M. Basu, Y. Pan, and J. Wang (Eds.): ISBRA 2014, LNBI 8492, pp. 59–70, 2014. © Springer International Publishing Switzerland 2014 Algorithms Implemented for Cancer Gene Searching and Classifications Murad M. Al-Rajab and Joan Lu School of Computing and Engineering, University of Huddersfield Huddersfield, UK {U1174101,j.lu}@hud.ac.uk Abstract. Understanding the gene expression is an important factor to cancer diagnosis. One target of this understanding is implementing cancer gene search and classification methods. However, cancer gene search and classification is a challenge in that there is no an obvious exact algorithm that can be implemented individually for various cancer cells. In this paper a research is conducted through the most common top ranked algorithms implemented for cancer gene search and classification, and how they are implemented to reach a better performance. The paper will distinguish algorithms implemented for Bio image analysis for cancer cells and algorithms implemented based on DNA array data. The main purpose of this paper is to explore a road map towards presenting the most current algorithms implemented for cancer gene search and classification. Keywords: cancer, genes, searching algorithms, classification algorithms. 1 Introduction Cancer is one of the world’s most serious diseases in modern society and a major cause of death worldwide. Traditional diagnostics methods are based mainly on the morphological and clinical appearance of cancer, but have limited contributions as cancer usually results from other environmental factors. There are several causes of cancer (carcinogens) such as smoke, radiation, synthetic chemicals, polluted water, and others that may accelerate the mutations and many undiscovered causes. On the other hand, a need to select the most informative genes from wide data sets, removal of uninformative genes and decreases noise, confusion and complexity and increase the chances for identification of diseases and prediction of various outcomes like cancer types is mandatory [1]. One of the challenging tasks in cancer diagnosis is how to identify salient expression genes from thousands of genes in microarray data that can directly contribute to the phenotype or symptom of disease [3]. The development of array technologies indicates the possibility of early detection and accurate prediction of cancer. Through these technologies, it is possible to get thousands of gene expression levels simultaneously through arrays, and also the ability to make use to know and find out whether it is cancer or not, and classify cancer [5]. Thus, there is a need to identify the informative genes that contribute to a cancerous state. An informative 60 M.M. Al-Rajab and J. Lu gene is a gene that is useful and relevant for cancer classification [6]. Cancer classification, which can help to improve health care of patients and the quality of life of individuals, is essential for cancer diagnosis and drug discovery [3]. Cancer classification or prediction refers to the process of constructing a model on the microarray dataset and then distinguishing one type of samples from other types within the induced model [7]. Microarray is a device or a technology used to measure expression levels of thousands of genes simultaneously in a cell mixture, and finally produces a microarray data, which is also known as gene expression data. The task of cancer classification using microarray data is to classify tissue samples into related classes of phenotypes such as cancer versus normal [8]. A major problem in these microarray data is the high redundancy and the noisy nature of many genes or irrelevant information for accurate classification of cancer. Only a small number of genes may be important [9]. Early and accurate detection and classification of cancer is critical to the wellbeing of patients. The need for a method or algorithms for cancer identification is important and has a great value in providing better treatment and this can be done through analysis of genetic data. For practical use an algorithm has to be fast and accurate as well as easy to implement, test, and maintain. The optimal algorithm for a given task would have adequate performance with minimal implementation complexity [10]. To study the algorithms implemented for cancer gene search and classification, a long path of solid literature review must be constructed from Bioinformatics understanding passing through Bio-image processing and algorithms analysis toward cancer gene searching and selection algorithms implemented in the field and how these algorithms can be applied to classify cancer cells and how efficient they are. Due to the emergence of new technologies such as the micro array data, these new technologies produce large datasets characterized by a large number of features (genes); this is why feature selection (gene selection) has become very important in several fields such as Bioinformatics. Authors in [6, 11] introduced a new hybrid feature selection method that combines the advantages of filter strategy based on the Laplacian Score joint with a simple wrapper strategy. The suggested algorithm resulted in a fast hybrid feature selectors that can solve feature selection problems in high dimensional datasets and select a small subset full of informative genes that is most relative to cancer classification. Another research developed an automated system for robust and reliable cancer diagnoses based on gene microarray data as stated by the authors in [9]. They investigated that support vector machine classifier algorithms outperforms other algorithms such as K nearest neighbors, naive Bayes, neural networks and decision tree; and thus they could adopt the important genes for cancer tumor classifications. On the other hand the authors in [12], found the smallest set of genes that can ensure highly accurate cancer classifications from microarray data by using supervised machine learning algorithms. Moreover, the authors in [13], survived different feature selection techniques and their application for gene array data, they found two optimal search methods for cancer classification which are Genetic Algorithms (GA) and Tabu search (TS) to generate candidate genes for classifications. They argued that GA is an optimal search method that behaves like evolution processes in nature, while TS is a heuristic method that guides the search for optimal solution making use of flexible memory. Algorithms Implemented for Cancer Gene Searching and Classifications 61 The main purpose of this paper is to explore a road map towards presenting the most current algorithms implemented for cancer gene search and classification. The remainder of this paper will be structured as follows; Section 2 will discuss the common algorithms implemented in the research topic, on the other hand, section 3will give an overview of the algorithms, while, results and discussion will be presented in section 4. Finally, section 5 will conclude the paper. 2 Common Algorithms for Cancer Gene Search and Classification The study of the algorithms is classified into two categories; first the algorithms that focus on gene expression analysis for cancer gene selection, and second, the algorithms that focus on Bio-Image analysis and performs cancer classification. These categories are discussed below: 2.1 Analysis of Cancer Gene Selection and Classification Algorithms Microarray data is being an influence to cancer diagnostics. Its accurate prediction to the type or size of tumors based on reliable and efficient classification algorithms, so that patient can be provided with better treatment or therapy response. The main issue behind microarray data is its high dimensionality which may lead to low efficiency in cancer gene classification and also makes it difficult to classify the related genes. Among thousands of genes whose expression levels are measured, not all are needed for classification [5]. Thus, one challenging task in cancer diagnosis is how to identify silent expression genes from thousands of genes in microarray data and how to select informative genes for classification that can assist to the symptom of disease [7]. Below is a summary of the most well implemented classification algorithms applied in the field and argued to be efficient for diverse cancer type’s diagnosis and treatment. Integrated Gene-Search Algorithm The integrated algorithm is based on Genetic Algorithm (GA) and Correlationbased heuristics [1]. (Correlation-based feature selection) (CFS) for data preprocessing and data mining (decision tree and support vector machine algorithms) for making predictions. Thereafter, bagging and stacking algorithms were applied for further enhancement classification accuracy and the analysis of data was performed by WEKA data mining software. This work was proposed and successfully applied to the training and testing genetic expression data sets of ovarian, prostate, and lung cancers but also can be successfully applied to any other cancer like colon, breasted, bladder, leukemia, and so on. The Algorithm consists of two phases as shown in Figure 1, the iterative phase I, where data partitioning, execution of Decision Tree (DT) algorithm or any other data mining algorithms applied to the data set, then GA and CFS for gene reduction take place. After that, in phase II, data-mining algorithms are applied to the training and testing data sets generated from phase I and their results will be evaluated to determine the most significant gene set. 62 M.M. Al-Rajab and J. Lu Fig. 1. Integrated Gene Search Algorithm An Integrated Algorithm for Gene Selection and Classification Applied to Microarray Data for Ovarian Cancer By applying a hybrid of algorithms (Genetic Algorithm “GA”, Particle Swarm Optimization “PSO”, Support Vector Machine “SVM”, and Analysis of Variance “ANOVA”) to select gene markers from target genes, finally fuzzy model is applied to classify cancer tissues [2]. Due to the huge amount of data types generated from gene expression and lack of systematic procedure to analyze the information instantaneously, in addition to avoid higher computational complexity, the need to select the most likely differential gene markers to explain the effects on ovarian cancer. It is concluded that the proposed algorithm has superior performance over ovarian cancer and can be applied and performed on other cancer diagnosis studies, and that is noticed from table 1. Table 1. The Proposed Algorithm Accuracy of classification for various approaches The hybrid process of SVM and GA (%) The hybrid process of SVM and PSO (%) The proposed algorithm (%) Colon 95.65% 97.13% 99.13 Breast 96.23% 97.95% 98.55 Source: Zne-Jung Lee, An integrated algorithm for gene selection and classification applied to microarray data of ovarian cancer, International Journal Artificial Intelligence in Medicine 42 (2008) 91. A Bootstrapped Genetic Algorithm and Support Vector Machine to Select Genes for Cancer Classification The algorithm states that gene expression data obtained from microarrays have shown to be useful in cancer classification. A novel system is suggested for selecting a set of genes for cancer classification. The system is based on linear support vector machine and a genetic algorithm The proposed system considers two databases for the solution, one for the colon cancer and the other for the leukemia. It is argued that this proposed system of hybridization of genetic algorithm, support vector machine and bootstrapped methods is very efficient for classification problems. [4]. Algorithms Implemented for Cancer Gene Searching and Classifications 63 A Novel Embedded Approach Composed of Two Main Phases to the Problem of Cancer Classification Using Gene Expression Data Phase one includes the use of gene selection to select the important predictive genes which make it later easier to be correctly classified. The second phase is to build powerful classifier models. For gene selection, a proposed of three filter approaches are analyzed, Information Gain (IG), Relief Algorithm (RA), and t-statistics (TA) to obtain a predictive reduced feature (gene) space containing the most informative genes. Later five well known classifier algorithms are utilized (Support Vector Machine (SVM), K Nearest Neighbor (KNN), Naïve Bayes (NB), Neural Network (NN), and Decision Tree (DT)) to classify nine famous available gene expression datasets. After the experiments, it was resulted that in 8 out of 9 datasets, SVMs classifier outperforms KNN, NB, NN and DT obviously in all cases [9]. Genetic Algorithm (GA) with an Initial Solution Provided by t-statistics (t-GA) for Selecting a Group of Informative Genes from Cancer Microarray Data The Decision Tree classifier (DT) is then built on the top of these selected genes. The performance of the proposed approach among other selection methods and indicated that t-GA has the highest accurate rate among different methods [14]. 2.2 Cancer Classification through Bio Image Analysis Algorithms CAIMAN system (CAncer IMage ANalysis) [15] is an online algorithm repository that analyze the image produced by experiments relevant to cancer research (www.caiman.org.uk), three algorithms have been implemented to this project, an algorithm for measuring cellular migration, other one for vasculature analysis and an algorithm for image shading correction. The following table was a result of the estimation performance of the CAIMAN system (CAncer IMage ANalysis) , the three proposed algorithms were tested with two groups of five images each one of approximately 10kb in size and the other more than 1 Mb.The times are recorded from the moment the user opens the web page to the time the email with the results are received, as in Table 2 below: Table 2. Proposed Algorithm Performance Estimation Algorithm Dimension (pixels) Size (kb) Time ± (s) Migration 285 x 203 127 x 900 1001 1700 62.6 ± 9.6 81.4 ± 16.7 Tracing 220 x 164 768 x 576 108 1300 66.2 ± 20.3 207.4 ± 14.6 Shading 285 x 203 1270 x 900 100 1700 59.5 ± 14.1 65.0 ± 15.6 Source: Constantino Carlos Reyes-Aldasoro, Michael K. Griffiths, Deniz Savas, Gillian M. Tozer, CAIMAN: An online algorithm repository for Cancer Image Analysis, Computer Methods and Programs in Biomedicine, Volume 103, Issue 2, August 2011, Page 103, ISSN 0169-2607, 10.1016/j.cmpb.2010.07.007. 64 M.M. Al-Rajab and J. Lu Fig. 2. Integrated Cancer Selection and Classification criteria 3 Algorithms Overview It is noticed that to classify cancer cells into normal cells or cancerous cells, Selection and Searching Algorithms must be implemented first as shown in figure 2. 3.1 Searching and Selection Algorithms Genetic Algorithm (GA) is a search algorithm. A GA is initiated with a set of solutions (chromosomes) called the population [1, 16]. Solutions from one population are taken and used to form a new population. This is motivated by a hope that the new population will be better than the old one. Solutions which are selected to form new solution are selected according to their fitness – the more suitable they are, the more chances they have to reproduce [14, 16], the chart of GA is presented in Figure 3.Correlation-based feature selection (CFS) it is a process of choosing or selecting a subset of original features so that the feature space is optimally reduced according to a certain evaluation criterion [17]. It reduces the number of features, removes irrelevant, redundant, or noisy data, and brings the immediate effects for applications [18]. Particle Swarm Optimization (PSO) is a population based search algorithm based on the simulation of the social behavior [19]. PSO is similar to GA in that the system is initialized with a population of random solutions. It is unlike GA, however, in that each potential solution is also assigned a randomized velocity, and the potential solutions “particles”, are then “flown” through the problem space [20]. Analysis of variance (ANOVA) is an extremely important method in exploratory and confirmatory data analysis [21]. Information Gain (IG) is a method that attempts to quantify the best possible class predictability that can be obtained by dividing the full range of a given gene expression into two disjoint intervals corresponding to the down-regulation of the gene. It predicts samples in one interval to normal and samples in another interval to cancer [14]. Fig. 3. Block Diagram of Genetic Algorithm Algorithms Implemented for Cancer Gene Searching and Classifications 65 3.2 Classification Algorithms Support Vector Machine (SVM) is considered popular classifier for microarray data [22]. It has an advantage applied in cancer diagnostic in that its performance appears not to be affected by using the set of full genes [9]. k- Nearest Neighbor (KNN) is one of the simplest learning algorithms, and applied to a variety of problem. It is used as a classifier among a given set of data and uses class labels of the most similar neighbor to predict the new class [9]. Naïve Bayes (NB) is a classifier that can achieve relatively good performance on classification tasks, based on the elementary Bayes’ theory [9]. Decision Tree (DT) different methods exit to build a DT, in which a given data in a tree structure, with each branch representing an association between attribute values and a class label [9]. The most famous DT methods is the C4.5 algorithm, which partition the training data set according to tests on the potential of attribute values in separating the classes. Table 3. Feature Selection Algorithms Specifications Methods/ Technology involved Importance Area/s Advantages Disadvantages Problems Filter Selection Tech- niques Compute the importance of each feature (gene) and then select the top ranked Gene Selec- tion Simple Fast Easy scales to very high dimensional data Univarate that means each feature is considered and treated separately, ignoring any correlation between features Low classification performance Wrapper Selection Technique Selects subset of features that is useful to build a good classifier or predictor Gene Selec- tion The ability to take into account the correlation between features and the interaction with the classifier Prone to high risk of over fitting It require very intensive computation Unfeasible for feature selection in highdimensional data More complex 4 Results and Discussion In this paper, various algorithms were analyzed that perform the task of cancer gene search and classification by first selecting the informative genes and reducing the size and then distinguish the type of the cell tumor or not. Cancer gene selection is a preprocessing step used to find a reduced-sample size of microarray data. This can be achieved by two feature (gene) selection approaches as stated in Table 3. From the table it is found that both filter and wrapper models play a role in feature (gene) selection, but each has its pros and cons. Filter model is noticed to be fast but may give a low classification performance result, while the wrapper model takes time and more complex, but may give somehow a high performance result. Furthermore,it is noticed from Table 4 (see appendix 1), that multiple algorithms implemented in integration and hybridization to analyze multiple kinds of cancer type. In addition, the efficiency of the algorithms was based on the cancer type and the algorithm implemented. The need for a scientific methodology to determine the efficient algorithm or integration of algorithms for cancer types was missed. We mean by algorithm efficiency how fast 66 M.M. Al-Rajab and J. Lu the algorithm to be implemented in terms of time and speed in order to analyze the cancer cells. Furthermore, Table 5 (see appendix 1) gives a summary for each individual algorithm and to which cancer type it was implemented. It is concluded from table 5 (see appendix 1), that Genetic Algorithm as a selection algorithm was implemented to almost all cancer types for a high performance, except the brain cancer, while Decision Tree and Support Vector Machine Algorithms were implemented to almost all types of cancer for high performance results. In addition figure 4 shows that the Integrated Algorithm for gene selection and classification has the highest accuracy 99% for colon and breast cancers, while the Bootstrapped Genetic Algorithm and Support Vector Machine give good performance accuracy without indicating the percentage. Also the Integrated Gene Search Algorithm has the second high performance up to 98% in accuracy results. Fig. 4. Algorithm Efficiency and Accuracy On the other hand, from the detailed review to many researchers’ contributions, Table 6 (see appendix 1) summarizes out the most common Algorithms used for cancer gene search and classifications, most of these algorithms where implemented in an integrated model or hybridization methods as discussed, in order to give out an optimum desired result. The main issue with the previous algorithms is the efficiency in performance, due that most of the suggested algorithms and technologies followed the hybridization methodology in order to achieve better in terms of efficiency and accuracy. When we talk about efficiency we mean less time and less memory, but the main concern will be saving time. 5 Conclusion and Future Work It is concluded that there are multiple computational algorithms applied for cancer gene selection that are either filter or wrapper methods, each has its own advantages or disadvantages and trying to reach a well performance result. On the other hand and in order to classify cancer cells, selection algorithms must be implemented first to reduce the microarray sample size and reach informative genes, then it would be easier to implement classifier algorithms to distinguish out tumor from normal cells. Algorithms Implemented for Cancer Gene Searching and Classifications 67 Moreover, the paper showed that most algorithms are implemented in an integration methodology and in a harmony in order to achieve a better performance result. Nevertheless, it was clear that the dominant algorithm applied in integration with other algorithms for gene selection was the Genetic Algorithm, while for classification was the Support Vector Machine; as both reached better results. The future work will be to analyze the processing time of each of the algorithms implemented in order to decide the best performance algorithm. References 1. Shah, S., Kusiak, A.: Cancer gene search with data mining and genetic algorithms. Computers in Biology and Medicine 37(2), 251–261 (2007) 2. Lee, Z.-J.: An integrated algorithm for gene selection and classification applied to microarray data of ovarian cancer. International Journal Artificial Intelligence in Medicine 42, 81– 93 (2008) 3. Liu, H., Liu, L., Zhang, H.: Ensemble gene selection for cancer classifi-cation. Pattern Recognition 43(8), 2763–2772 (2010) ISSN 0031-3203, 10.1016/j.patcog.2010.02.008 4. Chen, X.-W.: Gene selection for cancer classification using bootstrapped genetic algorithms and support vector machines. In: Proceedings of the 2003 IEEE Bioinformatics Conference, CSB 2003, August 11-14, pp. 504–505 (2003) 5. Park, C., Cho, S.-B.: Evolutionary ensemble classifier for lymphoma and colon cancer classification. In: The 2003 Congress on Evolutionary Computation, CEC 2003, December 8-12, vol. 4, pp. 2378–2385 (2003) 6. Mohamad, M.S., Omatu, S., Yoshioka, M., Deris, S.: An Approach Using Hybrid Methods to Select Informative Genes from Microarray Data for Cancer Classification. In: Second Asia International Conference on Modeling & Simulation, AICMS 2008, May 13-15, pp. 603–608 (2008) 7. Liu, H., Liu, L., Zhang, H.: Ensemble gene selection for cancer classification. Pattern Recognition 43(8), 2763–2772 (2010) ISSN 0031-3203 8. Mohamad, M.S., Omatu, S., Deris, S., Hashim, S.Z.M.: A Model for Gene Selection and Classification of Gene Expression Data. International Journal of Artificial Life & Robotics 11(2), 219–222 (2007) 9. Osareh, A., Shadgar, B.: Microarray data analysis for cancer classification. In: 2010 5th International Symposium on Health Informatics and Bioinformatics (HIBIT), April 20-22, pp. 125–132 (2010) 10. Nurminen, J.K.: Using software complexity measures to analyze algorithms—an experiment with the shortest-paths algorithms. Computers & Operations Research 30(8), 1121– 1134 (2003) ISSN 0305-0548, 10.1016/S0305-0548(02)00060-6 11. Solorio-Fernandez, S., Martinez-Trinidad, J.F., Carrasco-Ochoa, J.A., Zhang, Y.-Q.: Hybrid feature selection method for biomedical datasets. In: 2012 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), May 9-12, pp. 150–155 (2012) 12. Wang, L., Chu, F., Xie, W.: Accurate Cancer Classification Using Expres-sions of Very Few Genes. IEEE/ACM Transactions on Computational Biology and Bioinformatics 4(1), 40–53 (2007) 13. Li, J., Su, H., Chen, H., Futscher, B.W.: Optimal Search-Based Gene Subset Selection for Gene Array Cancer Classification. IEEE Transactions on Information Technology in Biomedicine 11(4), 398–405 (2007) 68 M.M. Al-Rajab and J. Lu 14. Yeh, J.-Y., Wu, T.-S., Wu, M.-C., Chang, D.-M.: Applying Data Mining Techniques for Cancer Classification from Gene Expression Data. In: International Conference on Convergence Information Technology, November 21-23, pp. 703–708 (2007) 15. Reyes-Aldasoro, C.C., Griffiths, M.K., Savas, D., Tozer, G.M.: CAIMAN: An online algorithm repository for Cancer Image Analysis. Computer Methods and Programs in Biomedicine 103(2), 97–103 (2011) ISSN 0169-2607, 10.1016/j.cmpb.2010.07.007 16. Goldberg, D.E.: Genetic algorithms in search, optimization and machine learning. Addison Wesley, MA (1989) 17. Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: ICML, pp. 856–863 (2003) 18. Tiwari, R., Singh, M.P.: Correlation-based Attribute Selection using Genetic Algorithm. International Journal of Computer Applications (0975 – 8887) 4(8), 28–34 (2010) 19. Khanesar, M.A., Teshnehlab, M., Shoorehdeli, M.A.: A novel binary particle swarm optimization. In: Mediterranean Conference on Control & Automation, MED 2007, June 27- 29, pp. 1–6 (2007) 20. Eberhart, R.C., Shi, Y.: Particle swarm optimization: developments, applications and resources. In: Proceedings of the 2001 Congress on Evolutionary Computation, vol. 1, pp. 81–86 (2001) 21. Gelman, A.: Analysis of Variance - Why it is More Important Than Ever. The Annals of Statistics 33(1), 1–53 (2005) 22. Vapnik, V.: Statistical learning theory. Wiley (1998) Algorithms Implemented for Cancer Gene Searching and Classifications 69 Appendix Table 4. Efficient Algorithms for various cancer types Algorithm Embedded Algo- rithms Cancer Type Comments Integrated GeneSearch Algorithm[1] Genetic Algorithm Correlation-based heuristics Decision tree Support vector ma- chine Ovarian Prostate Lung Can be successfully applied to any other cancer like colon, breasted, bladder, leukemia, and so on. High classification accuracy (94 – 98%) An integrated algorithm for gene selection and classification [2] Genetic Algorithm Particle Swarm Op- timization Support Vector Ma- chine Analysis of Variance Fuzzy Model Ovarian Colon Breast Superior performance for gene selection and classification (colon and breast 99% accuracy) Bootstrapped Genetic Algorithm and Support Vector Machine [4] Genetic Algorithm Support vector ma- chine Colon Leukemia Well suited for feature (gene) selection prob- lems Novel Embedded Approach [9] Information Gain Relief Algorithm t-statistics Support Vector Ma- chine K Nearest Neigh- bour Naïve Bayes Neural Network Decision Tree Lung Prostate Breast Leukemia Brain Colon Ovarian Suport Vector Machines peroforms accuracies > 85% with the combination of Information Gain Decision Tree are the worst model in accura- cy Genetic Algorithms (GA) with an initial solution provided by t-statistics (t-GA) [14] Genetic Algorithm T-statistics Decision Tree Colon Leukemia Lymphoma Lung Central Nervous System (CNS) Colon accuracy 89% Leukemia accuracy 94% Lymphona accuracy 92% Lung accuracy 98% CNS accuracy 77% CAIMAN system (CAncer IMage ANalysis) [15] Migration measure- ment Vasculature tracing Shading correction Cancer related imag- es More algorithms can be implemented 70 M.M. Al-Rajab and J. Lu Table 5. Cancer Types Algorithms Cancer Algorithm Ovarian Prostate Lung Colon Breast Bladder Leukemia Brain Lymphoma CNS Genetic Algorithm Correlation based heuristics Decision tree Support Vector Machine Particle Swarm Optimization Analysis of vari- ance Fuzzy Model Information Gain Relief Algorithm t-statistics K nearest Neigh- bor Naïve Bayes Neural Network Table 6. Common Feature Selection and Classifications Algorithms Selection Algorithms Classification Algorithms Genetic Algorithm (GA) Support Vector Machine (SVM) Correlation-based heuristics (Correlation-based feature selection) (CFS) Bootstrapped SVM Particle Swarm Optimization (PSO) K-Nearest Neighbors (KNN) Analysis of Variance (ANOVA) Naïve Bayes Information Gain (IG) Neural Networks (NN) Relief Algorithm (RA) Decision Tree (DT) t-statistics (TA) Bagging and Stacking Algorithms Fuzzy Model