SUST Repository

Classification Algorithms Comparison

Show simple item record

dc.contributor.author Abd alkreem, Musab Abd algader
dc.contributor.author Supervisor - Mohamed Elhafiz Mustafa Musa
dc.date.accessioned 2015-02-16T11:40:30Z
dc.date.available 2015-02-16T11:40:30Z
dc.date.issued 2014-09-22
dc.identifier.citation Abd alkreem, Musab Abd algader. Classification Algorithms Comparison- Case Study: Cancer Patients(SEER Data Set)/ Musab Abd algader Abd alkreem ؛Mohamed Elhafiz Mustafa WAHBI .-Khartoum : sudan university of science and technology,computer science,2014.-53p:ill;28cm.-M.Sc. en_US
dc.identifier.uri http://repository.sustech.edu/handle/123456789/10579
dc.description Thesis en_US
dc.description.abstract Data mining is the automatic search of huge data to discover patterns and trends that go beyond simple analysis. Data mining is also known as Knowledge Discovery in Data (KDD). This study investigates the discovery of the survival rate or survivability of a certain disease is possible by extracting the knowledge from the data related to that disease. To do such investigate a large data set needed one of these data sources is SEER[1] (Surveillance Epidemiology and End Results), which is a unique, reliable and essential resource for investigating the different aspects of cancer. In this study we have investigated three data mining techniques Multilayer Perceptron (MLP), K-nearest neighbor and the C4.5 decision trees the goal is to find the best accuracy to predict 5 years survivability of breast cancer. SEER database (period of 1973-2009 with 657,712 records) were used, starting from previous study we determined common variables use, after preprocessed there are 18 variables and 180,302 records. Weka was used to train and test the three techniques. The result show that the best technique is C4.5 accuracy is %95.6 and the second technique is K-nearest neighbor with accuracy %95.4 and the worst is MLP with accuracy %95.3. en_US
dc.description.sponsorship Sudan University of Science and Technology en_US
dc.language.iso en_US en_US
dc.publisher Sudan University of Science and Technology en_US
dc.subject Classification Algorithms en_US
dc.subject Cancer Patients en_US
dc.subject SEER Data Set en_US
dc.subject Data mining en_US
dc.subject KDD en_US
dc.title Classification Algorithms Comparison en_US
dc.title.alternative مقارنه خوارزميات التصنيف en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Share

Search SUST


Browse

My Account