Please use this identifier to cite or link to this item:
https://repository.sustech.edu/handle/123456789/10579
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Abd alkreem, Musab Abd algader | |
dc.contributor.author | Supervisor - Mohamed Elhafiz Mustafa Musa | |
dc.date.accessioned | 2015-02-16T11:40:30Z | |
dc.date.available | 2015-02-16T11:40:30Z | |
dc.date.issued | 2014-09-22 | |
dc.identifier.citation | Abd alkreem, Musab Abd algader. Classification Algorithms Comparison- Case Study: Cancer Patients(SEER Data Set)/ Musab Abd algader Abd alkreem ؛Mohamed Elhafiz Mustafa WAHBI .-Khartoum : sudan university of science and technology,computer science,2014.-53p:ill;28cm.-M.Sc. | en_US |
dc.identifier.uri | http://repository.sustech.edu/handle/123456789/10579 | |
dc.description | Thesis | en_US |
dc.description.abstract | Data mining is the automatic search of huge data to discover patterns and trends that go beyond simple analysis. Data mining is also known as Knowledge Discovery in Data (KDD). This study investigates the discovery of the survival rate or survivability of a certain disease is possible by extracting the knowledge from the data related to that disease. To do such investigate a large data set needed one of these data sources is SEER[1] (Surveillance Epidemiology and End Results), which is a unique, reliable and essential resource for investigating the different aspects of cancer. In this study we have investigated three data mining techniques Multilayer Perceptron (MLP), K-nearest neighbor and the C4.5 decision trees the goal is to find the best accuracy to predict 5 years survivability of breast cancer. SEER database (period of 1973-2009 with 657,712 records) were used, starting from previous study we determined common variables use, after preprocessed there are 18 variables and 180,302 records. Weka was used to train and test the three techniques. The result show that the best technique is C4.5 accuracy is %95.6 and the second technique is K-nearest neighbor with accuracy %95.4 and the worst is MLP with accuracy %95.3. | en_US |
dc.description.sponsorship | Sudan University of Science and Technology | en_US |
dc.language.iso | en_US | en_US |
dc.publisher | Sudan University of Science and Technology | en_US |
dc.subject | Classification Algorithms | en_US |
dc.subject | Cancer Patients | en_US |
dc.subject | SEER Data Set | en_US |
dc.subject | Data mining | en_US |
dc.subject | KDD | en_US |
dc.title | Classification Algorithms Comparison | en_US |
dc.title.alternative | مقارنه خوارزميات التصنيف | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | Masters Dissertations : Computer Science and Information Technology |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Classification Algorithms Comparison ... .pdf | Title | 57.64 kB | Adobe PDF | View/Open |
Research.pdf | Research | 487.28 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.