Please use this identifier to cite or link to this item: https://repository.sustech.edu/handle/123456789/4807
Title: Predicting Lung Cancer Survivability Using Support Vector Machine
Other Titles: التنبؤ بامكانية الحياة لمرضي سرطان الرئة باستخدام خوارزمية آلة المتجهات الداعمة
Authors: Mukhtar, Fatima Mohamed
Supervisor - Mohamed Elhafiz Mustafa Musa
Keywords: Cancer
database
Cancer Survivability
SupportVector Machine
Data mining
SVM
Issue Date: 30-Oct-2013
Publisher: Sudan University of Science and Technology
Citation: Mukhtar,Fatima Mohamed .Predicting Lung Cancer Survivability Using Support Vector Machine- A Case Study of SEER database / Fatima Mohamed Mukhtar؛ Mohamed Elhafiz Mustafa.-Khartoum : sudan university of science and technology, computer science,2013.-35p:ill;28cm.-M.Sc.
Abstract: Data mining is the process of analyzing large quantities of data and summarizing it into useful information. In medical diagnosis the role of data mining are increasing rapidly. Particularly classification algorithms are very helpful in classifying the patient data, which is important in decision making process for medical practitioners. In this study a Support Vector Machine(SVM) based classifier has been trained and tested to predict 5 years survivability of lung cancer patients. The dataset used in this study consist of information about patients who have lung cancer collected by SEER. Preprocessing techniques have been applied to prepare the raw dataset and identify the relevant attributes for classification. Dataset is pre-classified into survived and not-survived 11.3% and 88.7% respectively. The purpose of this research is to verify the predictive effectiveness of SVM algorithm on real, historical data. We used Weka tool to train and test the classifier, there were two implementations of SVM in Weka Sequential Minimal Optimization (SMO) and Library for Support Vector Machines (LIBSVM). The results show that there were slight differences in accuracy between these two training algorithm, but there was a difference in algorithm execution time. The accuracy of the proposed system (SMO&LIBSVM) is better than what is reported in the literature for classifiers trained on the same dataset. The result indicates that SMO & LIBSVM are not robust against imbalance dataset.
Description: Thesis
URI: http://repository.sustech.edu/handle/123456789/4807
Appears in Collections:Masters Dissertations : Computer Science and Information Technology

Files in This Item:
File Description SizeFormat 
Predicting Lung Cancer ....pdf
  Restricted Access
Research366.22 kBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.