Credit Scoring Using Data Mining Classification

El Hassan, Eiman Mohammed; Supervisor - Izzeldin mohamed osman

SUST Home
→
Theses and Dissertations
→
College of Computer Science and Information Technology
→
PhD theses : Computer Science and Information Technology
→
View Item

dc.contributor.author	El Hassan, Eiman Mohammed
dc.contributor.author	Supervisor - Izzeldin mohamed osman
dc.date.accessioned	2014-11-13T12:15:51Z
dc.date.available	2014-11-13T12:15:51Z
dc.date.issued	2014-07-10
dc.identifier.citation	El Hassan, Eiman Mohammed. Credit Scoring Using Data Mining Classification: Application on Sudanese Banks/ Eiman Mohammed El Hassan؛ Izzeldin Mohammed Osman.-Khartoum : sudan university of science and technology,computer science,2014.-248p:ill;28cm.M.Sc.	en_US
dc.identifier.uri	http://repository.sustech.edu/handle/123456789/8041
dc.description	Thesis	en_US
dc.description.abstract	The main aim of this thesis is to develop suitable and high performance Credit Scoring Models (CSMs) to assess credit risk of personal loans for the Sudanese commercial banks using data mining techniques. Two Sudanese credit datasets were constructed. These datasets were provided by Agricultural Bank of Sudan and Al Salam Commercial Bank. In addition to these two datasets, a German credit dataset was also employed in this research as a benchmarking dataset. Three data mining classification techniques were employed in this research: Artificial Neural Network (ANN), Support Vector Machine(SVM) and Decision Tree (DT). Genetic Algorithm (GA) is also applied as a feature selection technique. Two validation methods (split validation with two ratios (70:30 and 60:40) and 10-cross validation) were used to validate the proposed credit scoring models. As a result of combining GA with the specified classification techniques, tables of attributes and their weights were produced. By using these tables new reduced sets of features were identified for each dataset (i.e. new reduced datasets were produced from the original datasets). Experiments in this research were conducted in three stages. In stage 1, classification techniques were applied individually to each dataset .In stage 2, these techniques were combined with GA and in stage 3 these techniques were applied to the reduced datasets. Nine proposed credit scoring models for each dataset were developed for each stage. These models were compared for each dataset in terms of fiveevaluation measures: Accuracy, Precision (Defaulter), Precision (Non-defaulter), Type  and Type П errors. As a result of these comparisons, the suggestions for the best models for each dataset were given. The experiments carried out in this research show that: • For all datasets, combining GA as a wrapper-feature selection technique with ANN, SVM and DT classification techniques is more beneficial than applying these techniques individually. Applying specified classification techniques to the reduced datasets does not bring a significant improvement to the major models in terms of the specified five measure indicators compared to the resulting models from applying these techniques to the original datasets.In addition, and as well-known fact the performance of each technique heavily depends on the nature of datasets.	en_US
dc.description.sponsorship	Sudan University of Science and Technology	en_US
dc.language.iso	en_US	en_US
dc.publisher	Sudan University of Science and Technology	en_US
dc.subject	Credit Scoring	en_US
dc.subject	Data Mining	en_US
dc.subject	Credit Scoring Models (CSMs)	en_US
dc.subject	Artificial Neural Network (ANN)	en_US
dc.subject	Support Vector Machine(SVM)	en_US
dc.subject	Decision Tree (DT)	en_US
dc.subject	Genetic Algorithm (GA)	en_US
dc.title	Credit Scoring Using Data Mining Classification	en_US
dc.title.alternative	تصنيف الإئتمان بإستخدام تقنيات تنقيب البيانات	en_US
dc.type	Thesis	en_US