SUST Repository

HYBRID ENSEMBLE APPROACHES TO CLASSIFY IMBALANCED DATA

Show simple item record

dc.contributor.author Fadel Elmola, Shaza Merghani Abd Elrahman
dc.contributor.author Supervisor, Ajith Abraham
dc.date.accessioned 2016-10-25T07:24:48Z
dc.date.available 2016-10-25T07:24:48Z
dc.date.issued 2016-03-10
dc.identifier.citation Fadel Elmola, Shaza Merghani Abd Elrahman . HYBRID ENSEMBLE APPROACHES TO CLASSIFY IMBALANCED DATA / Shaza Merghani Abd Elrahman Fadel Elmola ; Ajith Abraham .- Khartoum: Sudan University of Science and Technology, college of Computer science and information technology, 2016 .- 110p. :ill. ;28cm .-PhD. en_US
dc.identifier.uri http://repository.sustech.edu/handle/123456789/14383
dc.description Thesis en_US
dc.description.abstract Class imbalance is one of the challenges of machine learning and data mining fields. Imbalanced data set degrades the performance of data mining and machine learning techniques as the overall accuracy and decision-making would be biased to the majority class, which leads to misclassifying the minority class samples or furthermore treated them as noise. The classification problem of imbalanced data gets complicated whenever the class of interest is relatively rare and has small number of instances compared to the majority class. Moreover, the cost of misclassifying the minority class is very high in comparison with the cost of misclassifying the majority class as occurs in many real applications such as medical diagnosis, fraud detection, network intrusion detection…etc. In this dissertation, we started by investigating the problem of two class classification. A series of experiments are conducted using imbalanced data with its original distribution, balanced data using sampling methods and meta learning methods. Then, we developed a hybrid ensemble that implemented multi resampling methods at various rates. The experimental results on many real world applications for two class imbalanced data sets, confirms that the proposed hybrid ensembles have better performance using different evaluation measures. Next, we investigated the multi class imbalanced problem. A series of experiments are conducted using direct multi class classification and meta learning methods. We developed a hybrid Error Correcting Output Code ensemble utilizing weighted Hamming distance and AdaBoost meta learning method. The experimental results on many real applications multi class imbalanced data sets show that our proposed hybrid ensemble performed effectively better by improving the classification performance in minority classes and significantly outperformed other tested methods en_US
dc.description.sponsorship Sudan University of Science and Technology en_US
dc.language.iso en en_US
dc.publisher Sudan University of Science and Technology en_US
dc.subject Computer Science en_US
dc.subject HYBRID ENSEMBLE APPROACHES en_US
dc.subject IMBALANCED DATA en_US
dc.title HYBRID ENSEMBLE APPROACHES TO CLASSIFY IMBALANCED DATA en_US
dc.title.alternative طرق المجاميع الهجينة لتصنيف البيانات غير المتوازنة en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Share

Search SUST


Browse

My Account