Please use this identifier to cite or link to this item:
https://repository.sustech.edu/handle/123456789/18613
Title: | A New Hierarchical Support Vector Machine based Model for Classification of Imbalanced Multi-class Data |
Other Titles: | نموذج جديد مرتكزعلى آلة المتجهات الداعمة لتصنيف البيانات غير المتوازنة متعددة الأصناف |
Authors: | Othman, Hanaa Sameeh A.Aziz Supervisor, - Mohamed ElHafiz Mustafa |
Keywords: | PHILOSOPHY Computer Science class Data Classification of Imbalanced Multi |
Issue Date: | 10-Mar-2017 |
Publisher: | Sudan University of Science and Technology |
Citation: | Othman, Hanaa Sameeh A.Aziz . A New Hierarchical Support Vector Machine based Model for Classification of Imbalanced Multi-class Data / Hanaa Sameeh A.Aziz Othman ; Mohamed ElHafiz Mustafa .- Khartoum: Sudan University of Science and Technology, college of Computer science and information technology, 2017 .- 133p. :ill. ;28cm .- PhD. |
Abstract: | The Imbalance Multi-class learning problem is one of the challenging problems in supervised machine learning. The imbalance nature of the data – which is owning skewed distribution of samples in different classes –as well as being multiclass – where an instance could be assigned to more than one class - lead to many vital problems in both learning and performance evaluation processes. The research problem could be epitomized in finding more accurate classification results for such kind of data. So, its methodology is based on proposing new classification hierarchical method based on Multi-Class Support Vector Machine (Multi-Class SVM). The model rebalances the data via grouping small classes in bigger classes (artificial classes). Then it classifies the compound classes into its constituent classes at later stage. Experiments were applied on nine different Multiclass imbalanced datasets from U.C.I. repository. The experiments show that the new hierarchical model enhances the classification results comparing with the classification results of some state-of-the-art solution, even when empowered with weight for minority instances, considering four different performance metrics. They also exhibit that the model is not only successful in treating the imbalance problem simply without computational efforts or algorithmic modification, but also it does not require any data pre-processing step as many other solutions need. So, there is no additional adaptation neither on the data level, nor on the algorithmic level. Moreover, the experiments showed that the model performs well even when the ratio between minority and majority samples is high. They also demonstrate that the model works better with large number of classes of a dataset and perform poorly with the dataset that owns little number of classes that could not be combined into artificial classes of nearly balanced numbers of examples. |
Description: | Thesis |
URI: | http://repository.sustech.edu/handle/123456789/18613 |
Appears in Collections: | PhD theses : Computer Science and Information Technology |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
A New Hierarchical Support ....pdf | Research | 1.66 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.