Please use this identifier to cite or link to this item: https://repository.sustech.edu/handle/123456789/27430
Title: Imbalanced data classification Enhancement Using SMOTE and NearMiss sampling Techniques
Other Titles: تحسين دقة تصنيف البيانات غير المتوازنة باستخدام تقنيتي
Authors: Babikir, MayaminTilalAbdelrahim
Supervisor, -Wafaa Faisal
Keywords: Information Technology Entitled:
Computer Science and Information Technology
Imbalanced data classification Enhancement
SMOTE and NearMiss sampling Techniques
Issue Date: 27-Jul-2022
Publisher: Sudan University of Science & Technology
Citation: Babikir, MayaminTilalAbdelrahim . Imbalanced data classification Enhancement Using SMOTE and NearMiss sampling Techniques \ MayaminTilalAbdelrahimBabikir ; Wafaa Faisal .- Khartoum:Sudan University of Science & Technology,College of Computer Science and Information Technology,2022.-47.p.:ill.;28cm.-M.Sc.
Abstract: An approach to construction of classifiers from imbalanced datasets is described. The dataset is imbalanced if the classification categories are not approximately equally represented,often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This research shows that a combination of method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance. The methodology involves acquisition the dataset form UCI repository and applying SVM and Random Forest classifier, applying SMOTE method and evaluating classification accuracy before and after balancing.
Description: Thesis
URI: http://repository.sustech.edu/handle/123456789/27430
Appears in Collections:Masters Dissertations : Computer Science and Information Technology

Files in This Item:
File Description SizeFormat 
Imbalanced data ....pdf
  Restricted Access
Research731.19 kBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.