Abstract:
Recently renal failure disease has spread widely all over the world, especially in
Sudan, as indicated by the WHO reports. Therefore, it was necessary to use all
available scientific methods to contribute in studying the factors that lead to the
disease and predict it in its early stage, to decrease its wide spread. In this research,
data mining techniques were used to study and determine the factors that lead to
Chronic Kidney Disease in its early stages, and to build models to predict the
disease using the selected features. Data used in this research was collected from a
Medical Center for Renal Failure Treatment in India. WEKA machine learning
software was used in this research for all data mining operations like data
exploration, feature selection, and model development. Supervised machine
learning algorithms, such as Naïve Bayes, Random Forest, C4.5 Tree and Neural
Networks, were used to select the important features and develop the models.
Several models were built using several algorithms, each of which gave high
accuracy and acceptable interpretation to the physicians.
The research motivates other researchers to start working intensively in this field
by forming research groups from data scientists and physicians to solve such
problems using real patients’ data.