Abstract:
Cluster analysis is one of major data mining methods; this method is a convenient
for identifying homogenous groups of objects called clusters.
Two-Step is a
clustering algorithm primarily designed to analyze large datasets, Two-step deals
with categorical and real valued data and it also finds the optimal number of clusters.
In this research the goal is to study the practical performance of two-step algorithm
using Khartoum state farms data.
In this study two-step clustering method is used to group Khartoum state farms data
into clusters base on procedure can apply on these farms and number of experiments
conducted (three experiments). Each experiments are generate number of interesting.
Moreover, data preprocessing carry out on the raw data before experiments.
From the second experiment the records of waiver procedure split in two cluster 1, 2.
Records of waiver procedure in cluster 1 are represent farms owned by persons and
association, the remain records in cluster 2 are represent farms owned by companies
and institutions.
The records of customize procedure are split cluster 1 and 2. The records in cluster
1 represent the farms owned by persons. The remains record in cluster 2 represents
the farms owned by companies, institutions and associations.
The records of renewal procedure are split in cluster 2 and 4. The records in cluster 2
represent the farms owned by associations and companies. The remains record in
cluster 4 represents the farms owned by persons and institutions. One of the
important results in these experiments is that one cluster is stable (i.e. does not
change through all experiments). This stable cluster contains few numbers of
records; however, it has the biggest area and investment.
The experiments show that the most replacement transactions occur in Omdurman,
and the most waiver transactions occur in Bahri, and the most area change from
agriculture to residential occur in Khartoum. The experiments also show that the
largest agriculture area is in Omdurman; however, Omdurman contains the most
unused area.