Abstract:
The missing data in household health survey was a problem for the researchers because it leads to incomplete analysis. The statistical tool of cluster analysis methodology was implemented in the collected data of Sudan's household health survey in 2006.
This research focuses specifically on the analysis of the collected data and the objective is to deal with the missing values in cluster analysis. Two-Step Cluster Analysis is applied in which each participant is classified into one of the identified pattern and the optimal number of classes is determined using SPSS Statistics/IBM. Any observation with missing data is excluded in the Cluster Analysis as in the multi-variable statistical techniques. Therefore, before performing the cluster analysis, missing values is imputed using multiple imputations (SPSS Statistics/IBM). The clustering result is displayed in tables. The descriptive statistics and cluster frequencies are produced for the final cluster model, while the information criterion table displayed results for a range of cluster solutions.
Furthermore, the objective is extended to include the reduction of biases arising from the fact that non-respondents may be different from those who participate and to bring sample data up to the dimensions of the target population totals.