Abstract:
Missing data are widespread, and pose problems for many statistical procedures. We all should be using methods that treat missing data properly, rather than deleting data or using single imputation. Importantly, researcher should pay attention by using most appropriate analysis of his data, in order to arrive to conclusions that have more accurate parameters. To achieve this objective, an appropriate method of handling treating missing data must be chosen before starting the analysis.
This research aims to a comparative study to the Multiple Imputation (MI) method of estimation against two other methods; the Regression Imputation of estimation and the Expectation-maximization (EM) algorithm of estimation, for estimating missing data.
The study is based in application on data randomly generated, some of them were missed by different percentages (5%, 10%, 15%, 20% and 30%). It also uses SPSS Program as statistical package to help in estimating and analyzing the data.
Data randomized was tested using little's test on which this data was divided into missing completely at random and missing not completely at random. The study proved that based on descriptive statistics, there is considerable differences between means and variances of estimated missing values, and to test the statistical significance of differences, the study used ANOVA test, and the
vi
consequently results proved that there is no significant difference between means.
The study also found that (98%) of the correlations were not significant based on the correlation matrix. finally, the study compared the estimated missing values after calculating the mean absolute error (MAE), based on the results, the study concluded that the Expectation-maximization (EM) method of estimation is better than the other two methods in producing more efficient estimates.
This study recommends to give attention should be paid to the missing data in the design and performance of the studies and in the analysis of the resulting data. And the application of the sophisticated statistical analysis techniques should only be performed after the maximal efforts have been employed to reduce missing data as they can cause bias and lead to invalid conclusions. It also recommends using the Expectation-maximization (EM) method of estimation because its estimates are the most efficient.