SUST Repository

Performance of Data Envelopment and Stochastic Frontier Models In the Presence of Misspecification-Multicollinearity-and Outliers

Show simple item record

dc.contributor.author Ismail, Mohamed Abdel-Rahman
dc.date.accessioned 2013-11-26T06:53:55Z
dc.date.available 2013-11-26T06:53:55Z
dc.date.issued 2010-02-01
dc.identifier.citation Ismail,Mohamed Abdel-Rahman .Performance of Data Envelopment and Stochastic Frontier Models In the Presence of Misspecification- Multicollinearity- and Outliers/Mohamed Abdel-Rahman Ismail; Obaid Mahmood M. Alzabaee.-Khartoum:Sudan university of Science and Technology,College of Science,2010.-179p. : ill. ; 28cm.- PhD. en_US
dc.identifier.uri http://hdl.handle.net/123456789/2457
dc.description Thesis en_US
dc.description.abstract The literature on efficiency measurement methods is broadly divided into non-parametric and parametric models. The two principal and popular efficiency measurement methods are Data Envelopment Analysis (DEA) and Stochastic Frontier (SF) models; representing non-parametric and parametric methods respectively. Many studies have compared DEA and SF models in empirical settings. However, there is a lack of empirical evidence in the literature about the proximity of these two approaches in measuring technical efficiency. This may be attributed to the presence of some potential problems such as misspecification, multicollinearity, measurement errors, and outliers. This is the core research problem this study attempted to solve. The primary objective of the study is to evaluate the performance of DEA and SF models in the presence of model misspecification, multicollinearity, and outliers. Since DEA and SF use mathematically distinctive methods and assumptions to estimate efficiency, Monte Carlo simulations experiments were carried to assess their relative performance. Each experiment was replicated hundred times for three sample sizes 25, 50, and 100. Three performance criteria Mean absolute deviation (MAD), median absolute deviation (MEDAD), and Spearman's correlation were used to evaluate the relative performance. The second study objective is to measure and evaluate farm-level efficiency of wheat production in the Gezira scheme. The Monte Carlo estimates of SF efficiency are higher than DEA estimates for the three sample sizes 25, 50, 100 and in presence and absence of random noise. The results confirm that omission of relevant variables underestimates SF and DEA efficiency scores. Likewise, inclusion of irrelevant variables overestimates both SF and DEA model efficiency scores. However, the effects of either omission of relevant variables or inclusion irrelevant variables are lesser on SF models than on DEA models. This reveals that SF models are more robust to both omission relevant variables and inclusion of irrelevant variables. The results show the performance of SF models in presence of multicollinearity is generally better than DEA models. In presence of severe multicollinearity, dropping one of the collinear variable leads to an improvement in SF efficiency for all sample sizes. In contrast, dropping one collinear variable from DEA model in presence of severe multicollinearity leads to increased bias. An omission of uncorrelated independent variable leads considerable bias in both DEA and SF efficiency scores. In presence of outlying observations, DEA models underestimate efficiency score. Whereas, the results show that SF models produce consistent efficiency estimates. Thus, SF models outperform DEA models in presence of outliers for all sample sizes. In addition, the performance of SF models in presence of outliers improves with large sample sizes. To reduce the influence of outlying observations on DEA models, the author proposed a modified model by transforming either the independent and/or the dependent variable, depending on the variable that contains outlying observations into logarithmic form. The results show that the level of bias of the modified DEA is much smaller than that of traditional DEA models and closed to that of SF models. The second part of the study aimed to measure the efficiency of farm-level wheat production in the Gezira scheme and to identify inefficiency determinants. The data collected through crop-cutting survey for 959 wheat farmers in season 2005/2006. One output (wheat production) and five input variables (farm size, seeds, irrigation-water, nitrogen fertilizer, and super phosphate) were thought to measure wheat production efficiency. Input- and output-oriented DEA-BCC models have been developed to measure the efficiency of wheat production. The mean BCC input-oriented efficiency was 0.751; implying that an average farmer can reduce the inputs use by 33.1% and nevertheless maintain the same level of wheat production. Whereas, the mean output-oriented BBC efficiency was 0.6133 revealing that an average farmer can increase wheat production by 63.1% with the current inputs use levels. The mean scale efficiency (SE) scores for input- and output-oriented DEA models were 0.762 and 0.938 with respective standard deviations of 0.177 and 0.070. The high SE values, particularly for output-oriented BCC model, reveal that the inefficiency of wheat production in the Gezira scheme was mainly due to improper input use. Two SF models; Half-Normal and Truncated-Normal Normal models were built for the same output and input variables for DEA models using two computational algorithms; Newton and Steepest Descent methods. Results of the two models showed that Half-normal normal is the most suitable model. The models results show that 56.8% to 58.3% of the variations in the wheat output among the farms were due to the differences in technical efficiency. The mean wheat technical efficiency for the sample was 80%. This implies that the average farmer can increase wheat production by 25% without increasing the inputs use levels. Four Tobit regression models were developed to analyze how the environmental variables and farmers' characteristics affect wheat production inefficiency scores. Twelve variables were assumed to influence wheat inefficiency; education, age, sowing date, variety (two dummy variables), preceding crop, finance, tillage, farm location, weeding, spraying, and zone. The variables that affect wheat inefficiency significantly vary with respect to Model. However, there are four common variables that have significant impact on wheat technical inefficiency; preceding crop, tillage, farm location, and zone. None of these variables are under the control of the farmer except tillage. The implication is that farmers who cultivated after fallow and adopted the recommended tillage and their farms at the head of the irrigation canal and in the northern groups of the scheme were more efficient in wheat production. en_US
dc.description.sponsorship Sudan university of Science and Technology en_US
dc.language.iso en en_US
dc.subject Data Envelopment Analysis en_US
dc.subject Stochastic systems en_US
dc.title Performance of Data Envelopment and Stochastic Frontier Models In the Presence of Misspecification-Multicollinearity-and Outliers en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Share

Search SUST


Browse

My Account