Abstract:
The tax-gap is defined as the difference between what tax-payers legally owe and what they voluntarily pay. This thesis attempts to identify new approach that can help taxation chamber to detect tax-payers whose tax returns may require auditing. The dataset used in this thesis (tax-payers dataset) contains 52568 observations, and it represents the historical data of tax-payers’ monthly reports. By using the K-mean clustering algorithm a model is developed to find patterns. From the experiments it was found that the proper number of clusters is 10 clusters categorized monthly report observations into high, medium, or low tax-gap observations. It is found that the identified pattern could be used to make auditing staff focus on just 3% of tax-payers data that represents observations belonging to high and medium tax-gaps. In addition, by reducing the work load there will be an enhancement on the use of the available human, financial and technical resources.