Abstract:
The research in Handwritten Arabic Optical Character Recognition area by artificial intelligence scientists continuing until this moment. This thesis, manipulates thirty four forms of Arabic letters, twenty eight forms of the basic letters and extra six forms for some letters. The data set used in this research is an Isolated Handwritten Arabic Characters (IHAC) dataset, which collected by Arabic Language Technology Research Group at Sudan University of Science and Technology. To solve the problem of strong similarity between some Arabic letters, this thesis proposed a two stages classification method. The first stage contains a classifier that classifies the input letter to one of fifteen subgroups. The second stage contains number of classifiers, one classifier for each subgroup (for instance the group ب ت ن ث has a classifier which output only one of these four letters).
The BackPropagation Neural Network (BPNN) is used to design and to train the classifiers. This system achieved 78.77% recognition rate for testing dataset and 99.4% for training dataset in the group stage. One classifier for the character stage has been tested and achieved 92.77% recognition rate for testing dataset.
To address overfitting problem, which reflected by the difference between testing and training results, some overfitting solutions have been testing and their results are encouraged.