Enhancing the Accuracy of Optical Character Recognition (OCR) for short text of Arabic language Image by implementing an algorithm to improve the Quality of Images

Mohamed, Ahmed Suliman Albashir; Supervisor, -Mohammed Hamouda Karboos Hamid

SUST Home
→
Theses and Dissertations
→
College of Computer Science and Information Technology
→
Masters Dissertations : Computer Science and Information Technology
→
View Item

Enhancing the Accuracy of Optical Character Recognition (OCR) for short text of Arabic language Image by implementing an algorithm to improve the Quality of Images

Mohamed, Ahmed Suliman Albashir; Supervisor, -Mohammed Hamouda Karboos Hamid

URI: http://repository.sustech.edu/handle/123456789/27626

Date: 2022-05-22

Abstract:

Optical Character Recognition (OCR) plays a major role in understanding, learning, and recognizing the language in the era of communication. OCR helps non-native speakers and even non-humans to understand the language and recognize its texts, words, phrases, and structures. Although, Optical Character Recognition provides more accurate way to recognize texts, but there is a lack of sufficient interest and support for Arabic languages in this field compared to other languages, especially English. This research aims to implement an algorithm for enhancing the accuracy of Arabic text recognition through improving image quality. This can be conduct via image processing which performs a set of image processing operations, and repeating the process several times to achieve maximum accuracy, so that text recognition software can easily detect texts. This could be done through a direct application in an experimental environment. The average similarity rate of the original images without modification was (0.50) to (1). The average similarity rate of texts for images after improving was reached (0.91) to (1) which is a much better result. The results showed that many future improvements can be made to obtain a greater similarity rate by improving images and using artificial intelligence.