SUST Repository

Identifying Broken Plural in ArabicInformation Retrieval Systems

Show simple item record

dc.contributor.author Ahmed, Lojain Abdalhakeem
dc.contributor.author Osman, Manal Alshazali
dc.contributor.author Supervised, Ebtihal Mustafa Alameen
dc.date.accessioned 2017-02-15T06:03:14Z
dc.date.available 2017-02-15T06:03:14Z
dc.date.issued 2016-10-01
dc.identifier.citation Ahmed, Lojain Abdalhakeem.Identifying Broken Plural in ArabicInformation Retrieval Systems/Lojain Abdalhakeem Ahmed, Manal Alshazali Osman; Ebtihal Mustafa Alameen.-Khartoum:Sudan University of Science & Technology,College of Compute Science,2016.-73p.:ill.;28cm.-Search Bachelor. en_US
dc.identifier.uri http://repository.sustech.edu/handle/123456789/15544
dc.description Search Bachelor en_US
dc.description.abstract Arabic Language is one of the most widespread languages in the world and it’s newly associated with the field of the internet, so information retrieval is one of the most important fields in computer science. It is concerned with operations like indexing, searching and retrieving information, which is required by the user. Search engines are examples of information retrieval system (IRS). IRS faces many challenges when searching with Arabic language, because it is a grammatical language. Plurals in Arabic language are divided to two types Regular Plurals (RP) and Irregular/Broken Plurals (BR), IRS can identify regular plural, because it maintains the basic structure of the word, but it fails to identify BP, because the basic structure of the word changes from singular form to plural form and vice versa and that reflects negativities when applying indexing operation in IRS; because if a user types a query that contains BP, the system retrieves only the documents that contain the plural form while losing the documents that contain the singular form that should also be retrieved. Identifying BP is also one of the challenges that face Arabic IRS and it causes document loss leading to inaccurate results. This study aims at explaining how big of a challenge BP is to Arabic IRS. This study proposes a method to recognize BP and to increase Recall without affecting Precision. Proposed method consists of five stages (pattern recognition – word recognition – singular candidates – selecting the right singular form – expanding the query). This study covers only one pattern of BP patterns which is (فعاليل). Method results were compared with System baseline before and after applying the proposed method. Based on these results this study has successfully identified BP and enhanced retrieval. en_US
dc.description.sponsorship Sudan University of Science and Technology en_US
dc.language.iso other en_US
dc.publisher Sudan University of Science & Technology en_US
dc.subject INFORMATION TECHNOLOGY en_US
dc.subject Identifying Broken en_US
dc.subject Arabic en_US
dc.subject Information Retrieval en_US
dc.title Identifying Broken Plural in ArabicInformation Retrieval Systems en_US
dc.title.alternative التعرف على جموع التكسير في نظم استرجاع المعلومات العربية en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Share

Search SUST


Browse

My Account