SUST Repository

Designing Broken Plurals Processing Method for Enhancing the Performance of Arabic Information Retrieval Systems

Show simple item record

dc.contributor.author Mahmoud, MohamedAlmoayed TajAlsir MohamedSaeed
dc.contributor.author Supervised - Albaraa Abuobieda Mohamed Ali
dc.date.accessioned 2015-12-16T08:46:01Z
dc.date.available 2015-12-16T08:46:01Z
dc.date.issued 2015-11-04
dc.identifier.citation Mahmoud,MohamedAlmoayed TajAlsir MohamedSaeed .Designing Broken Plurals Processing Method for Enhancing the Performance of Arabic Information Retrieval Systems \ MohamedAlmoayed TajAlsir MohamedSaeed Mahmoud ; . Albaraa Abuobieda Mohamed Ali.-Khartoum:Sudan University of Science and Technology,Faculty of Computer Science and Information Technology,2015.-72p:ill;28cm.-M.Sc. en_US
dc.identifier.uri http://repository.sustech.edu/handle/123456789/12298
dc.description Thesis en_US
dc.description.abstract Information Retrieval is one area of computer science highly associated with the field of the Internet. It concerned with the operations for indexing, searching and retrieving information and documents which are required by a user query. Search engines and E-library systems are examples of Information Retrieval System (IRS). IRS faces a fundamental challenges in some languages especially Arabic language because it is considered as a morphological language. A plurals in the Arabic language is divided into two types Sound Plurals (SP) and Broken Plurals (BP). IRS can identify the Sound plurals simply because it keeps the structure of words in its singular and plural form. Whereas IRS fails to recognize the BP because the structure of word is changed when the singular’s form of the word is derived from its plural form and vice-verse. In addition, this is reflected negatively when implementing indexing in Arabic IR. For instance, if a user typed a query contains plural form, system can retrieve all documents contain plurals form as the result, while system misses documents which contain singular form for the same word which should be retrieved. BP identification represent one of challenges faces Arabic IRS and causes loss of relevant documents; this is therefore lead to reduce Arabic IRS accuracy as a result. This study aims to explore how Arabic BP represent challenge faces Arabic IRS, and suggests a methodology based on the analysis of words to resolve BP identification problem and retrieval. The proposed method consists of three stages which are: Preprocess, BP identification, Query expansion. This study covers three patterns of syntax of Montaha Jemoa (SMJ) which are (Tfaaeel تفاعيل – Faaeel فعاعيل – Fyaeel فياعيل). Method Results were compared with (System baseline) before applying the proposed method and with (System baseline) after applying the proposed method. As a research findings, this study has successfully able to identify Broken Plural words and enhance retrieval and precision. en_US
dc.description.sponsorship Sudan University of Science and Technology en_US
dc.language.iso en en_US
dc.publisher Sudan University of Science & Technology en_US
dc.subject Computer Science en_US
dc.subject Arabic Information Retrieval Systems en_US
dc.title Designing Broken Plurals Processing Method for Enhancing the Performance of Arabic Information Retrieval Systems en_US
dc.title.alternative تصميم طريقة لمعالجة جموع التكسير لتحسين أداء نظم إسترجاع المعلومات العربية en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Share

Search SUST


Browse

My Account