Please use this identifier to cite or link to this item: https://repository.sustech.edu/handle/123456789/12809
Full metadata record
DC FieldValueLanguage
dc.contributor.authorMohammed, Ebtihal Mustafa Elamin
dc.contributor.authorSupervised, Ali Ahmed Al-faki
dc.contributor.authorSupervior - Ali Ahmed Al-faki
dc.date.accessioned2016-02-21T10:18:09Z
dc.date.available2016-02-21T10:18:09Z
dc.date.issued2015-11-01
dc.identifier.citationMohammed , Ebtihal Mustafa Elamin . Term Translation disambiguation in Cross-Language Information Retrieval : Translation From Arabic To English / Ebtihal Mustafa Elamin Mohammed ; Ali Ahmed Al-faki .- khartoum : Sudan University of Science and Technology , College of Computer science and Information Technology , 2015 .- 53p. ;28cm .- M.Sc.en_US
dc.identifier.urihttp://repository.sustech.edu/handle/123456789/12809
dc.descriptionthisesen_US
dc.description.abstractCross-language information retrieval (CLIR), where queries and documents are in different languages, become one of the major topics within the information retrieval community. The important step in CLIR is the translation. This research proposes a term translation disambiguation method based on co-occurrence statistics for translation in Arabic-English CLIR. There are multiple ways to perform query translations: employing machine translation techniques, using parallel corpora or using bilingual dictionaries. The first two approaches are very labour intensive. Manual hand-coding of linguistic, semantic and pragmatic knowledge is required for a machine translation engine to produce good translations. This can be quite overwhelming when the domain of coverage is wide. A great deal of work is also required for building parallel collections when using the second approach. With the increasing availability of machine-readable bilingual dictionaries, the third approach has become a viable approach to Cross-Language Information Retrieval (CLIR), but in this approach, resolving term ambiguity is a crucial step. In this research the ambiguity problem was resolved by co-occurrence statistics. Co- occurrence technique based on the hypothesis that correct translations tend to co- occur together in the target language collection. Therefore, the valid translation among a set of possible synonymous candidates of a certain source query term is expected to have high frequency of co-occurrence with the translations of the other terms in the same source query. After the document set divided to fixed size window to overcome varying in document length problem, the degree of association is calculated using mutual information measure because it simple and produce high correlation between terms even though they not appeared very frequently in document set. The results of developed method proved that co-occurrence statistics can reduce the ambiguity problem and it works well in case of diacritics and homonymous.en_US
dc.description.sponsorshipSudan University of Science and Technologyen_US
dc.language.isoen_USen_US
dc.publisherSudan University of Science and Technologyen_US
dc.subjectTerm Translation disambiguationen_US
dc.subjectCross-Language Information Retrievalen_US
dc.subjectTranslation From Arabic To Englishen_US
dc.titleTerm Translation disambiguation in Cross-Language Information Retrievalen_US
dc.title.alternative‫المعلومات‬ ‫استرجاع‬ ‫انظمة‬ ‫في‬ ‫اإلرتباط‬ ‫معامل‬ ‫بإستخذام‬ ‫الترجمة‬ ‫غموض‬ ‫إزالة‬ ‫اللغات‬ ‫بين‬en_US
dc.typeThesisen_US
Appears in Collections:Masters Dissertations : Computer Science and Information Technology

Files in This Item:
File Description SizeFormat 
Ebtihal Mustafa.pdfResearch1.94 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.