SUST Repository

Detecting Similarity Among Multiple Data Sources For Categorized DATA

Show simple item record

dc.contributor.author Mohamed, Gamal Saad
dc.contributor.author Supervisor - Awad EL-Kareem Mohammed Yousof
dc.date.accessioned 2013-09-11T11:55:38Z
dc.date.available 2013-09-11T11:55:38Z
dc.date.issued 2012-09-01
dc.identifier.citation Mohamed,Gamal Saad.Detecting Similarity Among Multiple Data Sources/ Gamal Saad Mohamed;Awad EL-Kareem Yousof.-khartoum:Sudan University of Science & Technology,computer science,2012.-92p:ill;28cm.-Ph.D. en_US
dc.identifier.uri http://repository.sustech.edu/handle/123456789/1506
dc.description Thesis en_US
dc.description.abstract Efficient techniques to detect similar data in many data sources has become one of the most important and challenging issues in many areas such as Data Base, Bioinformatics and Data Mining.In this research, a three phase framework for similarity detection is proposed: In the first phase: Data Sources were collected from the web, depending on how it relates to a predetermined domain. The base source is the source of the data available, which describes the domain. In the second phase: the sources obtained are filtered to select data sources with a greater probability of containing data describing the domain by examining the degree of similarity between the base source, and each source from the sources obtained "External Sources". Whereas the selection is only for the external sources which its simi_degree value is less than, or equal to the average of the simi_degree values of all sources. In the third phase: Content similarity is examined between the base source, and all the selected external sources in phase 1, by using the proposed "Probability Measure" that gives a value on the basis of which it is determined whether the content of external sources is similar to the content of the base resource. Experimental result shows that the researcher's similarity framework can achieve better quality result than the conventional approaches. en_US
dc.description.sponsorship Sudan University of Science and Technology en_US
dc.language.iso en en_US
dc.publisher Sudan University of Science and Technology en_US
dc.subject Data management en_US
dc.title Detecting Similarity Among Multiple Data Sources For Categorized DATA en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Share

Search SUST


Browse

My Account