SUST Repository

A Model for Automatic Abstractive Multidocument Domain-Specific Summarization

Show simple item record

dc.contributor.author Ahmed, Hadia Abbas Mohammed Elsied
dc.contributor.author Supervisor, - Naomie Binti Salim
dc.date.accessioned 2019-05-26T10:13:56Z
dc.date.available 2019-05-26T10:13:56Z
dc.date.issued 2019-03-01
dc.identifier.citation Ahmed, Hadia Abbas Mohammed Elsied.A Model for Automatic Abstractive Multidocument Domain-Specific Summarization\Hadia Abbas Mohammed Elsied Ahmed;Naomie Binti Salim.-Khartoum:Sudan University of Science & Technology,College of Computer Science and Information Technology,2019.-102p.:ill.;28cm.-Ph.D. en_US
dc.identifier.uri http://repository.sustech.edu/handle/123456789/22661
dc.description Thesis en_US
dc.description.abstract Documents which are retrieved there on the internet through online search often come with a large amount of text. In the context of news documents, different news sources reporting on the same event usually contain common components that build up the main story of the news. This study aims to provide a new model of multi-document abstractive summarization (SRL-CST) based technique.The study first makes a pre-process to the texts which include sentence splitting, tokenization, stop word elimination and word stemming and then employs the Semantic Role Labeling (SRL) to each sentence and then Predicate Argument Structure (PAS) extracted, which will be the representation of the texts undergo summary. Since this study involves multiple documents, the research further investigates the automatic identification of cross-document relations from unannotated text documents, where the case-based reasoning (CBR) classification model is proposed. Cross-document relations are used to identify highly relevant sentences to be included in the summary. In the context of CST, the researcher suggests combining each related relation to be in one big relation and this is done based on their similar meaning. Content selection for the summary is made by combining the PASs based on the Cross document Structure theory(CST) relations that each PAS has with other PASs, then according to number of relation types that each PAS holds a score is given calculated to each PAS ,then we combine the PASs according to rules related to CST suggested by the researcher so as to reduce the redundancy. Next, the PASs was ranked using document No and the sentence position No in that document. lastly, the PASs in the top 20% higher scores are selected to form the final summary. Pyramid evaluation is examined against the study system summary and human model summaries and it could be observed from the results, that on mean coverage score the proposed approach (AS-SRL-CST) yields better summarization results. en_US
dc.description.sponsorship Sudan University of Science and Technology en_US
dc.language.iso en en_US
dc.publisher Sudan University of Science & Technology en_US
dc.subject Automatic Abstractive en_US
dc.subject Multidocument en_US
dc.subject Domain-Specific en_US
dc.title A Model for Automatic Abstractive Multidocument Domain-Specific Summarization en_US
dc.title.alternative نموزج للتلخيص التلقائي للوثائق المتعددة في مجال محدد en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Share

Search SUST


Browse

My Account