Please use this identifier to cite or link to this item: https://repository.sustech.edu/handle/123456789/21226
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSHUMO, ALAA ISMAIEL IBRAHIM
dc.contributor.authorSALIH, ESRA ADIL GALAL .
dc.contributor.authorKHALED, SAJDA LOTFY AHMED
dc.contributor.authorALBASHEER, SARA HASSABO ABDALLAH
dc.contributor.authorSupervisor, -AHMED HAMZA ABDL-MONIEM HAMZA
dc.date.accessioned2018-08-02T08:55:06Z
dc.date.available2018-08-02T08:55:06Z
dc.date.issued2017-10-02
dc.identifier.citationSHUMO, ALAA ISMAIEL IBRAHIM . COMPARING THE PERFORMANCE OF APACHE SPARK AND APACHE HADOOP MAPREDUCE ON BIG DATA PROCESSING \ ALAA ISMAIEL IBRAHIM SHUMO ... .{etal} ; AHMED HAMZA ABDL-MONIEM HAMZA .- khartoum:Sudan University of Science & Technology,College Of Computer Science,2017.-110p.:ill.;28cm.-search Bacheloren_US
dc.identifier.urihttp://repository.sustech.edu/handle/123456789/21226
dc.descriptionSearch Bacheloren_US
dc.description.abstractImagine the massive volume of data in the world, and the rapid growth of it every moment and every second, these data that carry many useful values, which help companies to succeed and increase a competitive advantage, is called 'Big Data', due to its sheer Volume, Variety, Velocity and Veracity. Most of this data is unstructured, structured or semi structured. The large amounts of data created a need for new frameworks for processing. The “Apache Hadoop MapReduce" model is a framework for processing large-scale datasets with parallel and distributed algorithms. The “Apache Hadoop MapReduce“allows for the distributed processing of large data sets across clusters of computers using simple programming models. Recently a framework called Apache Spark has emerged, focused on micro-batch data processing. In addition the main feature of Spark is the in-memory computation. In this research, we perform a comparative study on the performance of these two frameworks. Additionally we use bigdatabench (tool) to load dataset up to 420 million records. Experimental results show that Spark has better performance and overall lower runtimes than Apache Hadoop MapReduce.en_US
dc.description.sponsorshipSudan University of Science & Technologyen_US
dc.language.isoenen_US
dc.publisherSudan University of Science and Technologyen_US
dc.subjectComputer Scienceen_US
dc.subjectAPACHE SPARKen_US
dc.subjectAPACHE HADOOP MAPREDUCEen_US
dc.subjectBIG DATA PROCESSINGen_US
dc.titleCOMPARING THE PERFORMANCE OF APACHE SPARK AND APACHE HADOOP MAPREDUCE ON BIG DATA PROCESSINGen_US
dc.typeThesisen_US
Appears in Collections:Bachelor of Computer Science and Information Technology

Files in This Item:
File Description SizeFormat 
COMPARING THE PERFORMANCE .....pdfResearch4.16 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.