College of Computing, Engineering & Construction
Master of Science in Computer and Information Sciences (MS)
NACO controlled Corporate Body
University of North Florida. School of Computing
Dr. Sherif Elfayoumy
Dr. Roger Eggen
Dr. Sanjay P. Ahuja
Dr. Asai Asaithambi
Dr. Mark A. Tumeo
There is currently considerable enthusiasm around the MapReduce paradigm, and the distributed computing paradigm for analysis of large volumes of data. The Apache Hadoop is the most popular open source implementation of MapReduce model and LINQ to HPC is Microsoft's alternative to open source Hadoop. In this thesis, the performance of LINQ to HPC and Hadoop are compared using different benchmarks.
To this end, we identified four benchmarks (Grep, Word Count, Read and Write) that we have run on LINQ to HPC as well as on Hadoop. For each benchmark, we measured each system’s performance metrics (Execution Time, Average CPU utilization and Average Memory utilization) for various degrees of parallelism on clusters of different sizes. Results revealed some interesting trade-offs. For example, LINQ to HPC performed better on three out of the four benchmarks (Grep, Read and Write), whereas Hadoop performed better on the Word Count benchmark. While more research that is extensive has focused on Hadoop, there are not many references to similar research on the LINQ to HPC platform, which is slowly evolving during the writing of this thesis.
Sivasubramaniam, Ravishankar, "Performance Evaluation of LINQ to HPC and Hadoop for Big Data" (2013). UNF Graduate Theses and Dissertations. 463.