This work is done as a part of the (CS 744) Advanced Big Data Systems Course at the University of Wisconsin Madison.
- Impact of Worker Failure
- Effect of Data Persistence
- Correlation between partitions and job completion time.
- Berkeley-Stanford web graph
- Wiki Articles
-
Configure Hadoop and Spark as detailed in [Link to Assignment] and place it in /mnt/data/
-
Execute each task using
./run.sh