Skip to content

1.1.0

Latest
Compare
Choose a tag to compare
@rcchen rcchen released this 01 Feb 12:42

Enhancements ⭐️

  • Added a log4j.properties file so feedback from Hadoop will be logged by default to the console now. This addresses two issues that people encountered: (1) runtime errors like NPE were silent because the logger was not configured and (2) progress was not reported to console. Sample output from introducing an artificial NPE to the Mapper code:
017-02-01 20:32:48,931 [Thread-17] INFO  org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2017-02-01 20:32:48,933 [Thread-17] WARN  org.apache.hadoop.mapred.LocalJobRunner - job_local368124732_0001
java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 100
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 100
    at edu.stanford.cs246.wordcount.WordCount$Map.map(WordCount.java:60)
    at edu.stanford.cs246.wordcount.WordCount$Map.map(WordCount.java:1)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
2017-02-01 20:32:49,623 [main] INFO  org.apache.hadoop.mapreduce.Job - Job job_local368124732_0001 running in uber mode : false
2017-02-01 20:32:49,625 [main] INFO  org.apache.hadoop.mapreduce.Job -  map 0% reduce 0%
2017-02-01 20:32:49,627 [main] INFO  org.apache.hadoop.mapreduce.Job - Job job_local368124732_0001 failed with state FAILED due to: NA
2017-02-01 20:32:49,635 [main] INFO  org.apache.hadoop.mapreduce.Job - Counters: 0

Cleanup 🗑

  • Fixed indentation in sample WordCount.java file to be 4 spaces instead of 3 spaces. Sorry about that.

Upgrading ⬆️

The easiest way would be to establish this repository as the upstream master in your fork and pull from upstream, then merge into your own branch. Github provides great instructions for doing so here.

If Git is not your forte, you can view the git diff and manually execute the changes: