Skip to content

Workshop on crunch and mapreduce for geek night 28th jan

Notifications You must be signed in to change notification settings

ThoughtworksGGN/bigdataworkshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Bigdataworkshop

Workshop on spark for geek night 31st March

Description of parameters :-

  1. user-name : User name of the linux user with whom you want to run job
  2. password : Password of the user
  3. class : Fully qualified name of the class, for example com.tw.example.MRExample
  4. args : arguments needed by class main driver

For example to run SparkExample - Word Count docker run -v <project_dir_path>:/local/git achalag/geeknight-spark <mvn_command>

Mvn Command mvn clean deploy -Dusername=<user-name> -Dpassword=<password> -Dclass=com.tw.example.SparkExample -Dargs="<input-dir> <output-dir>"

For example docker run -v /user/<local-user>/projects/bigdataworkshop:/local/git achalag/geeknight-spark mvn clean deploy -Dusername=<username> -Dpassword=<somepassword> -Dclass=com.tw.example.SparkExample -Dargs="/home/tw/data/tweets /home/tw/data/<username>/job1"

Note: Don't forget to delete output dir before running a Spark job.

Some web interfaces to check logs : Spark Master Web UI http://10.133.124.48:8080/ FileSystem Web UI http://10.133.124.48:3389/

Download Docker image http://10.133.124.48:3389/geeknight-spark.tgz

Load Docker image docker load < geeknight-spark.tgz

About

Workshop on crunch and mapreduce for geek night 28th jan

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages