Workshop on spark for geek night 31st March
Description of parameters :-
- user-name : User name of the linux user with whom you want to run job
- password : Password of the user
- class : Fully qualified name of the class, for example com.tw.example.MRExample
- args : arguments needed by class main driver
For example to run SparkExample - Word Count
docker run -v <project_dir_path>:/local/git achalag/geeknight-spark <mvn_command>
Mvn Command
mvn clean deploy -Dusername=<user-name> -Dpassword=<password> -Dclass=com.tw.example.SparkExample -Dargs="<input-dir> <output-dir>"
For example
docker run -v /user/<local-user>/projects/bigdataworkshop:/local/git achalag/geeknight-spark mvn clean deploy -Dusername=<username> -Dpassword=<somepassword> -Dclass=com.tw.example.SparkExample -Dargs="/home/tw/data/tweets /home/tw/data/<username>/job1"
Note: Don't forget to delete output dir before running a Spark job.
Some web interfaces to check logs : Spark Master Web UI http://10.133.124.48:8080/ FileSystem Web UI http://10.133.124.48:3389/
Download Docker image http://10.133.124.48:3389/geeknight-spark.tgz
Load Docker image docker load < geeknight-spark.tgz