spark_vagrant_ansible

Try out Spark 1.5 out using VMs provisioned by Vagrant and Ansible

What is this

This reopsitory provides a Vagrant file which installs in combination with an Ansible playbook:

To launch a spark job, use the /opt/spark/bin/spark-submit-local script on the spark-master vm. Connect to this machine using vagrant ssh spark-master

Spark runs in stand alone more, this means that there is no underlying hadoop or HDFS. If you need to work on files, you need to have them shared on all machines. The /data folder is shared between all machines and also the host. You can put files there and use them.
As all VMs run on your computer, memory is any issue. The ansible playbook configures spark quite memory constrainted. You can change these limits by first giving the VMs more memory (in the Vagrantfile) and then changing the launchers in the rules/spark-master and rules/spark-slave folder

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
roles		roles
spark-example		spark-example
.gitignore		.gitignore
README.md		README.md
Vagrantfile		Vagrantfile
playbook.yml		playbook.yml