Event Information

These instructions are meant to be used on the day of the HackReduce event. The servers will not be accessible except at the venue.

{CLUSTER NUMBER}: Will be assigned to your team at the event

Getting Started

git clone https://github.com/hoppertravel/HackReduce.git

The team folders will be used for storing your code and data on the cluster's master node.

ssh -i ~/.ssh/hackreduce.pem hadoop@hackreduce-cluster-{CLUSTER_NUMBER}.hopper.to
Create the code folder: mkdir -p ~/users/{team name}. This is where you will be storing all your code (small file storage)
Create the data folder: mkdir -p /mnt/users/{team name}. If there are large data files that you will be downloading to the cluster or saving the output of your jobs to, you need to store them here (otherwise you might run out of disk space!).

Starting on your local system:

cd {HackReduce project}
Compile your code with the following commands depending on whether you're using Gradle or Ant:

Copy your jar to the cluster's master node:

scp -i ~/.ssh/hackreduce.pem build/libs/{HackReduce custom}.jar hadoop@hackreduce-cluster-{CLUSTER NUMBER}.hopper.to:~/users/{team name}
Log onto the cluster:

ssh -i ~/.ssh/hackreduce.pem hadoop@hackreduce-cluster-{CLUSTER NUMBER}.hopper.to
Launch your job:

hadoop/bin/hadoop jar ~/users/{team name}/{HackReduce custom}.jar {Java job class} /datasets/{dataset chosen} /users/{team name}/job/

e.g. hadoop/bin/hadoop jar ~/users/hopper/myjar.jar org.hackreduce.examples.bixi.RecordCounter /datasets/bixi /users/hopper/bixi_recordcounts
Track the progress of your job on:

http://hackreduce-cluster-{CLUSTER NUMBER}.hopper.to:50030
When the job is finished, you can download the output from HDFS into the local file system:

hadoop/bin/hadoop dfs -copyToLocal /users/{team name}/job /mnt/users/{team name}/