-
Notifications
You must be signed in to change notification settings - Fork 326
Building Shark Master Branch
michaeljones46 edited this page May 2, 2014
·
5 revisions
Shark's latest master branch depends on Spark's master branch, which is usually not published to Maven yet. We can however publish Spark to local ivy repository.
git clone [email protected]:apache/spark.git
cd spark
sbt/sbt package publish-local
Then check out the AMPLab distribution of Apache Hive and build it.
git clone https://github.com/amplab/hive.git -b shark-0.11
cd hive
ant package
ant package
builds all Hive jars and put them into build/dist
directory. On the EC2 AMI, you may have to first install ant-antlr.noarch
and ant-contrib.noarch
:
yum install ant-antlr.noarch
yum install ant-contrib.noarch
Now check out Shark
git clone [email protected]:amplab/shark.git
cd shark
Edit the configuration file conf/shark-env.sh
#!/usr/bin/env bash
export SHARK_MASTER_MEM=1g
export HIVE_DEV_HOME="/scratch/rxin/hive"
export HIVE_HOME="$HIVE_DEV_HOME/build/dist"
SPARK_JAVA_OPTS="-Dspark.local.dir=/tmp "
SPARK_JAVA_OPTS+="-Dspark.kryoserializer.buffer.mb=10 "
SPARK_JAVA_OPTS+="-verbose:gc -XX:-PrintGCDetails -XX:+PrintGCTimeStamps "
export SPARK_JAVA_OPTS
export SCALA_VERSION=2.9.2=3
export SCALA_HOME="/scratch/rxin/scala-2.9.3"
export SPARK_HOME="/scratch/rxin/spark"
export HADOOP_HOME="/scratch/rxin/hadoop-0.20.205.0"
export JAVA_HOME="/usr/lib/jvm/java-6-openjdk/jre"
Finally, build Shark
sbt/sbt package