SPARKC-22 Upgrade to Scala 2.11 and Add cross build.

christobill · Mar 5, 2015 · d2386da · d2386da
1 parent 8b460c5
commit d2386da
Show file tree

Hide file tree

Showing 32 changed files with 651 additions and 387 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -1,8 +1,9 @@
 language: scala
 jdk: oraclejdk7
-sudo: false
 scala:
   - 2.10.4
+  - 2.11.5
+
 script:
   - "sbt ++$TRAVIS_SCALA_VERSION test:compile"
   - "sbt ++$TRAVIS_SCALA_VERSION it:compile"

diff --git a/CHANGES.txt b/CHANGES.txt
@@ -18,6 +18,7 @@
  * Report Connector metrics to Spark metrics system (SPARKC-27)
  * Upgraded to Spark 1.2.1 (SPARKC-30)
  * Add conversion from java.util.Date to java.sqlTimestamp for Spark SQL (#512)
+ * Upgraded to Scala 2.11 and scala version cross build (SPARKC-22)
 
 1.2.0 alpha 1
  * Added support for TTL and timestamp in the writer (#153)

diff --git a/README.md b/README.md
@@ -1,6 +1,5 @@
 # Spark Cassandra Connector [![Build Status](https://travis-ci.org/datastax/spark-cassandra-connector.svg)](http://travis-ci.org/datastax/spark-cassandra-connector)
 
-
 ## Lightning-fast cluster computing with Spark and Cassandra
 
 This library lets you expose Cassandra tables as Spark RDDs, write Spark RDDs to Cassandra tables, and
@@ -10,6 +9,7 @@ execute arbitrary CQL queries in your Spark applications.
 
  - Compatible with Apache Cassandra version 2.0 or higher and DataStax Enterprise 4.5 (see table below)
  - Compatible with Apache Spark 1.0 and 1.1 (see table below)
+ - Compatible with Scala 2.10 and 2.11
  - Exposes Cassandra tables as Spark RDDs
  - Maps table rows to CassandraRow objects or tuples
  - Offers customizable object mapper for mapping rows to objects of user-defined classes
@@ -43,32 +43,8 @@ If you want to access the functionality of Connector from Java, you may want to
     libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector-java" % "1.2.0-alpha2"
 
 ## Building
-
-### Building The Assembly Jar
-In the root directory run
-
-    sbt assembly
-
-A fat jar will be generated to both of these directories:
-   - `spark-cassandra-connector/target/scala-2.10/`
-   - `spark-cassandra-connector-java/target/scala-2.10/`
-
-Select the former for Scala apps, the later for Java.
-
-### Building General Artifacts
-In the root directory run:
-
-    sbt package
-    sbt doc
-
-The library package jars will be placed in:
-  - `spark-cassandra-connector/target/scala-2.10/`
-  - `spark-cassandra-connector-java/target/scala-2.10/`
-
-The documentation will be generated to: 
-  - `spark-cassandra-connector/target/scala-2.10/api/`    
-  - `spark-cassandra-connector-java/target/scala-2.10/api/`    
-
+See [Building And Artifacts](doc/12_building_and_artifacts.md)
+
 ## Documentation
 
   - [Quick-start guide](doc/0_quick_start.md)
@@ -83,6 +59,7 @@ The documentation will be generated to:
   - [About The Demos](doc/9_demos.md)
   - [The spark-cassandra-connector-embedded Artifact](doc/10_embedded.md)
   - [Performance monitoring](doc/11_metrics.md)
+  - [Building And Artifacts](doc/12_building_and_artifacts.md)
 
 ## License
 This software is available under the [Apache License, Version 2.0](LICENSE).    

diff --git a/doc/0_quick_start.md b/doc/0_quick_start.md
@@ -17,32 +17,22 @@ Configure a new Scala project with the following dependencies:
  - Apache Cassandra thrift and clientutil libraries matching the version of Cassandra  
  - DataStax Cassandra driver for your Cassandra version 
 
-This driver does not depend on the Cassandra server code.   
-- For a detailed dependency list, see project dependencies in the `project/CassandraSparkBuild.scala` file.
-- For dependency versions, see `project/Versions.scala` file.
+This driver does not depend on the Cassandra server code.
 
-Add the `spark-cassandra-connector` jar and its dependency jars to the following classpaths:
+ - For a detailed dependency list, see [project/CassandraSparkBuild.scala](../project/CassandraSparkBuild.scala)
+ - For dependency versions, see [project/Versions.scala](../project/Versions.scala)
+
+Add the `spark-cassandra-connector` jar and its dependency jars to the following classpaths.
+**Make sure the Connector version you use coincides with your Spark version (i.e. Spark 1.2.x with Connector 1.2.x)**:
 
     "com.datastax.spark" %% "spark-cassandra-connector" % Version
 
  - the classpath of your project
  - the classpath of every Spark cluster node
-
-The easiest way to do this is to make the assembled connector jar using
-
-     sbt assembly
-     
-This will generate a jar file with all of the required dependencies in 
 
-     spark-cassandra-connector/spark-cassandra-connector/target/scala-2.10/spark-cassandra-connector-assembly-*.jar
-     
-Then add this jar to your Spark executor classpath by adding the following line to your spark-default.conf
+### Building
+See [Building And Artifacts](doc/12_building_and_artifacts.md)
 
-     spark.executor.extraClassPath  spark-cassandra-connector/spark-cassandra-connector/target/scala-2.10/spark-cassandra-connector-assembly-$CurrentVersion-SNAPSHOT.jar
-
-This driver is also compatible with Spark distribution provided in 
-[DataStax Enterprise 4.5](http://www.datastax.com/documentation/datastax_enterprise/4.5/datastax_enterprise/newFeatures.html).
-
 ### Preparing example Cassandra schema
 Create a simple keyspace and table in Cassandra. Run the following statements in `cqlsh`:
 
@@ -100,5 +90,4 @@ val collection = sc.parallelize(Seq(("key3", 3), ("key4", 4)))
 collection.saveToCassandra("test", "kv", SomeColumns("key", "value"))       
 ```
 
-
 [Next - Connecting to Cassandra](1_connecting.md)
diff --git a/doc/11_metrics.md b/doc/11_metrics.md
@@ -47,3 +47,4 @@ read-row-meter         | Number of rows read from Cassandra
 read-page-wait-timer   | The Time spent by the driver waiting for rows to be paged in from C*
 read-task-timer        | Timer to measure time of reading a single partition
 
+[Next - Building And Artifacts](doc/12_building_and_artifacts.md)
diff --git a/doc/12_building_and_artifacts.md b/doc/12_building_and_artifacts.md
@@ -0,0 +1,85 @@
+# Documentation
+
+## Building
+
+### Scala Versions
+You can choose to build, assemble and run both Spark and the Spark Cassandra Connector against Scala 2.10 or 2.11.
+
+#### Scala 2.11
+Running SBT with the following builds against Scala 2.11 and the generated artifact paths coincide with
+the `binary.version` thereof:
+
+    sbt -Dscala-2.11=true
+
+For Spark see: [Building Spark for Scala 2.11](http://spark.apache.org/docs/1.2.0/building-spark.html)
+
+For Scala 2.11 tasks:
+
+    sbt -Dscala-2.11=true doc
+    sbt -Dscala-2.11=true package
+    sbt -Dscala-2.11=true assembly
+
+#### Scala 2.10
+To use Scala 2.10 nothing extra is required.
+
+#### Version Cross Build
+This produces artifacts for both versions:
+Start SBT:
+
+     sbt -Dscala-2.11=true
+
+Run in the SBT shell:
+
+     + package
+
+
+### Building The Assembly Jar
+In the root directory run
+
+    sbt assembly
+
+To build the assembly jar against Scala 2.11:
+
+     sbt -Dscala-2.11=true assembly
+
+A fat jar will be generated to both of these directories:
+   - `spark-cassandra-connector/target/scala-{binary.version}/`
+   - `spark-cassandra-connector-java/target/scala-{binary.version}/`
+
+Select the former for Scala apps, the later for Java.
+
+### Building General Artifacts
+All artifacts are generated to the standard output directories based on the Scala binary version you use.
+
+In the root directory run:
+
+    sbt package
+    sbt doc
+
+The library package jars will be generated to:
+  - `spark-cassandra-connector/target/scala-{binary.version}/`
+  - `spark-cassandra-connector-java/target/scala-{binary.version}/`
+
+The documentation will be generated to:
+  - `spark-cassandra-connector/target/scala-{binary.version}/api/`
+  - `spark-cassandra-connector-java/target/scala-{binary.version}/api/`
+
+##### Build Tasks
+The easiest way to do this is to make the assembled connector jar using
+
+     sbt assembly
+
+Remember that if you need to build the assembly jar against Scala 2.11:
+
+     sbt -Dscala-2.11=true assembly
+
+This will generate a jar file with all of the required dependencies in
+
+     spark-cassandra-connector/spark-cassandra-connector/target/scala-{binary.version}/spark-cassandra-connector-assembly-*.jar
+
+Then add this jar to your Spark executor classpath by adding the following line to your spark-default.conf
+
+     spark.executor.extraClassPath  spark-cassandra-connector/spark-cassandra-connector/target/scala-{binary.version}/spark-cassandra-connector-assembly-$CurrentVersion-SNAPSHOT.jar
+
+This driver is also compatible with Spark distribution provided in
+[DataStax Enterprise 4.5](http://www.datastax.com/documentation/datastax_enterprise/4.5/datastax_enterprise/newFeatures.html).
diff --git a/doc/9_demos.md b/doc/9_demos.md
@@ -62,6 +62,10 @@ To run from SBT read on.
 On the command line at the root of `spark-cassandra-connector`:
 
     sbt simple-demos/run
+
+Against Scala 2.11:
+
+    sbt -Dscala-2.11=true simple-demos/run
 
 And then select which demo you want:
 
@@ -75,10 +79,12 @@ And then select which demo you want:
      [6] com.datastax.spark.connector.demo.SQLDemo
 
 #### Running The Kafka Streaming Demo
+Spark does not support kafka streaming or publish the `spark-streaming-kafka`
+artifact in their Scala 2.11 build yet. Until then this is only available against Scala 2.10.
 On the command line at the root of `spark-cassandra-connector`:
 
     sbt kafka-streaming/run
-      
+
 #### Running The Twitter Streaming Demo
 First you need to set your Twitter auth credentials. This is required by Twitter.
 The Twitter streaming sample expects these values to either already exist in the
Original file line number	Diff line number	Diff line change
Expand Up		@@ -47,3 +47,4 @@ read-row-meter \| Number of rows read from Cassandra
		read-page-wait-timer \| The Time spent by the driver waiting for rows to be paged in from C*
		read-task-timer \| Timer to measure time of reading a single partition

		[Next - Building And Artifacts](doc/12_building_and_artifacts.md)