Skip to content

Commit

Permalink
SPARKC-22 Upgrade to Scala 2.11 and Add cross build.
Browse files Browse the repository at this point in the history
  • Loading branch information
Helena Edelson committed Mar 5, 2015
1 parent 8b460c5 commit d2386da
Show file tree
Hide file tree
Showing 32 changed files with 651 additions and 387 deletions.
3 changes: 2 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
language: scala
jdk: oraclejdk7
sudo: false
scala:
- 2.10.4
- 2.11.5

script:
- "sbt ++$TRAVIS_SCALA_VERSION test:compile"
- "sbt ++$TRAVIS_SCALA_VERSION it:compile"
Expand Down
1 change: 1 addition & 0 deletions CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
* Report Connector metrics to Spark metrics system (SPARKC-27)
* Upgraded to Spark 1.2.1 (SPARKC-30)
* Add conversion from java.util.Date to java.sqlTimestamp for Spark SQL (#512)
* Upgraded to Scala 2.11 and scala version cross build (SPARKC-22)

1.2.0 alpha 1
* Added support for TTL and timestamp in the writer (#153)
Expand Down
31 changes: 4 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# Spark Cassandra Connector [![Build Status](https://travis-ci.org/datastax/spark-cassandra-connector.svg)](http://travis-ci.org/datastax/spark-cassandra-connector)


## Lightning-fast cluster computing with Spark and Cassandra

This library lets you expose Cassandra tables as Spark RDDs, write Spark RDDs to Cassandra tables, and
Expand All @@ -10,6 +9,7 @@ execute arbitrary CQL queries in your Spark applications.

- Compatible with Apache Cassandra version 2.0 or higher and DataStax Enterprise 4.5 (see table below)
- Compatible with Apache Spark 1.0 and 1.1 (see table below)
- Compatible with Scala 2.10 and 2.11
- Exposes Cassandra tables as Spark RDDs
- Maps table rows to CassandraRow objects or tuples
- Offers customizable object mapper for mapping rows to objects of user-defined classes
Expand Down Expand Up @@ -43,32 +43,8 @@ If you want to access the functionality of Connector from Java, you may want to
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector-java" % "1.2.0-alpha2"

## Building

### Building The Assembly Jar
In the root directory run

sbt assembly

A fat jar will be generated to both of these directories:
- `spark-cassandra-connector/target/scala-2.10/`
- `spark-cassandra-connector-java/target/scala-2.10/`

Select the former for Scala apps, the later for Java.

### Building General Artifacts
In the root directory run:

sbt package
sbt doc

The library package jars will be placed in:
- `spark-cassandra-connector/target/scala-2.10/`
- `spark-cassandra-connector-java/target/scala-2.10/`

The documentation will be generated to:
- `spark-cassandra-connector/target/scala-2.10/api/`
- `spark-cassandra-connector-java/target/scala-2.10/api/`

See [Building And Artifacts](doc/12_building_and_artifacts.md)

## Documentation

- [Quick-start guide](doc/0_quick_start.md)
Expand All @@ -83,6 +59,7 @@ The documentation will be generated to:
- [About The Demos](doc/9_demos.md)
- [The spark-cassandra-connector-embedded Artifact](doc/10_embedded.md)
- [Performance monitoring](doc/11_metrics.md)
- [Building And Artifacts](doc/12_building_and_artifacts.md)

## License
This software is available under the [Apache License, Version 2.0](LICENSE).
Expand Down
27 changes: 8 additions & 19 deletions doc/0_quick_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,32 +17,22 @@ Configure a new Scala project with the following dependencies:
- Apache Cassandra thrift and clientutil libraries matching the version of Cassandra
- DataStax Cassandra driver for your Cassandra version

This driver does not depend on the Cassandra server code.
- For a detailed dependency list, see project dependencies in the `project/CassandraSparkBuild.scala` file.
- For dependency versions, see `project/Versions.scala` file.
This driver does not depend on the Cassandra server code.

Add the `spark-cassandra-connector` jar and its dependency jars to the following classpaths:
- For a detailed dependency list, see [project/CassandraSparkBuild.scala](../project/CassandraSparkBuild.scala)
- For dependency versions, see [project/Versions.scala](../project/Versions.scala)

Add the `spark-cassandra-connector` jar and its dependency jars to the following classpaths.
**Make sure the Connector version you use coincides with your Spark version (i.e. Spark 1.2.x with Connector 1.2.x)**:

"com.datastax.spark" %% "spark-cassandra-connector" % Version

- the classpath of your project
- the classpath of every Spark cluster node

The easiest way to do this is to make the assembled connector jar using

sbt assembly
This will generate a jar file with all of the required dependencies in

spark-cassandra-connector/spark-cassandra-connector/target/scala-2.10/spark-cassandra-connector-assembly-*.jar
Then add this jar to your Spark executor classpath by adding the following line to your spark-default.conf
### Building
See [Building And Artifacts](doc/12_building_and_artifacts.md)

spark.executor.extraClassPath spark-cassandra-connector/spark-cassandra-connector/target/scala-2.10/spark-cassandra-connector-assembly-$CurrentVersion-SNAPSHOT.jar

This driver is also compatible with Spark distribution provided in
[DataStax Enterprise 4.5](http://www.datastax.com/documentation/datastax_enterprise/4.5/datastax_enterprise/newFeatures.html).

### Preparing example Cassandra schema
Create a simple keyspace and table in Cassandra. Run the following statements in `cqlsh`:

Expand Down Expand Up @@ -100,5 +90,4 @@ val collection = sc.parallelize(Seq(("key3", 3), ("key4", 4)))
collection.saveToCassandra("test", "kv", SomeColumns("key", "value"))
```


[Next - Connecting to Cassandra](1_connecting.md)
1 change: 1 addition & 0 deletions doc/11_metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,4 @@ read-row-meter | Number of rows read from Cassandra
read-page-wait-timer | The Time spent by the driver waiting for rows to be paged in from C*
read-task-timer | Timer to measure time of reading a single partition

[Next - Building And Artifacts](doc/12_building_and_artifacts.md)
85 changes: 85 additions & 0 deletions doc/12_building_and_artifacts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Documentation

## Building

### Scala Versions
You can choose to build, assemble and run both Spark and the Spark Cassandra Connector against Scala 2.10 or 2.11.

#### Scala 2.11
Running SBT with the following builds against Scala 2.11 and the generated artifact paths coincide with
the `binary.version` thereof:

sbt -Dscala-2.11=true

For Spark see: [Building Spark for Scala 2.11](http://spark.apache.org/docs/1.2.0/building-spark.html)

For Scala 2.11 tasks:

sbt -Dscala-2.11=true doc
sbt -Dscala-2.11=true package
sbt -Dscala-2.11=true assembly

#### Scala 2.10
To use Scala 2.10 nothing extra is required.

#### Version Cross Build
This produces artifacts for both versions:
Start SBT:

sbt -Dscala-2.11=true

Run in the SBT shell:

+ package


### Building The Assembly Jar
In the root directory run

sbt assembly

To build the assembly jar against Scala 2.11:

sbt -Dscala-2.11=true assembly

A fat jar will be generated to both of these directories:
- `spark-cassandra-connector/target/scala-{binary.version}/`
- `spark-cassandra-connector-java/target/scala-{binary.version}/`

Select the former for Scala apps, the later for Java.

### Building General Artifacts
All artifacts are generated to the standard output directories based on the Scala binary version you use.

In the root directory run:

sbt package
sbt doc

The library package jars will be generated to:
- `spark-cassandra-connector/target/scala-{binary.version}/`
- `spark-cassandra-connector-java/target/scala-{binary.version}/`

The documentation will be generated to:
- `spark-cassandra-connector/target/scala-{binary.version}/api/`
- `spark-cassandra-connector-java/target/scala-{binary.version}/api/`

##### Build Tasks
The easiest way to do this is to make the assembled connector jar using

sbt assembly

Remember that if you need to build the assembly jar against Scala 2.11:

sbt -Dscala-2.11=true assembly

This will generate a jar file with all of the required dependencies in

spark-cassandra-connector/spark-cassandra-connector/target/scala-{binary.version}/spark-cassandra-connector-assembly-*.jar

Then add this jar to your Spark executor classpath by adding the following line to your spark-default.conf

spark.executor.extraClassPath spark-cassandra-connector/spark-cassandra-connector/target/scala-{binary.version}/spark-cassandra-connector-assembly-$CurrentVersion-SNAPSHOT.jar

This driver is also compatible with Spark distribution provided in
[DataStax Enterprise 4.5](http://www.datastax.com/documentation/datastax_enterprise/4.5/datastax_enterprise/newFeatures.html).
8 changes: 7 additions & 1 deletion doc/9_demos.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ To run from SBT read on.
On the command line at the root of `spark-cassandra-connector`:

sbt simple-demos/run

Against Scala 2.11:

sbt -Dscala-2.11=true simple-demos/run

And then select which demo you want:

Expand All @@ -75,10 +79,12 @@ And then select which demo you want:
[6] com.datastax.spark.connector.demo.SQLDemo

#### Running The Kafka Streaming Demo
Spark does not support kafka streaming or publish the `spark-streaming-kafka`
artifact in their Scala 2.11 build yet. Until then this is only available against Scala 2.10.
On the command line at the root of `spark-cassandra-connector`:

sbt kafka-streaming/run

#### Running The Twitter Streaming Demo
First you need to set your Twitter auth credentials. This is required by Twitter.
The Twitter streaming sample expects these values to either already exist in the
Expand Down
Loading

0 comments on commit d2386da

Please sign in to comment.