Skip to content

Commit

Permalink
Preparing for V1.4.0
Browse files Browse the repository at this point in the history
Updating build and README files.

Author: Hossein <[email protected]>

Closes #282 from falaki/v1.4.0.
  • Loading branch information
falaki committed Mar 4, 2016
1 parent aa32be0 commit cbc72fe
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 9 deletions.
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,30 +16,30 @@ You can link against this library in your program at the following coordinates:
```
groupId: com.databricks
artifactId: spark-csv_2.10
version: 1.3.0
version: 1.4.0
```
### Scala 2.11
```
groupId: com.databricks
artifactId: spark-csv_2.11
version: 1.3.0
version: 1.4.0
```

## Using with Spark shell
This package can be added to Spark using the `--packages` command line option. For example, to include it when starting the spark shell:

### Spark compiled with Scala 2.11
```
$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-csv_2.11:1.3.0
$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-csv_2.11:1.4.0
```

### Spark compiled with Scala 2.10
```
$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-csv_2.10:1.3.0
$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-csv_2.10:1.4.0
```

## Features
This package allows reading CSV files in local or distributed filesystem as [Spark DataFrames](https://spark.apache.org/docs/1.3.0/sql-programming-guide.html).
This package allows reading CSV files in local or distributed filesystem as [Spark DataFrames](https://spark.apache.org/docs/1.6.0/sql-programming-guide.html).
When reading files the API accepts several options:
* `path`: location of files. Similar to Spark can accept standard Hadoop globbing expressions.
* `header`: when set to true the first line of files will be used to name columns and will not be included in data. All types will be assumed string. Default value is false.
Expand Down Expand Up @@ -407,7 +407,7 @@ Automatically infer schema (data types), otherwise everything is assumed string:
```R
library(SparkR)

Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.3.0" "sparkr-shell"')
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
sqlContext <- sparkRSQL.init(sc)

df <- read.df(sqlContext, "cars.csv", source = "com.databricks.spark.csv", inferSchema = "true")
Expand All @@ -419,7 +419,7 @@ You can manually specify schema:
```R
library(SparkR)

Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.3.0" "sparkr-shell"')
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
sqlContext <- sparkRSQL.init(sc)
customSchema <- structType(
structField("year", "integer"),
Expand All @@ -437,7 +437,7 @@ You can save with compressed output:
```R
library(SparkR)

Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.3.0" "sparkr-shell"')
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
sqlContext <- sparkRSQL.init(sc)

df <- read.df(sqlContext, "cars.csv", source = "com.databricks.spark.csv", inferSchema = "true")
Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name := "spark-csv"

version := "1.4.0-SNAPSHOT"
version := "1.4.0"

organization := "com.databricks"

Expand Down

0 comments on commit cbc72fe

Please sign in to comment.