Skip to content

Commit

Permalink
Merge pull request #338 from SurajAralihalli/main-v2312-release
Browse files Browse the repository at this point in the history
Main v2312 release
  • Loading branch information
SurajAralihalli authored Dec 20, 2023
2 parents 3db9820 + 9eec316 commit 4d45eb0
Show file tree
Hide file tree
Showing 30 changed files with 2,303 additions and 48 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/auto-merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ name: auto-merge HEAD to BASE
on:
pull_request_target:
branches:
- branch-23.10
- branch-23.12
types: [closed]

jobs:
Expand All @@ -29,14 +29,14 @@ jobs:
steps:
- uses: actions/checkout@v3
with:
ref: branch-23.10 # force to fetch from latest upstream instead of PR ref
ref: branch-23.12 # force to fetch from latest upstream instead of PR ref

- name: auto-merge job
uses: ./.github/workflows/auto-merge
env:
OWNER: NVIDIA
REPO_NAME: spark-rapids-examples
HEAD: branch-23.10
BASE: branch-23.12
HEAD: branch-23.12
BASE: branch-24.02
AUTOMERGE_TOKEN: ${{ secrets.AUTOMERGE_TOKEN }} # use to merge PR

Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Navigate to your home directory in the UI and select **Create** > **File** from
create an `init.sh` scripts with contents:
```bash
#!/bin/bash
sudo wget -O /databricks/jars/rapids-4-spark_2.12-23.10.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar
sudo wget -O /databricks/jars/rapids-4-spark_2.12-23.12.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar
```
1. Select the Databricks Runtime Version from one of the supported runtimes specified in the
Prerequisites section.
Expand Down Expand Up @@ -68,7 +68,7 @@ create an `init.sh` scripts with contents:
```bash
spark.rapids.sql.python.gpu.enabled true
spark.python.daemon.module rapids.daemon_databricks
spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-23.10.0.jar:/databricks/spark/python
spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-23.12.0.jar:/databricks/spark/python
```
Note that since python memory pool require installing the cudf library, so you need to install cudf library in
each worker nodes `pip install cudf-cu11 --extra-index-url=https://pypi.nvidia.com` or disable python memory pool
Expand Down
2 changes: 1 addition & 1 deletion docs/get-started/xgboost-examples/csp/databricks/init.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-gpu_2.12--ml.dmlc__xgboost4j-gpu_2.12__1.5.2.jar
sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-spark-gpu_2.12--ml.dmlc__xgboost4j-spark-gpu_2.12__1.5.2.jar

sudo wget -O /databricks/jars/rapids-4-spark_2.12-23.10.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar
sudo wget -O /databricks/jars/rapids-4-spark_2.12-23.12.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar
sudo wget -O /databricks/jars/xgboost4j-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-gpu_2.12/1.7.1/xgboost4j-gpu_2.12-1.7.1.jar
sudo wget -O /databricks/jars/xgboost4j-spark-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-spark-gpu_2.12/1.7.1/xgboost4j-spark-gpu_2.12-1.7.1.jar
ls -ltr
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ export SPARK_DOCKER_IMAGE=<gpu spark docker image repo and name>
export SPARK_DOCKER_TAG=<spark docker image tag>

pushd ${SPARK_HOME}
wget https://github.com/NVIDIA/spark-rapids-examples/raw/branch-23.10/dockerfile/Dockerfile
wget https://github.com/NVIDIA/spark-rapids-examples/raw/branch-23.12/dockerfile/Dockerfile

# Optionally install additional jars into ${SPARK_HOME}/jars/

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ For simplicity export the location to these jars. All examples assume the packag
### Download the jars

Download the RAPIDS Accelerator for Apache Spark plugin jar
* [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)
* [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar)

### Build XGBoost Python Examples

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ For simplicity export the location to these jars. All examples assume the packag
### Download the jars

1. Download the RAPIDS Accelerator for Apache Spark plugin jar
* [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)
* [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar)

### Build XGBoost Scala Examples

Expand Down
2 changes: 1 addition & 1 deletion examples/ML+DL-Examples/Spark-cuML/pca/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
ARG CUDA_VER=11.8.0
FROM nvidia/cuda:${CUDA_VER}-devel-ubuntu20.04
# Please do not update the BRANCH_VER version
ARG BRANCH_VER=23.10
ARG BRANCH_VER=23.12

RUN apt-get update
RUN apt-get install -y wget ninja-build git
Expand Down
6 changes: 3 additions & 3 deletions examples/ML+DL-Examples/Spark-cuML/pca/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ User can also download the release jar from Maven central:

[rapids-4-spark-ml_2.12-22.02.0-cuda11.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark-ml_2.12/22.02.0/rapids-4-spark-ml_2.12-22.02.0-cuda11.jar)

[rapids-4-spark_2.12-23.10.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)
[rapids-4-spark_2.12-23.12.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar)

Note: This demo could only work with v22.02.0 spark-ml version, and only compatible with spark-rapids versions prior to 23.10.0 . Please do not update the version in release.
Note: This demo could only work with v22.02.0 spark-ml version, and only compatible with spark-rapids versions prior to 23.12.0 . Please do not update the version in release.

## Sample code

Expand Down Expand Up @@ -49,7 +49,7 @@ It is assumed that a Standalone Spark cluster has been set up, the `SPARK_MASTER

``` bash
RAPIDS_ML_JAR=PATH_TO_rapids-4-spark-ml_2.12-22.02.0-cuda11.jar
PLUGIN_JAR=PATH_TO_rapids-4-spark_2.12-23.10.0.jar
PLUGIN_JAR=PATH_TO_rapids-4-spark_2.12-23.12.0.jar
jupyter toree install \
--spark_home=${SPARK_HOME} \
Expand Down
5 changes: 3 additions & 2 deletions examples/ML+DL-Examples/Spark-cuML/pca/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@

<groupId>com.nvidia</groupId>
<artifactId>PCAExample</artifactId>
<version>23.10.0</version>
<packaging>jar</packaging>
<version>23.12.0-SNAPSHOT</version>

<properties>
<maven.compiler.source>8</maven.compiler.source>
Expand Down Expand Up @@ -51,7 +52,7 @@
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-ml_2.12</artifactId>
<!--The last rapids-4-spark-ml release version is 22.02.0, snapshot version is 23.04.0-SNPASHOT! Please do not update the version-->
<version>23.02.0</version>
<version>22.02.0</version>
</dependency>
</dependencies>

Expand Down
6 changes: 3 additions & 3 deletions examples/ML+DL-Examples/Spark-cuML/pca/spark-submit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@
#

# Note that the last rapids-4-spark-ml release version is 22.02.0, snapshot version is 23.04.0-SNPASHOT, please do not update the version in release
ML_JAR=/root/.m2/repository/com/nvidia/rapids-4-spark-ml_2.12/23.04.0-SNAPSHOT/rapids-4-spark-ml_2.12-23.04.0-SNAPSHOT.jar
PLUGIN_JAR=/root/.m2/repository/com/nvidia/rapids-4-spark_2.12/23.10.0-SNAPSHOT/rapids-4-spark_2.12-23.10.0-SNAPSHOT.jar
ML_JAR=/root/.m2/repository/com/nvidia/rapids-4-spark-ml_2.12/22.02.0/rapids-4-spark-ml_2.12-22.02.0.jar
PLUGIN_JAR=/root/.m2/repository/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar
Note: The last rapids-4-spark-ml release version is 22.02.0, snapshot version is 23.04.0-SNPASHOT.

$SPARK_HOME/bin/spark-submit \
Expand All @@ -40,4 +40,4 @@ $SPARK_HOME/bin/spark-submit \
--conf spark.network.timeout=1000s \
--jars $ML_JAR,$PLUGIN_JAR \
--class com.nvidia.spark.examples.pca.Main \
/workspace/target/PCAExample-23.10.0-SNAPSHOT.jar
/workspace/target/PCAExample-23.12.0-SNAPSHOT.jar
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"import os\n",
"# Change to your cluster ip:port and directories\n",
"SPARK_MASTER_URL = os.getenv(\"SPARK_MASTER_URL\", \"spark:your-ip:port\")\n",
"RAPIDS_JAR = os.getenv(\"RAPIDS_JAR\", \"/your-path/rapids-4-spark_2.12-23.10.0.jar\")\n"
"RAPIDS_JAR = os.getenv(\"RAPIDS_JAR\", \"/your-path/rapids-4-spark_2.12-23.12.0.jar\")\n"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion examples/UDF-Examples/RAPIDS-accelerated-UDFs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ then do the following inside the Docker container.

### Get jars from Maven Central

[rapids-4-spark_2.12-23.10.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)
[rapids-4-spark_2.12-23.12.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar)


### Launch a local mode Spark
Expand Down
4 changes: 2 additions & 2 deletions examples/UDF-Examples/RAPIDS-accelerated-UDFs/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
user defined functions for use with the RAPIDS Accelerator
for Apache Spark
</description>
<version>23.10.0</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<maven.compiler.source>1.8</maven.compiler.source>
Expand All @@ -37,7 +37,7 @@
<cuda.version>cuda11</cuda.version>
<scala.binary.version>2.12</scala.binary.version>
<!-- Depends on release version, Snapshot version is not published to the Maven Central -->
<rapids4spark.version>23.10.0</rapids4spark.version>
<rapids4spark.version>23.12.0</rapids4spark.version>
<spark.version>3.1.1</spark.version>
<scala.version>2.12.15</scala.version>
<udf.native.build.path>${project.build.directory}/cpp-build</udf.native.build.path>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

cmake_minimum_required(VERSION 3.23.1 FATAL_ERROR)

file(DOWNLOAD https://raw.githubusercontent.com/rapidsai/rapids-cmake/branch-23.10/RAPIDS.cmake
file(DOWNLOAD https://raw.githubusercontent.com/rapidsai/rapids-cmake/branch-23.12/RAPIDS.cmake
${CMAKE_BINARY_DIR}/RAPIDS.cmake)
include(${CMAKE_BINARY_DIR}/RAPIDS.cmake)

Expand All @@ -32,7 +32,7 @@ if(DEFINED GPU_ARCHS)
endif()
rapids_cuda_init_architectures(UDFEXAMPLESJNI)

project(UDFEXAMPLESJNI VERSION 23.10.0 LANGUAGES C CXX CUDA)
project(UDFEXAMPLESJNI VERSION 23.12.0 LANGUAGES C CXX CUDA)

option(PER_THREAD_DEFAULT_STREAM "Build with per-thread default stream" OFF)
option(BUILD_UDF_BENCHMARKS "Build the benchmarks" OFF)
Expand Down Expand Up @@ -84,10 +84,10 @@ set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -w --expt-extended-lambda --expt-relax
set(CUDA_USE_STATIC_CUDA_RUNTIME OFF)

rapids_cpm_init()
rapids_cpm_find(cudf 23.10.00
rapids_cpm_find(cudf 23.12.00
CPM_ARGS
GIT_REPOSITORY https://github.com/rapidsai/cudf.git
GIT_TAG branch-23.10
GIT_TAG branch-23.12
GIT_SHALLOW TRUE
SOURCE_SUBDIR cpp
OPTIONS "BUILD_TESTS OFF"
Expand Down
1 change: 1 addition & 0 deletions examples/UDF-Examples/Spark-cuSpatial/gpu-run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ rm -rf $DATA_OUT_PATH
JARS=$ROOT_PATH/jars

JARS_PATH=${JARS_PATH:-$JARS/rapids-4-spark_2.12-23.02.0.jar,$JARS/spark-cuspatial-23.02.0.jar}

$SPARK_HOME/bin/spark-submit --master spark://$HOSTNAME:7077 \
--name "Gpu Spatial Join UDF" \
--executor-memory 20G \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"source": [
"from pyspark.sql import SparkSession\n",
"import os\n",
"jarsPath = os.getenv(\"JARS_PATH\", \"/data/cuspatial_data/jars/rapids-4-spark_2.12-23.02.0-SNAPSHOT.jar,/data/cuspatial_data/jars/spark-cuspatial-23.02.0.jar\")\n",
"jarsPath = os.getenv(\"JARS_PATH\", \"/data/cuspatial_data/jars/rapids-4-spark_2.12-23.02.0.jar,/data/cuspatial_data/jars/spark-cuspatial-23.02.0.jar\")\n",
"spark = SparkSession.builder \\\n",
" .config(\"spark.jars\", jarsPath) \\\n",
" .config(\"spark.sql.adaptive.enabled\", \"false\") \\\n",
Expand Down
3 changes: 2 additions & 1 deletion examples/UDF-Examples/Spark-cuSpatial/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,13 @@
<name>UDF of the cuSpatial case for the RAPIDS Accelerator</name>
<description>The RAPIDS accelerated user defined function of the cuSpatial case
for use with the RAPIDS Accelerator for Apache Spark</description>
<version>23.02.0</version>
<version>23.12.0-SNAPSHOT</version>

<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<java.major.version>8</java.major.version>
<!--The last compatible plugin version is v23.02-->
<rapids.version>23.02.0</rapids.version>
<scala.binary.version>2.12</scala.binary.version>
<spark.version>3.2.0</spark.version>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@
"Setting default log level to \"WARN\".\n",
"To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).\n",
"2022-11-30 06:57:40,550 WARN resource.ResourceUtils: The configuration of cores (exec = 2 task = 1, runnable tasks = 2) will result in wasted resources due to resource gpu limiting the number of runnable tasks per executor to: 1. Please adjust your configuration.\n",
"2022-11-30 06:57:54,195 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator 23.10.0 using cudf 23.10.0.\n",
"2022-11-30 06:57:54,195 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator 23.12.0 using cudf 23.12.0.\n",
"2022-11-30 06:57:54,210 WARN rapids.RapidsPluginUtils: spark.rapids.sql.multiThreadedRead.numThreads is set to 20.\n",
"2022-11-30 06:57:54,214 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator is enabled, to disable GPU support set `spark.rapids.sql.enabled` to false.\n",
"2022-11-30 06:57:54,214 WARN rapids.RapidsPluginUtils: spark.rapids.sql.explain is set to `NOT_ON_GPU`. Set it to 'NONE' to suppress the diagnostics logging about the query placement on the GPU.\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,18 @@
"source": [
"## Prerequirement\n",
"### 1. Download data\n",
"Dataset is derived from Fannie Mae’s [Single-Family Loan Performance Data](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html) with all rights reserved by Fannie Mae. Refer to these [instructions](https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.10/docs/get-started/xgboost-examples/dataset/mortgage.md) to download the dataset.\n",
"Dataset is derived from Fannie Mae’s [Single-Family Loan Performance Data](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html) with all rights reserved by Fannie Mae. Refer to these [instructions](https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.12/docs/get-started/xgboost-examples/dataset/mortgage.md) to download the dataset.\n",
"\n",
"### 2. Download needed jars\n",
"* [rapids-4-spark_2.12-23.10.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)\n",
"* [rapids-4-spark_2.12-23.12.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar)\n",
"\n",
"\n",
"### 3. Start Spark Standalone\n",
"Before running the script, please setup Spark standalone mode\n",
"\n",
"### 4. Add ENV\n",
"```\n",
"$ export SPARK_JARS=rapids-4-spark_2.12-23.10.0.jar\n",
"$ export SPARK_JARS=rapids-4-spark_2.12-23.12.0.jar\n",
"$ export PYSPARK_DRIVER_PYTHON=jupyter \n",
"$ export PYSPARK_DRIVER_PYTHON_OPTS=notebook\n",
"```\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
"Setting default log level to \"WARN\".\n",
"To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).\n",
"2022-11-25 09:34:43,952 WARN resource.ResourceUtils: The configuration of cores (exec = 4 task = 1, runnable tasks = 4) will result in wasted resources due to resource gpu limiting the number of runnable tasks per executor to: 1. Please adjust your configuration.\n",
"2022-11-25 09:34:58,155 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator 23.10.0 using cudf 23.10.0.\n",
"2022-11-25 09:34:58,155 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator 23.12.0 using cudf 23.12.0.\n",
"2022-11-25 09:34:58,171 WARN rapids.RapidsPluginUtils: spark.rapids.sql.multiThreadedRead.numThreads is set to 20.\n",
"2022-11-25 09:34:58,175 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator is enabled, to disable GPU support set `spark.rapids.sql.enabled` to false.\n",
"2022-11-25 09:34:58,175 WARN rapids.RapidsPluginUtils: spark.rapids.sql.explain is set to `NOT_ON_GPU`. Set it to 'NONE' to suppress the diagnostics logging about the query placement on the GPU.\n"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@
"22/11/24 06:14:06 INFO org.apache.spark.SparkEnv: Registering BlockManagerMaster\n",
"22/11/24 06:14:06 INFO org.apache.spark.SparkEnv: Registering BlockManagerMasterHeartbeat\n",
"22/11/24 06:14:06 INFO org.apache.spark.SparkEnv: Registering OutputCommitCoordinator\n",
"22/11/24 06:14:07 WARN com.nvidia.spark.rapids.RapidsPluginUtils: RAPIDS Accelerator 23.10.0 using cudf 23.10.0.\n",
"22/11/24 06:14:07 WARN com.nvidia.spark.rapids.RapidsPluginUtils: RAPIDS Accelerator 23.12.0 using cudf 23.12.0.\n",
"22/11/24 06:14:07 WARN com.nvidia.spark.rapids.RapidsPluginUtils: spark.rapids.sql.multiThreadedRead.numThreads is set to 20.\n",
"22/11/24 06:14:07 WARN com.nvidia.spark.rapids.RapidsPluginUtils: RAPIDS Accelerator is enabled, to disable GPU support set `spark.rapids.sql.enabled` to false.\n",
"22/11/24 06:14:07 WARN com.nvidia.spark.rapids.RapidsPluginUtils: spark.rapids.sql.explain is set to `NOT_ON_GPU`. Set it to 'NONE' to suppress the diagnostics logging about the query placement on the GPU.\n"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,18 @@
"source": [
"## Prerequirement\n",
"### 1. Download data\n",
"<!-- Refer these [instructions](https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.10/docs/get-started/xgboost-examples/dataset/mortgage.md) to download the dataset -->\n",
"Refer to these [instructions](https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.10/docs/get-started/xgboost-examples/dataset/mortgage.md) to download the dataset.\n",
"<!-- Refer these [instructions](https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.12/docs/get-started/xgboost-examples/dataset/mortgage.md) to download the dataset -->\n",
"Refer to these [instructions](https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.12/docs/get-started/xgboost-examples/dataset/mortgage.md) to download the dataset.\n",
"\n",
"### 2. Download needed jars\n",
"* [rapids-4-spark_2.12-23.10.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.10.0/rapids-4-spark_2.12-23.10.0.jar)\n",
"* [rapids-4-spark_2.12-23.12.0.jar](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.0/rapids-4-spark_2.12-23.12.0.jar)\n",
"\n",
"### 3. Start Spark Standalone\n",
"Before Running the script, please setup Spark standalone mode\n",
"\n",
"### 4. Add ENV\n",
"```\n",
"$ export SPARK_JARS=rapids-4-spark_2.12-23.10.0.jar\n",
"$ export SPARK_JARS=rapids-4-spark_2.12-23.12.0.jar\n",
"\n",
"```\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion examples/XGBoost-Examples/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@

<properties>
<encoding>UTF-8</encoding>
<xgboost.version>2.0.0-SNAPSHOT</xgboost.version>
<xgboost.version>2.0.0</xgboost.version>
<spark.version>3.1.1</spark.version>
<scala.version>2.12.8</scala.version>
<scala.binary.version>2.12</scala.binary.version>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
"Setting default log level to \"WARN\".\n",
"To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).\n",
"2022-11-30 08:02:10,103 WARN resource.ResourceUtils: The configuration of cores (exec = 2 task = 1, runnable tasks = 2) will result in wasted resources due to resource gpu limiting the number of runnable tasks per executor to: 1. Please adjust your configuration.\n",
"2022-11-30 08:02:23,737 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator 23.10.0 using cudf 23.10.0.\n",
"2022-11-30 08:02:23,737 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator 23.12.0 using cudf 23.12.0.\n",
"2022-11-30 08:02:23,752 WARN rapids.RapidsPluginUtils: spark.rapids.sql.multiThreadedRead.numThreads is set to 20.\n",
"2022-11-30 08:02:23,756 WARN rapids.RapidsPluginUtils: RAPIDS Accelerator is enabled, to disable GPU support set `spark.rapids.sql.enabled` to false.\n",
"2022-11-30 08:02:23,757 WARN rapids.RapidsPluginUtils: spark.rapids.sql.explain is set to `NOT_ON_GPU`. Set it to 'NONE' to suppress the diagnostics logging about the query placement on the GPU.\n",
Expand Down
Loading

0 comments on commit 4d45eb0

Please sign in to comment.