This repository has been archived by the owner on Apr 15, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* cleanup of api * add context (primary) key to describe feature sets * optional verbose to print sql in create_training_context * added get_feature_dataset * comments * old code * i hate upppercase * commment * sql format * i still hate uppercase * null tx * sql format * sql format * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * docs * verbose * docs * docs * column ordering * feature param cleanup * training context features * removed clean_df * to_lower * docs * better logic * better logic * label column validation * refactor TrainingContext -> TrainingView, Feature Set Context Key -> Feature Set Join Key * missed one * exclude 2 more funcs * docs * as list * missed some more * hashable * pep * docs * docs * handleinvalid keep * feature_vector_sql * get-features_by_name requires names * exclude members * return Feature, docs fix
- Loading branch information
Ben Epstein
authored
Jan 5, 2021
1 parent
66d0ff9
commit 7660e39
Showing
18 changed files
with
610 additions
and
241 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,4 +21,4 @@ formats: | |
python: | ||
version: 3.7 | ||
install: | ||
- requirements: requirements.txt | ||
- requirements: requirements-docs.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,11 +13,17 @@ If you are running inside of the Splice Machine Cloud Service in a Jupyter Noteb | |
External Installation | ||
--------------------- | ||
|
||
If you would like to install outside of the K8s cluster (and use the ExtPySpliceContext), you can install with | ||
If you would like to install outside of the K8s cluster (and use the ExtPySpliceContext), you can install the stable build with | ||
|
||
.. code-block:: sh | ||
sudo pip install pysplice | ||
sudo pip install git+http://www.github.com/splicemachine/[email protected] | ||
Or latest with | ||
|
||
.. code-block:: sh | ||
sudo pip install git+http://www.github.com/splicemachine/pysplice | ||
Usage | ||
----- | ||
|
@@ -28,24 +34,40 @@ This section covers importing and instantiating the Native Spark DataSource | |
|
||
.. tab:: Native Spark DataSource | ||
|
||
To use the Native Spark DataSource inside of the cloud service, first create a Spark Session and then import your PySpliceContext | ||
To use the Native Spark DataSource inside of the `cloud service<https://cloud.splicemachine.io/register?utm_source=pydocs&utm_medium=header&utm_campaign=sandbox>`_., first create a Spark Session and then import your PySpliceContext | ||
|
||
.. code-block:: Python | ||
from pyspark.sql import SparkSession | ||
from splicemachine.spark import PySpliceContext | ||
from splicemachine.mlflow_support import * # Connects your MLflow session automatically | ||
from splicemachine.features import FeatureStore # Splice Machine Feature Store | ||
spark = SparkSession.builder.getOrCreate() | ||
splice = PySpliceContext(spark) | ||
splice = PySpliceContext(spark) # The Native Spark Datasource (PySpliceContext) takes a Spark Session | ||
fs = FeatureStore(splice) # Create your Feature Store | ||
mlflow.register_splice_context(splice) # Gives mlflow native DB connection | ||
mlflow.register_feature_store(fs) # Tracks Feature Store work in Mlflow automatically | ||
.. tab:: External Native Spark DataSource | ||
|
||
To use the External Native Spark DataSource, create a Spark Session with your external Jars configured. Then, import your ExtPySpliceContext and set the necessary parameters | ||
To use the External Native Spark DataSource, create a Spark Session with your external Jars configured. Then, import your ExtPySpliceContext and set the necessary parameters. | ||
Once created, the functionality is identical to the internal Native Spark Datasource (PySpliceContext) | ||
|
||
.. code-block:: Python | ||
from pyspark.sql import SparkSession | ||
from splicemachine.spark import ExtPySpliceContext | ||
from splicemachine.mlflow_support import * # Connects your MLflow session automatically | ||
from splicemachine.features import FeatureStore # Splice Machine Feature Store | ||
spark = SparkSession.builder.config('spark.jars', '/path/to/splice_spark2-3.0.0.1962-SNAPSHOT-shaded.jar').config('spark.driver.extraClassPath', 'path/to/Splice/jars/dir/*').getOrCreate() | ||
JDBC_URL = '' #Set your JDBC URL here. You can get this from the Cloud Manager UI. Make sure to append ';user=<USERNAME>;password=<PASSWORD>' after ';ssl=basic' so you can authenticate in | ||
kafka_server = 'kafka-broker-0-' + JDBC_URL.split('jdbc:splice://jdbc-')[1].split(':1527')[0] + ':19092' # Formatting kafka URL from JDBC | ||
# The ExtPySpliceContext communicates with the database via Kafka | ||
kafka_server = 'kafka-broker-0-' + JDBC_URL.split('jdbc:splice://jdbc-')[1].split(':1527')[0] + ':19092' # Formatting kafka URL from JDBC | ||
splice = ExtPySpliceContext(spark, JDBC_URL=JDBC_URL, kafkaServers=kafka_server) | ||
fs = FeatureStore(splice) # Create your Feature Store | ||
mlflow.register_splice_context(splice) # Gives mlflow native DB connection | ||
mlflow.register_feature_store(fs) # Tracks Feature Store work in Mlflow automatically |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
py4j==0.10.7.0 | ||
pytest | ||
mlflow==1.8.0 | ||
pyyaml==5.3.1 | ||
mleap==0.15.0 | ||
graphviz==0.13 | ||
requests | ||
gorilla==0.3.0 | ||
tqdm==4.43.0 | ||
pyspark-dist-explore==0.1.8 | ||
numpy==1.18.2 | ||
pandas==1.0.3 | ||
scipy==1.4.1 | ||
tensorflow==2.2.1 | ||
pyspark | ||
h2o-pysparkling-2.4==3.28.1.2-1 | ||
sphinx-tabs | ||
IPython | ||
cloudpickle==1.6.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,6 +14,5 @@ scipy==1.4.1 | |
tensorflow==2.2.1 | ||
pyspark | ||
h2o-pysparkling-2.4==3.28.1.2-1 | ||
sphinx-tabs | ||
IPython | ||
cloudpickle==1.6.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
from .feature import Feature | ||
from .feature_set import FeatureSet | ||
from .feature_store import FeatureStore | ||
from .constants import FeatureType |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.