Releases · splicemachine/pysplice

21 May 19:31

Ben-Epstein

2.8.0-k8

88087f2

2.8.0-k8 Latest

Latest

What's new?

This release has 90 commits and a number of major enhancements.

JWT Support for the Feature Store and MLManager model deployment (@myles-novick, #138)
MLflow 1.15 upgrade (@Ben-Epstein, #139)
New native support for MLModel flavors fastai, spacy, and statsmodels (@Ben-Epstein, #139)
feature_exists, feature_set_exists, training_view_exists functions (@Ben-Epstein, #140 #143)
Versioning support for training sets (@Ben-Epstein, #144)
Get features from feature set function (@Ben-Epstein, #146)
Migration from PySpliceContext artifact store to an HTTP Splice Artifact store for mlflow (@Ben-Epstein, #147)
Native Feature Search for Jupyter notebook for Feature Store (@Ben-Epstein, #148)
Extended get_training_set functions to support returning pandas dataframes and JSON data for users without a Spark Session (@Ben-Epstein, #149)
Function in mlflow to get the deployed models in an environment and their current statuses (@Ben-Epstein, #150)

Breaking Changes

None - But a NOTE that you must be on the matching ml-workflow release for these functions to work, especially the HTTP artifact store.

This release is in tandem with the ml-workflow release

Assets 2

06 Apr 22:54

Ben-Epstein

2.7.0-k8

05a9145

2.7.0-k8

What's new?

Nothing specific to this repo has been added, other than the SDK functions that map to the partnered ml-workflow release

You can see all changes from the last release here

Breaking Changes

None

Assets 2

12 Mar 19:32

Ben-Epstein

2.6.0-k8

1cbe301

2.6.0-k8

What's New?

New feature set design (#119 , @sergioferragut )
New attributes parameter to features that allows key value pairs (tags has been changed to a list. See breaking changes) (#120) (@myles-novick )
Undeploy Kubernetes function (#121) (@Ben-Epstein )
Bug Fix: Notebook history tracking was causing errors running mlflow locally (#122) (@Ben-Epstein )
Delete feature sets is now possible in certain scenarios (#124) (@Ben-Epstein )
Labels are now allowed in get_training_set without a view, which forces the proper time-consistent joins for training set creation (#125) (@myles-novick )

Breaking Changes

The tags parameter no longer accepts a dictionary, it now accepts a list. This must be changed to the attributes parameter for things to work. The attributes now accepts a dictionary.

This release is in tandem with the ml-workflow release

Assets 2

06 Mar 03:45

Ben-Epstein

2.5.5-k8

cc80946

Spark3 Release

This release is a Spark3 support of 2.5.1-k8.

No other changes were made except adding spark3 support and removing spark2.4 support. All future releases will be spark3 only

Assets 2

24 Feb 15:57

Ben-Epstein

2.5.0-k8

b499e07

2.5.0-k8

What's New?

The new Feature Store API!

(Nearly) full server side Feature Store API (@myles-novick) (many PRs)
New APIs for the feature store for added functionality (delete features, better authentication, summary statistics) (@myles-novick, @Ben-Epstein ) (many PRs)
Better support for mlflow native model logging calls (@Ben-Epstein )
MLflow watch_job throws an exception when the job fails (@Ben-Epstein )(#113)
Upgrade to Spark3, and maintained support for Spark2 (@Ben-Epstein )(#109)

Breaking Changes

The old Feature Store API may still work, but it is highly recommended to switch to the new Server Side Feature Store API. The client side API will no longer be maintained or supported.
This release is in tandem with the ml-workflow release
There is no upgrade script for this release as no table structures have changed, only new tables have been added.

Assets 2

15 Jan 14:42

Ben-Epstein

2.4.5-k8

55dda8d

Patch Release for Feature Sture

This is a patch release for 2.4.0-k8

The following features were added:

Drift detection (@sergioferragut )
Organized utilities modules for helpful functions of drift detection and training view SQL creation (@Ben-Epstein , @sergioferragut )

The following was fixed

Case sensitivity: The database's case sensitive column and table names were causing searchability issues. To remedy this, all column names, schema names, and table names are stored as UPPERCASE in the metadata, to match the default state of the database storage. (@sergioferragut )
datetime.min (0001-01-01 00:00:00) was causing problems when Spark tried to parse and process it. Because so much of the system runs on Spark, this was causing problems down the stack. To remedy this, we've replaced datetime.min with datetime.datetime('1900-01-01 00:00:00') for unspecified start times on Training Sets. (@sergioferragut , @Ben-Epstein )

Assets 2

07 Jan 19:58

Ben-Epstein

2.4.0-k8

5ddcf9b

2.4.0-k8

What's New?

K8s deployment has been fixed and stabilized (@Ben-Epstein ) (4521)
Feature Store API Beta 1 Release (@sergioferragut , @Ben-Epstein ) (#96)
NSDS Merge Into API (@jpanko1 ) (#95)
Moved call to get_current_transaction to the server side so users don't need permissions to make that call (@Ben-Epstein ) (#94)
Better Pandas support and fileToTable function for uploading data to the database (@Ben-Epstein ) (#93)
createAndInsertTable API for NSDS (@Ben-Epstein ) (#92)
MLFlow run log history (all cells run in the order they were executed) automatically recorded at the end of a run (@Ben-Epstein ) (#91)
Case insensitive column names for NSDS (@jpanko1 ) (#86)
MLFlow model support for pyfunc models (@Ben-Epstein ) (#81)

Breaking Changes

in mlflow.deploy_db the create_model_table parameter is now defaulted to True.

This release is in tandem with the ml-workflow release
The upgrade script is available here

Assets 2

15 Sep 22:42

Ben-Epstein

2.3.0-k8

b62e8a4

2.3.0-k8

What's New?

The External Native Spark Datasource API is now available (@jpanko1 )
Added functions to splicemachine.notebook to access the Spark UI and the Mlflow UI (@Ben-Epstein )
Python Dependency fixes for the October 2020 pip changes (@Ben-Epstein )
More graceful [errors] for unsupported models (#74) (@Ben-Epstein )
Better checking for spark datatypes (@Ben-Epstein, @ZachC16 )
Deployment support for non-pipeline models (@Ben-Epstein, @ZachC16 )
Support for Linear Support Vector Machine Spark Model (@Ben-Epstein, @ZachC16 )
Better unit testing (@Ben-Epstein @ZachC16)
New warning passed on Keras and Spark models when the number of label columns passed in doesn't match model (@Ben-Epstein, @domclassen )
Database Deployment Migrated to Server side running on Bobby pod (@abaveja313, @Ben-Epstein )
Initial K8s deployment code available - known bug with init container hanging, expected to be working in next release (@abaveja313 )
Models are now logged as MLModels instead of the raw model binary (@abaveja313 )
Model caching for database deployment (@Ben-Epstein @sergioferragut )
Fix for artifacts downloading without file extension (@Ben-Epstein )
Model deployment metadata managed by Bobby (@abaveja313 )

BREAKING CHANGES

The models table no longer exists. The deployment model is instead stored in a new column of the Artifacts table called database_binary. You must run the migration scripts to alter the artifacts table, otherwise existing deployments won't work
Models currently saved in the database with log_model will not be deployable as we have changed the model saving format from model to MLModel. You must read in the model binary, deserialize it, and re-log the model under a new run.

This release is in tandem with the ml-workflow release.

Upgrade scripts from 2.2.0 are available here

Assets 2

22 Jun 22:12

Ben-Epstein

2.2.0-k8

504f8f2

Release 2.2.0

What's New?

Stronger AWS Sagemaker deployment support using k8s ServiceAccounts
Model metadata tracking for in-db deployed models using the MODEL_METADATA and LIVE_MODEL_STATUS table and view
Support for in-db deployment for Keras linear models (LSTMs/RNNs/CNNs not yet supported).
Support for in-db deployment XGBoost using H2O/SKlearn implementations
SKLearn bug fix with fastnumbers
SKlearn better support for non-double return types
Upgrade from pickle -> cloudpickle for sklearn model serialization, adding support for both external and lambda functions inside SKLearn Pipelines
Merge in-db deployment to a 1 table design from a 2-table design. All features + model prediction(s) are stored in a single table
Support for deploying models to an existing table
Support for selecting which columns from a table are used in the model prediction. This allows you to deploy models to a "subset" fo a table.
Better support for in-db deployment for sklearn Pipelines that have predict parameters
deploy_db api cleanup: Removed model parameter and make run_id required. Model is pulled behind the scenes. DF parameter is optional and not required if deploying model to existing table.
General code cleanup

BREAKING CHANGES

deploy_db will no longer work with old parameters. New parameter set and order is required.
createTable from the PySpliceContext now has parameters ordered dataframe, schema_table_name instead of the other way around to match all other APIs in the module.

This release is in tandem with the ml-workflow release. Upgrade scripts are attached to that release.

Assets 2

21 May 00:36

Ben-Epstein

2.1.1-k8

4a88cde

Patch Fix for bad insert logic

Update context.py (#59)

Fix for df insert

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's new?

Breaking Changes

What's new?

Breaking Changes

What's New?

Breaking Changes

What's New?

Breaking Changes

This is a patch release for 2.4.0-k8

The following features were added:

The following was fixed

What's New?

Breaking Changes

What's New?

BREAKING CHANGES

What's New?

BREAKING CHANGES

Releases: splicemachine/pysplice

2.8.0-k8

What's new?

Breaking Changes

2.7.0-k8

What's new?

Breaking Changes

2.6.0-k8

What's New?

Breaking Changes

Spark3 Release

2.5.0-k8

What's New?

Breaking Changes

Patch Release for Feature Sture

This is a patch release for 2.4.0-k8

The following features were added:

The following was fixed

2.4.0-k8

What's New?

Breaking Changes

2.3.0-k8

What's New?

BREAKING CHANGES

Release 2.2.0

What's New?

BREAKING CHANGES

Patch Fix for bad insert logic