Skip to content

Commit

Permalink
Add catboost classifier support (#1403)
Browse files Browse the repository at this point in the history
Co-authored-by: Theofilos Papapanagiotou <[email protected]>
  • Loading branch information
krishanbhasin-gc and theofpa authored Oct 3, 2023
1 parent 9b9580d commit 90c3520
Show file tree
Hide file tree
Showing 20 changed files with 736 additions and 11 deletions.
1 change: 1 addition & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ jobs:
- huggingface
- alibi-explain
- alibi-detect
- catboost
is-pr:
- ${{ github.event_name == 'pull_request' }}
exclude:
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ Out of the box, MLServer provides support for:
| XGBoost || [MLServer XGBoost](./runtimes/xgboost) |
| Spark MLlib || [MLServer MLlib](./runtimes/mllib) |
| LightGBM || [MLServer LightGBM](./runtimes/lightgbm) |
| CatBoost || [MLServer CatBoost](./runtimes/catboost) |
| Tempo || [`github.com/SeldonIO/tempo`](https://github.com/SeldonIO/tempo) |
| MLflow || [MLServer MLflow](./runtimes/mlflow) |
| Alibi-Detect || [MLServer Alibi Detect](./runtimes/alibi-detect) |
Expand All @@ -91,6 +92,7 @@ MLServer to start serving your machine learning models.
- [Serving a `scikit-learn` model](./docs/examples/sklearn/README.md)
- [Serving a `xgboost` model](./docs/examples/xgboost/README.md)
- [Serving a `lightgbm` model](./docs/examples/lightgbm/README.md)
- [Serving a `catboost` model](./docs/examples/catboost/README.md)
- [Serving a `tempo` pipeline](./docs/examples/tempo/README.md)
- [Serving a custom model](./docs/examples/custom/README.md)
- [Serving an `alibi-detect` model](./docs/examples/alibi-detect/README.md)
Expand Down
190 changes: 190 additions & 0 deletions docs/examples/catboost/README.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Serving CatBoost models\n",
"\n",
"Out of the box, `mlserver` supports the deployment and serving of `catboost` models.\n",
"By default, it will assume that these models have been [serialised using the `save_model()` method](https://catboost.ai/en/docs/concepts/python-reference_catboost_save_model).\n",
"\n",
"In this example, we will cover how we can train and serialise a simple model, to then serve it using `mlserver`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Training\n",
"\n",
"To test the CatBoost Server, first we need to generate a simple CatBoost model using Python."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"from catboost import CatBoostClassifier\n",
"\n",
"train_data = np.random.randint(0, 100, size=(100, 10))\n",
"train_labels = np.random.randint(0, 2, size=(100))\n",
"\n",
"model = CatBoostClassifier(iterations=2,\n",
" depth=2,\n",
" learning_rate=1,\n",
" loss_function='Logloss',\n",
" verbose=True)\n",
"model.fit(train_data, train_labels)\n",
"model.save_model('model.cbm')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our model will be persisted as a file named `model.cbm`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Serving\n",
"\n",
"Now that we have trained and saved our model, the next step will be to serve it using `mlserver`. \n",
"For that, we will need to create 2 configuration files: \n",
"\n",
"- `settings.json`: holds the configuration of our server (e.g. ports, log level, etc.).\n",
"- `model-settings.json`: holds the configuration of our model (e.g. input type, runtime to use, etc.)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `settings.json`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile settings.json\n",
"{\n",
" \"debug\": \"true\"\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `model-settings.json`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile model-settings.json\n",
"{\n",
" \"name\": \"catboost\",\n",
" \"implementation\": \"mlserver_catboost.CatboostModel\",\n",
" \"parameters\": {\n",
" \"uri\": \"./model.cbm\",\n",
" \"version\": \"v0.1.0\"\n",
" }\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Start serving our model\n",
"\n",
"Now that we have our config in-place, we can start the server by running `mlserver start .`. This needs to either be ran from the same directory where our config files are or pointing to the folder where they are.\n",
"\n",
"```shell\n",
"mlserver start .\n",
"```\n",
"\n",
"Since this command will start the server and block the terminal, waiting for requests, this will need to be ran in the background on a separate terminal."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Send test inference request\n",
"\n",
"We now have our model being served by `mlserver`.\n",
"To make sure that everything is working as expected, let's send a request from our test set.\n",
"\n",
"For that, we can use the Python types that `mlserver` provides out of box, or we can build our request manually."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"import requests\n",
"import numpy as np\n",
"\n",
"test_data = np.random.randint(0, 100, size=(1, 10))\n",
"\n",
"x_0 = test_data[0:1]\n",
"inference_request = {\n",
" \"inputs\": [\n",
" {\n",
" \"name\": \"predict-prob\",\n",
" \"shape\": x_0.shape,\n",
" \"datatype\": \"FP32\",\n",
" \"data\": x_0.tolist()\n",
" }\n",
" ]\n",
"}\n",
"\n",
"endpoint = \"http://localhost:8080/v2/models/catboost/versions/v0.1.0/infer\"\n",
"response = requests.post(endpoint, json=inference_request)\n",
"\n",
"print(response.json())"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
104 changes: 104 additions & 0 deletions docs/examples/catboost/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Serving CatBoost models

Out of the box, `mlserver` supports the deployment and serving of `catboost` models.
By default, it will assume that these models have been [serialised using the `save_model()` method](https://catboost.ai/en/docs/concepts/python-reference_catboost_save_model).

In this example, we will cover how we can train and serialise a simple model, to then serve it using `mlserver`.

## Training

To test the CatBoost Server, first we need to generate a simple CatBoost model using Python.


```python
import numpy as np
from catboost import CatBoostClassifier

train_data = np.random.randint(0, 100, size=(100, 10))
train_labels = np.random.randint(0, 2, size=(100))

model = CatBoostClassifier(iterations=2,
depth=2,
learning_rate=1,
loss_function='Logloss',
verbose=True)
model.fit(train_data, train_labels)
model.save_model('model.cbm')
```

Our model will be persisted as a file named `model.cbm`.

## Serving

Now that we have trained and saved our model, the next step will be to serve it using `mlserver`.
For that, we will need to create 2 configuration files:

- `settings.json`: holds the configuration of our server (e.g. ports, log level, etc.).
- `model-settings.json`: holds the configuration of our model (e.g. input type, runtime to use, etc.).

### `settings.json`


```python
%%writefile settings.json
{
"debug": "true"
}
```

### `model-settings.json`


```python
%%writefile model-settings.json
{
"name": "catboost",
"implementation": "mlserver_catboost.CatboostModel",
"parameters": {
"uri": "./model.cbm",
"version": "v0.1.0"
}
}
```

### Start serving our model

Now that we have our config in-place, we can start the server by running `mlserver start .`. This needs to either be ran from the same directory where our config files are or pointing to the folder where they are.

```shell
mlserver start .
```

Since this command will start the server and block the terminal, waiting for requests, this will need to be ran in the background on a separate terminal.

### Send test inference request

We now have our model being served by `mlserver`.
To make sure that everything is working as expected, let's send a request from our test set.

For that, we can use the Python types that `mlserver` provides out of box, or we can build our request manually.


```python
import requests
import numpy as np

test_data = np.random.randint(0, 100, size=(1, 10))

x_0 = test_data[0:1]
inference_request = {
"inputs": [
{
"name": "predict-prob",
"shape": x_0.shape,
"datatype": "FP32",
"data": x_0.tolist()
}
]
}

endpoint = "http://localhost:8080/v2/models/catboost/versions/v0.1.0/infer"
response = requests.post(endpoint, json=inference_request)

print(response.json())
```
8 changes: 8 additions & 0 deletions docs/examples/catboost/model-settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"name": "catboost",
"implementation": "mlserver_catboost.CatboostModel",
"parameters": {
"uri": "./model.cbm",
"version": "v0.1.0"
}
}
Binary file added docs/examples/catboost/model.cbm
Binary file not shown.
3 changes: 3 additions & 0 deletions docs/examples/catboost/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"debug": "true"
}
1 change: 1 addition & 0 deletions docs/examples/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ models](./custom/README.md)).
- [Serving Scikit-Learn models](./sklearn/README.md)
- [Serving XGBoost models](./xgboost/README.md)
- [Serving LightGBM models](./lightgbm/README.md)
- [Serving CatBoost models](./catboost/README.md)
- [Serving Tempo pipelines](./tempo/README.md)
- [Serving MLflow models](./mlflow/README.md)
- [Serving custom models](./custom/README.md)
Expand Down
3 changes: 3 additions & 0 deletions docs/runtimes/catboost.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```{include} ../../runtimes/catboost/README.md
:relative-docs: ../../docs/
```
22 changes: 11 additions & 11 deletions docs/runtimes/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,17 +22,16 @@ class in your `model-settings.json` file.

## Included Inference Runtimes

| Framework | Package Name | Implementation Class | Example | Documentation |
| ------------- | ------------------------ | -------------------------------------------- | ---------------------------------------------------------- | ---------------------------------------------------------------- |
| Scikit-Learn | `mlserver-sklearn` | `mlserver_sklearn.SKLearnModel` | [Scikit-Learn example](../examples/sklearn/README.md) | [MLServer SKLearn](./sklearn) |
| XGBoost | `mlserver-xgboost` | `mlserver_xgboost.XGBoostModel` | [XGBoost example](../examples/xgboost/README.md) | [MLServer XGBoost](./xgboost) |
| HuggingFace | `mlserver-huggingface` | `mlserver_huggingface.HuggingFaceRuntime` | [HuggingFace example](../examples/huggingface/README.md) | [MLServer HuggingFace](./huggingface) |
| Spark MLlib | `mlserver-mllib` | `mlserver_mllib.MLlibModel` | Coming Soon | [MLServer MLlib](./mllib) |
| LightGBM | `mlserver-lightgbm` | `mlserver_lightgbm.LightGBMModel` | [LightGBM example](../examples/lightgbm/README.md) | [MLServer LightGBM](./lightgbm) |
| Tempo | `tempo` | `tempo.mlserver.InferenceRuntime` | [Tempo example](../examples/tempo/README.md) | [`github.com/SeldonIO/tempo`](https://github.com/SeldonIO/tempo) |
| MLflow | `mlserver-mlflow` | `mlserver_mlflow.MLflowRuntime` | [MLflow example](../examples/mlflow/README.md) | [MLServer MLflow](./mlflow) |
| Alibi-Detect | `mlserver-alibi-detect` | `mlserver_alibi_detect.AlibiDetectRuntime` | [Alibi-detect example](../examples/alibi-detect/README.md) | [MLServer Alibi-Detect](./alibi-detect) |
| Alibi-Explain | `mlserver-alibi-explain` | `mlserver_alibi_explain.AlibiExplainRuntime` | Coming Soon | [MLServer Alibi-Explain](./alibi-explain) |
| Framework | Package Name | Implementation Class | Example | Documentation |
| ------------ | ----------------------- | ------------------------------------------ | ---------------------------------------------------------- | ---------------------------------------------------------------- |
| Scikit-Learn | `mlserver-sklearn` | `mlserver_sklearn.SKLearnModel` | [Scikit-Learn example](../examples/sklearn/README.md) | [MLServer SKLearn](./sklearn) |
| XGBoost | `mlserver-xgboost` | `mlserver_xgboost.XGBoostModel` | [XGBoost example](../examples/xgboost/README.md) | [MLServer XGBoost](./xgboost) |
| Spark MLlib | `mlserver-mllib` | `mlserver_mllib.MLlibModel` | Coming Soon | [MLServer MLlib](./mllib) |
| LightGBM | `mlserver-lightgbm` | `mlserver_lightgbm.LightGBMModel` | [LightGBM example](../examples/lightgbm/README.md) | [MLServer LightGBM](./lightgbm) |
| CatBoost | `mlserver-catboost` | `mlserver_catboost.CatboostModel` | [CatBoost example](../examples/catboost/README.md) | [MLServer CatBoost](./catboost) |
| Tempo | `tempo` | `tempo.mlserver.InferenceRuntime` | [Tempo example](../examples/tempo/README.md) | [`github.com/SeldonIO/tempo`](https://github.com/SeldonIO/tempo) |
| MLflow | `mlserver-mlflow` | `mlserver_mlflow.MLflowRuntime` | [MLflow example](../examples/mlflow/README.md) | [MLServer MLflow](./mlflow) |
| Alibi-Detect | `mlserver-alibi-detect` | `mlserver_alibi_detect.AlibiDetectRuntime` | [Alibi-detect example](../examples/alibi-detect/README.md) | [MLServer Alibi-Detect](./alibi-detect) |

```{toctree}
:hidden:
Expand All @@ -44,6 +43,7 @@ MLflow <./mlflow>
Tempo <https://tempo.readthedocs.io>
Spark MLlib <./mllib>
LightGBM <./lightgbm>
Catboost <./catboost>
Alibi-Detect <./alibi-detect>
Alibi-Explain <./alibi-explain>
HuggingFace <./huggingface>
Expand Down
Loading

0 comments on commit 90c3520

Please sign in to comment.