Custom model no longer supporting scikit-learn 1.3.0 ColumnTransformer #128

ftrifoglio · 2024-11-06T16:53:03Z

I've been debugging the following error when using a model for inference which I never got before using the same deployment procedure.

I suspect that the issue is caused by ColumnTransformer. A string is passed (perhaps the name of the estimator) instead of the instance of the estimator.

This is my testing code. I'm using snowflake-ml 1.5.4 and scikit-learn 1.3.0

from snowflake.ml.model import custom_model
from snowflake.ml.model.model_signature import DataType, FeatureSpec, ModelSignature
from snowflake.ml.registry import Registry
import pandas as pd
from loguru import logger


def test_deployment(session, model, data):

    class MyModel(custom_model.CustomModel):
        def __init__(self, context: custom_model.ModelContext) -> None:
            super().__init__(context)

        @custom_model.inference_api
        def predict(self, X: pd.DataFrame) -> pd.DataFrame:
            preds = pd.DataFrame(self.context.model_ref("model").transform(X)).iloc[:, 0]
            res_df = pd.DataFrame({"output": preds})
            return res_df

    my_model = MyModel(
        custom_model.ModelContext(
            models={
                "model": model,
            },
            artifacts={},
        )
    )

    model_signature = ModelSignature(
        inputs=[FeatureSpec(dtype=DataType.FLOAT, name=f"X{i}") for i in range(20)],
        outputs=[FeatureSpec(dtype=DataType.FLOAT, name="output")],
    )

    registry = Registry(session=session)

    logger.info("Logging model to registry...")
    registry.log_model(
        my_model,
        model_name="MyModel",
        version_name="v1",
        python_version="3.11",
        signatures={"predict": model_signature},
        conda_dependencies=["scikit-learn==1.3.0"],
        options=dict(relax_version=False)
    )
    logger.opt(colors=True).info("<green>Done</green>")

    try:
        logger.info("Testing deployed model...")
        mv = registry.get_model("MYMODEL").version("V1")
        mv.run(data, function_name="predict")
        logger.opt(colors=True).info("<green>PASS</green>")
    except Exception as e:
        logger.error(e)
    finally:
        logger.info("Cleaning up...")
        registry.delete_model("MYMODEL")
        logger.opt(colors=True).info("<green>Done</green>")

Failed test results:

import pandas as pd
from sklearn.datasets import make_classification
from sklearn.compose import ColumnTransformer


X, y = make_classification()
X = pd.DataFrame(X, columns=["X" + str(i) for i in range(20)])
pipe = ColumnTransformer(
    [("num", "passthrough", X.columns)],
)
pipe.set_output(transform="pandas")
pipe.fit(X, y)

test_deployment(session, pipe, X)

2024-11-06 16:24:34.893 | INFO | main:test_deployment:36 - Logging model to registry...
2024-11-06 16:25:25.593 | INFO | main:test_deployment:46 - Done
2024-11-06 16:25:25.594 | INFO | main:test_deployment:49 - Testing deployed model...
2024-11-06 16:25:36.653 | ERROR | main:test_deployment:54 - (1300) (1304): 01b83179-0105-548e-00ff-7501687c0df7: 100357 (P0000): Python Interpreter Error:
Traceback (most recent call last):
File "/home/udf/4285898201/predict.py", line 78, in infer
predictions_df = runner(input_df[input_cols])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/folders/6l/309364ds7qd0ph4xl9p7c2k40000gq/T/ipykernel_86032/1446644176.py", line 16, in predict
File "/home/udf/4285898201/snowflake-ml-python.zip/snowflake/ml/model/custom_model.py", line 28, in call
return self._func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/utils/_set_output.py", line 313, in wrapped
data_to_wrap = f(self, X, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/compose/_column_transformer.py", line 1076, in transform
Xs = self._call_func_on_transformers(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/compose/_column_transformer.py", line 885, in _call_func_on_transformers
return Parallel(n_jobs=self.n_jobs)(jobs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/utils/parallel.py", line 74, in call
return super().call(iterable_with_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/joblib/parallel.py", line 1918, in call
return output if self.return_generator else list(output)
^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/joblib/parallel.py", line 1847, in _get_sequential_output
res = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/utils/parallel.py", line 136, in call
return self.function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/pipeline.py", line 1290, in _transform_one
res = transformer.transform(X, **params.transform)
^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'transform'
in function PREDICT with handler predict.infer
2024-11-06 16:25:36.654 | INFO | main:test_deployment:56 - Cleaning up...
2024-11-06 16:25:36.825 | INFO | main:test_deployment:58 - Done

Same as above but with just a no-op inside ColumnTransformer.

import pandas as pd
from sklearn.datasets import make_classification
from sklearn.compose import ColumnTransformer


X, y = make_classification()
X = pd.DataFrame(X, columns=["X" + str(i) for i in range(20)])
pipe = ColumnTransformer(
    [("num", "passthrough", X.columns)],
)
pipe.fit(X, y)

test_deployment(session, pipe, X)

2024-11-06 16:29:05.159 | INFO | main:test_deployment:36 - Logging model to registry...
2024-11-06 16:29:51.906 | INFO | main:test_deployment:46 - Done
2024-11-06 16:29:51.908 | INFO | main:test_deployment:49 - Testing deployed model...
2024-11-06 16:29:59.890 | ERROR | main:test_deployment:54 - (1300) (1304): 01b8317d-0105-56a8-00ff-7501687c875f: 100357 (P0000): Python Interpreter Error:
Traceback (most recent call last):
File "/home/udf/4285898205/predict.py", line 78, in infer
predictions_df = runner(input_df[input_cols])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/folders/6l/309364ds7qd0ph4xl9p7c2k40000gq/T/ipykernel_86032/2800746830.py", line 16, in predict
File "/home/udf/4285898205/snowflake-ml-python.zip/snowflake/ml/model/custom_model.py", line 28, in call
return self._func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/utils/_set_output.py", line 313, in wrapped
data_to_wrap = f(self, X, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/compose/_column_transformer.py", line 1076, in transform
Xs = self._call_func_on_transformers(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/compose/_column_transformer.py", line 885, in _call_func_on_transformers
return Parallel(n_jobs=self.n_jobs)(jobs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/utils/parallel.py", line 74, in call
return super().call(iterable_with_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/joblib/parallel.py", line 1918, in call
return output if self.return_generator else list(output)
^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/joblib/parallel.py", line 1847, in _get_sequential_output
res = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/utils/parallel.py", line 136, in call
return self.function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python_udf/4efe4f8655d1cbd717f4875029b07a29850ba16ad61ee320466104b713e358ec/lib/python3.11/site-packages/sklearn/pipeline.py", line 1290, in _transform_one
res = transformer.transform(X, **params.transform)
^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'transform'
in function PREDICT with handler predict.infer
2024-11-06 16:29:59.890 | INFO | main:test_deployment:56 - Cleaning up...
2024-11-06 16:30:00.043 | INFO | main:test_deployment:58 - Done

Successful test results:

Here I show some successful results using just an estimator and even a pipeline.

import pandas as pd
from sklearn.datasets import make_classification
from sklearn.preprocessing import MinMaxScaler


X, y = make_classification()
X = pd.DataFrame(X, columns=["X" + str(i) for i in range(20)])
pipe = MinMaxScaler()
pipe.set_output(transform="pandas")
pipe.fit(X, y)

test_deployment(session, pipe, X)

import pandas as pd
from sklearn.datasets import make_classification
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler


X, y = make_classification()
X = pd.DataFrame(X, columns=["X" + str(i) for i in range(20)])
pipe = Pipeline([("mm", MinMaxScaler())])
pipe.set_output(transform="pandas")
pipe.fit(X, y)

test_deployment(session, pipe, X)

The text was updated successfully, but these errors were encountered:

ftrifoglio · 2024-11-07T09:59:55Z

The issue seems to be fixed using scikit-learn 1.5.1 and snowflake-ml-python 1.6.4 (both of which are available and compatible in the Snowflake Anaconda Channel for python 3.11).

This issue affected a production model that has been working for the last 6 months using scikit-learn 1.3.0, which was both pinned in a stored procedure's packages parameter used for deployment and in the conda_dependencies of registry.log_model within the stored procedure python code itself.

I imagine a issue mitigation would be to export an environment.yml conda file from a successful deployed environment and use it for future deployments?

sfc-gh-sdas · 2024-11-22T06:34:03Z

Glad to know that you were able to fix the issue. But it is also important that once production runs, it continues to run. Could you please let us know a bit more about your production setup? Are you training a new model within sproc every time and trying to log? All of a sudden one day, log_model() stopped working. Am I right?

ftrifoglio · 2024-11-22T10:44:25Z

log_model() always works. No errors there. The error is raised from within the container when calling the successfully deployed model.

For example, in the log above you see that Logging model to registry... is always followed by Done. The error occurs after Testing deployed model..., which is when we call run() on the model.

My setup consists of two stored procedures. One for training the model which dumps two scikit-learn objects (a preprocessing pipeline and a model) to a stage. And one for deploying it, which basically loads the objects from the stage, instantiate a custom model with those objects and logs it to the registry.

In the training procedure I've pinned these dependencies.

PACKAGES=('snowflake-snowpark-python==1.19.0', 'scikit-learn==1.5.1', 'pandas==2.0.3', 'xgboost==1.7.3', 'numpy==1.24.3', 'scipy==1.13.1', 'matplotlib==3.9.2', 'mlflow==2.3.1', 'joblib==1.4.2')

In the deployment procedure I've pinned these dependencies

PACKAGES=('snowflake-snowpark-python==1.19.0', 'snowflake-ml-python==1.6.4', 'scikit-learn==1.5.1', 'pandas==2.0.3', 'xgboost==1.7.3', 'numpy==1.24.3', 'joblib==1.4.2')

And also pass these conda dependencies conda_dependencies=["scikit-learn==1.3.0", "pandas==2.0.3", "xgboost==1.7.3", "numpy==1.24.3"] to log_model.

Every month we run both procedures. And it's worked for I'd say 6 consecutive runs and then suddenly stopped working.

If a new image is created every time we run log_model() then I can see how that can happen. As we're only pinning some dependencies. I would imagine a more robust approach would be to pass an environment.yml or simply start using Snowflake Container Services maybe?

ftrifoglio changed the title ~~Custom model no longer supporting scikit-learn ColumnTransformer~~ Custom model no longer supporting scikit-learn 1.3.0 ColumnTransformer Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom model no longer supporting scikit-learn 1.3.0 ColumnTransformer #128

Custom model no longer supporting scikit-learn 1.3.0 ColumnTransformer #128

ftrifoglio commented Nov 6, 2024

ftrifoglio commented Nov 7, 2024 •

edited

Loading

sfc-gh-sdas commented Nov 22, 2024

ftrifoglio commented Nov 22, 2024 •

edited

Loading

Custom model no longer supporting scikit-learn 1.3.0 ColumnTransformer #128

Custom model no longer supporting scikit-learn 1.3.0 ColumnTransformer #128

Comments

ftrifoglio commented Nov 6, 2024

Failed test results:

Successful test results:

ftrifoglio commented Nov 7, 2024 • edited Loading

sfc-gh-sdas commented Nov 22, 2024

ftrifoglio commented Nov 22, 2024 • edited Loading

ftrifoglio commented Nov 7, 2024 •

edited

Loading

ftrifoglio commented Nov 22, 2024 •

edited

Loading