[BUG]: Config.yaml step config only used in first step when calling step multiple times #2145

christianversloot · 2023-12-13T11:19:05Z

Contact Details [Optional]

No response

System Information

zenml 0.50.0

What happened?

We have a pipeline with a step named run_model:

@step
def run_model(X_train: np.ndarray, y_train: np.ndarray, X_test: np.ndarray,
              y_test: np.ndarray, name: str,  configuration: Dict):

Using the new pipeline/step syntax, it is called multiple times:

 for model in models:
        if model_config[model]['active']:
            run_model(X_train, y_train, X_test, y_test, model, configuration, id=model)

We're using a config.yaml based configuration for the step:

run_model:
    enable_cache: false
    experiment_tracker: "trackername"
    settings:
      experiment_tracker.mlflow:
        experiment_name: "experimentname"
        nested: True

However, the configuration is only used in run_model, not in run_model_2, run_model_3 and run_model_4, of which the names are automatically generated.

Is this a bug?
If not, how can we avoid this from happening other than manually specifying the config multiple times (this would be somewhat redundant / not DRY).

Thanks!

Reproduction steps

...

Relevant log output

No response

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

Vishal-Padia · 2023-12-22T14:42:21Z

To use the same step instance and configuration don't specify the id parameter when calling run_model. This will reuse the same step instance each time:

for model in models:
  if model_config[model]['active']:
    run_model(X_train, y_train, X_test, y_test, model, configuration)

Or I think you can create the step instance once and reuse it:

model_step = run_model.with_id("model")

for model in models:
  if model_config[model]['active']:
    model_step(X_train, y_train, X_test, y_test, model, configuration)

This isn't a bug - it's just creating new step instances each time run_model is called with a different id. To reuse the configuration, you need to reuse the same step instance.

ConX · 2024-08-22T01:13:42Z

It's not clear whether this is considered a bug or not.

Same as @christianversloot, my expectation was that the configuration will be used across all invocations of a step, but it isn't. I think it would be best to make the behavior such that, by default, all steps use the same configuration despite their dynamic suffix (i.e., _1, _2, etc.).

In my case, I have a step that I invoke twice by passing two different values for one its parameters, expecting all other parameters to be common as defined in the configuration. My current workaround to achieve the latter is by utilizing the YAML anchor notation. For example:

steps:
  my_step:
    parameters: &my_step_params
      shared_param_1: "value_1"
      shared_param_2: "value_2"
  my_step_2:
     parameters: *my_step_params

schustmi · 2024-08-27T07:23:20Z

We have this on our roadmap to either apply this to all invocations by default or provide a wildcard syntax that can be used to match multiple step invocations, but until that is done the suggested solution is the YAML anchors mentioned by @ConX

christianversloot added the bug Something isn't working label Dec 13, 2023

strickvl self-assigned this Feb 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: Config.yaml step config only used in first step when calling step multiple times #2145

[BUG]: Config.yaml step config only used in first step when calling step multiple times #2145

christianversloot commented Dec 13, 2023

Vishal-Padia commented Dec 22, 2023

ConX commented Aug 22, 2024

schustmi commented Aug 27, 2024

[BUG]: Config.yaml step config only used in first step when calling step multiple times #2145

[BUG]: Config.yaml step config only used in first step when calling step multiple times #2145

Comments

christianversloot commented Dec 13, 2023

Contact Details [Optional]

System Information

What happened?

Reproduction steps

Relevant log output

Code of Conduct

Vishal-Padia commented Dec 22, 2023

ConX commented Aug 22, 2024

schustmi commented Aug 27, 2024