From 5fd0510080c1f89932260051bd1a65df9d10a1c7 Mon Sep 17 00:00:00 2001 From: Jayesh Sharma Date: Thu, 3 Oct 2024 17:33:20 +0530 Subject: [PATCH] clean up --- develop/api/actions/create-action.mdx | 3 - develop/api/actions/delete-action.mdx | 3 - develop/api/actions/get-action.mdx | 3 - develop/api/actions/list-actions.mdx | 3 - develop/api/actions/update-action.mdx | 3 - develop/api/api-reference.mdx | 79 - .../create-artifact-version.mdx | 3 - .../delete-artifact-version.mdx | 3 - .../get-artifact-version.mdx | 3 - .../get-artifact-visualization.mdx | 3 - .../list-artifact-versions.mdx | 3 - .../prune-artifact-versions.mdx | 3 - .../update-artifact-version.mdx | 3 - develop/api/artifacts/create-artifact.mdx | 3 - develop/api/artifacts/delete-artifact.mdx | 3 - develop/api/artifacts/get-artifact.mdx | 3 - develop/api/artifacts/list-artifacts.mdx | 3 - develop/api/artifacts/update-artifact.mdx | 3 - develop/api/auth/api-token.mdx | 3 - develop/api/auth/device-authorization.mdx | 3 - develop/api/auth/logout.mdx | 3 - develop/api/auth/token.mdx | 3 - .../delete-authorized-device.mdx | 3 - .../get-authorization-device.mdx | 3 - .../list-authorized-devices.mdx | 3 - .../update-authorized-device.mdx | 3 - .../verify-authorized-device.mdx | 3 - develop/api/builds/delete-build.mdx | 3 - develop/api/builds/get-build.mdx | 3 - develop/api/builds/list-builds.mdx | 3 - .../delete-code-repository.mdx | 3 - .../code_repositories/get-code-repository.mdx | 3 - .../list-code-repositories.mdx | 3 - .../update-code-repository.mdx | 3 - develop/api/deployments/delete-deployment.mdx | 3 - develop/api/deployments/deployment-logs.mdx | 3 - develop/api/deployments/get-deployment.mdx | 3 - develop/api/deployments/list-deployments.mdx | 3 - develop/api/endpoint/create.mdx | 4 - develop/api/endpoint/delete.mdx | 4 - develop/api/endpoint/get.mdx | 4 - .../api/event-sources/create-event-source.mdx | 3 - .../api/event-sources/delete-event-source.mdx | 3 - .../api/event-sources/get-event-source.mdx | 3 - .../api/event-sources/list-event-sources.mdx | 3 - .../api/event-sources/update-event-source.mdx | 3 - develop/api/flavors/create-flavor.mdx | 3 - develop/api/flavors/delete-flavor.mdx | 3 - develop/api/flavors/get-flavor.mdx | 3 - develop/api/flavors/list-flavors.mdx | 3 - develop/api/flavors/sync-flavors.mdx | 3 - develop/api/flavors/update-flavor.mdx | 3 - develop/api/health.mdx | 3 - .../list-model-version-artifact-links.mdx | 3 - .../list-model-version-pipeline-run-links.mdx | 3 - ...elete-all-model-version-artifact-links.mdx | 3 - .../delete-model-version-artifact-link.mdx | 3 - ...delete-model-version-pipeline-run-link.mdx | 3 - .../model_versions/delete-model-version.mdx | 3 - .../api/model_versions/get-model-version.mdx | 3 - .../model_versions/list-model-versions.mdx | 3 - .../model_versions/update-model-version.mdx | 3 - develop/api/models/delete-model.mdx | 3 - develop/api/models/get-model.mdx | 3 - develop/api/models/list-model-versions.mdx | 3 - develop/api/models/list-models.mdx | 3 - develop/api/models/update-model.mdx | 3 - develop/api/openapi.json | 29981 ---------------- develop/api/pipelines/delete-pipeline.mdx | 3 - develop/api/pipelines/get-pipeline.mdx | 3 - develop/api/pipelines/list-pipeline-runs.mdx | 3 - develop/api/pipelines/list-pipelines.mdx | 3 - develop/api/pipelines/update-pipeline.mdx | 3 - develop/api/plugins/get-flavor.mdx | 3 - develop/api/plugins/list-flavors.mdx | 3 - develop/api/run_metadata/get-run-metadata.mdx | 3 - .../api/run_metadata/list-run-metadata.mdx | 3 - .../api/run_templates/create-template-run.mdx | 3 - .../api/run_templates/delete-run-template.mdx | 3 - .../api/run_templates/get-run-template.mdx | 3 - .../api/run_templates/list-run-templates.mdx | 3 - .../api/run_templates/update-run-template.mdx | 3 - develop/api/runs/delete-run.mdx | 3 - .../api/runs/get-pipeline-configuration.mdx | 3 - develop/api/runs/get-run-dag.mdx | 3 - develop/api/runs/get-run-status.mdx | 3 - develop/api/runs/get-run-steps.mdx | 3 - develop/api/runs/get-run.mdx | 3 - develop/api/runs/list-runs.mdx | 3 - develop/api/runs/update-run.mdx | 3 - develop/api/schedules/delete-schedule.mdx | 3 - develop/api/schedules/get-schedule.mdx | 3 - develop/api/schedules/list-schedules.mdx | 3 - develop/api/schedules/update-schedule.mdx | 3 - develop/api/secrets/backup-secrets.mdx | 3 - develop/api/secrets/delete-secret.mdx | 3 - develop/api/secrets/get-secret.mdx | 3 - develop/api/secrets/list-secrets.mdx | 3 - develop/api/secrets/restore-secrets.mdx | 3 - develop/api/secrets/update-secret.mdx | 3 - develop/api/server/get-onboarding-state.mdx | 3 - develop/api/server/server-info.mdx | 3 - develop/api/server/version.mdx | 3 - .../api/service_accounts/create-api-key.mdx | 3 - .../create-service-account.mdx | 3 - .../api/service_accounts/delete-api-key.mdx | 3 - .../delete-service-account.mdx | 3 - develop/api/service_accounts/get-api-key.mdx | 3 - .../service_accounts/get-service-account.mdx | 3 - .../api/service_accounts/list-api-keys.mdx | 3 - .../list-service-accounts.mdx | 3 - .../api/service_accounts/rotate-api-key.mdx | 3 - .../api/service_accounts/update-api-key.mdx | 3 - .../update-service-account.mdx | 3 - .../delete-service-connector.mdx | 3 - ...ources-based-on-service-connector-info.mdx | 3 - .../get-service-connector-client.mdx | 3 - .../get-service-connector-type.mdx | 3 - .../get-service-connector.mdx | 3 - .../list-service-connector-types.mdx | 3 - .../list-service-connectors.mdx | 3 - .../update-service-connector.mdx | 3 - ...te-and-verify-service-connector-config.mdx | 3 - .../validate-and-verify-service-connector.mdx | 3 - develop/api/services/create-service.mdx | 3 - develop/api/services/delete-service.mdx | 3 - develop/api/services/get-service.mdx | 3 - develop/api/services/list-services.mdx | 3 - develop/api/services/update-service.mdx | 3 - .../deregister-stack-component.mdx | 3 - .../get-stack-component-types.mdx | 3 - .../stack_components/get-stack-component.mdx | 3 - .../list-stack-components.mdx | 3 - .../update-stack-component.mdx | 3 - develop/api/stacks/delete-stack.mdx | 3 - develop/api/stacks/get-deployed-stack.mdx | 3 - .../stacks/get-stack-deployment-config.mdx | 3 - .../api/stacks/get-stack-deployment-info.mdx | 3 - develop/api/stacks/get-stack.mdx | 3 - develop/api/stacks/list-stacks.mdx | 3 - develop/api/stacks/update-stack.mdx | 3 - develop/api/steps/create-run-step.mdx | 3 - develop/api/steps/get-step-configuration.mdx | 3 - develop/api/steps/get-step-logs.mdx | 3 - develop/api/steps/get-step-status.mdx | 3 - develop/api/steps/get-step.mdx | 3 - develop/api/steps/list-run-steps.mdx | 3 - develop/api/steps/update-step.mdx | 3 - develop/api/tags/create-tag.mdx | 3 - develop/api/tags/delete-tag.mdx | 3 - develop/api/tags/get-tag.mdx | 3 - develop/api/tags/list-tags.mdx | 3 - develop/api/tags/update-tag.mdx | 3 - develop/api/triggers/create-trigger.mdx | 3 - develop/api/triggers/delete-trigger.mdx | 3 - develop/api/triggers/get-trigger.mdx | 3 - develop/api/triggers/list-triggers.mdx | 3 - develop/api/triggers/update-trigger.mdx | 3 - develop/api/users/get-current-user.mdx | 3 - develop/api/users/get-user.mdx | 3 - develop/api/users/list-users.mdx | 3 - .../users/update-user-resource-membership.mdx | 3 - develop/api/webhook/webhook.mdx | 3 - develop/api/workspaces/create-build.mdx | 3 - .../api/workspaces/create-code-repository.mdx | 3 - develop/api/workspaces/create-deployment.mdx | 3 - .../create-model-version-artifact-link.mdx | 3 - ...create-model-version-pipeline-run-link.mdx | 3 - .../api/workspaces/create-model-version.mdx | 3 - develop/api/workspaces/create-model.mdx | 3 - .../api/workspaces/create-pipeline-run.mdx | 3 - develop/api/workspaces/create-pipeline.mdx | 3 - .../api/workspaces/create-run-metadata.mdx | 3 - .../api/workspaces/create-run-template.mdx | 3 - develop/api/workspaces/create-schedule.mdx | 3 - develop/api/workspaces/create-secret.mdx | 3 - .../workspaces/create-service-connector.mdx | 3 - develop/api/workspaces/create-service.mdx | 3 - .../api/workspaces/create-stack-component.mdx | 3 - develop/api/workspaces/create-stack.mdx | 3 - develop/api/workspaces/create-workspace.mdx | 3 - develop/api/workspaces/delete-workspace.mdx | 3 - .../workspaces/get-or-create-pipeline-run.mdx | 3 - .../workspaces/get-workspace-statistics.mdx | 3 - develop/api/workspaces/get-workspace.mdx | 3 - develop/api/workspaces/list-runs.mdx | 3 - .../list-service-connector-resources.mdx | 3 - .../api/workspaces/list-workspace-builds.mdx | 3 - .../list-workspace-code-repositories.mdx | 3 - .../workspaces/list-workspace-deployments.mdx | 3 - .../workspaces/list-workspace-pipelines.mdx | 3 - .../list-workspace-run-templates.mdx | 3 - .../list-workspace-service-connectors.mdx | 3 - .../list-workspace-stack-components.mdx | 3 - .../api/workspaces/list-workspace-stacks.mdx | 3 - develop/api/workspaces/list-workspaces.mdx | 3 - develop/api/workspaces/update-workspace.mdx | 3 - develop/getting-started/core-concepts.mdx | 192 - .../common-questions-answered.mdx | 24 - .../deploy-using-huggingface-spaces.mdx | 77 - .../deploy-with-custom-image.mdx | 106 - .../deploying-zenml/deploy-with-docker.mdx | 521 - .../deploying-zenml/deploy-with-helm.mdx | 690 - .../deploying-zenml/deploy-with-zenml-cli.mdx | 334 - .../deploying-zenml/deploying-zenml.mdx | 86 - .../deploying-zenml/introduction.mdx | 4 - .../manage-the-deployed-services.mdx | 19 - .../custom-secret-stores.mdx | 101 - .../troubleshoot-stack-components.mdx | 80 - .../troubleshoot-your-deployed-server.mdx | 97 - ...pgrade-the-version-of-the-zenml-server.mdx | 85 - .../zenml-pro/system-architectures.mdx | 56 - .../zenml-pro/user-management.mdx | 20 - .../deploying-zenml/zenml-pro/zenml-pro.mdx | 29 - .../deploying-zenml/zenml-pro/zenml-saas.mdx | 26 - develop/getting-started/faq.mdx | 75 - develop/getting-started/installation.mdx | 99 - develop/getting-started/introduction.mdx | 172 - .../migration-guide/migration-guide.mdx | 26 - .../migration-guide/migration-zero-forty.mdx | 660 - .../migration-guide/migration-zero-sixty.mdx | 74 - .../migration-guide/migration-zero-thirty.mdx | 17 - .../migration-guide/migration-zero-twenty.mdx | 602 - .../reference/community-and-content.mdx | 36 - develop/mint.json | 869 - develop/sdk/core-sdk/v1/core-_hub.md | 7 - develop/sdk/core-sdk/v1/core-actions.md | 7 - develop/sdk/core-sdk/v1/core-alerter.md | 7 - develop/sdk/core-sdk/v1/core-analytics.md | 7 - develop/sdk/core-sdk/v1/core-annotators.md | 7 - develop/sdk/core-sdk/v1/core-api.md | 7 - .../sdk/core-sdk/v1/core-artifact_stores.md | 7 - develop/sdk/core-sdk/v1/core-artifacts.md | 7 - develop/sdk/core-sdk/v1/core-assistant.md | 7 - develop/sdk/core-sdk/v1/core-client.md | 7 - .../core-sdk/v1/core-client_lazy_loader.md | 7 - .../sdk/core-sdk/v1/core-code_repositories.md | 7 - develop/sdk/core-sdk/v1/core-config.md | 7 - develop/sdk/core-sdk/v1/core-console.md | 7 - develop/sdk/core-sdk/v1/core-constants.md | 7 - .../core-sdk/v1/core-container_registries.md | 7 - .../sdk/core-sdk/v1/core-data_validators.md | 7 - develop/sdk/core-sdk/v1/core-entrypoints.md | 7 - develop/sdk/core-sdk/v1/core-enums.md | 7 - develop/sdk/core-sdk/v1/core-environment.md | 7 - develop/sdk/core-sdk/v1/core-event_hub.md | 7 - develop/sdk/core-sdk/v1/core-event_sources.md | 7 - develop/sdk/core-sdk/v1/core-exceptions.md | 7 - .../core-sdk/v1/core-experiment_trackers.md | 7 - .../sdk/core-sdk/v1/core-feature_stores.md | 7 - develop/sdk/core-sdk/v1/core-hooks.md | 3 - .../sdk/core-sdk/v1/core-image_builders.md | 7 - develop/sdk/core-sdk/v1/core-io.md | 7 - develop/sdk/core-sdk/v1/core-lineage_graph.md | 7 - develop/sdk/core-sdk/v1/core-logger.md | 7 - develop/sdk/core-sdk/v1/core-logging.md | 7 - develop/sdk/core-sdk/v1/core-materializers.md | 7 - develop/sdk/core-sdk/v1/core-metadata.md | 7 - develop/sdk/core-sdk/v1/core-model.md | 7 - .../sdk/core-sdk/v1/core-model_deployers.md | 7 - .../sdk/core-sdk/v1/core-model_registries.md | 7 - develop/sdk/core-sdk/v1/core-models.md | 7 - develop/sdk/core-sdk/v1/core-new.md | 7 - develop/sdk/core-sdk/v1/core-orchestrators.md | 7 - develop/sdk/core-sdk/v1/core-pipelines.md | 7 - develop/sdk/core-sdk/v1/core-plugins.md | 7 - .../sdk/core-sdk/v1/core-post_execution.md | 7 - develop/sdk/core-sdk/v1/core-secret.md | 7 - .../core-sdk/v1/core-service_connectors.md | 7 - develop/sdk/core-sdk/v1/core-services.md | 7 - develop/sdk/core-sdk/v1/core-stack.md | 7 - .../sdk/core-sdk/v1/core-stack_deployments.md | 7 - .../sdk/core-sdk/v1/core-step_operators.md | 7 - develop/sdk/core-sdk/v1/core-steps.md | 7 - develop/sdk/core-sdk/v1/core-types.md | 7 - develop/sdk/core-sdk/v1/core-utils.md | 7 - develop/sdk/core-sdk/v1/core-zen_server.md | 7 - develop/sdk/core-sdk/v1/core-zen_stores.md | 7 - .../v1/integrations-airflow.md | 7 - .../v1/integrations-argilla.md | 7 - .../integration-docs/v1/integrations-aws.md | 7 - .../integration-docs/v1/integrations-azure.md | 7 - .../v1/integrations-bentoml.md | 7 - .../v1/integrations-bitbucket.md | 7 - .../integration-docs/v1/integrations-comet.md | 7 - .../v1/integrations-constants.md | 7 - .../v1/integrations-databricks.md | 7 - .../v1/integrations-deepchecks.md | 7 - .../v1/integrations-discord.md | 7 - .../v1/integrations-evidently.md | 7 - .../v1/integrations-facets.md | 7 - .../integration-docs/v1/integrations-feast.md | 7 - .../integration-docs/v1/integrations-gcp.md | 7 - .../v1/integrations-github.md | 7 - .../v1/integrations-gitlab.md | 7 - .../v1/integrations-great_expectations.md | 7 - .../v1/integrations-huggingface.md | 7 - .../v1/integrations-hyperai.md | 7 - .../v1/integrations-integration.md | 7 - .../v1/integrations-kaniko.md | 7 - .../v1/integrations-kubeflow.md | 7 - .../v1/integrations-kubernetes.md | 7 - .../v1/integrations-label_studio.md | 7 - .../v1/integrations-langchain.md | 7 - .../v1/integrations-lightgbm.md | 7 - .../v1/integrations-lightning.md | 7 - .../v1/integrations-llama_index.md | 7 - .../v1/integrations-mlflow.md | 7 - .../v1/integrations-neptune.md | 7 - .../v1/integrations-neural_prophet.md | 7 - .../integration-docs/v1/integrations-numpy.md | 7 - .../v1/integrations-openai.md | 7 - .../v1/integrations-pandas.md | 7 - .../v1/integrations-pigeon.md | 7 - .../v1/integrations-pillow.md | 7 - .../v1/integrations-polars.md | 7 - .../v1/integrations-prodigy.md | 7 - .../v1/integrations-pycaret.md | 7 - .../v1/integrations-pytorch.md | 7 - .../v1/integrations-pytorch_lightning.md | 7 - .../v1/integrations-registry.md | 7 - .../integration-docs/v1/integrations-s3.md | 7 - .../integration-docs/v1/integrations-scipy.md | 7 - .../v1/integrations-seldon.md | 7 - .../v1/integrations-sklearn.md | 7 - .../v1/integrations-skypilot.md | 7 - .../v1/integrations-skypilot_aws.md | 7 - .../v1/integrations-skypilot_azure.md | 7 - .../v1/integrations-skypilot_gcp.md | 7 - .../v1/integrations-skypilot_lambda.md | 7 - .../integration-docs/v1/integrations-slack.md | 7 - .../integration-docs/v1/integrations-spark.md | 7 - .../v1/integrations-tekton.md | 7 - .../v1/integrations-tensorboard.md | 7 - .../v1/integrations-tensorflow.md | 7 - .../integration-docs/v1/integrations-utils.md | 7 - .../integration-docs/v1/integrations-wandb.md | 7 - .../v1/integrations-whylogs.md | 7 - .../v1/integrations-xgboost.md | 7 - develop/sdk/python-client.mdx | 143 - develop/snippets/snippet-intro.mdx | 4 - .../stack-components/alerters/alerters.mdx | 41 - develop/stack-components/alerters/custom.mdx | 142 - develop/stack-components/alerters/discord.mdx | 112 - develop/stack-components/alerters/slack.mdx | 150 - .../annotators/annotators.mdx | 67 - .../stack-components/annotators/argilla.mdx | 94 - .../stack-components/annotators/custom.mdx | 18 - .../annotators/label-studio.mdx | 103 - .../stack-components/annotators/pigeon.mdx | 106 - .../stack-components/annotators/prodigy.mdx | 94 - .../artifact-stores/artifact-stores.mdx | 166 - .../artifact-stores/azure.mdx | 234 - .../artifact-stores/custom.mdx | 187 - .../stack-components/artifact-stores/gcp.mdx | 223 - .../artifact-stores/local.mdx | 82 - .../stack-components/artifact-stores/s3.mdx | 230 - develop/stack-components/component-guide.mdx | 118 - .../container-registries/aws.mdx | 230 - .../container-registries/azure.mdx | 206 - .../container-registries.mdx | 37 - .../container-registries/custom.mdx | 121 - .../container-registries/default.mdx | 190 - .../container-registries/dockerhub.mdx | 69 - .../container-registries/gcp.mdx | 302 - .../container-registries/github.mdx | 58 - .../data-validators/custom.mdx | 62 - .../data-validators/data-validators.mdx | 50 - .../data-validators/deepchecks.mdx | 419 - .../data-validators/evidently.mdx | 632 - .../data-validators/great-expectations.mdx | 310 - .../data-validators/whylogs.mdx | 285 - .../experiment-trackers/comet.mdx | 300 - .../experiment-trackers/custom.mdx | 59 - .../experiment-trackers.mdx | 72 - .../experiment-trackers/mlflow.mdx | 213 - .../experiment-trackers/neptune.mdx | 187 - .../experiment-trackers/wandb.mdx | 197 - .../feature-stores/custom.mdx | 17 - .../stack-components/feature-stores/feast.mdx | 118 - .../feature-stores/feature-stores.mdx | 39 - .../image-builders/custom.mdx | 114 - .../stack-components/image-builders/gcp.mdx | 188 - .../image-builders/image-builders.mdx | 32 - .../image-builders/kaniko.mdx | 151 - .../stack-components/image-builders/local.mdx | 47 - .../model-deployers/bentoml.mdx | 255 - .../model-deployers/custom.mdx | 166 - .../model-deployers/databricks.mdx | 142 - .../model-deployers/huggingface.mdx | 150 - .../model-deployers/mlflow.mdx | 233 - .../model-deployers/model-deployers.mdx | 174 - .../model-deployers/seldon.mdx | 520 - .../model-registries/custom.mdx | 191 - .../model-registries/mlflow.mdx | 184 - .../model-registries/model-registries.mdx | 58 - .../orchestrators/airflow.mdx | 222 - .../orchestrators/azureml.mdx | 189 - .../stack-components/orchestrators/custom.mdx | 215 - .../orchestrators/databricks.mdx | 193 - .../orchestrators/hyperai.mdx | 82 - .../orchestrators/kubeflow.mdx | 350 - .../orchestrators/kubernetes.mdx | 273 - .../orchestrators/lightning.mdx | 163 - .../orchestrators/local-docker.mdx | 73 - .../stack-components/orchestrators/local.mdx | 40 - .../orchestrators/orchestrators.mdx | 69 - .../orchestrators/sagemaker.mdx | 270 - .../orchestrators/skypilot-vm.mdx | 466 - .../stack-components/orchestrators/tekton.mdx | 238 - .../stack-components/orchestrators/vertex.mdx | 297 - .../deploy-a-cloud-stack-with-terraform.mdx | 308 - .../stack-deployment/deploy-a-cloud-stack.mdx | 369 - .../deploy-a-stack-using-mlstacks.mdx | 214 - .../stack-deployment/deploying-stacks.mdx | 45 - .../implement-a-custom-integration.mdx | 127 - .../implement-a-custom-stack-component.mdx | 286 - .../register-a-cloud-stack.mdx | 263 - .../step-operators/azureml.mdx | 138 - .../step-operators/custom.mdx | 122 - .../step-operators/kubernetes.mdx | 227 - .../step-operators/sagemaker.mdx | 106 - .../step-operators/spark-kubernetes.mdx | 345 - .../step-operators/step-operators.mdx | 56 - .../step-operators/vertex.mdx | 136 - .../control-logging/control-logging.mdx | 13 - .../disable-colorful-logging.mdx | 28 - .../disable-rich-traceback.mdx | 31 - .../enable-or-disable-logs-storing.mdx | 54 - .../control-logging/set-logging-verbosity.mdx | 31 - .../view-logs-on-the-dasbhoard.mdx | 32 - .../debug-and-solve-issues.mdx | 204 - .../popular-integrations/aws-guide.mdx | 281 - .../popular-integrations/azure-guide.mdx | 175 - .../popular-integrations/gcp-guide.mdx | 161 - .../popular-integrations/kubeflow.mdx | 107 - .../popular-integrations/kubernetes.mdx | 65 - .../popular-integrations/mlflow.mdx | 105 - .../popular-integrations.mdx | 7 - .../popular-integrations/skypilot.mdx | 92 - ...ns-of-defining-steps-in-notebook-cells.mdx | 11 - .../run-a-single-step-from-a-notebook.mdx | 49 - ...ote-steps-and-pipelines-from-notebooks.mdx | 14 - .../associate-a-pipeline-with-a-model.mdx | 51 - .../connecting-artifacts-via-a-model.mdx | 133 - .../delete-a-model.mdx | 42 - .../linking-model-binaries-data-to-models.mdx | 127 - .../load-a-model-in-code.mdx | 54 - .../load-artifacts-from-model.mdx | 69 - .../model-versions.mdx | 131 - .../promote-a-model.mdx | 91 - .../register-a-model.mdx | 71 - .../use-the-model-control-plane.mdx | 25 - .../access-secrets-in-a-step.mdx | 42 - .../build-pipelines/build-pipelines.mdx | 73 - .../build-pipelines/compose-pipelines.mdx | 38 - .../configuring-a-pipeline-at-runtime.mdx | 18 - .../control-caching-behavior.mdx | 40 - .../control-execution-order-of-steps.mdx | 38 - .../build-pipelines/delete-a-pipeline.mdx | 46 - .../build-pipelines/fetching-pipelines.mdx | 370 - .../get-past-pipeline-step-runs.mdx | 20 - .../hyper-parameter-tuning.mdx | 91 - .../name-your-pipeline-and-runs.mdx | 37 - .../pipelines/build-pipelines/retry-steps.mdx | 49 - .../run-an-individual-step.mdx | 62 - .../run-pipelines-asynchronously.mdx | 32 - .../build-pipelines/schedule-a-pipeline.mdx | 67 - .../step-output-typing-and-annotation.mdx | 118 - .../use-failure-success-hooks.mdx | 265 - .../use-pipeline-step-parameters.mdx | 117 - .../using-a-custom-step-invocation-id.mdx | 28 - .../create-a-run-template.mdx | 24 - .../create-and-run-templates.mdx | 15 - .../run-a-template.mdx | 25 - .../trigger-a-pipeline-from-another.mdx | 49 - .../trigger-a-pipeline-from-client.mdx | 27 - .../trigger-a-pipeline-from-rest-api.mdx | 75 - .../trigger-pipelines/trigger-pipelines.mdx | 48 - .../configure-python-environments.mdx | 50 - .../configure-the-server-environment.mdx | 134 - .../handling-dependencies.mdx | 49 - .../customize-docker-builds.mdx | 9 - .../define-where-an-image-is-built.mdx | 13 - .../docker-settings-on-a-pipeline.mdx | 143 - .../docker-settings-on-a-step.mdx | 38 - ...cify-pip-dependencies-and-apt-packages.mdx | 139 - ...itories-to-speed-up-docker-build-times.mdx | 38 - .../use-your-own-docker-files.mdx | 38 - .../which-files-are-built-into-the-image.mdx | 38 - .../usage/project-setup/global-settings.mdx | 159 - .../best-practices.mdx | 99 - .../connect-your-git-repository.mdx | 165 - .../setting-up-a-project-repository.mdx | 12 - .../using-project-templates.mdx | 89 - .../autogenerate-a-template-yaml-file.mdx | 218 - .../configuration-hierarchy.mdx | 39 - .../how-to-use-config.mdx | 40 - .../retrieve-used-configuration-of-a-run.mdx | 15 - .../runtime-configuration.mdx | 75 - .../use-configuration-files.mdx | 14 - .../what-can-be-configured.mdx | 240 - .../use-secrets/interact-with-secrets.mdx | 148 - ...ference-secrets-in-stack-configuration.mdx | 66 - .../use-secrets/secret-management.mdx | 72 - .../auth-management/auth-management.mdx | 722 - .../auth-management/aws-service-connector.mdx | 2678 -- .../azure-service-connector.mdx | 1201 - .../best-security-practices.mdx | 851 - .../docker-service-connector.mdx | 111 - .../auth-management/gcp-service-connector.mdx | 2855 -- .../hyperai-service-connector.mdx | 60 - .../kubernetes-service-connector.mdx | 201 - .../service-connectors-guide.mdx | 2503 -- .../connect-in-with-your-user-interactive.mdx | 45 - .../connect-with-a-service-account.mdx | 78 - .../connecting-to-zenml.mdx | 7 - .../artifact-versioning.mdx | 47 - .../delete-an-artifact.mdx | 17 - .../get-arbitrary-artifacts-in-a-step.mdx | 25 - .../handle-custom-data-types.mdx | 640 - .../handle-data-artifacts.mdx | 38 - .../load-artifacts-into-memory.mdx | 77 - .../passing-artifacts-between-pipelines.mdx | 129 - .../return-multiple-outputs-from-a-step.mdx | 40 - .../handle-data-artifacts/tagging.mdx | 97 - .../unmaterialized-artifacts.mdx | 93 - .../attach-metadata-to-a-model.mdx | 59 - .../attach-metadata-to-an-artifact.mdx | 83 - .../attach-metadata-to-steps.mdx | 69 - .../fetch-metadata-within-pipeline.mdx | 46 - .../fetch-metadata-within-steps.mdx | 37 - .../grouping-metadata.mdx | 34 - .../logging-metadata.mdx | 35 - .../track-metrics-metadata.mdx | 12 - .../accelerate-distributed-training.mdx | 110 - .../training-with-gpus/training-with-gpus.mdx | 138 - .../creating-custom-visualizations.mdx | 113 - .../disabling-visualizations.mdx | 18 - .../types-of-visualizations.mdx | 25 - .../visualizations-in-dashboard.mdx | 23 - .../visualize-artifacts.mdx | 12 - .../examples/basic-nlp-with-bert.mdx | 5 - develop/user-guide/examples/basic-nlp.mdx | 0 .../examples/computer-vision-with-yolov8.mdx | 5 - develop/user-guide/examples/e2e-batch.mdx | 0 .../user-guide/examples/end-to-end-batch.mdx | 5 - .../user-guide/examples/llm-finetuning.mdx | 5 - develop/user-guide/examples/more-projects.mdx | 5 - develop/user-guide/examples/quickstart.mdx | 5 - develop/user-guide/guides/llmops-guide.mdx | 54 - .../guides/llmops-guide/evaluation.mdx | 33 - .../evaluation/evaluation-in-65-loc.mdx | 88 - .../evaluation/evaluation-in-practice.mdx | 43 - .../llmops-guide/evaluation/generation.mdx | 312 - .../llmops-guide/evaluation/retrieval.mdx | 241 - .../llmops-guide/finetuning-embeddings.mdx | 19 - .../evaluating-finetuned-embeddings.mdx | 94 - ...-embeddings-with-sentence-transformers.mdx | 73 - .../synthetic-data-generation.mdx | 175 - .../guides/llmops-guide/finetuning-llms.mdx | 7 - .../guides/llmops-guide/rag-with-zenml.mdx | 19 - .../basic-rag-inference-pipeline.mdx | 127 - .../rag-with-zenml/data-ingestion.mdx | 116 - .../rag-with-zenml/embeddings-generation.mdx | 160 - .../rag-with-zenml/rag-85-loc.mdx | 118 - ...toring-embeddings-in-a-vector-database.mdx | 130 - .../rag-with-zenml/understanding-rag.mdx | 49 - .../guides/llmops-guide/reranking.mdx | 14 - .../evaluating-reranking-performance.mdx | 228 - .../reranking/implementing-reranking.mdx | 117 - .../reranking/understanding-reranking.mdx | 32 - .../user-guide/guides/production-guide.mdx | 26 - .../guides/production-guide/ci-cd.mdx | 126 - .../production-guide/cloud-orchestration.mdx | 177 - .../production-guide/configure-pipeline.mdx | 164 - .../connect-code-repository.mdx | 116 - .../production-guide/deploying-zenml.mdx | 69 - .../guides/production-guide/end-to-end.mdx | 62 - .../production-guide/remote-storage.mdx | 205 - .../production-guide/understand-stacks.mdx | 236 - develop/user-guide/guides/starter-guide.mdx | 22 - .../cache-previous-executions.mdx | 181 - .../starter-guide/create-an-ml-pipeline.mdx | 346 - .../guides/starter-guide/manage-artifacts.mdx | 484 - .../guides/starter-guide/starter-project.mdx | 69 - .../guides/starter-guide/track-ml-models.mdx | 268 - 587 files changed, 75960 deletions(-) delete mode 100644 develop/api/actions/create-action.mdx delete mode 100644 develop/api/actions/delete-action.mdx delete mode 100644 develop/api/actions/get-action.mdx delete mode 100644 develop/api/actions/list-actions.mdx delete mode 100644 develop/api/actions/update-action.mdx delete mode 100644 develop/api/api-reference.mdx delete mode 100644 develop/api/artifact_versions/create-artifact-version.mdx delete mode 100644 develop/api/artifact_versions/delete-artifact-version.mdx delete mode 100644 develop/api/artifact_versions/get-artifact-version.mdx delete mode 100644 develop/api/artifact_versions/get-artifact-visualization.mdx delete mode 100644 develop/api/artifact_versions/list-artifact-versions.mdx delete mode 100644 develop/api/artifact_versions/prune-artifact-versions.mdx delete mode 100644 develop/api/artifact_versions/update-artifact-version.mdx delete mode 100644 develop/api/artifacts/create-artifact.mdx delete mode 100644 develop/api/artifacts/delete-artifact.mdx delete mode 100644 develop/api/artifacts/get-artifact.mdx delete mode 100644 develop/api/artifacts/list-artifacts.mdx delete mode 100644 develop/api/artifacts/update-artifact.mdx delete mode 100644 develop/api/auth/api-token.mdx delete mode 100644 develop/api/auth/device-authorization.mdx delete mode 100644 develop/api/auth/logout.mdx delete mode 100644 develop/api/auth/token.mdx delete mode 100644 develop/api/authorized_devices/delete-authorized-device.mdx delete mode 100644 develop/api/authorized_devices/get-authorization-device.mdx delete mode 100644 develop/api/authorized_devices/list-authorized-devices.mdx delete mode 100644 develop/api/authorized_devices/update-authorized-device.mdx delete mode 100644 develop/api/authorized_devices/verify-authorized-device.mdx delete mode 100644 develop/api/builds/delete-build.mdx delete mode 100644 develop/api/builds/get-build.mdx delete mode 100644 develop/api/builds/list-builds.mdx delete mode 100644 develop/api/code_repositories/delete-code-repository.mdx delete mode 100644 develop/api/code_repositories/get-code-repository.mdx delete mode 100644 develop/api/code_repositories/list-code-repositories.mdx delete mode 100644 develop/api/code_repositories/update-code-repository.mdx delete mode 100644 develop/api/deployments/delete-deployment.mdx delete mode 100644 develop/api/deployments/deployment-logs.mdx delete mode 100644 develop/api/deployments/get-deployment.mdx delete mode 100644 develop/api/deployments/list-deployments.mdx delete mode 100644 develop/api/endpoint/create.mdx delete mode 100644 develop/api/endpoint/delete.mdx delete mode 100644 develop/api/endpoint/get.mdx delete mode 100644 develop/api/event-sources/create-event-source.mdx delete mode 100644 develop/api/event-sources/delete-event-source.mdx delete mode 100644 develop/api/event-sources/get-event-source.mdx delete mode 100644 develop/api/event-sources/list-event-sources.mdx delete mode 100644 develop/api/event-sources/update-event-source.mdx delete mode 100644 develop/api/flavors/create-flavor.mdx delete mode 100644 develop/api/flavors/delete-flavor.mdx delete mode 100644 develop/api/flavors/get-flavor.mdx delete mode 100644 develop/api/flavors/list-flavors.mdx delete mode 100644 develop/api/flavors/sync-flavors.mdx delete mode 100644 develop/api/flavors/update-flavor.mdx delete mode 100644 develop/api/health.mdx delete mode 100644 develop/api/model_version_artifacts/list-model-version-artifact-links.mdx delete mode 100644 develop/api/model_version_pipeline_runs/list-model-version-pipeline-run-links.mdx delete mode 100644 develop/api/model_versions/delete-all-model-version-artifact-links.mdx delete mode 100644 develop/api/model_versions/delete-model-version-artifact-link.mdx delete mode 100644 develop/api/model_versions/delete-model-version-pipeline-run-link.mdx delete mode 100644 develop/api/model_versions/delete-model-version.mdx delete mode 100644 develop/api/model_versions/get-model-version.mdx delete mode 100644 develop/api/model_versions/list-model-versions.mdx delete mode 100644 develop/api/model_versions/update-model-version.mdx delete mode 100644 develop/api/models/delete-model.mdx delete mode 100644 develop/api/models/get-model.mdx delete mode 100644 develop/api/models/list-model-versions.mdx delete mode 100644 develop/api/models/list-models.mdx delete mode 100644 develop/api/models/update-model.mdx delete mode 100644 develop/api/openapi.json delete mode 100644 develop/api/pipelines/delete-pipeline.mdx delete mode 100644 develop/api/pipelines/get-pipeline.mdx delete mode 100644 develop/api/pipelines/list-pipeline-runs.mdx delete mode 100644 develop/api/pipelines/list-pipelines.mdx delete mode 100644 develop/api/pipelines/update-pipeline.mdx delete mode 100644 develop/api/plugins/get-flavor.mdx delete mode 100644 develop/api/plugins/list-flavors.mdx delete mode 100644 develop/api/run_metadata/get-run-metadata.mdx delete mode 100644 develop/api/run_metadata/list-run-metadata.mdx delete mode 100644 develop/api/run_templates/create-template-run.mdx delete mode 100644 develop/api/run_templates/delete-run-template.mdx delete mode 100644 develop/api/run_templates/get-run-template.mdx delete mode 100644 develop/api/run_templates/list-run-templates.mdx delete mode 100644 develop/api/run_templates/update-run-template.mdx delete mode 100644 develop/api/runs/delete-run.mdx delete mode 100644 develop/api/runs/get-pipeline-configuration.mdx delete mode 100644 develop/api/runs/get-run-dag.mdx delete mode 100644 develop/api/runs/get-run-status.mdx delete mode 100644 develop/api/runs/get-run-steps.mdx delete mode 100644 develop/api/runs/get-run.mdx delete mode 100644 develop/api/runs/list-runs.mdx delete mode 100644 develop/api/runs/update-run.mdx delete mode 100644 develop/api/schedules/delete-schedule.mdx delete mode 100644 develop/api/schedules/get-schedule.mdx delete mode 100644 develop/api/schedules/list-schedules.mdx delete mode 100644 develop/api/schedules/update-schedule.mdx delete mode 100644 develop/api/secrets/backup-secrets.mdx delete mode 100644 develop/api/secrets/delete-secret.mdx delete mode 100644 develop/api/secrets/get-secret.mdx delete mode 100644 develop/api/secrets/list-secrets.mdx delete mode 100644 develop/api/secrets/restore-secrets.mdx delete mode 100644 develop/api/secrets/update-secret.mdx delete mode 100644 develop/api/server/get-onboarding-state.mdx delete mode 100644 develop/api/server/server-info.mdx delete mode 100644 develop/api/server/version.mdx delete mode 100644 develop/api/service_accounts/create-api-key.mdx delete mode 100644 develop/api/service_accounts/create-service-account.mdx delete mode 100644 develop/api/service_accounts/delete-api-key.mdx delete mode 100644 develop/api/service_accounts/delete-service-account.mdx delete mode 100644 develop/api/service_accounts/get-api-key.mdx delete mode 100644 develop/api/service_accounts/get-service-account.mdx delete mode 100644 develop/api/service_accounts/list-api-keys.mdx delete mode 100644 develop/api/service_accounts/list-service-accounts.mdx delete mode 100644 develop/api/service_accounts/rotate-api-key.mdx delete mode 100644 develop/api/service_accounts/update-api-key.mdx delete mode 100644 develop/api/service_accounts/update-service-account.mdx delete mode 100644 develop/api/service_connectors/delete-service-connector.mdx delete mode 100644 develop/api/service_connectors/get-resources-based-on-service-connector-info.mdx delete mode 100644 develop/api/service_connectors/get-service-connector-client.mdx delete mode 100644 develop/api/service_connectors/get-service-connector-type.mdx delete mode 100644 develop/api/service_connectors/get-service-connector.mdx delete mode 100644 develop/api/service_connectors/list-service-connector-types.mdx delete mode 100644 develop/api/service_connectors/list-service-connectors.mdx delete mode 100644 develop/api/service_connectors/update-service-connector.mdx delete mode 100644 develop/api/service_connectors/validate-and-verify-service-connector-config.mdx delete mode 100644 develop/api/service_connectors/validate-and-verify-service-connector.mdx delete mode 100644 develop/api/services/create-service.mdx delete mode 100644 develop/api/services/delete-service.mdx delete mode 100644 develop/api/services/get-service.mdx delete mode 100644 develop/api/services/list-services.mdx delete mode 100644 develop/api/services/update-service.mdx delete mode 100644 develop/api/stack_components/deregister-stack-component.mdx delete mode 100644 develop/api/stack_components/get-stack-component-types.mdx delete mode 100644 develop/api/stack_components/get-stack-component.mdx delete mode 100644 develop/api/stack_components/list-stack-components.mdx delete mode 100644 develop/api/stack_components/update-stack-component.mdx delete mode 100644 develop/api/stacks/delete-stack.mdx delete mode 100644 develop/api/stacks/get-deployed-stack.mdx delete mode 100644 develop/api/stacks/get-stack-deployment-config.mdx delete mode 100644 develop/api/stacks/get-stack-deployment-info.mdx delete mode 100644 develop/api/stacks/get-stack.mdx delete mode 100644 develop/api/stacks/list-stacks.mdx delete mode 100644 develop/api/stacks/update-stack.mdx delete mode 100644 develop/api/steps/create-run-step.mdx delete mode 100644 develop/api/steps/get-step-configuration.mdx delete mode 100644 develop/api/steps/get-step-logs.mdx delete mode 100644 develop/api/steps/get-step-status.mdx delete mode 100644 develop/api/steps/get-step.mdx delete mode 100644 develop/api/steps/list-run-steps.mdx delete mode 100644 develop/api/steps/update-step.mdx delete mode 100644 develop/api/tags/create-tag.mdx delete mode 100644 develop/api/tags/delete-tag.mdx delete mode 100644 develop/api/tags/get-tag.mdx delete mode 100644 develop/api/tags/list-tags.mdx delete mode 100644 develop/api/tags/update-tag.mdx delete mode 100644 develop/api/triggers/create-trigger.mdx delete mode 100644 develop/api/triggers/delete-trigger.mdx delete mode 100644 develop/api/triggers/get-trigger.mdx delete mode 100644 develop/api/triggers/list-triggers.mdx delete mode 100644 develop/api/triggers/update-trigger.mdx delete mode 100644 develop/api/users/get-current-user.mdx delete mode 100644 develop/api/users/get-user.mdx delete mode 100644 develop/api/users/list-users.mdx delete mode 100644 develop/api/users/update-user-resource-membership.mdx delete mode 100644 develop/api/webhook/webhook.mdx delete mode 100644 develop/api/workspaces/create-build.mdx delete mode 100644 develop/api/workspaces/create-code-repository.mdx delete mode 100644 develop/api/workspaces/create-deployment.mdx delete mode 100644 develop/api/workspaces/create-model-version-artifact-link.mdx delete mode 100644 develop/api/workspaces/create-model-version-pipeline-run-link.mdx delete mode 100644 develop/api/workspaces/create-model-version.mdx delete mode 100644 develop/api/workspaces/create-model.mdx delete mode 100644 develop/api/workspaces/create-pipeline-run.mdx delete mode 100644 develop/api/workspaces/create-pipeline.mdx delete mode 100644 develop/api/workspaces/create-run-metadata.mdx delete mode 100644 develop/api/workspaces/create-run-template.mdx delete mode 100644 develop/api/workspaces/create-schedule.mdx delete mode 100644 develop/api/workspaces/create-secret.mdx delete mode 100644 develop/api/workspaces/create-service-connector.mdx delete mode 100644 develop/api/workspaces/create-service.mdx delete mode 100644 develop/api/workspaces/create-stack-component.mdx delete mode 100644 develop/api/workspaces/create-stack.mdx delete mode 100644 develop/api/workspaces/create-workspace.mdx delete mode 100644 develop/api/workspaces/delete-workspace.mdx delete mode 100644 develop/api/workspaces/get-or-create-pipeline-run.mdx delete mode 100644 develop/api/workspaces/get-workspace-statistics.mdx delete mode 100644 develop/api/workspaces/get-workspace.mdx delete mode 100644 develop/api/workspaces/list-runs.mdx delete mode 100644 develop/api/workspaces/list-service-connector-resources.mdx delete mode 100644 develop/api/workspaces/list-workspace-builds.mdx delete mode 100644 develop/api/workspaces/list-workspace-code-repositories.mdx delete mode 100644 develop/api/workspaces/list-workspace-deployments.mdx delete mode 100644 develop/api/workspaces/list-workspace-pipelines.mdx delete mode 100644 develop/api/workspaces/list-workspace-run-templates.mdx delete mode 100644 develop/api/workspaces/list-workspace-service-connectors.mdx delete mode 100644 develop/api/workspaces/list-workspace-stack-components.mdx delete mode 100644 develop/api/workspaces/list-workspace-stacks.mdx delete mode 100644 develop/api/workspaces/list-workspaces.mdx delete mode 100644 develop/api/workspaces/update-workspace.mdx delete mode 100644 develop/getting-started/core-concepts.mdx delete mode 100644 develop/getting-started/deploying-zenml/common-questions-answered.mdx delete mode 100644 develop/getting-started/deploying-zenml/deploy-using-huggingface-spaces.mdx delete mode 100644 develop/getting-started/deploying-zenml/deploy-with-custom-image.mdx delete mode 100644 develop/getting-started/deploying-zenml/deploy-with-docker.mdx delete mode 100644 develop/getting-started/deploying-zenml/deploy-with-helm.mdx delete mode 100644 develop/getting-started/deploying-zenml/deploy-with-zenml-cli.mdx delete mode 100644 develop/getting-started/deploying-zenml/deploying-zenml.mdx delete mode 100644 develop/getting-started/deploying-zenml/introduction.mdx delete mode 100644 develop/getting-started/deploying-zenml/manage-the-deployed-services.mdx delete mode 100644 develop/getting-started/deploying-zenml/manage-the-deployed-services/custom-secret-stores.mdx delete mode 100644 develop/getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-stack-components.mdx delete mode 100644 develop/getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-your-deployed-server.mdx delete mode 100644 develop/getting-started/deploying-zenml/manage-the-deployed-services/upgrade-the-version-of-the-zenml-server.mdx delete mode 100644 develop/getting-started/deploying-zenml/zenml-pro/system-architectures.mdx delete mode 100644 develop/getting-started/deploying-zenml/zenml-pro/user-management.mdx delete mode 100644 develop/getting-started/deploying-zenml/zenml-pro/zenml-pro.mdx delete mode 100644 develop/getting-started/deploying-zenml/zenml-pro/zenml-saas.mdx delete mode 100644 develop/getting-started/faq.mdx delete mode 100644 develop/getting-started/installation.mdx delete mode 100644 develop/getting-started/introduction.mdx delete mode 100644 develop/getting-started/migration-guide/migration-guide.mdx delete mode 100644 develop/getting-started/migration-guide/migration-zero-forty.mdx delete mode 100644 develop/getting-started/migration-guide/migration-zero-sixty.mdx delete mode 100644 develop/getting-started/migration-guide/migration-zero-thirty.mdx delete mode 100644 develop/getting-started/migration-guide/migration-zero-twenty.mdx delete mode 100644 develop/getting-started/reference/community-and-content.mdx delete mode 100644 develop/mint.json delete mode 100644 develop/sdk/core-sdk/v1/core-_hub.md delete mode 100644 develop/sdk/core-sdk/v1/core-actions.md delete mode 100644 develop/sdk/core-sdk/v1/core-alerter.md delete mode 100644 develop/sdk/core-sdk/v1/core-analytics.md delete mode 100644 develop/sdk/core-sdk/v1/core-annotators.md delete mode 100644 develop/sdk/core-sdk/v1/core-api.md delete mode 100644 develop/sdk/core-sdk/v1/core-artifact_stores.md delete mode 100644 develop/sdk/core-sdk/v1/core-artifacts.md delete mode 100644 develop/sdk/core-sdk/v1/core-assistant.md delete mode 100644 develop/sdk/core-sdk/v1/core-client.md delete mode 100644 develop/sdk/core-sdk/v1/core-client_lazy_loader.md delete mode 100644 develop/sdk/core-sdk/v1/core-code_repositories.md delete mode 100644 develop/sdk/core-sdk/v1/core-config.md delete mode 100644 develop/sdk/core-sdk/v1/core-console.md delete mode 100644 develop/sdk/core-sdk/v1/core-constants.md delete mode 100644 develop/sdk/core-sdk/v1/core-container_registries.md delete mode 100644 develop/sdk/core-sdk/v1/core-data_validators.md delete mode 100644 develop/sdk/core-sdk/v1/core-entrypoints.md delete mode 100644 develop/sdk/core-sdk/v1/core-enums.md delete mode 100644 develop/sdk/core-sdk/v1/core-environment.md delete mode 100644 develop/sdk/core-sdk/v1/core-event_hub.md delete mode 100644 develop/sdk/core-sdk/v1/core-event_sources.md delete mode 100644 develop/sdk/core-sdk/v1/core-exceptions.md delete mode 100644 develop/sdk/core-sdk/v1/core-experiment_trackers.md delete mode 100644 develop/sdk/core-sdk/v1/core-feature_stores.md delete mode 100644 develop/sdk/core-sdk/v1/core-hooks.md delete mode 100644 develop/sdk/core-sdk/v1/core-image_builders.md delete mode 100644 develop/sdk/core-sdk/v1/core-io.md delete mode 100644 develop/sdk/core-sdk/v1/core-lineage_graph.md delete mode 100644 develop/sdk/core-sdk/v1/core-logger.md delete mode 100644 develop/sdk/core-sdk/v1/core-logging.md delete mode 100644 develop/sdk/core-sdk/v1/core-materializers.md delete mode 100644 develop/sdk/core-sdk/v1/core-metadata.md delete mode 100644 develop/sdk/core-sdk/v1/core-model.md delete mode 100644 develop/sdk/core-sdk/v1/core-model_deployers.md delete mode 100644 develop/sdk/core-sdk/v1/core-model_registries.md delete mode 100644 develop/sdk/core-sdk/v1/core-models.md delete mode 100644 develop/sdk/core-sdk/v1/core-new.md delete mode 100644 develop/sdk/core-sdk/v1/core-orchestrators.md delete mode 100644 develop/sdk/core-sdk/v1/core-pipelines.md delete mode 100644 develop/sdk/core-sdk/v1/core-plugins.md delete mode 100644 develop/sdk/core-sdk/v1/core-post_execution.md delete mode 100644 develop/sdk/core-sdk/v1/core-secret.md delete mode 100644 develop/sdk/core-sdk/v1/core-service_connectors.md delete mode 100644 develop/sdk/core-sdk/v1/core-services.md delete mode 100644 develop/sdk/core-sdk/v1/core-stack.md delete mode 100644 develop/sdk/core-sdk/v1/core-stack_deployments.md delete mode 100644 develop/sdk/core-sdk/v1/core-step_operators.md delete mode 100644 develop/sdk/core-sdk/v1/core-steps.md delete mode 100644 develop/sdk/core-sdk/v1/core-types.md delete mode 100644 develop/sdk/core-sdk/v1/core-utils.md delete mode 100644 develop/sdk/core-sdk/v1/core-zen_server.md delete mode 100644 develop/sdk/core-sdk/v1/core-zen_stores.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-airflow.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-argilla.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-aws.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-azure.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-bentoml.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-bitbucket.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-comet.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-constants.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-databricks.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-deepchecks.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-discord.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-evidently.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-facets.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-feast.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-gcp.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-github.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-gitlab.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-great_expectations.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-huggingface.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-hyperai.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-integration.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-kaniko.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-kubeflow.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-kubernetes.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-label_studio.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-langchain.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-lightgbm.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-lightning.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-llama_index.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-mlflow.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-neptune.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-neural_prophet.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-numpy.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-openai.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-pandas.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-pigeon.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-pillow.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-polars.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-prodigy.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-pycaret.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-pytorch.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-pytorch_lightning.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-registry.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-s3.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-scipy.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-seldon.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-sklearn.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-skypilot.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-skypilot_aws.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-skypilot_azure.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-skypilot_gcp.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-skypilot_lambda.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-slack.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-spark.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-tekton.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-tensorboard.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-tensorflow.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-utils.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-wandb.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-whylogs.md delete mode 100644 develop/sdk/integration-docs/v1/integrations-xgboost.md delete mode 100644 develop/sdk/python-client.mdx delete mode 100644 develop/snippets/snippet-intro.mdx delete mode 100644 develop/stack-components/alerters/alerters.mdx delete mode 100644 develop/stack-components/alerters/custom.mdx delete mode 100644 develop/stack-components/alerters/discord.mdx delete mode 100644 develop/stack-components/alerters/slack.mdx delete mode 100644 develop/stack-components/annotators/annotators.mdx delete mode 100644 develop/stack-components/annotators/argilla.mdx delete mode 100644 develop/stack-components/annotators/custom.mdx delete mode 100644 develop/stack-components/annotators/label-studio.mdx delete mode 100644 develop/stack-components/annotators/pigeon.mdx delete mode 100644 develop/stack-components/annotators/prodigy.mdx delete mode 100644 develop/stack-components/artifact-stores/artifact-stores.mdx delete mode 100644 develop/stack-components/artifact-stores/azure.mdx delete mode 100644 develop/stack-components/artifact-stores/custom.mdx delete mode 100644 develop/stack-components/artifact-stores/gcp.mdx delete mode 100644 develop/stack-components/artifact-stores/local.mdx delete mode 100644 develop/stack-components/artifact-stores/s3.mdx delete mode 100644 develop/stack-components/component-guide.mdx delete mode 100644 develop/stack-components/container-registries/aws.mdx delete mode 100644 develop/stack-components/container-registries/azure.mdx delete mode 100644 develop/stack-components/container-registries/container-registries.mdx delete mode 100644 develop/stack-components/container-registries/custom.mdx delete mode 100644 develop/stack-components/container-registries/default.mdx delete mode 100644 develop/stack-components/container-registries/dockerhub.mdx delete mode 100644 develop/stack-components/container-registries/gcp.mdx delete mode 100644 develop/stack-components/container-registries/github.mdx delete mode 100644 develop/stack-components/data-validators/custom.mdx delete mode 100644 develop/stack-components/data-validators/data-validators.mdx delete mode 100644 develop/stack-components/data-validators/deepchecks.mdx delete mode 100644 develop/stack-components/data-validators/evidently.mdx delete mode 100644 develop/stack-components/data-validators/great-expectations.mdx delete mode 100644 develop/stack-components/data-validators/whylogs.mdx delete mode 100644 develop/stack-components/experiment-trackers/comet.mdx delete mode 100644 develop/stack-components/experiment-trackers/custom.mdx delete mode 100644 develop/stack-components/experiment-trackers/experiment-trackers.mdx delete mode 100644 develop/stack-components/experiment-trackers/mlflow.mdx delete mode 100644 develop/stack-components/experiment-trackers/neptune.mdx delete mode 100644 develop/stack-components/experiment-trackers/wandb.mdx delete mode 100644 develop/stack-components/feature-stores/custom.mdx delete mode 100644 develop/stack-components/feature-stores/feast.mdx delete mode 100644 develop/stack-components/feature-stores/feature-stores.mdx delete mode 100644 develop/stack-components/image-builders/custom.mdx delete mode 100644 develop/stack-components/image-builders/gcp.mdx delete mode 100644 develop/stack-components/image-builders/image-builders.mdx delete mode 100644 develop/stack-components/image-builders/kaniko.mdx delete mode 100644 develop/stack-components/image-builders/local.mdx delete mode 100644 develop/stack-components/model-deployers/bentoml.mdx delete mode 100644 develop/stack-components/model-deployers/custom.mdx delete mode 100644 develop/stack-components/model-deployers/databricks.mdx delete mode 100644 develop/stack-components/model-deployers/huggingface.mdx delete mode 100644 develop/stack-components/model-deployers/mlflow.mdx delete mode 100644 develop/stack-components/model-deployers/model-deployers.mdx delete mode 100644 develop/stack-components/model-deployers/seldon.mdx delete mode 100644 develop/stack-components/model-registries/custom.mdx delete mode 100644 develop/stack-components/model-registries/mlflow.mdx delete mode 100644 develop/stack-components/model-registries/model-registries.mdx delete mode 100644 develop/stack-components/orchestrators/airflow.mdx delete mode 100644 develop/stack-components/orchestrators/azureml.mdx delete mode 100644 develop/stack-components/orchestrators/custom.mdx delete mode 100644 develop/stack-components/orchestrators/databricks.mdx delete mode 100644 develop/stack-components/orchestrators/hyperai.mdx delete mode 100644 develop/stack-components/orchestrators/kubeflow.mdx delete mode 100644 develop/stack-components/orchestrators/kubernetes.mdx delete mode 100644 develop/stack-components/orchestrators/lightning.mdx delete mode 100644 develop/stack-components/orchestrators/local-docker.mdx delete mode 100644 develop/stack-components/orchestrators/local.mdx delete mode 100644 develop/stack-components/orchestrators/orchestrators.mdx delete mode 100644 develop/stack-components/orchestrators/sagemaker.mdx delete mode 100644 develop/stack-components/orchestrators/skypilot-vm.mdx delete mode 100644 develop/stack-components/orchestrators/tekton.mdx delete mode 100644 develop/stack-components/orchestrators/vertex.mdx delete mode 100644 develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform.mdx delete mode 100644 develop/stack-components/stack-deployment/deploy-a-cloud-stack.mdx delete mode 100644 develop/stack-components/stack-deployment/deploy-a-stack-using-mlstacks.mdx delete mode 100644 develop/stack-components/stack-deployment/deploying-stacks.mdx delete mode 100644 develop/stack-components/stack-deployment/implement-a-custom-integration.mdx delete mode 100644 develop/stack-components/stack-deployment/implement-a-custom-stack-component.mdx delete mode 100644 develop/stack-components/stack-deployment/register-a-cloud-stack.mdx delete mode 100644 develop/stack-components/step-operators/azureml.mdx delete mode 100644 develop/stack-components/step-operators/custom.mdx delete mode 100644 develop/stack-components/step-operators/kubernetes.mdx delete mode 100644 develop/stack-components/step-operators/sagemaker.mdx delete mode 100644 develop/stack-components/step-operators/spark-kubernetes.mdx delete mode 100644 develop/stack-components/step-operators/step-operators.mdx delete mode 100644 develop/stack-components/step-operators/vertex.mdx delete mode 100644 develop/usage/development-monitoring/control-logging/control-logging.mdx delete mode 100644 develop/usage/development-monitoring/control-logging/disable-colorful-logging.mdx delete mode 100644 develop/usage/development-monitoring/control-logging/disable-rich-traceback.mdx delete mode 100644 develop/usage/development-monitoring/control-logging/enable-or-disable-logs-storing.mdx delete mode 100644 develop/usage/development-monitoring/control-logging/set-logging-verbosity.mdx delete mode 100644 develop/usage/development-monitoring/control-logging/view-logs-on-the-dasbhoard.mdx delete mode 100644 develop/usage/development-monitoring/debug-and-solve-issues.mdx delete mode 100644 develop/usage/development-monitoring/popular-integrations/aws-guide.mdx delete mode 100644 develop/usage/development-monitoring/popular-integrations/azure-guide.mdx delete mode 100644 develop/usage/development-monitoring/popular-integrations/gcp-guide.mdx delete mode 100644 develop/usage/development-monitoring/popular-integrations/kubeflow.mdx delete mode 100644 develop/usage/development-monitoring/popular-integrations/kubernetes.mdx delete mode 100644 develop/usage/development-monitoring/popular-integrations/mlflow.mdx delete mode 100644 develop/usage/development-monitoring/popular-integrations/popular-integrations.mdx delete mode 100644 develop/usage/development-monitoring/popular-integrations/skypilot.mdx delete mode 100644 develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/limitations-of-defining-steps-in-notebook-cells.mdx delete mode 100644 develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-a-single-step-from-a-notebook.mdx delete mode 100644 develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-remote-steps-and-pipelines-from-notebooks.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/associate-a-pipeline-with-a-model.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/connecting-artifacts-via-a-model.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/delete-a-model.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/linking-model-binaries-data-to-models.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/load-a-model-in-code.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/load-artifacts-from-model.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/model-versions.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/promote-a-model.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/register-a-model.mdx delete mode 100644 develop/usage/development-monitoring/use-the-model-control-plane/use-the-model-control-plane.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/access-secrets-in-a-step.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/build-pipelines.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/compose-pipelines.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/configuring-a-pipeline-at-runtime.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/control-caching-behavior.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/control-execution-order-of-steps.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/delete-a-pipeline.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/fetching-pipelines.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/get-past-pipeline-step-runs.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/hyper-parameter-tuning.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/name-your-pipeline-and-runs.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/retry-steps.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/run-an-individual-step.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/run-pipelines-asynchronously.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/schedule-a-pipeline.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/step-output-typing-and-annotation.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/use-failure-success-hooks.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/use-pipeline-step-parameters.mdx delete mode 100644 develop/usage/pipelines/build-pipelines/using-a-custom-step-invocation-id.mdx delete mode 100644 develop/usage/pipelines/create-and-run-templates/create-a-run-template.mdx delete mode 100644 develop/usage/pipelines/create-and-run-templates/create-and-run-templates.mdx delete mode 100644 develop/usage/pipelines/create-and-run-templates/run-a-template.mdx delete mode 100644 develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-another.mdx delete mode 100644 develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-client.mdx delete mode 100644 develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-rest-api.mdx delete mode 100644 develop/usage/pipelines/trigger-pipelines/trigger-pipelines.mdx delete mode 100644 develop/usage/project-setup/configure-python-environments/configure-python-environments.mdx delete mode 100644 develop/usage/project-setup/configure-python-environments/configure-the-server-environment.mdx delete mode 100644 develop/usage/project-setup/configure-python-environments/handling-dependencies.mdx delete mode 100644 develop/usage/project-setup/customize-docker-builds/customize-docker-builds.mdx delete mode 100644 develop/usage/project-setup/customize-docker-builds/define-where-an-image-is-built.mdx delete mode 100644 develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline.mdx delete mode 100644 develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-step.mdx delete mode 100644 develop/usage/project-setup/customize-docker-builds/specify-pip-dependencies-and-apt-packages.mdx delete mode 100644 develop/usage/project-setup/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times.mdx delete mode 100644 develop/usage/project-setup/customize-docker-builds/use-your-own-docker-files.mdx delete mode 100644 develop/usage/project-setup/customize-docker-builds/which-files-are-built-into-the-image.mdx delete mode 100644 develop/usage/project-setup/global-settings.mdx delete mode 100644 develop/usage/project-setup/setting-up-a-project-repository/best-practices.mdx delete mode 100644 develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository.mdx delete mode 100644 develop/usage/project-setup/setting-up-a-project-repository/setting-up-a-project-repository.mdx delete mode 100644 develop/usage/project-setup/setting-up-a-project-repository/using-project-templates.mdx delete mode 100644 develop/usage/project-setup/use-configuration-files/autogenerate-a-template-yaml-file.mdx delete mode 100644 develop/usage/project-setup/use-configuration-files/configuration-hierarchy.mdx delete mode 100644 develop/usage/project-setup/use-configuration-files/how-to-use-config.mdx delete mode 100644 develop/usage/project-setup/use-configuration-files/retrieve-used-configuration-of-a-run.mdx delete mode 100644 develop/usage/project-setup/use-configuration-files/runtime-configuration.mdx delete mode 100644 develop/usage/project-setup/use-configuration-files/use-configuration-files.mdx delete mode 100644 develop/usage/project-setup/use-configuration-files/what-can-be-configured.mdx delete mode 100644 develop/usage/project-setup/use-secrets/interact-with-secrets.mdx delete mode 100644 develop/usage/project-setup/use-secrets/reference-secrets-in-stack-configuration.mdx delete mode 100644 develop/usage/project-setup/use-secrets/secret-management.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/auth-management.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/aws-service-connector.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/azure-service-connector.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/best-security-practices.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/docker-service-connector.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/gcp-service-connector.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/hyperai-service-connector.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/kubernetes-service-connector.mdx delete mode 100644 develop/usage/resource-data-management/auth-management/service-connectors-guide.mdx delete mode 100644 develop/usage/resource-data-management/connecting-to-zenml/connect-in-with-your-user-interactive.mdx delete mode 100644 develop/usage/resource-data-management/connecting-to-zenml/connect-with-a-service-account.mdx delete mode 100644 develop/usage/resource-data-management/connecting-to-zenml/connecting-to-zenml.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/artifact-versioning.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/delete-an-artifact.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/get-arbitrary-artifacts-in-a-step.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/handle-data-artifacts.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/load-artifacts-into-memory.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/passing-artifacts-between-pipelines.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/return-multiple-outputs-from-a-step.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/tagging.mdx delete mode 100644 develop/usage/resource-data-management/handle-data-artifacts/unmaterialized-artifacts.mdx delete mode 100644 develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-a-model.mdx delete mode 100644 develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-an-artifact.mdx delete mode 100644 develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-steps.mdx delete mode 100644 develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-pipeline.mdx delete mode 100644 develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps.mdx delete mode 100644 develop/usage/resource-data-management/track-metrics-metadata/grouping-metadata.mdx delete mode 100644 develop/usage/resource-data-management/track-metrics-metadata/logging-metadata.mdx delete mode 100644 develop/usage/resource-data-management/track-metrics-metadata/track-metrics-metadata.mdx delete mode 100644 develop/usage/resource-data-management/training-with-gpus/accelerate-distributed-training.mdx delete mode 100644 develop/usage/resource-data-management/training-with-gpus/training-with-gpus.mdx delete mode 100644 develop/usage/resource-data-management/visualize-artifacts/creating-custom-visualizations.mdx delete mode 100644 develop/usage/resource-data-management/visualize-artifacts/disabling-visualizations.mdx delete mode 100644 develop/usage/resource-data-management/visualize-artifacts/types-of-visualizations.mdx delete mode 100644 develop/usage/resource-data-management/visualize-artifacts/visualizations-in-dashboard.mdx delete mode 100644 develop/usage/resource-data-management/visualize-artifacts/visualize-artifacts.mdx delete mode 100644 develop/user-guide/examples/basic-nlp-with-bert.mdx delete mode 100644 develop/user-guide/examples/basic-nlp.mdx delete mode 100644 develop/user-guide/examples/computer-vision-with-yolov8.mdx delete mode 100644 develop/user-guide/examples/e2e-batch.mdx delete mode 100644 develop/user-guide/examples/end-to-end-batch.mdx delete mode 100644 develop/user-guide/examples/llm-finetuning.mdx delete mode 100644 develop/user-guide/examples/more-projects.mdx delete mode 100644 develop/user-guide/examples/quickstart.mdx delete mode 100644 develop/user-guide/guides/llmops-guide.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/evaluation.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-65-loc.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-practice.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/evaluation/generation.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/evaluation/retrieval.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/finetuning-embeddings.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/finetuning-embeddings/evaluating-finetuned-embeddings.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/finetuning-embeddings/synthetic-data-generation.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/finetuning-llms.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/rag-with-zenml.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/rag-with-zenml/basic-rag-inference-pipeline.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/rag-with-zenml/data-ingestion.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/rag-with-zenml/embeddings-generation.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/rag-with-zenml/rag-85-loc.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/rag-with-zenml/understanding-rag.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/reranking.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/reranking/evaluating-reranking-performance.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/reranking/implementing-reranking.mdx delete mode 100644 develop/user-guide/guides/llmops-guide/reranking/understanding-reranking.mdx delete mode 100644 develop/user-guide/guides/production-guide.mdx delete mode 100644 develop/user-guide/guides/production-guide/ci-cd.mdx delete mode 100644 develop/user-guide/guides/production-guide/cloud-orchestration.mdx delete mode 100644 develop/user-guide/guides/production-guide/configure-pipeline.mdx delete mode 100644 develop/user-guide/guides/production-guide/connect-code-repository.mdx delete mode 100644 develop/user-guide/guides/production-guide/deploying-zenml.mdx delete mode 100644 develop/user-guide/guides/production-guide/end-to-end.mdx delete mode 100644 develop/user-guide/guides/production-guide/remote-storage.mdx delete mode 100644 develop/user-guide/guides/production-guide/understand-stacks.mdx delete mode 100644 develop/user-guide/guides/starter-guide.mdx delete mode 100644 develop/user-guide/guides/starter-guide/cache-previous-executions.mdx delete mode 100644 develop/user-guide/guides/starter-guide/create-an-ml-pipeline.mdx delete mode 100644 develop/user-guide/guides/starter-guide/manage-artifacts.mdx delete mode 100644 develop/user-guide/guides/starter-guide/starter-project.mdx delete mode 100644 develop/user-guide/guides/starter-guide/track-ml-models.mdx diff --git a/develop/api/actions/create-action.mdx b/develop/api/actions/create-action.mdx deleted file mode 100644 index ed366c9..0000000 --- a/develop/api/actions/create-action.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/actions ---- \ No newline at end of file diff --git a/develop/api/actions/delete-action.mdx b/develop/api/actions/delete-action.mdx deleted file mode 100644 index 4712cf0..0000000 --- a/develop/api/actions/delete-action.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/actions/{action_id} ---- \ No newline at end of file diff --git a/develop/api/actions/get-action.mdx b/develop/api/actions/get-action.mdx deleted file mode 100644 index 84bab26..0000000 --- a/develop/api/actions/get-action.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/actions/{action_id} ---- \ No newline at end of file diff --git a/develop/api/actions/list-actions.mdx b/develop/api/actions/list-actions.mdx deleted file mode 100644 index 2036bd4..0000000 --- a/develop/api/actions/list-actions.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/actions ---- \ No newline at end of file diff --git a/develop/api/actions/update-action.mdx b/develop/api/actions/update-action.mdx deleted file mode 100644 index f467604..0000000 --- a/develop/api/actions/update-action.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/actions/{action_id} ---- \ No newline at end of file diff --git a/develop/api/api-reference.mdx b/develop/api/api-reference.mdx deleted file mode 100644 index e211ad3..0000000 --- a/develop/api/api-reference.mdx +++ /dev/null @@ -1,79 +0,0 @@ ---- -title: "API reference" -description: "See the ZenML API reference." -icon: eye ---- - -The ZenML server is a FastAPI application, therefore the OpenAPI-compliant docs are available at `/docs` or `/redoc` of your ZenML server: - -In the local case (i.e. using `zenml up`, the docs are available on `http://127.0.0.1:8237/docs`) - - - - - - - - - -## Using a bearer token to access the API programmatically - -If you are using the ZenML server API using the above pages, it is enough to be logged in to your ZenML account in the same browser session. However, in order to do this programmatically, the following steps need to be followed: - -1. Create a [service account](/develop/usage/resource-data-management/connecting-to-zenml/connect-with-a-service-account): - - -```Bash -zenml service-account create myserviceaccount -``` - -This will print out the ``, you can use in the next command. - -1. Get an access token by using the `/api/v1/login` endpoint: - - -```shell - -curl -X 'POST' \ - - '/api/v1/login' \ - - -H 'accept: application/json' \ - - -H 'Content-Type: application/x-www-form-urlencoded' \ - - -d 'grant_type=zenml-api-key&username=&password='\''&client_id=&device_code=' -``` - -This will return a response like this: - - -```shell -{ - - "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiI3MGJjZTg5NC1hN2VjLTRkOTYtYjE1Ny1kOTZkYWY5ZWM2M2IiLCJpc3MiOiJmMGQ5NjI1Ni04YmQyLTQxZDctOWVjZi0xMmYwM2JmYTVlMTYiLCJhdWQiOiJmMGQ5NjI1Ni04YmQyLTQxZDctOWVjZi0xMmYwM2JmYTVlMTYiLCJleHAiOjE3MTk0MDk0NjAsImFwaV9rZXlfaWQiOiIzNDkyM2U0NS0zMGFlLTRkMjctODZiZS0wZGRhNTdkMjA5MDcifQ.ByB1ngCPtBenGE6UugsWC6Blga3qPqkAiPJUSFDR-u4", - - "token_type": "bearer", - - "expires_in": 3600, - - "refresh_token": null, - - "scope": null - -} -``` - -1. Use the access token to make subsequent curl commands: - - -```shell -curl -X 'GET' \ - - '/api/v1/pipelines?hydrate=false&name=training' \ - - -H 'accept: application/json' \ - - -H 'Authorization: Bearer ' -``` - diff --git a/develop/api/artifact_versions/create-artifact-version.mdx b/develop/api/artifact_versions/create-artifact-version.mdx deleted file mode 100644 index ab51e05..0000000 --- a/develop/api/artifact_versions/create-artifact-version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/artifact_versions ---- \ No newline at end of file diff --git a/develop/api/artifact_versions/delete-artifact-version.mdx b/develop/api/artifact_versions/delete-artifact-version.mdx deleted file mode 100644 index bd34925..0000000 --- a/develop/api/artifact_versions/delete-artifact-version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/artifact_versions/{artifact_version_id} ---- \ No newline at end of file diff --git a/develop/api/artifact_versions/get-artifact-version.mdx b/develop/api/artifact_versions/get-artifact-version.mdx deleted file mode 100644 index 325d044..0000000 --- a/develop/api/artifact_versions/get-artifact-version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/artifact_versions/{artifact_version_id} ---- \ No newline at end of file diff --git a/develop/api/artifact_versions/get-artifact-visualization.mdx b/develop/api/artifact_versions/get-artifact-visualization.mdx deleted file mode 100644 index 26b7d64..0000000 --- a/develop/api/artifact_versions/get-artifact-visualization.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/artifact_versions/{artifact_version_id}/visualize ---- \ No newline at end of file diff --git a/develop/api/artifact_versions/list-artifact-versions.mdx b/develop/api/artifact_versions/list-artifact-versions.mdx deleted file mode 100644 index 6f60495..0000000 --- a/develop/api/artifact_versions/list-artifact-versions.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/artifact_versions ---- \ No newline at end of file diff --git a/develop/api/artifact_versions/prune-artifact-versions.mdx b/develop/api/artifact_versions/prune-artifact-versions.mdx deleted file mode 100644 index ed80c48..0000000 --- a/develop/api/artifact_versions/prune-artifact-versions.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/artifact_versions/ ---- \ No newline at end of file diff --git a/develop/api/artifact_versions/update-artifact-version.mdx b/develop/api/artifact_versions/update-artifact-version.mdx deleted file mode 100644 index 0382d3e..0000000 --- a/develop/api/artifact_versions/update-artifact-version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/artifact_versions/{artifact_version_id} ---- \ No newline at end of file diff --git a/develop/api/artifacts/create-artifact.mdx b/develop/api/artifacts/create-artifact.mdx deleted file mode 100644 index 9ae79be..0000000 --- a/develop/api/artifacts/create-artifact.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/artifacts ---- \ No newline at end of file diff --git a/develop/api/artifacts/delete-artifact.mdx b/develop/api/artifacts/delete-artifact.mdx deleted file mode 100644 index 7069744..0000000 --- a/develop/api/artifacts/delete-artifact.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/artifacts/{artifact_id} ---- \ No newline at end of file diff --git a/develop/api/artifacts/get-artifact.mdx b/develop/api/artifacts/get-artifact.mdx deleted file mode 100644 index bdb398c..0000000 --- a/develop/api/artifacts/get-artifact.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/artifacts/{artifact_id} ---- \ No newline at end of file diff --git a/develop/api/artifacts/list-artifacts.mdx b/develop/api/artifacts/list-artifacts.mdx deleted file mode 100644 index 9844f69..0000000 --- a/develop/api/artifacts/list-artifacts.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/artifacts ---- \ No newline at end of file diff --git a/develop/api/artifacts/update-artifact.mdx b/develop/api/artifacts/update-artifact.mdx deleted file mode 100644 index 15b8137..0000000 --- a/develop/api/artifacts/update-artifact.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/artifacts/{artifact_id} ---- \ No newline at end of file diff --git a/develop/api/auth/api-token.mdx b/develop/api/auth/api-token.mdx deleted file mode 100644 index 615d443..0000000 --- a/develop/api/auth/api-token.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/api_token ---- \ No newline at end of file diff --git a/develop/api/auth/device-authorization.mdx b/develop/api/auth/device-authorization.mdx deleted file mode 100644 index 851ceca..0000000 --- a/develop/api/auth/device-authorization.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/device_authorization ---- \ No newline at end of file diff --git a/develop/api/auth/logout.mdx b/develop/api/auth/logout.mdx deleted file mode 100644 index c45ec70..0000000 --- a/develop/api/auth/logout.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/logout ---- \ No newline at end of file diff --git a/develop/api/auth/token.mdx b/develop/api/auth/token.mdx deleted file mode 100644 index 9cc1f94..0000000 --- a/develop/api/auth/token.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/login ---- \ No newline at end of file diff --git a/develop/api/authorized_devices/delete-authorized-device.mdx b/develop/api/authorized_devices/delete-authorized-device.mdx deleted file mode 100644 index c85118a..0000000 --- a/develop/api/authorized_devices/delete-authorized-device.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/devices/{device_id} ---- \ No newline at end of file diff --git a/develop/api/authorized_devices/get-authorization-device.mdx b/develop/api/authorized_devices/get-authorization-device.mdx deleted file mode 100644 index a749a79..0000000 --- a/develop/api/authorized_devices/get-authorization-device.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/devices/{device_id} ---- \ No newline at end of file diff --git a/develop/api/authorized_devices/list-authorized-devices.mdx b/develop/api/authorized_devices/list-authorized-devices.mdx deleted file mode 100644 index 242501e..0000000 --- a/develop/api/authorized_devices/list-authorized-devices.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/devices ---- \ No newline at end of file diff --git a/develop/api/authorized_devices/update-authorized-device.mdx b/develop/api/authorized_devices/update-authorized-device.mdx deleted file mode 100644 index 2185e20..0000000 --- a/develop/api/authorized_devices/update-authorized-device.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/devices/{device_id} ---- \ No newline at end of file diff --git a/develop/api/authorized_devices/verify-authorized-device.mdx b/develop/api/authorized_devices/verify-authorized-device.mdx deleted file mode 100644 index 47cf070..0000000 --- a/develop/api/authorized_devices/verify-authorized-device.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/devices/{device_id}/verify ---- \ No newline at end of file diff --git a/develop/api/builds/delete-build.mdx b/develop/api/builds/delete-build.mdx deleted file mode 100644 index f8a1810..0000000 --- a/develop/api/builds/delete-build.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/pipeline_builds/{build_id} ---- \ No newline at end of file diff --git a/develop/api/builds/get-build.mdx b/develop/api/builds/get-build.mdx deleted file mode 100644 index c9143b3..0000000 --- a/develop/api/builds/get-build.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/pipeline_builds/{build_id} ---- \ No newline at end of file diff --git a/develop/api/builds/list-builds.mdx b/develop/api/builds/list-builds.mdx deleted file mode 100644 index 048ca36..0000000 --- a/develop/api/builds/list-builds.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/pipeline_builds ---- \ No newline at end of file diff --git a/develop/api/code_repositories/delete-code-repository.mdx b/develop/api/code_repositories/delete-code-repository.mdx deleted file mode 100644 index 1efa7e5..0000000 --- a/develop/api/code_repositories/delete-code-repository.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/code_repositories/{code_repository_id} ---- \ No newline at end of file diff --git a/develop/api/code_repositories/get-code-repository.mdx b/develop/api/code_repositories/get-code-repository.mdx deleted file mode 100644 index 18fdfe2..0000000 --- a/develop/api/code_repositories/get-code-repository.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/code_repositories/{code_repository_id} ---- \ No newline at end of file diff --git a/develop/api/code_repositories/list-code-repositories.mdx b/develop/api/code_repositories/list-code-repositories.mdx deleted file mode 100644 index a6dd95f..0000000 --- a/develop/api/code_repositories/list-code-repositories.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/code_repositories ---- \ No newline at end of file diff --git a/develop/api/code_repositories/update-code-repository.mdx b/develop/api/code_repositories/update-code-repository.mdx deleted file mode 100644 index da06d5a..0000000 --- a/develop/api/code_repositories/update-code-repository.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/code_repositories/{code_repository_id} ---- \ No newline at end of file diff --git a/develop/api/deployments/delete-deployment.mdx b/develop/api/deployments/delete-deployment.mdx deleted file mode 100644 index 362824e..0000000 --- a/develop/api/deployments/delete-deployment.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/pipeline_deployments/{deployment_id} ---- \ No newline at end of file diff --git a/develop/api/deployments/deployment-logs.mdx b/develop/api/deployments/deployment-logs.mdx deleted file mode 100644 index 8d076e2..0000000 --- a/develop/api/deployments/deployment-logs.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/pipeline_deployments/{deployment_id}/logs ---- \ No newline at end of file diff --git a/develop/api/deployments/get-deployment.mdx b/develop/api/deployments/get-deployment.mdx deleted file mode 100644 index f2fdffd..0000000 --- a/develop/api/deployments/get-deployment.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/pipeline_deployments/{deployment_id} ---- \ No newline at end of file diff --git a/develop/api/deployments/list-deployments.mdx b/develop/api/deployments/list-deployments.mdx deleted file mode 100644 index 948b928..0000000 --- a/develop/api/deployments/list-deployments.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/pipeline_deployments ---- \ No newline at end of file diff --git a/develop/api/endpoint/create.mdx b/develop/api/endpoint/create.mdx deleted file mode 100644 index 5689f1b..0000000 --- a/develop/api/endpoint/create.mdx +++ /dev/null @@ -1,4 +0,0 @@ ---- -title: 'Create Plant' -openapi: 'POST /plants' ---- diff --git a/develop/api/endpoint/delete.mdx b/develop/api/endpoint/delete.mdx deleted file mode 100644 index 657dfc8..0000000 --- a/develop/api/endpoint/delete.mdx +++ /dev/null @@ -1,4 +0,0 @@ ---- -title: 'Delete Plant' -openapi: 'DELETE /plants/{id}' ---- diff --git a/develop/api/endpoint/get.mdx b/develop/api/endpoint/get.mdx deleted file mode 100644 index 0285123..0000000 --- a/develop/api/endpoint/get.mdx +++ /dev/null @@ -1,4 +0,0 @@ ---- -title: 'Get pipelines' -openapi: 'GET /api/v1/pipelines' ---- diff --git a/develop/api/event-sources/create-event-source.mdx b/develop/api/event-sources/create-event-source.mdx deleted file mode 100644 index e8989e9..0000000 --- a/develop/api/event-sources/create-event-source.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/event-sources ---- \ No newline at end of file diff --git a/develop/api/event-sources/delete-event-source.mdx b/develop/api/event-sources/delete-event-source.mdx deleted file mode 100644 index 2658766..0000000 --- a/develop/api/event-sources/delete-event-source.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/event-sources/{event_source_id} ---- \ No newline at end of file diff --git a/develop/api/event-sources/get-event-source.mdx b/develop/api/event-sources/get-event-source.mdx deleted file mode 100644 index 2f3e25a..0000000 --- a/develop/api/event-sources/get-event-source.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/event-sources/{event_source_id} ---- \ No newline at end of file diff --git a/develop/api/event-sources/list-event-sources.mdx b/develop/api/event-sources/list-event-sources.mdx deleted file mode 100644 index 4b40cf6..0000000 --- a/develop/api/event-sources/list-event-sources.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/event-sources ---- \ No newline at end of file diff --git a/develop/api/event-sources/update-event-source.mdx b/develop/api/event-sources/update-event-source.mdx deleted file mode 100644 index 216be6c..0000000 --- a/develop/api/event-sources/update-event-source.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/event-sources/{event_source_id} ---- \ No newline at end of file diff --git a/develop/api/flavors/create-flavor.mdx b/develop/api/flavors/create-flavor.mdx deleted file mode 100644 index 9872319..0000000 --- a/develop/api/flavors/create-flavor.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/flavors ---- \ No newline at end of file diff --git a/develop/api/flavors/delete-flavor.mdx b/develop/api/flavors/delete-flavor.mdx deleted file mode 100644 index 879bea8..0000000 --- a/develop/api/flavors/delete-flavor.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/flavors/{flavor_id} ---- \ No newline at end of file diff --git a/develop/api/flavors/get-flavor.mdx b/develop/api/flavors/get-flavor.mdx deleted file mode 100644 index 081e590..0000000 --- a/develop/api/flavors/get-flavor.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/flavors/{flavor_id} ---- \ No newline at end of file diff --git a/develop/api/flavors/list-flavors.mdx b/develop/api/flavors/list-flavors.mdx deleted file mode 100644 index c39abea..0000000 --- a/develop/api/flavors/list-flavors.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/flavors ---- \ No newline at end of file diff --git a/develop/api/flavors/sync-flavors.mdx b/develop/api/flavors/sync-flavors.mdx deleted file mode 100644 index 928b2ba..0000000 --- a/develop/api/flavors/sync-flavors.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: patch /api/v1/flavors/sync ---- \ No newline at end of file diff --git a/develop/api/flavors/update-flavor.mdx b/develop/api/flavors/update-flavor.mdx deleted file mode 100644 index 7e573eb..0000000 --- a/develop/api/flavors/update-flavor.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/flavors/{flavor_id} ---- \ No newline at end of file diff --git a/develop/api/health.mdx b/develop/api/health.mdx deleted file mode 100644 index f27caa8..0000000 --- a/develop/api/health.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /health ---- \ No newline at end of file diff --git a/develop/api/model_version_artifacts/list-model-version-artifact-links.mdx b/develop/api/model_version_artifacts/list-model-version-artifact-links.mdx deleted file mode 100644 index ab57f5b..0000000 --- a/develop/api/model_version_artifacts/list-model-version-artifact-links.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/model_version_artifacts ---- \ No newline at end of file diff --git a/develop/api/model_version_pipeline_runs/list-model-version-pipeline-run-links.mdx b/develop/api/model_version_pipeline_runs/list-model-version-pipeline-run-links.mdx deleted file mode 100644 index d942201..0000000 --- a/develop/api/model_version_pipeline_runs/list-model-version-pipeline-run-links.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/model_version_pipeline_runs ---- \ No newline at end of file diff --git a/develop/api/model_versions/delete-all-model-version-artifact-links.mdx b/develop/api/model_versions/delete-all-model-version-artifact-links.mdx deleted file mode 100644 index b5e90ed..0000000 --- a/develop/api/model_versions/delete-all-model-version-artifact-links.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/model_versions/{model_version_id}/artifacts ---- \ No newline at end of file diff --git a/develop/api/model_versions/delete-model-version-artifact-link.mdx b/develop/api/model_versions/delete-model-version-artifact-link.mdx deleted file mode 100644 index 78e57a3..0000000 --- a/develop/api/model_versions/delete-model-version-artifact-link.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/model_versions/{model_version_id}/artifacts/{model_version_artifact_link_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/model_versions/delete-model-version-pipeline-run-link.mdx b/develop/api/model_versions/delete-model-version-pipeline-run-link.mdx deleted file mode 100644 index 16c741f..0000000 --- a/develop/api/model_versions/delete-model-version-pipeline-run-link.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/model_versions/{model_version_id}/runs/{model_version_pipeline_run_link_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/model_versions/delete-model-version.mdx b/develop/api/model_versions/delete-model-version.mdx deleted file mode 100644 index 070161b..0000000 --- a/develop/api/model_versions/delete-model-version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/model_versions/{model_version_id} ---- \ No newline at end of file diff --git a/develop/api/model_versions/get-model-version.mdx b/develop/api/model_versions/get-model-version.mdx deleted file mode 100644 index 14b76ff..0000000 --- a/develop/api/model_versions/get-model-version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/model_versions/{model_version_id} ---- \ No newline at end of file diff --git a/develop/api/model_versions/list-model-versions.mdx b/develop/api/model_versions/list-model-versions.mdx deleted file mode 100644 index b9a25a1..0000000 --- a/develop/api/model_versions/list-model-versions.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/model_versions ---- \ No newline at end of file diff --git a/develop/api/model_versions/update-model-version.mdx b/develop/api/model_versions/update-model-version.mdx deleted file mode 100644 index 53af6ef..0000000 --- a/develop/api/model_versions/update-model-version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/model_versions/{model_version_id} ---- \ No newline at end of file diff --git a/develop/api/models/delete-model.mdx b/develop/api/models/delete-model.mdx deleted file mode 100644 index e36be81..0000000 --- a/develop/api/models/delete-model.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/models/{model_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/models/get-model.mdx b/develop/api/models/get-model.mdx deleted file mode 100644 index 453dd8a..0000000 --- a/develop/api/models/get-model.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/models/{model_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/models/list-model-versions.mdx b/develop/api/models/list-model-versions.mdx deleted file mode 100644 index d74fade..0000000 --- a/develop/api/models/list-model-versions.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/models/{model_name_or_id}/model_versions ---- \ No newline at end of file diff --git a/develop/api/models/list-models.mdx b/develop/api/models/list-models.mdx deleted file mode 100644 index 98895d6..0000000 --- a/develop/api/models/list-models.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/models ---- \ No newline at end of file diff --git a/develop/api/models/update-model.mdx b/develop/api/models/update-model.mdx deleted file mode 100644 index e49cc61..0000000 --- a/develop/api/models/update-model.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/models/{model_id} ---- \ No newline at end of file diff --git a/develop/api/openapi.json b/develop/api/openapi.json deleted file mode 100644 index 7658ce6..0000000 --- a/develop/api/openapi.json +++ /dev/null @@ -1,29981 +0,0 @@ -{ - "openapi": "3.1.0", - "info": { "title": "ZenML", "version": "0.65.0" }, - "paths": { - "/health": { - "get": { - "summary": "Health", - "description": "Get health status of the server.\n\nReturns:\n String representing the health status of the server.", - "operationId": "health_health_get", - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "string", - "title": "Response Health Health Get" - } - } - } - } - } - } - }, - "/api/v1/actions": { - "get": { - "tags": ["actions"], - "summary": "List Actions", - "description": "List actions.\n\nArgs:\n action_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n Page of actions.", - "operationId": "list_actions_api_v1_actions_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "flavor", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Flavor" - } - }, - { - "name": "plugin_subtype", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Plugin Subtype" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ActionResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["actions"], - "summary": "Create Action", - "description": "Creates an action.\n\nArgs:\n action: Action to create.\n\nRaises:\n ValueError: If the action handler for flavor/type is not valid.\n\nReturns:\n The created action.", - "operationId": "create_action_api_v1_actions_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ActionRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ActionResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/actions/{action_id}": { - "get": { - "tags": ["actions"], - "summary": "Get Action", - "description": "Returns the requested action.\n\nArgs:\n action_id: ID of the action.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nRaises:\n ValueError: If the action handler for flavor/type is not valid.\n\nReturns:\n The requested action.", - "operationId": "get_action_api_v1_actions__action_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "action_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Action Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ActionResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["actions"], - "summary": "Update Action", - "description": "Update an action.\n\nArgs:\n action_id: ID of the action to update.\n action_update: The action update.\n\nRaises:\n ValueError: If the action handler for flavor/type is not valid.\n\nReturns:\n The updated action.", - "operationId": "update_action_api_v1_actions__action_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "action_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Action Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ActionUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ActionResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["actions"], - "summary": "Delete Action", - "description": "Delete an action.\n\nArgs:\n action_id: ID of the action.\n force: Flag deciding whether to force delete the action.\n\nRaises:\n ValueError: If the action handler for flavor/type is not valid.", - "operationId": "delete_action_api_v1_actions__action_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "action_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Action Id" - } - }, - { - "name": "force", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": false, "title": "Force" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/artifacts": { - "get": { - "tags": ["artifacts"], - "summary": "List Artifacts", - "description": "Get artifacts according to query filters.\n\nArgs:\n artifact_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The artifacts according to query filters.", - "operationId": "list_artifacts_api_v1_artifacts_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "has_custom_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Has Custom Name" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ArtifactResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["artifacts"], - "summary": "Create Artifact", - "description": "Create a new artifact.\n\nArgs:\n artifact: The artifact to create.\n\nReturns:\n The created artifact.", - "operationId": "create_artifact_api_v1_artifacts_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ArtifactRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ArtifactResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/artifacts/{artifact_id}": { - "get": { - "tags": ["artifacts"], - "summary": "Get Artifact", - "description": "Get an artifact by ID.\n\nArgs:\n artifact_id: The ID of the artifact to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The artifact with the given ID.", - "operationId": "get_artifact_api_v1_artifacts__artifact_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "artifact_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Artifact Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ArtifactResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["artifacts"], - "summary": "Update Artifact", - "description": "Update an artifact by ID.\n\nArgs:\n artifact_id: The ID of the artifact to update.\n artifact_update: The update to apply to the artifact.\n\nReturns:\n The updated artifact.", - "operationId": "update_artifact_api_v1_artifacts__artifact_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "artifact_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Artifact Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ArtifactUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ArtifactResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["artifacts"], - "summary": "Delete Artifact", - "description": "Delete an artifact by ID.\n\nArgs:\n artifact_id: The ID of the artifact to delete.", - "operationId": "delete_artifact_api_v1_artifacts__artifact_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "artifact_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Artifact Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/artifact_versions": { - "get": { - "tags": ["artifact_versions"], - "summary": "List Artifact Versions", - "description": "Get artifact versions according to query filters.\n\nArgs:\n artifact_version_filter_model: Filter model used for pagination,\n sorting, filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n auth_context: The authentication context.\n\nReturns:\n The artifact versions according to query filters.", - "operationId": "list_artifact_versions_api_v1_artifact_versions_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "artifact_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Artifact Id" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "version", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Version" - } - }, - { - "name": "version_number", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "integer" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Version Number" - } - }, - { - "name": "uri", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Uri" - } - }, - { - "name": "materializer", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Materializer" - } - }, - { - "name": "type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Type" - } - }, - { - "name": "data_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Data Type" - } - }, - { - "name": "artifact_store_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Artifact Store Id" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "only_unused", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "default": false, - "title": "Only Unused" - } - }, - { - "name": "has_custom_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Has Custom Name" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ArtifactVersionResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["artifact_versions"], - "summary": "Create Artifact Version", - "description": "Create a new artifact version.\n\nArgs:\n artifact_version: The artifact version to create.\n\nReturns:\n The created artifact version.", - "operationId": "create_artifact_version_api_v1_artifact_versions_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ArtifactVersionRequest" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ArtifactVersionResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/artifact_versions/{artifact_version_id}": { - "get": { - "tags": ["artifact_versions"], - "summary": "Get Artifact Version", - "description": "Get an artifact version by ID.\n\nArgs:\n artifact_version_id: The ID of the artifact version to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The artifact version with the given ID.", - "operationId": "get_artifact_version_api_v1_artifact_versions__artifact_version_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "artifact_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Artifact Version Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ArtifactVersionResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["artifact_versions"], - "summary": "Update Artifact Version", - "description": "Update an artifact by ID.\n\nArgs:\n artifact_version_id: The ID of the artifact version to update.\n artifact_version_update: The update to apply to the artifact version.\n\nReturns:\n The updated artifact.", - "operationId": "update_artifact_version_api_v1_artifact_versions__artifact_version_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "artifact_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Artifact Version Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ArtifactVersionUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ArtifactVersionResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["artifact_versions"], - "summary": "Delete Artifact Version", - "description": "Delete an artifact version by ID.\n\nArgs:\n artifact_version_id: The ID of the artifact version to delete.", - "operationId": "delete_artifact_version_api_v1_artifact_versions__artifact_version_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "artifact_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Artifact Version Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/artifact_versions/": { - "delete": { - "tags": ["artifact_versions"], - "summary": "Prune Artifact Versions", - "description": "Prunes unused artifact versions and their artifacts.\n\nArgs:\n only_versions: Only delete artifact versions, keeping artifacts", - "operationId": "prune_artifact_versions_api_v1_artifact_versions__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "only_versions", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": true, - "title": "Only Versions" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/artifact_versions/{artifact_version_id}/visualize": { - "get": { - "tags": ["artifact_versions"], - "summary": "Get Artifact Visualization", - "description": "Get the visualization of an artifact.\n\nArgs:\n artifact_version_id: ID of the artifact version for which to get the visualization.\n index: Index of the visualization to get (if there are multiple).\n\nReturns:\n The visualization of the artifact version.", - "operationId": "get_artifact_visualization_api_v1_artifact_versions__artifact_version_id__visualize_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "artifact_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Artifact Version Id" - } - }, - { - "name": "index", - "in": "query", - "required": false, - "schema": { "type": "integer", "default": 0, "title": "Index" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/LoadedVisualization" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/login": { - "post": { - "tags": ["auth"], - "summary": "Token", - "description": "OAuth2 token endpoint.\n\nArgs:\n request: The request object.\n response: The response object.\n auth_form_data: The OAuth 2.0 authentication form data.\n\nReturns:\n An access token or a redirect response.\n\nRaises:\n ValueError: If the grant type is invalid.", - "operationId": "token_api_v1_login_post", - "requestBody": { - "content": { - "application/x-www-form-urlencoded": { - "schema": { - "allOf": [ - { - "$ref": "#/components/schemas/Body_token_api_v1_login_post" - } - ], - "title": "Body" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "anyOf": [ - { "$ref": "#/components/schemas/OAuthTokenResponse" }, - { "$ref": "#/components/schemas/OAuthRedirectResponse" } - ], - "title": "Response Token Api V1 Login Post" - } - } - } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/HTTPValidationError" } - } - } - } - } - } - }, - "/api/v1/logout": { - "get": { - "tags": ["auth"], - "summary": "Logout", - "description": "Logs out the user.\n\nArgs:\n response: The response object.", - "operationId": "logout_api_v1_logout_get", - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - } - } - } - }, - "/api/v1/device_authorization": { - "post": { - "tags": ["auth"], - "summary": "Device Authorization", - "description": "OAuth2 device authorization endpoint.\n\nThis endpoint implements the OAuth2 device authorization grant flow as\ndefined in https://tools.ietf.org/html/rfc8628. It is called to initiate\nthe device authorization flow by requesting a device and user code for a\ngiven client ID.\n\nFor a new client ID, a new OAuth device is created, stored in the DB and\nreturned to the client along with a pair of newly generated device and user\ncodes. If a device for the given client ID already exists, the existing\nDB entry is reused and new device and user codes are generated.\n\nArgs:\n request: The request object.\n client_id: The client ID.\n\nReturns:\n The device authorization response.", - "operationId": "device_authorization_api_v1_device_authorization_post", - "requestBody": { - "content": { - "application/x-www-form-urlencoded": { - "schema": { - "$ref": "#/components/schemas/Body_device_authorization_api_v1_device_authorization_post" - } - } - }, - "required": true - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/OAuthDeviceAuthorizationResponse" - } - } - } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/HTTPValidationError" } - } - } - } - } - } - }, - "/api/v1/api_token": { - "get": { - "tags": ["auth"], - "summary": "Api Token", - "description": "Get a workload API token for the current user.\n\nArgs:\n pipeline_id: The ID of the pipeline to get the API token for.\n schedule_id: The ID of the schedule to get the API token for.\n expires_minutes: The number of minutes for which the API token should\n be valid. If not provided, the API token will be valid indefinitely.\n auth_context: The authentication context.\n\nReturns:\n The API token.\n\nRaises:\n HTTPException: If the user is not authenticated.\n AuthorizationException: If trying to scope the API token to a different\n pipeline/schedule than the token used to authorize this request.", - "operationId": "api_token_api_v1_api_token_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "schedule_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Schedule Id" - } - }, - { - "name": "expires_minutes", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "Expires Minutes" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "string", - "title": "Response Api Token Api V1 Api Token Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/HTTPValidationError" } - } - } - } - } - } - }, - "/api/v1/devices": { - "get": { - "tags": ["authorized_devices"], - "summary": "List Authorized Devices", - "description": "Gets a page of OAuth2 authorized devices belonging to the current user.\n\nArgs:\n filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n auth_context: The current auth context.\n\nReturns:\n Page of OAuth2 authorized device objects.", - "operationId": "list_authorized_devices_api_v1_devices_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_user", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope User" - } - }, - { - "name": "expires", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Expires" - } - }, - { - "name": "client_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Client Id" - } - }, - { - "name": "status", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "$ref": "#/components/schemas/OAuthDeviceStatus" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Status" - } - }, - { - "name": "trusted_device", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "boolean" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Trusted Device" - } - }, - { - "name": "failed_auth_attempts", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "integer" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Failed Auth Attempts" - } - }, - { - "name": "last_login", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Last Login" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_OAuthDeviceResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/devices/{device_id}": { - "get": { - "tags": ["authorized_devices"], - "summary": "Get Authorization Device", - "description": "Gets a specific OAuth2 authorized device using its unique ID.\n\nArgs:\n device_id: The ID of the OAuth2 authorized device to get.\n user_code: The user code of the OAuth2 authorized device to get. Needs\n to be specified with devices that have not been verified yet.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n auth_context: The current auth context.\n\nReturns:\n A specific OAuth2 authorized device object.\n\nRaises:\n KeyError: If the device with the given ID does not exist, does not\n belong to the current user or could not be verified using the\n given user code.", - "operationId": "get_authorization_device_api_v1_devices__device_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "device_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Device Id" - } - }, - { - "name": "user_code", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "User Code" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/OAuthDeviceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["authorized_devices"], - "summary": "Update Authorized Device", - "description": "Updates a specific OAuth2 authorized device using its unique ID.\n\nArgs:\n device_id: The ID of the OAuth2 authorized device to update.\n update: The model containing the attributes to update.\n auth_context: The current auth context.\n\nReturns:\n The updated OAuth2 authorized device object.\n\nRaises:\n KeyError: If the device with the given ID does not exist or does not\n belong to the current user.", - "operationId": "update_authorized_device_api_v1_devices__device_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "device_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Device Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/OAuthDeviceUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/OAuthDeviceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["authorized_devices"], - "summary": "Delete Authorized Device", - "description": "Deletes a specific OAuth2 authorized device using its unique ID.\n\nArgs:\n device_id: The ID of the OAuth2 authorized device to delete.\n auth_context: The current auth context.\n\nRaises:\n KeyError: If the device with the given ID does not exist or does not\n belong to the current user.", - "operationId": "delete_authorized_device_api_v1_devices__device_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "device_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Device Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/devices/{device_id}/verify": { - "put": { - "tags": ["authorized_devices"], - "summary": "Verify Authorized Device", - "description": "Verifies a specific OAuth2 authorized device using its unique ID.\n\nThis endpoint implements the OAuth2 device authorization grant flow as\ndefined in https://tools.ietf.org/html/rfc8628. It is called to verify\nthe user code for a given device ID.\n\nIf the user code is valid, the device is marked as verified and associated\nwith the user that authorized the device. This association is required to\nbe able to issue access tokens or revoke the device later on.\n\nArgs:\n device_id: The ID of the OAuth2 authorized device to update.\n request: The model containing the verification request.\n auth_context: The current auth context.\n\nReturns:\n The updated OAuth2 authorized device object.\n\nRaises:\n ValueError: If the device verification request fails.", - "operationId": "verify_authorized_device_api_v1_devices__device_id__verify_put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "device_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Device Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/OAuthDeviceVerificationRequest" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/OAuthDeviceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/code_repositories": { - "get": { - "tags": ["code_repositories"], - "summary": "List Code Repositories", - "description": "Gets a page of code repositories.\n\nArgs:\n filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n Page of code repository objects.", - "operationId": "list_code_repositories_api_v1_code_repositories_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_CodeRepositoryResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/code_repositories/{code_repository_id}": { - "get": { - "tags": ["code_repositories"], - "summary": "Get Code Repository", - "description": "Gets a specific code repository using its unique ID.\n\nArgs:\n code_repository_id: The ID of the code repository to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A specific code repository object.", - "operationId": "get_code_repository_api_v1_code_repositories__code_repository_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "code_repository_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Code Repository Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/CodeRepositoryResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["code_repositories"], - "summary": "Update Code Repository", - "description": "Updates a code repository.\n\nArgs:\n code_repository_id: The ID of the code repository to update.\n update: The model containing the attributes to update.\n\nReturns:\n The updated code repository object.", - "operationId": "update_code_repository_api_v1_code_repositories__code_repository_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "code_repository_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Code Repository Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/CodeRepositoryUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/CodeRepositoryResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["code_repositories"], - "summary": "Delete Code Repository", - "description": "Deletes a specific code repository.\n\nArgs:\n code_repository_id: The ID of the code repository to delete.", - "operationId": "delete_code_repository_api_v1_code_repositories__code_repository_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "code_repository_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Code Repository Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/plugin-flavors": { - "get": { - "tags": ["plugins"], - "summary": "List Flavors", - "description": "Returns all event flavors.\n\nArgs:\n type: The type of Plugin\n subtype: The subtype of the plugin\n page: Page for pagination (offset +1)\n size: Page size for pagination\n hydrate: Whether to hydrate the response bodies\n\nReturns:\n A page of flavors.", - "operationId": "list_flavors_api_v1_plugin_flavors_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "type", - "in": "query", - "required": true, - "schema": { "$ref": "#/components/schemas/PluginType" } - }, - { - "name": "subtype", - "in": "query", - "required": true, - "schema": { - "$ref": "#/components/schemas/zenml__enums__PluginSubType__2" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { "type": "integer", "default": 1, "title": "Page" } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { "type": "integer", "default": 20, "title": "Size" } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_BasePluginFlavorResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/plugin-flavors/{name}": { - "get": { - "tags": ["plugins"], - "summary": "Get Flavor", - "description": "Returns the requested flavor.\n\nArgs:\n name: Name of the flavor.\n type: Type of Plugin\n subtype: Subtype of Plugin\n\nReturns:\n The requested flavor response.", - "operationId": "get_flavor_api_v1_plugin_flavors__name__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "name", - "in": "path", - "required": true, - "schema": { "type": "string", "title": "Name" } - }, - { - "name": "type", - "in": "query", - "required": true, - "schema": { "$ref": "#/components/schemas/PluginType" } - }, - { - "name": "subtype", - "in": "query", - "required": true, - "schema": { - "$ref": "#/components/schemas/zenml__enums__PluginSubType__2" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/BasePluginFlavorResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/event-sources": { - "get": { - "tags": ["event-sources"], - "summary": "List Event Sources", - "description": "Returns all event_sources.\n\nArgs:\n event_source_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All event_sources.", - "operationId": "list_event_sources_api_v1_event_sources_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "flavor", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Flavor" - } - }, - { - "name": "plugin_subtype", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "Plugin Subtype" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_EventSourceResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["event-sources"], - "summary": "Create Event Source", - "description": "Creates an event source.\n\nArgs:\n event_source: EventSource to register.\n\nReturns:\n The created event source.\n\nRaises:\n ValueError: If the plugin for an event source is not a valid event\n source plugin.", - "operationId": "create_event_source_api_v1_event_sources_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/EventSourceRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/EventSourceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/event-sources/{event_source_id}": { - "get": { - "tags": ["event-sources"], - "summary": "Get Event Source", - "description": "Returns the requested event_source.\n\nArgs:\n event_source_id: ID of the event_source.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The requested event_source.\n\nRaises:\n ValueError: If the plugin for an event source is not a valid event\n source plugin.", - "operationId": "get_event_source_api_v1_event_sources__event_source_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "event_source_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Event Source Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/EventSourceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["event-sources"], - "summary": "Update Event Source", - "description": "Updates an event_source.\n\nArgs:\n event_source_id: Name of the event_source.\n event_source_update: EventSource to use for the update.\n\nReturns:\n The updated event_source.\n\nRaises:\n ValueError: If the plugin for an event source is not a valid event\n source plugin.", - "operationId": "update_event_source_api_v1_event_sources__event_source_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "event_source_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Event Source Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/EventSourceUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/EventSourceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["event-sources"], - "summary": "Delete Event Source", - "description": "Deletes a event_source.\n\nArgs:\n event_source_id: Name of the event_source.\n force: Flag deciding whether to force delete the event source.\n\nRaises:\n ValueError: If the plugin for an event source is not a valid event\n source plugin.", - "operationId": "delete_event_source_api_v1_event_sources__event_source_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "event_source_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Event Source Id" - } - }, - { - "name": "force", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": false, "title": "Force" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/flavors": { - "get": { - "tags": ["flavors"], - "summary": "List Flavors", - "description": "Returns all flavors.\n\nArgs:\n flavor_filter_model: Filter model used for pagination, sorting,\n filtering\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All flavors.", - "operationId": "list_flavors_api_v1_flavors_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Type" - } - }, - { - "name": "integration", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Integration" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_FlavorResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["flavors"], - "summary": "Create Flavor", - "description": "Creates a stack component flavor.\n\nArgs:\n flavor: Stack component flavor to register.\n\nReturns:\n The created stack component flavor.", - "operationId": "create_flavor_api_v1_flavors_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/FlavorRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/FlavorResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/flavors/{flavor_id}": { - "get": { - "tags": ["flavors"], - "summary": "Get Flavor", - "description": "Returns the requested flavor.\n\nArgs:\n flavor_id: ID of the flavor.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The requested stack.", - "operationId": "get_flavor_api_v1_flavors__flavor_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "flavor_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Flavor Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/FlavorResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["flavors"], - "summary": "Update Flavor", - "description": "Updates a flavor.\n\n# noqa: DAR401\n\nArgs:\n flavor_id: ID of the flavor to update.\n flavor_update: Flavor update.\n\nReturns:\n The updated flavor.", - "operationId": "update_flavor_api_v1_flavors__flavor_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "flavor_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Flavor Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/FlavorUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/FlavorResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["flavors"], - "summary": "Delete Flavor", - "description": "Deletes a flavor.\n\nArgs:\n flavor_id: ID of the flavor.", - "operationId": "delete_flavor_api_v1_flavors__flavor_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "flavor_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Flavor Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/flavors/sync": { - "patch": { - "tags": ["flavors"], - "summary": "Sync Flavors", - "description": "Purge all in-built and integration flavors from the DB and sync.\n\nReturns:\n None if successful. Raises an exception otherwise.", - "operationId": "sync_flavors_api_v1_flavors_sync_patch", - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "403": { - "description": "Forbidden", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "404": { - "description": "Not Found", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "422": { - "description": "Unprocessable Entity", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - } - }, - "security": [{ "CookieOAuth2TokenBearer": [] }] - } - }, - "/api/v1/models": { - "get": { - "tags": ["models"], - "summary": "List Models", - "description": "Get models according to query filters.\n\nArgs:\n model_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The models according to query filters.", - "operationId": "list_models_api_v1_models_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/Page_ModelResponse_" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/models/{model_name_or_id}": { - "get": { - "tags": ["models"], - "summary": "Get Model", - "description": "Get a model by name or ID.\n\nArgs:\n model_name_or_id: The name or ID of the model to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The model with the given name or ID.", - "operationId": "get_model_api_v1_models__model_name_or_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Model Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ModelResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["models"], - "summary": "Delete Model", - "description": "Delete a model by name or ID.\n\nArgs:\n model_name_or_id: The name or ID of the model to delete.", - "operationId": "delete_model_api_v1_models__model_name_or_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Model Name Or Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/models/{model_id}": { - "put": { - "tags": ["models"], - "summary": "Update Model", - "description": "Updates a model.\n\nArgs:\n model_id: Name of the stack.\n model_update: Stack to use for the update.\n\nReturns:\n The updated model.", - "operationId": "update_model_api_v1_models__model_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ModelUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ModelResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/models/{model_name_or_id}/model_versions": { - "get": { - "tags": ["models"], - "summary": "List Model Versions", - "description": "Get model versions according to query filters.\n\nThis endpoint serves the purpose of allowing scoped filtering by model_id.\n\nArgs:\n model_name_or_id: The name or ID of the model to list in.\n model_version_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n auth_context: The authentication context.\n\nReturns:\n The model versions according to query filters.", - "operationId": "list_model_versions_api_v1_models__model_name_or_id__model_versions_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Model Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "number", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "Number" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "stage", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string" }, - { "$ref": "#/components/schemas/ModelStages" }, - { "type": "null" } - ], - "title": "Stage" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ModelVersionResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/model_versions": { - "get": { - "tags": ["model_versions"], - "summary": "List Model Versions", - "description": "Get model versions according to query filters.\n\nArgs:\n model_version_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n auth_context: The authentication context.\n\nReturns:\n The model versions according to query filters.", - "operationId": "list_model_versions_api_v1_model_versions_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "number", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "Number" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "stage", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string" }, - { "$ref": "#/components/schemas/ModelStages" }, - { "type": "null" } - ], - "title": "Stage" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ModelVersionResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/model_versions/{model_version_id}": { - "get": { - "tags": ["model_versions"], - "summary": "Get Model Version", - "description": "Get a model version by ID.\n\nArgs:\n model_version_id: id of the model version to be retrieved.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The model version with the given name or ID.", - "operationId": "get_model_version_api_v1_model_versions__model_version_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Version Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ModelVersionResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["model_versions"], - "summary": "Update Model Version", - "description": "Get all model versions by filter.\n\nArgs:\n model_version_id: The ID of model version to be updated.\n model_version_update_model: The model version to be updated.\n\nReturns:\n An updated model version.", - "operationId": "update_model_version_api_v1_model_versions__model_version_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Version Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ModelVersionUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ModelVersionResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["model_versions"], - "summary": "Delete Model Version", - "description": "Delete a model by name or ID.\n\nArgs:\n model_version_id: The name or ID of the model version to delete.", - "operationId": "delete_model_version_api_v1_model_versions__model_version_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Version Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/model_versions/{model_version_id}/artifacts/{model_version_artifact_link_name_or_id}": { - "delete": { - "tags": ["model_versions"], - "summary": "Delete Model Version Artifact Link", - "description": "Deletes a model version to artifact link.\n\nArgs:\n model_version_id: ID of the model version containing the link.\n model_version_artifact_link_name_or_id: name or ID of the model\n version to artifact link to be deleted.", - "operationId": "delete_model_version_artifact_link_api_v1_model_versions__model_version_id__artifacts__model_version_artifact_link_name_or_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Version Id" - } - }, - { - "name": "model_version_artifact_link_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Model Version Artifact Link Name Or Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/model_versions/{model_version_id}/artifacts": { - "delete": { - "tags": ["model_versions"], - "summary": "Delete All Model Version Artifact Links", - "description": "Deletes all model version to artifact links.\n\nArgs:\n model_version_id: ID of the model version containing links.\n only_links: Whether to only delete the link to the artifact.", - "operationId": "delete_all_model_version_artifact_links_api_v1_model_versions__model_version_id__artifacts_delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Version Id" - } - }, - { - "name": "only_links", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": true, - "title": "Only Links" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/model_versions/{model_version_id}/runs/{model_version_pipeline_run_link_name_or_id}": { - "delete": { - "tags": ["model_versions"], - "summary": "Delete Model Version Pipeline Run Link", - "description": "Deletes a model version link.\n\nArgs:\n model_version_id: name or ID of the model version containing the link.\n model_version_pipeline_run_link_name_or_id: name or ID of the model\n version link to be deleted.", - "operationId": "delete_model_version_pipeline_run_link_api_v1_model_versions__model_version_id__runs__model_version_pipeline_run_link_name_or_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "model_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Version Id" - } - }, - { - "name": "model_version_pipeline_run_link_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Model Version Pipeline Run Link Name Or Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/model_version_artifacts": { - "get": { - "tags": ["model_version_artifacts"], - "summary": "List Model Version Artifact Links", - "description": "Get model version to artifact links according to query filters.\n\nArgs:\n model_version_artifact_link_filter_model: Filter model used for\n pagination, sorting, filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The model version to artifact links according to query filters.", - "operationId": "list_model_version_artifact_links_api_v1_model_version_artifacts_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "model_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Model Id" - } - }, - { - "name": "model_version_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Model Version Id" - } - }, - { - "name": "artifact_version_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Artifact Version Id" - } - }, - { - "name": "artifact_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Artifact Name" - } - }, - { - "name": "only_data_artifacts", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "default": false, - "title": "Only Data Artifacts" - } - }, - { - "name": "only_model_artifacts", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "default": false, - "title": "Only Model Artifacts" - } - }, - { - "name": "only_deployment_artifacts", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "default": false, - "title": "Only Deployment Artifacts" - } - }, - { - "name": "has_custom_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Has Custom Name" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ModelVersionArtifactResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/model_version_pipeline_runs": { - "get": { - "tags": ["model_version_pipeline_runs"], - "summary": "List Model Version Pipeline Run Links", - "description": "Get model version to pipeline run links according to query filters.\n\nArgs:\n model_version_pipeline_run_link_filter_model: Filter model used for\n pagination, sorting, and filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The model version to pipeline run links according to query filters.", - "operationId": "list_model_version_pipeline_run_links_api_v1_model_version_pipeline_runs_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "model_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Model Id" - } - }, - { - "name": "model_version_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Model Version Id" - } - }, - { - "name": "pipeline_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Run Id" - } - }, - { - "name": "pipeline_run_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Pipeline Run Name" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ModelVersionPipelineRunResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/pipelines": { - "get": { - "tags": ["pipelines"], - "summary": "List Pipelines", - "description": "Gets a list of pipelines.\n\nArgs:\n pipeline_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n List of pipeline objects.", - "operationId": "list_pipelines_api_v1_pipelines_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/pipelines/{pipeline_id}": { - "get": { - "tags": ["pipelines"], - "summary": "Get Pipeline", - "description": "Gets a specific pipeline using its unique id.\n\nArgs:\n pipeline_id: ID of the pipeline to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A specific pipeline object.", - "operationId": "get_pipeline_api_v1_pipelines__pipeline_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "pipeline_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Pipeline Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["pipelines"], - "summary": "Update Pipeline", - "description": "Updates the attribute on a specific pipeline using its unique id.\n\nArgs:\n pipeline_id: ID of the pipeline to get.\n pipeline_update: the model containing the attributes to update.\n\nReturns:\n The updated pipeline object.", - "operationId": "update_pipeline_api_v1_pipelines__pipeline_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "pipeline_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Pipeline Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["pipelines"], - "summary": "Delete Pipeline", - "description": "Deletes a specific pipeline.\n\nArgs:\n pipeline_id: ID of the pipeline to delete.", - "operationId": "delete_pipeline_api_v1_pipelines__pipeline_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "pipeline_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Pipeline Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/pipelines/{pipeline_id}/runs": { - "get": { - "tags": ["pipelines"], - "summary": "List Pipeline Runs", - "description": "Get pipeline runs according to query filters.\n\nArgs:\n pipeline_run_filter_model: Filter model used for pagination, sorting,\n filtering\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The pipeline runs according to query filters.", - "operationId": "list_pipeline_runs_api_v1_pipelines__pipeline_id__runs_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "pipeline_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "orchestrator_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Orchestrator Run Id" - } - }, - { - "name": "pipeline_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Pipeline Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "schedule_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Schedule Id" - } - }, - { - "name": "build_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Build Id" - } - }, - { - "name": "deployment_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Deployment Id" - } - }, - { - "name": "code_repository_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Code Repository Id" - } - }, - { - "name": "template_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Template Id" - } - }, - { - "name": "status", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Status" - } - }, - { - "name": "start_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Start Time" - } - }, - { - "name": "end_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "End Time" - } - }, - { - "name": "unlisted", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Unlisted" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineRunResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/pipeline_builds": { - "get": { - "tags": ["builds"], - "summary": "List Builds", - "description": "Gets a list of builds.\n\nArgs:\n build_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n List of build objects.", - "operationId": "list_builds_api_v1_pipeline_builds_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "is_local", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Is Local" - } - }, - { - "name": "contains_code", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Contains Code" - } - }, - { - "name": "zenml_version", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Zenml Version" - } - }, - { - "name": "python_version", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Python Version" - } - }, - { - "name": "checksum", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Checksum" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineBuildResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/pipeline_builds/{build_id}": { - "get": { - "tags": ["builds"], - "summary": "Get Build", - "description": "Gets a specific build using its unique id.\n\nArgs:\n build_id: ID of the build to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A specific build object.", - "operationId": "get_build_api_v1_pipeline_builds__build_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "build_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Build Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/PipelineBuildResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["builds"], - "summary": "Delete Build", - "description": "Deletes a specific build.\n\nArgs:\n build_id: ID of the build to delete.", - "operationId": "delete_build_api_v1_pipeline_builds__build_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "build_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Build Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/pipeline_deployments": { - "get": { - "tags": ["deployments"], - "summary": "List Deployments", - "description": "Gets a list of deployment.\n\nArgs:\n deployment_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n List of deployment objects.", - "operationId": "list_deployments_api_v1_pipeline_deployments_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "build_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Build Id" - } - }, - { - "name": "schedule_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Schedule Id" - } - }, - { - "name": "template_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Template Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineDeploymentResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/pipeline_deployments/{deployment_id}": { - "get": { - "tags": ["deployments"], - "summary": "Get Deployment", - "description": "Gets a specific deployment using its unique id.\n\nArgs:\n deployment_id: ID of the deployment to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A specific deployment object.", - "operationId": "get_deployment_api_v1_pipeline_deployments__deployment_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "deployment_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Deployment Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/PipelineDeploymentResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["deployments"], - "summary": "Delete Deployment", - "description": "Deletes a specific deployment.\n\nArgs:\n deployment_id: ID of the deployment to delete.", - "operationId": "delete_deployment_api_v1_pipeline_deployments__deployment_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "deployment_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Deployment Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/pipeline_deployments/{deployment_id}/logs": { - "get": { - "tags": ["deployments"], - "summary": "Deployment Logs", - "description": "Get deployment logs.\n\nArgs:\n deployment_id: ID of the deployment.\n\nReturns:\n The deployment logs.", - "operationId": "deployment_logs_api_v1_pipeline_deployments__deployment_id__logs_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "deployment_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Deployment Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "string", - "title": "Response Deployment Logs Api V1 Pipeline Deployments Deployment Id Logs Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/runs": { - "get": { - "tags": ["runs"], - "summary": "List Runs", - "description": "Get pipeline runs according to query filters.\n\nArgs:\n runs_filter_model: Filter model used for pagination, sorting, filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The pipeline runs according to query filters.", - "operationId": "list_runs_api_v1_runs_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "orchestrator_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Orchestrator Run Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "pipeline_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Pipeline Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "schedule_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Schedule Id" - } - }, - { - "name": "build_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Build Id" - } - }, - { - "name": "deployment_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Deployment Id" - } - }, - { - "name": "code_repository_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Code Repository Id" - } - }, - { - "name": "template_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Template Id" - } - }, - { - "name": "status", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Status" - } - }, - { - "name": "start_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Start Time" - } - }, - { - "name": "end_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "End Time" - } - }, - { - "name": "unlisted", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Unlisted" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineRunResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/runs/{run_id}": { - "get": { - "tags": ["runs"], - "summary": "Get Run", - "description": "Get a specific pipeline run using its ID.\n\nArgs:\n run_id: ID of the pipeline run to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The pipeline run.", - "operationId": "get_run_api_v1_runs__run_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "run_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Run Id" } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineRunResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["runs"], - "summary": "Update Run", - "description": "Updates a run.\n\nArgs:\n run_id: ID of the run.\n run_model: Run model to use for the update.\n\nReturns:\n The updated run model.", - "operationId": "update_run_api_v1_runs__run_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "run_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Run Id" } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineRunUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineRunResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["runs"], - "summary": "Delete Run", - "description": "Deletes a run.\n\nArgs:\n run_id: ID of the run.", - "operationId": "delete_run_api_v1_runs__run_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "run_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Run Id" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/runs/{run_id}/graph": { - "get": { - "tags": ["runs"], - "summary": "Get Run Dag", - "description": "Get the DAG for a given pipeline run.\n\nArgs:\n run_id: ID of the pipeline run to use to get the DAG.\n\nReturns:\n The DAG for a given pipeline run.", - "operationId": "get_run_dag_api_v1_runs__run_id__graph_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "run_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Run Id" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/LineageGraph" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/runs/{run_id}/steps": { - "get": { - "tags": ["runs"], - "summary": "Get Run Steps", - "description": "Get all steps for a given pipeline run.\n\nArgs:\n run_id: ID of the pipeline run.\n step_run_filter_model: Filter model used for pagination, sorting,\n filtering\n\nReturns:\n The steps for a given pipeline run.", - "operationId": "get_run_steps_api_v1_runs__run_id__steps_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "run_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Run Id" } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "code_hash", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Code Hash" - } - }, - { - "name": "cache_key", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Cache Key" - } - }, - { - "name": "status", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Status" - } - }, - { - "name": "start_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Start Time" - } - }, - { - "name": "end_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "End Time" - } - }, - { - "name": "pipeline_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Run Id" - } - }, - { - "name": "original_step_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Original Step Run Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_StepRunResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/runs/{run_id}/pipeline-configuration": { - "get": { - "tags": ["runs"], - "summary": "Get Pipeline Configuration", - "description": "Get the pipeline configuration of a specific pipeline run using its ID.\n\nArgs:\n run_id: ID of the pipeline run to get.\n\nReturns:\n The pipeline configuration of the pipeline run.", - "operationId": "get_pipeline_configuration_api_v1_runs__run_id__pipeline_configuration_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "run_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Run Id" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "object", - "title": "Response Get Pipeline Configuration Api V1 Runs Run Id Pipeline Configuration Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/runs/{run_id}/status": { - "get": { - "tags": ["runs"], - "summary": "Get Run Status", - "description": "Get the status of a specific pipeline run.\n\nArgs:\n run_id: ID of the pipeline run for which to get the status.\n\nReturns:\n The status of the pipeline run.", - "operationId": "get_run_status_api_v1_runs__run_id__status_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "run_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Run Id" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ExecutionStatus" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/run-metadata": { - "get": { - "tags": ["run_metadata"], - "summary": "List Run Metadata", - "description": "Get run metadata according to query filters.\n\nArgs:\n run_metadata_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The pipeline runs according to query filters.", - "operationId": "list_run_metadata_api_v1_run_metadata_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "resource_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Resource Id" - } - }, - { - "name": "resource_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "$ref": "#/components/schemas/MetadataResourceTypes" }, - { "type": "null" } - ], - "title": "Resource Type" - } - }, - { - "name": "stack_component_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Component Id" - } - }, - { - "name": "key", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Key" - } - }, - { - "name": "type", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string" }, - { "$ref": "#/components/schemas/MetadataTypeEnum" }, - { "type": "null" } - ], - "title": "Type" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_RunMetadataResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/run-metadata/{run_metadata_id}": { - "get": { - "tags": ["run_metadata"], - "summary": "Get Run Metadata", - "description": "Get run metadata by ID.\n\nArgs:\n run_metadata_id: The ID of run metadata.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The run metadata response.", - "operationId": "get_run_metadata_api_v1_run_metadata__run_metadata_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "run_metadata_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Run Metadata Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/RunMetadataResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/run_templates": { - "get": { - "tags": ["run_templates"], - "summary": "List Run Templates", - "description": "Get a page of run templates.\n\nArgs:\n filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n Page of run templates.", - "operationId": "list_run_templates_api_v1_run_templates_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "build_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Build Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "code_repository_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Code Repository Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_RunTemplateResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/run_templates/{template_id}": { - "get": { - "tags": ["run_templates"], - "summary": "Get Run Template", - "description": "Get a run template.\n\nArgs:\n template_id: ID of the run template to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The run template.", - "operationId": "get_run_template_api_v1_run_templates__template_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "template_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Template Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/RunTemplateResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["run_templates"], - "summary": "Update Run Template", - "description": "Update a run template.\n\nArgs:\n template_id: ID of the run template to get.\n update: The updates to apply.\n\nReturns:\n The updated run template.", - "operationId": "update_run_template_api_v1_run_templates__template_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "template_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Template Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/RunTemplateUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/RunTemplateResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["run_templates"], - "summary": "Delete Run Template", - "description": "Delete a run template.\n\nArgs:\n template_id: ID of the run template to delete.", - "operationId": "delete_run_template_api_v1_run_templates__template_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "template_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Template Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/run_templates/{template_id}/runs": { - "post": { - "tags": ["run_templates"], - "summary": "Create Template Run", - "description": "Run a pipeline from a template.\n\nArgs:\n template_id: The ID of the template.\n background_tasks: Background tasks.\n config: Configuration for the pipeline run.\n auth_context: Authentication context.\n\nReturns:\n The created pipeline run.", - "operationId": "create_template_run_api_v1_run_templates__template_id__runs_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "template_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Template Id" - } - } - ], - "requestBody": { - "content": { - "application/json": { - "schema": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineRunConfiguration" }, - { "type": "null" } - ], - "title": "Config" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineRunResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/schedules": { - "get": { - "tags": ["schedules"], - "summary": "List Schedules", - "description": "Gets a list of schedules.\n\nArgs:\n schedule_filter_model: Filter model used for pagination, sorting,\n filtering\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n List of schedule objects.", - "operationId": "list_schedules_api_v1_schedules_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "orchestrator_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Orchestrator Id" - } - }, - { - "name": "active", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Active" - } - }, - { - "name": "cron_expression", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Cron Expression" - } - }, - { - "name": "start_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Start Time" - } - }, - { - "name": "end_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "End Time" - } - }, - { - "name": "interval_second", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "number" }, { "type": "null" }], - "title": "Interval Second" - } - }, - { - "name": "catchup", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Catchup" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "run_once_start_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Run Once Start Time" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ScheduleResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/schedules/{schedule_id}": { - "get": { - "tags": ["schedules"], - "summary": "Get Schedule", - "description": "Gets a specific schedule using its unique id.\n\nArgs:\n schedule_id: ID of the schedule to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A specific schedule object.", - "operationId": "get_schedule_api_v1_schedules__schedule_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "schedule_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Schedule Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ScheduleResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["schedules"], - "summary": "Update Schedule", - "description": "Updates the attribute on a specific schedule using its unique id.\n\nArgs:\n schedule_id: ID of the schedule to get.\n schedule_update: the model containing the attributes to update.\n\nReturns:\n The updated schedule object.", - "operationId": "update_schedule_api_v1_schedules__schedule_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "schedule_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Schedule Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ScheduleUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ScheduleResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["schedules"], - "summary": "Delete Schedule", - "description": "Deletes a specific schedule using its unique id.\n\nArgs:\n schedule_id: ID of the schedule to delete.", - "operationId": "delete_schedule_api_v1_schedules__schedule_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "schedule_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Schedule Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/secrets": { - "get": { - "tags": ["secrets"], - "summary": "List Secrets", - "description": "Gets a list of secrets.\n\nArgs:\n secret_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n List of secret objects.", - "operationId": "list_secrets_api_v1_secrets_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "scope", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "$ref": "#/components/schemas/SecretScope" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Scope" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_SecretResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/secrets/{secret_id}": { - "get": { - "tags": ["secrets"], - "summary": "Get Secret", - "description": "Gets a specific secret using its unique id.\n\nArgs:\n secret_id: ID of the secret to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A specific secret object.", - "operationId": "get_secret_api_v1_secrets__secret_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "secret_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Secret Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/SecretResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["secrets"], - "summary": "Update Secret", - "description": "Updates the attribute on a specific secret using its unique id.\n\nArgs:\n secret_id: ID of the secret to get.\n secret_update: the model containing the attributes to update.\n patch_values: Whether to patch the secret values or replace them.\n\nReturns:\n The updated secret object.", - "operationId": "update_secret_api_v1_secrets__secret_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "secret_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Secret Id" - } - }, - { - "name": "patch_values", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "default": false, - "title": "Patch Values" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/SecretUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/SecretResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["secrets"], - "summary": "Delete Secret", - "description": "Deletes a specific secret using its unique id.\n\nArgs:\n secret_id: ID of the secret to delete.", - "operationId": "delete_secret_api_v1_secrets__secret_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "secret_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Secret Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/secrets_operations/backup": { - "put": { - "tags": ["secrets"], - "summary": "Backup Secrets", - "description": "Backs up all secrets in the secrets store to the backup secrets store.\n\nArgs:\n ignore_errors: Whether to ignore individual errors when backing up\n secrets and continue with the backup operation until all secrets\n have been backed up.\n delete_secrets: Whether to delete the secrets that have been\n successfully backed up from the primary secrets store. Setting\n this flag effectively moves all secrets from the primary secrets\n store to the backup secrets store.", - "operationId": "backup_secrets_api_v1_secrets_operations_backup_put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "ignore_errors", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": true, - "title": "Ignore Errors" - } - }, - { - "name": "delete_secrets", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Delete Secrets" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/secrets_operations/restore": { - "put": { - "tags": ["secrets"], - "summary": "Restore Secrets", - "description": "Restores all secrets from the backup secrets store into the main secrets store.\n\nArgs:\n ignore_errors: Whether to ignore individual errors when restoring\n secrets and continue with the restore operation until all secrets\n have been restored.\n delete_secrets: Whether to delete the secrets that have been\n successfully restored from the backup secrets store. Setting\n this flag effectively moves all secrets from the backup secrets\n store to the primary secrets store.", - "operationId": "restore_secrets_api_v1_secrets_operations_restore_put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "ignore_errors", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Ignore Errors" - } - }, - { - "name": "delete_secrets", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Delete Secrets" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/version": { - "get": { - "tags": ["server"], - "summary": "Version", - "description": "Get version of the server.\n\nReturns:\n String representing the version of the server.", - "operationId": "version_api_v1_version_get", - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "string", - "title": "Response Version Api V1 Version Get" - } - } - } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - } - } - } - }, - "/api/v1/info": { - "get": { - "tags": ["server"], - "summary": "Server Info", - "description": "Get information about the server.\n\nReturns:\n Information about the server.", - "operationId": "server_info_api_v1_info_get", - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServerModel" } - } - } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "404": { - "description": "Not Found", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "422": { - "description": "Unprocessable Entity", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - } - } - } - }, - "/api/v1/onboarding_state": { - "get": { - "tags": ["server"], - "summary": "Get Onboarding State", - "description": "Get the onboarding state of the server.\n\nReturns:\n The onboarding state of the server.", - "operationId": "get_onboarding_state_api_v1_onboarding_state_get", - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "items": { "type": "string" }, - "type": "array", - "title": "Response Get Onboarding State Api V1 Onboarding State Get" - } - } - } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "404": { - "description": "Not Found", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "422": { - "description": "Unprocessable Entity", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - } - }, - "security": [{ "CookieOAuth2TokenBearer": [] }] - } - }, - "/api/v1/service_accounts": { - "post": { - "tags": ["service_accounts", "api_keys"], - "summary": "Create Service Account", - "description": "Creates a service account.\n\nArgs:\n service_account: Service account to create.\n\nReturns:\n The created service account.", - "operationId": "create_service_account_api_v1_service_accounts_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceAccountRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceAccountResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "get": { - "tags": ["service_accounts", "api_keys"], - "summary": "List Service Accounts", - "description": "Returns a list of service accounts.\n\nArgs:\n filter_model: Model that takes care of filtering, sorting and\n pagination.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A list of service accounts matching the filter.", - "operationId": "list_service_accounts_api_v1_service_accounts_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "description", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Description" - } - }, - { - "name": "active", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "boolean" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Active" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ServiceAccountResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_accounts/{service_account_name_or_id}": { - "get": { - "tags": ["service_accounts", "api_keys"], - "summary": "Get Service Account", - "description": "Returns a specific service account.\n\nArgs:\n service_account_name_or_id: Name or ID of the service account.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The service account matching the given name or ID.", - "operationId": "get_service_account_api_v1_service_accounts__service_account_name_or_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Service Account Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceAccountResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["service_accounts", "api_keys"], - "summary": "Update Service Account", - "description": "Updates a specific service account.\n\nArgs:\n service_account_name_or_id: Name or ID of the service account.\n service_account_update: the service account to use for the update.\n\nReturns:\n The updated service account.", - "operationId": "update_service_account_api_v1_service_accounts__service_account_name_or_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Service Account Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceAccountUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceAccountResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["service_accounts", "api_keys"], - "summary": "Delete Service Account", - "description": "Delete a specific service account.\n\nArgs:\n service_account_name_or_id: Name or ID of the service account.", - "operationId": "delete_service_account_api_v1_service_accounts__service_account_name_or_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Service Account Name Or Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_accounts/{service_account_id}/api_keys": { - "post": { - "tags": ["service_accounts", "api_keys"], - "summary": "Create Api Key", - "description": "Creates an API key for a service account.\n\nArgs:\n service_account_id: ID of the service account for which to create the\n API key.\n api_key: API key to create.\n\nReturns:\n The created API key.", - "operationId": "create_api_key_api_v1_service_accounts__service_account_id__api_keys_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Account Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/APIKeyRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/APIKeyResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "get": { - "tags": ["service_accounts", "api_keys"], - "summary": "List Api Keys", - "description": "List API keys associated with a service account.\n\nArgs:\n service_account_id: ID of the service account to which the API keys\n belong.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n filter_model: Filter model used for pagination, sorting,\n filtering\n\nReturns:\n All API keys matching the filter and associated with the supplied\n service account.", - "operationId": "list_api_keys_api_v1_service_accounts__service_account_id__api_keys_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Account Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "service_account", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Service Account" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "description", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Description" - } - }, - { - "name": "active", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "boolean" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Active" - } - }, - { - "name": "last_login", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Last Login" - } - }, - { - "name": "last_rotated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Last Rotated" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_APIKeyResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_accounts/{service_account_id}/api_keys/{api_key_name_or_id}": { - "get": { - "tags": ["service_accounts", "api_keys"], - "summary": "Get Api Key", - "description": "Returns the requested API key.\n\nArgs:\n service_account_id: ID of the service account to which the API key\n belongs.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n api_key_name_or_id: Name or ID of the API key to return.\n\nReturns:\n The requested API key.", - "operationId": "get_api_key_api_v1_service_accounts__service_account_id__api_keys__api_key_name_or_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Account Id" - } - }, - { - "name": "api_key_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Api Key Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/APIKeyResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["service_accounts", "api_keys"], - "summary": "Update Api Key", - "description": "Updates an API key for a service account.\n\nArgs:\n service_account_id: ID of the service account to which the API key\n belongs.\n api_key_name_or_id: Name or ID of the API key to update.\n api_key_update: API key update.\n\nReturns:\n The updated API key.", - "operationId": "update_api_key_api_v1_service_accounts__service_account_id__api_keys__api_key_name_or_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Account Id" - } - }, - { - "name": "api_key_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Api Key Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/APIKeyUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/APIKeyResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["service_accounts", "api_keys"], - "summary": "Delete Api Key", - "description": "Deletes an API key.\n\nArgs:\n service_account_id: ID of the service account to which the API key\n belongs.\n api_key_name_or_id: Name or ID of the API key to delete.", - "operationId": "delete_api_key_api_v1_service_accounts__service_account_id__api_keys__api_key_name_or_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Account Id" - } - }, - { - "name": "api_key_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Api Key Name Or Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_accounts/{service_account_id}/api_keys/{api_key_name_or_id}/rotate": { - "put": { - "tags": ["service_accounts", "api_keys"], - "summary": "Rotate Api Key", - "description": "Rotate an API key.\n\nArgs:\n service_account_id: ID of the service account to which the API key\n belongs.\n api_key_name_or_id: Name or ID of the API key to rotate.\n rotate_request: API key rotation request.\n\nReturns:\n The updated API key.", - "operationId": "rotate_api_key_api_v1_service_accounts__service_account_id__api_keys__api_key_name_or_id__rotate_put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_account_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Account Id" - } - }, - { - "name": "api_key_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Api Key Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/APIKeyRotateRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/APIKeyResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_connectors": { - "get": { - "tags": ["service_connectors"], - "summary": "List Service Connectors", - "description": "Get a list of all service connectors for a specific type.\n\nArgs:\n connector_filter_model: Filter model used for pagination, sorting,\n filtering\n expand_secrets: Whether to expand secrets or not.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n Page with list of service connectors for a specific type.", - "operationId": "list_service_connectors_api_v1_service_connectors_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "expand_secrets", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": true, - "title": "Expand Secrets" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "scope_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Scope Type" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "connector_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Connector Type" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "auth_method", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Auth Method" - } - }, - { - "name": "resource_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Type" - } - }, - { - "name": "resource_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Id" - } - }, - { - "name": "labels_str", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Labels Str" - } - }, - { - "name": "secret_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Secret Id" - } - } - ], - "requestBody": { - "content": { - "application/json": { - "schema": { - "anyOf": [ - { - "type": "object", - "additionalProperties": { - "anyOf": [{ "type": "string" }, { "type": "null" }] - } - }, - { "type": "null" } - ], - "title": "Labels" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ServiceConnectorResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_connectors/{connector_id}": { - "get": { - "tags": ["service_connectors"], - "summary": "Get Service Connector", - "description": "Returns the requested service connector.\n\nArgs:\n connector_id: ID of the service connector.\n expand_secrets: Whether to expand secrets or not.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The requested service connector.", - "operationId": "get_service_connector_api_v1_service_connectors__connector_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "connector_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Connector Id" - } - }, - { - "name": "expand_secrets", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": true, - "title": "Expand Secrets" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["service_connectors"], - "summary": "Update Service Connector", - "description": "Updates a service connector.\n\nArgs:\n connector_id: ID of the service connector.\n connector_update: Service connector to use to update.\n\nReturns:\n Updated service connector.", - "operationId": "update_service_connector_api_v1_service_connectors__connector_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "connector_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Connector Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorUpdate" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["service_connectors"], - "summary": "Delete Service Connector", - "description": "Deletes a service connector.\n\nArgs:\n connector_id: ID of the service connector.", - "operationId": "delete_service_connector_api_v1_service_connectors__connector_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "connector_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Connector Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_connectors/verify": { - "post": { - "tags": ["service_connectors"], - "summary": "Validate And Verify Service Connector Config", - "description": "Verifies if a service connector configuration has access to resources.\n\nThis requires the service connector implementation to be installed\non the ZenML server, otherwise a 501 Not Implemented error will be\nreturned.\n\nArgs:\n connector: The service connector configuration to verify.\n list_resources: If True, the list of all resources accessible\n through the service connector is returned.\n\nReturns:\n The list of resources that the service connector configuration has\n access to.", - "operationId": "validate_and_verify_service_connector_config_api_v1_service_connectors_verify_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "list_resources", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": true, - "title": "List Resources" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorRequest" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorResourcesModel" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_connectors/{connector_id}/verify": { - "put": { - "tags": ["service_connectors"], - "summary": "Validate And Verify Service Connector", - "description": "Verifies if a service connector instance has access to one or more resources.\n\nThis requires the service connector implementation to be installed\non the ZenML server, otherwise a 501 Not Implemented error will be\nreturned.\n\nArgs:\n connector_id: The ID of the service connector to verify.\n resource_type: The type of resource to verify access to.\n resource_id: The ID of the resource to verify access to.\n list_resources: If True, the list of all resources accessible\n through the service connector and matching the supplied resource\n type and ID are returned.\n\nReturns:\n The list of resources that the service connector has access to, scoped\n to the supplied resource type and ID, if provided.", - "operationId": "validate_and_verify_service_connector_api_v1_service_connectors__connector_id__verify_put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "connector_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Connector Id" - } - }, - { - "name": "resource_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Type" - } - }, - { - "name": "resource_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Id" - } - }, - { - "name": "list_resources", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": true, - "title": "List Resources" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorResourcesModel" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_connectors/{connector_id}/client": { - "get": { - "tags": ["service_connectors"], - "summary": "Get Service Connector Client", - "description": "Get a service connector client for a service connector and given resource.\n\nThis requires the service connector implementation to be installed\non the ZenML server, otherwise a 501 Not Implemented error will be\nreturned.\n\nArgs:\n connector_id: ID of the service connector.\n resource_type: Type of the resource to list.\n resource_id: ID of the resource to list.\n\nReturns:\n A service connector client that can be used to access the given\n resource.", - "operationId": "get_service_connector_client_api_v1_service_connectors__connector_id__client_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "connector_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Connector Id" - } - }, - { - "name": "resource_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Type" - } - }, - { - "name": "resource_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_connectors/full_stack_resources": { - "post": { - "tags": ["service_connectors"], - "summary": "Get Resources Based On Service Connector Info", - "description": "Gets the list of resources that a service connector can access.\n\nArgs:\n connector_info: The service connector info.\n connector_uuid: The service connector uuid.\n\nReturns:\n The list of resources that the service connector configuration has\n access to and consumable from UI/CLI.\n\nRaises:\n ValueError: If both connector_info and connector_uuid are provided.\n ValueError: If neither connector_info nor connector_uuid are provided.", - "operationId": "get_resources_based_on_service_connector_info_api_v1_service_connectors_full_stack_resources_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "connector_uuid", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Connector Uuid" - } - } - ], - "requestBody": { - "content": { - "application/json": { - "schema": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceConnectorInfo" }, - { "type": "null" } - ], - "title": "Connector Info" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorResourcesInfo" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_connector_types": { - "get": { - "tags": ["service_connectors"], - "summary": "List Service Connector Types", - "description": "Get a list of service connector types.\n\nArgs:\n connector_type: Filter by connector type.\n resource_type: Filter by resource type.\n auth_method: Filter by auth method.\n\nReturns:\n List of service connector types.", - "operationId": "list_service_connector_types_api_v1_service_connector_types_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "connector_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Connector Type" - } - }, - { - "name": "resource_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Type" - } - }, - { - "name": "auth_method", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Auth Method" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "array", - "items": { - "$ref": "#/components/schemas/ServiceConnectorTypeModel" - }, - "title": "Response List Service Connector Types Api V1 Service Connector Types Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/service_connector_types/{connector_type}": { - "get": { - "tags": ["service_connectors"], - "summary": "Get Service Connector Type", - "description": "Returns the requested service connector type.\n\nArgs:\n connector_type: the service connector type identifier.\n\nReturns:\n The requested service connector type.", - "operationId": "get_service_connector_type_api_v1_service_connector_types__connector_type__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "connector_type", - "in": "path", - "required": true, - "schema": { "type": "string", "title": "Connector Type" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorTypeModel" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/services": { - "post": { - "tags": ["services"], - "summary": "Create Service", - "description": "Creates a new service.\n\nArgs:\n service: The model containing the attributes of the new service.\n\nReturns:\n The created service object.", - "operationId": "create_service_api_v1_services_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "get": { - "tags": ["services"], - "summary": "List Services", - "description": "Gets a page of service objects.\n\nArgs:\n filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n Page of service objects.", - "operationId": "list_services_api_v1_services_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Type" - } - }, - { - "name": "flavor", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Flavor" - } - }, - { - "name": "config", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "binary" }, - { "type": "null" } - ], - "title": "Config" - } - }, - { - "name": "pipeline_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Pipeline Name" - } - }, - { - "name": "pipeline_step_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Pipeline Step Name" - } - }, - { - "name": "running", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Running" - } - }, - { - "name": "model_version_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Model Version Id" - } - }, - { - "name": "pipeline_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Run Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ServiceResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/services/{service_id}": { - "get": { - "tags": ["services"], - "summary": "Get Service", - "description": "Gets a specific service using its unique ID.\n\nArgs:\n service_id: The ID of the service to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A specific service object.", - "operationId": "get_service_api_v1_services__service_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["services"], - "summary": "Update Service", - "description": "Updates a service.\n\nArgs:\n service_id: The ID of the service to update.\n update: The model containing the attributes to update.\n\nReturns:\n The updated service object.", - "operationId": "update_service_api_v1_services__service_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["services"], - "summary": "Delete Service", - "description": "Deletes a specific service.\n\nArgs:\n service_id: The ID of the service to delete.", - "operationId": "delete_service_api_v1_services__service_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "service_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Service Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/stack-deployment/info": { - "get": { - "tags": ["stacks"], - "summary": "Get Stack Deployment Info", - "description": "Get information about a stack deployment provider.\n\nArgs:\n provider: The stack deployment provider.\n\nReturns:\n Information about the stack deployment provider.", - "operationId": "get_stack_deployment_info_api_v1_stack_deployment_info_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "provider", - "in": "query", - "required": true, - "schema": { "$ref": "#/components/schemas/StackDeploymentProvider" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StackDeploymentInfo" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/HTTPValidationError" } - } - } - } - } - } - }, - "/api/v1/stack-deployment/config": { - "get": { - "tags": ["stacks"], - "summary": "Get Stack Deployment Config", - "description": "Return the URL to deploy the ZenML stack to the specified cloud provider.\n\nArgs:\n request: The FastAPI request object.\n provider: The stack deployment provider.\n stack_name: The name of the stack.\n location: The location where the stack should be deployed.\n terraform: Whether the stack should be deployed using Terraform.\n auth_context: The authentication context.\n\nReturns:\n The cloud provider console URL where the stack will be deployed and\n the configuration for the stack deployment.", - "operationId": "get_stack_deployment_config_api_v1_stack_deployment_config_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "provider", - "in": "query", - "required": true, - "schema": { "$ref": "#/components/schemas/StackDeploymentProvider" } - }, - { - "name": "stack_name", - "in": "query", - "required": true, - "schema": { "type": "string", "title": "Stack Name" } - }, - { - "name": "location", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Location" - } - }, - { - "name": "terraform", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Terraform" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/StackDeploymentConfig" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/HTTPValidationError" } - } - } - } - } - } - }, - "/api/v1/stack-deployment/stack": { - "get": { - "tags": ["stacks"], - "summary": "Get Deployed Stack", - "description": "Return a matching ZenML stack that was deployed and registered.\n\nArgs:\n provider: The stack deployment provider.\n stack_name: The name of the stack.\n location: The location where the stack should be deployed.\n date_start: The date when the deployment started.\n terraform: Whether the stack was deployed using Terraform.\n\nReturns:\n The ZenML stack that was deployed and registered or None if the stack\n was not found.", - "operationId": "get_deployed_stack_api_v1_stack_deployment_stack_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "provider", - "in": "query", - "required": true, - "schema": { "$ref": "#/components/schemas/StackDeploymentProvider" } - }, - { - "name": "stack_name", - "in": "query", - "required": true, - "schema": { "type": "string", "title": "Stack Name" } - }, - { - "name": "location", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Location" - } - }, - { - "name": "date_start", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Date Start" - } - }, - { - "name": "terraform", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Terraform" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "anyOf": [ - { "$ref": "#/components/schemas/DeployedStack" }, - { "type": "null" } - ], - "title": "Response Get Deployed Stack Api V1 Stack Deployment Stack Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/HTTPValidationError" } - } - } - } - } - } - }, - "/api/v1/stacks": { - "get": { - "tags": ["stacks"], - "summary": "List Stacks", - "description": "Returns all stacks.\n\nArgs:\n stack_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All stacks.", - "operationId": "list_stacks_api_v1_stacks_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "description", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Description" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "component_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Component Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/Page_StackResponse_" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/stacks/{stack_id}": { - "get": { - "tags": ["stacks"], - "summary": "Get Stack", - "description": "Returns the requested stack.\n\nArgs:\n stack_id: ID of the stack.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The requested stack.", - "operationId": "get_stack_api_v1_stacks__stack_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "stack_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Stack Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StackResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["stacks"], - "summary": "Update Stack", - "description": "Updates a stack.\n\nArgs:\n stack_id: Name of the stack.\n stack_update: Stack to use for the update.\n\nReturns:\n The updated stack.", - "operationId": "update_stack_api_v1_stacks__stack_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "stack_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Stack Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StackUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StackResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["stacks"], - "summary": "Delete Stack", - "description": "Deletes a stack.\n\nArgs:\n stack_id: Name of the stack.", - "operationId": "delete_stack_api_v1_stacks__stack_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "stack_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Stack Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/components": { - "get": { - "tags": ["stack_components"], - "summary": "List Stack Components", - "description": "Get a list of all stack components for a specific type.\n\nArgs:\n component_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n List of stack components for a specific type.", - "operationId": "list_stack_components_api_v1_components_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "scope_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Scope Type" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "flavor", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Flavor" - } - }, - { - "name": "type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Type" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "connector_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Connector Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ComponentResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/components/{component_id}": { - "get": { - "tags": ["stack_components"], - "summary": "Get Stack Component", - "description": "Returns the requested stack component.\n\nArgs:\n component_id: ID of the stack component.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The requested stack component.", - "operationId": "get_stack_component_api_v1_components__component_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "component_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Component Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ComponentResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["stack_components"], - "summary": "Update Stack Component", - "description": "Updates a stack component.\n\nArgs:\n component_id: ID of the stack component.\n component_update: Stack component to use to update.\n\nReturns:\n Updated stack component.", - "operationId": "update_stack_component_api_v1_components__component_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "component_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Component Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ComponentUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ComponentResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["stack_components"], - "summary": "Deregister Stack Component", - "description": "Deletes a stack component.\n\nArgs:\n component_id: ID of the stack component.", - "operationId": "deregister_stack_component_api_v1_components__component_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "component_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Component Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/component-types": { - "get": { - "tags": ["stack_components"], - "summary": "Get Stack Component Types", - "description": "Get a list of all stack component types.\n\nReturns:\n List of stack components.", - "operationId": "get_stack_component_types_api_v1_component_types_get", - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "items": { "type": "string" }, - "type": "array", - "title": "Response Get Stack Component Types Api V1 Component Types Get" - } - } - } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "403": { - "description": "Forbidden", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "404": { - "description": "Not Found", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "422": { - "description": "Unprocessable Entity", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - } - }, - "security": [{ "CookieOAuth2TokenBearer": [] }] - } - }, - "/api/v1/steps": { - "get": { - "tags": ["steps"], - "summary": "List Run Steps", - "description": "Get run steps according to query filters.\n\nArgs:\n step_run_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n auth_context: Authentication context.\n\nReturns:\n The run steps according to query filters.", - "operationId": "list_run_steps_api_v1_steps_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "code_hash", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Code Hash" - } - }, - { - "name": "cache_key", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Cache Key" - } - }, - { - "name": "status", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Status" - } - }, - { - "name": "start_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Start Time" - } - }, - { - "name": "end_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "End Time" - } - }, - { - "name": "pipeline_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Run Id" - } - }, - { - "name": "original_step_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Original Step Run Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_StepRunResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["steps"], - "summary": "Create Run Step", - "description": "Create a run step.\n\nArgs:\n step: The run step to create.\n\nReturns:\n The created run step.", - "operationId": "create_run_step_api_v1_steps_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StepRunRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StepRunResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/steps/{step_id}": { - "get": { - "tags": ["steps"], - "summary": "Get Step", - "description": "Get one specific step.\n\nArgs:\n step_id: ID of the step to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The step.", - "operationId": "get_step_api_v1_steps__step_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "step_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Step Id" } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StepRunResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["steps"], - "summary": "Update Step", - "description": "Updates a step.\n\nArgs:\n step_id: ID of the step.\n step_model: Step model to use for the update.\n\nReturns:\n The updated step model.", - "operationId": "update_step_api_v1_steps__step_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "step_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Step Id" } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StepRunUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StepRunResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/steps/{step_id}/step-configuration": { - "get": { - "tags": ["steps"], - "summary": "Get Step Configuration", - "description": "Get the configuration of a specific step.\n\nArgs:\n step_id: ID of the step to get.\n\nReturns:\n The step configuration.", - "operationId": "get_step_configuration_api_v1_steps__step_id__step_configuration_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "step_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Step Id" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "object", - "title": "Response Get Step Configuration Api V1 Steps Step Id Step Configuration Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/steps/{step_id}/status": { - "get": { - "tags": ["steps"], - "summary": "Get Step Status", - "description": "Get the status of a specific step.\n\nArgs:\n step_id: ID of the step for which to get the status.\n\nReturns:\n The status of the step.", - "operationId": "get_step_status_api_v1_steps__step_id__status_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "step_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Step Id" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ExecutionStatus" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/steps/{step_id}/logs": { - "get": { - "tags": ["steps"], - "summary": "Get Step Logs", - "description": "Get the logs of a specific step.\n\nArgs:\n step_id: ID of the step for which to get the logs.\n offset: The offset from which to start reading.\n length: The amount of bytes that should be read.\n\nReturns:\n The logs of the step.\n\nRaises:\n HTTPException: If no logs are available for this step.", - "operationId": "get_step_logs_api_v1_steps__step_id__logs_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "step_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Step Id" } - }, - { - "name": "offset", - "in": "query", - "required": false, - "schema": { "type": "integer", "default": 0, "title": "Offset" } - }, - { - "name": "length", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "default": 16777216, - "title": "Length" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "string", - "title": "Response Get Step Logs Api V1 Steps Step Id Logs Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/tags": { - "post": { - "tags": ["tags"], - "summary": "Create Tag", - "description": "Create a new tag.\n\nArgs:\n tag: The tag to create.\n\nReturns:\n The created tag.", - "operationId": "create_tag_api_v1_tags_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TagRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TagResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "get": { - "tags": ["tags"], - "summary": "List Tags", - "description": "Get tags according to query filters.\n\nArgs:\n tag_filter_model: Filter model used for pagination, sorting,\n filtering\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The tags according to query filters.", - "operationId": "list_tags_api_v1_tags_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "color", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "$ref": "#/components/schemas/ColorVariants" }, - { "type": "null" } - ], - "title": "Color" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/Page_TagResponse_" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/tags/{tag_name_or_id}": { - "get": { - "tags": ["tags"], - "summary": "Get Tag", - "description": "Get a tag by name or ID.\n\nArgs:\n tag_name_or_id: The name or ID of the tag to get.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The tag with the given name or ID.", - "operationId": "get_tag_api_v1_tags__tag_name_or_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "tag_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Tag Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TagResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["tags"], - "summary": "Delete Tag", - "description": "Delete a tag by name or ID.\n\nArgs:\n tag_name_or_id: The name or ID of the tag to delete.", - "operationId": "delete_tag_api_v1_tags__tag_name_or_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "tag_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Tag Name Or Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/tags/{tag_id}": { - "put": { - "tags": ["tags"], - "summary": "Update Tag", - "description": "Updates a tag.\n\nArgs:\n tag_id: Id or name of the tag.\n tag_update_model: Tag to use for the update.\n\nReturns:\n The updated tag.", - "operationId": "update_tag_api_v1_tags__tag_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "tag_id", - "in": "path", - "required": true, - "schema": { "type": "string", "format": "uuid", "title": "Tag Id" } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TagUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TagResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/triggers": { - "get": { - "tags": ["triggers"], - "summary": "List Triggers", - "description": "Returns all triggers.\n\nArgs:\n trigger_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All triggers.", - "operationId": "list_triggers_api_v1_triggers_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "event_source_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Event Source Id" - } - }, - { - "name": "action_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Action Id" - } - }, - { - "name": "is_active", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Is Active" - } - }, - { - "name": "action_flavor", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Action Flavor" - } - }, - { - "name": "action_subtype", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Action Subtype" - } - }, - { - "name": "event_source_flavor", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Event Source Flavor" - } - }, - { - "name": "event_source_subtype", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Event Source Subtype" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_TriggerResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["triggers"], - "summary": "Create Trigger", - "description": "Creates a trigger.\n\nArgs:\n trigger: Trigger to register.\n\nReturns:\n The created trigger.\n\nRaises:\n ValueError: If the action flavor/subtype combination is not actually a webhook event source", - "operationId": "create_trigger_api_v1_triggers_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TriggerRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TriggerResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/triggers/{trigger_id}": { - "get": { - "tags": ["triggers"], - "summary": "Get Trigger", - "description": "Returns the requested trigger.\n\nArgs:\n trigger_id: ID of the trigger.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The requested trigger.", - "operationId": "get_trigger_api_v1_triggers__trigger_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "trigger_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Trigger Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TriggerResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["triggers"], - "summary": "Update Trigger", - "description": "Updates a trigger.\n\nArgs:\n trigger_id: Name of the trigger.\n trigger_update: Trigger to use for the update.\n\nReturns:\n The updated trigger.\n\nRaises:\n ValueError: If the action flavor/subtype combination is not actually a webhook event source", - "operationId": "update_trigger_api_v1_triggers__trigger_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "trigger_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Trigger Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TriggerUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/TriggerResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["triggers"], - "summary": "Delete Trigger", - "description": "Deletes a trigger.\n\nArgs:\n trigger_id: Name of the trigger.", - "operationId": "delete_trigger_api_v1_triggers__trigger_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "trigger_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Trigger Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "403": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Forbidden" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/users": { - "get": { - "tags": ["users"], - "summary": "List Users", - "description": "Returns a list of all users.\n\nArgs:\n user_filter_model: Model that takes care of filtering, sorting and\n pagination.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n auth_context: Authentication context.\n\nReturns:\n A list of all users.", - "operationId": "list_users_api_v1_users_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "full_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Full Name" - } - }, - { - "name": "email", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Email" - } - }, - { - "name": "active", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "boolean" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Active" - } - }, - { - "name": "email_opted_in", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "boolean" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Email Opted In" - } - }, - { - "name": "external_user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "External User Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/Page_UserResponse_" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/users/{user_name_or_id}": { - "get": { - "tags": ["users"], - "summary": "Get User", - "description": "Returns a specific user.\n\nArgs:\n user_name_or_id: Name or ID of the user.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n auth_context: Authentication context.\n\nReturns:\n A specific user.", - "operationId": "get_user_api_v1_users__user_name_or_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "user_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "User Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/UserResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/users/{user_name_or_id}/resource_membership": { - "post": { - "tags": ["users"], - "summary": "Update User Resource Membership", - "description": "Updates resource memberships of a user.\n\nArgs:\n user_name_or_id: Name or ID of the user.\n resource_type: Type of the resource for which to update the\n membership.\n resource_id: ID of the resource for which to update the membership.\n actions: List of actions that the user should be able to perform on\n the resource. If the user currently has permissions to perform\n actions which are not passed in this list, the permissions will\n be removed.\n auth_context: Authentication context.\n\nRaises:\n ValueError: If a user tries to update their own membership.\n KeyError: If no resource with the given type and ID exists.", - "operationId": "update_user_resource_membership_api_v1_users__user_name_or_id__resource_membership_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "user_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "User Name Or Id" - } - }, - { - "name": "resource_type", - "in": "query", - "required": true, - "schema": { "type": "string", "title": "Resource Type" } - }, - { - "name": "resource_id", - "in": "query", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Resource Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "type": "array", - "items": { "type": "string" }, - "title": "Actions" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/current-user": { - "get": { - "tags": ["users"], - "summary": "Get Current User", - "description": "Returns the model of the authenticated user.\n\nArgs:\n auth_context: The authentication context.\n\nReturns:\n The model of the authenticated user.", - "operationId": "get_current_user_api_v1_current_user_get", - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/UserResponse" } - } - } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "404": { - "description": "Not Found", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - }, - "422": { - "description": "Unprocessable Entity", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - } - } - }, - "security": [{ "CookieOAuth2TokenBearer": [] }] - } - }, - "/api/v1/webhooks/{event_source_id}": { - "post": { - "tags": ["webhook"], - "summary": "Webhook", - "description": "Webhook to receive events from external event sources.\n\nArgs:\n event_source_id: The event_source_id\n request: The request object\n background_tasks: Background task handler\n raw_body: The raw request body\n\nReturns:\n Static dict stating that event is received.\n\nRaises:\n AuthorizationException: If the Event Source does not exist.\n KeyError: If no appropriate Plugin found in the plugin registry\n ValueError: If the id of the Event Source is not actually a webhook event source\n WebhookInactiveError: In case this webhook has been deactivated", - "operationId": "webhook_api_v1_webhooks__event_source_id__post", - "parameters": [ - { - "name": "event_source_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Event Source Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "object", - "additionalProperties": { "type": "string" }, - "title": "Response Webhook Api V1 Webhooks Event Source Id Post" - } - } - } - }, - "422": { - "description": "Validation Error", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/HTTPValidationError" } - } - } - } - } - } - }, - "/api/v1/workspaces": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspaces", - "description": "Lists all workspaces in the organization.\n\nArgs:\n workspace_filter_model: Filter model used for pagination, sorting,\n filtering,\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n A list of workspaces.", - "operationId": "list_workspaces_api_v1_workspaces_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_WorkspaceResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Workspace", - "description": "Creates a workspace based on the requestBody.\n\n# noqa: DAR401\n\nArgs:\n workspace: Workspace to create.\n\nReturns:\n The created workspace.", - "operationId": "create_workspace_api_v1_workspaces_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/WorkspaceRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/WorkspaceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}": { - "get": { - "tags": ["workspaces"], - "summary": "Get Workspace", - "description": "Get a workspace for given name.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The requested workspace.", - "operationId": "get_workspace_api_v1_workspaces__workspace_name_or_id__get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { "type": "boolean", "default": true, "title": "Hydrate" } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/WorkspaceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "put": { - "tags": ["workspaces"], - "summary": "Update Workspace", - "description": "Get a workspace for given name.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace to update.\n workspace_update: the workspace to use to update\n\nReturns:\n The updated workspace.", - "operationId": "update_workspace_api_v1_workspaces__workspace_name_or_id__put", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/WorkspaceUpdate" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/WorkspaceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "delete": { - "tags": ["workspaces"], - "summary": "Delete Workspace", - "description": "Deletes a workspace.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.", - "operationId": "delete_workspace_api_v1_workspaces__workspace_name_or_id__delete", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { "application/json": { "schema": {} } } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/stacks": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspace Stacks", - "description": "Get stacks that are part of a specific workspace for the user.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n stack_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All stacks part of the specified workspace.", - "operationId": "list_workspace_stacks_api_v1_workspaces__workspace_name_or_id__stacks_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "description", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Description" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "component_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Component Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/Page_StackResponse_" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Stack", - "description": "Creates a stack for a particular workspace.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n stack: Stack to register.\n auth_context: Authentication context.\n\nReturns:\n The created stack.", - "operationId": "create_stack_api_v1_workspaces__workspace_name_or_id__stacks_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StackRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/StackResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/components": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspace Stack Components", - "description": "List stack components that are part of a specific workspace.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n component_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All stack components part of the specified workspace.", - "operationId": "list_workspace_stack_components_api_v1_workspaces__workspace_name_or_id__components_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "scope_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Scope Type" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "flavor", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Flavor" - } - }, - { - "name": "type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Type" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "connector_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Connector Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ComponentResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Stack Component", - "description": "Creates a stack component.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n component: Stack component to register.\n\nReturns:\n The created stack component.\n\nRaises:\n IllegalOperationError: If the workspace specified in the stack\n component does not match the current workspace.", - "operationId": "create_stack_component_api_v1_workspaces__workspace_name_or_id__components_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ComponentRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ComponentResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/pipelines": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspace Pipelines", - "description": "Gets pipelines defined for a specific workspace.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n pipeline_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All pipelines within the workspace.", - "operationId": "list_workspace_pipelines_api_v1_workspaces__workspace_name_or_id__pipelines_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Pipeline", - "description": "Creates a pipeline.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n pipeline: Pipeline to create.\n\nReturns:\n The created pipeline.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the pipeline\n does not match the current workspace or authenticated user.", - "operationId": "create_pipeline_api_v1_workspaces__workspace_name_or_id__pipelines_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/pipeline_builds": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspace Builds", - "description": "Gets builds defined for a specific workspace.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n build_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All builds within the workspace.", - "operationId": "list_workspace_builds_api_v1_workspaces__workspace_name_or_id__pipeline_builds_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "is_local", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Is Local" - } - }, - { - "name": "contains_code", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Contains Code" - } - }, - { - "name": "zenml_version", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Zenml Version" - } - }, - { - "name": "python_version", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Python Version" - } - }, - { - "name": "checksum", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Checksum" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineBuildResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Build", - "description": "Creates a build.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n build: Build to create.\n auth_context: Authentication context.\n\nReturns:\n The created build.\n\nRaises:\n IllegalOperationError: If the workspace specified in the build\n does not match the current workspace.", - "operationId": "create_build_api_v1_workspaces__workspace_name_or_id__pipeline_builds_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineBuildRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/PipelineBuildResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/pipeline_deployments": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspace Deployments", - "description": "Gets deployments defined for a specific workspace.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n deployment_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All deployments within the workspace.", - "operationId": "list_workspace_deployments_api_v1_workspaces__workspace_name_or_id__pipeline_deployments_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "build_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Build Id" - } - }, - { - "name": "schedule_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Schedule Id" - } - }, - { - "name": "template_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Template Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineDeploymentResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Deployment", - "description": "Creates a deployment.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n deployment: Deployment to create.\n auth_context: Authentication context.\n\nReturns:\n The created deployment.\n\nRaises:\n IllegalOperationError: If the workspace specified in the\n deployment does not match the current workspace.", - "operationId": "create_deployment_api_v1_workspaces__workspace_name_or_id__pipeline_deployments_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/PipelineDeploymentRequest" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/PipelineDeploymentResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/run_templates": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspace Run Templates", - "description": "Get a page of run templates.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n Page of run templates.", - "operationId": "list_workspace_run_templates_api_v1_workspaces__workspace_name_or_id__run_templates_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "build_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Build Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "code_repository_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Code Repository Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_RunTemplateResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Run Template", - "description": "Create a run template.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n run_template: Run template to create.\n\nReturns:\n The created run template.\n\nRaises:\n IllegalOperationError: If the workspace specified in the\n run template does not match the current workspace.", - "operationId": "create_run_template_api_v1_workspaces__workspace_name_or_id__run_templates_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/RunTemplateRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/RunTemplateResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/runs": { - "get": { - "tags": ["workspaces"], - "summary": "List Runs", - "description": "Get pipeline runs according to query filters.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n runs_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n The pipeline runs according to query filters.", - "operationId": "list_runs_api_v1_workspaces__workspace_name_or_id__runs_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "tag", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Tag" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "orchestrator_run_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Orchestrator Run Id" - } - }, - { - "name": "pipeline_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - { - "name": "pipeline_name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Pipeline Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "stack_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Stack Id" - } - }, - { - "name": "schedule_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Schedule Id" - } - }, - { - "name": "build_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Build Id" - } - }, - { - "name": "deployment_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Deployment Id" - } - }, - { - "name": "code_repository_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Code Repository Id" - } - }, - { - "name": "template_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Template Id" - } - }, - { - "name": "status", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Status" - } - }, - { - "name": "start_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Start Time" - } - }, - { - "name": "end_time", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "End Time" - } - }, - { - "name": "unlisted", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Unlisted" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_PipelineRunResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Pipeline Run", - "description": "Creates a pipeline run.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n pipeline_run: Pipeline run to create.\n\nReturns:\n The created pipeline run.\n\nRaises:\n IllegalOperationError: If the workspace specified in the\n pipeline run does not match the current workspace.", - "operationId": "create_pipeline_run_api_v1_workspaces__workspace_name_or_id__runs_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineRunRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineRunResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/schedules": { - "post": { - "tags": ["workspaces"], - "summary": "Create Schedule", - "description": "Creates a schedule.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n schedule: Schedule to create.\n auth_context: Authentication context.\n\nReturns:\n The created schedule.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the\n schedule does not match the current workspace or authenticated user.", - "operationId": "create_schedule_api_v1_workspaces__workspace_name_or_id__schedules_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ScheduleRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ScheduleResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/runs/get-or-create": { - "post": { - "tags": ["workspaces"], - "summary": "Get Or Create Pipeline Run", - "description": "Get or create a pipeline run.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n pipeline_run: Pipeline run to create.\n auth_context: Authentication context.\n\nReturns:\n The pipeline run and a boolean indicating whether the run was created\n or not.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the\n pipeline run does not match the current workspace or authenticated\n user.", - "operationId": "get_or_create_pipeline_run_api_v1_workspaces__workspace_name_or_id__runs_get_or_create_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/PipelineRunRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "array", - "prefixItems": [ - { "$ref": "#/components/schemas/PipelineRunResponse" }, - { "type": "boolean" } - ], - "minItems": 2, - "maxItems": 2, - "title": "Response Get Or Create Pipeline Run Api V1 Workspaces Workspace Name Or Id Runs Get Or Create Post" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/run-metadata": { - "post": { - "tags": ["workspaces"], - "summary": "Create Run Metadata", - "description": "Creates run metadata.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n run_metadata: The run metadata to create.\n auth_context: Authentication context.\n\nReturns:\n The created run metadata.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the run\n metadata does not match the current workspace or authenticated user.\n RuntimeError: If the resource type is not supported.", - "operationId": "create_run_metadata_api_v1_workspaces__workspace_name_or_id__run_metadata_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/RunMetadataRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "array", - "items": { - "$ref": "#/components/schemas/RunMetadataResponse" - }, - "title": "Response Create Run Metadata Api V1 Workspaces Workspace Name Or Id Run Metadata Post" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/secrets": { - "post": { - "tags": ["workspaces"], - "summary": "Create Secret", - "description": "Creates a secret.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n secret: Secret to create.\n\nReturns:\n The created secret.\n\nRaises:\n IllegalOperationError: If the workspace specified in the\n secret does not match the current workspace.", - "operationId": "create_secret_api_v1_workspaces__workspace_name_or_id__secrets_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/SecretRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/SecretResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/code_repositories": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspace Code Repositories", - "description": "Gets code repositories defined for a specific workspace.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All code repositories within the workspace.", - "operationId": "list_workspace_code_repositories_api_v1_workspaces__workspace_name_or_id__code_repositories_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_CodeRepositoryResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Code Repository", - "description": "Creates a code repository.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n code_repository: Code repository to create.\n\nReturns:\n The created code repository.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the\n code repository does not match the current workspace or\n authenticated user.", - "operationId": "create_code_repository_api_v1_workspaces__workspace_name_or_id__code_repositories_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/CodeRepositoryRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/CodeRepositoryResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/statistics": { - "get": { - "tags": ["workspaces"], - "summary": "Get Workspace Statistics", - "description": "Gets statistics of a workspace.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace to get statistics for.\n auth_context: Authentication context.\n\nReturns:\n All pipelines within the workspace.", - "operationId": "get_workspace_statistics_api_v1_workspaces__workspace_name_or_id__statistics_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "object", - "additionalProperties": { "type": "integer" }, - "title": "Response Get Workspace Statistics Api V1 Workspaces Workspace Name Or Id Statistics Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/service_connectors": { - "get": { - "tags": ["workspaces"], - "summary": "List Workspace Service Connectors", - "description": "List service connectors that are part of a specific workspace.\n\n# noqa: DAR401\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n connector_filter_model: Filter model used for pagination, sorting,\n filtering.\n hydrate: Flag deciding whether to hydrate the output model(s)\n by including metadata fields in the response.\n\nReturns:\n All service connectors part of the specified workspace.", - "operationId": "list_workspace_service_connectors_api_v1_workspaces__workspace_name_or_id__service_connectors_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "hydrate", - "in": "query", - "required": false, - "schema": { - "type": "boolean", - "default": false, - "title": "Hydrate" - } - }, - { - "name": "sort_by", - "in": "query", - "required": false, - "schema": { - "type": "string", - "default": "created", - "title": "Sort By" - } - }, - { - "name": "logical_operator", - "in": "query", - "required": false, - "schema": { - "allOf": [{ "$ref": "#/components/schemas/LogicalOperators" }], - "default": "and", - "title": "Logical Operator" - } - }, - { - "name": "page", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "minimum": 1, - "default": 1, - "title": "Page" - } - }, - { - "name": "size", - "in": "query", - "required": false, - "schema": { - "type": "integer", - "maximum": 10000, - "minimum": 1, - "default": 20, - "title": "Size" - } - }, - { - "name": "id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Id" - } - }, - { - "name": "created", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Created" - } - }, - { - "name": "updated", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Updated" - } - }, - { - "name": "scope_workspace", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Scope Workspace" - } - }, - { - "name": "scope_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Scope Type" - } - }, - { - "name": "name", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - } - }, - { - "name": "connector_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Connector Type" - } - }, - { - "name": "workspace_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Workspace Id" - } - }, - { - "name": "user_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "User Id" - } - }, - { - "name": "auth_method", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Auth Method" - } - }, - { - "name": "resource_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Type" - } - }, - { - "name": "resource_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Id" - } - }, - { - "name": "labels_str", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Labels Str" - } - }, - { - "name": "secret_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Secret Id" - } - } - ], - "requestBody": { - "content": { - "application/json": { - "schema": { - "anyOf": [ - { - "type": "object", - "additionalProperties": { - "anyOf": [{ "type": "string" }, { "type": "null" }] - } - }, - { "type": "null" } - ], - "title": "Labels" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/Page_ServiceConnectorResponse_" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - }, - "post": { - "tags": ["workspaces"], - "summary": "Create Service Connector", - "description": "Creates a service connector.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n connector: Service connector to register.\n\nReturns:\n The created service connector.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the service\n connector does not match the current workspace or authenticated\n user.", - "operationId": "create_service_connector_api_v1_workspaces__workspace_name_or_id__service_connectors_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorRequest" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ServiceConnectorResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/service_connectors/resources": { - "get": { - "tags": ["workspaces"], - "summary": "List Service Connector Resources", - "description": "List resources that can be accessed by service connectors.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n connector_type: the service connector type identifier to filter by.\n resource_type: the resource type identifier to filter by.\n resource_id: the resource identifier to filter by.\n auth_context: Authentication context.\n\nReturns:\n The matching list of resources that available service\n connectors have access to.", - "operationId": "list_service_connector_resources_api_v1_workspaces__workspace_name_or_id__service_connectors_resources_get", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "connector_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Connector Type" - } - }, - { - "name": "resource_type", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Type" - } - }, - { - "name": "resource_id", - "in": "query", - "required": false, - "schema": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Resource Id" - } - } - ], - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "type": "array", - "items": { - "$ref": "#/components/schemas/ServiceConnectorResourcesModel" - }, - "title": "Response List Service Connector Resources Api V1 Workspaces Workspace Name Or Id Service Connectors Resources Get" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "404": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Not Found" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/models": { - "post": { - "tags": ["workspaces"], - "summary": "Create Model", - "description": "Create a new model.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n model: The model to create.\n\nReturns:\n The created model.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the\n model does not match the current workspace or authenticated\n user.", - "operationId": "create_model_api_v1_workspaces__workspace_name_or_id__models_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ModelRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ModelResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/models/{model_name_or_id}/model_versions": { - "post": { - "tags": ["workspaces"], - "summary": "Create Model Version", - "description": "Create a new model version.\n\nArgs:\n model_name_or_id: Name or ID of the model.\n workspace_name_or_id: Name or ID of the workspace.\n model_version: The model version to create.\n auth_context: Authentication context.\n\nReturns:\n The created model version.\n\nRaises:\n IllegalOperationError: If the workspace specified in the\n model version does not match the current workspace.", - "operationId": "create_model_version_api_v1_workspaces__workspace_name_or_id__models__model_name_or_id__model_versions_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "model_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Model Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ModelVersionRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ModelVersionResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/model_versions/{model_version_id}/artifacts": { - "post": { - "tags": ["workspaces"], - "summary": "Create Model Version Artifact Link", - "description": "Create a new model version to artifact link.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n model_version_id: ID of the model version.\n model_version_artifact_link: The model version to artifact link to create.\n auth_context: Authentication context.\n\nReturns:\n The created model version to artifact link.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the\n model version does not match the current workspace or authenticated\n user.", - "operationId": "create_model_version_artifact_link_api_v1_workspaces__workspace_name_or_id__model_versions__model_version_id__artifacts_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "model_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Version Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ModelVersionArtifactRequest" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ModelVersionArtifactResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/model_versions/{model_version_id}/runs": { - "post": { - "tags": ["workspaces"], - "summary": "Create Model Version Pipeline Run Link", - "description": "Create a new model version to pipeline run link.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n model_version_id: ID of the model version.\n model_version_pipeline_run_link: The model version to pipeline run link to create.\n auth_context: Authentication context.\n\nReturns:\n - If Model Version to Pipeline Run Link already exists - returns the existing link.\n - Otherwise, returns the newly created model version to pipeline run link.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the\n model version does not match the current workspace or authenticated\n user.", - "operationId": "create_model_version_pipeline_run_link_api_v1_workspaces__workspace_name_or_id__model_versions__model_version_id__runs_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - }, - { - "name": "model_version_id", - "in": "path", - "required": true, - "schema": { - "type": "string", - "format": "uuid", - "title": "Model Version Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ModelVersionPipelineRunRequest" - } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ModelVersionPipelineRunResponse" - } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - }, - "/api/v1/workspaces/{workspace_name_or_id}/services": { - "post": { - "tags": ["workspaces"], - "summary": "Create Service", - "description": "Create a new service.\n\nArgs:\n workspace_name_or_id: Name or ID of the workspace.\n service: The service to create.\n\nReturns:\n The created service.\n\nRaises:\n IllegalOperationError: If the workspace or user specified in the\n model does not match the current workspace or authenticated\n user.", - "operationId": "create_service_api_v1_workspaces__workspace_name_or_id__services_post", - "security": [{ "CookieOAuth2TokenBearer": [] }], - "parameters": [ - { - "name": "workspace_name_or_id", - "in": "path", - "required": true, - "schema": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "Workspace Name Or Id" - } - } - ], - "requestBody": { - "required": true, - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceRequest" } - } - } - }, - "responses": { - "200": { - "description": "Successful Response", - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ServiceResponse" } - } - } - }, - "401": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unauthorized" - }, - "409": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Conflict" - }, - "422": { - "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/ErrorModel" } - } - }, - "description": "Unprocessable Entity" - } - } - } - } - }, - "components": { - "schemas": { - "APIKeyRequest": { - "properties": { - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the API Key." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The description of the API Key." - } - }, - "type": "object", - "required": ["name"], - "title": "APIKeyRequest", - "description": "Request model for API keys." - }, - "APIKeyResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/APIKeyResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/APIKeyResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/APIKeyResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the API Key." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "APIKeyResponse", - "description": "Response model for API keys." - }, - "APIKeyResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "key": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The API key. Only set immediately after creation or rotation." - }, - "active": { - "type": "boolean", - "title": "Whether the API key is active.", - "default": true - }, - "service_account": { - "allOf": [ - { "$ref": "#/components/schemas/ServiceAccountResponse" } - ], - "title": "The service account associated with this API key." - } - }, - "type": "object", - "required": ["created", "updated", "service_account"], - "title": "APIKeyResponseBody", - "description": "Response body for API keys." - }, - "APIKeyResponseMetadata": { - "properties": { - "description": { - "type": "string", - "maxLength": 65535, - "title": "The description of the API Key.", - "default": "" - }, - "retain_period_minutes": { - "type": "integer", - "title": "Number of minutes for which the previous key is still valid after it has been rotated." - }, - "last_login": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Time when the API key was last used to log in." - }, - "last_rotated": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Time when the API key was last rotated." - } - }, - "type": "object", - "required": ["retain_period_minutes"], - "title": "APIKeyResponseMetadata", - "description": "Response metadata for API keys." - }, - "APIKeyResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "APIKeyResponseResources", - "description": "Class for all resource models associated with the APIKey entity." - }, - "APIKeyRotateRequest": { - "properties": { - "retain_period_minutes": { - "type": "integer", - "title": "Number of minutes for which the previous key is still valid after it has been rotated.", - "default": 0 - } - }, - "type": "object", - "title": "APIKeyRotateRequest", - "description": "Request model for API key rotation." - }, - "APIKeyUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the API Key." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The description of the API Key." - }, - "active": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Whether the API key is active." - } - }, - "type": "object", - "title": "APIKeyUpdate", - "description": "Update model for API keys." - }, - "ActionRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the action." - }, - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the action", - "default": "" - }, - "flavor": { - "type": "string", - "maxLength": 255, - "title": "The flavor of the action." - }, - "plugin_subtype": { - "type": "string", - "enum": ["webhook", "pipeline_run"], - "maxLength": 255, - "title": "The subtype of the action.", - "description": "All possible types of Plugins." - }, - "configuration": { - "type": "object", - "title": "The configuration for the action." - }, - "service_account_id": { - "type": "string", - "format": "uuid", - "title": "The service account that is used to execute the action." - }, - "auth_window": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The time window in minutes for which the service account is authorized to execute the action. Set this to 0 to authorize the service account indefinitely (not recommended). If not set, a default value defined for each individual action type is used." - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "name", - "flavor", - "plugin_subtype", - "configuration", - "service_account_id" - ], - "title": "ActionRequest", - "description": "Model for creating a new action." - }, - "ActionResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ActionResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/ActionResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/ActionResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the action." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "ActionResponse", - "description": "Response model for actions." - }, - "ActionResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "flavor": { - "type": "string", - "maxLength": 255, - "title": "The flavor of the action." - }, - "plugin_subtype": { - "type": "string", - "enum": ["webhook", "pipeline_run"], - "maxLength": 255, - "title": "The subtype of the action.", - "description": "All possible types of Plugins." - } - }, - "type": "object", - "required": ["created", "updated", "flavor", "plugin_subtype"], - "title": "ActionResponseBody", - "description": "Response body for actions." - }, - "ActionResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the action.", - "default": "" - }, - "configuration": { - "type": "object", - "title": "The configuration for the action." - }, - "auth_window": { - "type": "integer", - "title": "The time window in minutes for which the service account is authorized to execute the action." - } - }, - "type": "object", - "required": ["workspace", "configuration", "auth_window"], - "title": "ActionResponseMetadata", - "description": "Response metadata for actions." - }, - "ActionResponseResources": { - "properties": { - "service_account": { - "allOf": [{ "$ref": "#/components/schemas/UserResponse" }], - "title": "The service account that is used to execute the action." - } - }, - "additionalProperties": true, - "type": "object", - "required": ["service_account"], - "title": "ActionResponseResources", - "description": "Class for all resource models associated with the action entity." - }, - "ActionUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The new name for the action." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The new description for the action." - }, - "configuration": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The configuration for the action." - }, - "service_account_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The service account that is used to execute the action." - }, - "auth_window": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The time window in minutes for which the service account is authorized to execute the action. Set this to 0 to authorize the service account indefinitely (not recommended). If not set, a default value defined for each individual action type is used." - } - }, - "type": "object", - "title": "ActionUpdate", - "description": "Update model for actions." - }, - "ArtifactConfiguration-Input": { - "properties": { - "materializer_source": { - "items": { "$ref": "#/components/schemas/Source" }, - "type": "array", - "title": "Materializer Source" - }, - "default_materializer_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - } - }, - "additionalProperties": false, - "type": "object", - "required": ["materializer_source"], - "title": "ArtifactConfiguration", - "description": "Class representing a complete input/output artifact configuration." - }, - "ArtifactConfiguration-Output": { - "properties": { - "materializer_source": { - "items": { "$ref": "#/components/schemas/Source" }, - "type": "array", - "title": "Materializer Source" - }, - "default_materializer_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - } - }, - "additionalProperties": false, - "type": "object", - "required": ["materializer_source"], - "title": "ArtifactConfiguration", - "description": "Class representing a complete input/output artifact configuration." - }, - "ArtifactNode": { - "properties": { - "id": { "type": "string", "title": "Id" }, - "type": { "type": "string", "title": "Type", "default": "artifact" }, - "data": { "$ref": "#/components/schemas/ArtifactNodeDetails" } - }, - "type": "object", - "required": ["id", "data"], - "title": "ArtifactNode", - "description": "A class that represents an artifact node in a lineage graph." - }, - "ArtifactNodeDetails": { - "properties": { - "execution_id": { "type": "string", "title": "Execution Id" }, - "name": { "type": "string", "title": "Name" }, - "status": { "$ref": "#/components/schemas/ArtifactNodeStatus" }, - "is_cached": { "type": "boolean", "title": "Is Cached" }, - "artifact_type": { "type": "string", "title": "Artifact Type" }, - "artifact_data_type": { - "type": "string", - "title": "Artifact Data Type" - }, - "parent_step_id": { "type": "string", "title": "Parent Step Id" }, - "producer_step_id": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Producer Step Id" - }, - "uri": { "type": "string", "title": "Uri" }, - "metadata": { - "items": { - "prefixItems": [ - { "type": "string" }, - { "type": "string" }, - { "type": "string" } - ], - "type": "array", - "maxItems": 3, - "minItems": 3 - }, - "type": "array", - "title": "Metadata" - } - }, - "type": "object", - "required": [ - "execution_id", - "name", - "status", - "is_cached", - "artifact_type", - "artifact_data_type", - "parent_step_id", - "producer_step_id", - "uri", - "metadata" - ], - "title": "ArtifactNodeDetails", - "description": "Captures all artifact details for the node." - }, - "ArtifactNodeStatus": { - "type": "string", - "enum": ["cached", "created", "external", "unknown"], - "title": "ArtifactNodeStatus", - "description": "Enum that represents the status of an artifact." - }, - "ArtifactRequest": { - "properties": { - "name": { - "type": "string", - "maxLength": 255, - "title": "Name of the artifact." - }, - "has_custom_name": { - "type": "boolean", - "title": "Whether the name is custom (True) or auto-generated (False).", - "default": false - }, - "tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Artifact tags.", - "description": "Should be a list of plain strings, e.g., ['tag1', 'tag2']" - } - }, - "type": "object", - "required": ["name"], - "title": "ArtifactRequest", - "description": "Artifact request model." - }, - "ArtifactResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ArtifactResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/ArtifactResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/ArtifactResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "Name of the output in the parent step." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "ArtifactResponse", - "description": "Artifact response model." - }, - "ArtifactResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "tags": { - "items": { "$ref": "#/components/schemas/TagResponse" }, - "type": "array", - "title": "Tags associated with the model" - }, - "latest_version_name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Latest Version Name" - }, - "latest_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Latest Version Id" - } - }, - "type": "object", - "required": ["created", "updated", "tags"], - "title": "ArtifactResponseBody", - "description": "Response body for artifacts." - }, - "ArtifactResponseMetadata": { - "properties": { - "has_custom_name": { - "type": "boolean", - "title": "Whether the name is custom (True) or auto-generated (False).", - "default": false - } - }, - "type": "object", - "title": "ArtifactResponseMetadata", - "description": "Response metadata for artifacts." - }, - "ArtifactResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ArtifactResponseResources", - "description": "Class for all resource models associated with the Artifact Entity." - }, - "ArtifactType": { - "type": "string", - "enum": [ - "DataAnalysisArtifact", - "DataArtifact", - "ModelArtifact", - "SchemaArtifact", - "ServiceArtifact", - "StatisticsArtifact", - "BaseArtifact" - ], - "title": "ArtifactType", - "description": "All possible types an artifact can have." - }, - "ArtifactUpdate": { - "properties": { - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "add_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Add Tags" - }, - "remove_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Remove Tags" - }, - "has_custom_name": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Has Custom Name" - } - }, - "type": "object", - "title": "ArtifactUpdate", - "description": "Artifact update model." - }, - "ArtifactVersionRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "artifact_id": { - "type": "string", - "format": "uuid", - "title": "ID of the artifact to which this version belongs." - }, - "version": { - "anyOf": [{ "type": "string" }, { "type": "integer" }], - "title": "Version of the artifact." - }, - "has_custom_name": { - "type": "boolean", - "title": "Whether the name is custom (True) or auto-generated (False).", - "default": false - }, - "type": { - "allOf": [{ "$ref": "#/components/schemas/ArtifactType" }], - "title": "Type of the artifact." - }, - "artifact_store_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "ID of the artifact store in which this artifact is stored." - }, - "uri": { - "type": "string", - "maxLength": 65535, - "title": "URI of the artifact." - }, - "materializer": { - "allOf": [{ "$ref": "#/components/schemas/Source" }], - "title": "Materializer class to use for this artifact." - }, - "data_type": { - "allOf": [{ "$ref": "#/components/schemas/Source" }], - "title": "Data type of the artifact." - }, - "tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags of the artifact.", - "description": "Should be a list of plain strings, e.g., ['tag1', 'tag2']" - }, - "visualizations": { - "anyOf": [ - { - "items": { - "$ref": "#/components/schemas/ArtifactVisualizationRequest" - }, - "type": "array" - }, - { "type": "null" } - ], - "title": "Visualizations of the artifact." - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "artifact_id", - "version", - "type", - "uri", - "materializer", - "data_type" - ], - "title": "ArtifactVersionRequest", - "description": "Request model for artifact versions." - }, - "ArtifactVersionResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ArtifactVersionResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { - "$ref": "#/components/schemas/ArtifactVersionResponseMetadata" - }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/ArtifactVersionResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "ArtifactVersionResponse", - "description": "Response model for artifact versions." - }, - "ArtifactVersionResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "artifact": { - "allOf": [{ "$ref": "#/components/schemas/ArtifactResponse" }], - "title": "Artifact to which this version belongs." - }, - "version": { "type": "string", "title": "Version of the artifact." }, - "uri": { - "type": "string", - "maxLength": 65535, - "title": "URI of the artifact." - }, - "type": { - "allOf": [{ "$ref": "#/components/schemas/ArtifactType" }], - "title": "Type of the artifact." - }, - "materializer": { - "allOf": [{ "$ref": "#/components/schemas/Source" }], - "title": "Materializer class to use for this artifact." - }, - "data_type": { - "allOf": [{ "$ref": "#/components/schemas/Source" }], - "title": "Data type of the artifact." - }, - "tags": { - "items": { "$ref": "#/components/schemas/TagResponse" }, - "type": "array", - "title": "Tags associated with the model" - }, - "producer_pipeline_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the pipeline run that generated this artifact version." - } - }, - "type": "object", - "required": [ - "created", - "updated", - "artifact", - "version", - "uri", - "type", - "materializer", - "data_type", - "tags" - ], - "title": "ArtifactVersionResponseBody", - "description": "Response body for artifact versions." - }, - "ArtifactVersionResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "artifact_store_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "ID of the artifact store in which this artifact is stored." - }, - "producer_step_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "ID of the step run that produced this artifact." - }, - "visualizations": { - "anyOf": [ - { - "items": { - "$ref": "#/components/schemas/ArtifactVisualizationResponse" - }, - "type": "array" - }, - { "type": "null" } - ], - "title": "Visualizations of the artifact." - }, - "run_metadata": { - "additionalProperties": { - "$ref": "#/components/schemas/RunMetadataResponse" - }, - "type": "object", - "title": "Metadata of the artifact.", - "default": {} - } - }, - "type": "object", - "required": ["workspace"], - "title": "ArtifactVersionResponseMetadata", - "description": "Response metadata for artifact versions." - }, - "ArtifactVersionResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ArtifactVersionResponseResources", - "description": "Class for all resource models associated with the artifact version entity." - }, - "ArtifactVersionUpdate": { - "properties": { - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "add_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Add Tags" - }, - "remove_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Remove Tags" - } - }, - "type": "object", - "title": "ArtifactVersionUpdate", - "description": "Artifact version update model." - }, - "ArtifactVisualizationRequest": { - "properties": { - "type": { "$ref": "#/components/schemas/VisualizationType" }, - "uri": { "type": "string", "title": "Uri" } - }, - "type": "object", - "required": ["type", "uri"], - "title": "ArtifactVisualizationRequest", - "description": "Request model for artifact visualization." - }, - "ArtifactVisualizationResponse": { - "properties": { - "body": { - "anyOf": [ - { - "$ref": "#/components/schemas/ArtifactVisualizationResponseBody" - }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { - "$ref": "#/components/schemas/ArtifactVisualizationResponseMetadata" - }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/ArtifactVisualizationResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "ArtifactVisualizationResponse", - "description": "Response model for artifact visualizations." - }, - "ArtifactVisualizationResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "type": { "$ref": "#/components/schemas/VisualizationType" }, - "uri": { "type": "string", "title": "Uri" } - }, - "type": "object", - "required": ["created", "updated", "type", "uri"], - "title": "ArtifactVisualizationResponseBody", - "description": "Response body for artifact visualizations." - }, - "ArtifactVisualizationResponseMetadata": { - "properties": { - "artifact_version_id": { - "type": "string", - "format": "uuid", - "title": "Artifact Version Id" - } - }, - "type": "object", - "required": ["artifact_version_id"], - "title": "ArtifactVisualizationResponseMetadata", - "description": "Response metadata model for artifact visualizations." - }, - "ArtifactVisualizationResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ArtifactVisualizationResponseResources", - "description": "Class for all resource models associated with the artifact visualization." - }, - "AuthScheme": { - "type": "string", - "enum": ["NO_AUTH", "HTTP_BASIC", "OAUTH2_PASSWORD_BEARER", "EXTERNAL"], - "title": "AuthScheme", - "description": "The authentication scheme." - }, - "AuthenticationMethodModel": { - "properties": { - "name": { - "type": "string", - "title": "User readable name for the authentication method." - }, - "auth_method": { - "type": "string", - "maxLength": 255, - "title": "The name of the authentication method." - }, - "description": { - "type": "string", - "title": "A description of the authentication method.", - "default": "" - }, - "config_schema": { - "type": "object", - "title": "The JSON schema of the configuration for this authentication method." - }, - "min_expiration_seconds": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The minimum number of seconds that the authentication session can be configured to be valid for. Set to None for authentication sessions and long-lived credentials that don't expire." - }, - "max_expiration_seconds": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The maximum number of seconds that the authentication session can be configured to be valid for. Set to None for authentication sessions and long-lived credentials that don't expire." - }, - "default_expiration_seconds": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The default number of seconds that the authentication session is valid for. Set to None for authentication sessions and long-lived credentials that don't expire." - } - }, - "type": "object", - "required": ["name", "auth_method"], - "title": "AuthenticationMethodModel", - "description": "Authentication method specification.\n\nDescribes the schema for the configuration and secrets that need to be\nprovided to configure an authentication method." - }, - "BasePluginFlavorResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/BasePluginResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/BasePluginResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/BasePluginResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "name": { "type": "string", "title": "Name of the flavor." }, - "type": { - "allOf": [{ "$ref": "#/components/schemas/PluginType" }], - "title": "Type of the plugin." - }, - "subtype": { - "allOf": [ - { "$ref": "#/components/schemas/zenml__enums__PluginSubType__2" } - ], - "title": "Subtype of the plugin." - } - }, - "type": "object", - "required": ["name", "type", "subtype"], - "title": "BasePluginFlavorResponse", - "description": "Base response for all Plugin Flavors." - }, - "BasePluginResponseBody": { - "properties": {}, - "type": "object", - "title": "BasePluginResponseBody", - "description": "Response body for plugins." - }, - "BasePluginResponseMetadata": { - "properties": {}, - "type": "object", - "title": "BasePluginResponseMetadata", - "description": "Response metadata for plugins." - }, - "BasePluginResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "BasePluginResponseResources", - "description": "Response resources for plugins." - }, - "BaseResponseMetadata": { - "properties": {}, - "type": "object", - "title": "BaseResponseMetadata", - "description": "Base metadata model.\n\nUsed as a base class for all metadata models associated with responses." - }, - "BaseSettings": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "BaseSettings", - "description": "Base class for settings.\n\nThe `LEVEL` class variable defines on which level the settings can be\nspecified. By default, subclasses can be defined on both pipelines and\nsteps." - }, - "Body_device_authorization_api_v1_device_authorization_post": { - "properties": { - "client_id": { - "type": "string", - "format": "uuid", - "title": "Client Id" - } - }, - "type": "object", - "required": ["client_id"], - "title": "Body_device_authorization_api_v1_device_authorization_post" - }, - "Body_token_api_v1_login_post": { - "properties": { - "grant_type": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Grant Type" - }, - "username": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Username" - }, - "password": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Password" - }, - "client_id": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Client Id" - }, - "device_code": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Device Code" - } - }, - "type": "object", - "title": "Body_token_api_v1_login_post" - }, - "BuildItem": { - "properties": { - "image": { "type": "string", "title": "The image name or digest." }, - "dockerfile": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The dockerfile used to build the image." - }, - "requirements": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The pip requirements installed in the image." - }, - "settings_checksum": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The checksum of the build settings." - }, - "contains_code": { - "type": "boolean", - "title": "Whether the image contains user files.", - "default": true - }, - "requires_code_download": { - "type": "boolean", - "title": "Whether the image needs to download files.", - "default": false - } - }, - "type": "object", - "required": ["image"], - "title": "BuildItem", - "description": "Pipeline build item.\n\nAttributes:\n image: The image name or digest.\n dockerfile: The contents of the Dockerfile used to build the image.\n requirements: The pip requirements installed in the image. This is a\n string consisting of multiple concatenated requirements.txt files.\n settings_checksum: Checksum of the settings used for the build.\n contains_code: Whether the image contains user files.\n requires_code_download: Whether the image needs to download files." - }, - "ClientLazyLoader": { - "properties": { - "method_name": { "type": "string", "title": "Method Name" }, - "call_chain": { - "items": { "$ref": "#/components/schemas/_CallStep" }, - "type": "array", - "title": "Call Chain", - "default": [] - }, - "exclude_next_call": { - "type": "boolean", - "title": "Exclude Next Call", - "default": false - } - }, - "type": "object", - "required": ["method_name"], - "title": "ClientLazyLoader", - "description": "Lazy loader for Client methods." - }, - "CodeReferenceRequest": { - "properties": { - "commit": { - "type": "string", - "title": "Commit", - "description": "The commit of the code reference." - }, - "subdirectory": { - "type": "string", - "title": "Subdirectory", - "description": "The subdirectory of the code reference." - }, - "code_repository": { - "type": "string", - "format": "uuid", - "title": "Code Repository", - "description": "The repository of the code reference." - } - }, - "type": "object", - "required": ["commit", "subdirectory", "code_repository"], - "title": "CodeReferenceRequest", - "description": "Request model for code references." - }, - "CodeReferenceResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeReferenceResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeReferenceResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeReferenceResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "CodeReferenceResponse", - "description": "Response model for code references." - }, - "CodeReferenceResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "commit": { - "type": "string", - "title": "Commit", - "description": "The commit of the code reference." - }, - "subdirectory": { - "type": "string", - "title": "Subdirectory", - "description": "The subdirectory of the code reference." - }, - "code_repository": { - "allOf": [ - { "$ref": "#/components/schemas/CodeRepositoryResponse" } - ], - "description": "The repository of the code reference." - } - }, - "type": "object", - "required": [ - "created", - "updated", - "commit", - "subdirectory", - "code_repository" - ], - "title": "CodeReferenceResponseBody", - "description": "Response body for code references." - }, - "CodeReferenceResponseMetadata": { - "properties": {}, - "type": "object", - "title": "CodeReferenceResponseMetadata", - "description": "Response metadata for code references." - }, - "CodeReferenceResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "CodeReferenceResponseResources", - "description": "Class for all resource models associated with the code reference entity." - }, - "CodeRepositoryRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the code repository." - }, - "config": { - "type": "object", - "title": "Config", - "description": "Configuration for the code repository." - }, - "source": { - "allOf": [{ "$ref": "#/components/schemas/Source" }], - "description": "The code repository source." - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Logo Url", - "description": "Optional URL of a logo (png, jpg or svg) for the code repository." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "Description", - "description": "Code repository description." - } - }, - "type": "object", - "required": ["user", "workspace", "name", "config", "source"], - "title": "CodeRepositoryRequest", - "description": "Request model for code repositories." - }, - "CodeRepositoryResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeRepositoryResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeRepositoryResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/CodeRepositoryResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the code repository." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "CodeRepositoryResponse", - "description": "Response model for code repositories." - }, - "CodeRepositoryResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "source": { - "allOf": [{ "$ref": "#/components/schemas/Source" }], - "description": "The code repository source." - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Logo Url", - "description": "Optional URL of a logo (png, jpg or svg) for the code repository." - } - }, - "type": "object", - "required": ["created", "updated", "source"], - "title": "CodeRepositoryResponseBody", - "description": "Response body for code repositories." - }, - "CodeRepositoryResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "config": { - "type": "object", - "title": "Config", - "description": "Configuration for the code repository." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "Description", - "description": "Code repository description." - } - }, - "type": "object", - "required": ["workspace", "config"], - "title": "CodeRepositoryResponseMetadata", - "description": "Response metadata for code repositories." - }, - "CodeRepositoryResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "CodeRepositoryResponseResources", - "description": "Class for all resource models associated with the code repository entity." - }, - "CodeRepositoryUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the code repository." - }, - "config": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "Config", - "description": "Configuration for the code repository." - }, - "source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ], - "description": "The code repository source." - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Logo Url", - "description": "Optional URL of a logo (png, jpg or svg) for the code repository." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "Description", - "description": "Code repository description." - } - }, - "type": "object", - "title": "CodeRepositoryUpdate", - "description": "Update model for code repositories." - }, - "ColorVariants": { - "type": "string", - "enum": [ - "grey", - "purple", - "red", - "green", - "yellow", - "orange", - "lime", - "teal", - "turquoise", - "magenta", - "blue" - ], - "title": "ColorVariants", - "description": "All possible color variants for frontend." - }, - "ComponentInfo": { - "properties": { - "flavor": { "type": "string", "title": "Flavor" }, - "service_connector_index": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The id of the service connector from the list `service_connectors`.", - "description": "The id of the service connector from the list `service_connectors` from `FullStackRequest`." - }, - "service_connector_resource_id": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Service Connector Resource Id" - }, - "configuration": { - "type": "object", - "title": "Configuration", - "default": {} - } - }, - "type": "object", - "required": ["flavor"], - "title": "ComponentInfo", - "description": "Information about each stack components when creating a full stack." - }, - "ComponentRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the stack component." - }, - "type": { - "allOf": [{ "$ref": "#/components/schemas/StackComponentType" }], - "title": "The type of the stack component." - }, - "flavor": { - "type": "string", - "maxLength": 255, - "title": "The flavor of the stack component." - }, - "configuration": { - "type": "object", - "title": "The stack component configuration." - }, - "connector_resource_id": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Connector Resource Id", - "description": "The ID of a specific resource instance to gain access to through the connector" - }, - "labels": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The stack component labels." - }, - "component_spec_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The path to the component spec used for mlstacks deployments." - }, - "connector": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The service connector linked to this stack component." - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "name", - "type", - "flavor", - "configuration" - ], - "title": "ComponentRequest", - "description": "Request model for components." - }, - "ComponentResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ComponentResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/ComponentResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/ComponentResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the stack component." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "ComponentResponse", - "description": "Response model for components." - }, - "ComponentResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "type": { - "allOf": [{ "$ref": "#/components/schemas/StackComponentType" }], - "title": "The type of the stack component." - }, - "flavor": { - "type": "string", - "maxLength": 255, - "title": "The flavor of the stack component." - }, - "integration": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the integration that the component's flavor belongs to." - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to a png,svg or jpg can be attached." - } - }, - "type": "object", - "required": ["created", "updated", "type", "flavor"], - "title": "ComponentResponseBody", - "description": "Response body for components." - }, - "ComponentResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "configuration": { - "type": "object", - "title": "The stack component configuration." - }, - "labels": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The stack component labels." - }, - "component_spec_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The path to the component spec used for mlstacks deployments." - }, - "connector_resource_id": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Connector Resource Id", - "description": "The ID of a specific resource instance to gain access to through the connector" - }, - "connector": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceConnectorResponse" }, - { "type": "null" } - ], - "title": "The service connector linked to this stack component." - } - }, - "type": "object", - "required": ["workspace", "configuration"], - "title": "ComponentResponseMetadata", - "description": "Response metadata for components." - }, - "ComponentResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ComponentResponseResources", - "description": "Class for all resource models associated with the component entity." - }, - "ComponentUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the stack component." - }, - "type": { - "anyOf": [ - { "$ref": "#/components/schemas/StackComponentType" }, - { "type": "null" } - ], - "title": "The type of the stack component." - }, - "flavor": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The flavor of the stack component." - }, - "configuration": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The stack component configuration." - }, - "connector_resource_id": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Connector Resource Id", - "description": "The ID of a specific resource instance to gain access to through the connector" - }, - "labels": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The stack component labels." - }, - "component_spec_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The path to the component spec used for mlstacks deployments." - }, - "connector": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The service connector linked to this stack component." - } - }, - "type": "object", - "title": "ComponentUpdate", - "description": "Update model for stack components." - }, - "DeployedStack": { - "properties": { - "stack": { - "allOf": [{ "$ref": "#/components/schemas/StackResponse" }], - "title": "The stack that was deployed.", - "description": "The stack that was deployed." - }, - "service_connector": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceConnectorResponse" }, - { "type": "null" } - ], - "title": "The service connector for the deployed stack.", - "description": "The service connector for the deployed stack." - } - }, - "type": "object", - "required": ["stack"], - "title": "DeployedStack", - "description": "Information about a deployed stack." - }, - "Edge": { - "properties": { - "id": { "type": "string", "title": "Id" }, - "source": { "type": "string", "title": "Source" }, - "target": { "type": "string", "title": "Target" } - }, - "type": "object", - "required": ["id", "source", "target"], - "title": "Edge", - "description": "A class that represents an edge in a lineage graph." - }, - "ErrorModel": { - "properties": { - "detail": { "anyOf": [{}, { "type": "null" }], "title": "Detail" } - }, - "type": "object", - "title": "ErrorModel", - "description": "Base class for error responses." - }, - "EventSourceRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the event source." - }, - "flavor": { - "type": "string", - "maxLength": 255, - "title": "The flavor of event source." - }, - "plugin_subtype": { - "allOf": [ - { "$ref": "#/components/schemas/zenml__enums__PluginSubType__2" } - ], - "title": "The plugin subtype of the event source." - }, - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the event source.", - "default": "" - }, - "configuration": { - "type": "object", - "title": "The event source configuration." - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "name", - "flavor", - "plugin_subtype", - "configuration" - ], - "title": "EventSourceRequest", - "description": "BaseModel for all event sources." - }, - "EventSourceResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/EventSourceResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/EventSourceResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/EventSourceResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the event source." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "EventSourceResponse", - "description": "Response model for event sources." - }, - "EventSourceResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "flavor": { - "type": "string", - "maxLength": 255, - "title": "The flavor of event source." - }, - "plugin_subtype": { - "allOf": [ - { "$ref": "#/components/schemas/zenml__enums__PluginSubType__2" } - ], - "title": "The plugin subtype of the event source." - }, - "is_active": { - "type": "boolean", - "title": "Whether the event source is active." - } - }, - "type": "object", - "required": [ - "created", - "updated", - "flavor", - "plugin_subtype", - "is_active" - ], - "title": "EventSourceResponseBody", - "description": "ResponseBody for event sources." - }, - "EventSourceResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the event source.", - "default": "" - }, - "configuration": { - "type": "object", - "title": "The event source configuration." - } - }, - "type": "object", - "required": ["workspace", "configuration"], - "title": "EventSourceResponseMetadata", - "description": "Response metadata for event sources." - }, - "EventSourceResponseResources": { - "properties": { - "triggers": { - "allOf": [{ "$ref": "#/components/schemas/Page_TriggerResponse_" }], - "title": "The triggers configured with this event source." - } - }, - "additionalProperties": true, - "type": "object", - "required": ["triggers"], - "title": "EventSourceResponseResources", - "description": "Class for all resource models associated with the code repository entity." - }, - "EventSourceUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The updated name of the event source." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The updated description of the event source." - }, - "configuration": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The updated event source configuration." - }, - "is_active": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "The status of the event source." - } - }, - "type": "object", - "title": "EventSourceUpdate", - "description": "Update model for event sources." - }, - "ExecutionStatus": { - "type": "string", - "enum": ["initializing", "failed", "completed", "running", "cached"], - "title": "ExecutionStatus", - "description": "Enum that represents the current status of a step or pipeline run." - }, - "ExternalArtifactConfiguration-Input": { - "properties": { - "id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Id" - }, - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Version" - }, - "model": { - "anyOf": [ - { "$ref": "#/components/schemas/Model" }, - { "type": "null" } - ] - } - }, - "type": "object", - "title": "ExternalArtifactConfiguration", - "description": "External artifact configuration.\n\nLightweight class to pass in the steps for runtime inference." - }, - "ExternalArtifactConfiguration-Output": { - "properties": { - "id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Id" - }, - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Version" - }, - "model": { - "anyOf": [ - { "$ref": "#/components/schemas/Model" }, - { "type": "null" } - ] - } - }, - "type": "object", - "title": "ExternalArtifactConfiguration", - "description": "External artifact configuration.\n\nLightweight class to pass in the steps for runtime inference." - }, - "FlavorRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the Flavor." - }, - "type": { - "allOf": [{ "$ref": "#/components/schemas/StackComponentType" }], - "title": "The type of the Flavor." - }, - "config_schema": { - "type": "object", - "title": "The JSON schema of this flavor's corresponding configuration." - }, - "connector_type": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The type of the connector that this flavor uses." - }, - "connector_resource_type": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The resource type of the connector that this flavor uses." - }, - "connector_resource_id_attr": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of an attribute in the stack component configuration that plays the role of resource ID when linked to a service connector." - }, - "source": { - "type": "string", - "maxLength": 255, - "title": "The path to the module which contains this Flavor." - }, - "integration": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the integration that the Flavor belongs to." - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to a png,svg or jpg can be attached." - }, - "docs_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to docs, within docs.zenml.io." - }, - "sdk_docs_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to SDK docs,within sdkdocs.zenml.io." - }, - "is_custom": { - "type": "boolean", - "title": "Whether or not this flavor is a custom, user created flavor.", - "default": true - }, - "workspace": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The workspace to which this resource belongs." - } - }, - "type": "object", - "required": [ - "user", - "name", - "type", - "config_schema", - "source", - "integration" - ], - "title": "FlavorRequest", - "description": "Request model for flavors." - }, - "FlavorResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/FlavorResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/FlavorResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/FlavorResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the Flavor." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "FlavorResponse", - "description": "Response model for flavors." - }, - "FlavorResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "type": { - "allOf": [{ "$ref": "#/components/schemas/StackComponentType" }], - "title": "The type of the Flavor." - }, - "integration": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the integration that the Flavor belongs to." - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to a png,svg or jpg can be attached." - } - }, - "type": "object", - "required": ["created", "updated", "type", "integration"], - "title": "FlavorResponseBody", - "description": "Response body for flavor." - }, - "FlavorResponseMetadata": { - "properties": { - "workspace": { - "anyOf": [ - { "$ref": "#/components/schemas/WorkspaceResponse" }, - { "type": "null" } - ], - "title": "The project of this resource." - }, - "config_schema": { - "type": "object", - "title": "The JSON schema of this flavor's corresponding configuration." - }, - "connector_type": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The type of the connector that this flavor uses." - }, - "connector_resource_type": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The resource type of the connector that this flavor uses." - }, - "connector_resource_id_attr": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of an attribute in the stack component configuration that plays the role of resource ID when linked to a service connector." - }, - "source": { - "type": "string", - "maxLength": 255, - "title": "The path to the module which contains this Flavor." - }, - "docs_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to docs, within docs.zenml.io." - }, - "sdk_docs_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to SDK docs,within sdkdocs.zenml.io." - }, - "is_custom": { - "type": "boolean", - "title": "Whether or not this flavor is a custom, user created flavor.", - "default": true - } - }, - "type": "object", - "required": ["workspace", "config_schema", "source"], - "title": "FlavorResponseMetadata", - "description": "Response metadata for flavors." - }, - "FlavorResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "FlavorResponseResources", - "description": "Class for all resource models associated with the flavor entity." - }, - "FlavorUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the Flavor." - }, - "type": { - "anyOf": [ - { "$ref": "#/components/schemas/StackComponentType" }, - { "type": "null" } - ], - "title": "The type of the Flavor." - }, - "config_schema": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The JSON schema of this flavor's corresponding configuration." - }, - "connector_type": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The type of the connector that this flavor uses." - }, - "connector_resource_type": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The resource type of the connector that this flavor uses." - }, - "connector_resource_id_attr": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of an attribute in the stack component configuration that plays the role of resource ID when linked to a service connector." - }, - "source": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The path to the module which contains this Flavor." - }, - "integration": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the integration that the Flavor belongs to." - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to a png,svg or jpg can be attached." - }, - "docs_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to docs, within docs.zenml.io." - }, - "sdk_docs_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a url pointing to SDK docs,within sdkdocs.zenml.io." - }, - "is_custom": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Whether or not this flavor is a custom, user created flavor." - }, - "workspace": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The workspace to which this resource belongs." - } - }, - "type": "object", - "title": "FlavorUpdate", - "description": "Update model for flavors." - }, - "HTTPValidationError": { - "properties": { - "detail": { - "items": { "$ref": "#/components/schemas/ValidationError" }, - "type": "array", - "title": "Detail" - } - }, - "type": "object", - "title": "HTTPValidationError" - }, - "InputSpec": { - "properties": { - "step_name": { "type": "string", "title": "Step Name" }, - "output_name": { "type": "string", "title": "Output Name" } - }, - "additionalProperties": false, - "type": "object", - "required": ["step_name", "output_name"], - "title": "InputSpec", - "description": "Step input specification." - }, - "LineageGraph": { - "properties": { - "nodes": { - "items": { - "anyOf": [ - { "$ref": "#/components/schemas/StepNode" }, - { "$ref": "#/components/schemas/ArtifactNode" } - ] - }, - "type": "array", - "title": "Nodes", - "default": [] - }, - "edges": { - "items": { "$ref": "#/components/schemas/Edge" }, - "type": "array", - "title": "Edges", - "default": [] - }, - "root_step_id": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Root Step Id" - }, - "run_metadata": { - "items": { - "prefixItems": [ - { "type": "string" }, - { "type": "string" }, - { "type": "string" } - ], - "type": "array", - "maxItems": 3, - "minItems": 3 - }, - "type": "array", - "title": "Run Metadata", - "default": [] - } - }, - "type": "object", - "title": "LineageGraph", - "description": "A lineage graph representation of a PipelineRunResponseModel." - }, - "LoadedVisualization": { - "properties": { - "type": { "$ref": "#/components/schemas/VisualizationType" }, - "value": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "binary" } - ], - "title": "Value" - } - }, - "type": "object", - "required": ["type", "value"], - "title": "LoadedVisualization", - "description": "Model for loaded visualizations." - }, - "LogicalOperators": { - "type": "string", - "enum": ["or", "and"], - "title": "LogicalOperators", - "description": "Logical Ops to use to combine filters on list methods." - }, - "LogsRequest": { - "properties": { - "uri": { "type": "string", "title": "The uri of the logs file" }, - "artifact_store_id": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "The artifact store ID to associate the logs with." - } - }, - "type": "object", - "required": ["uri", "artifact_store_id"], - "title": "LogsRequest", - "description": "Request model for logs." - }, - "LogsResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/LogsResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/LogsResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/LogsResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "LogsResponse", - "description": "Response model for logs." - }, - "LogsResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "uri": { - "type": "string", - "maxLength": 65535, - "title": "The uri of the logs file" - } - }, - "type": "object", - "required": ["created", "updated", "uri"], - "title": "LogsResponseBody", - "description": "Response body for logs." - }, - "LogsResponseMetadata": { - "properties": { - "step_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Step ID to associate the logs with.", - "description": "When this is set, pipeline_run_id should be set to None." - }, - "pipeline_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline run ID to associate the logs with.", - "description": "When this is set, step_run_id should be set to None." - }, - "artifact_store_id": { - "anyOf": [ - { "type": "string" }, - { "type": "string", "format": "uuid" } - ], - "title": "The artifact store ID to associate the logs with." - } - }, - "type": "object", - "required": ["artifact_store_id"], - "title": "LogsResponseMetadata", - "description": "Response metadata for logs." - }, - "LogsResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "LogsResponseResources", - "description": "Class for all resource models associated with the Logs entity." - }, - "MetadataResourceTypes": { - "type": "string", - "enum": [ - "pipeline_run", - "step_run", - "artifact_version", - "model_version" - ], - "title": "MetadataResourceTypes", - "description": "All possible resource types for adding metadata." - }, - "MetadataTypeEnum": { - "type": "string", - "enum": [ - "str", - "int", - "float", - "bool", - "list", - "dict", - "tuple", - "set", - "Uri", - "Path", - "DType", - "StorageSize" - ], - "title": "MetadataTypeEnum", - "description": "String Enum of all possible types that metadata can have." - }, - "Model": { - "properties": { - "name": { "type": "string", "title": "Name" }, - "license": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "License" - }, - "description": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Description" - }, - "audience": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Audience" - }, - "use_cases": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Use Cases" - }, - "limitations": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Limitations" - }, - "trade_offs": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Trade Offs" - }, - "ethics": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Ethics" - }, - "tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags" - }, - "version": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelStages" }, - { "type": "integer" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Version" - }, - "save_models_to_registry": { - "type": "boolean", - "title": "Save Models To Registry", - "default": true - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Model Version Id" - }, - "suppress_class_validation_warnings": { - "type": "boolean", - "title": "Suppress Class Validation Warnings", - "default": false - }, - "was_created_in_this_run": { - "type": "boolean", - "title": "Was Created In This Run", - "default": false - } - }, - "type": "object", - "required": ["name"], - "title": "Model", - "description": "Model class to pass into pipeline or step to set it into a model context.\n\nname: The name of the model.\nlicense: The license under which the model is created.\ndescription: The description of the model.\naudience: The target audience of the model.\nuse_cases: The use cases of the model.\nlimitations: The known limitations of the model.\ntrade_offs: The tradeoffs of the model.\nethics: The ethical implications of the model.\ntags: Tags associated with the model.\nversion: The version name, version number or stage is optional and points model context\n to a specific version/stage. If skipped new version will be created.\nsave_models_to_registry: Whether to save all ModelArtifacts to Model Registry,\n if available in active stack.\nmodel_version_id: The ID of a specific Model Version, if given - it will override\n `name` and `version` settings. Used mostly internally." - }, - "ModelRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the model" - }, - "license": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The license model created under" - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The description of the model" - }, - "audience": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The target audience of the model" - }, - "use_cases": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The use cases of the model" - }, - "limitations": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The know limitations of the model" - }, - "trade_offs": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The trade offs of the model" - }, - "ethics": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The ethical implications of the model" - }, - "tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags associated with the model" - }, - "save_models_to_registry": { - "type": "boolean", - "title": "Whether to save all ModelArtifacts to Model Registry", - "default": true - } - }, - "type": "object", - "required": ["user", "workspace", "name", "tags"], - "title": "ModelRequest", - "description": "Request model for models." - }, - "ModelResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the model" - } - }, - "type": "object", - "required": ["id", "name"], - "title": "ModelResponse", - "description": "Response model for models." - }, - "ModelResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "tags": { - "items": { "$ref": "#/components/schemas/TagResponse" }, - "type": "array", - "title": "Tags associated with the model" - }, - "latest_version_name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Latest Version Name" - }, - "latest_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Latest Version Id" - } - }, - "type": "object", - "required": ["created", "updated", "tags"], - "title": "ModelResponseBody", - "description": "Response body for models." - }, - "ModelResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "license": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The license model created under" - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The description of the model" - }, - "audience": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The target audience of the model" - }, - "use_cases": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The use cases of the model" - }, - "limitations": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The know limitations of the model" - }, - "trade_offs": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The trade offs of the model" - }, - "ethics": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The ethical implications of the model" - }, - "save_models_to_registry": { - "type": "boolean", - "title": "Whether to save all ModelArtifacts to Model Registry", - "default": true - } - }, - "type": "object", - "required": ["workspace"], - "title": "ModelResponseMetadata", - "description": "Response metadata for models." - }, - "ModelResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ModelResponseResources", - "description": "Class for all resource models associated with the model entity." - }, - "ModelStages": { - "type": "string", - "enum": ["none", "staging", "production", "archived", "latest"], - "title": "ModelStages", - "description": "All possible stages of a Model Version." - }, - "ModelUpdate": { - "properties": { - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "license": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "License" - }, - "description": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Description" - }, - "audience": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Audience" - }, - "use_cases": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Use Cases" - }, - "limitations": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Limitations" - }, - "trade_offs": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Trade Offs" - }, - "ethics": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Ethics" - }, - "add_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Add Tags" - }, - "remove_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Remove Tags" - }, - "save_models_to_registry": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Save Models To Registry" - } - }, - "type": "object", - "title": "ModelUpdate", - "description": "Update model for models." - }, - "ModelVersionArtifactRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "model": { "type": "string", "format": "uuid", "title": "Model" }, - "model_version": { - "type": "string", - "format": "uuid", - "title": "Model Version" - }, - "artifact_version": { - "type": "string", - "format": "uuid", - "title": "Artifact Version" - }, - "is_model_artifact": { - "type": "boolean", - "title": "Is Model Artifact", - "default": false - }, - "is_deployment_artifact": { - "type": "boolean", - "title": "Is Deployment Artifact", - "default": false - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "model", - "model_version", - "artifact_version" - ], - "title": "ModelVersionArtifactRequest", - "description": "Request model for links between model versions and artifacts." - }, - "ModelVersionArtifactResponse": { - "properties": { - "body": { - "anyOf": [ - { - "$ref": "#/components/schemas/ModelVersionArtifactResponseBody" - }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/BaseResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/ModelVersionArtifactResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "ModelVersionArtifactResponse", - "description": "Response model for links between model versions and artifacts." - }, - "ModelVersionArtifactResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "model": { "type": "string", "format": "uuid", "title": "Model" }, - "model_version": { - "type": "string", - "format": "uuid", - "title": "Model Version" - }, - "artifact_version": { - "$ref": "#/components/schemas/ArtifactVersionResponse" - }, - "is_model_artifact": { - "type": "boolean", - "title": "Is Model Artifact", - "default": false - }, - "is_deployment_artifact": { - "type": "boolean", - "title": "Is Deployment Artifact", - "default": false - } - }, - "type": "object", - "required": [ - "created", - "updated", - "model", - "model_version", - "artifact_version" - ], - "title": "ModelVersionArtifactResponseBody", - "description": "Response body for links between model versions and artifacts." - }, - "ModelVersionArtifactResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ModelVersionArtifactResponseResources", - "description": "Class for all resource models associated with the model version artifact entity." - }, - "ModelVersionDataLazyLoader-Input": { - "properties": { - "model": { "$ref": "#/components/schemas/Model" }, - "artifact_name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Artifact Name" - }, - "artifact_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Artifact Version" - }, - "metadata_name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Metadata Name" - } - }, - "type": "object", - "required": ["model"], - "title": "ModelVersionDataLazyLoader", - "description": "Model Version Data Lazy Loader helper class.\n\nIt helps the inner codes to fetch proper artifact,\nmodel version metadata or artifact metadata from the\nmodel version during runtime time of the step." - }, - "ModelVersionDataLazyLoader-Output": { - "properties": { - "model": { "$ref": "#/components/schemas/Model" }, - "artifact_name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Artifact Name" - }, - "artifact_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Artifact Version" - }, - "metadata_name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Metadata Name" - } - }, - "type": "object", - "required": ["model"], - "title": "ModelVersionDataLazyLoader", - "description": "Model Version Data Lazy Loader helper class.\n\nIt helps the inner codes to fetch proper artifact,\nmodel version metadata or artifact metadata from the\nmodel version during runtime time of the step." - }, - "ModelVersionPipelineRunRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "model": { "type": "string", "format": "uuid", "title": "Model" }, - "model_version": { - "type": "string", - "format": "uuid", - "title": "Model Version" - }, - "pipeline_run": { - "type": "string", - "format": "uuid", - "title": "Pipeline Run" - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "model", - "model_version", - "pipeline_run" - ], - "title": "ModelVersionPipelineRunRequest", - "description": "Request model for links between model versions and pipeline runs." - }, - "ModelVersionPipelineRunResponse": { - "properties": { - "body": { - "anyOf": [ - { - "$ref": "#/components/schemas/ModelVersionPipelineRunResponseBody" - }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/BaseResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/ModelVersionPipelineRunResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "ModelVersionPipelineRunResponse", - "description": "Response model for links between model versions and pipeline runs." - }, - "ModelVersionPipelineRunResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "model": { "type": "string", "format": "uuid", "title": "Model" }, - "model_version": { - "type": "string", - "format": "uuid", - "title": "Model Version" - }, - "pipeline_run": { "$ref": "#/components/schemas/PipelineRunResponse" } - }, - "type": "object", - "required": [ - "created", - "updated", - "model", - "model_version", - "pipeline_run" - ], - "title": "ModelVersionPipelineRunResponseBody", - "description": "Response body for links between model versions and pipeline runs." - }, - "ModelVersionPipelineRunResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ModelVersionPipelineRunResponseResources", - "description": "Class for all resource models associated with the model version pipeline run entity." - }, - "ModelVersionRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "Name", - "description": "The name of the model version" - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "Description", - "description": "The description of the model version" - }, - "stage": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "Stage", - "description": "The stage of the model version" - }, - "number": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "Number", - "description": "The number of the model version" - }, - "model": { - "type": "string", - "format": "uuid", - "title": "Model", - "description": "The ID of the model containing version" - }, - "tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags associated with the model version" - } - }, - "type": "object", - "required": ["user", "workspace", "model"], - "title": "ModelVersionRequest", - "description": "Request model for model versions." - }, - "ModelVersionResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelVersionResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelVersionResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelVersionResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "Name", - "description": "The name of the model version" - } - }, - "type": "object", - "required": ["id"], - "title": "ModelVersionResponse", - "description": "Response model for model versions." - }, - "ModelVersionResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "stage": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "Stage", - "description": "The stage of the model version" - }, - "number": { - "type": "integer", - "title": "Number", - "description": "The number of the model version" - }, - "model": { - "allOf": [{ "$ref": "#/components/schemas/ModelResponse" }], - "description": "The model containing version" - }, - "model_artifact_ids": { - "additionalProperties": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object" - }, - "type": "object", - "title": "Model Artifact Ids", - "description": "Model artifacts linked to the model version", - "default": {} - }, - "data_artifact_ids": { - "additionalProperties": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object" - }, - "type": "object", - "title": "Data Artifact Ids", - "description": "Data artifacts linked to the model version", - "default": {} - }, - "deployment_artifact_ids": { - "additionalProperties": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object" - }, - "type": "object", - "title": "Deployment Artifact Ids", - "description": "Deployment artifacts linked to the model version", - "default": {} - }, - "pipeline_run_ids": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object", - "title": "Pipeline Run Ids", - "description": "Pipeline runs linked to the model version", - "default": {} - }, - "tags": { - "items": { "$ref": "#/components/schemas/TagResponse" }, - "type": "array", - "title": "Tags associated with the model version", - "default": [] - } - }, - "type": "object", - "required": ["created", "updated", "number", "model"], - "title": "ModelVersionResponseBody", - "description": "Response body for model versions." - }, - "ModelVersionResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "Description", - "description": "The description of the model version" - }, - "run_metadata": { - "additionalProperties": { - "$ref": "#/components/schemas/RunMetadataResponse" - }, - "type": "object", - "title": "Run Metadata", - "description": "Metadata linked to the model version", - "default": {} - } - }, - "type": "object", - "required": ["workspace"], - "title": "ModelVersionResponseMetadata", - "description": "Response metadata for model versions." - }, - "ModelVersionResponseResources": { - "properties": { - "services": { - "allOf": [{ "$ref": "#/components/schemas/Page_ServiceResponse_" }], - "description": "Services linked to the model version" - } - }, - "additionalProperties": true, - "type": "object", - "required": ["services"], - "title": "ModelVersionResponseResources", - "description": "Class for all resource models associated with the model version entity." - }, - "ModelVersionUpdate": { - "properties": { - "model": { - "type": "string", - "format": "uuid", - "title": "Model", - "description": "The ID of the model containing version" - }, - "stage": { - "anyOf": [ - { "type": "string" }, - { "$ref": "#/components/schemas/ModelStages" }, - { "type": "null" } - ], - "title": "Stage", - "description": "Target model version stage to be set" - }, - "force": { - "type": "boolean", - "title": "Force", - "description": "Whether existing model version in target stage should be silently archived or an error should be raised.", - "default": false - }, - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name", - "description": "Target model version name to be set" - }, - "description": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Description", - "description": "Target model version description to be set" - }, - "add_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Add Tags", - "description": "Tags to be added to the model version" - }, - "remove_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Remove Tags", - "description": "Tags to be removed from the model version" - } - }, - "type": "object", - "required": ["model"], - "title": "ModelVersionUpdate", - "description": "Update model for model versions." - }, - "OAuthDeviceAuthorizationResponse": { - "properties": { - "device_code": { "type": "string", "title": "Device Code" }, - "user_code": { "type": "string", "title": "User Code" }, - "verification_uri": { "type": "string", "title": "Verification Uri" }, - "verification_uri_complete": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Verification Uri Complete" - }, - "expires_in": { "type": "integer", "title": "Expires In" }, - "interval": { "type": "integer", "title": "Interval" } - }, - "type": "object", - "required": [ - "device_code", - "user_code", - "verification_uri", - "expires_in", - "interval" - ], - "title": "OAuthDeviceAuthorizationResponse", - "description": "OAuth2 device authorization grant response." - }, - "OAuthDeviceResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/OAuthDeviceResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/OAuthDeviceResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/OAuthDeviceResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "OAuthDeviceResponse", - "description": "Response model for OAuth2 devices." - }, - "OAuthDeviceResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "client_id": { - "type": "string", - "format": "uuid", - "title": "Client Id", - "description": "The client ID of the OAuth2 device." - }, - "expires": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Expires", - "description": "The expiration date of the OAuth2 device after which the device is no longer valid and cannot be used for authentication." - }, - "trusted_device": { - "type": "boolean", - "title": "Trusted Device", - "description": "Whether the OAuth2 device was marked as trusted. A trusted device has a much longer validity time." - }, - "status": { - "allOf": [{ "$ref": "#/components/schemas/OAuthDeviceStatus" }], - "description": "The status of the OAuth2 device." - }, - "os": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Os", - "description": "The operating system of the device used for authentication." - }, - "ip_address": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Ip Address", - "description": "The IP address of the device used for authentication." - }, - "hostname": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Hostname", - "description": "The hostname of the device used for authentication." - } - }, - "type": "object", - "required": [ - "created", - "updated", - "client_id", - "trusted_device", - "status" - ], - "title": "OAuthDeviceResponseBody", - "description": "Response body for OAuth2 devices." - }, - "OAuthDeviceResponseMetadata": { - "properties": { - "python_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Python Version", - "description": "The Python version of the device used for authentication." - }, - "zenml_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Zenml Version", - "description": "The ZenML version of the device used for authentication." - }, - "city": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "City", - "description": "The city where the device is located." - }, - "region": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Region", - "description": "The region where the device is located." - }, - "country": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Country", - "description": "The country where the device is located." - }, - "failed_auth_attempts": { - "type": "integer", - "title": "Failed Auth Attempts", - "description": "The number of failed authentication attempts." - }, - "last_login": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Last Login", - "description": "The date of the last successful login." - } - }, - "type": "object", - "required": ["failed_auth_attempts", "last_login"], - "title": "OAuthDeviceResponseMetadata", - "description": "Response metadata for OAuth2 devices." - }, - "OAuthDeviceResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "OAuthDeviceResponseResources", - "description": "Class for all resource models associated with the OAuthDevice entity." - }, - "OAuthDeviceStatus": { - "type": "string", - "enum": ["pending", "verified", "active", "locked"], - "title": "OAuthDeviceStatus", - "description": "The OAuth device status." - }, - "OAuthDeviceUpdate": { - "properties": { - "locked": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Locked", - "description": "Whether to lock or unlock the OAuth2 device. A locked device cannot be used for authentication." - } - }, - "type": "object", - "title": "OAuthDeviceUpdate", - "description": "OAuth2 device update model." - }, - "OAuthDeviceVerificationRequest": { - "properties": { - "user_code": { "type": "string", "title": "User Code" }, - "trusted_device": { - "type": "boolean", - "title": "Trusted Device", - "default": false - } - }, - "type": "object", - "required": ["user_code"], - "title": "OAuthDeviceVerificationRequest", - "description": "OAuth2 device authorization verification request." - }, - "OAuthRedirectResponse": { - "properties": { - "authorization_url": { - "type": "string", - "title": "Authorization Url" - } - }, - "type": "object", - "required": ["authorization_url"], - "title": "OAuthRedirectResponse", - "description": "Redirect response." - }, - "OAuthTokenResponse": { - "properties": { - "access_token": { "type": "string", "title": "Access Token" }, - "token_type": { "type": "string", "title": "Token Type" }, - "expires_in": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "Expires In" - }, - "refresh_token": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Refresh Token" - }, - "scope": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Scope" - } - }, - "type": "object", - "required": ["access_token", "token_type"], - "title": "OAuthTokenResponse", - "description": "OAuth2 device authorization token response." - }, - "Page_APIKeyResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/APIKeyResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[APIKeyResponse]" - }, - "Page_ActionResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ActionResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ActionResponse]" - }, - "Page_ArtifactResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ArtifactResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ArtifactResponse]" - }, - "Page_ArtifactVersionResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ArtifactVersionResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ArtifactVersionResponse]" - }, - "Page_BasePluginFlavorResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { - "$ref": "#/components/schemas/BasePluginFlavorResponse" - }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[BasePluginFlavorResponse]" - }, - "Page_CodeRepositoryResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/CodeRepositoryResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[CodeRepositoryResponse]" - }, - "Page_ComponentResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ComponentResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ComponentResponse]" - }, - "Page_EventSourceResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/EventSourceResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[EventSourceResponse]" - }, - "Page_FlavorResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/FlavorResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[FlavorResponse]" - }, - "Page_ModelResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ModelResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ModelResponse]" - }, - "Page_ModelVersionArtifactResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { - "$ref": "#/components/schemas/ModelVersionArtifactResponse" - }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ModelVersionArtifactResponse]" - }, - "Page_ModelVersionPipelineRunResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { - "$ref": "#/components/schemas/ModelVersionPipelineRunResponse" - }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ModelVersionPipelineRunResponse]" - }, - "Page_ModelVersionResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ModelVersionResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ModelVersionResponse]" - }, - "Page_OAuthDeviceResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/OAuthDeviceResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[OAuthDeviceResponse]" - }, - "Page_PipelineBuildResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/PipelineBuildResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[PipelineBuildResponse]" - }, - "Page_PipelineDeploymentResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { - "$ref": "#/components/schemas/PipelineDeploymentResponse" - }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[PipelineDeploymentResponse]" - }, - "Page_PipelineResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/PipelineResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[PipelineResponse]" - }, - "Page_PipelineRunResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/PipelineRunResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[PipelineRunResponse]" - }, - "Page_RunMetadataResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/RunMetadataResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[RunMetadataResponse]" - }, - "Page_RunTemplateResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/RunTemplateResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[RunTemplateResponse]" - }, - "Page_ScheduleResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ScheduleResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ScheduleResponse]" - }, - "Page_SecretResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/SecretResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[SecretResponse]" - }, - "Page_ServiceAccountResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ServiceAccountResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ServiceAccountResponse]" - }, - "Page_ServiceConnectorResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { - "$ref": "#/components/schemas/ServiceConnectorResponse" - }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ServiceConnectorResponse]" - }, - "Page_ServiceResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/ServiceResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[ServiceResponse]" - }, - "Page_StackResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/StackResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[StackResponse]" - }, - "Page_StepRunResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/StepRunResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[StepRunResponse]" - }, - "Page_TagResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/TagResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[TagResponse]" - }, - "Page_TriggerExecutionResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { - "$ref": "#/components/schemas/TriggerExecutionResponse" - }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[TriggerExecutionResponse]" - }, - "Page_TriggerResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/TriggerResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[TriggerResponse]" - }, - "Page_UserResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/UserResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[UserResponse]" - }, - "Page_WorkspaceResponse_": { - "properties": { - "index": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Index" - }, - "max_size": { - "type": "integer", - "exclusiveMinimum": 0.0, - "title": "Max Size" - }, - "total_pages": { - "type": "integer", - "minimum": 0.0, - "title": "Total Pages" - }, - "total": { "type": "integer", "minimum": 0.0, "title": "Total" }, - "items": { - "items": { "$ref": "#/components/schemas/WorkspaceResponse" }, - "type": "array", - "title": "Items" - } - }, - "type": "object", - "required": ["index", "max_size", "total_pages", "total", "items"], - "title": "Page[WorkspaceResponse]" - }, - "PartialArtifactConfiguration": { - "properties": { - "materializer_source": { - "anyOf": [ - { - "items": { "$ref": "#/components/schemas/Source" }, - "type": "array" - }, - { "type": "null" } - ], - "title": "Materializer Source" - }, - "default_materializer_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - } - }, - "additionalProperties": false, - "type": "object", - "title": "PartialArtifactConfiguration", - "description": "Class representing a partial input/output artifact configuration." - }, - "PipelineBuildBase": { - "properties": { - "images": { - "additionalProperties": { - "$ref": "#/components/schemas/BuildItem" - }, - "type": "object", - "title": "The images of this build.", - "default": {} - }, - "is_local": { - "type": "boolean", - "title": "Whether the build images are stored in a container registry or locally." - }, - "contains_code": { - "type": "boolean", - "title": "Whether any image of the build contains user code." - }, - "zenml_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The version of ZenML used for this build." - }, - "python_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The Python version used for this build." - } - }, - "type": "object", - "required": ["is_local", "contains_code"], - "title": "PipelineBuildBase", - "description": "Base model for pipeline builds." - }, - "PipelineBuildRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "images": { - "additionalProperties": { - "$ref": "#/components/schemas/BuildItem" - }, - "type": "object", - "title": "The images of this build.", - "default": {} - }, - "is_local": { - "type": "boolean", - "title": "Whether the build images are stored in a container registry or locally." - }, - "contains_code": { - "type": "boolean", - "title": "Whether any image of the build contains user code." - }, - "zenml_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The version of ZenML used for this build." - }, - "python_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The Python version used for this build." - }, - "checksum": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The build checksum." - }, - "stack_checksum": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The stack checksum." - }, - "stack": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The stack that was used for this build." - }, - "pipeline": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The pipeline that was used for this build." - } - }, - "type": "object", - "required": ["user", "workspace", "is_local", "contains_code"], - "title": "PipelineBuildRequest", - "description": "Request model for pipelines builds." - }, - "PipelineBuildResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineBuildResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineBuildResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineBuildResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "PipelineBuildResponse", - "description": "Response model for pipeline builds." - }, - "PipelineBuildResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - } - }, - "type": "object", - "required": ["created", "updated"], - "title": "PipelineBuildResponseBody", - "description": "Response body for pipeline builds." - }, - "PipelineBuildResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "pipeline": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineResponse" }, - { "type": "null" } - ], - "title": "The pipeline that was used for this build." - }, - "stack": { - "anyOf": [ - { "$ref": "#/components/schemas/StackResponse" }, - { "type": "null" } - ], - "title": "The stack that was used for this build." - }, - "images": { - "additionalProperties": { - "$ref": "#/components/schemas/BuildItem" - }, - "type": "object", - "title": "The images of this build.", - "default": {} - }, - "zenml_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The version of ZenML used for this build." - }, - "python_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The Python version used for this build." - }, - "checksum": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The build checksum." - }, - "stack_checksum": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The stack checksum." - }, - "is_local": { - "type": "boolean", - "title": "Whether the build images are stored in a container registry or locally." - }, - "contains_code": { - "type": "boolean", - "title": "Whether any image of the build contains user code." - } - }, - "type": "object", - "required": ["workspace", "is_local", "contains_code"], - "title": "PipelineBuildResponseMetadata", - "description": "Response metadata for pipeline builds." - }, - "PipelineBuildResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "PipelineBuildResponseResources", - "description": "Class for all resource models associated with the pipeline build entity." - }, - "PipelineConfiguration-Input": { - "properties": { - "enable_cache": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Cache" - }, - "enable_artifact_metadata": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Metadata" - }, - "enable_artifact_visualization": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Visualization" - }, - "enable_step_logs": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Step Logs" - }, - "settings": { - "additionalProperties": { - "$ref": "#/components/schemas/BaseSettings" - }, - "type": "object", - "title": "Settings", - "default": {} - }, - "extra": { "type": "object", "title": "Extra", "default": {} }, - "failure_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "success_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "model": { - "anyOf": [ - { "$ref": "#/components/schemas/Model" }, - { "type": "null" } - ] - }, - "parameters": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "Parameters" - }, - "retry": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRetryConfig" }, - { "type": "null" } - ] - }, - "name": { "type": "string", "title": "Name" } - }, - "additionalProperties": false, - "type": "object", - "required": ["name"], - "title": "PipelineConfiguration", - "description": "Pipeline configuration class." - }, - "PipelineConfiguration-Output": { - "properties": { - "enable_cache": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Cache" - }, - "enable_artifact_metadata": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Metadata" - }, - "enable_artifact_visualization": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Visualization" - }, - "enable_step_logs": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Step Logs" - }, - "settings": { - "additionalProperties": { - "$ref": "#/components/schemas/BaseSettings" - }, - "type": "object", - "title": "Settings", - "default": {} - }, - "extra": { "type": "object", "title": "Extra", "default": {} }, - "failure_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "success_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "model": { - "anyOf": [ - { "$ref": "#/components/schemas/Model" }, - { "type": "null" } - ] - }, - "parameters": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "Parameters" - }, - "retry": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRetryConfig" }, - { "type": "null" } - ] - }, - "name": { "type": "string", "title": "Name" } - }, - "additionalProperties": false, - "type": "object", - "required": ["name"], - "title": "PipelineConfiguration", - "description": "Pipeline configuration class." - }, - "PipelineDeploymentRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "run_name_template": { - "type": "string", - "title": "The run name template for runs created using this deployment." - }, - "pipeline_configuration": { - "allOf": [ - { "$ref": "#/components/schemas/PipelineConfiguration-Input" } - ], - "title": "The pipeline configuration for this deployment." - }, - "step_configurations": { - "additionalProperties": { - "$ref": "#/components/schemas/Step-Input" - }, - "type": "object", - "title": "The step configurations for this deployment.", - "default": {} - }, - "client_environment": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "The client environment for this deployment.", - "default": {} - }, - "client_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The version of the ZenML installation on the client side." - }, - "server_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The version of the ZenML installation on the server side." - }, - "pipeline_version_hash": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The pipeline version hash of the deployment." - }, - "pipeline_spec": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineSpec-Input" }, - { "type": "null" } - ], - "title": "The pipeline spec of the deployment." - }, - "stack": { - "type": "string", - "format": "uuid", - "title": "The stack associated with the deployment." - }, - "pipeline": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The pipeline associated with the deployment." - }, - "build": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The build associated with the deployment." - }, - "schedule": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The schedule associated with the deployment." - }, - "code_reference": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeReferenceRequest" }, - { "type": "null" } - ], - "title": "The code reference associated with the deployment." - }, - "code_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optional path where the code is stored in the artifact store." - }, - "template": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Template", - "description": "Template used for the deployment." - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "run_name_template", - "pipeline_configuration", - "stack" - ], - "title": "PipelineDeploymentRequest", - "description": "Request model for pipeline deployments." - }, - "PipelineDeploymentResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineDeploymentResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { - "$ref": "#/components/schemas/PipelineDeploymentResponseMetadata" - }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/PipelineDeploymentResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "PipelineDeploymentResponse", - "description": "Response model for pipeline deployments." - }, - "PipelineDeploymentResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - } - }, - "type": "object", - "required": ["created", "updated"], - "title": "PipelineDeploymentResponseBody", - "description": "Response body for pipeline deployments." - }, - "PipelineDeploymentResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "run_name_template": { - "type": "string", - "title": "The run name template for runs created using this deployment." - }, - "pipeline_configuration": { - "allOf": [ - { "$ref": "#/components/schemas/PipelineConfiguration-Output" } - ], - "title": "The pipeline configuration for this deployment." - }, - "step_configurations": { - "additionalProperties": { - "$ref": "#/components/schemas/Step-Output" - }, - "type": "object", - "title": "The step configurations for this deployment.", - "default": {} - }, - "client_environment": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "The client environment for this deployment.", - "default": {} - }, - "client_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The version of the ZenML installation on the client side." - }, - "server_version": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The version of the ZenML installation on the server side." - }, - "pipeline_version_hash": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The pipeline version hash of the deployment." - }, - "pipeline_spec": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineSpec-Output" }, - { "type": "null" } - ], - "title": "The pipeline spec of the deployment." - }, - "code_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optional path where the code is stored in the artifact store." - }, - "pipeline": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineResponse" }, - { "type": "null" } - ], - "title": "The pipeline associated with the deployment." - }, - "stack": { - "anyOf": [ - { "$ref": "#/components/schemas/StackResponse" }, - { "type": "null" } - ], - "title": "The stack associated with the deployment." - }, - "build": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineBuildResponse" }, - { "type": "null" } - ], - "title": "The pipeline build associated with the deployment." - }, - "schedule": { - "anyOf": [ - { "$ref": "#/components/schemas/ScheduleResponse" }, - { "type": "null" } - ], - "title": "The schedule associated with the deployment." - }, - "code_reference": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeReferenceResponse" }, - { "type": "null" } - ], - "title": "The code reference associated with the deployment." - }, - "template_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Template Id", - "description": "Template used for the pipeline run." - } - }, - "type": "object", - "required": [ - "workspace", - "run_name_template", - "pipeline_configuration", - "client_version", - "server_version" - ], - "title": "PipelineDeploymentResponseMetadata", - "description": "Response metadata for pipeline deployments." - }, - "PipelineDeploymentResponseResources": { - "properties": { - "triggers": { - "allOf": [{ "$ref": "#/components/schemas/Page_TriggerResponse_" }], - "title": "The triggers configured with this event source." - } - }, - "additionalProperties": true, - "type": "object", - "required": ["triggers"], - "title": "PipelineDeploymentResponseResources", - "description": "Class for all resource models associated with the pipeline deployment entity." - }, - "PipelineRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the pipeline." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The description of the pipeline." - }, - "tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags of the pipeline." - } - }, - "type": "object", - "required": ["user", "workspace", "name"], - "title": "PipelineRequest", - "description": "Request model for pipelines." - }, - "PipelineResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the pipeline." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "PipelineResponse", - "description": "Response model for pipelines." - }, - "PipelineResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "latest_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the latest run of the pipeline." - }, - "latest_run_status": { - "anyOf": [ - { "$ref": "#/components/schemas/ExecutionStatus" }, - { "type": "null" } - ], - "title": "The status of the latest run of the pipeline." - } - }, - "type": "object", - "required": ["created", "updated"], - "title": "PipelineResponseBody", - "description": "Response body for pipelines." - }, - "PipelineResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "description": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The description of the pipeline." - } - }, - "type": "object", - "required": ["workspace"], - "title": "PipelineResponseMetadata", - "description": "Response metadata for pipelines." - }, - "PipelineResponseResources": { - "properties": { - "tags": { - "items": { "$ref": "#/components/schemas/TagResponse" }, - "type": "array", - "title": "Tags associated with the pipeline." - } - }, - "additionalProperties": true, - "type": "object", - "required": ["tags"], - "title": "PipelineResponseResources", - "description": "Class for all resource models associated with the pipeline entity." - }, - "PipelineRunConfiguration": { - "properties": { - "run_name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Run Name" - }, - "enable_cache": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Cache" - }, - "enable_artifact_metadata": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Metadata" - }, - "enable_artifact_visualization": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Visualization" - }, - "enable_step_logs": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Step Logs" - }, - "schedule": { - "anyOf": [ - { "$ref": "#/components/schemas/Schedule" }, - { "type": "null" } - ] - }, - "build": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineBuildBase" }, - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Build" - }, - "steps": { - "additionalProperties": { - "$ref": "#/components/schemas/StepConfigurationUpdate" - }, - "type": "object", - "title": "Steps", - "default": {} - }, - "settings": { - "additionalProperties": { - "$ref": "#/components/schemas/BaseSettings" - }, - "type": "object", - "title": "Settings", - "default": {} - }, - "extra": { "type": "object", "title": "Extra", "default": {} }, - "model": { - "anyOf": [ - { "$ref": "#/components/schemas/Model" }, - { "type": "null" } - ] - }, - "parameters": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "Parameters" - }, - "retry": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRetryConfig" }, - { "type": "null" } - ] - }, - "failure_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "success_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - } - }, - "additionalProperties": false, - "type": "object", - "title": "PipelineRunConfiguration", - "description": "Class for pipeline run configurations." - }, - "PipelineRunRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the pipeline run." - }, - "deployment": { - "type": "string", - "format": "uuid", - "title": "The deployment associated with the pipeline run." - }, - "pipeline": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The pipeline associated with the pipeline run." - }, - "orchestrator_run_id": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The orchestrator run ID." - }, - "start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The start time of the pipeline run." - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The end time of the pipeline run." - }, - "status": { - "allOf": [{ "$ref": "#/components/schemas/ExecutionStatus" }], - "title": "The status of the pipeline run." - }, - "client_environment": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "Environment of the client that initiated this pipeline run (OS, Python version, etc.).", - "default": {} - }, - "orchestrator_environment": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "Environment of the orchestrator that executed this pipeline run (OS, Python version, etc.).", - "default": {} - }, - "trigger_execution_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "ID of the trigger execution that triggered this run." - }, - "tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags of the pipeline run." - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the model version that was configured by this pipeline run explicitly." - } - }, - "type": "object", - "required": ["user", "workspace", "name", "deployment", "status"], - "title": "PipelineRunRequest", - "description": "Request model for pipeline runs." - }, - "PipelineRunResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineRunResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineRunResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineRunResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the pipeline run." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "PipelineRunResponse", - "description": "Response model for pipeline runs." - }, - "PipelineRunResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "status": { - "allOf": [{ "$ref": "#/components/schemas/ExecutionStatus" }], - "title": "The status of the pipeline run." - }, - "stack": { - "anyOf": [ - { "$ref": "#/components/schemas/StackResponse" }, - { "type": "null" } - ], - "title": "The stack that was used for this run." - }, - "pipeline": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineResponse" }, - { "type": "null" } - ], - "title": "The pipeline this run belongs to." - }, - "build": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineBuildResponse" }, - { "type": "null" } - ], - "title": "The pipeline build that was used for this run." - }, - "schedule": { - "anyOf": [ - { "$ref": "#/components/schemas/ScheduleResponse" }, - { "type": "null" } - ], - "title": "The schedule that was used for this run." - }, - "code_reference": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeReferenceResponse" }, - { "type": "null" } - ], - "title": "The code reference that was used for this run." - }, - "deployment_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The deployment that was used for this run." - }, - "trigger_execution": { - "anyOf": [ - { "$ref": "#/components/schemas/TriggerExecutionResponse" }, - { "type": "null" } - ], - "title": "The trigger execution that triggered this run." - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the model version that was configured by this pipeline run explicitly." - } - }, - "type": "object", - "required": ["created", "updated", "status"], - "title": "PipelineRunResponseBody", - "description": "Response body for pipeline runs." - }, - "PipelineRunResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "run_metadata": { - "additionalProperties": { - "$ref": "#/components/schemas/RunMetadataResponse" - }, - "type": "object", - "title": "Metadata associated with this pipeline run.", - "default": {} - }, - "steps": { - "additionalProperties": { - "$ref": "#/components/schemas/StepRunResponse" - }, - "type": "object", - "title": "The steps of this run.", - "default": {} - }, - "config": { - "allOf": [ - { "$ref": "#/components/schemas/PipelineConfiguration-Output" } - ], - "title": "The pipeline configuration used for this pipeline run." - }, - "start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The start time of the pipeline run." - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The end time of the pipeline run." - }, - "client_environment": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "Environment of the client that initiated this pipeline run (OS, Python version, etc.).", - "default": {} - }, - "orchestrator_environment": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "Environment of the orchestrator that executed this pipeline run (OS, Python version, etc.).", - "default": {} - }, - "orchestrator_run_id": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The orchestrator run ID." - }, - "code_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optional path where the code is stored in the artifact store." - }, - "template_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Template Id", - "description": "Template used for the pipeline run." - } - }, - "type": "object", - "required": ["workspace", "config"], - "title": "PipelineRunResponseMetadata", - "description": "Response metadata for pipeline runs." - }, - "PipelineRunResponseResources": { - "properties": { - "model_version": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelVersionResponse" }, - { "type": "null" } - ] - }, - "tags": { - "items": { "$ref": "#/components/schemas/TagResponse" }, - "type": "array", - "title": "Tags associated with the pipeline run." - } - }, - "additionalProperties": true, - "type": "object", - "required": ["tags"], - "title": "PipelineRunResponseResources", - "description": "Class for all resource models associated with the pipeline run entity." - }, - "PipelineRunUpdate": { - "properties": { - "status": { - "anyOf": [ - { "$ref": "#/components/schemas/ExecutionStatus" }, - { "type": "null" } - ] - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "End Time" - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the model version that was configured by this pipeline run explicitly." - }, - "add_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "New tags to add to the pipeline run." - }, - "remove_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags to remove from the pipeline run." - } - }, - "type": "object", - "title": "PipelineRunUpdate", - "description": "Pipeline run update model." - }, - "PipelineSpec-Input": { - "properties": { - "version": { "type": "string", "title": "Version", "default": "0.4" }, - "source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "parameters": { - "type": "object", - "title": "Parameters", - "default": {} - }, - "steps": { - "items": { "$ref": "#/components/schemas/StepSpec-Input" }, - "type": "array", - "title": "Steps" - } - }, - "additionalProperties": false, - "type": "object", - "required": ["steps"], - "title": "PipelineSpec", - "description": "Specification of a pipeline." - }, - "PipelineSpec-Output": { - "properties": { - "version": { "type": "string", "title": "Version", "default": "0.4" }, - "source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "parameters": { - "type": "object", - "title": "Parameters", - "default": {} - }, - "steps": { - "items": { "$ref": "#/components/schemas/StepSpec-Output" }, - "type": "array", - "title": "Steps" - } - }, - "additionalProperties": false, - "type": "object", - "required": ["steps"], - "title": "PipelineSpec", - "description": "Specification of a pipeline." - }, - "PipelineUpdate": { - "properties": { - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The description of the pipeline." - }, - "add_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "New tags to add to the pipeline." - }, - "remove_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags to remove from the pipeline." - } - }, - "type": "object", - "title": "PipelineUpdate", - "description": "Update model for pipelines." - }, - "PluginType": { - "type": "string", - "enum": ["event_source", "action"], - "title": "PluginType", - "description": "All possible types of Plugins." - }, - "ResourceTypeModel": { - "properties": { - "name": { - "type": "string", - "title": "User readable name for the resource type." - }, - "resource_type": { - "type": "string", - "title": "Resource type identifier." - }, - "description": { - "type": "string", - "title": "A description of the resource type.", - "default": "" - }, - "auth_methods": { - "items": { "type": "string" }, - "type": "array", - "title": "The list of authentication methods that can be used to access resources of this type." - }, - "supports_instances": { - "type": "boolean", - "title": "Specifies if a single connector instance can be used to access multiple instances of this resource type. If set to True, the connector is able to provide a list of resource IDs identifying all the resources that it can access and a resource ID needs to be explicitly configured or supplied when access to a resource is requested. If set to False, a connector instance is only able to access a single resource and a resource ID is not required to access the resource.", - "default": false - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a URL pointing to a png,svg or jpg file can be attached." - }, - "emoji": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a python-rich emoji can be attached." - } - }, - "type": "object", - "required": ["name", "resource_type", "auth_methods"], - "title": "ResourceTypeModel", - "description": "Resource type specification.\n\nDescribes the authentication methods and resource instantiation model for\none or more resource types." - }, - "ResourcesInfo": { - "properties": { - "flavor": { "type": "string", "title": "Flavor" }, - "flavor_display_name": { - "type": "string", - "title": "Flavor Display Name" - }, - "required_configuration": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "Required Configuration", - "default": {} - }, - "use_resource_value_as_fixed_config": { - "type": "boolean", - "title": "Use Resource Value As Fixed Config", - "default": false - }, - "accessible_by_service_connector": { - "items": { "type": "string" }, - "type": "array", - "title": "Accessible By Service Connector" - }, - "connected_through_service_connector": { - "items": { "$ref": "#/components/schemas/ComponentResponse" }, - "type": "array", - "title": "Connected Through Service Connector" - } - }, - "type": "object", - "required": [ - "flavor", - "flavor_display_name", - "accessible_by_service_connector", - "connected_through_service_connector" - ], - "title": "ResourcesInfo", - "description": "Information about the resources needed for CLI and UI." - }, - "RunMetadataRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "resource_id": { - "type": "string", - "format": "uuid", - "title": "The ID of the resource that this metadata belongs to." - }, - "resource_type": { - "allOf": [{ "$ref": "#/components/schemas/MetadataResourceTypes" }], - "title": "The type of the resource that this metadata belongs to." - }, - "stack_component_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the stack component that this metadata belongs to." - }, - "values": { - "additionalProperties": { - "anyOf": [ - { "type": "string" }, - { "type": "integer" }, - { "type": "number" }, - { "type": "boolean" }, - { "type": "object" }, - { "items": {}, "type": "array" }, - { "items": {}, "type": "array", "uniqueItems": true } - ] - }, - "type": "object", - "title": "The metadata to be created." - }, - "types": { - "additionalProperties": { - "$ref": "#/components/schemas/MetadataTypeEnum" - }, - "type": "object", - "title": "The types of the metadata to be created." - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "resource_id", - "resource_type", - "stack_component_id", - "values", - "types" - ], - "title": "RunMetadataRequest", - "description": "Request model for run metadata." - }, - "RunMetadataResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/RunMetadataResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/RunMetadataResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/RunMetadataResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "RunMetadataResponse", - "description": "Response model for run metadata." - }, - "RunMetadataResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "key": { "type": "string", "title": "The key of the metadata." }, - "value": { - "anyOf": [ - { "type": "string" }, - { "type": "integer" }, - { "type": "number" }, - { "type": "boolean" }, - { "type": "object" }, - { "items": {}, "type": "array" }, - { "items": {}, "type": "array", "uniqueItems": true } - ], - "title": "The value of the metadata." - }, - "type": { - "allOf": [{ "$ref": "#/components/schemas/MetadataTypeEnum" }], - "title": "The type of the metadata." - } - }, - "type": "object", - "required": ["created", "updated", "key", "value", "type"], - "title": "RunMetadataResponseBody", - "description": "Response body for run metadata." - }, - "RunMetadataResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "resource_id": { - "type": "string", - "format": "uuid", - "title": "The ID of the resource that this metadata belongs to." - }, - "resource_type": { - "allOf": [{ "$ref": "#/components/schemas/MetadataResourceTypes" }], - "title": "The type of the resource that this metadata belongs to." - }, - "stack_component_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the stack component that this metadata belongs to." - } - }, - "type": "object", - "required": [ - "workspace", - "resource_id", - "resource_type", - "stack_component_id" - ], - "title": "RunMetadataResponseMetadata", - "description": "Response metadata for run metadata." - }, - "RunMetadataResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "RunMetadataResponseResources", - "description": "Class for all resource models associated with the run metadata entity." - }, - "RunTemplateRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the run template." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The description of the run template." - }, - "source_deployment_id": { - "type": "string", - "format": "uuid", - "title": "The deployment that should be the base of the created template." - }, - "tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags of the run template." - } - }, - "type": "object", - "required": ["user", "workspace", "name", "source_deployment_id"], - "title": "RunTemplateRequest", - "description": "Request model for run templates." - }, - "RunTemplateResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/RunTemplateResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/RunTemplateResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/RunTemplateResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the run template." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "RunTemplateResponse", - "description": "Response model for run templates." - }, - "RunTemplateResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "runnable": { - "type": "boolean", - "title": "If a run can be started from the template." - }, - "latest_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the latest run of the run template." - }, - "latest_run_status": { - "anyOf": [ - { "$ref": "#/components/schemas/ExecutionStatus" }, - { "type": "null" } - ], - "title": "The status of the latest run of the run template." - } - }, - "type": "object", - "required": ["created", "updated", "runnable"], - "title": "RunTemplateResponseBody", - "description": "Response body for run templates." - }, - "RunTemplateResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "description": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The description of the run template." - }, - "pipeline_spec": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineSpec-Output" }, - { "type": "null" } - ], - "title": "The spec of the pipeline." - }, - "config_template": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "Run configuration template." - }, - "config_schema": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "Run configuration schema." - } - }, - "type": "object", - "required": ["workspace"], - "title": "RunTemplateResponseMetadata", - "description": "Response metadata for run templates." - }, - "RunTemplateResponseResources": { - "properties": { - "source_deployment": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineDeploymentResponse" }, - { "type": "null" } - ], - "title": "The deployment that is the source of the template." - }, - "pipeline": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineResponse" }, - { "type": "null" } - ], - "title": "The pipeline associated with the template." - }, - "build": { - "anyOf": [ - { "$ref": "#/components/schemas/PipelineBuildResponse" }, - { "type": "null" } - ], - "title": "The pipeline build associated with the template." - }, - "code_reference": { - "anyOf": [ - { "$ref": "#/components/schemas/CodeReferenceResponse" }, - { "type": "null" } - ], - "title": "The code reference associated with the template." - }, - "tags": { - "items": { "$ref": "#/components/schemas/TagResponse" }, - "type": "array", - "title": "Tags associated with the run template." - } - }, - "additionalProperties": true, - "type": "object", - "required": ["tags"], - "title": "RunTemplateResponseResources", - "description": "All resource models associated with the run template." - }, - "RunTemplateUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the run template." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The description of the run template." - }, - "add_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "New tags to add to the run template." - }, - "remove_tags": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "Tags to remove from the run template." - } - }, - "type": "object", - "title": "RunTemplateUpdate", - "description": "Run template update model." - }, - "Schedule": { - "properties": { - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "cron_expression": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Cron Expression" - }, - "start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Start Time" - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "End Time" - }, - "interval_second": { - "anyOf": [ - { "type": "string", "format": "duration" }, - { "type": "null" } - ], - "title": "Interval Second" - }, - "catchup": { - "type": "boolean", - "title": "Catchup", - "default": false - }, - "run_once_start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Run Once Start Time" - } - }, - "type": "object", - "title": "Schedule", - "description": "Class for defining a pipeline schedule.\n\nAttributes:\n name: Optional name to give to the schedule. If not set, a default name\n will be generated based on the pipeline name and the current date\n and time.\n cron_expression: Cron expression for the pipeline schedule. If a value\n for this is set it takes precedence over the start time + interval.\n start_time: datetime object to indicate when to start the schedule.\n end_time: datetime object to indicate when to end the schedule.\n interval_second: datetime timedelta indicating the seconds between two\n recurring runs for a periodic schedule.\n catchup: Whether the recurring run should catch up if behind schedule.\n For example, if the recurring run is paused for a while and\n re-enabled afterward. If catchup=True, the scheduler will catch\n up on (backfill) each missed interval. Otherwise, it only\n schedules the latest interval if more than one interval is ready to\n be scheduled. Usually, if your pipeline handles backfill\n internally, you should turn catchup off to avoid duplicate backfill.\n run_once_start_time: datetime object to indicate when to run the\n pipeline once. This is useful for one-off runs." - }, - "ScheduleRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { "type": "string", "title": "Name" }, - "active": { "type": "boolean", "title": "Active" }, - "cron_expression": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Cron Expression" - }, - "start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Start Time" - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "End Time" - }, - "interval_second": { - "anyOf": [ - { "type": "string", "format": "duration" }, - { "type": "null" } - ], - "title": "Interval Second" - }, - "catchup": { - "type": "boolean", - "title": "Catchup", - "default": false - }, - "run_once_start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Run Once Start Time" - }, - "orchestrator_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Orchestrator Id" - }, - "pipeline_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "name", - "active", - "orchestrator_id", - "pipeline_id" - ], - "title": "ScheduleRequest", - "description": "Request model for schedules." - }, - "ScheduleResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ScheduleResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/ScheduleResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/ScheduleResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "Name of this schedule." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "ScheduleResponse", - "description": "Response model for schedules." - }, - "ScheduleResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "active": { "type": "boolean", "title": "Active" }, - "cron_expression": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Cron Expression" - }, - "start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Start Time" - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "End Time" - }, - "interval_second": { - "anyOf": [ - { "type": "string", "format": "duration" }, - { "type": "null" } - ], - "title": "Interval Second" - }, - "catchup": { - "type": "boolean", - "title": "Catchup", - "default": false - }, - "run_once_start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Run Once Start Time" - } - }, - "type": "object", - "required": ["created", "updated", "active"], - "title": "ScheduleResponseBody", - "description": "Response body for schedules." - }, - "ScheduleResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "orchestrator_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Orchestrator Id" - }, - "pipeline_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - "type": "object", - "required": ["workspace", "orchestrator_id", "pipeline_id"], - "title": "ScheduleResponseMetadata", - "description": "Response metadata for schedules." - }, - "ScheduleResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ScheduleResponseResources", - "description": "Class for all resource models associated with the schedule entity." - }, - "ScheduleUpdate": { - "properties": { - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "active": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Active" - }, - "cron_expression": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Cron Expression" - }, - "start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Start Time" - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "End Time" - }, - "interval_second": { - "anyOf": [ - { "type": "string", "format": "duration" }, - { "type": "null" } - ], - "title": "Interval Second" - }, - "catchup": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Catchup" - }, - "run_once_start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Run Once Start Time" - }, - "orchestrator_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Orchestrator Id" - }, - "pipeline_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Pipeline Id" - } - }, - "type": "object", - "title": "ScheduleUpdate", - "description": "Update model for schedules." - }, - "SecretRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the secret." - }, - "scope": { - "allOf": [{ "$ref": "#/components/schemas/SecretScope" }], - "title": "The scope of the secret.", - "default": "workspace" - }, - "values": { - "additionalProperties": { - "anyOf": [ - { "type": "string", "format": "password", "writeOnly": true }, - { "type": "null" } - ] - }, - "type": "object", - "title": "The values stored in this secret." - } - }, - "type": "object", - "required": ["user", "workspace", "name"], - "title": "SecretRequest", - "description": "Request models for secrets." - }, - "SecretResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/SecretResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/SecretResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/SecretResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the secret." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "SecretResponse", - "description": "Response model for secrets." - }, - "SecretResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "scope": { - "allOf": [{ "$ref": "#/components/schemas/SecretScope" }], - "title": "The scope of the secret.", - "default": "workspace" - }, - "values": { - "additionalProperties": { - "anyOf": [ - { "type": "string", "format": "password", "writeOnly": true }, - { "type": "null" } - ] - }, - "type": "object", - "title": "The values stored in this secret." - } - }, - "type": "object", - "required": ["created", "updated"], - "title": "SecretResponseBody", - "description": "Response body for secrets." - }, - "SecretResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - } - }, - "type": "object", - "required": ["workspace"], - "title": "SecretResponseMetadata", - "description": "Response metadata for secrets." - }, - "SecretResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "SecretResponseResources", - "description": "Class for all resource models associated with the secret entity." - }, - "SecretScope": { - "type": "string", - "enum": ["workspace", "user"], - "title": "SecretScope", - "description": "Enum for the scope of a secret." - }, - "SecretUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the secret." - }, - "scope": { - "anyOf": [ - { "$ref": "#/components/schemas/SecretScope" }, - { "type": "null" } - ], - "title": "The scope of the secret." - }, - "values": { - "anyOf": [ - { - "additionalProperties": { - "anyOf": [ - { - "type": "string", - "format": "password", - "writeOnly": true - }, - { "type": "null" } - ] - }, - "type": "object" - }, - { "type": "null" } - ], - "title": "The values stored in this secret." - } - }, - "type": "object", - "title": "SecretUpdate", - "description": "Secret update model." - }, - "SecretsStoreType": { - "type": "string", - "enum": ["none", "sql", "aws", "gcp", "azure", "hashicorp", "custom"], - "title": "SecretsStoreType", - "description": "Secrets Store Backend Types." - }, - "ServerDatabaseType": { - "type": "string", - "enum": ["sqlite", "mysql", "other"], - "title": "ServerDatabaseType", - "description": "Enum for server database types." - }, - "ServerDeploymentType": { - "type": "string", - "enum": [ - "local", - "docker", - "kubernetes", - "aws", - "gcp", - "azure", - "alpha", - "other", - "hf_spaces", - "sandbox", - "cloud" - ], - "title": "ServerDeploymentType", - "description": "Enum for server deployment types." - }, - "ServerModel": { - "properties": { - "id": { - "type": "string", - "format": "uuid", - "title": "The unique server id." - }, - "version": { - "type": "string", - "title": "The ZenML version that the server is running." - }, - "active": { - "type": "boolean", - "title": "Flag to indicate whether the server is active.", - "default": true - }, - "debug": { - "type": "boolean", - "title": "Flag to indicate whether ZenML is running on debug mode.", - "default": false - }, - "deployment_type": { - "allOf": [{ "$ref": "#/components/schemas/ServerDeploymentType" }], - "title": "The ZenML server deployment type.", - "default": "other" - }, - "database_type": { - "allOf": [{ "$ref": "#/components/schemas/ServerDatabaseType" }], - "title": "The database type that the server is using.", - "default": "other" - }, - "secrets_store_type": { - "allOf": [{ "$ref": "#/components/schemas/SecretsStoreType" }], - "title": "The type of secrets store that the server is using.", - "default": "none" - }, - "auth_scheme": { - "allOf": [{ "$ref": "#/components/schemas/AuthScheme" }], - "title": "The authentication scheme that the server is using." - }, - "server_url": { - "type": "string", - "title": "The URL where the ZenML server API is reachable. If not specified, the clients will use the same URL used to connect them to the ZenML server.", - "default": "" - }, - "dashboard_url": { - "type": "string", - "title": "The URL where the ZenML dashboard is reachable. If not specified, the `server_url` value will be used instead.", - "default": "" - }, - "analytics_enabled": { - "type": "boolean", - "title": "Enable server-side analytics.", - "default": true - }, - "metadata": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "The metadata associated with the server.", - "default": {} - }, - "use_legacy_dashboard": { - "type": "boolean", - "title": "Flag to indicate whether the server is using the legacy dashboard.", - "default": false - }, - "last_user_activity": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Timestamp of latest user activity traced on the server." - } - }, - "type": "object", - "required": ["version", "auth_scheme"], - "title": "ServerModel", - "description": "Domain model for ZenML servers." - }, - "ServiceAccountRequest": { - "properties": { - "name": { - "type": "string", - "maxLength": 255, - "title": "The unique name for the service account." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "A description of the service account." - }, - "active": { - "type": "boolean", - "title": "Whether the service account is active or not." - } - }, - "type": "object", - "required": ["name", "active"], - "title": "ServiceAccountRequest", - "description": "Request model for service accounts." - }, - "ServiceAccountResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceAccountResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceAccountResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/ServiceAccountResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The unique username for the account." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "ServiceAccountResponse", - "description": "Response model for service accounts." - }, - "ServiceAccountResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "active": { - "type": "boolean", - "title": "Whether the account is active.", - "default": false - } - }, - "type": "object", - "required": ["created", "updated"], - "title": "ServiceAccountResponseBody", - "description": "Response body for service accounts." - }, - "ServiceAccountResponseMetadata": { - "properties": { - "description": { - "type": "string", - "maxLength": 65535, - "title": "A description of the service account.", - "default": "" - } - }, - "type": "object", - "title": "ServiceAccountResponseMetadata", - "description": "Response metadata for service accounts." - }, - "ServiceAccountResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ServiceAccountResponseResources", - "description": "Class for all resource models associated with the service account entity." - }, - "ServiceAccountUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The unique name for the service account." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "A description of the service account." - }, - "active": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Whether the service account is active or not." - } - }, - "type": "object", - "title": "ServiceAccountUpdate", - "description": "Update model for service accounts." - }, - "ServiceConnectorInfo": { - "properties": { - "type": { "type": "string", "title": "Type" }, - "auth_method": { "type": "string", "title": "Auth Method" }, - "configuration": { - "type": "object", - "title": "Configuration", - "default": {} - } - }, - "type": "object", - "required": ["type", "auth_method"], - "title": "ServiceConnectorInfo", - "description": "Information about the service connector when creating a full stack." - }, - "ServiceConnectorRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The service connector name." - }, - "connector_type": { - "anyOf": [ - { "type": "string" }, - { "$ref": "#/components/schemas/ServiceConnectorTypeModel" } - ], - "title": "The type of service connector." - }, - "description": { - "type": "string", - "title": "The service connector instance description.", - "default": "" - }, - "auth_method": { - "type": "string", - "maxLength": 255, - "title": "The authentication method that the connector instance uses to access the resources." - }, - "resource_types": { - "items": { "type": "string" }, - "type": "array", - "title": "The type(s) of resource that the connector instance can be used to gain access to." - }, - "resource_id": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "Uniquely identifies a specific resource instance that the connector instance can be used to access. If omitted, the connector instance can be used to access any and all resource instances that the authentication method and resource type(s) allow." - }, - "supports_instances": { - "type": "boolean", - "title": "Indicates whether the connector instance can be used to access multiple instances of the configured resource type.", - "default": false - }, - "expires_at": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Time when the authentication credentials configured for the connector expire. If omitted, the credentials do not expire." - }, - "expires_skew_tolerance": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The number of seconds of tolerance to apply when checking whether the authentication credentials configured for the connector have expired. If omitted, no tolerance is applied." - }, - "expiration_seconds": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The duration, in seconds, that the temporary credentials generated by this connector should remain valid. Only applicable for connectors and authentication methods that involve generating temporary credentials from the ones configured in the connector." - }, - "configuration": { - "type": "object", - "title": "The service connector configuration, not including secrets." - }, - "secrets": { - "additionalProperties": { - "anyOf": [ - { "type": "string", "format": "password", "writeOnly": true }, - { "type": "null" } - ] - }, - "type": "object", - "title": "The service connector secrets." - }, - "labels": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "Service connector labels." - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "name", - "connector_type", - "auth_method" - ], - "title": "ServiceConnectorRequest", - "description": "Request model for service connectors." - }, - "ServiceConnectorResourcesInfo": { - "properties": { - "connector_type": { "type": "string", "title": "Connector Type" }, - "components_resources_info": { - "additionalProperties": { - "items": { "$ref": "#/components/schemas/ResourcesInfo" }, - "type": "array" - }, - "type": "object", - "title": "Components Resources Info" - } - }, - "type": "object", - "required": ["connector_type", "components_resources_info"], - "title": "ServiceConnectorResourcesInfo", - "description": "Information about the service connector resources needed for CLI and UI." - }, - "ServiceConnectorResourcesModel": { - "properties": { - "id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the service connector instance providing this resource." - }, - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the service connector instance providing this resource." - }, - "connector_type": { - "anyOf": [ - { "type": "string" }, - { "$ref": "#/components/schemas/ServiceConnectorTypeModel" } - ], - "title": "The type of service connector." - }, - "resources": { - "items": { - "$ref": "#/components/schemas/ServiceConnectorTypedResourcesModel" - }, - "type": "array", - "title": "The list of resources that the service connector instance can give access to. Contains one entry for every resource type that the connector is configured for." - }, - "error": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "A global error message describing why the service connector instance could not authenticate to the remote service." - } - }, - "type": "object", - "required": ["connector_type"], - "title": "ServiceConnectorResourcesModel", - "description": "Service connector resources list.\n\nLists the resource types and resource instances that a service connector\ncan provide access to." - }, - "ServiceConnectorResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceConnectorResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { - "$ref": "#/components/schemas/ServiceConnectorResponseMetadata" - }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/ServiceConnectorResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The service connector name." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "ServiceConnectorResponse", - "description": "Response model for service connectors." - }, - "ServiceConnectorResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "description": { - "type": "string", - "title": "The service connector instance description.", - "default": "" - }, - "connector_type": { - "anyOf": [ - { "type": "string" }, - { "$ref": "#/components/schemas/ServiceConnectorTypeModel" } - ], - "title": "The type of service connector." - }, - "auth_method": { - "type": "string", - "maxLength": 255, - "title": "The authentication method that the connector instance uses to access the resources." - }, - "resource_types": { - "items": { "type": "string" }, - "type": "array", - "title": "The type(s) of resource that the connector instance can be used to gain access to." - }, - "resource_id": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "Uniquely identifies a specific resource instance that the connector instance can be used to access. If omitted, the connector instance can be used to access any and all resource instances that the authentication method and resource type(s) allow." - }, - "supports_instances": { - "type": "boolean", - "title": "Indicates whether the connector instance can be used to access multiple instances of the configured resource type.", - "default": false - }, - "expires_at": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Time when the authentication credentials configured for the connector expire. If omitted, the credentials do not expire." - }, - "expires_skew_tolerance": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The number of seconds of tolerance to apply when checking whether the authentication credentials configured for the connector have expired. If omitted, no tolerance is applied." - } - }, - "type": "object", - "required": ["created", "updated", "connector_type", "auth_method"], - "title": "ServiceConnectorResponseBody", - "description": "Response body for service connectors." - }, - "ServiceConnectorResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "configuration": { - "type": "object", - "title": "The service connector configuration, not including secrets." - }, - "secret_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the secret that contains the service connector secret configuration values." - }, - "expiration_seconds": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The duration, in seconds, that the temporary credentials generated by this connector should remain valid. Only applicable for connectors and authentication methods that involve generating temporary credentials from the ones configured in the connector." - }, - "secrets": { - "additionalProperties": { - "anyOf": [ - { "type": "string", "format": "password", "writeOnly": true }, - { "type": "null" } - ] - }, - "type": "object", - "title": "The service connector secrets." - }, - "labels": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "Service connector labels." - } - }, - "type": "object", - "required": ["workspace"], - "title": "ServiceConnectorResponseMetadata", - "description": "Response metadata for service connectors." - }, - "ServiceConnectorResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ServiceConnectorResponseResources", - "description": "Class for all resource models associated with the service connector entity." - }, - "ServiceConnectorTypeModel": { - "properties": { - "name": { - "type": "string", - "title": "User readable name for the service connector type." - }, - "connector_type": { - "type": "string", - "maxLength": 255, - "title": "The type of service connector. It can be used to represent a generic resource (e.g. Docker, Kubernetes) or a group of different resources accessible through a common interface or point of access and authentication (e.g. a cloud provider or a platform)." - }, - "description": { - "type": "string", - "title": "A description of the service connector.", - "default": "" - }, - "resource_types": { - "items": { "$ref": "#/components/schemas/ResourceTypeModel" }, - "type": "array", - "title": "A list of resource types that the connector can be used to access." - }, - "auth_methods": { - "items": { - "$ref": "#/components/schemas/AuthenticationMethodModel" - }, - "type": "array", - "title": "A list of specifications describing the authentication methods that are supported by the service connector, along with the configuration and secrets attributes that need to be configured for them." - }, - "supports_auto_configuration": { - "type": "boolean", - "title": "Models if the connector can be configured automatically based on information extracted from a local environment.", - "default": false - }, - "logo_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a URL pointing to a png,svg or jpg can be attached." - }, - "emoji": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a python-rich emoji can be attached." - }, - "docs_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a URL pointing to docs, within docs.zenml.io." - }, - "sdk_docs_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Optionally, a URL pointing to SDK docs,within sdkdocs.zenml.io." - }, - "local": { - "type": "boolean", - "title": "If True, the service connector is available locally.", - "default": true - }, - "remote": { - "type": "boolean", - "title": "If True, the service connector is available remotely.", - "default": false - } - }, - "type": "object", - "required": [ - "name", - "connector_type", - "resource_types", - "auth_methods" - ], - "title": "ServiceConnectorTypeModel", - "description": "Service connector type specification.\n\nDescribes the types of resources to which the service connector can be used\nto gain access and the authentication methods that are supported by the\nservice connector.\n\nThe connector type, resource types, resource IDs and authentication\nmethods can all be used as search criteria to lookup and filter service\nconnector instances that are compatible with the requirements of a consumer\n(e.g. a stack component)." - }, - "ServiceConnectorTypedResourcesModel": { - "properties": { - "resource_type": { - "type": "string", - "maxLength": 255, - "title": "The type of resource that the service connector instance can be used to access." - }, - "resource_ids": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "The resource IDs of all resource instances that the service connector instance can be used to access. Omitted (set to None) for multi-type service connectors that didn't explicitly request to fetch resources for all resource types. Also omitted if an error occurred while listing the resource instances or if no resources are listed due to authorization issues or lack of permissions (in both cases the 'error' field is set to an error message). For resource types that do not support multiple instances, a single resource ID is listed." - }, - "error": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "An error message describing why the service connector instance could not list the resources that it is configured to access." - } - }, - "type": "object", - "required": ["resource_type"], - "title": "ServiceConnectorTypedResourcesModel", - "description": "Service connector typed resources list.\n\nLists the resource instances that a service connector can provide\naccess to." - }, - "ServiceConnectorUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The service connector name." - }, - "connector_type": { - "anyOf": [ - { "type": "string" }, - { "$ref": "#/components/schemas/ServiceConnectorTypeModel" }, - { "type": "null" } - ], - "title": "The type of service connector." - }, - "description": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The service connector instance description." - }, - "auth_method": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The authentication method that the connector instance uses to access the resources." - }, - "resource_types": { - "anyOf": [ - { "items": { "type": "string" }, "type": "array" }, - { "type": "null" } - ], - "title": "The type(s) of resource that the connector instance can be used to gain access to." - }, - "resource_id": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "Uniquely identifies a specific resource instance that the connector instance can be used to access. If omitted, the connector instance can be used to access any and all resource instances that the authentication method and resource type(s) allow." - }, - "supports_instances": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Indicates whether the connector instance can be used to access multiple instances of the configured resource type." - }, - "expires_at": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "Time when the authentication credentials configured for the connector expire. If omitted, the credentials do not expire." - }, - "expires_skew_tolerance": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The number of seconds of tolerance to apply when checking whether the authentication credentials configured for the connector have expired. If omitted, no tolerance is applied." - }, - "expiration_seconds": { - "anyOf": [{ "type": "integer" }, { "type": "null" }], - "title": "The duration, in seconds, that the temporary credentials generated by this connector should remain valid. Only applicable for connectors and authentication methods that involve generating temporary credentials from the ones configured in the connector." - }, - "configuration": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The service connector configuration, not including secrets." - }, - "secrets": { - "anyOf": [ - { - "additionalProperties": { - "anyOf": [ - { - "type": "string", - "format": "password", - "writeOnly": true - }, - { "type": "null" } - ] - }, - "type": "object" - }, - { "type": "null" } - ], - "title": "The service connector secrets." - }, - "labels": { - "anyOf": [ - { - "additionalProperties": { "type": "string" }, - "type": "object" - }, - { "type": "null" } - ], - "title": "Service connector labels." - } - }, - "type": "object", - "title": "ServiceConnectorUpdate", - "description": "Model used for service connector updates.\n\nMost fields in the update model are optional and will not be updated if\nomitted. However, the following fields are \"special\" and leaving them out\nwill also cause the corresponding value to be removed from the service\nconnector in the database:\n\n* the `resource_id` field\n* the `expiration_seconds` field\n\nIn addition to the above exceptions, the following rules apply:\n\n* the `configuration` and `secrets` fields together represent a full\nvalid configuration update, not just a partial update. If either is\nset (i.e. not None) in the update, their values are merged together and\nwill replace the existing configuration and secrets values.\n* the `secret_id` field value in the update is ignored, given that\nsecrets are managed internally by the ZenML store.\n* the `labels` field is also a full labels update: if set (i.e. not\n`None`), all existing labels are removed and replaced by the new labels\nin the update.\n\nNOTE: the attributes here override the ones in the base class, so they\nhave a None default value." - }, - "ServiceRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the service." - }, - "service_type": { - "allOf": [{ "$ref": "#/components/schemas/ServiceType" }], - "title": "The type of the service." - }, - "service_source": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The class of the service.", - "description": "The fully qualified class name of the service implementation." - }, - "admin_state": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceState" }, - { "type": "null" } - ], - "title": "The admin state of the service.", - "description": "The administrative state of the service, e.g., ACTIVE, INACTIVE." - }, - "config": { - "type": "object", - "title": "The service config.", - "description": "A dictionary containing configuration parameters for the service." - }, - "labels": { - "anyOf": [ - { - "additionalProperties": { "type": "string" }, - "type": "object" - }, - { "type": "null" } - ], - "title": "The service labels." - }, - "status": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The status of the service." - }, - "endpoint": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The service endpoint." - }, - "prediction_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The service endpoint URL." - }, - "health_check_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The service health check URL." - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The model version id linked to the service." - }, - "pipeline_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "string" }, - { "type": "null" } - ], - "title": "Pipeline Run Id", - "description": "By the event source this trigger is attached to." - } - }, - "type": "object", - "required": ["user", "workspace", "name", "service_type", "config"], - "title": "ServiceRequest", - "description": "Request model for services." - }, - "ServiceResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the service." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "ServiceResponse", - "description": "Response model for services." - }, - "ServiceResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this component was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this component was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "service_type": { - "allOf": [{ "$ref": "#/components/schemas/ServiceType" }], - "title": "The type of the service." - }, - "labels": { - "anyOf": [ - { - "additionalProperties": { "type": "string" }, - "type": "object" - }, - { "type": "null" } - ], - "title": "The service labels." - }, - "state": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceState" }, - { "type": "null" } - ], - "title": "The current state of the service." - } - }, - "type": "object", - "required": ["created", "updated", "service_type"], - "title": "ServiceResponseBody", - "description": "Response body for services." - }, - "ServiceResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "service_source": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The class of the service." - }, - "admin_state": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceState" }, - { "type": "null" } - ], - "title": "The admin state of the service." - }, - "config": { "type": "object", "title": "The service config." }, - "status": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The status of the service." - }, - "endpoint": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The service endpoint." - }, - "prediction_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The service endpoint URL." - }, - "health_check_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The service health check URL." - } - }, - "type": "object", - "required": [ - "workspace", - "service_source", - "admin_state", - "config", - "status" - ], - "title": "ServiceResponseMetadata", - "description": "Response metadata for services." - }, - "ServiceResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "ServiceResponseResources", - "description": "Class for all resource models associated with the service entity." - }, - "ServiceState": { - "type": "string", - "enum": [ - "inactive", - "active", - "pending_startup", - "pending_shutdown", - "error", - "scaled_to_zero" - ], - "title": "ServiceState", - "description": "Possible states for the service and service endpoint." - }, - "ServiceType": { - "properties": { - "type": { "type": "string", "title": "Type" }, - "flavor": { "type": "string", "title": "Flavor" }, - "name": { "type": "string", "title": "Name", "default": "" }, - "description": { - "type": "string", - "title": "Description", - "default": "" - }, - "logo_url": { "type": "string", "title": "Logo Url", "default": "" } - }, - "type": "object", - "required": ["type", "flavor"], - "title": "ServiceType", - "description": "Service type descriptor.\n\nAttributes:\n type: service type\n flavor: service flavor\n name: name of the service type\n description: description of the service type\n logo_url: logo of the service type" - }, - "ServiceUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the service." - }, - "admin_state": { - "anyOf": [ - { "$ref": "#/components/schemas/ServiceState" }, - { "type": "null" } - ], - "title": "The admin state of the service.", - "description": "The administrative state of the service, e.g., ACTIVE, INACTIVE." - }, - "service_source": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The class of the service.", - "description": "The fully qualified class name of the service implementation." - }, - "status": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The status of the service." - }, - "endpoint": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The service endpoint." - }, - "prediction_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The service endpoint URL." - }, - "health_check_url": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The service health check URL." - }, - "labels": { - "anyOf": [ - { - "additionalProperties": { "type": "string" }, - "type": "object" - }, - { "type": "null" } - ], - "title": "The service labels." - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The model version id linked to the service." - } - }, - "type": "object", - "title": "ServiceUpdate", - "description": "Update model for stack components." - }, - "Source": { - "properties": { - "module": { "type": "string", "title": "Module" }, - "attribute": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Attribute" - }, - "type": { "$ref": "#/components/schemas/SourceType" } - }, - "additionalProperties": true, - "type": "object", - "required": ["module", "type"], - "title": "Source", - "description": "Source specification.\n\nA source specifies a module name as well as an optional attribute of that\nmodule. These values can be used to import the module and get the value\nof the attribute inside the module.\n\nExample:\n The source `Source(module=\"zenml.config.source\", attribute=\"Source\")`\n references the class that this docstring is describing. This class is\n defined in the `zenml.config.source` module and the name of the\n attribute is the class name `Source`.\n\nAttributes:\n module: The module name.\n attribute: Optional name of the attribute inside the module.\n type: The type of the source." - }, - "SourceType": { - "type": "string", - "enum": [ - "user", - "builtin", - "internal", - "distribution_package", - "code_repository", - "notebook", - "unknown" - ], - "title": "SourceType", - "description": "Enum representing different types of sources." - }, - "StackComponentType": { - "type": "string", - "enum": [ - "alerter", - "annotator", - "artifact_store", - "container_registry", - "data_validator", - "experiment_tracker", - "feature_store", - "image_builder", - "model_deployer", - "orchestrator", - "step_operator", - "model_registry" - ], - "title": "StackComponentType", - "description": "All possible types a `StackComponent` can have." - }, - "StackDeploymentConfig": { - "properties": { - "deployment_url": { - "type": "string", - "title": "The cloud provider console URL where the stack will be deployed." - }, - "deployment_url_text": { - "type": "string", - "title": "A textual description for the cloud provider console URL." - }, - "configuration": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Configuration for the stack deployment that the user must manually configure into the cloud provider console." - }, - "instructions": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Instructions for deploying the stack." - } - }, - "type": "object", - "required": ["deployment_url", "deployment_url_text"], - "title": "StackDeploymentConfig", - "description": "Configuration about a stack deployment." - }, - "StackDeploymentInfo": { - "properties": { - "provider": { - "allOf": [ - { "$ref": "#/components/schemas/StackDeploymentProvider" } - ], - "title": "The provider of the stack deployment." - }, - "description": { - "type": "string", - "title": "The description of the stack deployment.", - "description": "The description of the stack deployment." - }, - "instructions": { - "type": "string", - "title": "The instructions for deploying the stack.", - "description": "The instructions for deploying the stack." - }, - "post_deploy_instructions": { - "type": "string", - "title": "The instructions for post-deployment.", - "description": "The instructions for post-deployment." - }, - "integrations": { - "items": { "type": "string" }, - "type": "array", - "title": "ZenML integrations required for the stack.", - "description": "The list of ZenML integrations that need to be installed for the stack to be usable." - }, - "permissions": { - "additionalProperties": { - "items": { "type": "string" }, - "type": "array" - }, - "type": "object", - "title": "The permissions granted to ZenML to access the cloud resources.", - "description": "The permissions granted to ZenML to access the cloud resources, as a dictionary grouping permissions by resource." - }, - "locations": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "The locations where the stack can be deployed.", - "description": "The locations where the stack can be deployed, as a dictionary mapping location names to descriptions." - }, - "skypilot_default_regions": { - "additionalProperties": { "type": "string" }, - "type": "object", - "title": "The locations where the Skypilot clusters can be deployed by default.", - "description": "The locations where the Skypilot clusters can be deployed by default, as a dictionary mapping location names to descriptions." - } - }, - "type": "object", - "required": [ - "provider", - "description", - "instructions", - "post_deploy_instructions", - "integrations", - "permissions", - "locations", - "skypilot_default_regions" - ], - "title": "StackDeploymentInfo", - "description": "Information about a stack deployment." - }, - "StackDeploymentProvider": { - "type": "string", - "enum": ["aws", "gcp", "azure"], - "title": "StackDeploymentProvider", - "description": "All possible stack deployment providers." - }, - "StackRequest": { - "properties": { - "user": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "User" - }, - "workspace": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "Workspace" - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the stack." - }, - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the stack", - "default": "" - }, - "stack_spec_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The path to the stack spec used for mlstacks deployments." - }, - "components": { - "additionalProperties": { - "items": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "$ref": "#/components/schemas/ComponentInfo" } - ] - }, - "type": "array" - }, - "type": "object", - "title": "The mapping for the components of the full stack registration.", - "description": "The mapping from component types to either UUIDs of existing components or request information for brand new components." - }, - "labels": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The stack labels." - }, - "service_connectors": { - "items": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "$ref": "#/components/schemas/ServiceConnectorInfo" } - ] - }, - "type": "array", - "title": "The service connectors dictionary for the full stack registration.", - "description": "The UUID of an already existing service connector or request information to create a service connector from scratch.", - "default": [] - } - }, - "type": "object", - "required": ["name", "components"], - "title": "StackRequest", - "description": "Request model for a stack." - }, - "StackResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/StackResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/StackResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/StackResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the stack." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "StackResponse", - "description": "Response model for stacks." - }, - "StackResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - } - }, - "type": "object", - "required": ["created", "updated"], - "title": "StackResponseBody", - "description": "Response body for stacks." - }, - "StackResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "components": { - "additionalProperties": { - "items": { "$ref": "#/components/schemas/ComponentResponse" }, - "type": "array" - }, - "type": "object", - "title": "A mapping of stack component types to the actualinstances of components of this type." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The description of the stack", - "default": "" - }, - "stack_spec_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The path to the stack spec used for mlstacks deployments." - }, - "labels": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The stack labels." - } - }, - "type": "object", - "required": ["workspace", "components"], - "title": "StackResponseMetadata", - "description": "Response metadata for stacks." - }, - "StackResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "StackResponseResources", - "description": "Class for all resource models associated with the stack entity." - }, - "StackUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The name of the stack." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The description of the stack" - }, - "stack_spec_path": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "The path to the stack spec used for mlstacks deployments." - }, - "components": { - "anyOf": [ - { - "additionalProperties": { - "items": { "type": "string", "format": "uuid" }, - "type": "array" - }, - "type": "object" - }, - { "type": "null" } - ], - "title": "A mapping of stack component types to the actualinstances of components of this type." - }, - "labels": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The stack labels." - } - }, - "type": "object", - "title": "StackUpdate", - "description": "Update model for stacks." - }, - "Step-Input": { - "properties": { - "spec": { "$ref": "#/components/schemas/StepSpec-Input" }, - "config": { "$ref": "#/components/schemas/StepConfiguration-Input" } - }, - "additionalProperties": false, - "type": "object", - "required": ["spec", "config"], - "title": "Step", - "description": "Class representing a ZenML step." - }, - "Step-Output": { - "properties": { - "spec": { "$ref": "#/components/schemas/StepSpec-Output" }, - "config": { "$ref": "#/components/schemas/StepConfiguration-Output" } - }, - "additionalProperties": false, - "type": "object", - "required": ["spec", "config"], - "title": "Step", - "description": "Class representing a ZenML step." - }, - "StepConfiguration-Input": { - "properties": { - "name": { "type": "string", "title": "Name" }, - "enable_cache": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Cache" - }, - "enable_artifact_metadata": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Metadata" - }, - "enable_artifact_visualization": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Visualization" - }, - "enable_step_logs": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Step Logs" - }, - "step_operator": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Step Operator" - }, - "experiment_tracker": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Experiment Tracker" - }, - "parameters": { - "type": "object", - "title": "Parameters", - "default": {} - }, - "settings": { - "additionalProperties": { - "$ref": "#/components/schemas/BaseSettings" - }, - "type": "object", - "title": "Settings", - "default": {} - }, - "extra": { "type": "object", "title": "Extra", "default": {} }, - "failure_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "success_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "model": { - "anyOf": [ - { "$ref": "#/components/schemas/Model" }, - { "type": "null" } - ] - }, - "retry": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRetryConfig" }, - { "type": "null" } - ] - }, - "outputs": { - "additionalProperties": { - "$ref": "#/components/schemas/ArtifactConfiguration-Input" - }, - "type": "object", - "title": "Outputs", - "default": {} - }, - "caching_parameters": { - "type": "object", - "title": "Caching Parameters", - "default": {} - }, - "external_input_artifacts": { - "additionalProperties": { - "$ref": "#/components/schemas/ExternalArtifactConfiguration-Input" - }, - "type": "object", - "title": "External Input Artifacts", - "default": {} - }, - "model_artifacts_or_metadata": { - "additionalProperties": { - "$ref": "#/components/schemas/ModelVersionDataLazyLoader-Input" - }, - "type": "object", - "title": "Model Artifacts Or Metadata", - "default": {} - }, - "client_lazy_loaders": { - "additionalProperties": { - "$ref": "#/components/schemas/ClientLazyLoader" - }, - "type": "object", - "title": "Client Lazy Loaders", - "default": {} - } - }, - "additionalProperties": false, - "type": "object", - "required": ["name"], - "title": "StepConfiguration", - "description": "Step configuration class." - }, - "StepConfiguration-Output": { - "properties": { - "name": { "type": "string", "title": "Name" }, - "enable_cache": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Cache" - }, - "enable_artifact_metadata": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Metadata" - }, - "enable_artifact_visualization": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Visualization" - }, - "enable_step_logs": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Step Logs" - }, - "step_operator": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Step Operator" - }, - "experiment_tracker": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Experiment Tracker" - }, - "parameters": { - "type": "object", - "title": "Parameters", - "default": {} - }, - "settings": { - "additionalProperties": { - "$ref": "#/components/schemas/BaseSettings" - }, - "type": "object", - "title": "Settings", - "default": {} - }, - "extra": { "type": "object", "title": "Extra", "default": {} }, - "failure_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "success_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "model": { - "anyOf": [ - { "$ref": "#/components/schemas/Model" }, - { "type": "null" } - ] - }, - "retry": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRetryConfig" }, - { "type": "null" } - ] - }, - "outputs": { - "additionalProperties": { - "$ref": "#/components/schemas/ArtifactConfiguration-Output" - }, - "type": "object", - "title": "Outputs", - "default": {} - }, - "caching_parameters": { - "type": "object", - "title": "Caching Parameters", - "default": {} - }, - "external_input_artifacts": { - "additionalProperties": { - "$ref": "#/components/schemas/ExternalArtifactConfiguration-Output" - }, - "type": "object", - "title": "External Input Artifacts", - "default": {} - }, - "model_artifacts_or_metadata": { - "additionalProperties": { - "$ref": "#/components/schemas/ModelVersionDataLazyLoader-Output" - }, - "type": "object", - "title": "Model Artifacts Or Metadata", - "default": {} - }, - "client_lazy_loaders": { - "additionalProperties": { - "$ref": "#/components/schemas/ClientLazyLoader" - }, - "type": "object", - "title": "Client Lazy Loaders", - "default": {} - } - }, - "additionalProperties": false, - "type": "object", - "required": ["name"], - "title": "StepConfiguration", - "description": "Step configuration class." - }, - "StepConfigurationUpdate": { - "properties": { - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "enable_cache": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Cache" - }, - "enable_artifact_metadata": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Metadata" - }, - "enable_artifact_visualization": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Artifact Visualization" - }, - "enable_step_logs": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Enable Step Logs" - }, - "step_operator": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Step Operator" - }, - "experiment_tracker": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Experiment Tracker" - }, - "parameters": { - "type": "object", - "title": "Parameters", - "default": {} - }, - "settings": { - "additionalProperties": { - "$ref": "#/components/schemas/BaseSettings" - }, - "type": "object", - "title": "Settings", - "default": {} - }, - "extra": { "type": "object", "title": "Extra", "default": {} }, - "failure_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "success_hook_source": { - "anyOf": [ - { "$ref": "#/components/schemas/Source" }, - { "type": "null" } - ] - }, - "model": { - "anyOf": [ - { "$ref": "#/components/schemas/Model" }, - { "type": "null" } - ] - }, - "retry": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRetryConfig" }, - { "type": "null" } - ] - }, - "outputs": { - "additionalProperties": { - "$ref": "#/components/schemas/PartialArtifactConfiguration" - }, - "type": "object", - "title": "Outputs", - "default": {} - } - }, - "additionalProperties": false, - "type": "object", - "title": "StepConfigurationUpdate", - "description": "Class for step configuration updates." - }, - "StepNode": { - "properties": { - "id": { "type": "string", "title": "Id" }, - "type": { "type": "string", "title": "Type", "default": "step" }, - "data": { "$ref": "#/components/schemas/StepNodeDetails" } - }, - "type": "object", - "required": ["id", "data"], - "title": "StepNode", - "description": "A class that represents a step node in a lineage graph." - }, - "StepNodeDetails": { - "properties": { - "execution_id": { "type": "string", "title": "Execution Id" }, - "name": { "type": "string", "title": "Name" }, - "status": { "$ref": "#/components/schemas/ExecutionStatus" }, - "entrypoint_name": { "type": "string", "title": "Entrypoint Name" }, - "parameters": { "type": "object", "title": "Parameters" }, - "configuration": { "type": "object", "title": "Configuration" }, - "inputs": { "type": "object", "title": "Inputs" }, - "outputs": { "type": "object", "title": "Outputs" }, - "metadata": { - "items": { - "prefixItems": [ - { "type": "string" }, - { "type": "string" }, - { "type": "string" } - ], - "type": "array", - "maxItems": 3, - "minItems": 3 - }, - "type": "array", - "title": "Metadata" - } - }, - "type": "object", - "required": [ - "execution_id", - "name", - "status", - "entrypoint_name", - "parameters", - "configuration", - "inputs", - "outputs", - "metadata" - ], - "title": "StepNodeDetails", - "description": "Captures all artifact details for the node." - }, - "StepRetryConfig": { - "properties": { - "max_retries": { - "type": "integer", - "title": "Max Retries", - "default": 1 - }, - "delay": { "type": "integer", "title": "Delay", "default": 0 }, - "backoff": { "type": "integer", "title": "Backoff", "default": 0 } - }, - "additionalProperties": false, - "type": "object", - "title": "StepRetryConfig", - "description": "Retry configuration for a step.\n\nDelay is an integer (specified in seconds)." - }, - "StepRunRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the pipeline run step." - }, - "start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The start time of the step run." - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The end time of the step run." - }, - "status": { - "allOf": [{ "$ref": "#/components/schemas/ExecutionStatus" }], - "title": "The status of the step." - }, - "cache_key": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The cache key of the step run." - }, - "code_hash": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The code hash of the step run." - }, - "docstring": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The docstring of the step function or class." - }, - "source_code": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The source code of the step function or class." - }, - "pipeline_run_id": { - "type": "string", - "format": "uuid", - "title": "The ID of the pipeline run that this step run belongs to." - }, - "original_step_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the original step run if this step was cached." - }, - "parent_step_ids": { - "items": { "type": "string", "format": "uuid" }, - "type": "array", - "title": "The IDs of the parent steps of this step run." - }, - "inputs": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object", - "title": "The IDs of the input artifact versions of the step run.", - "default": {} - }, - "outputs": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object", - "title": "The IDs of the output artifact versions of the step run.", - "default": {} - }, - "logs": { - "anyOf": [ - { "$ref": "#/components/schemas/LogsRequest" }, - { "type": "null" } - ], - "title": "Logs associated with this step run." - }, - "deployment": { - "type": "string", - "format": "uuid", - "title": "The deployment associated with the step run." - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the model version that was configured by this step run explicitly." - } - }, - "type": "object", - "required": [ - "user", - "workspace", - "name", - "status", - "pipeline_run_id", - "deployment" - ], - "title": "StepRunRequest", - "description": "Request model for step runs." - }, - "StepRunResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRunResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRunResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/StepRunResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the pipeline run step." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "StepRunResponse", - "description": "Response model for step runs." - }, - "StepRunResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "status": { - "allOf": [{ "$ref": "#/components/schemas/ExecutionStatus" }], - "title": "The status of the step." - }, - "inputs": { - "additionalProperties": { - "$ref": "#/components/schemas/ArtifactVersionResponse" - }, - "type": "object", - "title": "The input artifact versions of the step run.", - "default": {} - }, - "outputs": { - "additionalProperties": { - "$ref": "#/components/schemas/ArtifactVersionResponse" - }, - "type": "object", - "title": "The output artifact versions of the step run.", - "default": {} - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the model version that was configured by this step run explicitly." - } - }, - "type": "object", - "required": ["created", "updated", "status"], - "title": "StepRunResponseBody", - "description": "Response body for step runs." - }, - "StepRunResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "config": { - "allOf": [ - { "$ref": "#/components/schemas/StepConfiguration-Output" } - ], - "title": "The configuration of the step." - }, - "spec": { - "allOf": [{ "$ref": "#/components/schemas/StepSpec-Output" }], - "title": "The spec of the step." - }, - "cache_key": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The cache key of the step run." - }, - "code_hash": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The code hash of the step run." - }, - "docstring": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The docstring of the step function or class." - }, - "source_code": { - "anyOf": [ - { "type": "string", "maxLength": 65535 }, - { "type": "null" } - ], - "title": "The source code of the step function or class." - }, - "start_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The start time of the step run." - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The end time of the step run." - }, - "logs": { - "anyOf": [ - { "$ref": "#/components/schemas/LogsResponse" }, - { "type": "null" } - ], - "title": "Logs associated with this step run." - }, - "deployment_id": { - "type": "string", - "format": "uuid", - "title": "The deployment associated with the step run." - }, - "pipeline_run_id": { - "type": "string", - "format": "uuid", - "title": "The ID of the pipeline run that this step run belongs to." - }, - "original_step_run_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the original step run if this step was cached." - }, - "parent_step_ids": { - "items": { "type": "string", "format": "uuid" }, - "type": "array", - "title": "The IDs of the parent steps of this step run." - }, - "run_metadata": { - "additionalProperties": { - "$ref": "#/components/schemas/RunMetadataResponse" - }, - "type": "object", - "title": "Metadata associated with this step run.", - "default": {} - } - }, - "type": "object", - "required": [ - "workspace", - "config", - "spec", - "deployment_id", - "pipeline_run_id" - ], - "title": "StepRunResponseMetadata", - "description": "Response metadata for step runs." - }, - "StepRunResponseResources": { - "properties": { - "model_version": { - "anyOf": [ - { "$ref": "#/components/schemas/ModelVersionResponse" }, - { "type": "null" } - ] - } - }, - "additionalProperties": true, - "type": "object", - "title": "StepRunResponseResources", - "description": "Class for all resource models associated with the step run entity." - }, - "StepRunUpdate": { - "properties": { - "outputs": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object", - "title": "The IDs of the output artifact versions of the step run.", - "default": {} - }, - "saved_artifact_versions": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object", - "title": "The IDs of artifact versions that were saved by this step run.", - "default": {} - }, - "loaded_artifact_versions": { - "additionalProperties": { "type": "string", "format": "uuid" }, - "type": "object", - "title": "The IDs of artifact versions that were loaded by this step run.", - "default": {} - }, - "status": { - "anyOf": [ - { "$ref": "#/components/schemas/ExecutionStatus" }, - { "type": "null" } - ], - "title": "The status of the step." - }, - "end_time": { - "anyOf": [ - { "type": "string", "format": "date-time" }, - { "type": "null" } - ], - "title": "The end time of the step run." - }, - "model_version_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The ID of the model version that was configured by this step run explicitly." - } - }, - "type": "object", - "title": "StepRunUpdate", - "description": "Update model for step runs." - }, - "StepSpec-Input": { - "properties": { - "source": { "$ref": "#/components/schemas/Source" }, - "upstream_steps": { - "items": { "type": "string" }, - "type": "array", - "title": "Upstream Steps" - }, - "inputs": { - "additionalProperties": { - "$ref": "#/components/schemas/InputSpec" - }, - "type": "object", - "title": "Inputs", - "default": {} - }, - "pipeline_parameter_name": { - "type": "string", - "title": "Pipeline Parameter Name", - "default": "" - } - }, - "additionalProperties": false, - "type": "object", - "required": ["source", "upstream_steps"], - "title": "StepSpec", - "description": "Specification of a pipeline." - }, - "StepSpec-Output": { - "properties": { - "source": { "$ref": "#/components/schemas/Source" }, - "upstream_steps": { - "items": { "type": "string" }, - "type": "array", - "title": "Upstream Steps" - }, - "inputs": { - "additionalProperties": { - "$ref": "#/components/schemas/InputSpec" - }, - "type": "object", - "title": "Inputs", - "default": {} - }, - "pipeline_parameter_name": { - "type": "string", - "title": "Pipeline Parameter Name", - "default": "" - } - }, - "additionalProperties": false, - "type": "object", - "required": ["source", "upstream_steps"], - "title": "StepSpec", - "description": "Specification of a pipeline." - }, - "TagRequest": { - "properties": { - "name": { - "type": "string", - "maxLength": 255, - "title": "Name", - "description": "The unique title of the tag." - }, - "color": { - "allOf": [{ "$ref": "#/components/schemas/ColorVariants" }], - "description": "The color variant assigned to the tag." - } - }, - "type": "object", - "required": ["name"], - "title": "TagRequest", - "description": "Request model for tags." - }, - "TagResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/TagResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/BaseResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/TagResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "Name", - "description": "The unique title of the tag." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "TagResponse", - "description": "Response model for tags." - }, - "TagResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "color": { - "allOf": [{ "$ref": "#/components/schemas/ColorVariants" }], - "description": "The color variant assigned to the tag." - }, - "tagged_count": { - "type": "integer", - "title": "Tagged Count", - "description": "The count of resources tagged with this tag." - } - }, - "type": "object", - "required": ["created", "updated", "tagged_count"], - "title": "TagResponseBody", - "description": "Response body for tags." - }, - "TagResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "TagResponseResources", - "description": "Class for all resource models associated with the tag entity." - }, - "TagUpdate": { - "properties": { - "name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Name" - }, - "color": { - "anyOf": [ - { "$ref": "#/components/schemas/ColorVariants" }, - { "type": "null" } - ] - } - }, - "type": "object", - "title": "TagUpdate", - "description": "Update model for tags." - }, - "TriggerExecutionResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/TriggerExecutionResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { - "$ref": "#/components/schemas/TriggerExecutionResponseMetadata" - }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { - "$ref": "#/components/schemas/TriggerExecutionResponseResources" - }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - } - }, - "type": "object", - "required": ["id"], - "title": "TriggerExecutionResponse", - "description": "Response model for trigger executions." - }, - "TriggerExecutionResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - } - }, - "type": "object", - "required": ["created", "updated"], - "title": "TriggerExecutionResponseBody", - "description": "Response body for trigger executions." - }, - "TriggerExecutionResponseMetadata": { - "properties": { - "event_metadata": { - "type": "object", - "title": "Event Metadata", - "default": {} - } - }, - "type": "object", - "title": "TriggerExecutionResponseMetadata", - "description": "Response metadata for trigger executions." - }, - "TriggerExecutionResponseResources": { - "properties": { - "trigger": { - "allOf": [{ "$ref": "#/components/schemas/TriggerResponse" }], - "title": "The event source that activates this trigger." - } - }, - "additionalProperties": true, - "type": "object", - "required": ["trigger"], - "title": "TriggerExecutionResponseResources", - "description": "Class for all resource models associated with the trigger entity." - }, - "TriggerRequest": { - "properties": { - "user": { - "type": "string", - "format": "uuid", - "title": "The id of the user that created this resource." - }, - "workspace": { - "type": "string", - "format": "uuid", - "title": "The workspace to which this resource belongs." - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the trigger." - }, - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the trigger", - "default": "" - }, - "action_id": { - "type": "string", - "format": "uuid", - "title": "The action that is executed by this trigger." - }, - "schedule": { - "anyOf": [ - { "$ref": "#/components/schemas/Schedule" }, - { "type": "null" } - ], - "title": "The schedule for the trigger. Either a schedule or an event source is required." - }, - "event_source_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The event source that activates this trigger. Either a schedule or an event source is required." - }, - "event_filter": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "Filter applied to events that activate this trigger. Only set if the trigger is activated by an event source." - } - }, - "type": "object", - "required": ["user", "workspace", "name", "action_id"], - "title": "TriggerRequest", - "description": "Model for creating a new trigger." - }, - "TriggerResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/TriggerResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/TriggerResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/TriggerResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The name of the trigger" - } - }, - "type": "object", - "required": ["id", "name"], - "title": "TriggerResponse", - "description": "Response model for models." - }, - "TriggerResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "user": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponse" }, - { "type": "null" } - ], - "title": "The user who created this resource." - }, - "action_flavor": { - "type": "string", - "maxLength": 255, - "title": "The flavor of the action that is executed by this trigger." - }, - "action_subtype": { - "type": "string", - "title": "The subtype of the action that is executed by this trigger." - }, - "event_source_flavor": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The flavor of the event source that activates this trigger. Not set if the trigger is activated by a schedule." - }, - "event_source_subtype": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The subtype of the event source that activates this trigger. Not set if the trigger is activated by a schedule." - }, - "is_active": { - "type": "boolean", - "title": "Whether the trigger is active." - } - }, - "type": "object", - "required": [ - "created", - "updated", - "action_flavor", - "action_subtype", - "is_active" - ], - "title": "TriggerResponseBody", - "description": "Response body for triggers." - }, - "TriggerResponseMetadata": { - "properties": { - "workspace": { - "allOf": [{ "$ref": "#/components/schemas/WorkspaceResponse" }], - "title": "The workspace of this resource." - }, - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the trigger.", - "default": "" - }, - "event_filter": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "The event that activates this trigger. Not set if the trigger is activated by a schedule." - }, - "schedule": { - "anyOf": [ - { "$ref": "#/components/schemas/Schedule" }, - { "type": "null" } - ], - "title": "The schedule that activates this trigger. Not set if the trigger is activated by an event source." - } - }, - "type": "object", - "required": ["workspace"], - "title": "TriggerResponseMetadata", - "description": "Response metadata for triggers." - }, - "TriggerResponseResources": { - "properties": { - "action": { - "allOf": [{ "$ref": "#/components/schemas/ActionResponse" }], - "title": "The action that is executed by this trigger." - }, - "event_source": { - "anyOf": [ - { "$ref": "#/components/schemas/EventSourceResponse" }, - { "type": "null" } - ], - "title": "The event source that activates this trigger. Not set if the trigger is activated by a schedule." - }, - "executions": { - "allOf": [ - { "$ref": "#/components/schemas/Page_TriggerExecutionResponse_" } - ], - "title": "The executions of this trigger." - } - }, - "additionalProperties": true, - "type": "object", - "required": ["action", "executions"], - "title": "TriggerResponseResources", - "description": "Class for all resource models associated with the trigger entity." - }, - "TriggerUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The new name for the trigger." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The new description for the trigger." - }, - "event_filter": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "New filter applied to events that activate this trigger. Only valid if the trigger is already configured to be activated by an event source." - }, - "schedule": { - "anyOf": [ - { "$ref": "#/components/schemas/Schedule" }, - { "type": "null" } - ], - "title": "The updated schedule for the trigger. Only valid if the trigger is already configured to be activated by a schedule." - }, - "is_active": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "The new status of the trigger." - } - }, - "type": "object", - "title": "TriggerUpdate", - "description": "Update model for triggers." - }, - "UserResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/UserResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The unique username for the account." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "UserResponse", - "description": "Response model for user and service accounts.\n\nThis returns the activation_token that is required for the\nuser-invitation-flow of the frontend. The email is returned optionally as\nwell for use by the analytics on the client-side." - }, - "UserResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - }, - "active": { - "type": "boolean", - "title": "Whether the account is active.", - "default": false - }, - "activation_token": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The activation token for the user. Only relevant for user accounts." - }, - "full_name": { - "type": "string", - "maxLength": 255, - "title": "The full name for the account owner. Only relevant for user accounts.", - "default": "" - }, - "email_opted_in": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Whether the user agreed to share their email. Only relevant for user accounts", - "description": "`null` if not answered, `true` if agreed, `false` if skipped." - }, - "is_service_account": { - "type": "boolean", - "title": "Indicates whether this is a service account or a user account." - }, - "is_admin": { - "type": "boolean", - "title": "Whether the account is an administrator." - } - }, - "type": "object", - "required": ["created", "updated", "is_service_account", "is_admin"], - "title": "UserResponseBody", - "description": "Response body for users." - }, - "UserResponseMetadata": { - "properties": { - "email": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The email address associated with the account. Only relevant for user accounts.", - "default": "" - }, - "external_user_id": { - "anyOf": [ - { "type": "string", "format": "uuid" }, - { "type": "null" } - ], - "title": "The external user ID associated with the account. Only relevant for user accounts." - }, - "user_metadata": { - "type": "object", - "title": "The metadata associated with the user.", - "default": {} - } - }, - "type": "object", - "title": "UserResponseMetadata", - "description": "Response metadata for users." - }, - "UserResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "UserResponseResources", - "description": "Class for all resource models associated with the user entity." - }, - "ValidationError": { - "properties": { - "loc": { - "items": { "anyOf": [{ "type": "string" }, { "type": "integer" }] }, - "type": "array", - "title": "Location" - }, - "msg": { "type": "string", "title": "Message" }, - "type": { "type": "string", "title": "Error Type" } - }, - "type": "object", - "required": ["loc", "msg", "type"], - "title": "ValidationError" - }, - "VisualizationType": { - "type": "string", - "enum": ["csv", "html", "image", "markdown"], - "title": "VisualizationType", - "description": "All currently available visualization types." - }, - "WorkspaceRequest": { - "properties": { - "name": { - "type": "string", - "maxLength": 255, - "title": "The unique name of the workspace." - }, - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the workspace.", - "default": "" - } - }, - "type": "object", - "required": ["name"], - "title": "WorkspaceRequest", - "description": "Request model for workspaces." - }, - "WorkspaceResponse": { - "properties": { - "body": { - "anyOf": [ - { "$ref": "#/components/schemas/WorkspaceResponseBody" }, - { "type": "null" } - ], - "title": "The body of the resource." - }, - "metadata": { - "anyOf": [ - { "$ref": "#/components/schemas/WorkspaceResponseMetadata" }, - { "type": "null" } - ], - "title": "The metadata related to this resource." - }, - "resources": { - "anyOf": [ - { "$ref": "#/components/schemas/WorkspaceResponseResources" }, - { "type": "null" } - ], - "title": "The resources related to this resource." - }, - "id": { - "type": "string", - "format": "uuid", - "title": "The unique resource id." - }, - "permission_denied": { - "type": "boolean", - "title": "Permission Denied", - "default": false - }, - "name": { - "type": "string", - "maxLength": 255, - "title": "The unique name of the workspace." - } - }, - "type": "object", - "required": ["id", "name"], - "title": "WorkspaceResponse", - "description": "Response model for workspaces." - }, - "WorkspaceResponseBody": { - "properties": { - "created": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was created." - }, - "updated": { - "type": "string", - "format": "date-time", - "title": "The timestamp when this resource was last updated." - } - }, - "type": "object", - "required": ["created", "updated"], - "title": "WorkspaceResponseBody", - "description": "Response body for workspaces." - }, - "WorkspaceResponseMetadata": { - "properties": { - "description": { - "type": "string", - "maxLength": 255, - "title": "The description of the workspace.", - "default": "" - } - }, - "type": "object", - "title": "WorkspaceResponseMetadata", - "description": "Response metadata for workspaces." - }, - "WorkspaceResponseResources": { - "properties": {}, - "additionalProperties": true, - "type": "object", - "title": "WorkspaceResponseResources", - "description": "Class for all resource models associated with the workspace entity." - }, - "WorkspaceUpdate": { - "properties": { - "name": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The unique name of the workspace." - }, - "description": { - "anyOf": [ - { "type": "string", "maxLength": 255 }, - { "type": "null" } - ], - "title": "The description of the workspace." - } - }, - "type": "object", - "title": "WorkspaceUpdate", - "description": "Update model for workspaces." - }, - "_CallStep": { - "properties": { - "attribute_name": { - "anyOf": [{ "type": "string" }, { "type": "null" }], - "title": "Attribute Name" - }, - "is_call": { - "anyOf": [{ "type": "boolean" }, { "type": "null" }], - "title": "Is Call", - "default": false - }, - "call_args": { - "anyOf": [{ "items": {}, "type": "array" }, { "type": "null" }], - "title": "Call Args", - "default": [] - }, - "call_kwargs": { - "anyOf": [{ "type": "object" }, { "type": "null" }], - "title": "Call Kwargs", - "default": {} - }, - "selector": { "anyOf": [{}, { "type": "null" }], "title": "Selector" } - }, - "type": "object", - "title": "_CallStep" - }, - "zenml__enums__PluginSubType__1": { - "type": "string", - "enum": ["webhook", "pipeline_run"], - "maxLength": 255, - "title": "PluginSubType", - "description": "All possible types of Plugins." - }, - "zenml__enums__PluginSubType__2": { - "type": "string", - "enum": ["webhook", "pipeline_run"], - "title": "PluginSubType", - "description": "All possible types of Plugins." - } - }, - "securitySchemes": { - "CookieOAuth2TokenBearer": { - "type": "oauth2", - "flows": { "password": { "scopes": {}, "tokenUrl": "/api/v1/login" } } - } - } - } -} diff --git a/develop/api/pipelines/delete-pipeline.mdx b/develop/api/pipelines/delete-pipeline.mdx deleted file mode 100644 index 9e1b9dc..0000000 --- a/develop/api/pipelines/delete-pipeline.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/pipelines/{pipeline_id} ---- \ No newline at end of file diff --git a/develop/api/pipelines/get-pipeline.mdx b/develop/api/pipelines/get-pipeline.mdx deleted file mode 100644 index 2ac2937..0000000 --- a/develop/api/pipelines/get-pipeline.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/pipelines/{pipeline_id} ---- \ No newline at end of file diff --git a/develop/api/pipelines/list-pipeline-runs.mdx b/develop/api/pipelines/list-pipeline-runs.mdx deleted file mode 100644 index 24b60dc..0000000 --- a/develop/api/pipelines/list-pipeline-runs.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/pipelines/{pipeline_id}/runs ---- \ No newline at end of file diff --git a/develop/api/pipelines/list-pipelines.mdx b/develop/api/pipelines/list-pipelines.mdx deleted file mode 100644 index 8fd17a7..0000000 --- a/develop/api/pipelines/list-pipelines.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/pipelines ---- \ No newline at end of file diff --git a/develop/api/pipelines/update-pipeline.mdx b/develop/api/pipelines/update-pipeline.mdx deleted file mode 100644 index 724eac8..0000000 --- a/develop/api/pipelines/update-pipeline.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/pipelines/{pipeline_id} ---- \ No newline at end of file diff --git a/develop/api/plugins/get-flavor.mdx b/develop/api/plugins/get-flavor.mdx deleted file mode 100644 index 10b6c91..0000000 --- a/develop/api/plugins/get-flavor.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/plugin-flavors/{name} ---- \ No newline at end of file diff --git a/develop/api/plugins/list-flavors.mdx b/develop/api/plugins/list-flavors.mdx deleted file mode 100644 index 1db745c..0000000 --- a/develop/api/plugins/list-flavors.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/plugin-flavors ---- \ No newline at end of file diff --git a/develop/api/run_metadata/get-run-metadata.mdx b/develop/api/run_metadata/get-run-metadata.mdx deleted file mode 100644 index 9d6296a..0000000 --- a/develop/api/run_metadata/get-run-metadata.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/run-metadata/{run_metadata_id} ---- \ No newline at end of file diff --git a/develop/api/run_metadata/list-run-metadata.mdx b/develop/api/run_metadata/list-run-metadata.mdx deleted file mode 100644 index 0d65a61..0000000 --- a/develop/api/run_metadata/list-run-metadata.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/run-metadata ---- \ No newline at end of file diff --git a/develop/api/run_templates/create-template-run.mdx b/develop/api/run_templates/create-template-run.mdx deleted file mode 100644 index 828cc55..0000000 --- a/develop/api/run_templates/create-template-run.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/run_templates/{template_id}/runs ---- \ No newline at end of file diff --git a/develop/api/run_templates/delete-run-template.mdx b/develop/api/run_templates/delete-run-template.mdx deleted file mode 100644 index d72f17e..0000000 --- a/develop/api/run_templates/delete-run-template.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/run_templates/{template_id} ---- \ No newline at end of file diff --git a/develop/api/run_templates/get-run-template.mdx b/develop/api/run_templates/get-run-template.mdx deleted file mode 100644 index e33ce0e..0000000 --- a/develop/api/run_templates/get-run-template.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/run_templates/{template_id} ---- \ No newline at end of file diff --git a/develop/api/run_templates/list-run-templates.mdx b/develop/api/run_templates/list-run-templates.mdx deleted file mode 100644 index baf97a0..0000000 --- a/develop/api/run_templates/list-run-templates.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/run_templates ---- \ No newline at end of file diff --git a/develop/api/run_templates/update-run-template.mdx b/develop/api/run_templates/update-run-template.mdx deleted file mode 100644 index faaaa7c..0000000 --- a/develop/api/run_templates/update-run-template.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/run_templates/{template_id} ---- \ No newline at end of file diff --git a/develop/api/runs/delete-run.mdx b/develop/api/runs/delete-run.mdx deleted file mode 100644 index a282d9e..0000000 --- a/develop/api/runs/delete-run.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/runs/{run_id} ---- \ No newline at end of file diff --git a/develop/api/runs/get-pipeline-configuration.mdx b/develop/api/runs/get-pipeline-configuration.mdx deleted file mode 100644 index 4237fd2..0000000 --- a/develop/api/runs/get-pipeline-configuration.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/runs/{run_id}/pipeline-configuration ---- \ No newline at end of file diff --git a/develop/api/runs/get-run-dag.mdx b/develop/api/runs/get-run-dag.mdx deleted file mode 100644 index 8ec5a0c..0000000 --- a/develop/api/runs/get-run-dag.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/runs/{run_id}/graph ---- \ No newline at end of file diff --git a/develop/api/runs/get-run-status.mdx b/develop/api/runs/get-run-status.mdx deleted file mode 100644 index acb2c02..0000000 --- a/develop/api/runs/get-run-status.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/runs/{run_id}/status ---- \ No newline at end of file diff --git a/develop/api/runs/get-run-steps.mdx b/develop/api/runs/get-run-steps.mdx deleted file mode 100644 index 9e9a253..0000000 --- a/develop/api/runs/get-run-steps.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/runs/{run_id}/steps ---- \ No newline at end of file diff --git a/develop/api/runs/get-run.mdx b/develop/api/runs/get-run.mdx deleted file mode 100644 index 05925f8..0000000 --- a/develop/api/runs/get-run.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/runs/{run_id} ---- \ No newline at end of file diff --git a/develop/api/runs/list-runs.mdx b/develop/api/runs/list-runs.mdx deleted file mode 100644 index bc9274c..0000000 --- a/develop/api/runs/list-runs.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/runs ---- \ No newline at end of file diff --git a/develop/api/runs/update-run.mdx b/develop/api/runs/update-run.mdx deleted file mode 100644 index 13ea92f..0000000 --- a/develop/api/runs/update-run.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/runs/{run_id} ---- \ No newline at end of file diff --git a/develop/api/schedules/delete-schedule.mdx b/develop/api/schedules/delete-schedule.mdx deleted file mode 100644 index 3fb7122..0000000 --- a/develop/api/schedules/delete-schedule.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/schedules/{schedule_id} ---- \ No newline at end of file diff --git a/develop/api/schedules/get-schedule.mdx b/develop/api/schedules/get-schedule.mdx deleted file mode 100644 index 74ec0d3..0000000 --- a/develop/api/schedules/get-schedule.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/schedules/{schedule_id} ---- \ No newline at end of file diff --git a/develop/api/schedules/list-schedules.mdx b/develop/api/schedules/list-schedules.mdx deleted file mode 100644 index abb4df3..0000000 --- a/develop/api/schedules/list-schedules.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/schedules ---- \ No newline at end of file diff --git a/develop/api/schedules/update-schedule.mdx b/develop/api/schedules/update-schedule.mdx deleted file mode 100644 index 228f2ae..0000000 --- a/develop/api/schedules/update-schedule.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/schedules/{schedule_id} ---- \ No newline at end of file diff --git a/develop/api/secrets/backup-secrets.mdx b/develop/api/secrets/backup-secrets.mdx deleted file mode 100644 index 2accddf..0000000 --- a/develop/api/secrets/backup-secrets.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/secrets_operations/backup ---- \ No newline at end of file diff --git a/develop/api/secrets/delete-secret.mdx b/develop/api/secrets/delete-secret.mdx deleted file mode 100644 index 0e547a7..0000000 --- a/develop/api/secrets/delete-secret.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/secrets/{secret_id} ---- \ No newline at end of file diff --git a/develop/api/secrets/get-secret.mdx b/develop/api/secrets/get-secret.mdx deleted file mode 100644 index fc6a255..0000000 --- a/develop/api/secrets/get-secret.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/secrets/{secret_id} ---- \ No newline at end of file diff --git a/develop/api/secrets/list-secrets.mdx b/develop/api/secrets/list-secrets.mdx deleted file mode 100644 index 93ec736..0000000 --- a/develop/api/secrets/list-secrets.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/secrets ---- \ No newline at end of file diff --git a/develop/api/secrets/restore-secrets.mdx b/develop/api/secrets/restore-secrets.mdx deleted file mode 100644 index 66e8290..0000000 --- a/develop/api/secrets/restore-secrets.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/secrets_operations/restore ---- \ No newline at end of file diff --git a/develop/api/secrets/update-secret.mdx b/develop/api/secrets/update-secret.mdx deleted file mode 100644 index ce17369..0000000 --- a/develop/api/secrets/update-secret.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/secrets/{secret_id} ---- \ No newline at end of file diff --git a/develop/api/server/get-onboarding-state.mdx b/develop/api/server/get-onboarding-state.mdx deleted file mode 100644 index 9bb7ebf..0000000 --- a/develop/api/server/get-onboarding-state.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/onboarding_state ---- \ No newline at end of file diff --git a/develop/api/server/server-info.mdx b/develop/api/server/server-info.mdx deleted file mode 100644 index 360e7b4..0000000 --- a/develop/api/server/server-info.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/info ---- \ No newline at end of file diff --git a/develop/api/server/version.mdx b/develop/api/server/version.mdx deleted file mode 100644 index 052f773..0000000 --- a/develop/api/server/version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/version ---- \ No newline at end of file diff --git a/develop/api/service_accounts/create-api-key.mdx b/develop/api/service_accounts/create-api-key.mdx deleted file mode 100644 index b7113f3..0000000 --- a/develop/api/service_accounts/create-api-key.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/service_accounts/{service_account_id}/api_keys ---- \ No newline at end of file diff --git a/develop/api/service_accounts/create-service-account.mdx b/develop/api/service_accounts/create-service-account.mdx deleted file mode 100644 index 6a0d9ef..0000000 --- a/develop/api/service_accounts/create-service-account.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/service_accounts ---- \ No newline at end of file diff --git a/develop/api/service_accounts/delete-api-key.mdx b/develop/api/service_accounts/delete-api-key.mdx deleted file mode 100644 index 62d88d4..0000000 --- a/develop/api/service_accounts/delete-api-key.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/service_accounts/{service_account_id}/api_keys/{api_key_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/service_accounts/delete-service-account.mdx b/develop/api/service_accounts/delete-service-account.mdx deleted file mode 100644 index 36a05da..0000000 --- a/develop/api/service_accounts/delete-service-account.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/service_accounts/{service_account_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/service_accounts/get-api-key.mdx b/develop/api/service_accounts/get-api-key.mdx deleted file mode 100644 index 1ceac6f..0000000 --- a/develop/api/service_accounts/get-api-key.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_accounts/{service_account_id}/api_keys/{api_key_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/service_accounts/get-service-account.mdx b/develop/api/service_accounts/get-service-account.mdx deleted file mode 100644 index 6f772b0..0000000 --- a/develop/api/service_accounts/get-service-account.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_accounts/{service_account_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/service_accounts/list-api-keys.mdx b/develop/api/service_accounts/list-api-keys.mdx deleted file mode 100644 index 87bc6bc..0000000 --- a/develop/api/service_accounts/list-api-keys.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_accounts/{service_account_id}/api_keys ---- \ No newline at end of file diff --git a/develop/api/service_accounts/list-service-accounts.mdx b/develop/api/service_accounts/list-service-accounts.mdx deleted file mode 100644 index a844a91..0000000 --- a/develop/api/service_accounts/list-service-accounts.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_accounts ---- \ No newline at end of file diff --git a/develop/api/service_accounts/rotate-api-key.mdx b/develop/api/service_accounts/rotate-api-key.mdx deleted file mode 100644 index 2dcff4f..0000000 --- a/develop/api/service_accounts/rotate-api-key.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/service_accounts/{service_account_id}/api_keys/{api_key_name_or_id}/rotate ---- \ No newline at end of file diff --git a/develop/api/service_accounts/update-api-key.mdx b/develop/api/service_accounts/update-api-key.mdx deleted file mode 100644 index 886f40a..0000000 --- a/develop/api/service_accounts/update-api-key.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/service_accounts/{service_account_id}/api_keys/{api_key_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/service_accounts/update-service-account.mdx b/develop/api/service_accounts/update-service-account.mdx deleted file mode 100644 index d5db3f8..0000000 --- a/develop/api/service_accounts/update-service-account.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/service_accounts/{service_account_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/service_connectors/delete-service-connector.mdx b/develop/api/service_connectors/delete-service-connector.mdx deleted file mode 100644 index f9deba5..0000000 --- a/develop/api/service_connectors/delete-service-connector.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/service_connectors/{connector_id} ---- \ No newline at end of file diff --git a/develop/api/service_connectors/get-resources-based-on-service-connector-info.mdx b/develop/api/service_connectors/get-resources-based-on-service-connector-info.mdx deleted file mode 100644 index c258ee1..0000000 --- a/develop/api/service_connectors/get-resources-based-on-service-connector-info.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/service_connectors/full_stack_resources ---- \ No newline at end of file diff --git a/develop/api/service_connectors/get-service-connector-client.mdx b/develop/api/service_connectors/get-service-connector-client.mdx deleted file mode 100644 index d7b63c1..0000000 --- a/develop/api/service_connectors/get-service-connector-client.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_connectors/{connector_id}/client ---- \ No newline at end of file diff --git a/develop/api/service_connectors/get-service-connector-type.mdx b/develop/api/service_connectors/get-service-connector-type.mdx deleted file mode 100644 index 70f5f25..0000000 --- a/develop/api/service_connectors/get-service-connector-type.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_connector_types/{connector_type} ---- \ No newline at end of file diff --git a/develop/api/service_connectors/get-service-connector.mdx b/develop/api/service_connectors/get-service-connector.mdx deleted file mode 100644 index 310213c..0000000 --- a/develop/api/service_connectors/get-service-connector.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_connectors/{connector_id} ---- \ No newline at end of file diff --git a/develop/api/service_connectors/list-service-connector-types.mdx b/develop/api/service_connectors/list-service-connector-types.mdx deleted file mode 100644 index a7ea52f..0000000 --- a/develop/api/service_connectors/list-service-connector-types.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_connector_types ---- \ No newline at end of file diff --git a/develop/api/service_connectors/list-service-connectors.mdx b/develop/api/service_connectors/list-service-connectors.mdx deleted file mode 100644 index 9495533..0000000 --- a/develop/api/service_connectors/list-service-connectors.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/service_connectors ---- \ No newline at end of file diff --git a/develop/api/service_connectors/update-service-connector.mdx b/develop/api/service_connectors/update-service-connector.mdx deleted file mode 100644 index dfee0c8..0000000 --- a/develop/api/service_connectors/update-service-connector.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/service_connectors/{connector_id} ---- \ No newline at end of file diff --git a/develop/api/service_connectors/validate-and-verify-service-connector-config.mdx b/develop/api/service_connectors/validate-and-verify-service-connector-config.mdx deleted file mode 100644 index 9044785..0000000 --- a/develop/api/service_connectors/validate-and-verify-service-connector-config.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/service_connectors/verify ---- \ No newline at end of file diff --git a/develop/api/service_connectors/validate-and-verify-service-connector.mdx b/develop/api/service_connectors/validate-and-verify-service-connector.mdx deleted file mode 100644 index 38afed5..0000000 --- a/develop/api/service_connectors/validate-and-verify-service-connector.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/service_connectors/{connector_id}/verify ---- \ No newline at end of file diff --git a/develop/api/services/create-service.mdx b/develop/api/services/create-service.mdx deleted file mode 100644 index 45c62f6..0000000 --- a/develop/api/services/create-service.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/services ---- \ No newline at end of file diff --git a/develop/api/services/delete-service.mdx b/develop/api/services/delete-service.mdx deleted file mode 100644 index 3eebc86..0000000 --- a/develop/api/services/delete-service.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/services/{service_id} ---- \ No newline at end of file diff --git a/develop/api/services/get-service.mdx b/develop/api/services/get-service.mdx deleted file mode 100644 index bbce710..0000000 --- a/develop/api/services/get-service.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/services/{service_id} ---- \ No newline at end of file diff --git a/develop/api/services/list-services.mdx b/develop/api/services/list-services.mdx deleted file mode 100644 index 360d3c1..0000000 --- a/develop/api/services/list-services.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/services ---- \ No newline at end of file diff --git a/develop/api/services/update-service.mdx b/develop/api/services/update-service.mdx deleted file mode 100644 index 5300023..0000000 --- a/develop/api/services/update-service.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/services/{service_id} ---- \ No newline at end of file diff --git a/develop/api/stack_components/deregister-stack-component.mdx b/develop/api/stack_components/deregister-stack-component.mdx deleted file mode 100644 index 6cdc29e..0000000 --- a/develop/api/stack_components/deregister-stack-component.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/components/{component_id} ---- \ No newline at end of file diff --git a/develop/api/stack_components/get-stack-component-types.mdx b/develop/api/stack_components/get-stack-component-types.mdx deleted file mode 100644 index 7e892e6..0000000 --- a/develop/api/stack_components/get-stack-component-types.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/component-types ---- \ No newline at end of file diff --git a/develop/api/stack_components/get-stack-component.mdx b/develop/api/stack_components/get-stack-component.mdx deleted file mode 100644 index 8b4ca6c..0000000 --- a/develop/api/stack_components/get-stack-component.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/components/{component_id} ---- \ No newline at end of file diff --git a/develop/api/stack_components/list-stack-components.mdx b/develop/api/stack_components/list-stack-components.mdx deleted file mode 100644 index 3afd2fb..0000000 --- a/develop/api/stack_components/list-stack-components.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/components ---- \ No newline at end of file diff --git a/develop/api/stack_components/update-stack-component.mdx b/develop/api/stack_components/update-stack-component.mdx deleted file mode 100644 index 0e052de..0000000 --- a/develop/api/stack_components/update-stack-component.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/components/{component_id} ---- \ No newline at end of file diff --git a/develop/api/stacks/delete-stack.mdx b/develop/api/stacks/delete-stack.mdx deleted file mode 100644 index ac67b9c..0000000 --- a/develop/api/stacks/delete-stack.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/stacks/{stack_id} ---- \ No newline at end of file diff --git a/develop/api/stacks/get-deployed-stack.mdx b/develop/api/stacks/get-deployed-stack.mdx deleted file mode 100644 index 6d363c6..0000000 --- a/develop/api/stacks/get-deployed-stack.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/stack-deployment/stack ---- \ No newline at end of file diff --git a/develop/api/stacks/get-stack-deployment-config.mdx b/develop/api/stacks/get-stack-deployment-config.mdx deleted file mode 100644 index d81944d..0000000 --- a/develop/api/stacks/get-stack-deployment-config.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/stack-deployment/config ---- \ No newline at end of file diff --git a/develop/api/stacks/get-stack-deployment-info.mdx b/develop/api/stacks/get-stack-deployment-info.mdx deleted file mode 100644 index a68cef3..0000000 --- a/develop/api/stacks/get-stack-deployment-info.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/stack-deployment/info ---- \ No newline at end of file diff --git a/develop/api/stacks/get-stack.mdx b/develop/api/stacks/get-stack.mdx deleted file mode 100644 index 2e61707..0000000 --- a/develop/api/stacks/get-stack.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/stacks/{stack_id} ---- \ No newline at end of file diff --git a/develop/api/stacks/list-stacks.mdx b/develop/api/stacks/list-stacks.mdx deleted file mode 100644 index f261c64..0000000 --- a/develop/api/stacks/list-stacks.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/stacks ---- \ No newline at end of file diff --git a/develop/api/stacks/update-stack.mdx b/develop/api/stacks/update-stack.mdx deleted file mode 100644 index 8123760..0000000 --- a/develop/api/stacks/update-stack.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/stacks/{stack_id} ---- \ No newline at end of file diff --git a/develop/api/steps/create-run-step.mdx b/develop/api/steps/create-run-step.mdx deleted file mode 100644 index 1063f24..0000000 --- a/develop/api/steps/create-run-step.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/steps ---- \ No newline at end of file diff --git a/develop/api/steps/get-step-configuration.mdx b/develop/api/steps/get-step-configuration.mdx deleted file mode 100644 index 74dfc94..0000000 --- a/develop/api/steps/get-step-configuration.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/steps/{step_id}/step-configuration ---- \ No newline at end of file diff --git a/develop/api/steps/get-step-logs.mdx b/develop/api/steps/get-step-logs.mdx deleted file mode 100644 index 9aaeac5..0000000 --- a/develop/api/steps/get-step-logs.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/steps/{step_id}/logs ---- \ No newline at end of file diff --git a/develop/api/steps/get-step-status.mdx b/develop/api/steps/get-step-status.mdx deleted file mode 100644 index 8e44b54..0000000 --- a/develop/api/steps/get-step-status.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/steps/{step_id}/status ---- \ No newline at end of file diff --git a/develop/api/steps/get-step.mdx b/develop/api/steps/get-step.mdx deleted file mode 100644 index f58aef9..0000000 --- a/develop/api/steps/get-step.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/steps/{step_id} ---- \ No newline at end of file diff --git a/develop/api/steps/list-run-steps.mdx b/develop/api/steps/list-run-steps.mdx deleted file mode 100644 index 91e6119..0000000 --- a/develop/api/steps/list-run-steps.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/steps ---- \ No newline at end of file diff --git a/develop/api/steps/update-step.mdx b/develop/api/steps/update-step.mdx deleted file mode 100644 index b0d3424..0000000 --- a/develop/api/steps/update-step.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/steps/{step_id} ---- \ No newline at end of file diff --git a/develop/api/tags/create-tag.mdx b/develop/api/tags/create-tag.mdx deleted file mode 100644 index 73a8aee..0000000 --- a/develop/api/tags/create-tag.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/tags ---- \ No newline at end of file diff --git a/develop/api/tags/delete-tag.mdx b/develop/api/tags/delete-tag.mdx deleted file mode 100644 index e400225..0000000 --- a/develop/api/tags/delete-tag.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/tags/{tag_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/tags/get-tag.mdx b/develop/api/tags/get-tag.mdx deleted file mode 100644 index ad008af..0000000 --- a/develop/api/tags/get-tag.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/tags/{tag_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/tags/list-tags.mdx b/develop/api/tags/list-tags.mdx deleted file mode 100644 index f93bbdb..0000000 --- a/develop/api/tags/list-tags.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/tags ---- \ No newline at end of file diff --git a/develop/api/tags/update-tag.mdx b/develop/api/tags/update-tag.mdx deleted file mode 100644 index 9750cb4..0000000 --- a/develop/api/tags/update-tag.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/tags/{tag_id} ---- \ No newline at end of file diff --git a/develop/api/triggers/create-trigger.mdx b/develop/api/triggers/create-trigger.mdx deleted file mode 100644 index 3a5e967..0000000 --- a/develop/api/triggers/create-trigger.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/triggers ---- \ No newline at end of file diff --git a/develop/api/triggers/delete-trigger.mdx b/develop/api/triggers/delete-trigger.mdx deleted file mode 100644 index 99d2174..0000000 --- a/develop/api/triggers/delete-trigger.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/triggers/{trigger_id} ---- \ No newline at end of file diff --git a/develop/api/triggers/get-trigger.mdx b/develop/api/triggers/get-trigger.mdx deleted file mode 100644 index 77156d3..0000000 --- a/develop/api/triggers/get-trigger.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/triggers/{trigger_id} ---- \ No newline at end of file diff --git a/develop/api/triggers/list-triggers.mdx b/develop/api/triggers/list-triggers.mdx deleted file mode 100644 index c6001f2..0000000 --- a/develop/api/triggers/list-triggers.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/triggers ---- \ No newline at end of file diff --git a/develop/api/triggers/update-trigger.mdx b/develop/api/triggers/update-trigger.mdx deleted file mode 100644 index e235541..0000000 --- a/develop/api/triggers/update-trigger.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/triggers/{trigger_id} ---- \ No newline at end of file diff --git a/develop/api/users/get-current-user.mdx b/develop/api/users/get-current-user.mdx deleted file mode 100644 index 4794e46..0000000 --- a/develop/api/users/get-current-user.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/current-user ---- \ No newline at end of file diff --git a/develop/api/users/get-user.mdx b/develop/api/users/get-user.mdx deleted file mode 100644 index fa4066a..0000000 --- a/develop/api/users/get-user.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/users/{user_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/users/list-users.mdx b/develop/api/users/list-users.mdx deleted file mode 100644 index 461877f..0000000 --- a/develop/api/users/list-users.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/users ---- \ No newline at end of file diff --git a/develop/api/users/update-user-resource-membership.mdx b/develop/api/users/update-user-resource-membership.mdx deleted file mode 100644 index 4386070..0000000 --- a/develop/api/users/update-user-resource-membership.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/users/{user_name_or_id}/resource_membership ---- \ No newline at end of file diff --git a/develop/api/webhook/webhook.mdx b/develop/api/webhook/webhook.mdx deleted file mode 100644 index c9e9ee1..0000000 --- a/develop/api/webhook/webhook.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/webhooks/{event_source_id} ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-build.mdx b/develop/api/workspaces/create-build.mdx deleted file mode 100644 index 1baa4f7..0000000 --- a/develop/api/workspaces/create-build.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/pipeline_builds ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-code-repository.mdx b/develop/api/workspaces/create-code-repository.mdx deleted file mode 100644 index 58eaa4a..0000000 --- a/develop/api/workspaces/create-code-repository.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/code_repositories ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-deployment.mdx b/develop/api/workspaces/create-deployment.mdx deleted file mode 100644 index 1134cd0..0000000 --- a/develop/api/workspaces/create-deployment.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/pipeline_deployments ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-model-version-artifact-link.mdx b/develop/api/workspaces/create-model-version-artifact-link.mdx deleted file mode 100644 index e9e14af..0000000 --- a/develop/api/workspaces/create-model-version-artifact-link.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/model_versions/{model_version_id}/artifacts ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-model-version-pipeline-run-link.mdx b/develop/api/workspaces/create-model-version-pipeline-run-link.mdx deleted file mode 100644 index f7b44d1..0000000 --- a/develop/api/workspaces/create-model-version-pipeline-run-link.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/model_versions/{model_version_id}/runs ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-model-version.mdx b/develop/api/workspaces/create-model-version.mdx deleted file mode 100644 index 8c8babe..0000000 --- a/develop/api/workspaces/create-model-version.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/models/{model_name_or_id}/model_versions ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-model.mdx b/develop/api/workspaces/create-model.mdx deleted file mode 100644 index 2b5b369..0000000 --- a/develop/api/workspaces/create-model.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/models ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-pipeline-run.mdx b/develop/api/workspaces/create-pipeline-run.mdx deleted file mode 100644 index 347b57c..0000000 --- a/develop/api/workspaces/create-pipeline-run.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/runs ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-pipeline.mdx b/develop/api/workspaces/create-pipeline.mdx deleted file mode 100644 index ae58850..0000000 --- a/develop/api/workspaces/create-pipeline.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/pipelines ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-run-metadata.mdx b/develop/api/workspaces/create-run-metadata.mdx deleted file mode 100644 index 28173d4..0000000 --- a/develop/api/workspaces/create-run-metadata.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/run-metadata ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-run-template.mdx b/develop/api/workspaces/create-run-template.mdx deleted file mode 100644 index 7c38d77..0000000 --- a/develop/api/workspaces/create-run-template.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/run_templates ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-schedule.mdx b/develop/api/workspaces/create-schedule.mdx deleted file mode 100644 index 16abfe9..0000000 --- a/develop/api/workspaces/create-schedule.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/schedules ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-secret.mdx b/develop/api/workspaces/create-secret.mdx deleted file mode 100644 index cc5d6f7..0000000 --- a/develop/api/workspaces/create-secret.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/secrets ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-service-connector.mdx b/develop/api/workspaces/create-service-connector.mdx deleted file mode 100644 index 81d9bec..0000000 --- a/develop/api/workspaces/create-service-connector.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/service_connectors ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-service.mdx b/develop/api/workspaces/create-service.mdx deleted file mode 100644 index 85c1527..0000000 --- a/develop/api/workspaces/create-service.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/services ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-stack-component.mdx b/develop/api/workspaces/create-stack-component.mdx deleted file mode 100644 index c3536dd..0000000 --- a/develop/api/workspaces/create-stack-component.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/components ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-stack.mdx b/develop/api/workspaces/create-stack.mdx deleted file mode 100644 index 4b0dad6..0000000 --- a/develop/api/workspaces/create-stack.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/stacks ---- \ No newline at end of file diff --git a/develop/api/workspaces/create-workspace.mdx b/develop/api/workspaces/create-workspace.mdx deleted file mode 100644 index e861a62..0000000 --- a/develop/api/workspaces/create-workspace.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces ---- \ No newline at end of file diff --git a/develop/api/workspaces/delete-workspace.mdx b/develop/api/workspaces/delete-workspace.mdx deleted file mode 100644 index 171d104..0000000 --- a/develop/api/workspaces/delete-workspace.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: delete /api/v1/workspaces/{workspace_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/workspaces/get-or-create-pipeline-run.mdx b/develop/api/workspaces/get-or-create-pipeline-run.mdx deleted file mode 100644 index 5b61597..0000000 --- a/develop/api/workspaces/get-or-create-pipeline-run.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: post /api/v1/workspaces/{workspace_name_or_id}/runs/get-or-create ---- \ No newline at end of file diff --git a/develop/api/workspaces/get-workspace-statistics.mdx b/develop/api/workspaces/get-workspace-statistics.mdx deleted file mode 100644 index 7e76139..0000000 --- a/develop/api/workspaces/get-workspace-statistics.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/statistics ---- \ No newline at end of file diff --git a/develop/api/workspaces/get-workspace.mdx b/develop/api/workspaces/get-workspace.mdx deleted file mode 100644 index 398a914..0000000 --- a/develop/api/workspaces/get-workspace.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id} ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-runs.mdx b/develop/api/workspaces/list-runs.mdx deleted file mode 100644 index adfa9d3..0000000 --- a/develop/api/workspaces/list-runs.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/runs ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-service-connector-resources.mdx b/develop/api/workspaces/list-service-connector-resources.mdx deleted file mode 100644 index 5604c17..0000000 --- a/develop/api/workspaces/list-service-connector-resources.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/service_connectors/resources ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspace-builds.mdx b/develop/api/workspaces/list-workspace-builds.mdx deleted file mode 100644 index 60e2480..0000000 --- a/develop/api/workspaces/list-workspace-builds.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/pipeline_builds ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspace-code-repositories.mdx b/develop/api/workspaces/list-workspace-code-repositories.mdx deleted file mode 100644 index 1810ea5..0000000 --- a/develop/api/workspaces/list-workspace-code-repositories.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/code_repositories ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspace-deployments.mdx b/develop/api/workspaces/list-workspace-deployments.mdx deleted file mode 100644 index 0c64617..0000000 --- a/develop/api/workspaces/list-workspace-deployments.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/pipeline_deployments ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspace-pipelines.mdx b/develop/api/workspaces/list-workspace-pipelines.mdx deleted file mode 100644 index 6107b84..0000000 --- a/develop/api/workspaces/list-workspace-pipelines.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/pipelines ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspace-run-templates.mdx b/develop/api/workspaces/list-workspace-run-templates.mdx deleted file mode 100644 index 6108cfa..0000000 --- a/develop/api/workspaces/list-workspace-run-templates.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/run_templates ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspace-service-connectors.mdx b/develop/api/workspaces/list-workspace-service-connectors.mdx deleted file mode 100644 index 4e9ccd5..0000000 --- a/develop/api/workspaces/list-workspace-service-connectors.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/service_connectors ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspace-stack-components.mdx b/develop/api/workspaces/list-workspace-stack-components.mdx deleted file mode 100644 index 9db1757..0000000 --- a/develop/api/workspaces/list-workspace-stack-components.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/components ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspace-stacks.mdx b/develop/api/workspaces/list-workspace-stacks.mdx deleted file mode 100644 index 088475a..0000000 --- a/develop/api/workspaces/list-workspace-stacks.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces/{workspace_name_or_id}/stacks ---- \ No newline at end of file diff --git a/develop/api/workspaces/list-workspaces.mdx b/develop/api/workspaces/list-workspaces.mdx deleted file mode 100644 index 1ed624b..0000000 --- a/develop/api/workspaces/list-workspaces.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: get /api/v1/workspaces ---- \ No newline at end of file diff --git a/develop/api/workspaces/update-workspace.mdx b/develop/api/workspaces/update-workspace.mdx deleted file mode 100644 index 1d1b498..0000000 --- a/develop/api/workspaces/update-workspace.mdx +++ /dev/null @@ -1,3 +0,0 @@ ---- -openapi: put /api/v1/workspaces/{workspace_name_or_id} ---- \ No newline at end of file diff --git a/develop/getting-started/core-concepts.mdx b/develop/getting-started/core-concepts.mdx deleted file mode 100644 index e703eef..0000000 --- a/develop/getting-started/core-concepts.mdx +++ /dev/null @@ -1,192 +0,0 @@ ---- -title: "Core concepts" -description: "Discovering the core concepts behind ZenML." -icon: wand-magic-sparkles ---- - -**ZenML** is an extensible, open-source MLOps framework for creating portable, production-ready **MLOps pipelines**. It's built for data scientists, ML Engineers, and MLOps Developers to collaborate as they develop to production. In order to achieve this goal, ZenML introduces various concepts for different aspects of an ML workflow and we can categorize these concepts under three different threads: - - - As a developer, how do I design my machine learning workflows? - - - While executing, how do my workflows utilize the large landscape of MLOps tooling/infrastructure? - - -How do I establish and maintain a production-grade and efficient solution? - - - - -## 1\. Development - -First, let's look at the main concepts which play a role during the development stage of an ML workflow with ZenML. - -#### Step - -Steps are functions annotated with the `@step` decorator. The easiest one could look like this. - - -```Bash -@step - -def step_1() -> str: - - """Returns a string.""" - - return "world" -``` - -These functions can also have inputs and outputs. For ZenML to work properly, these should preferably be typed. - - -```Bash -@step(enable_cache=False) - -def step_2(input_one: str, input_two: str) -> str: - - """Combines the two strings passed in.""" - - combined_str = f"{input_one} {input_two}" - - return combined_str -``` - -#### Pipelines - -At its core, ZenML follows a pipeline-based workflow for your projects. A **pipeline** consists of a series of **steps**, organized in any order that makes sense for your use case. - - - - - -As seen in the image, a step might use the outputs from a previous step and thus must wait until the previous step is completed before starting. This is something you can keep in mind when organizing your steps. - -Pipelines and steps are defined in code using Python _decorators_ or _classes_. This is where the core business logic and value of your work lives, and you will spend most of your time defining these two things. - -Even though pipelines are simple Python functions, you are only allowed to call steps within this function. The inputs for steps called within a pipeline can either be the outputs of previous steps or alternatively, you can pass in values directly (as long as they're JSON-serializable). - - -```Bash -@pipeline - -def my_pipeline(): - - output_step_one = step_1() - - step_2(input_one="hello", input_two=output_step_one) -``` - -Executing the Pipeline is as easy as calling the function that you decorated with the `@pipeline` decorator. - - -```Bash -if __name__ == "__main__": - - my_pipeline() -``` - -#### Artifacts - -Artifacts represent the data that goes through your steps as inputs and outputs and they are automatically tracked and stored by ZenML in the artifact store. They are produced by and circulated among steps whenever your step returns an object or a value. This means the data is not passed between steps in memory. Rather, when the execution of a step is completed they are written to storage, and when a new step gets executed they are loaded from storage. - -The serialization and deserialization logic of artifacts is defined by [Materializers](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types). - -#### Models - -Models are used to represent the outputs of a training process along with all metadata associated with that output. In other words: models in ZenML are more broadly defined as the weights as well as any associated information. Models are first-class citizens in ZenML and as such viewing and using them is unified and centralized in the ZenML API, client as well as on the [ZenML Pro](https://zenml.io/pro) dashboard. - -#### Materializers - -Materializers define how artifacts live in between steps. More precisely, they define how data of a particular type can be serialized/deserialized, so that the steps are able to load the input data and store the output data. - -All materializers use the base abstraction called the `BaseMaterializer` class. While ZenML comes built-in with various implementations of materializers for different datatypes, if you are using a library or a tool that doesn't work with our built-in options, you can write [your own custom materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) to ensure that your data can be passed from step to step. - -#### Parameters & Settings - -When we think about steps as functions, we know they receive input in the form of artifacts. We also know that they produce output (in the form of artifacts, stored in the artifact store). But steps also take parameters. The parameters that you pass into the steps are also (helpfully!) stored by ZenML. This helps freeze the iterations of your experimentation workflow in time, so you can return to them exactly as you run them. On top of the parameters that you provide for your steps, you can also use different `Setting`s to configure runtime configurations for your infrastructure and pipelines. - -#### Model and model versions - -ZenML exposes the concept of a `Model`, which consists of multiple different model versions. A model version represents a unified view of the ML models that are created, tracked, and managed as part of a ZenML project. Model versions link all other entities to a centralized view. - -## 2\. Execution - -Once you have implemented your workflow by using the concepts described above, you can focus your attention on the execution of the pipeline run. - -#### Stacks & Components - -When you want to execute a pipeline run with ZenML, **Stacks** come into play. A **Stack** is a collection of **stack components**, where each component represents the respective configuration regarding a particular function in your MLOps pipeline such as orchestration systems, artifact repositories, and model deployment platforms. - -For instance, if you take a close look at the default local stack of ZenML, you will see two components that are **required** in every stack in ZenML, namely an _orchestrator_ and an _artifact store_. - - - - - - - Keep in mind, that each one of these components is built on top of base abstractions and is completely extensible. - - -#### Orchestrator - -An **Orchestrator** is a workhorse that coordinates all the steps to run in a pipeline. Since pipelines can be set up with complex combinations of steps with various asynchronous dependencies between them, the orchestrator acts as the component that decides what steps to run and when to run them. - -ZenML comes with a default _local orchestrator_ designed to run on your local machine. This is useful, especially during the exploration phase of your project. You don't have to rent a cloud instance just to try out basic things. - -#### Artifact Store - -An **Artifact Store** is a component that houses all data that pass through the pipeline as inputs and outputs. Each artifact that gets stored in the artifact store is tracked and versioned and this allows for extremely useful features like data caching which speeds up your workflows. - -Similar to the orchestrator, ZenML comes with a default _local artifact store_ designed to run on your local machine. This is useful, especially during the exploration phase of your project. You don't have to set up a cloud storage system to try out basic things. - -#### Flavor - -ZenML provides a dedicated base abstraction for each stack component type. These abstractions are used to develop solutions, called **Flavors**, tailored to specific use cases/tools. With ZenML installed, you get access to a variety of built-in and integrated Flavors for each component type, but users can also leverage the base abstractions to create their own custom flavors. - -#### Stack Switching - -When it comes to production-grade solutions, it is rarely enough to just run your workflow locally without including any cloud infrastructure. - -Thanks to the separation between the pipeline code and the stack in ZenML, you can easily switch your stack independently from your code. For instance, all it would take you to switch from an experimental local stack running on your machine to a remote stack that employs a full-fledged cloud infrastructure is a single CLI command. - -## 3\. Management - -In order to benefit from the aforementioned core concepts to their fullest extent, it is essential to deploy and manage a production-grade environment that interacts with your ZenML installation. - -#### ZenML Server - -To use _stack components_ that are running remotely on a cloud infrastructure, you need to deploy a [**ZenML Server**](/develop/user-guide/guides/production-guide/deploying-zenml) so it can communicate with these stack components and run your pipelines. The server is also responsible for managing ZenML business entities like pipelines, steps, models, etc. - - - - -#### Server Deployment - -In order to benefit from the advantages of using a deployed ZenML server, you can either choose to use the [**ZenML Pro SaaS offering**](/develop/getting-started/deploying-zenml/zenml-pro/zenml-pro) which provides a control plane for you to create managed instances of ZenML servers, or [deploy it in your self-hosted environment](/develop/getting-started/deploying-zenml). - -#### Metadata Tracking - -On top of the communication with the stack components, the **ZenML Server** also keeps track of all the bits of metadata around a pipeline run. With a ZenML server, you are able to access all of your previous experiments with the associated details. This is extremely helpful in troubleshooting. - -#### Secrets - -The **ZenML Server** also acts as a [centralized secrets store](/develop/usage/project-setup/use-secrets/secret-management) that safely and securely stores sensitive data such as credentials used to access the services that are part of your stack. It can be configured to use a variety of different backends for this purpose, such as the AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, and Hashicorp Vault. - -Secrets are sensitive data that you don't want to store in your code or configure alongside your stacks and pipelines. ZenML includes a [centralized secrets store](/develop/usage/project-setup/use-secrets/secret-management) that you can use to store and access your secrets securely. - -#### Collaboration - -Collaboration is a crucial aspect of any MLOps team as they often need to bring together individuals with diverse skills and expertise to create a cohesive and effective workflow for machine learning projects. A successful MLOps team requires seamless collaboration between data scientists, engineers, and DevOps professionals to develop, train, deploy, and maintain machine learning models. - -With a deployed **ZenML Server**, users have the ability to create their own teams and project structures. They can easily share pipelines, runs, stacks, and other resources, streamlining the workflow and promoting teamwork. - -#### Dashboard - -The **ZenML Dashboard** also communicates with **the ZenML Server** to visualize your _pipelines_, _stacks_, and _stack components_. The dashboard serves as a visual interface to showcase collaboration with ZenML. You can invite _users_, and share your stacks with them. - -When you start working with ZenML, you'll start with a local ZenML setup, and when you want to transition you will need to [deploy ZenML](/develop/getting-started/deploying-zenml). Don't worry though, there is a one-click way to do it which we'll learn about later. - -#### VS Code Extension - -ZenML also provides a [VS Code extension](https://marketplace.visualstudio.com/items?itemName=ZenML.zenml-vscode) that allows you to interact with your ZenML stacks, runs and server directly from your VS Code editor. If you're working on code in your editor, you can easily switch and inspect the stacks you're using, delete and inspect pipelines as well as even switch stacks. diff --git a/develop/getting-started/deploying-zenml/common-questions-answered.mdx b/develop/getting-started/deploying-zenml/common-questions-answered.mdx deleted file mode 100644 index 1d8a42f..0000000 --- a/develop/getting-started/deploying-zenml/common-questions-answered.mdx +++ /dev/null @@ -1,24 +0,0 @@ ---- -title: "Deployment FAQ" -description: "Questions about deploying ZenML" -icon: question ---- - -### How do I deploy ZenML on my internal company cluster? - -Read the documentation on self-hosted ZenML deployments in which several options are presented. - -### How do I deploy cloud infrastructure and/or MLOps stacks? - -ZenML is designed to be stack-agnostic, so you can use it with any cloud infrastructure or MLOps stack. Each of the documentation pages for stack components explain how to deploy these components on the most popular cloud providers. - -We also build and maintain the mlstacks package and library which offers a dedicated way to spin up infrastructure for your ZenML pipelines. It's fully integrated into ZenML's CLI and is a great way to get started with deploying your infrastructure. ZenML also publishes and maintains modules on the Terraform Registry (which are used by mlstacks under the hood) which you can also use as a standalone solution if you are familiar with Terraform. - -We also have a one-click deployment solution for stacks in the Dashboard. - -### How do I upgrade my ZenML client and/or server? - -Upgrading your ZenML client package is as simple as running pip install ---upgrade zenml in your terminal. For upgrading your ZenML server, please refer -to the dedicated documentation section which covers most of the ways you might -do this as well as common troubleshooting steps. diff --git a/develop/getting-started/deploying-zenml/deploy-using-huggingface-spaces.mdx b/develop/getting-started/deploying-zenml/deploy-using-huggingface-spaces.mdx deleted file mode 100644 index 6191335..0000000 --- a/develop/getting-started/deploying-zenml/deploy-using-huggingface-spaces.mdx +++ /dev/null @@ -1,77 +0,0 @@ ---- -title: "Deploy using HuggingFace Spaces" -description: "Deploying ZenML to Huggingface Spaces." -icon: face-grin-tongue-squint ---- - -A quick way to deploy ZenML and get started is to use [HuggingFace Spaces](https://huggingface.co/spaces). HuggingFace Spaces is a platform for hosting and sharing ML projects and workflows, and it also works to deploy ZenML. You can be up and running in minutes (for free) with a hosted ZenML server, so it's a good option if you want to try out ZenML without any infrastructure overhead. - -If you are planning to use HuggingFace Spaces for production use, make sure you have [persistent storage turned on](https://huggingface.co/docs/hub/en/spaces-storage) so as to prevent loss of data. See our [other deployment options](/develop/getting-started/deploying-zenml) if you want alternative options. - - - - -In this diagram, you can see what the default deployment of ZenML on HuggingFace looks like. - -## Deploying ZenML on HuggingFace Spaces - -You can deploy ZenML on HuggingFace Spaces with just a few clicks: - - -[![Deploy to Spaces](/_assets/deploy-to-spaces-lg.svg)](https://huggingface.co/new-space?template=zenml/zenml) - - - -To set up your ZenML app, you need to specify three main components: the Owner (either your personal account or an organization), a Space name, and the Visibility (a bit lower down the page). Note that the space visibility needs to be set to 'Public' if you wish to connect to the ZenML server from your local machine. - - - - - -You have the option here to select a higher-tier machine to use for your server. The advantage of selecting a paid CPU instance is that it is not subject to auto-shutdown policies and thus will stay up as long as you leave it up. In order to make use of a persistent CPU, you'll likely want to create and set up a MySQL database to connect to (see below). - -To personalize your Space's appearance, such as the title, emojis, and colors, navigate to "Files and Versions" and modify the metadata in your README.md file. Full information on Spaces configuration parameters can be found on the HuggingFace [documentation reference guide](https://huggingface.co/docs/hub/spaces-config-reference). - -After creating your Space, you'll notice a 'Building' status along with logs displayed on the screen. When this switches to 'Running', your Space is ready for use. If the ZenML login UI isn't visible, try refreshing the page. - -In the upper-right hand corner of your space you'll see a button with three dots which, when you click on it, will offer you a menu option to "Embed this Space". (See [the HuggingFace documentation](https://huggingface.co/docs/hub/spaces-embed) for more details on this feature.) Copy the "Direct URL" shown in the box that you can now see on the screen. This should look something like this: `https://-.hf.space`. Open that URL and follow the instructions to initialize your ZenML server and set up an initial admin user account. - -## Connecting to your ZenML Server from your local machine - -Once you have your ZenML server up and running, you can connect to it from your local machine. To do this, you'll need to get your Space's 'Direct URL' (see above). - - - Your Space's URL will only be available and usable for connecting from your local machine if the visibility of the space is set to 'Public'. - - -You can use the 'Direct URL' to connect to your ZenML server from your local machine with the following CLI command (after installing ZenML, and using your custom URL instead of the placeholder): - -```Bash -zenml connect --url '' -``` - -You can also use the Direct URL in your browser to use the ZenML dashboard as a fullscreen application (i.e. without the HuggingFace Spaces wrapper around it). - -## Extra configuration options - -By default, the ZenML application will be configured to use an SQLite non-persistent database. If you want to use a persistent database, you can configure this by amending the `Dockerfile` to your Space's root directory. For full details on the various parameters you can change, see [our reference documentation](/develop/getting-started/deploying-zenml/deploy-with-docker#advanced-server-configuration-options) on configuring ZenML when deployed with Docker. - - - If you are using the space just for testing and experimentation, you don't need to make any changes to the configuration. Everything will work out of the box. - - -You can also use an external secrets backend together with your HuggingFace Spaces as described in [our documentation](/develop/getting-started/deploying-zenml/deploy-with-docker#advanced-server-configuration-options). You should be sure to use HuggingFace's inbuilt ' Repository secrets' functionality to configure any secrets you need to use in your`Dockerfile` configuration. [See the documentation](https://huggingface.co/docs/hub/spaces-sdks-docker#secret-management) for more details on how to set this up. - - - If you wish to use a cloud secrets backend together with ZenML for secrets management, **you must update your password** on your ZenML Server on the Dashboard. This is because the default user created by the HuggingFace Spaces deployment process has no password assigned to it and as the Space is publicly accessible (since the Space is public) _potentially anyone could access your secrets without this extra step_. To change your password navigate to the Settings page by clicking the button in the upper right-hand corner of the Dashboard and then click 'Update Password'. - - -## Troubleshooting - -If you are having trouble with your ZenML server on HuggingFace Spaces, you can view the logs by clicking on the "Open Logs" button at the top of the space. This will give you more context of what's happening with your server. - -If you have any other issues, please feel free to reach out to us on our [Slack channel](https://zenml.io/slack/) for more support. - -## Upgrading your ZenML Server on HF Spaces - -The default space will use the latest version of ZenML automatically. If you want to update your version, you can simply select the 'Factory reboot' option within the 'Settings' tab of the space. Note that this will wipe any data contained within the space and so if you are not using a MySQL persistent database (as described above) you will lose any data contained within your ZenML deployment on the space. You can also configure the space to use an earlier version by updating the `Dockerfile`'s `FROM` import statement at the very top. diff --git a/develop/getting-started/deploying-zenml/deploy-with-custom-image.mdx b/develop/getting-started/deploying-zenml/deploy-with-custom-image.mdx deleted file mode 100644 index da0f7df..0000000 --- a/develop/getting-started/deploying-zenml/deploy-with-custom-image.mdx +++ /dev/null @@ -1,106 +0,0 @@ ---- -title: "Deploy with custom images" -description: "Deploying ZenML with custom Docker images." -icon: image ---- - -In most cases, deploying ZenML with the default `zenmlhub/zenml-server` Docker image should work just fine. However, there are some scenarios when you might need to deploy ZenML with a custom Docker image: - -* You have implemented a custom artifact store for which you want to enable [artifact visualizations](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/usage/resource-data-management/handle-data-artifacts/visualize-artifacts.md) or [step logs](/develop/usage/project-setup/setting-up-a-project-repository/best-practices#logging) in your dashboard. -* You have forked the ZenML repository and want to deploy a ZenML server based on your own fork because you made changes to the server / database logic. - - - Deploying ZenML with custom Docker images is only possible for [Docker](/develop/getting-started/deploying-zenml/deploy-with-docker) or [Helm](/develop/getting-started/deploying-zenml/deploy-with-helm) deployments. - - -### Build and Push Custom ZenML Server Docker Image - -Here is how you can build a custom ZenML server Docker image: - -1. Set up a container registry of your choice. E.g., as an indivial developer you could create a free [Docker Hub](https://hub.docker.com/) account and then set up a free Docker Hub repository. -2. Clone ZenML (or your ZenML fork) and checkout the branch that you want to deploy, e.g., if you want to deploy ZenML version 0.41.0, run - -```Bash -git checkout release/0.41.0 -``` -3. the [ZenML base.Dockerfile](https://github.com/zenml-io/zenml/blob/main/docker/base.Dockerfile), e.g.: - -```Bash -cp docker/base.Dockerfile docker/custom.Dockerfile -``` -4. Modify the copied Dockerfile: - * Add additional dependencies: - -```Bash -RUN pip install -``` - * (Forks only) install local files instead of official ZenML: - -```Bash -RUN pip install -e .[server,secrets-aws,secrets-gcp,secrets-azure,secrets-hashicorp,s3fs,gcsfs,adlfs,connectors-aws,connectors-gcp,connectors-azure] -``` -5. Build and push an image based on your Dockerfile: - -```Bash -docker build -f docker/custom.Dockerfile . -t /: --platform linux/amd64 -docker push /: -``` - - - If you want to verify your custom image locally, you can follow the [Deploy a custom ZenML image via Docker](/develop/getting-started/deploying-zenml/deploy-with-custom-image#deploy-a-custom-zenml-image-via-docker) section below to deploy the ZenML server locally first. - - -### Deploy ZenML with your custom image - -Next, adjust your preferred deployment strategy to use the custom Docker image you just built. - -#### Deploy a custom ZenML image via CLI - -You can deploy your custom image via the `zenml deploy` CLI command by setting the `--config` argument to a custom configuration file that has both `zenmlserver_image_repo` and `zenmlserver_image_tag` set: - -1. Define a custom `config.yaml` based on the [base deployment configuration file](/develop/getting-started/deploying-zenml/deploy-with-zenml-cli#base-configuration-file) and set `zenmlserver_image_repo` and `zenmlserver_image_tag` according to the custom image you built: - -```Bash -zenmlserver_image_repo: / -zenmlserver_image_tag: -``` -2. Run `zenml deploy` with the custom config file: - -```Bash -zenml deploy --config=/PATH/TO/FILE -``` - -See the general [ZenML CLI Deployment Guide](/develop/getting-started/deploying-zenml/deploy-with-zenml-cli) for more information on how to use the `zenml deploy` CLI command and what other options can be configured. - -#### Deploy a custom ZenML image via Docker - -To deploy your custom image via Docker, first familiarize yourself with the general [ZenML Docker Deployment Guide](/develop/getting-started/deploying-zenml/deploy-with-docker). - -To use your own image, follow the general guide step by step but replace all mentions of `zenmldocker/zenml-server` with your custom image reference `/:`. E.g.: - -* To run the ZenML server with Docker based on your custom image, do - -```shell -docker run -it -d -p 8080:8080 --name zenml /: -``` - -* To use `docker-compose`, adjust your `docker-compose.yml`: - -```Bash -services: - zenml: - image: /: -``` - -#### Deploy a custom ZenML image via Helm - -To deploy your custom image via Helm, first familiarize yourself with the general [ZenML Helm Deployment Guide](/develop/getting-started/deploying-zenml/deploy-with-helm). - -To use your own image, the only thing you need to do differently is to modify the `image` section of your `values.yaml` file: - -```Bash -zenml: - image: - repository: / - tag: -``` diff --git a/develop/getting-started/deploying-zenml/deploy-with-docker.mdx b/develop/getting-started/deploying-zenml/deploy-with-docker.mdx deleted file mode 100644 index e2445de..0000000 --- a/develop/getting-started/deploying-zenml/deploy-with-docker.mdx +++ /dev/null @@ -1,521 +0,0 @@ ---- -title: "Deploy with Docker" -description: "Deploying ZenML in a Docker container." -icon: docker ---- - -The ZenML server container image is available at [zenmldocker/zenml-server](https://hub.docker.com/r/zenmldocker/zenml/) and can be used to deploy ZenML with a container management or orchestration tool like Docker and docker-compose, or a serverless platform like [Cloud Run](https://cloud.google.com/run), [Container Apps](https://learn.microsoft.com/en-us/azure/container-apps/overview), and more! This guide walks you through the various configuration options that the ZenML server container expects as well as a few deployment use cases. - -## Try it out locally first - -If you're just looking for a quick way to deploy the ZenML server using a container, without going through the hassle of interacting with a container management tool like Docker and manually configuring your container, you can use the ZenML CLI to do so. You only need to have Docker installed and running on your machine: - -```Bash -zenml up --docker -``` - -This command deploys a ZenML server locally in a Docker container, then connects your client to it. Similar to running plain `zenml up`, the server and the local ZenML client share the same SQLite database. - -The rest of this guide is addressed to advanced users who are looking to manually deploy and manage a containerized ZenML server. - -## ZenML server configuration options - -If you're planning on deploying a custom containerized ZenML server yourself, you probably need to configure some settings for it like the **database** it should use, the **default user details,** and more. The ZenML server container image uses sensible defaults, so you can simply start a container without worrying too much about the configuration. However, if you're looking to connect the ZenML server to an external MySQL database or secrets management service, to persist the internal SQLite database, or simply want to control other settings like the default account, you can do so by customizing the container's environment variables. - -The following environment variables can be passed to the container: - -* **ZENML\_STORE\_URL**: This URL should point to an SQLite database file _mounted in the container_, or to a MySQL-compatible database service _reachable from the container_. It takes one of these forms: - -``` -sqlite:////path/to/zenml.db -``` -or: - -``` -mysql://username:password@host:port/database -``` -* **ZENML\_STORE\_SSL\_CA**: This can be set to a custom server CA certificate in use by the MySQL database service. Only valid when `ZENML_STORE_URL` points to a MySQL database that uses SSL-secured connections. The variable can be set either to the path where the certificate file is mounted inside the container or to the certificate contents themselves. -* **ZENML\_STORE\_SSL\_CERT**: This can be set to a client SSL certificate required to connect to the MySQL database service. Only valid when `ZENML_STORE_URL` points to a MySQL database that uses SSL-secured connections and requires client SSL certificates. The variable can be set either to the path where the certificate file is mounted inside the container or to the certificate contents themselves. This variable also requires `ZENML_STORE_SSL_KEY` to be set. -* **ZENML\_STORE\_SSL\_KEY**: This can be set to a client SSL private key required to connect to the MySQL database service. Only valid when `ZENML_STORE_URL` points to a MySQL database that uses SSL-secured connections and requires client SSL certificates. The variable can be set either to the path where the certificate file is mounted inside the container or to the certificate contents themselves. This variable also requires `ZENML_STORE_SSL_CERT` to be set. -* **ZENML\_STORE\_SSL\_VERIFY\_SERVER\_CERT**: This boolean variable controls whether the SSL certificate in use by the MySQL server is verified. Only valid when `ZENML_STORE_URL` points to a MySQL database that uses SSL-secured connections. Defaults to `False`. -* **ZENML\_LOGGING\_VERBOSITY**: Use this variable to control the verbosity of logs inside the container. It can be set to one of the following values: `NOTSET`, `ERROR`, `WARN`, `INFO` (default), `DEBUG` or `CRITICAL`. -* **ZENML\_STORE\_BACKUP\_STRATEGY**: This variable controls the database backup strategy used by the ZenML server. See the [Database backup and recovery](/develop/getting-started/deploying-zenml/deploy-with-docker#database-backup-and-recovery) section for more details about this feature and other related environment variables. Defaults to `in-memory`. -* **ZENML\_SERVER\_RATE\_LIMIT\_ENABLED**: This variable controls the rate limiting for ZenML API (currently only for the `LOGIN` endpoint). It is disabled by default, so set it to `1` only if you need to enable rate limiting. To determine unique users a `X_FORWARDED_FOR` header or `request.client.host` is used, so before enabling this make sure that your network configuration is associating proper information with your clients in order to avoid disruptions for legitimate requests. -* **ZENML\_SERVER\_LOGIN\_RATE\_LIMIT\_MINUTE**: If rate limiting is enabled, this variable controls how many requests will be allowed to query the login endpoint in a one minute interval. Set it to a desired integer value; defaults to `5`. -* **ZENML\_SERVER\_LOGIN\_RATE\_LIMIT\_DAY**: If rate limiting is enabled, this variable controls how many requests will be allowed to query the login endpoint in an interval of day interval. Set it to a desired integer value; defaults to `1000`. - -If none of the `ZENML_STORE_*` variables are set, the container will default to creating and using an SQLite database file stored at `/zenml/.zenconfig/local_stores/default_zen_store/zenml.db` inside the container. The `/zenml/.zenconfig/local_stores` base path where the default SQLite database is located can optionally be overridden by setting the `ZENML_LOCAL_STORES_PATH` environment variable to point to a different path (e.g. a persistent volume or directory that is mounted from the host). - -### Secret store environment variables - -Unless explicitly disabled or configured otherwise, the ZenML server will use the SQL database as [a secrets store backend](/develop/usage/project-setup/use-secrets/secret-management) where secret values are stored. If you want to use an external secrets management service like the AWS Secrets Manager, GCP Secrets Manager, Azure Key Vault, HashiCorp Vault or even your custom Secrets Store back-end implementation instead, you need to configure it explicitly using Docker environment variables. Depending on where you deploy your ZenML server and how your Kubernetes cluster is configured, you will also need to provide the credentials needed to access the secrets management service API. - -> **Important:** If you are updating the configuration of your ZenML Server container to use a different secrets store back-end or location, you should follow [the documented secrets migration strategy](/develop/usage/project-setup/use-secrets/secret-management#secrets-migration-strategy) to minimize downtime and to ensure that existing secrets are also properly migrated. - - - - The SQL database is used as the default secret store location. You only need to configure these options if you want to change the default behavior. - -It is particularly recommended to enable encryption at rest for the SQL database if you plan on using it as a secrets store backend. You'll have to configure the secret key used to encrypt the secret values. If not set, encryption will not be used and passwords will be stored unencrypted in the database. - -* **ZENML\_SECRETS\_STORE\_TYPE:** Set this to `sql` in order to explicitly set this type of secret store. -* **ZENML\_SECRETS\_STORE\_ENCRYPTION\_KEY**: the secret key used to encrypt all secrets stored in the SQL secrets store. It is recommended to set this to a random string with a length of at least 32 characters, e.g.: - -```Bash -from secrets import token_hex -token_hex(32) -``` -or: - -```Bash -openssl rand -hex 32 -``` - -> **Important:** If you configure encryption for your SQL database secrets store, you should keep the `ZENML_SECRETS_STORE_ENCRYPTION_KEY` value somewhere safe and secure, as it will always be required by the ZenML server to decrypt the secrets in the database. If you lose the encryption key, you will not be able to decrypt the secrets in the database and will have to reset them. - - - -These configuration options are only relevant if you're using the AWS Secrets Manager as the secrets store backend. - -* **ZENML\_SECRETS\_STORE\_TYPE:** Set this to `aws` in order to set this type of secret store. - -The AWS Secrets Store uses the ZenML AWS Service Connector under the hood to authenticate with the AWS Secrets Manager API. This means that you can use any of the [authentication methods supported by the AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector#authentication-methods) to authenticate with the AWS Secrets Manager API. - -The minimum set of permissions that must be attached to the implicit or configured AWS credentials are: `secretsmanager:CreateSecret`, `secretsmanager:GetSecretValue`, `secretsmanager:DescribeSecret`, `secretsmanager:PutSecretValue`, `secretsmanager:TagResource` and `secretsmanager:DeleteSecret` and they must be associated with secrets that have a name starting with `zenml/` in the target region and account. The following IAM policy example can be used as a starting point: - -``` -{ - "Version": "2012-10-17", - "Statement": [ - { - "Sid": "ZenMLSecretsStore", - "Effect": "Allow", - "Action": [ - "secretsmanager:CreateSecret", - "secretsmanager:GetSecretValue", - "secretsmanager:DescribeSecret", - "secretsmanager:PutSecretValue", - "secretsmanager:TagResource", - "secretsmanager:DeleteSecret" - ], - "Resource": "arn:aws:secretsmanager:::secret:zenml/*" - } - ] -} -``` -The following configuration options are supported: - -* **ZENML\_SECRETS\_STORE\_AUTH\_METHOD**: The AWS Service Connector authentication method to use (e.g. `secret-key` or `iam-role`). -* **ZENML\_SECRETS\_STORE\_AUTH\_CONFIG**: The AWS Service Connector configuration, in JSON format (e.g. `{"aws_access_key_id":"","aws_secret_access_key":"","region":""}`). - -> **Note:** The remaining configuration options are deprecated and may be removed in a future release. Instead, you should set the `ZENML_SECRETS_STORE_AUTH_METHOD` and `ZENML_SECRETS_STORE_AUTH_CONFIG` variables to use the AWS Service Connector authentication method. - -* **ZENML\_SECRETS\_STORE\_REGION\_NAME**: The AWS region to use. This must be set to the region where the AWS Secrets Manager service that you want to use is located. -* **ZENML\_SECRETS\_STORE\_AWS\_ACCESS\_KEY\_ID**: The AWS access key ID to use for authentication. This must be set to a valid AWS access key ID that has access to the AWS Secrets Manager service that you want to use. If you are using an IAM role attached to an EKS cluster to authenticate, you can omit this variable. -* **ZENML\_SECRETS\_STORE\_AWS\_SECRET\_ACCESS\_KEY**: The AWS secret access key to use for authentication. This must be set to a valid AWS secret access key that has access to the AWS Secrets Manager service that you want to use. If you are using an IAM role attached to an EKS cluster to authenticate, you can omit this variable. - - - -These configuration options are only relevant if you're using the GCP Secrets Manager as the secrets store backend. - -* **ZENML\_SECRETS\_STORE\_TYPE:** Set this to `gcp` in order to set this type of secret store. - -The GCP Secrets Store uses the ZenML GCP Service Connector under the hood to authenticate with the GCP Secrets Manager API. This means that you can use any of the [authentication methods supported by the GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector#authentication-methods) to authenticate with the GCP Secrets Manager API. - -The minimum set of permissions that must be attached to the implicit or configured GCP credentials are as follows: - -* `secretmanager.secrets.create` for the target GCP project (i.e. no condition on the name prefix) -* `secretmanager.secrets.get`, `secretmanager.secrets.update`, `secretmanager.versions.access`, `secretmanager.versions.add` and `secretmanager.secrets.delete` for the target GCP project and for secrets that have a name starting with `zenml-` - -This can be achieved by creating two custom IAM roles and attaching them to the principal (e.g. user or service account) that will be used to access the GCP Secrets Manager API with a condition configured when attaching the second role to limit access to secrets with a name prefix of `zenml-`. The following `gcloud` CLI command examples can be used as a starting point: - -```bash - -gcloud iam roles create ZenMLServerSecretsStoreCreator \ - --project \ - --title "ZenML Server Secrets Store Creator" \ - --description "Allow the ZenML Server to create new secrets" \ - --stage GA \ - --permissions "secretmanager.secrets.create" - -gcloud iam roles create ZenMLServerSecretsStoreEditor \ - --project \ - --title "ZenML Server Secrets Store Editor" \ - --description "Allow the ZenML Server to manage its secrets" \ - --stage GA \ - --permissions "secretmanager.secrets.get,secretmanager.secrets.update,secretmanager.versions.access,secretmanager.versions.add,secretmanager.secrets.delete" - -gcloud projects add-iam-policy-binding \ - --member serviceAccount: \ - --role projects//roles/ZenMLServerSecretsStoreCreator \ - --condition None - -# NOTE: use the GCP project NUMBER, not the project ID in the condition -gcloud projects add-iam-policy-binding \ - --member serviceAccount: \ - --role projects//roles/ZenMLServerSecretsStoreEditor \ - --condition 'title=limit_access_zenml,description="Limit access to secrets with prefix zenml-",expression=resource.name.startsWith("projects//secrets/zenml-")' - -``` - -The following configuration options are supported: - -* **ZENML\_SECRETS\_STORE\_AUTH\_METHOD**: The GCP Service Connector authentication method to use (e.g. `service-account`). -* **ZENML\_SECRETS\_STORE\_AUTH\_CONFIG**: The GCP Service Connector configuration, in JSON format (e.g. `{"project_id":"my-project","service_account_json":{ ... }}`). - -> **Note:** The remaining configuration options are deprecated and may be removed in a future release. Instead, you should set the `ZENML_SECRETS_STORE_AUTH_METHOD` and `ZENML_SECRETS_STORE_AUTH_CONFIG` variables to use the GCP Service Connector authentication method. - -* **ZENML\_SECRETS\_STORE\_PROJECT\_ID**: The GCP project ID to use. This must be set to the project ID where the GCP Secrets Manager service that you want to use is located. -* **GOOGLE\_APPLICATION\_CREDENTIALS**: The path to the GCP service account credentials file to use for authentication. This must be set to a valid GCP service account credentials file that has access to the GCP Secrets Manager service that you want to use. If you are using a GCP service account attached to a GKE cluster to authenticate, you can omit this variable. NOTE: the path to the credentials file must be mounted into the container. - - -These configuration options are only relevant if you're using Azure Key Vault as the secrets store backend. - -* **ZENML\_SECRETS\_STORE\_TYPE:** Set this to `azure` in order to set this type of secret store. -* **ZENML\_SECRETS\_STORE\_KEY\_VAULT\_NAME**: The name of the Azure Key Vault. This must be set to point to the Azure Key Vault instance that you want to use. - -The Azure Secrets Store uses the ZenML Azure Service Connector under the hood to authenticate with the Azure Key Vault API. This means that you can use any of the [authentication methods supported by the Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector#authentication-methods) to authenticate with the Azure Key Vault API. The following configuration options are supported: - -* **ZENML\_SECRETS\_STORE\_AUTH\_METHOD**: The Azure Service Connector authentication method to use (e.g. `service-account`). -* **ZENML\_SECRETS\_STORE\_AUTH\_CONFIG**: The Azure Service Connector configuration, in JSON format (e.g. `{"tenant_id":"my-tenant-id","client_id":"my-client-id","client_secret": "my-client-secret"}`). - -> **Note:** The remaining configuration options are deprecated and may be removed in a future release. Instead, you should set the `ZENML_SECRETS_STORE_AUTH_METHOD` and `ZENML_SECRETS_STORE_AUTH_CONFIG` variables to use the Azure Service Connector authentication method. - -* **ZENML\_SECRETS\_STORE\_AZURE\_CLIENT\_ID**: The Azure application service principal client ID to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable. -* **ZENML\_SECRETS\_STORE\_AZURE\_CLIENT\_SECRET**: The Azure application service principal client secret to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable. -* **ZENML\_SECRETS\_STORE\_AZURE\_TENANT\_ID**: The Azure application service principal tenant ID to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable. - - -These configuration options are only relevant if you're using Hashicorp Vault as the secrets store backend. - -* **ZENML\_SECRETS\_STORE\_TYPE:** Set this to `hashicorp` in order to set this type of secret store. -* **ZENML\_SECRETS\_STORE\_VAULT\_ADDR**: The URL of the HashiCorp Vault server to connect to. NOTE: this is the same as setting the `VAULT_ADDR` environment variable. -* **ZENML\_SECRETS\_STORE\_VAULT\_TOKEN**: The token to use to authenticate with the HashiCorp Vault server. NOTE: this is the same as setting the `VAULT_TOKEN` environment variable. -* **ZENML\_SECRETS\_STORE\_VAULT\_NAMESPACE**: The Vault Enterprise namespace. Not required for Vault OSS. NOTE: this is the same as setting the `VAULT_NAMESPACE` environment variable. -* **ZENML\_SECRETS\_STORE\_MAX\_VERSIONS**: The maximum number of secret versions to keep for each Vault secret. If not set, the default value of 1 will be used (only the latest version will be kept). - - - These configuration options are only relevant if you're using a custom secrets store backend implementation. For this to work, you must have [a custom implementation of the secrets store API](/develop/getting-started/deploying-zenml/manage-the-deployed-services/custom-secret-stores) in the form of a class derived from `zenml.zen_stores.secrets_stores.base_secrets_store.BaseSecretsStore`. This class must be importable from within the ZenML server container, which means you most likely need to mount the directory containing the class into the container or build a custom container image that contains the class. - -The following configuration option is required: - -* **ZENML\_SECRETS\_STORE\_TYPE:** Set this to `custom` in order to set this type of secret store. -* **ZENML\_SECRETS\_STORE\_CLASS\_PATH**: The fully qualified path to the class that implements the custom secrets store API (e.g. `my_package.my_module.MySecretsStore`). - -If your custom secrets store implementation requires additional configuration options, you can pass them as environment variables using the following naming convention: - -* `ZENML_SECRETS_STORE_`: The name of the option to pass to the custom secrets store class. The option name must be in uppercase and any hyphens (`-`) must be replaced with underscores (`_`). ZenML will automatically convert the environment variable name to the corresponding option name by removing the prefix and converting the remaining characters to lowercase. For example, the environment variable `ZENML_SECRETS_STORE_MY_OPTION` will be converted to the option name `my_option` and passed to the custom secrets store class configuration. - - - - -**ZENML\_SECRETS\_STORE\_TYPE**: Set this variable to `none`to disable the secrets store functionality altogether. - - -#### Backup secrets store - -[A backup secrets store](/develop/usage/project-setup/use-secrets/secret-management#backup-secrets-store) back-end may be configured for high-availability and backup purposes. or as an intermediate step in the process of [migrating secrets to a different external location or secrets manager provider](/develop/usage/project-setup/use-secrets/secret-management#secrets-migration-strategy). - -To configure a backup secrets store in the Docker container, use the same approach and instructions documented for the primary secrets store, but set the `**ZENML\_BACKUP\_SECRETS\_STORE\***` environment variables instead of `**ZENML\_SECRETS\_STORE\***`, e.g.: - -```Bash -ZENML_BACKUP_SECRETS_STORE_TYPE: aws - -ZENML_BACKUP_SECRETS_STORE_AUTH_METHOD: secret-key - -ZENML_BACKUP_SECRETS_STORE_AUTH_CONFIG: '{"aws_access_key_id":"", "aws_secret_access_key","","role_arn": ""}`' -``` - -### Advanced server configuration options - -These configuration options are not required for most use cases, but can be useful in certain scenarios that require mirroring the same ZenML server configuration across multiple container instances (e.g. a Kubernetes deployment with multiple replicas): - -* **ZENML\_SERVER\_JWT\_SECRET\_KEY**: This is a secret key used to sign JWT tokens used for authentication. If not explicitly set, a random key is generated automatically by the server on startup and stored in the server's global configuration. This should be set to a random string with a recommended length of at least 32 characters, e.g.: - -```Bash -from secrets import token_hex -token_hex(32) -``` -or: - -```Bash -openssl rand -hex 32 -``` - -The environment variables starting with _ZENML\_SERVER\_SECURE\_HEADERS\__\* can be used to enable, disable or set custom values for security headers in the ZenML server's HTTP responses. The following values can be set for any of the supported secure headers configuration options: - -* `enabled`, `on`, `true` or `yes` \- enables the secure header with the default value. -* `disabled`, `off`, `false`, `none` or `no` \- disables the secure header entirely, so that it is not set in the ZenML server's HTTP responses. -* any other value - sets the secure header to the specified value. - -The following secure headers environment variables are supported: - -* **ZENML\_SERVER\_SECURE\_HEADERS\_SERVER**: The `Server` HTTP header value used to identify the server. The default value is the ZenML server ID. -* **ZENML\_SERVER\_SECURE\_HEADERS\_HSTS**: The `Strict-Transport-Security` HTTP header value. The default value is `max-age=63072000; includeSubDomains`. -* **ZENML\_SERVER\_SECURE\_HEADERS\_XFO**: The `X-Frame-Options` HTTP header value. The default value is `SAMEORIGIN`. -* **ZENML\_SERVER\_SECURE\_HEADERS\_XXP**: The `X-XSS-Protection` HTTP header value. The default value is `0`. NOTE: this header is deprecated and should not be customized anymore. The `Content-Security-Policy` header should be used instead. -* **ZENML\_SERVER\_SECURE\_HEADERS\_CONTENT**: The `X-Content-Type-Options` HTTP header value. The default value is `nosniff`. -* **ZENML\_SERVER\_SECURE\_HEADERS\_CSP**: The `Content-Security-Policy` HTTP header value. This is by default set to a strict CSP policy that only allows content from the origins required by the ZenML dashboard. NOTE: customizing this header is discouraged, as it may cause the ZenML dashboard to malfunction. -* **ZENML\_SERVER\_SECURE\_HEADERS\_REFERRER**: The `Referrer-Policy` HTTP header value. The default value is `no-referrer-when-downgrade`. -* **ZENML\_SERVER\_SECURE\_HEADERS\_CACHE**: The `Cache-Control` HTTP header value. The default value is `no-store, no-cache, must-revalidate`. -* **ZENML\_SERVER\_SECURE\_HEADERS\_PERMISSIONS**: The `Permissions-Policy` HTTP header value. The default value is `accelerometer=(), camera=(), geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=()`. - -If you prefer to activate the server automatically during the initial deployment and also automate the creation of the initial admin user account, this legacy behavior can be brought back by setting the following environment variables: - -* **ZENML\_SERVER\_AUTO\_ACTIVATE**: Set this to `1` to automatically activate the server and create the initial admin user account when the server is first deployed. Defaults to `0`. -* **ZENML\_DEFAULT\_USER\_NAME**: The name of the initial admin user account created by the server on the first deployment, during database initialization. Defaults to `default`. -* **ZENML\_DEFAULT\_USER\_PASSWORD**: The password to use for the initial admin user account. Defaults to an empty password value, if not set. - -## Run the ZenML server with Docker - -As previously mentioned, the ZenML server container image uses sensible defaults for most configuration options. This means that you can simply run the container with Docker without any additional configuration and it will work out of the box for most use cases: - -```shell -docker run -it -d -p 8080:8080 --name zenml zenmldocker/zenml-server -``` - -> **Note:** It is recommended to use a ZenML container image version that matches the version of your client, to avoid any potential API incompatibilities (e.g. `zenmldocker/zenml-server:0.21.1` instead of `zenmldocker/zenml-server`). - -The above command will start a containerized ZenML server running on your machine that uses a temporary SQLite database file stored in the container. Temporary means that the database and all its contents (stacks, pipelines, pipeline runs, etc.) will be lost when the container is removed with `docker rm`. - -You need to visit the ZenML dashboard at `http://localhost:8080` and activate the server by creating an initial admin user account. You can then connect your client to the server with the web login flow: - -```shell -$ zenml connect --url http://localhost:8080 - -Connecting to: 'http://localhost:8080'... - -If your browser did not open automatically, please open the following URL into your browser to proceed with the authentication: - -http://localhost:8080/devices/verify?device_id=f7a7333a-3ef0-4f39-85a9-f190279456d3&user_code=9375f5cdfdaf36772ce981fe3ee6172c - -Successfully logged in. - -Creating default stack for user 'default'... - -Updated the global store configuration. -``` - - -The `localhost` URL **will** work, even if you are using Docker-backed ZenML orchestrators in your stack, like [the local Docker orchestrator](/develop/stack-components/orchestrators/local-docker) or [a locally deployed Kubeflow orchestrator](/develop/stack-components/orchestrators/kubeflow). - -ZenML makes use of specialized DNS entries such as `host.docker.internal` and `host.k3d.internal` to make the ZenML server accessible from the pipeline steps running inside other Docker containers on the same machine. - - -You can manage the container with the usual Docker commands: - -* `docker logs zenml` to view the server logs -* `docker stop zenml` to stop the server -* `docker start zenml` to start the server again -* `docker rm zenml` to remove the container - -If you are looking for a customized ZenML server Docker deployment, you can configure one or more of [the supported environment variables](/develop/getting-started/deploying-zenml/deploy-with-docker#zenml-server-configuration-options) and then pass them to the container using the `docker run` `--env` or `--env-file` arguments (see the [Docker documentation](https://docs.docker.com/engineversions/0.66.0/reference/commandline/run/#set-environment-variables--e---env---env-file) for more details). For example: - -```shell -docker run -it -d -p 8080:8080 --name zenml \ - - --env ZENML_STORE_URL=mysql://username:password@host:port/database \ - - zenmldocker/zenml-server -``` - -If you're looking for a quick way to run both the ZenML server and a MySQL database with Docker, you can [deploy the ZenML server with Docker Compose](/develop/getting-started/deploying-zenml/deploy-with-docker#zenml-server-with-docker-compose). - -The rest of this guide covers various advanced use cases for running the ZenML server with Docker. - -### Persisting the SQLite database - -Depending on your use case, you may also want to mount a persistent volume or directory from the host into the container to store the ZenML SQLite database file. This can be done using the `--mount` flag (see the [Docker documentation](https://docs.docker.com/storage/volumes/) for more details). For example: - -```shell -mkdir zenml-server - -docker run -it -d -p 8080:8080 --name zenml \ - - --mount type=bind,source=$PWD/zenml-server,target=/zenml/.zenconfig/local_stores/default_zen_store \ - - zenmldocker/zenml-server -``` - -This deployment has the advantage that the SQLite database file is persisted even when the container is removed with `docker rm`. - -### Docker MySQL database - -As a recommended alternative to the SQLite database, you can run a MySQL database service as another Docker container and connect the ZenML server container to it. - -A command like the following can be run to start the containerized MySQL database service: - -```shell -docker run --name mysql -d -p 3306:3306 -e MYSQL_ROOT_PASSWORD=password mysql:8.0 -``` - -If you also wish to persist the MySQL database data, you can mount a persistent volume or directory from the host into the container using the `--mount` flag, e.g.: - -```shell -mkdir mysql-data -docker run --name mysql -d -p 3306:3306 -e MYSQL_ROOT_PASSWORD=password \ - --mount type=bind,source=$PWD/mysql-data,target=/var/lib/mysql \ - mysql:8.0 -``` - -Configuring the ZenML server container to connect to the MySQL database is just a matter of setting the `ZENML_STORE_URL` environment variable. We use the special `host.docker.internal` DNS name that is resolved from within the Docker containers to the gateway IP address used by the Docker network (see the [Docker documentation](https://docs.docker.com/desktop/networking/#use-cases-and-workarounds-for-all-platforms) for more details). On Linux, this needs to be explicitly enabled in the `docker run` command with the `--add-host` argument: - -```shell -docker run -it -d -p 8080:8080 --name zenml \ - --add-host host.docker.internal:host-gateway \ - --env ZENML_STORE_URL=mysql://root:password@host.docker.internal/zenml \ - zenmldocker/zenml-server -``` - -You need to visit the ZenML dashboard at `http://localhost:8080` and activate the server by creating an initial admin user account. You can then connect your client to the server with the web login flow: - -```shell -zenml connect --url http://localhost:8080 -``` - -### Direct MySQL database connection - -This scenario is similar to the previous one, but instead of running a ZenML server, the client is configured to connect directly to a MySQL database running in a Docker container. - -As previously covered, the containerized MySQL database service can be started with a command like the following: - -```shell -docker run --name mysql -d -p 3306:3306 -e MYSQL_ROOT_PASSWORD=password mysql:8.0 -``` - -The ZenML client on the host machine can then be configured to connect directly to the database with a slightly different `zenml connect` command: - -```shell -zenml connect --url mysql://127.0.0.1/zenml --username root --password password -``` - -> **Note** The `localhost` hostname will not work with MySQL databases. You need to use the `127.0.0.1` IP address instead. - -### ZenML server with `docker-compose` - -Docker compose offers a simpler way of managing multi-container setups on your local machine, which is the case for instance if you are looking to deploy the ZenML server container and connect it to a MySQL database service also running in a Docker container. - -To use Docker Compose, you need to [install the docker-compose plugin](https://docs.docker.com/compose/install/linux/) on your machine first. - -A `docker-compose.yml` file like the one below can be used to start and manage the ZenML server container and the MySQL database service all at once: - - - -```JSON -version: "3.9" - -services: - mysql: - image: mysql:8.0 - ports: - - 3306:3306 - environment: - - MYSQL_ROOT_PASSWORD=password - zenml: - image: zenmldocker/zenml-server - ports: - - "8080:8080" - environment: - - ZENML_STORE_URL=mysql://root:password@host.docker.internal/zenml - links: - - mysql - depends_on: - - mysql - extra_hosts: - - "host.docker.internal:host-gateway" - restart: on-failure -``` - -Note the following: - -* `ZENML_STORE_URL` is set to the special Docker `host.docker.internal` hostname to instruct the server to connect to the database over the Docker network. -* The `extra_hosts` section is needed on Linux to make the `host.docker.internal` hostname resolvable from the ZenML server container. - -To start the containers, run the following command from the directory where the `docker-compose.yml` file is located: - -```Bash -docker compose -p zenml up -d -``` - -or, if you need to use a different filename or path: - -```Bash -docker compose -f /path/to/docker-compose.yml -p zenml up -d -``` - -You need to visit the ZenML dashboard at `http://localhost:8080` to activate the server by creating an initial admin account. You can then connect your client to the server with the web login flow: - -```Bash -zenml connect --url http://localhost:8080 -``` - -Tearing down the installation is as simple as running: - -```Bash -docker compose -p zenml down -``` - -## Database backup and recovery - -An automated database backup and recovery feature is enabled by default for all Docker deployments. The ZenML server will automatically back up the database in-memory before every database schema migration and restore it if the migration fails. - - -The database backup automatically created by the ZenML server is only temporary and only used as an immediate recovery in case of database migration failures. It is not meant to be used as a long-term backup solution. If you need to back up your database for long-term storage, you should use a dedicated backup solution. - -Several database backup strategies are supported, depending on where and how the backup is stored. The strategy can be configured by means of the `ZENML_STORE_BACKUP_STRATEGY` environment variable: - -* `disabled` \- no backup is performed -* `in-memory` \- the database schema and data are stored in memory. This is the fastest backup strategy, but the backup is not persisted across container restarts, so no manual intervention is possible in case the automatic DB recovery fails after a failed DB migration. Adequate memory resources should be allocated to the ZenML server container when using this backup strategy with larger databases. This is the default backup strategy. -* `database` \- the database is copied to a backup database in the same database server. This requires the `ZENML_STORE_BACKUP_DATABASE` environment variable to be set to the name of the backup database. This backup strategy is only supported for MySQL compatible databases and the user specified in the database URL must have permissions to manage (create, drop, and modify) the backup database in addition to the main database. -* `dump-file` \- the database schema and data are dumped to a filesystem location inside the ZenML server container. This location can be customized by means of the `ZENML_STORE_BACKUP_DIRECTORY` environment variable. When this strategy is configured, users should mount a host directory in the container and point the `ZENML_STORE_BACKUP_DIRECTORY` variable to where it's mounted inside the container. If a host directory is not mounted, the dump file will be stored in the container's filesystem and will be lost when the container is removed. - -The following additional rules are applied concerning the creation and lifetime of the backup: - -* a backup is not attempted if the database doesn't need to undergo a migration (e.g. when the ZenML server is upgraded to a new version that doesn't require a database schema change or if the ZenML version doesn't change at all). -* a backup file or database is created before every database migration attempt (i.e. when the container starts). If a backup already exists (i.e. persisted in a mounted host directory or backup database), it is overwritten. -* the persistent backup file or database is cleaned up after the migration is completed successfully or if the database doesn't need to undergo a migration. This includes backups created by previous failed migration attempts. -* the persistent backup file or database is NOT cleaned up after a failed migration. This allows the user to manually inspect and/or apply the backup if the automatic recovery fails. - -The following example shows how to deploy the ZenML server to use a mounted host directory to persist the database backup file during a database migration: - -```Bash -mkdir mysql-data - -docker run --name mysql -d -p 3306:3306 -e MYSQL_ROOT_PASSWORD=password \ - --mount type=bind,source=$PWD/mysql-data,target=/var/lib/mysql \ - mysql:8.0 - -docker run -it -d -p 8080:8080 --name zenml \ - --add-host host.docker.internal:host-gateway \ - --mount type=bind,source=$PWD/mysql-data,target=/db-dump \ - --env ZENML_STORE_URL=mysql://root:password@host.docker.internal/zenml \ - --env ZENML_STORE_BACKUP_STRATEGY=dump-file \ - --env ZENML_STORE_BACKUP_DIRECTORY=/db-dump \ - zenmldocker/zenml-server -``` - -## Troubleshooting - -You can check the logs of the container to verify if the server is up and, depending on where you have deployed it, you can also access the dashboard at a `localhost` port (if running locally) or through some other service that exposes your container to the internet. - -### CLI Docker deployments - -If you used the `zenml up --docker` CLI command to deploy the Docker ZenML server, you can check the logs with the command: - -```Bash -zenml logs -f -``` - -### Manual Docker deployments - -If you used the `docker run` command to manually deploy the Docker ZenML server, you can check the logs with the command: - -```Bash -docker logs zenml -f -``` - -If you used the `docker compose` command to manually deploy the Docker ZenML server, you can check the logs with the command: - -```Bash -docker compose -p zenml logs -f -``` diff --git a/develop/getting-started/deploying-zenml/deploy-with-helm.mdx b/develop/getting-started/deploying-zenml/deploy-with-helm.mdx deleted file mode 100644 index 7f8af41..0000000 --- a/develop/getting-started/deploying-zenml/deploy-with-helm.mdx +++ /dev/null @@ -1,690 +0,0 @@ ---- -title: "Deploy with Helm" -description: "Deploying ZenML in a Kubernetes cluster with Helm." -icon: face-smile ---- - -If you wish to manually deploy and manage ZenML in a Kubernetes cluster of your choice, ZenML also includes a Helm chart among its available deployment options. - -You can find the chart on this [ArtifactHub repository](https://artifacthub.io/packages/helm/zenml/zenml), along with the templates, default values and instructions on how to install it. Read on to find detailed explanations on prerequisites, configuration, and deployment scenarios. - -## Prerequisites - -You'll need the following: - -* A Kubernetes cluster -* Optional, but recommended: a MySQL-compatible database reachable from the Kubernetes cluster (e.g. one of the managed databases offered by Google Cloud, AWS, or Azure). A MySQL server version of 8.0 or higher is required -* the [Kubernetes client](https://kubernetes.io/docs/tasks/tools/#kubectl) already installed on your machine and configured to access your cluster -* [Helm](https://helm.sh/docs/intro/install/) installed on your machine -* Optional: an external Secrets Manager service (e.g. one of the managed secrets management services offered by Google Cloud, AWS, Azure, or HashiCorp Vault). By default, ZenML stores secrets inside the SQL database that it's connected to, but you also have the option of using an external cloud Secrets Manager service if you already happen to use one of those cloud or service providers - -## ZenML Helm Configuration - -You can start by taking a look at the [values.yaml file](https://artifacthub.io/packages/helm/zenml/zenml?modal=values) and familiarize yourself with some of the configuration settings that you can customize for your ZenML deployment. - -In addition to tools and infrastructure, you will also need to collect and [prepare information related to your database](/develop/getting-started/deploying-zenml/deploy-with-helm#collect-information-from-your-sql-database-service) and [information related to your external secrets management service](/develop/getting-started/deploying-zenml/deploy-with-helm#collect-information-from-your-secrets-management-service) to be used for the Helm chart configuration and you may also want to install additional [optional services in your cluster](/develop/getting-started/deploying-zenml/deploy-with-helm#optional-cluster-services). - -When you are ready, you can proceed to the [installation](/develop/getting-started/deploying-zenml/deploy-with-helm#zenml-helm-installation) section. - -### Collect information from your SQL database service - -Using an external MySQL-compatible database service is optional, but is recommended for production deployments. If omitted, ZenML will default to using an embedded SQLite database, which has the following limitations: - -* the SQLite database is not persisted, meaning that it will be lost if the ZenML server pod is restarted or deleted -* the SQLite database does not scale horizontally, meaning that you will not be able to use more than one replica at a time for the ZenML server pod - -If you decide to use an external MySQL-compatible database service, you will need to collect and prepare the following information for the Helm chart configuration: - -* the hostname and port where the SQL database is reachable from the Kubernetes cluster -* the username and password that will be used to connect to the database. It is recommended that you create a dedicated database user for the ZenML server and that you restrict its privileges to only access the database that will be used by ZenML. Enforcing secure SSL connections for the user/database is also recommended. See the [MySQL documentation](https://dev.mysql.com/doc/refman/5.7/en/access-control.html) for more information on how to set up users and privileges. -* the name of the database that will be used by ZenML. The database does not have to exist prior to the deployment ( ZenML will create it on the first start). However, you need to create the database if you follow the best practice of restricting database user privileges to only access it. -* if you plan on using SSL to secure the client database connection, you may also need to prepare additional SSL certificates and keys: - * the TLS CA certificate that was used to sign the server TLS certificate, if you're using a self-signed certificate or signed by a custom certificate authority that is not already trusted by default by most operating systems. - * the TLS client certificate and key. This is only needed if you decide to use client certificates for your DB connection (some managed DB services support this, CloudSQL is an example). - -### Collect information from your secrets management service - -Using an externally managed secrets management service like those offered by Google Cloud, AWS, Azure or HashiCorp Vault is optional, but is recommended if you are already using those cloud service providers. If omitted, ZenML will default to using the SQL database to store secrets. - -If you decide to use an external secrets management service, you will need to collect and prepare the following information for the Helm chart configuration (for supported back-ends only): - -For the AWS secrets manager: - -* the AWS region that you want to use to store your secrets -* an AWS access key ID and secret access key that provides full access to the AWS secrets manager service. You can create a dedicated IAM user for this purpose, or use an existing user with the necessary permissions. If you deploy the ZenML server in an EKS Kubernetes cluster that is already configured to use implicit authorization with an IAM role for service accounts, you can omit this step. - -For the Google Cloud secrets manager: - -* the Google Cloud project ID that you want to use to store your secrets -* a Google Cloud service account that has access to the secrets manager service. You can create a dedicated service account for this purpose, or use an existing service account with the necessary permissions. - -For the Azure Key Vault: - -* the name of the Azure Key Vault that you want to use to store your secrets -* the Azure tenant ID, client ID, and client secret associated with the Azure service principal that will be used to access the Azure Key Vault. You can create a dedicated application service principal for this purpose, or use an existing service principal with the necessary permissions. If you deploy the ZenML server in an AKS Kubernetes cluster that is already configured to use implicit authorization through the Azure-managed identity service, you can omit this step. - -For the HashiCorp Vault: - -* the URL of the HashiCorp Vault server -* the token that will be used to access the HashiCorp Vault server. - -### Optional cluster services - -It is common practice to install additional infrastructure-related services in a Kubernetes cluster to support the deployment and long-term management of applications. For example: - -* an Ingress service like [nginx-ingress](https://kubernetes.github.io/ingress-nginx/deploy/) is recommended if you want to expose HTTP services to the internet. An Ingress is required if you want to use secure HTTPS for your ZenML deployment. The alternative is to use a LoadBalancer service to expose the ZenML service using plain HTTP, but this is not recommended for production. -* a [cert-manager](https://cert-manager.io/docs/installation/) is recommended if you want to generate and manage TLS certificates for your ZenML deployment. It can be used to automatically provision TLS certificates from a certificate authority (CA) of your choice, such as [Let's Encrypt](https://letsencrypt.org/). As an alternative, the ZenML Helm chart can be configured to auto-generate self-signed or you can generate the certificates yourself and provide them to the Helm chart, but this makes it more difficult to manage the certificates and you need to manually renew them when they expire. - -## ZenML Helm Installation - -### Configure the Helm chart - -To use the Helm chart with custom values that includes path to files like the database SSL certificates, you need to pull the chart to your local directory first. You can do this with the following command: - -```Bash -helm pull oci://public.ecr.aws/zenml/zenml --version --untar -``` - -Next, to customize the Helm chart for your deployment, you should create a of the `values.yaml` file that you can find at `./zenml/values.yaml` (let’s call this `custom-values.yaml`). You’ll use this as a template to customize your configuration. Any values that you don’t override you should simply remove from your `custom-values.yaml` file to keep it clean and compatible with future Helm chart releases. - -In most cases, you’ll need to change the following configuration values in `custom-values.yaml`: - -* the database configuration, if you mean to use an external database: - * the database URL, formatted as `mysql://:@:/` - * CA and/or client TLS certificates, if you’re using SSL to secure the connection to the database -* the Ingress configuration, if enabled: - * enabling TLS - * enabling self-signed certificates - * configuring the hostname that will be used to access the ZenML server, if different from the IP address or hostname associated with the Ingress service installed in your cluster - -> **Note** All the file paths that you use in your helm chart (e.g. for certificates like `database.sslCa`) must be relative to the `./zenml` helm chart directory, meaning that you also have to these files there. - -### Install the Helm chart - -Once everything is configured, you can run the following command in the `./zenml` folder to install the Helm chart. - -``` -helm -n install zenml-server . --create-namespace --values custom-values.yaml -``` - -### Connect to the deployed ZenML server - -Immediately after deployment, the ZenML server needs to be activated before it can be used. The activation process includes creating an initial admin user account and configuring some server settings. You can do this only by visiting the ZenML server URL in your browser and following the on-screen instructions. Connecting your local ZenML client to the server is not possible until the server is properly initialized. - -The Helm chart should print out a message with the URL of the deployed ZenML server. You can use the URL to open the ZenML UI in your browser. - -To connect your local client to the ZenML server, you can either pass the configuration as command line arguments or as a YAML file: - -```Bash -zenml connect --url=https://zenml.example.com:8080 --no-verify-ssl -``` - -or - -```Bash -zenml connect --config=/path/to/zenml_server_config.yaml -``` - -The YAML file should have the following structure when connecting to a ZenML server: - -```Bash -url: - -verify_ssl: | - -``` - -Example of a ZenML server YAML configuration file: - -```Bash -url: https://ac8ef63af203226194a7725ee71d85a-7635928635.us-east-1.elb.amazonaws.com/zenml - -verify_ssl: | - - -----BEGIN CERTIFICATE----- - -... - - -----END CERTIFICATE----- -``` - -To disconnect from the current ZenML server and revert to using the local default database, use the following command: - -```Bash -zenml disconnect -``` - -## ZenML Helm Deployment Scenarios - -This section covers some common Helm deployment scenarios for ZenML. - -### Minimal deployment - -The example below is a minimal configuration for a ZenML server deployment that uses a temporary SQLite database and a ClusterIP service that is not exposed to the internet: - -```Bash -zenml: - - ingress: - - enabled: false -``` - -Once deployed, you have to use port-forwarding to access the ZenML server and to connect to it from your local machine: - -```Bash -kubectl -n zenml-server port-forward svc/zenml-server 8080:8080 - -zenml connect --url=http://localhost:8080 -``` - -This is just a simple example only fit for testing and evaluation purposes. For production deployments, you should use an external database and an Ingress service with TLS certificates to secure and expose the ZenML server to the internet. - -### Basic deployment with local database - -This deployment use-case still uses a local database, but it exposes the ZenML server to the internet using an Ingress service with TLS certificates generated by the cert-manager and signed by Let's Encrypt. - -First, you need to install cert-manager and nginx-ingress in your Kubernetes cluster. You can use the following commands to install them with their default configuration: - -```Bash -helm repo add jetstack https://charts.jetstack.io -helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx -helm repo update -helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set installCRDs=true -helm install nginx-ingress ingress-nginx/ingress-nginx --namespace nginx-ingress --create-namespace -``` - -Next, you need to create a ClusterIssuer resource that will be used by cert-manager to generate TLS certificates with Let's Encrypt: - -```Bash -cat < - privateKeySecretRef: - name: letsencrypt-staging - solvers: - - http01: - ingress: - class: nginx -EOF -``` - -Finally, you can deploy the ZenML server with the following Helm values: - -```Bash -zenml: - - ingress: - enabled: true - annotations: - cert-manager.io/cluster-issuer: "letsencrypt-staging" - tls: - enabled: true - generateCerts: false -``` - -> **Note** This use-case exposes ZenML at the root URL path of the IP address or hostname of the Ingress service. You cannot share the same Ingress hostname and URL path for multiple applications. See the next section for a solution to this problem. - -### Shared Ingress controller - -If the root URL path of your Ingress controller is already in use by another application, you cannot use it for ZenML. This section presents three possible solutions to this problem. - -#### Use a dedicated Ingress hostname for ZenML - -If you know the IP address of the load balancer in use by your Ingress controller, you can use a service like https://nip.io/ to create a new DNS name associated with it and expose ZenML at this new root URL path. For example, if your Ingress controller has the IP address `192.168.10.20`, you can use a DNS name like `zenml.192.168.10.20.nip.io` to expose ZenML at the root URL path `https://zenml.192.168.10.20.nip.io`. - -To find the IP address of your Ingress controller, you can use a command like the following: - -``` -kubectl -n nginx-ingress get svc nginx-ingress-ingress-nginx-controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}' -``` - -You can deploy the ZenML server with the following Helm values: - - - -```BASH -zenml: - - ingress: - enabled: true - annotations: - cert-manager.io/cluster-issuer: "letsencrypt-staging" - host: zenml..nip.io - tls: - enabled: true - generateCerts: false -``` - -> **Note** This method does not work if your Ingress controller is behind a load balancer that uses a hostname mapped to several IP addresses instead of an IP address. - -#### Use a dedicated Ingress URL path for ZenML - -If you cannot use a dedicated Ingress hostname for ZenML, you can use a dedicated Ingress URL path instead. For example, you can expose ZenML at the URL path `https:///zenml`. - -To deploy the ZenML server with a dedicated Ingress URL path, you can use the following Helm values: - -```Bash -zenml: - - ingress: - enabled: true - annotations: - cert-manager.io/cluster-issuer: "letsencrypt-staging" - nginx.ingress.kubernetes.io/rewrite-target: /$1 - path: /zenml/?(.*) - tls: - enabled: true - generateCerts: false -``` - -> **Note** This method has one current limitation: the ZenML UI does not support URL rewriting and will not work properly if you use a dedicated Ingress URL path. You can still connect your client to the ZenML server and use it to run pipelines as usual, but you will not be able to use the ZenML UI. - -#### Use a DNS service to map a different hostname to the Ingress controller - -This method requires you to configure a DNS service like AWS Route 53 or Google Cloud DNS to map a different hostname to the Ingress controller. For example, you can map the hostname `zenml.` to the Ingress controller's IP address or hostname. Then, simply use the new hostname to expose ZenML at the root URL path. - -### Secret Store configuration - -Unless explicitly disabled or configured otherwise, the ZenML server will use the SQL database as [a secrets store backend](/develop/usage/project-setup/use-secrets/secret-management) where secret values are stored. If you want to use an external secrets management service like the AWS Secrets Manager, GCP Secrets Manager, Azure Key Vault, HashiCorp Vault or even your custom Secrets Store back-end implementation instead, you need to configure it in the Helm values. Depending on where you deploy your ZenML server and how your Kubernetes cluster is configured, you will also need to provide the credentials needed to access the secrets management service API. - -> **Important:** If you are updating the configuration of your ZenML Server deployment to use a different secrets store back-end or location, you should follow [the documented secrets migration strategy](/develop/usage/project-setup/use-secrets/secret-management#secrets-migration-strategy) to minimize downtime and to ensure that existing secrets are also properly migrated. - - - -**Using the SQL database as a secrets store backend (default)** - -The SQL database is used as the default location where the ZenML secrets store keeps the secret values. You only need to configure these options if you want to change the default behavior. - -It is particularly recommended to enable encryption at rest for the SQL database if you plan on using it as a secrets store backend. You'll have to configure the secret key used to encrypt the secret values. If not set, encryption will not be used and passwords will be stored unencrypted in the database. This value should be set to a random string with a recommended length of at least 32 characters, e.g.: - -* generate a random string with Python: - -```Bash -from secrets import token_hex - -token_hex(32) -``` - -* or with OpenSSL: - -```Bash -openssl rand -hex 32 -``` - -* then configure it in the Helm values: - -```Bash - zenml: - - # ... - - # Secrets store settings. This is used to store centralized secrets. - secretsStore: - - # The type of the secrets store - type: sql - - # Configuration for the SQL secrets store - sql: - encryptionKey: 0f00e4282a3181be32c108819e8a860a429b613e470ad58531f0730afff64545 -``` - -> **Important:** If you configure encryption for your SQL database secrets store, you should keep the `encryptionKey` value somewhere safe and secure, as it will always be required by the ZenML Server to decrypt the secrets in the database. If you lose the encryption key, you will not be able to decrypt the secrets anymore and will have to reset them. - - - -**Using the AWS Secrets Manager as a secrets store backend** - -The AWS Secrets Store uses the ZenML AWS Service Connector under the hood to authenticate with the AWS Secrets Manager API. This means that you can use any of the [authentication methods supported by the AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector#authentication-methods) to authenticate with the AWS Secrets Manager API. - -The minimum set of permissions that must be attached to the implicit or configured AWS credentials are: `secretsmanager:CreateSecret`, `secretsmanager:GetSecretValue`, `secretsmanager:DescribeSecret`, `secretsmanager:PutSecretValue`, `secretsmanager:TagResource` and `secretsmanager:DeleteSecret` and they must be associated with secrets that have a name starting with `zenml/` in the target region and account. The following IAM policy example can be used as a starting point: - -```JSON -{ - "Version": "2012-10-17", - "Statement": [ - { - "Sid": "ZenMLSecretsStore", - "Effect": "Allow", - "Action": [ - "secretsmanager:CreateSecret", - "secretsmanager:GetSecretValue", - "secretsmanager:DescribeSecret", - "secretsmanager:PutSecretValue", - "secretsmanager:TagResource", - "secretsmanager:DeleteSecret" - ], - "Resource": "arn:aws:secretsmanager:::secret:zenml/*" - } - ] -} -``` - -Example configuration for the AWS Secrets Store: - -```Bash - zenml: - - # ... - - # Secrets store settings. This is used to store centralized secrets. - secretsStore: - - # Set to false to disable the secrets store. - enabled: true - - # The type of the secrets store - type: aws - - # Configuration for the AWS Secrets Manager secrets store - aws: - - # The AWS Service Connector authentication method to use. - authMethod: secret-key - - # The AWS Service Connector configuration. - authConfig: - # The AWS region to use. This must be set to the region where the AWS - # Secrets Manager service that you want to use is located. - region: us-east-1 - - # The AWS credentials to use to authenticate with the AWS Secrets - aws_access_key_id: - aws_secret_access_key: -``` - - -**Using the GCP Secrets Manager as a secrets store backend** - -The GCP Secrets Store uses the ZenML GCP Service Connector under the hood to authenticate with the GCP Secrets Manager API. This means that you can use any of the [authentication methods supported by the GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector#authentication-methods) to authenticate with the GCP Secrets Manager API. - -The minimum set of permissions that must be attached to the implicit or configured GCP credentials are as follows: - -* `secretmanager.secrets.create` for the target GCP project (i.e. no condition on the name prefix) -* `secretmanager.secrets.get`, `secretmanager.secrets.update`, `secretmanager.versions.access`, `secretmanager.versions.add` and `secretmanager.secrets.delete` for the target GCP project and for secrets that have a name starting with `zenml-` - -This can be achieved by creating two custom IAM roles and attaching them to the principal (e.g. user or service account) that will be used to access the GCP Secrets Manager API with a condition configured when attaching the second role to limit access to secrets with a name prefix of `zenml-`. The following `gcloud` CLI command examples can be used as a starting point: - -```Bash -gcloud iam roles create ZenMLServerSecretsStoreCreator \ - --project \ - --title "ZenML Server Secrets Store Creator" \ - --description "Allow the ZenML Server to create new secrets" \ - --stage GA \ - --permissions "secretmanager.secrets.create" - -gcloud iam roles create ZenMLServerSecretsStoreEditor \ - --project \ - --title "ZenML Server Secrets Store Editor" \ - --description "Allow the ZenML Server to manage its secrets" \ - --stage GA \ - --permissions "secretmanager.secrets.get,secretmanager.secrets.update,secretmanager.versions.access,secretmanager.versions.add,secretmanager.secrets.delete" - -gcloud projects add-iam-policy-binding \ - --member serviceAccount: \ - --role projects//roles/ZenMLServerSecretsStoreCreator \ - --condition None - -# NOTE: use the GCP project NUMBER, not the project ID in the condition -gcloud projects add-iam-policy-binding \ - --member serviceAccount: \ - --role projects//roles/ZenMLServerSecretsStoreEditor \ - --condition 'title=limit_access_zenml,description="Limit access to secrets with prefix zenml-",expression=resource.name.startsWith("projects//secrets/zenml-")' -``` - -Example configuration for the GCP Secrets Store: - -```Bash - - zenml: - - # ... - - # Secrets store settings. This is used to store centralized secrets. - secretsStore: - - # Set to false to disable the secrets store. - enabled: true - - # The type of the secrets store - type: gcp - - # Configuration for the GCP Secrets Manager secrets store - gcp: - - # The GCP Service Connector authentication method to use. - authMethod: service-account - - # The GCP Service Connector configuration. - authConfig: - - # The GCP project ID to use. This must be set to the project ID where the - # GCP Secrets Manager service that you want to use is located. - project_id: my-gcp-project - - # GCP credentials JSON to use to authenticate with the GCP Secrets - # Manager instance. - google_application_credentials: | - { - "type": "service_account", - "project_id": "my-project", - "private_key_id": "...", - "private_key": "-----BEGIN PRIVATE KEY-----\n...=\n-----END PRIVATE KEY-----\n", - "client_email": "...", - "client_id": "...", - "auth_uri": "https://accounts.google.com/o/oauth2/auth", - "token_uri": "https://oauth2.googleapis.com/token", - "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", - "client_x509_cert_url": "..." - } - - serviceAccount: - - # If you're using workload identity, you need to annotate the service - # account with the GCP service account name (see https://cloud.google.com/kubernetes-engine/docs/versions/0.66.0/how-to/workload-identity) - annotations: - iam.gke.io/gcp-service-account: @.iam.gserviceaccount.com - -``` - - - **Using the Azure Key Vault as a secrets store backend** - -The Azure Secrets Store uses the ZenML Azure Service Connector under the hood to authenticate with the Azure Key Vault API. This means that you can use any of the [authentication methods supported by the Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector#authentication-methods) to authenticate with the Azure Key Vault API. - -Example configuration for the Azure Key Vault Secrets Store: - -```Bash - zenml: - - # ... - - # Secrets store settings. This is used to store centralized secrets. - secretsStore: - - # Set to false to disable the secrets store. - enabled: true - - # The type of the secrets store - type: azure - - # Configuration for the Azure Key Vault secrets store - azure: - - # The name of the Azure Key Vault. This must be set to point to the Azure - # Key Vault instance that you want to use. - key_vault_name: - - # The Azure Service Connector authentication method to use. - authMethod: service-principal - - # The Azure Service Connector configuration. - authConfig: - - # The Azure application service principal credentials to use to - # authenticate with the Azure Key Vault API. - client_id: - client_secret: - tenant_id: -``` - - -**Using the HashiCorp Vault as a secrets store backend** - -To use the HashiCorp Vault service as a Secrets Store back-end, it must be configured in the Helm values: - -```Bash - zenml: - - # ... - - # Secrets store settings. This is used to store centralized secrets. - secretsStore: - - # Set to false to disable the secrets store. - enabled: true - - # The type of the secrets store - type: hashicorp - - # Configuration for the HashiCorp Vault secrets store - hashicorp: - - # The url of the HashiCorp Vault server to use - vault_addr: https://vault.example.com - # The token used to authenticate with the Vault server - vault_token: - # The Vault Enterprise namespace. Not required for Vault OSS. - vault_namespace: -``` - - -**Using a custom secrets store backend implementation** - -You have the option of using [a custom implementation of the secrets store API](/develop/usage/project-setup/use-secrets/secret-management) as your secrets store back-end. This must come in the form of a class derived from `zenml.zen_stores.secrets_stores.base_secrets_store.BaseSecretsStore`. This class must be importable from within the ZenML server container, which means you most likely need to build a custom container image that contains the class. Then, you can configure the Helm values to use your custom secrets store as follows: -```Bash - - zenml: - - # ... - - # Secrets store settings. This is used to store centralized secrets. - secretsStore: - - # Set to false to disable the secrets store. - enabled: true - - # The type of the secrets store - type: custom - - # Configuration for the HashiCorp Vault secrets store - custom: - - # The class path of the custom secrets store implementation. This should - # point to a full Python class that extends the - # `zenml.zen_stores.secrets_stores.base_secrets_store.BaseSecretsStore` - # base class. The class should be importable from the container image - # that you are using for the ZenML server. - class_path: my.custom.secrets.store.MyCustomSecretsStore - - # Extra environment variables used to configure the custom secrets store. - environment: - ZENML_SECRETS_STORE_OPTION_1: value1 - ZENML_SECRETS_STORE_OPTION_2: value2 - - # Extra environment variables to set in the ZenML server container that - # should be kept secret and are used to configure the custom secrets store. - secretEnvironment: - ZENML_SECRETS_STORE_SECRET_OPTION_3: value3 - ZENML_SECRETS_STORE_SECRET_OPTION_4: value4 -``` - - - -#### Backup secrets store - -[A backup secrets store](/develop/usage/project-setup/use-secrets/secret-management#backup-secrets-store) back-end may be configured for high-availability and backup purposes. or as an intermediate step in the process of [migrating secrets to a different external location or secrets manager provider](/develop/usage/project-setup/use-secrets/secret-management#secrets-migration-strategy). - -To configure a backup secrets store in the Helm chart, use the same approach and instructions documented for the primary secrets store, but using the `backupSecretsStore` configuration section instead of `secretsStore`, e.g.: - -```Bash - zenml: - - # ... - - # Backup secrets store settings. This is used as a backup for the primary - # secrets store. - backupSecretsStore: - - # Set to true to enable the backup secrets store. - enabled: true - - # The type of the backup secrets store - type: aws - - # Configuration for the AWS Secrets Manager backup secrets store - aws: - - # The AWS Service Connector authentication method to use. - authMethod: secret-key - - # The AWS Service Connector configuration. - authConfig: - # The AWS region to use. This must be set to the region where the AWS - # Secrets Manager service that you want to use is located. - region: us-east-1 - - # The AWS credentials to use to authenticate with the AWS Secrets - aws_access_key_id: - aws_secret_access_key: -``` - -### Database backup and recovery - -An automated database backup and recovery feature is enabled by default for all Helm deployments. The ZenML server will automatically back up the database before every upgrade and restore it if the upgrade fails in a way that affects the database. - - - The database backup automatically created by the ZenML server is only temporary and only used as an immediate recovery in case of database migration failures. It is not meant to be used as a long-term backup solution. If you need to back up your database for long-term storage, you should use a dedicated backup solution. - - -Several database backup strategies are supported, depending on where and how the backup is stored. The strategy can be configured by means of the `zenml.database.backupStrategy` Helm value: - -* `disabled` \- no backup is performed -* `in-memory` \- the database schema and data are stored in memory. This is the fastest backup strategy, but the backup is not persisted across pod restarts, so no manual intervention is possible in case the automatic DB recovery fails after a failed DB migration. Adequate memory resources should be allocated to the ZenML server pod when using this backup strategy with larger databases. This is the default backup strategy. -* `database` \- the database is copied to a backup database in the same database server. This requires the `backupDatabase` option to be set to the name of the backup database. This backup strategy is only supported for MySQL compatible databases and the user specified in the database URL must have permissions to manage (create, drop, and modify) the backup database in addition to the main database. -* `dump-file` \- the database schema and data are dumped to a file local to the database initialization and upgrade job. Users may optionally configure a persistent volume where the dump file will be stored by setting the `backupPVStorageSize` and optionally the `backupPVStorageClass` options. If a persistent volume is not configured, the dump file will be stored in an emptyDir volume, which is not persisted. If configured, the user is responsible for deleting the resulting PVC when uninstalling the Helm release. - -> **NOTE:** You should also set the `podSecurityContext.fsGroup` option if you are using a persistent volume to store the dump file. - -The following additional rules are applied concerning the creation and lifetime of the backup: - -* a backup is not attempted if the database doesn't need to undergo a migration (e.g. when the ZenML server is upgraded to a new version that doesn't require a database schema change or if the ZenML version doesn't change at all). -* a backup file or database is created before every database migration attempt (i.e. during every Helm upgrade). If a backup already exists (i.e. persisted in a persistent volume or backup database), it is overwritten. -* the persistent backup file or database is cleaned up after the migration is completed successfully or if the database doesn't need to undergo a migration. This includes backups created by previous failed migration attempts. -* the persistent backup file or database is NOT cleaned up after a failed migration. This allows the user to manually inspect and/or apply the backup if the automatic recovery fails. - -The following example shows how to configure the ZenML server to use a persistent volume to store the database dump file: - -```bash - zenml: - - # ... - - database: - url: "mysql://admin:password@my.database.org:3306/zenml" - - # Configure the database backup strategy - backupStrategy: dump-file - backupPVStorageSize: 1Gi - -podSecurityContext: - fsGroup: 1000 # if you're using a PVC for backup, this should necessarily be set. -``` diff --git a/develop/getting-started/deploying-zenml/deploy-with-zenml-cli.mdx b/develop/getting-started/deploying-zenml/deploy-with-zenml-cli.mdx deleted file mode 100644 index 334c8ce..0000000 --- a/develop/getting-started/deploying-zenml/deploy-with-zenml-cli.mdx +++ /dev/null @@ -1,334 +0,0 @@ ---- -title: "Deploy with ZenML CLI" -description: "Deploying ZenML on cloud using the ZenML CLI." -icon: terminal ---- - -The easiest and fastest way to get running on the cloud is by using the `deploy` CLI command. It currently only supports deploying to Kubernetes on managed cloud services. You can check the [overview page](/develop/getting-started/deploying-zenml#deploying-a-zenml-server) to learn about other options that you have. - -Before we begin, it will help to understand the [architecture](/develop/getting-started/deploying-zenml) around the ZenML server and the database that it uses. Now, depending on your setup, you may find one of the following scenarios relevant. - -## Option 1: Starting from scratch - -If you don't have an existing Kubernetes cluster, you have the following two options to set it up: - -* Creating it manually using the documentation for your cloud provider. For convenience, here are links for [AWS](https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html), [Azure](https://learn.microsoft.com/en-us/azure/aks/learn/quick-kubernetes-deploy-portal?tabs=azure-cli), and [GCP](https://cloud.google.com/kubernetes-engine/docs/versions/0.66.0/how-to/creating-a-zonal-cluster#before%5Fyou%5Fbegin). -* Using a [stack recipe](/develop/stack-components/stack-deployment/deploy-a-stack-using-mlstacks) that sets up a cluster along with other tools that you might need in your cloud stack like artifact stores and secret managers. Take a look at all [available stack recipes](https://github.com/zenml-io/mlstacks) to see if there's something that works for you. - - -Once you have created your cluster, make sure that you configure your [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) client to talk to it. - -You're now ready to deploy ZenML! Run the following command: - -```Bash -zenml deploy -``` - -You will be prompted to provide a name for your deployment and details like what cloud provider you want to deploy to — and that's it! It creates the database and any VPCs, permissions, and more that are needed. - - - In order to be able to run the `deploy` command, you should have your cloud provider's CLI configured locally with permissions to create resources like MySQL databases and networks. - - -Reasonable defaults are in place for you already and if you wish to configure more settings, take a look at the next scenario that uses a config file. - -## Option 2: Using existing cloud resources - -### Existing Kubernetes cluster - -If you already have an existing cluster without an ingress controller, you can jump straight to the `deploy` command above to get going with the defaults. Please make sure that you have your local `kubectl` configured to talk to your cluster. - -#### Having an existing NGINX Ingress Controller - -The `deploy` command, by default, tries to create an NGINX ingress controller on your cluster. If you already have an existing controller, you can tell ZenML to not re-deploy it through the use of a config file. This file can be found in the [Configuration File Templates](/develop/getting-started/deploying-zenml/deploy-with-zenml-cli#configuration-file-templates) towards the end of this guide. It offers a host of configuration options that you can leverage for advanced use cases. - -* Check if an ingress controller is running on your cluster by running the following command. You should see an entry in the output with the hostname populated. - -```Bash -# change the namespace to any other where -# You might have the controller installed -kubectl get svc -n ingress-nginx -``` -* Set `create_ingress_controller` to `false`. -* Supply your controller's hostname to the `ingress_controller_hostname` variable. -> **Note:** The address should not have a trailing `/`. -* You can now run the `deploy` command and pass the config file above, to it. - -```Bash -zenml deploy --config=/PATH/TO/FILE -``` -> **Note:** To be able to run the deploy command, you should have your cloud provider's CLI configured locally with permissions to create resources like MySQL databases and networks. - -### Existing hosted SQL database - -If you also already have a database that you would want to use with the deployment, you can choose to configure it with the use of the config file. Here, we will demonstrate setting the database. - -* Fill the fields below from the config file with values from your database. - -```sql -# The username and password for the database. -database_username: -database_password: -# The URL of the database to use for the ZenML server. -database_url: -# The path to the SSL CA certificate to use for the database connection. -database_ssl_ca: -# The path to the client SSL certificate to use for the database connection. -database_ssl_cert: -# The path to the client SSL key to use for the database connection. -database_ssl_key: -# Whether to verify the database server SSL certificate. -database_ssl_verify_server_cert: -``` -* Run the `deploy` command and pass the config file above to it. -Copy -``` -zenml deploy --config=/PATH/TO/FILE -``` -> **Note** To be able to run the deploy command, you should have your cloud provider's CLI configured locally with permissions to create resources like MySQL databases and networks. - -## Configuration file templates - -#### Base configuration file - -Below is the general structure of a config file. Use this as a base and then add any cloud-specific parameters from the sections below. - -```shell -# Name of the server deployment. -name: - -# The server provider type, one of aws, gcp or azure. -provider: - -# The path to the kubectl config file to use for deployment. -kubectl_config_path: - -# The Kubernetes namespace to deploy the ZenML server to. -namespace: zenmlserver - -# The path to the ZenML server helm chart to use for deployment. -helm_chart: - -# The repository and tag to use for the ZenML server Docker image. -zenmlserver_image_repo: zenmldocker/zenml -zenmlserver_image_tag: latest - -# Whether to deploy an nginx ingress controller as part of the deployment. -create_ingress_controller: true - -# Whether to use TLS for the ingress. -ingress_tls: true - -# Whether to generate self-signed TLS certificates for the ingress. -ingress_tls_generate_certs: true - -# The name of the Kubernetes secret to use for the ingress. -ingress_tls_secret_name: zenml-tls-certs - -# The ingress controller's IP address. The ZenML server will be exposed on a subdomain of this IP. For AWS, if you have a hostname instead, use the following command to get the IP address: `dig +short `. -ingress_controller_ip: - -# Whether to create a SQL database service as part of the recipe. -deploy_db: true - -# The username and password for the database. -database_username: user -database_password: - -# The URL of the database to use for the ZenML server. -database_url: - -# The path to the SSL CA certificate to use for the database connection. -database_ssl_ca: - -# The path to the client SSL certificate to use for the database connection. -database_ssl_cert: - -# The path to the client SSL key to use for the database connection. -database_ssl_key: - -# Whether to verify the database server SSL certificate. -database_ssl_verify_server_cert: true - -# The log level to set the terraform client. Choose one of TRACE, -# DEBUG, INFO, WARN, or ERROR (case insensitive). -log_level: ERROR -``` - - - - Feel free to include only those variables that you want to customize, in your file. For all other variables, the default values (shown above) will be used. - - -#### Cloud-specific settings - - - -```Bash -# The AWS region to deploy to. -region: eu-west-1 - -# The name of the RDS instance to create -rds_name: zenmlserver - -# Name of RDS database to create. -db_name: zenmlserver - -# Type of RDS database to create. -db_type: mysql - -# Version of RDS database to create. -db_version: 5.7.38 - -# Instance class of RDS database to create. -db_instance_class: db.t3.micro - -# Allocated storage of RDS database to create. -db_allocated_storage: 5 -``` -The `database_username` and `database_password` from the general config is used to set those variables for the AWS RDS instance. - - -```Bash -# The project in GCP to deploy the server in. - -project_id: - -# The GCP region to deploy to. - -region: europe-west3 - -# The name of the CloudSQL instance to create. - -cloudsql_name: zenmlserver - -# Name of CloudSQL database to create. - -db_name: zenmlserver - -# Instance class of CloudSQL database to create. - -db_instance_tier: db-n1-standard-1 - -# Allocated storage of CloudSQL database, in GB, to create. - -db_disk_size: 10 - -# Whether or not to enable the Secrets Manager API. Disable this if you - -# don't have ListServices permissions on the project. - -enable_secrets_manager_api: true -``` - -* The `project_id` is required to be set. -* The `database_username` and `database_password` from the general config is used to set those variables for the CloudSQL instance. -* SSL is disabled by default on the database and the option to enable it is coming soon! - - -```Bash -# The Azure resource_group to deploy to. - -resource_group: zenml - -# The name of the Flexible MySQL instance to create. - -db_instance_name: zenmlserver - -# Name of RDS database to create. - -db_name: zenmlserver - -# Version of MySQL database to create. - -db_version: 5.7 - -# The sku_name for the database resource. - -db_sku_name: B_Standard_B1s - -# Allocated storage of MySQL database to create. - -db_disk_size: 20 -``` -The `database_username` and `database_password` from the general config is used to set those variables for the Azure Flexible MySQL server. - - - - - -## Connecting to deployed ZenML - -Immediately after deployment, the ZenML server needs to be activated before it can be used. The activation process includes creating an initial admin user account and configuring some server settings. You can do this only by visiting the ZenML server URL in your browser and following the on-screen instructions. Connecting your local ZenML client to the server is not possible until the server is properly initialized. - -Once ZenML is deployed, one or multiple users can connect to it with the `zenml connect` command. - -```bash -zenml connect -``` - - - If no arguments are supplied, ZenML will attempt to connect to the last ZenML server deployed from the local host using the `zenml deploy` command: - - -In order to connect to a specific ZenML server, you can either pass the configuration as command line arguments or as a YAML file: - - -```Bash -zenml connect --url=https://zenml.example.com:8080 --no-verify-ssl -``` - -or - -```Bash -zenml connect --config=/path/to/zenml_server_config.yaml -``` - -The YAML file should have the following structure when connecting to a ZenML server: - -```Bash -# The URL of the ZenML server - -url: - -# Either a boolean, in which case it controls whether the server's TLS - -# certificate is verified, or a string, in which case it must be a path - -# to a CA certificate bundle to use or the CA bundle value itself - -verify_ssl: -``` - -Here is an example of a ZenML server YAML configuration file: - - -```Shell -url: https://ac8ef63af203226194a7725ee71d85a-7635928635.us-east-1.elb.amazonaws.com/zenml - -verify_ssl: | - - -----BEGIN CERTIFICATE----- - -... - - -----END CERTIFICATE----- -``` - -To disconnect from the current ZenML server and revert to using the local default database, use the following command: - - -```Bash -zenml disconnect -``` - -## How does it work? - -Here's an architecture diagram that shows how the workflow looks like when you do `zenml deploy`. - - -Running zenml deploy - -The deploy CLI makes use of a "recipe" inside the `zenml-io/zenml` repository to deploy the server on the right cloud. Any configuration that you pass with the CLI, is sent to the recipe as input variables. diff --git a/develop/getting-started/deploying-zenml/deploying-zenml.mdx b/develop/getting-started/deploying-zenml/deploying-zenml.mdx deleted file mode 100644 index f2c5ab6..0000000 --- a/develop/getting-started/deploying-zenml/deploying-zenml.mdx +++ /dev/null @@ -1,86 +0,0 @@ ---- -title: "Overview" -description: "Why do we need to deploy ZenML?" -icon: person-circle-question ---- - -Moving your ZenML Server to a production environment offers several benefits over staying local: - -1. **Scalability**: Production environments are designed to handle large-scale workloads, allowing your models to process more data and deliver faster results. -2. **Reliability**: Production-grade infrastructure ensures high availability and fault tolerance, minimizing downtime and ensuring consistent performance. -3. **Collaboration**: A shared production environment enables seamless collaboration between team members, making it easier to iterate on models and share insights. - -Despite these advantages, transitioning to production can be challenging due to the complexities involved in setting up the needed infrastructure. - -### ZenML Server - -When you first get started with ZenML, it relies with the following architecture on your machine. - - - - -The SQLite database that you can see in this diagram is used to store information about pipelines, pipeline runs, stacks, and other configurations. Users can run the `zenml up` command to spin up a local REST server to serve the dashboard. The diagram for this looks as follows: - - - - -For the local REST server option, the `zenml up` command implicitly connects the client to the server. - -Currently the ZenML server supports a legacy and a brand-new version of the dashboard. To use the legacy version simply use the following command `zenml up --legacy` - -In order to move into production, the ZenML server needs to be deployed somewhere centrally so that the different cloud stack components can read from and write to the server. Additionally, this also allows all your team members to connect to it and share stacks and pipelines. - - - - -### Deploying a ZenML Server - -Deploying the ZenML Server is a crucial step towards transitioning to a production-grade environment for your machine learning projects. By setting up a deployed ZenML Server instance, you gain access to powerful features, allowing you to use stacks with remote components, centrally track progress, collaborate effectively, and achieve reproducible results. - -Currently, there are two main options to access a deployed ZenML server: - -1. **SaaS:** With [ZenML Pro](/develop/getting-started/deploying-zenml/zenml-pro/zenml-pro) offering you can utilize a control plane to create ZenML servers, also known as tenants. These tenants are managed and maintained by ZenML's dedicated team, alleviating the burden of server management from your end. Importantly, your data remains securely within your stack, and ZenML's role is primarily to handle tracking of metadata and server maintenance. -2. **Self-hosted Deployment:** Alternatively, you have the ability to deploy ZenML on your own self-hosted environment. This can be achieved through various methods, including using [our CLI](/develop/getting-started/deploying-zenml/deploy-with-zenml-cli), [Docker](/develop/stack-components/model-registries), [Helm](/develop/getting-started/deploying-zenml/deploy-with-helm), or [HuggingFace Spaces](/develop/getting-started/deploying-zenml/deploy-using-huggingface-spaces). We also offer our Pro version for self-hosted deployments, so you can use our full paid feature-set while staying fully in control with an airgapped solution on your infrastructure. - -Currently the ZenML server supports a legacy and a brand-new version of the dashboard. To use the legacy version which supports stack registration from the dashboard simply set the following environment variable in the deployment environment: `export ZEN_SERVER_USE_LEGACY_DASHBOARD=True`. - -Both options offer distinct advantages, allowing you to choose the deployment approach that best aligns with your organization's needs and infrastructure preferences. Whichever path you select, ZenML facilitates a seamless and efficient way to take advantage of the ZenML Server and enhance your machine learning workflows for production-level success. - -## How to deploy ZenML - -Documentation for the various deployment strategies can be found in the following pages below (in our 'how-to' guides): - - - - Deploying ZenML on cloud using the ZenML CLI. - - - Deploying ZenML in a Docker container. - - - Deploying ZenML in a Kubernetes cluster with Helm. - - - Deploying ZenML to Hugging Face Spaces. - - diff --git a/develop/getting-started/deploying-zenml/introduction.mdx b/develop/getting-started/deploying-zenml/introduction.mdx deleted file mode 100644 index 584e1d8..0000000 --- a/develop/getting-started/deploying-zenml/introduction.mdx +++ /dev/null @@ -1,4 +0,0 @@ -## Deploying ZenML - -ZenML is designed to be deployed in a variety of environments. This section -covers the different options you have for deploying ZenML. \ No newline at end of file diff --git a/develop/getting-started/deploying-zenml/manage-the-deployed-services.mdx b/develop/getting-started/deploying-zenml/manage-the-deployed-services.mdx deleted file mode 100644 index 5ef2e13..0000000 --- a/develop/getting-started/deploying-zenml/manage-the-deployed-services.mdx +++ /dev/null @@ -1,19 +0,0 @@ ---- -title: "Overview" -description: "Once deployed, here's how to manage ZenML and its Stack Components" -icon: gear ---- - -Maintaining and troubleshooting your ZenML deployment and the stack components deployed through ZenML is quite easy. Here are a few guides that will explain how: - - - -Learn how to upgrade your server to a new version of ZenML for different deployment options. - - -Troubleshooting tips for your ZenML deployment. - - -Learn how to troubleshoot Stack Components deployed with ZenML. - - diff --git a/develop/getting-started/deploying-zenml/manage-the-deployed-services/custom-secret-stores.mdx b/develop/getting-started/deploying-zenml/manage-the-deployed-services/custom-secret-stores.mdx deleted file mode 100644 index 52a2a8d..0000000 --- a/develop/getting-started/deploying-zenml/manage-the-deployed-services/custom-secret-stores.mdx +++ /dev/null @@ -1,101 +0,0 @@ ---- -title: "Custom secret stores" -description: "Learning how to develop a custom secret store." -icon: shop ---- - -The secrets store acts as the one-stop shop for all the secrets to which your pipeline or stack components might need access. It is responsible for storing, updating and deleting _only the secrets values_ for ZenML secrets, while the ZenML secret metadata is stored in the SQL database. The secrets store interface implemented by all available secrets store back-ends is defined in the `zenml.zen_stores.secrets_stores.secrets_store_interface` core module and looks more or less like this: - - -```py -class SecretsStoreInterface(ABC): - """ZenML secrets store interface. - - All ZenML secrets stores must implement the methods in this interface. - """ - - # --------------------------------- - # Initialization and configuration - # --------------------------------- - - @abstractmethod - def _initialize(self) -> None: - """Initialize the secrets store. - - This method is called immediately after the secrets store is created. - It should be used to set up the backend (database, connection etc.). - """ - - # --------- - # Secrets - # --------- - - @abstractmethod - def store_secret_values( - self, - secret_id: UUID, - secret_values: Dict[str, str], - ) -> None: - """Store secret values for a new secret. - - Args: - secret_id: ID of the secret. - secret_values: Values for the secret. - """ - - @abstractmethod - def get_secret_values(self, secret_id: UUID) -> Dict[str, str]: - """Get the secret values for an existing secret. - - Args: - secret_id: ID of the secret. - - Returns: - The secret values. - - Raises: - KeyError: if no secret values for the given ID are stored in the - secrets store. - """ - - @abstractmethod - def update_secret_values( - self, - secret_id: UUID, - secret_values: Dict[str, str], - ) -> None: - """Updates secret values for an existing secret. - - Args: - secret_id: The ID of the secret to be updated. - secret_values: The new secret values. - - Raises: - KeyError: if no secret values for the given ID are stored in the - secrets store. - """ - - @abstractmethod - def delete_secret_values(self, secret_id: UUID) -> None: - """Deletes secret values for an existing secret. - - Args: - secret_id: The ID of the secret. - - Raises: - KeyError: if no secret values for the given ID are stored in the - secrets store. - """ -``` - - - This is a slimmed-down version of the real interface which aims to highlight the abstraction layer. In order to see the full definition and get the complete docstrings, please check the [SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-zen%5Fstores/#zenml.zen%5Fstores.secrets%5Fstores.secrets%5Fstore%5Finterface.SecretsStoreInterface) . - - -## Build your own custom secrets store - -If you want to create your own custom secrets store implementation, you can follow the following steps: - -1. Create a class that inherits from the `zenml.zen_stores.secrets_stores.base_secrets_store.BaseSecretsStore` base class and implements the `abstractmethod`s shown in the interface above. Use `SecretsStoreType.CUSTOM` as the `TYPE` value for your secrets store class. -2. If you need to provide any configuration, create a class that inherits from the `SecretsStoreConfiguration` class and add your configuration parameters there. Use that as the `CONFIG_TYPE` value for your secrets store class. -3. To configure the ZenML server to use your custom secrets store, make sure your code is available in the container image that is used to run the ZenML server. Then, use environment variables or helm chart values to configure the ZenML server to use your custom secrets store, as covered in the [deployment guide](/develop/getting-started/deploying-zenml). diff --git a/develop/getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-stack-components.mdx b/develop/getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-stack-components.mdx deleted file mode 100644 index 61fbd96..0000000 --- a/develop/getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-stack-components.mdx +++ /dev/null @@ -1,80 +0,0 @@ ---- -title: "Troubleshoot stack components" -description: "Learn how to troubleshoot Stack Components deployed with ZenML." -icon: "layer-group" ---- - -There are two ways in which you can understand if something has gone wrong while deploying your stack or stack components. - -## Error logs from the CLI - -The CLI will show any errors that the deployment runs into. Most of these would be coming from the underlying terraform library and could range from issues like resources with the same name existing in your cloud to a wrong naming scheme for some resource. - -Most of these are easy to fix and self-explanatory but feel free to ask any questions or doubts you may have to us on the ZenML Slack! - -## Debugging errors with already deployed components - -Sometimes, an application might fail after an initial successful deployment. This section will cover steps on how to debug failures in such a case, for Kubernetes apps, since they form a majority of all tools deployed with the CLI. - - - Other components include cloud-specific apps like Vertex AI, Sagemaker, S3 buckets, and more. Information on what has gone wrong with them would be best found on the web console for the respective clouds. - - -### Getting access to the Kubernetes Cluster - -The first step to figuring out the problem with a deployed Kubernetes app is to get access to the underlying cluster hosting it. When you deploy apps that require a cluster, ZenML creates a cluster for you and this is reused for all subsequent apps that need it. - - -If you've used the `zenml stack deploy` flow to deploy your components, your local `kubectl` might already have access to the cluster. Check by running the following command: - -``` -kubectl get nodes -``` - -#### Stack Component Deploy - - - -1. Get the name of the deployed cluster.
-`zenml stack recipe output eks-cluster-name` -2. Figure out the region that the cluster is deployed to. By default, the region is set to `eu-west-1` , which you should use in the next step if you haven't supplied a custom value while creating the cluster. -3. Run the following command.
-`aws eks update-kubeconfig --name --region ` - -
- -1. Get the name of the deployed cluster. - -`zenml stack recipe output gke-cluster-name`\\ -2. Figure out the region that the cluster is deployed to. By default, the region is set to `europe-west1`, which you should use in the next step if you haven't supplied a custom value while creating the cluster.\\ -3. Figure out the project that the cluster is deployed to. You must have passed in a project ID while creating a GCP resource for the first time.\\ -4. Run the following command. - -`gcloud container clusters get-credentials --region --project ` - - - -You may already have your `kubectl` client configured with your cluster. Check by running `kubectl get nodes` before proceeding. - - -1. Get the name of the deployed cluster. - -`zenml stack recipe output k3d-cluster-name`\\ -2. Set the `KUBECONFIG` env variable to the `kubeconfig` file from the cluster. - -`export KUBECONFIG=$(k3d kubeconfig get )`\\ - -3. You can now use the `kubectl` client to talk to the cluster. - - -
- -#### Stack Recipe Deploy - -The steps for the stack recipe case should be the same as the ones listed above. The only difference that you need to take into account is the name of the outputs that contain your cluster name and the default regions. - -Each recipe might have its own values and here's how you can ascertain those values. - -* For the cluster name, go into the `outputs.tf` file in the root directory and search for the output that exposes the cluster name. -* For the region, check out the `variables.tf` or the `locals.tf` file for the default value assigned to it. - diff --git a/develop/getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-your-deployed-server.mdx b/develop/getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-your-deployed-server.mdx deleted file mode 100644 index d64ec03..0000000 --- a/develop/getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-your-deployed-server.mdx +++ /dev/null @@ -1,97 +0,0 @@ ---- -title: "Troubleshoot the deployed server" -description: "Troubleshooting tips for your ZenML deployment" -icon: "screwdriver-wrench" ---- - -In this document, we will go over some common issues that you might face when deploying ZenML and how to solve them. - -## Viewing logs - -Analyzing logs is a great way to debug issues. Depending on whether you have a Kubernetes (using Helm or `zenml deploy`) or a Docker deployment, you can view the logs in different ways. - - - If you are using Kubernetes, you can view the logs of the ZenML server using the following method: - -* Check all pods that are running your ZenML deployment. - -```Bash -kubectl -n get pods -``` - -* If you see that the pods aren't running, you can use the command below to get the logs for all pods at once. - -```Bash -kubectl -n logs -l app.kubernetes.io/name=zenml -``` - -Note that the error can either be from the `zenml-db-init` container that connects to the MySQL database or from the `zenml` container that runs the server code. If the get pods command shows that the pod is failing in the `Init` state then use `zenml-db-init` as the container name, otherwise use `zenml`. - -```Bash -kubectl -n logs -l app.kubernetes.io/name=zenml -c -``` - - - You can also use the `--tail` flag to limit the number of lines to show or the `--follow` flag to follow the logs in real-time. - - - -If you are using Docker, you can view the logs of the ZenML server using the following method: - -* If you used the `zenml up --docker` CLI command to deploy the Docker ZenML server, you can check the logs with the command: - -```Bash -zenml logs -f -``` -* If you used the `docker run` command to manually deploy the Docker ZenML server, you can check the logs with the command: - -```bash -docker logs zenml -f -``` -* If you used the `docker compose` command to manually deploy the Docker ZenML server, you can check the logs with the command: - -```bash -docker compose -p zenml logs -f -``` - - - - - -## Fixing database connection problems - -If you are using a MySQL database, you might face issues connecting to it. The logs from the `zenml-db-init` container should give you a good idea of what the problem is. Here are some common issues and how to fix them: - -* If you see an error like `ERROR 1045 (28000): Access denied for user using password YES`, it means that the username or password is incorrect. Make sure that the username and password are correctly set for whatever deployment method you are using. -* If you see an error like `ERROR 2003 (HY000): Can't connect to MySQL server on ()`, it means that the host is incorrect. Make sure that the host is correctly set for whatever deployment method you are using. - -You can test the connection and the credentials by running the following command from your machine: - -```Bash -mysql -h -u -p -``` - - -If you are using a Kubernetes deployment, you can use the `kubectl port-forward` command to forward the MySQL port to your local machine. This will allow you to connect to the database from your machine. - - -## Fixing database initialization problems - -If you’ve migrated from a newer ZenML version to an older version and see errors like `Revision not found` in your `zenml-db-init` logs, one way out is to drop the database and create a new one with the same name. - -* Log in to your MySQL instance. - -```Bash -mysql -h -u -p -``` -* Drop the database for the server. - -```Bash -drop database ; -``` -* Create the database with the same name. - -```Bash -create database ; -``` -* Restart the Kubernetes pods or the docker container running your server to trigger the database initialization again. diff --git a/develop/getting-started/deploying-zenml/manage-the-deployed-services/upgrade-the-version-of-the-zenml-server.mdx b/develop/getting-started/deploying-zenml/manage-the-deployed-services/upgrade-the-version-of-the-zenml-server.mdx deleted file mode 100644 index 3582fcb..0000000 --- a/develop/getting-started/deploying-zenml/manage-the-deployed-services/upgrade-the-version-of-the-zenml-server.mdx +++ /dev/null @@ -1,85 +0,0 @@ ---- -title: "Upgrade the version of the ZenML server" -description: "Learn how to upgrade your server to a new version of ZenML for the different deployment options." -icon: "arrow-up-right-dots" ---- - -The way to upgrade your ZenML server depends a lot on how you deployed it. - - - - -To upgrade your ZenML server that was deployed with the `zenml deploy` command to a newer version, you can follow the steps below. - -* In the config file, set `zenmlserver_image_tag` to the version that you want your ZenML server to be running. -* Run the deploy command again with this config file: -```Bash -zenml deploy --config=/PATH/TO/FILE -``` - -Any database schema updates are automatically handled by ZenML and unless mentioned otherwise, all of your data is migrated to the new version, intact. - - -To upgrade to a new version with docker, you have to delete the existing container and then run the new version of the `zenml-server` image. - - -Check that your data is persisted (either on persistent storage or on an external MySQL instance) before doing this. - -Optionally also perform a backup before the upgrade. - - -* Delete the existing ZenML container, for example like this: - -```shell -# find your container ID -docker ps -``` - -```shell -# stop the container -docker stop -# remove the container -docker rm -``` -* Deploy the version of the `zenml-server` image that you want to use. Find all versions [here](https://hub.docker.com/r/zenmldocker/zenml-server/tags). -```shell -docker run -it -d -p 8080:8080 --name zenmldocker/zenml-server: -``` - - - To upgrade your ZenML server Helm release to a new version, follow the steps below: - -* Pull the latest version of the Helm chart from the ZenML GitHub repository, or a version of your choice, e.g.: - -```Bash -# If you haven't cloned the ZenML repository yet -git clone https://github.com/zenml-io/zenml.git -# Optional: checkout an explicit release tag -# git checkout 0.21.1 -git pull -# Switch to the directory that hosts the helm chart -cd src/zenml/zen_server/deploy/helm/ -``` - -* Simply reuse the `custom-values.yaml` file that you used during the previous installation or upgrade. If you don't have it handy, you can extract the values from the ZenML Helm deployment using the following command: - -```Bash -helm -n get values zenml-server > custom-values.yaml -``` -* Upgrade the release using your modified values file. Make sure you are in the directory that hosts the helm chart: -Copy -```Bash -helm -n upgrade zenml-server . -f custom-values.yaml -``` - -It is not recommended to change the container image tag in the Helm chart to custom values, since every Helm chart version is tested to work only with the default image tag. However, if you know what you're doing you can change the `zenml.image.tag` value in your `custom-values.yaml` file to the desired ZenML version (e.g. `0.32.0`). - - - - - Downgrading the server to an older version is not supported and can lead to unexpected behavior. - - - - The version of the Python client that connects to the server should be kept at the same version as the server. - diff --git a/develop/getting-started/deploying-zenml/zenml-pro/system-architectures.mdx b/develop/getting-started/deploying-zenml/zenml-pro/system-architectures.mdx deleted file mode 100644 index 5759458..0000000 --- a/develop/getting-started/deploying-zenml/zenml-pro/system-architectures.mdx +++ /dev/null @@ -1,56 +0,0 @@ ---- -title: "System Architectures" -description: "Different variations of the ZenML architecture depending on your needs." -icon: compass-drafting ---- - - - If you're interested in assessing ZenML Pro, you can create a [free account](https://cloud.zenml.io/?utm%5Fsource=docs&utm%5Fmedium=referral%5Flink&utm%5Fcampaign=cloud%5Fpromotion&utm%5Fcontent=signup%5Flink), which defaults to a [Scenario 1](/develop/getting-started/deploying-zenml/zenml-pro#scenario-1-full-saas) deployment. To upgrade to different scenarios, please [reach out to us](mailto:cloud@zenml.io). - - -The ZenML Pro offers many additional features to increase your teams productivity. No matter your specific needs, the hosting options for ZenML Pro range from easy SaaS integration to completely airgapped deployments on your own infrastructure. - -A ZenML Pro deployment consists of the following moving pieces for both the SaaS product as well as the self-hosted version.: - -* **ZenML Pro Control Plane**: This is a centralized MLOps control plane that includes a managed ZenML dashboard and a special ZenML server optimized for production MLOps workloads. -* **Single Sign-On (SSO)**: The ZenML Pro API is integrated with [Auth0](https://auth0.com/) as an SSO provider to manage user authentication and authorization. Users can log in to the ZenML Pro app using their social media accounts or their corporate credentials. -* **Secrets Store**: All secrets and credentials required to access customer infrastructure services are stored in a secure secrets store. The ZenML Pro API has access to these secrets and uses them to access customer infrastructure services on behalf of the ZenML Pro. The secrets store can be hosted either by the ZenML Pro or by the customer. -* **ML Metadata Store**: This is where all ZenML metadata is stored, including ML metadata such as tracking and versioning information about pipelines and models. - -The above four interact with other MLOps stack components, secrets, and data in varying scenarios described below. - -## Scenario 1: Full SaaS - - - - -In this scenario, all services are hosted on infrastructure hosted by the ZenML Team, except the MLOps stack components. Customer secrets and credentials required to access customer infrastructure are stored and managed by the ZenML Pro Control Plane. - -On the ZenML Pro infrastructure, only ML _metadata_ (e.g. pipeline and model tracking and versioning information) is stored. All the actual ML data artifacts (e.g. data produced or consumed by pipeline steps, logs and visualizations, models) are stored on the customer cloud. This can be set up quite easily by configuring an [artifact store](/develop/stack-components/artifact-stores) with your MLOps stack. - -Your tenant only needs permissions to read from this data to display artifacts on the ZenML dashboard. The tenant also needs direct access to parts of the customer infrastructure services to support dashboard control plane features such as CI/CD, triggering and running pipelines, triggering model deployments etc. - -This scenario is meant for customers who want to quickly get started with ZenML and can to a certain extent allow ingress connections into their infrastructure from an external SaaS provider. - -## Scenario 2: Hybrid SaaS with Customer Secret Store managed by ZenML - - - - - -This scenario is a version of Scenario 1\. modified to store all sensitive information on the customer side. In this case, the customer connects their own secret store directly to the ZenML server that is managed by us. All ZenML secrets used by running pipelines to access infrastructure services and resources are stored in the customer secret store. This allows users to use [service connectors](/develop/usage/resource-data-management/auth-management/service-connectors-guide) and the [secrets API](/develop/usage/project-setup/use-secrets/interact-with-secrets) to authenticate ZenML pipelines and the ZenML Pro to 3rd party services and infrastructure while ensuring that credentials are always stored on the customer side. - -Even though they are stored customer side, access to ZenML secrets is fully managed by ZenML Pro. The individually deployed ZenML Servers can also allowed to use some of those credentials to connect directly to customer infrastructure services to implement control plane features such as artifact visualization or triggering pipelines. This implies that the secret values are allowed to leave the customer environment to allow their access to be managed centrally by the ZenML Pro and to enforce access control policies, but the ZenML users and pipelines never have direct access to the secret store. - -All access to customer secrets is, of course, regulated through authentication and RBAC, so that only authorized users can access the secrets. This deployment scenario is meant for customers who want to use the ZenML Pro but want to keep their secrets on their own infrastructure. - -## Scenario 3: Fully On-prem - - - - - -In this scenario, all services, data, and secrets are deployed on the customer cloud. This is the opposite of Scenario 1, and is meant for customers who require completely airgapped deployments, for the tightest security standards. [Reach out to us](mailto:cloud@zenml.io) if you want to set this up. - -Are you interested in ZenML Pro? [Sign up](https://cloud.zenml.io/?utm%5Fsource=docs&utm%5Fmedium=referral%5Flink&utm%5Fcampaign=cloud%5Fpromotion&utm%5Fcontent=signup%5Flink) and get access to Scenario 1\. with a free 14 day trial now! - diff --git a/develop/getting-started/deploying-zenml/zenml-pro/user-management.mdx b/develop/getting-started/deploying-zenml/zenml-pro/user-management.mdx deleted file mode 100644 index f9902ee..0000000 --- a/develop/getting-started/deploying-zenml/zenml-pro/user-management.mdx +++ /dev/null @@ -1,20 +0,0 @@ ---- -title: "User Management" -description: In ZenML Pro, there is a slightly different entity hierarchy as compared to the open-source ZenML framework. This document walks you through the key differences and new concepts that are pro-only. -icon: users-gear ---- - -## Organizations, Tenants, and Roles - -ZenML Pro arranges various aspects of your work experience around the concept of an **Organization**. This is the top-most level structure within the ZenML Cloud environment. Generally, an organization contains a group of users and one or more **tenants**. Tenants are individual, isolated deployments of the ZenML server. - -Every user in an organization has a distinct role. Each role configures what they can view, modify, and their level of involvement in collaborative tasks. A role thus helps determine the level of access that a user has within an organization. - -The `admin` has all permissions on an organization. They are allowed to add members, adjust the billing information and assign roles. The `editor` can still fully manage tenants and members but is not allowed to access the subscription information or delete the organization. The `viewer` Role allows you to allow users to access the tenants within the organization with only view permissions. - -## Inviting Team Members - -Inviting users to your organization to work on the organization's tenants is easy. Simply click `Add Member` in the Organization settings, and give them an initial Role. The User will be sent an invitation email. If a user is part of an organization, they can utilize their login on all tenants they have authority to access. - - - diff --git a/develop/getting-started/deploying-zenml/zenml-pro/zenml-pro.mdx b/develop/getting-started/deploying-zenml/zenml-pro/zenml-pro.mdx deleted file mode 100644 index a5540d7..0000000 --- a/develop/getting-started/deploying-zenml/zenml-pro/zenml-pro.mdx +++ /dev/null @@ -1,29 +0,0 @@ ---- -title: "Overview" -description: "The Pro version of ZenML comes with a number of features that expand the functionality of the Open Source product" -icon: cloud ---- - -### Role-based access control and permissions - -Utilizing ZenML Pro provides you with access to a robust control plane that unifies user management and optimizes your workflows. Efficiently manage access and permissions through centralized user administration. Create fine-grained permissions for resources such as stacks, pipelines, models, etc. - -See the section on [user management](/develop/getting-started/deploying-zenml/zenml-pro/user-management) to learn more. - -### A brand-new, modern MLOps experience - - - - - -We have built the ZenML Pro experience from the ground-up. With ZenML Pro, you get access to a new dashboard, with a better experience. The new dashboard features more functionality such as the [Model Control Plane](/develop/user-guide/guides/starter-guide/track-ml-models) and [Artifact Control Plane](/develop/user-guide/guides/starter-guide/manage-artifacts). - -### Run templates for running your pipelines from the dashboard or the API - -ZenML Pro enables you to [create and run templates](/develop/usage/pipelines/create-and-run-templates). This way, you can use the dashboard or our Client/REST API to run a pipeline with updated configuration which allows you to iterate quickly with minimal friction. - -### Triggers, CI/CD, Reports and more - -Additionally, ZenML Pro users get exclusive access to an array of cloud-specific features, such as triggers, integrating with your code repository CI/CD system, generating usage reports and more. - -Learn more about ZenML Pro on the [ZenML Website](https://zenml.io/pro). diff --git a/develop/getting-started/deploying-zenml/zenml-pro/zenml-saas.mdx b/develop/getting-started/deploying-zenml/zenml-pro/zenml-saas.mdx deleted file mode 100644 index cfd1f5a..0000000 --- a/develop/getting-started/deploying-zenml/zenml-pro/zenml-saas.mdx +++ /dev/null @@ -1,26 +0,0 @@ ---- -title: "ZenML SaaS" -description: "Your one-stop MLOps control plane." -icon: user-ninja ---- - -One of the most straightforward paths to start with a deployed ZenML server is to use [ZenML Pro](https://zenml.io/pro). The ZenML Pro offering eliminates the need for you to dedicate time and resources to deploy and manage a ZenML server, allowing you to focus primarily on your MLOps workflows. - - - If you're interested in assessing ZenML Pro, you can simply create a [free account](https://cloud.zenml.io/?utm%5Fsource=docs&utm%5Fmedium=referral%5Flink&utm%5Fcampaign=cloud%5Fpromotion&utm%5Fcontent=signup%5Flink). Learn more about ZenML Pro on the [ZenML Website](https://zenml.io/pro). - - -## Key features - -ZenML Pro comes as a Software-as-a-Service (SaaS) platform that enhances the functionalities of the open-source ZenML product. It equips you with a centralized interface to seamlessly launch and manage ZenML server instances. While it remains rooted in the robust open-source offering, ZenML Pro offers extra features designed to optimize your machine learning workflow. - -### Managed ZenML Server (Multi-tenancy) - -ZenML Pro simplifies your machine learning workflows, enabling you to deploy a managed instance of ZenML servers with just one click. This eradicates the need to handle infrastructure complexities, making the set-up and management of your machine learning pipelines a breeze. We handle all pertinent system updates and backups, thus ensuring your system stays current and robust, allowing you to zero in on your essential MLOps tasks. As a ZenML Pro user, you'll also have priority support, giving you the necessary aid to fully utilize the platform. - -### Maximum data security - -At ZenML Pro, your data security and privacy are our top priority. The platform enables a secure connection to your infrastructure, tracking only metadata via an encrypted connection to maintain the confidentiality of your sensitive information. ZenML Pro integrates smoothly with your cloud services via service connectors, allowing a straightforward connection with various cloud resources without sacrificing data security. We hold your confidential information in a secure and isolated environment, offering an extra degree of protection. If desired, you can even [supply your own secret store](/develop/getting-started/deploying-zenml/manage-the-deployed-services/custom-secret-stores). - -Click [here](/develop/getting-started/deploying-zenml/zenml-pro/system-architectures) to understand about the ZenML Pro system architecture. - diff --git a/develop/getting-started/faq.mdx b/develop/getting-started/faq.mdx deleted file mode 100644 index 8468daa..0000000 --- a/develop/getting-started/faq.mdx +++ /dev/null @@ -1,75 +0,0 @@ ---- -title: "FAQ" -description: "Find answers to the most frequently asked questions about ZenML." -icon: circle-question -mode: wide ---- - - - - We built it because we scratched our own itch while deploying multiple machine-learning models in production over the past three years. Our team struggled to find a simple yet production-ready solution whilst developing large-scale ML pipelines. We built a solution for it that we are now proud to share with all of you! Read more about this backstory [on our blog here](https://blog.zenml.io/why-zenml/). - - - Not really! An orchestrator in MLOps is the system component that is responsible for executing and managing the execution of an ML pipeline. ZenML is a framework that allows you to run your pipelines on whatever orchestrator you like, and we coordinate with all the other parts of an ML system in production. There are [standard orchestrators](/develop/stack-components/orchestrators) that ZenML supports out-of-the-box, but you are encouraged to [write your own orchestrator](/develop/stack-components/orchestrators/custom) in order to gain more control as to exactly how your pipelines are executed! - - - -Take a look at our [documentation](https://docs.zenml.io) (in particular the [component guide](/develop/stack-components/component-guide)), which contains instructions and sample code to support each integration that ZenML supports out-of-the-box. You can also check out [our integration test code](https://github.com/zenml-io/zenml/tree/main/tests/integration/examples) to see active examples of many of our integrations in action. - -The ZenML team and community are constantly working to include more tools and integrations to the above list (check out the [roadmap](https://zenml.io/roadmap) for more details). You can [upvote features](https://zenml.io/discussion) you'd like and add your ideas to the roadmap. - -Most importantly, ZenML is extensible, and we encourage you to use it with whatever other tools you require as part of your ML process and system(s). Check out [our documentation on how to get started](/develop/) with extending ZenML to learn more! - - - ZenML officially supports Windows if you're using WSL. Much of ZenML will also work on Windows outside a WSL environment, but we don't officially support it and some features don't work (notably anything that requires spinning up a server process). - - - Yes, ZenML does support Macs running on Apple Silicon. You just need to make sure that you set the following environment variable: - -```Bash -export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES -``` -This is a known issue with how forking works on Macs running on Apple Silicon and it will enable you to use ZenML and the server. This environment variable is needed if you are working with a local server on your Mac, but if you're just using ZenML as a client / CLI and connecting to a deployed server then you don't need to set it. - - - - This depends on the tool and its respective MLOps category. We have a full guide on this over [here](/develop/stack-components/stack-deployment/implement-a-custom-stack-component)! - - - We develop ZenML together with our community! To get involved, the best way to get started is to select any issue from the [good-first-issue label](https://github.com/zenml-io/zenml/labels/good%20first%20issue). If you would like to contribute, please review our [Contributing Guide](https://github.com/zenml-io/zenml/blob/main/CONTRIBUTING.md) for all relevant details. - - - The first point of the call should be [our Slack group](https://zenml.io/slack/). Ask your questions about bugs or specific use cases and someone from the core team will respond. - - - ZenML is distributed under the terms of the Apache License Version 2.0\. A complete version of the license is available in the [LICENSE.md](https://github.com/zenml-io/zenml/blob/main/LICENSE) in this repository. Any contribution made to this project will be licensed under the Apache License Version 2.0. - - - Check out [our dedicated documentation page]() on some ways you can try to solve these dependency and versioning issues. - - - - Our dedicated documentation guide on implementing this is the place to learn more. - - - - To reset your ZenML client, you can run `zenml clean` which will wipe your local metadata database and reset your client. Note that this is a destructive action, so feel free to reach out to us on Slack before doing this if you are unsure. - - - - Please read our general information on how to compose steps + pipelines together to start with. You might also find the code examples in our guide to implementing hyperparameter tuning which is related to this topic. - - - - Project templates allow you to get going quickly with ZenML. We recommend the - Starter template (starter) for most use cases which gives you a basic scaffold - and structure around which you can write your own code. You can also build - templates for others inside a Git repository and use them with ZenML's - templates functionality. We recommend the Starter template (starter) for most use cases which gives you a basic scaffold -and structure around which you can write your own code. You can also build -templates for others inside a Git repository and use them with ZenML's templates -functionality. - - - - diff --git a/develop/getting-started/installation.mdx b/develop/getting-started/installation.mdx deleted file mode 100644 index 6b3984f..0000000 --- a/develop/getting-started/installation.mdx +++ /dev/null @@ -1,99 +0,0 @@ ---- -title: "Installation" -description: "Installing ZenML and getting started." -icon: user-astronaut ---- - -**ZenML** is a Python package that can be installed directly via `pip`: - - -```Bash -pip install zenml -``` - - - Note that ZenML currently supports **Python 3.8, 3.9, 3.10, and 3.11**. Please make sure that you are using a supported Python version. - - -## Install with the dashboard - -ZenML comes bundled with a web dashboard that lives inside a [sister repository](https://github.com/zenml-io/zenml-dashboard). In order to get access to the dashboard **locally**, you need to launch the [ZenML Server and Dashboard locally](/develop/getting-started/deploying-zenml). For this, you need to install the optional dependencies for the ZenML Server: - - -```Bash -pip install "zenml[server]" -``` - - - We highly encourage you to install ZenML in a virtual environment. At ZenML, We like to use [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/) or [pyenv-virtualenv](https://github.com/pyenv/pyenv-virtualenv) to manage our Python virtual environments. - - -## Installing onto MacOS with Apple Silicon (M1, M2) - -A change in how forking works on Macs running on Apple Silicon means that you should set the following environment variable which will ensure that your connections to the server remain unbroken: - - -```Bash -export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES -``` - -You can read more about this [here](http://sealiesoftware.com/blog/archive/2017/6/5/Objective-C%5Fand%5Ffork%5Fin%5FmacOS%5F1013.html). This environment variable is needed if you are working with a local server on your Mac, but if you're just using ZenML as a client / CLI and connecting to a deployed server then you don't need to set it. - -## Nightly builds - -ZenML also publishes nightly builds under the [zenml-nightly package name](https://pypi.org/project/zenml-nightly/). These are built from the latest [develop branch](https://github.com/zenml-io/zenml/tree/develop) (to which work ready for release is published) and are not guaranteed to be stable. To install the nightly build, run: - - -```Bash -pip install zenml-nightly -``` - -## Verifying installations - -Once the installation is completed, you can check whether the installation was successful either through Bash: - - -```Bash -zenml version -``` - -or through Python: - - -```Bash -import zenml - -print(zenml.__version__) -``` - -If you would like to learn more about the current release, please visit our [PyPi package page.](https://pypi.org/project/zenml) - -## Running with Docker - -`zenml` is also available as a Docker image hosted publicly on [DockerHub](https://hub.docker.com/r/zenmldocker/zenml). Use the following command to get started in a bash environment with `zenml` available: - -```Bash -docker run -it zenmldocker/zenml /bin/bash -``` - -If you would like to run the ZenML server with Docker: - - -```Bash -docker run -it -d -p 8080:8080 zenmldocker/zenml-server -``` - -## Deploying the server - -Though ZenML can run entirely as a pip package on a local system, complete with the dashboard. You can do this easily: - - -```Bash -pip install "zenml[server]" - -zenml up # opens the dashboard locally -``` - -However, advanced ZenML features are dependent on a centrally-deployed ZenML server accessible to other MLOps stack components. You can read more about it [here](/develop/getting-started/deploying-zenml). - -For the deployment of ZenML, you have the option to either [self-host](/develop/getting-started/deploying-zenml) it or register for a free [ZenML Pro](https://cloud.zenml.io/signup?utm%5Fsource=docs&utm%5Fmedium=referral%5Flink&utm%5Fcampaign=cloud%5Fpromotion&utm%5Fcontent=signup%5Flink) account. diff --git a/develop/getting-started/introduction.mdx b/develop/getting-started/introduction.mdx deleted file mode 100644 index b82bbb4..0000000 --- a/develop/getting-started/introduction.mdx +++ /dev/null @@ -1,172 +0,0 @@ ---- -title: "Introduction" -description: "Welcome to ZenML!" -icon: star -mode: wide ---- - -**ZenML** is an extensible, open-source MLOps framework for creating portable, production-ready machine learning pipelines. By decoupling infrastructure from code, ZenML enables developers across your organization to collaborate more effectively as they develop to production. - - - -
- - -ZenML enables MLOps infrastructure experts to define, deploy, and manage sophisticated production environments that are easy to share with colleagues. - -* **ZenML Pro** **:** [ZenML Pro](/develop/getting-started/deploying-zenml/zenml-pro/zenml-pro) provides a control plane that allows you to deploy a managed ZenML instance and get access to exciting new features such as CI/CD, Model Control Plane, and RBAC. - - - -* **Self-hosted deployment:** ZenML can be deployed on any cloud provider and provides many Terraform-based utility functions to deploy other MLOps tools or even entire MLOps stacks: - -```Bash -# Deploy ZenML to any cloud -zenml deploy --provider aws -# Connect cloud resources with a simple wizard -zenml stack register --provider aws -# Deploy entire MLOps stacks at once -zenml stack deploy --provider gcp -``` -* **Standardization:** With ZenML, you can standardize MLOps infrastructure and tooling across your organization. Simply register your staging and production environments as ZenML stacks and invite your colleagues to run ML workflows on them. - -```Bash -# Register MLOps tools and infrastructure -zenml orchestrator register kfp_orchestrator -f kubeflow -# Register your production environment -zenml stack register production --orchestrator kubeflow ... -``` -* Registering your environments as ZenML stacks also enables you to browse and explore them in a convenient user interface. Try it out at [https://www.zenml.io/live-demo](https://www.zenml.io/live-demo)! -* **No Vendor Lock-In:** Since infrastructure is decoupled from code, ZenML gives you the freedom to switch to a different tooling stack whenever it suits you. By avoiding vendor lock-in, you have the flexibility to transition between cloud providers or services, ensuring that you receive the best performance and pricing available in the market at any time. - -```Bash -zenml stack set gcp -python run.py # Run your ML workflows in GCP -zenml stack set aws -python run.py # Now your ML workflow runs in AWS -``` - - **Learn More** - -Ready to deploy and manage your MLOps infrastructure with ZenML? Here is a collection of pages you can take a look at next: - - - - Set up and manage production-ready infrastructure with ZenML. - - - Explore the existing infrastructure and tooling integrations of ZenML. - - -Find answers to the most frequently asked questions. - - - - -ZenML gives data scientists the freedom to fully focus on modeling and experimentation while writing code that is production-ready from the get-go. - -* **Develop Locally:** ZenML allows you to develop ML models in any environment using your favorite tools. This means you can start developing locally, and simply switch to a production environment once you are satisfied with your results. - -```py -python run.py # develop your code locally with all your favorite tools -zenml stack set production -python run.py # run on production infrastructure without any code changes -``` -* **Pythonic SDK:** ZenML is designed to be as unintrusive as possible. Adding a ZenML `@step` or `@pipeline` decorator to your Python functions is enough to turn your existing code into ZenML pipelines: - -```py -from zenml import pipeline, step -@step -def step_1() -> str: - return "world" -@step -def step_2(input_one: str, input_two: str) -> None: - combined_str = input_one + ' ' + input_two - print(combined_str) -@pipeline -def my_pipeline(): - output_step_one = step_1() - step_2(input_one="hello", input_two=output_step_one) -my_pipeline() -``` -* **Automatic Metadata Tracking:** ZenML automatically tracks the metadata of all your runs and saves all your datasets and models to disk and versions them. Using the ZenML dashboard, you can see detailed visualizations of all your experiments. Try it out at [https://www.zenml.io/live-demo](https://www.zenml.io/live-demo)! - -ZenML integrates seamlessly with many popular open-source tools, so you can also combine ZenML with other popular experiment tracking tools like [Weights & Biases](/develop/stack-components/experiment-trackers/wandb), [MLflow](/develop/stack-components/experiment-trackers/mlflow), or [Neptune](/develop/stack-components/experiment-trackers/neptune) for even better reproducibility. - - **Learn More** - -Ready to develop production-ready code with ZenML? Here is a collection of pages you can take a look at next: - - - -Understand the core concepts behind ZenML. - - - -Get started with ZenML and learn how to build your first pipeline and stack. - - - -Build your first ZenML pipeline and deploy it in the cloud. - - - - - -ZenML empowers ML engineers to take ownership of the entire ML lifecycle end-to-end. Adopting ZenML means fewer handover points and more visibility on what is happening in your organization. - -* **ML Lifecycle Management:** ZenML's abstractions enable you to manage sophisticated ML setups with ease. After you define your ML workflows as [Pipelines](/develop/getting-started/core-concepts#1-development) and your development, staging, and production infrastructures as [Stacks](/develop/getting-started/core-concepts#2-execution), you can move entire ML workflows to different environments in seconds. -```Bash -zenml stack set staging -python run.py # test your workflows on staging infrastructure -zenml stack set production -python run.py # run your workflows in production -``` -* **Reproducibility:** ZenML enables you to painlessly reproduce previous results by automatically tracking and versioning all stacks, pipelines, artifacts, and source code. In the ZenML dashboard, you can get an overview of everything that has happened and drill down into detailed lineage visualizations. Try it out at [https://www.zenml.io/live-demo](https://www.zenml.io/live-demo)! - - - - - -* **Automated Deployments:** With ZenML, you no longer need to upload custom Docker images to the cloud whenever you want to deploy a new model to production. Simply define your ML workflow as a ZenML pipeline, let ZenML handle the containerization, and have your model automatically deployed to a highly scalable Kubernetes deployment service like [Seldon](/develop/stack-components/model-deployers/seldon). - -```Bash -from zenml.integrations.seldon.steps import seldon_model_deployer_step -from my_organization.steps import data_loader_step, model_trainer_step -@pipeline -def my_pipeline(): - data = data_loader_step() - model = model_trainer_step(data) - seldon_model_deployer_step(model) -``` - - **Learn More** - -Ready to manage your ML lifecycles end-to-end with ZenML? Here is a collection of pages you can take a look at next: - - - - -Get started with ZenML and learn how to build your first pipeline and stack. - - - -Discover advanced ZenML features like config management and containerization. - - - -Explore ZenML through practical use-case examples. - - - - - - - - - - - - - - diff --git a/develop/getting-started/migration-guide/migration-guide.mdx b/develop/getting-started/migration-guide/migration-guide.mdx deleted file mode 100644 index fba2142..0000000 --- a/develop/getting-started/migration-guide/migration-guide.mdx +++ /dev/null @@ -1,26 +0,0 @@ ---- -title: "Overview" -description: "How to migrate your ZenML code to the newest version." -icon: recycle ---- - -Migrations are necessary for ZenML releases that include breaking changes, which are currently all releases that increment the minor version of the release, e.g., `0.X` \-> `0.Y`. Furthermore, all releases that increment the first non-zero digit of the version contain major breaking changes or paradigm shifts that are explained in separate migration guides below. - -## Release Type Examples - -* `0.40.2` to `0.40.3` contains _no breaking changes_ and requires no migration whatsoever, -* `0.40.3` to `0.41.0` contains _minor breaking changes_ that need to be taken into account when upgrading ZenML, -* `0.39.1` to `0.40.0` contains _major breaking changes_ that introduce major shifts in how ZenML code is written or used. - -## Major Migration Guides - -The following guides contain detailed instructions on how to migrate between ZenML versions that introduced major breaking changes or paradigm shifts. The migration guides are sequential, meaning if there is more than one migration guide between your current version and the latest release, follow each guide in order. - -* [Migration guide 0.13.2 to 0.20.0](/develop/getting-started/migration-guide/migration-zero-twenty) -* [Migration guide 0.23.0 to 0.30.0](/develop/getting-started/migration-guide/migration-zero-thirty) -* [Migration guide 0.39.1 to 0.41.0](/develop/getting-started/migration-guide/migration-zero-forty) - -## Release Notes - -For releases with minor breaking changes, e.g., `0.40.3` to `0.41.0`, check out the official [ZenML Release Notes](https://github.com/zenml-io/zenml/releases) to see which breaking changes were introduced. - diff --git a/develop/getting-started/migration-guide/migration-zero-forty.mdx b/develop/getting-started/migration-guide/migration-zero-forty.mdx deleted file mode 100644 index 2d94b61..0000000 --- a/develop/getting-started/migration-guide/migration-zero-forty.mdx +++ /dev/null @@ -1,660 +0,0 @@ ---- -title: "Migration guide 0.39.1 → 0.41.0" -description: "How to migrate your ZenML pipelines and steps from version up to and including 0.39.1 to 0.41.0." -icon: circle-chevron-right ---- - -ZenML versions 0.40.0 to 0.41.0 introduced a new and more flexible syntax to define ZenML steps and pipelines. This page contains code samples that show you how to upgrade your steps and pipelines to the new syntax. - - - Newer versions of ZenML still work with pipelines and steps defined using the old syntax, but the old syntax is deprecated and will be removed in the future. - - -## Overview - - - ```py -from typing import Optional - -from zenml.steps import BaseParameters, Output, StepContext, step - -from zenml.pipelines import pipeline - -# Define a Step - -class MyStepParameters(BaseParameters): - - param_1: int - - param_2: Optional[float] = None - -@step - -def my_step( - - params: MyStepParameters, context: StepContext, - -) -> Output(int_output=int, str_output=str): - - result = int(params.param_1 * (params.param_2 or 1)) - - result_uri = context.get_output_artifact_uri() - - return result, result_uri - -# Run the Step separately - -my_step.entrypoint() - -# Define a Pipeline - -@pipeline - -def my_pipeline(my_step): - - my_step() - -step_instance = my_step(params=MyStepParameters(param_1=17)) - -pipeline_instance = my_pipeline(my_step=step_instance) - -# Configure and run the Pipeline - -pipeline_instance.configure(enable_cache=False) - -schedule = Schedule(...) - -pipeline_instance.run(schedule=schedule) - -# Fetch the Pipeline Run - -last_run = pipeline_instance.get_runs()[0] - -int_output = last_run.get_step["my_step"].outputs["int_output"].read() -``` - - - ```py -from typing import Annotated, Optional, Tuple - -from zenml import get_step_context, pipeline, step - -from zenml.client import Client - -# Define a Step - -@step - -def my_step( - - param_1: int, param_2: Optional[float] = None - -) -> Tuple[Annotated[int, "int_output"], Annotated[str, "str_output"]]: - - result = int(param_1 * (param_2 or 1)) - - result_uri = get_step_context().get_output_artifact_uri() - - return result, result_uri - -# Run the Step separately - -my_step() - -# Define a Pipeline - -@pipeline - -def my_pipeline(): - - my_step(param_1=17) - -# Configure and run the Pipeline - -my_pipeline = my_pipeline.with_options(enable_cache=False, schedule=schedule) - -my_pipeline() - -# Fetch the Pipeline Run - -last_run = my_pipeline.last_run - -int_output = last_run.steps["my_step"].outputs["int_output"].load() -``` - - - - - -## Defining steps - - - - ```py -from zenml.steps import step, BaseParameters - -from zenml.pipelines import pipeline - -# Old: Subclass `BaseParameters` to define parameters for a step - -class MyStepParameters(BaseParameters): - - param_1: int - - param_2: Optional[float] = None - -@step - -def my_step(params: MyStepParameters) -> None: - - ... - -@pipeline - -def my_pipeline(my_step): - - my_step() - -step_instance = my_step(params=MyStepParameters(param_1=17)) - -pipeline_instance = my_pipeline(my_step=step_instance) -``` - - -```py -# New: Directly define the parameters as arguments of your step function. - -# In case you still want to group your parameters in a separate class, - -# you can subclass `pydantic.BaseModel` and use that as an argument of your - -# step function - -from zenml import pipeline, step - -@step - -def my_step(param_1: int, param_2: Optional[float] = None) -> None: - - ... - -@pipeline - -def my_pipeline(): - - my_step(param_1=17) -``` - - - - - -Check out [this page](/develop/usage/pipelines/build-pipelines/use-pipeline-step-parameters) for more information on how to parameterize your steps. - -## Calling a step outside of a pipeline - - -```PY -from zenml.steps import step - -@step - -def my_step() -> None: - - ... - -my_step.entrypoint() # Old: Call `step.entrypoint(...)` -``` - - -```PY -from zenml import step - -@step - -def my_step() -> None: - - ... - -my_step() # New: Call the step directly `step(...)` -``` - - - - - - - - -## Defining pipelines - - - -```py -from zenml.pipelines import pipeline - -@pipeline - -def my_pipeline(my_step): # Old: steps are arguments of the pipeline function - - my_step() -``` - - -```py -from zenml import pipeline, step - -@step - -def my_step() -> None: - - ... - -@pipeline - -def my_pipeline(): - - my_step() # New: The pipeline function calls the step directly -``` - - - - - -## Configuring pipelines - - - - ```py -from zenml.pipelines import pipeline - -from zenml.steps import step - -@step - -def my_step() -> None: - - ... - -@pipeline - -def my_pipeline(my_step): - - my_step() - -# Old: Create an instance of the pipeline and then call `pipeline_instance.configure(...)` - -pipeline_instance = my_pipeline(my_step=my_step()) - -pipeline_instance.configure(enable_cache=False) -``` - - -```py -from zenml import pipeline, step - -@step - -def my_step() -> None: - - ... - -@pipeline - -def my_pipeline(): - - my_step() - -# New: Call the `with_options(...)` method on the pipeline - -my_pipeline = my_pipeline.with_options(enable_cache=False) -``` - - - - - - - -## Running pipelines - - - -```py -from zenml.pipelines import pipeline - -from zenml.steps import step - -@step - -def my_step() -> None: - - ... - -@pipeline - -def my_pipeline(my_step): - - my_step() - -# Old: Create an instance of the pipeline and then call `pipeline_instance.run(...)` - -pipeline_instance = my_pipeline(my_step=my_step()) - -pipeline_instance.run(...) -``` - - -```py -from zenml import pipeline, step - -@step - -def my_step() -> None: - - ... - -@pipeline - -def my_pipeline(): - - my_step() - -my_pipeline() # New: Call the pipeline -``` - - - - - -## Scheduling pipelines - - - -```py -from zenml.pipelines import pipeline, Schedule - -from zenml.steps import step - -@step - -def my_step() -> None: - - ... - -@pipeline - -def my_pipeline(my_step): - - my_step() - -# Old: Create an instance of the pipeline and then call `pipeline_instance.run(schedule=...)` - -schedule = Schedule(...) - -pipeline_instance = my_pipeline(my_step=my_step()) - -pipeline_instance.run(schedule=schedule) -``` - - -```py -from zenml.pipelines import Schedule - -from zenml import pipeline, step - -@step - -def my_step() -> None: - - ... - -@pipeline - -def my_pipeline(): - - my_step() - -# New: Set the schedule using the `pipeline.with_options(...)` method and then run it - -schedule = Schedule(...) - -my_pipeline = my_pipeline.with_options(schedule=schedule) - -my_pipeline() -``` - - - - - -Check out [this page](/develop/usage/pipelines/build-pipelines/schedule-a-pipeline) for more information on how to schedule your pipelines. - -## Fetching pipelines after execution - - - -```py -pipeline: PipelineView = zenml.post_execution.get_pipeline("first_pipeline") - -last_run: PipelineRunView = pipeline.runs[0] - -# OR: last_run = my_pipeline.get_runs()[0] - -model_trainer_step: StepView = last_run.get_step("model_trainer") - -model: ArtifactView = model_trainer_step.output - -loaded_model = model.read() -``` - - -```py -pipeline: PipelineResponseModel = zenml.client.Client().get_pipeline("first_pipeline") - -# OR: pipeline = pipeline_instance.model - -last_run: PipelineRunResponseModel = pipeline.last_run - -# OR: last_run = pipeline.runs[0] - -# OR: last_run = pipeline.get_runs(custom_filters)[0] - -# OR: last_run = pipeline.last_successful_run - -model_trainer_step: StepRunResponseModel = last_run.steps["model_trainer"] - -model: ArtifactResponseModel = model_trainer_step.output - -loaded_model = model.load() -``` - - - - -Check out [this page](/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps) for more information on how to programmatically fetch information about previous pipeline runs. - -## Controlling the step execution order - - - -```py -from zenml.pipelines import pipeline - -@pipeline - -def my_pipeline(step_1, step_2, step_3): - - step_1() - - step_2() - - step_3() - - step_3.after(step_1) # Old: Use the `step.after(...)` method - - step_3.after(step_2) -``` - - -```py -from zenml import pipeline - -@pipeline - -def my_pipeline(): - - step_1() - - step_2() - - step_3(after=["step_1", "step_2"]) # New: Pass the `after` argument when calling a step -``` - - - - -Check out [this page](/develop/usage/pipelines/build-pipelines/control-execution-order-of-steps) for more information on how to control the step execution order. - -## Defining steps with multiple outputs - - -```py -# Old: Use the `Output` class - -from zenml.steps import step, Output - -@step - -def my_step() -> Output(int_output=int, str_output=str): - - ... -``` - - -```py -# New: Use a `Tuple` annotation and optionally assign custom output names - -from typing_extensions import Annotated - -from typing import Tuple - -from zenml import step - -# Default output names `output_0`, `output_1` - -@step - -def my_step() -> Tuple[int, str]: - - ... - -# Custom output names - -@step - -def my_step() -> Tuple[ - - Annotated[int, "int_output"], - - Annotated[str, "str_output"], - -]: - - ... -``` - - - - -Check out [this page](/develop/usage/pipelines/build-pipelines/step-output-typing-and-annotation) for more information on how to annotate your step outputs. - -## Accessing run information inside steps - - - -```py -from zenml.steps import StepContext, step - -from zenml.environment import Environment - -@step - -def my_step(context: StepContext) -> Any: # Old: `StepContext` class defined as arg - - env = Environment().step_environment - - output_uri = context.get_output_artifact_uri() - - step_name = env.step_name # Old: Run info accessible via `StepEnvironment` - - ... -``` - - -```py -from zenml import get_step_context, step - -@step - -def my_step() -> Any: # New: StepContext is no longer an argument of the step - - context = get_step_context() - - output_uri = context.get_output_artifact_uri() - - step_name = context.step_name # New: StepContext now has ALL run/step info - - ... -``` - - - - - -Check out [this page](/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps) for more information on how to fetch run information inside your steps using `get_step_context()`. - diff --git a/develop/getting-started/migration-guide/migration-zero-sixty.mdx b/develop/getting-started/migration-guide/migration-zero-sixty.mdx deleted file mode 100644 index 07d8210..0000000 --- a/develop/getting-started/migration-guide/migration-zero-sixty.mdx +++ /dev/null @@ -1,74 +0,0 @@ ---- -title: "Migration guide 0.58.2 → 0.60.0" -description: "How to migrate from ZenML 0.58.2 to 0.60.0 (Pydantic 2 edition)." -icon: circle-chevron-right ---- - -ZenML now uses Pydantic v2\. - -This upgrade comes with a set of critical updates. While your user experience mostly remains unaffected, you might see unexpected behavior due to the changes in our dependencies. Moreover, since Pydantic v2 provides a slightly stricter validation process, you might end up bumping into some validation errors which was not caught before, but it is all for the better If you run into any other errors, please let us know either on [GitHub](https://github.com/zenml-io/zenml) or on our [Slack](https://zenml.io/slack-invite). - -## Changes in some of the critical dependencies - -* SQLModel is one of the core dependencies of ZenML and prior to this upgrade, we were utilizing version `0.0.8`. However, this version is relatively outdated and incompatible with Pydantic v2\. Within the scope of this upgrade, we upgraded it to `0.0.18`. -* Due to the change in the SQLModel version, we also had to upgrade our SQLAlchemy dependency from V1 to v2\. While this does not affect the way that you are using ZenML, if you are using SQLAlchemy in your environment, you might have to migrate your code as well. For a detailed list of changes, feel free to check [their migration guide](https://docs.sqlalchemy.org/en/20/changelog/migration%5F20.html). - -## Changes in `pydantic` - -Pydantic v2 brings a lot of new and exciting changes to the table. The core logic now uses Rust and it is much faster and more efficient in terms of performance. On top of it, the main concepts like model design, configuration, validation, or serialization now include a lot of new cool features. If you are using `pydantic` in your workflow and are interested in the new changes, you can check [the brilliant migration guide](https://docs.pydantic.dev/2.7/migration/) provided by the `pydantic` team to see the full list of changes. - -## Changes in our integrations changes - -Much like ZenML, `pydantic` is an important dependency in many other Python packages. That’s why conducting this upgrade helped us unlock a new version for several ZenML integration dependencies. Additionally, in some instances, we had to adapt the functionality of the integration to keep it compatible with `pydantic`. So, if you are using any of these integrations, please go through the changes. - -### Airflow - -As mentioned above upgrading our `pydantic` dependency meant we had to upgrade our `sqlmodel` dependency. Upgrading our `sqlmodel` dependency meant we had to upgrade our `sqlalchemy` dependency as well. Unfortunately, `apache-airflow` is still using `sqlalchemy` v1 and is incompatible with pydantic v2\. As a solution, we have removed the dependencies of the `airflow` integration. Now, you can use ZenML to create your Airflow pipelines and use a separate environment to run them with Airflow. You can check the updated docs [right here](https://docs.zenml.io/stack-components/orchestrators/airflow). - -### AWS - -Some of our integrations now require `protobuf` 4\. Since our previous `sagemaker` version (`2.117.0`) did not support `protobof` 4, we could not pair it with these new integrations. Thankfully `sagemaker` started supporting `protobuf` 4 with version `2.172.0` and relaxing its dependency solved the compatibility issue. - -### Evidently - -The old version of our `evidently` integration was not compatible with Pydantic v2\. They started supporting it starting from version `0.4.16`. As their latest version is `0.4.22`, the new dependency of the integration is limited between these two versions. - -### Feast - -Our previous implementation of the `feast` integration was not compatible with Pydantic v2 due to the extra `redis` dependency we were using. This extra dependency is now removed and the `feast` integration is working as intended. - -### GCP - -The previous version of the Kubeflow dependency (`kfp==1.8.22`) in our GCP integration required Pydantic V1 to be installed. While we were upgrading our Pydantic dependency, we saw this as an opportunity and wanted to use this chance to upgrade the `kfp` dependency to v2 (which has no dependencies on the Pydantic library). This is why you may see some functional changes in the vertex step operator and orchestrator. If you would like to go through the changes in the `kfp` library, you can find [the migration guide here](https://www.kubeflow.org/docs/components/pipelines/v2/migration/). - -### Great Expectations - -Great Expectations started supporting Pydantic v2 starting from version `0.17.15` and they are closing in on their `1.0` release. Since this release might include a lot of big changes, we adjusted the dependency in our integration to `great-expectations>=0.17.15,<1.0`. We will try to keep it updated in the future once they release the `1.0` version - -### Kubeflow - -Similar to the GCP integration, the previous version of the kubeflow dependency (`kfp==1.8.22`) in our `kubeflow` integration required Pydantic V1 to be installed. While we were upgrading our Pydantic dependency, we saw this as an opportunity and wanted to use this chance to upgrade the `kfp` dependency to v2 (which has no dependencies on the Pydantic library). If you would like to go through the changes in the `kfp` library, you can find [the migration guide here](https://www.kubeflow.org/docs/components/pipelines/v2/migration/). ( We also are considering adding an alternative version of this integration so our users can keep using `kfp` V1 in their environment. Stay tuned for any updates.) - -### MLflow - -`mlflow` is compatible with both Pydantic V1 and v2\. However, due to a known issue, if you install `zenml` first and then do `zenml integration install mlflow -y`, it downgrades `pydantic` to V1\. This is why we manually added the same duplicated `pydantic` requirement in the integration definition as well. Keep in mind that the `mlflow` library is still using some features of `pydantic` V1 which are deprecated. So, if the integration is installed in your environment, you might run into some deprecation warnings. - -### Label Studio - -While we were working on updating our `pydantic` dependency, the `label-studio-sdk` has released its 1.0 version. In this new version, `pydantic` v2 is also supported. The implementation and documentation of our Label Studio integration have been updated accordingly. - -### Skypilot - -With the switch to `pydantic` v2, the implementation of our `skypilot` integration mostly remained untouched. However, due to an incompatibility between the new version `pydantic` and the `azurecli`, the `skypilot[azure]` flavor can not be installed at the same time, thus our `skypilot_azure` integration is currently deactivated. We are working on fixing this issue and if you are using this integration in your workflows, we recommend staying on the previous version of ZenML until we can solve this issue. - -### Tensorflow - -The new version of `pydantic` creates a drift between `tensorflow` and `typing_extensions` packages and relaxing the dependencies here resolves the issue. At the same time, the upgrade to `kfp` v2 (in integrations like `kubeflow`, `tekton`, or `gcp`) bumps our `protobuf` dependency from `3.X` to `4.X`. To stay compatible with this requirement, the installed version of `tensorflow` needs to be `>=2.12.0`. While this change solves the dependency issues in most settings, we have bumped into some errors while using `tensorflow` 2.12.0 on Python 3.8 on Ubuntu. If you would like to use this integration, please consider using a higher Python version. - -### Tekton - -Similar to the `gcp` and `kubeflow` integrations, the old version of our `tekton` integration was not compatible with `pydantic` V1 due to its `kfp` dependency. With the switch from `kfp` V1 to v2, we have adapted our implementation to use the new version of `kfp` library and updated our documentation accordingly. - - - Due to all aforementioned changes, when you upgrade ZenML to 0.60.0, you might run into some dependency issues, especially if you were previously using an integration which was not supporting Pydantic v2 before. In such cases, we highly recommend setting up a fresh Python environment. - diff --git a/develop/getting-started/migration-guide/migration-zero-thirty.mdx b/develop/getting-started/migration-guide/migration-zero-thirty.mdx deleted file mode 100644 index 33ed365..0000000 --- a/develop/getting-started/migration-guide/migration-zero-thirty.mdx +++ /dev/null @@ -1,17 +0,0 @@ ---- -title: "Migration guide 0.23.0 → 0.30.0" -description: "How to migrate from ZenML 0.20.0-0.23.0 to 0.30.0-0.39.1." -icon: circle-chevron-right ---- - - - Migrating to `0.30.0` performs non-reversible database changes so downgrading to `<=0.23.0` is not possible afterwards. If you are running on an older ZenML version, please follow the [0.20.0 Migration Guide](/develop/getting-started/migration-guide/migration-zero-twenty) first to prevent unexpected database migration failures. - -The ZenML 0.30.0 release removed the `ml-pipelines-sdk` dependency in favor of natively storing pipeline runs and artifacts in the ZenML database. The corresponding database migration will happen automatically as soon as you run any `zenml ...` CLI command after installing the new ZenML version, e.g.: - - -```shell -pip install zenml==0.30.0 - -zenml version # 0.30.0 -``` diff --git a/develop/getting-started/migration-guide/migration-zero-twenty.mdx b/develop/getting-started/migration-guide/migration-zero-twenty.mdx deleted file mode 100644 index f979b51..0000000 --- a/develop/getting-started/migration-guide/migration-zero-twenty.mdx +++ /dev/null @@ -1,602 +0,0 @@ ---- -title: "Migration guide 0.13.2 → 0.20.0" -description: "How to migrate from ZenML versions up to 0.13.2 to 0.20.0." -icon: circle-chevron-right ---- - -_Last updated: 2023-07-24_ - -The ZenML 0.20.0 release brings a number of big changes to its architecture and its features, some of which are not backwards compatible with previous versions. This guide walks you through these changes and offers instructions on how to migrate your existing ZenML stacks and pipelines to the new version with minimal effort and disruption to your existing workloads. - - -Updating to ZenML 0.20.0 needs to be followed by a migration of your existing ZenML Stacks and you may also need to make changes to your current ZenML pipeline code. Please read this guide carefully and follow the migration instructions to ensure a smooth transition. - -If you have updated to ZenML 0.20.0 by mistake or are experiencing issues with the new version, you can always go back to the previous version by using `pip install zenml==0.13.2` instead of `pip install zenml` when installing ZenML manually or in your scripts. - - -High-level overview of the changes: - -* [ZenML takes over the Metadata Store](/develop/getting-started/migration-guide/migration-zero-twenty#zenml-takes-over-the-metadata-store-role) role. All information about your ZenML Stacks, pipelines, and artifacts is tracked by ZenML itself directly. If you are currently using remote Metadata Stores (e.g. deployed in cloud) in your stacks, you will probably need to replace them with a [ZenML server deployment](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/deploying-zenml/deploying-zenml.md). -* the [new ZenML Dashboard](/develop/getting-started/migration-guide/migration-zero-twenty#the-zenml-dashboard-is-now-available) is now available with all ZenML deployments. -* [ZenML Profiles have been removed](/develop/getting-started/migration-guide/migration-zero-twenty#removal-of-profiles-and-the-local-yaml-database) in favor of ZenML Projects. You need to [manually migrate your existing ZenML Profiles](/develop/getting-started/migration-guide/migration-zero-twenty#-how-to-migrate-your-profiles) after the update. -* the [configuration of Stack Components is now decoupled from their implementation](/develop/getting-started/migration-guide/migration-zero-twenty#decoupling-stack-component-configuration-from-implementation). If you extended ZenML with custom stack component implementations, you may need to update the way they are registered in ZenML. -* the updated ZenML server provides a new and improved collaborative experience. When connected to a ZenML server, you can now [share your ZenML Stacks and Stack Components](/develop/getting-started/migration-guide/migration-zero-twenty#shared-zenml-stacks-and-stack-components) with other users. If you were previously using the ZenML Profiles or the ZenML server to share your ZenML Stacks, you should switch to the new ZenML server and Dashboard and update your existing workflows to reflect the new features. - -## ZenML takes over the Metadata Store role - -ZenML can now run [as a server](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/core-concepts.md#zenml-server-and-dashboard) that can be accessed via a REST API and also comes with a visual user interface (called the ZenML Dashboard). This server can be deployed in arbitrary environments (local, on-prem, via Docker, on AWS, GCP, Azure etc.) and supports user management, workspace scoping, and more. - -The release introduces a series of commands to facilitate managing the lifecycle of the ZenML server and to access the pipeline and pipeline run information: - -* `zenml connect / disconnect / down / up / logs / status` can be used to configure your client to connect to a ZenML server, to start a local ZenML Dashboard or to deploy a ZenML server to a cloud environment. For more information on how to use these commands, see [the ZenML deployment documentation](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/deploying-zenml/deploying-zenml.md). -* `zenml pipeline list / runs / delete` can be used to display information and about and manage your pipelines and pipeline runs. - -In ZenML 0.13.2 and earlier versions, information about pipelines and pipeline runs used to be stored in a separate stack component called the Metadata Store. Starting with 0.20.0, the role of the Metadata Store is now taken over by ZenML itself. This means that the Metadata Store is no longer a separate component in the ZenML architecture, but rather a part of the ZenML core, located wherever ZenML is deployed: locally on your machine or running remotely as a server. - -All metadata is now stored, tracked, and managed by ZenML itself. The Metadata Store stack component type and all its implementations have been deprecated and removed. It is no longer possible to register them or include them in ZenML stacks. This is a key architectural change in ZenML 0.20.0 that further improves usability, reproducibility and makes it possible to visualize and manage all your pipelines and pipeline runs in the new ZenML Dashboard. - -The architecture changes for the local case are shown in the diagram below: - - -ZenML local metadata before 0.20.0 - - -ZenML local metadata after 0.20.0 - -The architecture changes for the remote case are shown in the diagram below: - - -ZenML remote metadata before 0.20.0 - - -ZenML remote metadata after 0.20.0 - -If you're already using ZenML, aside from the above limitation, this change will impact you differently, depending on the flavor of Metadata Stores you have in your stacks: - -* if you're using the default `sqlite` Metadata Store flavor in your stacks, you don't need to do anything. ZenML will automatically switch to using its local database instead of your `sqlite` Metadata Stores when you update to 0.20.0 (also see how to [migrate your stacks](/develop/getting-started/migration-guide/migration-zero-twenty#-how-to-migrate-your-profiles)). -* if you're using the `kubeflow` Metadata Store flavor _only as a way to connect to the local Kubeflow Metadata Service_ (i.e. the one installed by the `kubeflow` Orchestrator in a local k3d Kubernetes cluster), you also don't need to do anything explicitly. When you [migrate your stacks](/develop/getting-started/migration-guide/migration-zero-twenty#-how-to-migrate-your-profiles) to ZenML 0.20.0, ZenML will automatically switch to using its local database. -* if you're using the `kubeflow` Metadata Store flavor to connect to a remote Kubeflow Metadata Service such as those provided by a Kubeflow installation running in AWS, Google or Azure, there is currently no equivalent in ZenML 0.20.0\. You'll need to [deploy a ZenML Server](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/deploying-zenml/deploying-zenml.md) instance close to where your Kubeflow service is running (e.g. in the same cloud region). -* if you're using the `mysql` Metadata Store flavor to connect to a remote MySQL database service (e.g. a managed AWS, GCP or Azure MySQL service), you'll have to [deploy a ZenML Server](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/deploying-zenml/deploying-zenml.md) instance connected to that same database. -* if you deployed a `kubernetes` Metadata Store flavor (i.e. a MySQL database service deployed in Kubernetes), you can [deploy a ZenML Server](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/deploying-zenml/deploying-zenml.md) in the same Kubernetes cluster and connect it to that same database. However, ZenML will no longer provide the `kubernetes` Metadata Store flavor and you'll have to manage the Kubernetes MySQL database service deployment yourself going forward. - - -The ZenML Server inherits the same limitations that the Metadata Store had prior to ZenML 0.20.0: - -* it is not possible to use a local ZenML Server to track pipelines and pipeline runs that are running remotely in the cloud, unless the ZenML server is explicitly configured to be reachable from the cloud (e.g. by using a public IP address or a VPN connection). -* using a remote ZenML Server to track pipelines and pipeline runs that are running locally is possible, but can have significant performance issues due to the network latency. - -It is therefore recommended that you always use a ZenML deployment that is located as close as possible to and reachable from where your pipelines and step operators are running. This will ensure the best possible performance and usability. - - -### How to migrate pipeline runs from your old metadata stores - - - The `zenml pipeline runs migrate` CLI command is only available under ZenML versions \[0.21.0, 0.21.1, 0.22.0\]. If you want to migrate your existing ZenML runs from `zenml<0.20.0` to `zenml>0.22.0`, please first upgrade to `zenml==0.22.0` and migrate your runs as shown below, then upgrade to the newer version. - -To migrate the pipeline run information already stored in an existing metadata store to the new ZenML paradigm, you can use the `zenml pipeline runs migrate` CLI command. - -1. Before upgrading ZenML, make a backup of all metadata stores you want to migrate, then upgrade ZenML. -2. Decide the ZenML deployment model that you want to follow for your projects. See the [ZenML deployment documentation](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/deploying-zenml/deploying-zenml.md) for available deployment scenarios. If you decide on using a local or remote ZenML server to manage your pipelines, make sure that you first connect your client to it by running `zenml connect`. -3. Use the `zenml pipeline runs migrate` CLI command to migrate your old pipeline runs: -* If you want to migrate from a local SQLite metadata store, you only need to pass the path to the metadata store to the command, e.g.: - - -```Bash -zenml pipeline runs migrate PATH/TO/LOCAL/STORE/metadata.db -``` - -* If you would like to migrate any other store, you will need to set `--database_type=mysql` and provide the MySQL host, username, and password in addition to the database, e.g.: - - -```sql -zenml pipeline runs migrate DATABASE_NAME \ - - --database_type=mysql \ - - --mysql_host=URL/TO/MYSQL \ - - --mysql_username=MYSQL_USERNAME \ - - --mysql_password=MYSQL_PASSWORD -``` - -### The New Way (CLI Command Cheat Sheet) - -**Deploy the server** - -`zenml deploy --aws` (maybe don’t do this :) since it spins up infrastructure on AWS…) - -**Spin up a local ZenML Server** - -`zenml up` - -**Connect to a pre-existing server** - -`zenml connect` (pass in URL / etc, or zenml connect --config + yaml file) - -**List your deployed server details** - -`zenml status` - -## The ZenML Dashboard is now available - -The new ZenML Dashboard is now bundled into the ZenML Python package and can be launched directly from Python. The source code lives in the [ZenML Dashboard repository](https://github.com/zenml-io/zenml-dashboard). - -To launch it locally, simply run `zenml up` on your machine and follow the instructions: - - -``` -$ zenml up - -Deploying a local ZenML server with name 'local'. - -Connecting ZenML to the 'local' local ZenML server (http://127.0.0.1:8237). - -Updated the global store configuration. - -Connected ZenML to the 'local' local ZenML server (http://127.0.0.1:8237). - -The local ZenML dashboard is available at 'http://127.0.0.1:8237'. You can - -connect to it using the 'default' username and an empty password. -``` - -The Dashboard will be available at `http://localhost:8237` by default: - - -ZenML Dashboard Preview - - - For more details on other possible deployment options, see the [ZenML deployment documentation](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/deploying-zenml/deploying-zenml.md), and/or follow the [starter guide](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/starter-guide/pipelines/pipelines.md) to learn more. - - -## Removal of Profiles and the local YAML database - -Prior to 0.20.0, ZenML used used a set of local YAML files to store information about the Stacks and Stack Components that were registered on your machine. In addition to that, these Stacks could be grouped together and organized under individual Profiles. - -Profiles and the local YAML database have both been deprecated and removed in ZenML 0.20.0\. Stack, Stack Components as well as all other information that ZenML tracks, such as Pipelines and Pipeline Runs, are now stored in a single SQL database. These entities are no longer organized into Profiles, but they can be scoped into different Projects instead. - - - Since the local YAML database is no longer used by ZenML 0.20.0, you will lose all the Stacks and Stack Components that you currently have configured when you update to ZenML 0.20.0\. If you still want to use these Stacks, you will need to [manually migrate](/develop/getting-started/migration-guide/migration-zero-twenty#-how-to-migrate-your-profiles) them after the update. - - -### How to migrate your Profiles - -If you're already using ZenML, you can migrate your existing Profiles to the new ZenML 0.20.0 paradigm by following these steps: - -1. first, update ZenML to 0.20.0\. This will automatically invalidate all your existing Profiles. -2. decide the ZenML deployment model that you want to follow for your projects. See the [ZenML deployment documentation](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/getting-started/deploying-zenml/deploying-zenml.md) for available deployment scenarios. If you decide on using a local or remote ZenML server to manage your pipelines, make sure that you first connect your client to it by running `zenml connect`. -3. use the `zenml profile list` and `zenml profile migrate` CLI commands to import the Stacks and Stack Components from your Profiles into your new ZenML deployment. If you have multiple Profiles that you would like to migrate, you can either use a prefix for the names of your imported Stacks and Stack Components, or you can use a different ZenML Project for each Profile. - - - The ZenML Dashboard is currently limited to showing only information that is available in the `default` Project. If you wish to migrate your Profiles to a different Project, you will not be able to visualize the migrated Stacks and Stack Components in the Dashboard. This will be fixed in a future release. - -Once you've migrated all your Profiles, you can delete the old YAML files. - -Example of migrating a `default` profile into the `default` project: - - -``` -$ zenml profile list - -ZenML profiles have been deprecated and removed in this version of ZenML. All - -stacks, stack components, flavors etc. are now stored and managed globally, - -either in a local database or on a remote ZenML server (see the `zenml up` and - -`zenml connect` commands). As an alternative to profiles, you can use projects - -as a scoping mechanism for stacks, stack components and other ZenML objects. - -The information stored in legacy profiles is not automatically migrated. You can - -do so manually by using the `zenml profile list` and `zenml profile migrate` commands. - -Found profile with 1 stacks, 3 components and 0 flavors at: /home/stefan/.config/zenml/profiles/default - -Found profile with 3 stacks, 6 components and 0 flavors at: /home/stefan/.config/zenml/profiles/zenprojects - -Found profile with 3 stacks, 7 components and 0 flavors at: /home/stefan/.config/zenml/profiles/zenbytes - -$ zenml profile migrate /home/stefan/.config/zenml/profiles/default - -No component flavors to migrate from /home/stefan/.config/zenml/profiles/default/stacks.yaml... - -Migrating stack components from /home/stefan/.config/zenml/profiles/default/stacks.yaml... - -Created artifact_store 'cloud_artifact_store' with flavor 's3'. - -Created container_registry 'cloud_registry' with flavor 'aws'. - -Created container_registry 'local_registry' with flavor 'default'. - -Created model_deployer 'eks_seldon' with flavor 'seldon'. - -Created orchestrator 'cloud_orchestrator' with flavor 'kubeflow'. - -Created orchestrator 'kubeflow_orchestrator' with flavor 'kubeflow'. - -Created secrets_manager 'aws_secret_manager' with flavor 'aws'. - -Migrating stacks from /home/stefan/.config/zenml/profiles/v/stacks.yaml... - -Created stack 'cloud_kubeflow_stack'. - -Created stack 'local_kubeflow_stack'. - -$ zenml stack list - -Using the default local database. - -Running with active project: 'default' (global) - -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┓ - -┃ ACTIVE │ STACK NAME │ STACK ID │ SHARED │ OWNER │ CONTAINER_REGISTRY │ ARTIFACT_STORE │ ORCHESTRATOR │ MODEL_DEPLOYER │ SECRETS_MANAGER ┃ - -┠────────┼──────────────────────┼──────────────────────────────────────┼────────┼─────────┼────────────────────┼──────────────────────┼───────────────────────┼────────────────┼────────────────────┨ - -┃ │ local_kubeflow_stack │ 067cc6ee-b4da-410d-b7ed-06da4c983145 │ │ default │ local_registry │ default │ kubeflow_orchestrator │ │ ┃ - -┠────────┼──────────────────────┼──────────────────────────────────────┼────────┼─────────┼────────────────────┼──────────────────────┼───────────────────────┼────────────────┼────────────────────┨ - -┃ │ cloud_kubeflow_stack │ 054f5efb-9e80-48c0-852e-5114b1165d8b │ │ default │ cloud_registry │ cloud_artifact_store │ cloud_orchestrator │ eks_seldon │ aws_secret_manager ┃ - -┠────────┼──────────────────────┼──────────────────────────────────────┼────────┼─────────┼────────────────────┼──────────────────────┼───────────────────────┼────────────────┼────────────────────┨ - -┃ 👉 │ default │ fe913bb5-e631-4d4e-8c1b-936518190ebb │ │ default │ │ default │ default │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┛ -``` - -Example of migrating a profile into the `default` project using a name prefix: - - -``` -$ zenml profile migrate /home/stefan/.config/zenml/profiles/zenbytes --prefix zenbytes_ - -No component flavors to migrate from /home/stefan/.config/zenml/profiles/zenbytes/stacks.yaml... - -Migrating stack components from /home/stefan/.config/zenml/profiles/zenbytes/stacks.yaml... - -Created artifact_store 'zenbytes_s3_store' with flavor 's3'. - -Created container_registry 'zenbytes_ecr_registry' with flavor 'default'. - -Created experiment_tracker 'zenbytes_mlflow_tracker' with flavor 'mlflow'. - -Created experiment_tracker 'zenbytes_mlflow_tracker_local' with flavor 'mlflow'. - -Created model_deployer 'zenbytes_eks_seldon' with flavor 'seldon'. - -Created model_deployer 'zenbytes_mlflow' with flavor 'mlflow'. - -Created orchestrator 'zenbytes_eks_orchestrator' with flavor 'kubeflow'. - -Created secrets_manager 'zenbytes_aws_secret_manager' with flavor 'aws'. - -Migrating stacks from /home/stefan/.config/zenml/profiles/zenbytes/stacks.yaml... - -Created stack 'zenbytes_aws_kubeflow_stack'. - -Created stack 'zenbytes_local_with_mlflow'. - -$ zenml stack list - -Using the default local database. - -Running with active project: 'default' (global) - -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ ACTIVE │ STACK NAME │ STACK ID │ SHARED │ OWNER │ ORCHESTRATOR │ ARTIFACT_STORE │ CONTAINER_REGISTRY │ SECRETS_MANAGER │ MODEL_DEPLOYER │ EXPERIMENT_TRACKER ┃ - -┠────────┼──────────────────────┼──────────────────────┼────────┼─────────┼───────────────────────┼───────────────────┼──────────────────────┼───────────────────────┼─────────────────────┼──────────────────────┨ - -┃ │ zenbytes_aws_kubeflo │ 9fe90f0b-2a79-47d9-8 │ │ default │ zenbytes_eks_orchestr │ zenbytes_s3_store │ zenbytes_ecr_registr │ zenbytes_aws_secret_m │ zenbytes_eks_seldon │ ┃ - -┃ │ w_stack │ f80-04e45ff02cdb │ │ │ ator │ │ y │ anager │ │ ┃ - -┠────────┼──────────────────────┼──────────────────────┼────────┼─────────┼───────────────────────┼───────────────────┼──────────────────────┼───────────────────────┼─────────────────────┼──────────────────────┨ - -┃ 👉 │ default │ 7a587e0c-30fd-402f-a │ │ default │ default │ default │ │ │ │ ┃ - -┃ │ │ 3a8-03651fe1458f │ │ │ │ │ │ │ │ ┃ - -┠────────┼──────────────────────┼──────────────────────┼────────┼─────────┼───────────────────────┼───────────────────┼──────────────────────┼───────────────────────┼─────────────────────┼──────────────────────┨ - -┃ │ zenbytes_local_with_ │ c2acd029-8eed-4b6e-a │ │ default │ default │ default │ │ │ zenbytes_mlflow │ zenbytes_mlflow_trac ┃ - -┃ │ mlflow │ d19-91c419ce91d4 │ │ │ │ │ │ │ │ ker ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -Example of migrating a profile into a new project: - - -``` -$ zenml profile migrate /home/stefan/.config/zenml/profiles/zenprojects --project zenprojects - -Unable to find ZenML repository in your current working directory (/home/stefan/aspyre/src/zenml) or any parent directories. If you want to use an existing repository which is in a different location, set the environment variable 'ZENML_REPOSITORY_PATH'. If you want to create a new repository, run zenml init. - -Running without an active repository root. - -Creating project zenprojects - -Creating default stack for user 'default' in project zenprojects... - -No component flavors to migrate from /home/stefan/.config/zenml/profiles/zenprojects/stacks.yaml... - -Migrating stack components from /home/stefan/.config/zenml/profiles/zenprojects/stacks.yaml... - -Created artifact_store 'cloud_artifact_store' with flavor 's3'. - -Created container_registry 'cloud_registry' with flavor 'aws'. - -Created container_registry 'local_registry' with flavor 'default'. - -Created model_deployer 'eks_seldon' with flavor 'seldon'. - -Created orchestrator 'cloud_orchestrator' with flavor 'kubeflow'. - -Created orchestrator 'kubeflow_orchestrator' with flavor 'kubeflow'. - -Created secrets_manager 'aws_secret_manager' with flavor 'aws'. - -Migrating stacks from /home/stefan/.config/zenml/profiles/zenprojects/stacks.yaml... - -Created stack 'cloud_kubeflow_stack'. - -Created stack 'local_kubeflow_stack'. - -$ zenml project set zenprojects - -Currently the concept of `project` is not supported within the Dashboard. The Project functionality will be completed in the coming weeks. For the time being it is recommended to stay within the `default` - -project. - -Using the default local database. - -Running with active project: 'default' (global) - -Set active project 'zenprojects'. - -$ zenml stack list - -Using the default local database. - -Running with active project: 'zenprojects' (global) - -The current global active stack is not part of the active project. Resetting the active stack to default. - -You are running with a non-default project 'zenprojects'. Any stacks, components, pipelines and pipeline runs produced in this project will currently not be accessible through the dashboard. However, this will be possible in the near future. - -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┓ - -┃ ACTIVE │ STACK NAME │ STACK ID │ SHARED │ OWNER │ ARTIFACT_STORE │ ORCHESTRATOR │ MODEL_DEPLOYER │ CONTAINER_REGISTRY │ SECRETS_MANAGER ┃ - -┠────────┼──────────────────────┼──────────────────────────────────────┼────────┼─────────┼──────────────────────┼───────────────────────┼────────────────┼────────────────────┼────────────────────┨ - -┃ 👉 │ default │ 3ea77330-0c75-49c8-b046-4e971f45903a │ │ default │ default │ default │ │ │ ┃ - -┠────────┼──────────────────────┼──────────────────────────────────────┼────────┼─────────┼──────────────────────┼───────────────────────┼────────────────┼────────────────────┼────────────────────┨ - -┃ │ cloud_kubeflow_stack │ b94df4d2-5b65-4201-945a-61436c9c5384 │ │ default │ cloud_artifact_store │ cloud_orchestrator │ eks_seldon │ cloud_registry │ aws_secret_manager ┃ - -┠────────┼──────────────────────┼──────────────────────────────────────┼────────┼─────────┼──────────────────────┼───────────────────────┼────────────────┼────────────────────┼────────────────────┨ - -┃ │ local_kubeflow_stack │ 8d9343ac-d405-43bd-ab9c-85637e479efe │ │ default │ default │ kubeflow_orchestrator │ │ local_registry │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┛ -``` - -The `zenml profile migrate` CLI command also provides command line flags for cases in which the user wants to overwrite existing components or stacks, or ignore errors. - -## Decoupling Stack Component configuration from implementation - -Stack components can now be registered without having the required integrations installed. As part of this change, we split all existing stack component definitions into three classes: an implementation class that defines the logic of the stack component, a config class that defines the attributes and performs input validations, and a flavor class that links implementation and config classes together. See [**component flavor models #895**](https://github.com/zenml-io/zenml/pull/895) for more details. - -If you are only using stack component flavors that are shipped with the zenml Python distribution, this change has no impact on the configuration of your existing stacks. However, if you are currently using custom stack component implementations, you will need to update them to the new format. See the [documentation on writing custom stack component flavors](/develop/stack-components/stack-deployment/implement-a-custom-stack-component) for updated information on how to do this. - -## Shared ZenML Stacks and Stack Components - -With collaboration being the key part of ZenML, the 0.20.0 release puts the concepts of Users in the front and center and introduces the possibility to share stacks and stack components with other users by means of the ZenML server. - -When your client is connected to a ZenML server, entities such as Stacks, Stack Components, Stack Component Flavors, Pipelines, Pipeline Runs, and artifacts are scoped to a Project and owned by the User that creates them. Only the objects that are owned by the current user used to authenticate to the ZenML server and that are part of the current project are available to the client. - -Stacks and Stack Components can also be shared within the same project with other users. To share an object, either set it as shared during creation time (e.g. `zenml stack register mystack ... --share`) or afterwards (e.g. through `zenml stack share mystack`). - -To differentiate between shared and private Stacks and Stack Components, these can now be addressed by name, id or the first few letters of the id in the cli. E.g. for a stack `default` with id `179ebd25-4c5b-480f-a47c-d4f04e0b6185` you can now run `zenml stack describe default` or `zenml stack describe 179` or `zenml stack describe 179ebd25-4c5b-480f-a47c-d4f04e0b6185`. - -We also introduce the notion of `local` vs `non-local` stack components. Local stack components are stack components that are configured to run locally while non-local stack components are configured to run remotely or in a cloud environment. Consequently: - -* stacks made up of local stack components should not be shared on a central ZenML Server, even though this is not enforced by the system. -* stacks made up of non-local stack components are only functional if they are shared through a remotely deployed ZenML Server. - -Read more about shared stacks in the new [starter guide](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/starter-guide/stacks/managing-stacks.md#sharing-stacks-over-a-zenml-server). - -## Other changes - -### The `Repository` class is now called `Client` - -The `Repository` object has been renamed to `Client` to better capture its functionality. You can continue to use the `Repository` object for backwards compatibility, but it will be removed in a future release. - -**How to migrate**: Rename all references to `Repository` in your code to `Client`. - -### The `BaseStepConfig` class is now called `BaseParameters` - -The `BaseStepConfig` object has been renamed to `BaseParameters` to better capture its functionality. You can NOT continue to use the `BaseStepConfig`. - -This is part of a broader configuration rehaul which is discussed next. - -**How to migrate**: Rename all references to `BaseStepConfig` in your code to `BaseParameters`. - -### Configuration Rework - -Alongside the architectural shift, Pipeline configuration has been completely rethought. This video gives an overview of how configuration has changed with ZenML in the post ZenML 0.20.0 world. - - - - -**What changed?** - -ZenML pipelines and steps could previously be configured in many different ways: - -* On the `@pipeline` and `@step` decorators (e.g. the `requirements` variable) -* In the `__init__` method of the pipeline and step class -* Using `@enable_xxx` decorators, e.g. `@enable_mlflow`. -* Using specialized methods like `pipeline.with_config(...)` or `step.with_return_materializer(...)` - -Some of the configuration options were quite hidden, difficult to access and not tracked in any way by the ZenML metadata store. - -With ZenML 0.20.0, we introduce the `BaseSettings` class, a broad class that serves as a central object to represent all runtime configuration of a pipeline run (apart from the `BaseParameters`). - -Pipelines and steps now allow all configurations on their decorators as well as the `.configure(...)` method. This includes configurations for stack components that are not infrastructure-related which was previously done using the `@enable_xxx` decorators). The same configurations can also be defined in a YAML file. - -Read more about this paradigm in the [new docs section about settings](/develop/usage/project-setup/use-configuration-files/what-can-be-configured). - -Here is a list of changes that are the most obvious in consequence of the above code. Please note that this list is not exhaustive, and if we have missed something let us know via [Slack](https://zenml.io/slack). - -**Deprecating the** `**enable_xxx**` **decorators** - -With the above changes, we are deprecating the much-loved `enable_xxx` decorators, like `enable_mlflow` and `enable_wandb`. - -**How to migrate**: Simply remove the decorator and pass something like this instead to step directly: - -```Bash -@step( - - experiment_tracker="mlflow_stack_comp_name", # name of registered component - - settings={ # settings of registered component - - "experiment_tracker.mlflow": { # this is `category`.`flavor`, so another example is `step_operator.spark` - - "experiment_name": "name", - - "nested": False - - } - - } - -) -``` - -**Deprecating** `**pipeline.with_config(...)**` - -**How to migrate**: Replaced with the new `pipeline.run(config_path=...)`. - -**Deprecating** `**step.with_return_materializer(...)**` - -**How to migrate**: Simply remove the `with_return_materializer` method and pass something like this instead to step directly: - -``` -@step( - - output_materializers=materializer_or_dict_of_materializers_mapped_to_outputs - -) -``` - -`**DockerConfiguration**` **is now renamed to** `**DockerSettings**` - -**How to migrate**: Rename `DockerConfiguration` to `DockerSettings` and instead of passing it in the decorator directly with `docker_configuration`, you can use: - - -```Bash -from zenml.config import DockerSettings - -@step(settings={"docker": DockerSettings(...)}) - -def my_step() -> None: - - ... -``` - -With this change, all stack components (e.g. Orchestrators and Step Operators) that accepted a `docker_parent_image` as part of its Stack Configuration should now pass it through the `DockerSettings` object. - -Read more [here](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/starter-guide/production-fundamentals/containerization.md). - -`**ResourceConfiguration**` **is now renamed to** `**ResourceSettings**` - -**How to migrate**: Rename `ResourceConfiguration` to `ResourceSettings` and instead of passing it in the decorator directly with `resource_configuration`, you can use: - - -```py -from zenml.config import ResourceSettings - -@step(settings={"resources": ResourceSettings(...)}) - -def my_step() -> None: - - ... -``` - -**Deprecating the** `**requirements**` **and** `**required_integrations**` **parameters** - -Users used to be able to pass `requirements` and `required_integrations` directly in the `@pipeline` decorator, but now need to pass them through settings: - -**How to migrate**: Simply remove the parameters and use the `DockerSettings` instead - - -```py -from zenml.config import DockerSettings - -@step(settings={"docker": DockerSettings(requirements=[...], requirements_integrations=[...])}) - -def my_step() -> None: - - ... -``` - -Read more [here](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/starter-guide/production-fundamentals/containerization.md). - -**A new pipeline intermediate representation** - -All the aforementioned configurations as well as additional information required to run a ZenML pipelines are now combined into an intermediate representation called `PipelineDeployment`. Instead of the user-facing `BaseStep` and `BasePipeline` classes, all the ZenML orchestrators and step operators now use this intermediate representation to run pipelines and steps. - -**How to migrate**: If you have written a [custom orchestrator](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/component-gallery/orchestrators/custom.md) or [step operator](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/component-gallery/step-operators/custom.md), then you should see the new base abstractions (seen in the links). You can adjust your stack component implementations accordingly. - -### `PipelineSpec` now uniquely defines pipelines - -Once a pipeline has been executed, it is represented by a `PipelineSpec` that uniquely identifies it. Therefore, users are no longer able to edit a pipeline once it has been run once. There are now three options to get around this: - -* Pipeline runs can be created without being associated with a pipeline explicitly: We call these `unlisted` runs. Read more about unlisted runs [here](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/starter-guide/pipelines/pipelines.md#unlisted-runs). -* Pipelines can be deleted and created again. -* Pipelines can be given unique names each time they are run to uniquely identify them. - -**How to migrate**: No code changes, but rather keep in mind the behavior (e.g. in a notebook setting) when quickly [iterating over pipelines as experiments](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/starter-guide/pipelines/parameters-and-caching.md). - -### New post-execution workflow - -The Post-execution workflow has changed as follows: - -* The `get_pipelines` and `get_pipeline` methods have been moved out of the `Repository` (i.e. the new `Client` ) class and lie directly in the post\_execution module now. To use the user has to do: - -``` -from zenml.post_execution import get_pipelines, get_pipeline -``` - -* New methods to directly get a run have been introduced: `get_run` and `get_unlisted_runs` method has been introduced to get unlisted runs. - -Usage remains largely similar. Please read the [new docs for post-execution](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/starter-guide/pipelines/fetching-pipelines.md) to inform yourself of what further has changed. - -**How to migrate**: Replace all post-execution workflows from the paradigm of `Repository.get_pipelines` or `Repository.get_pipeline_run` to the corresponding post\_execution methods. - -## Future Changes - -While this rehaul is big and will break previous releases, we do have some more work left to do. However we also expect this to be the last big rehaul of ZenML before our 1.0.0 release, and no other release will be so hard breaking as this one. Currently planned future breaking changes are: - -* Following the metadata store, the secrets manager stack component might move out of the stack. -* ZenML `StepContext` might be deprecated. - -## Reporting Bugs - -While we have tried our best to document everything that has changed, we realize that mistakes can be made and smaller changes overlooked. If this is the case, or you encounter a bug at any time, the ZenML core team and community are available around the clock on the growing [Slack community](https://zenml.io/slack). - -For bug reports, please also consider submitting a [GitHub Issue](https://github.com/zenml-io/zenml/issues/new/choose). - -Lastly, if the new changes have left you desiring a feature, then consider adding it to our [public feature voting board](https://zenml.io/discussion). Before doing so, do check what is already on there and consider upvoting the features you desire the most. diff --git a/develop/getting-started/reference/community-and-content.mdx b/develop/getting-started/reference/community-and-content.mdx deleted file mode 100644 index 3df6766..0000000 --- a/develop/getting-started/reference/community-and-content.mdx +++ /dev/null @@ -1,36 +0,0 @@ ---- -title: "Community & content" -description: "All possible ways for our community to get in touch with ZenML." -icon: heart ---- - -The ZenML team and community have put together a list of references that can be used to get in touch with the development team of ZenML and develop a deeper understanding of the framework. - -### Slack Channel: Get help from the community - -The ZenML [Slack channel](https://zenml.io/slack) is the main gathering point for the community. Not only is it the best place to get in touch with the core team of ZenML, but it is also a great way to discuss new ideas and share your ZenML projects with the community. If you have a question, there is a high chance someone else might have already answered it on Slack! - -### Social Media: Bite-sized updates - -We are active on [LinkedIn](https://www.linkedin.com/company/zenml) and [Twitter](https://twitter.com/zenml%5Fio) where we post bite-sized updates on releases, events, and MLOps in general. Follow us to interact and stay up to date! We would appreciate it if you could comment on and share our posts so more people can benefit from our work at ZenML! - -### YouTube Channel: Video tutorials, workshops, and more - -Our [YouTube channel](https://www.youtube.com/c/ZenML) features a growing set of videos that take you through the entire framework. Go here if you are a visual learner, and follow along with some tutorials. - -### Public roadmap - -The feedback from our community plays a significant role in the development of ZenML. That's why we have a [public roadmap](https://zenml.io/roadmap) that serves as a bridge between our users and our development team. If you have ideas regarding any new features or want to prioritize one over the other, feel free to share your thoughts here or vote on existing ideas. - -### Blog - -On our [Blog](https://zenml.io/blog/) page, you can find various articles written by our team. We use it as a platform to share our thoughts and explain the implementation process of our tool, its new features, and the thought process behind them. - -### Podcast - -We also have a [Podcast](https://podcast.zenml.io/) series that brings you interviews and discussions with industry leaders, top technology professionals, and others. We discuss the latest developments in machine learning, deep learning, and artificial intelligence, with a particular focus on MLOps, or how trained models are used in production. - -### Newsletter - -You can also subscribe to our [Newsletter](https://zenml.io/newsletter-signup) where we share what we learn as we develop open-source tooling for production machine learning. You will also get all the exciting news about ZenML in general. - diff --git a/develop/mint.json b/develop/mint.json deleted file mode 100644 index 4640a10..0000000 --- a/develop/mint.json +++ /dev/null @@ -1,869 +0,0 @@ -{ - "$schema": "https://mintlify.com/schema.json", - "name": "ZenML", - "logo": { - "dark": "/_assets/logo/dark.svg", - "light": "/_assets/logo/light.svg" - }, - "favicon": "/_assets/favicon.png", - "colors": { - "primary": "#7a3ef4", - "light": "#7a3ef4", - "dark": "#7a3ef4", - "anchors": { - "from": "#7a3ef4", - "to": "#7a3ef4" - } - }, - "font": { - "family": "Plus Jakarta Sans" - }, - "background": { "style": "grid" }, - "layout": "sidenav", - "topbarLinks": [ - { - "name": "Support", - "url": "https://zenml.io/slack" - } - ], - "topbarCtaButton": { - "name": "Dashboard", - "url": "https://dashboard.mintlify.com" - }, - "tabs": [ - { - "name": "Usage", - "url": "usage" - }, - { - "name": "Guides", - "url": "user-guide" - }, - { - "name": "Stacks", - "url": "stack-components" - }, - { - "name": "SDK", - "url": "sdk" - }, - { - "name": "API", - "url": "api", - "openapi": "/api/openapi.json" - } - ], - "anchors": [ - { - "name": "About", - "icon": "users", - "url": "https://www.zenml.io" - }, - { - "name": "Community", - "icon": "slack", - "url": "https://mintlify.com/community" - }, - { - "name": "Blog", - "icon": "newspaper", - "url": "https://mintlify.com/blog" - } - ], - "versions": ["0.66.0", "0.65.0"], - "navigation": [ - { - "group": "🚀 Getting Started", - "pages": [ - "getting-started/introduction", - "getting-started/installation", - "getting-started/core-concepts", - "getting-started/faq" - ] - }, - { - "group": "🖥️ Deploying ZenML", - "pages": [ - "getting-started/deploying-zenml/deploying-zenml", - "getting-started/deploying-zenml/deploy-with-zenml-cli", - "getting-started/deploying-zenml/deploy-with-docker", - "getting-started/deploying-zenml/deploy-with-helm", - "getting-started/deploying-zenml/deploy-using-huggingface-spaces", - "getting-started/deploying-zenml/deploy-with-custom-image", - { - "group": "Manage deployed services", - "icon": "gear", - "iconType": "solid", - "pages": [ - "getting-started/deploying-zenml/manage-the-deployed-services", - "getting-started/deploying-zenml/manage-the-deployed-services/upgrade-the-version-of-the-zenml-server", - "getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-your-deployed-server", - "getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-stack-components", - "getting-started/deploying-zenml/manage-the-deployed-services/custom-secret-stores" - ] - }, - "getting-started/deploying-zenml/common-questions-answered" - ] - }, - { - "group": "ZenML Pro", - "icon": "cloud", - "iconType": "solid", - "pages": [ - "getting-started/deploying-zenml/zenml-pro/zenml-pro", - "getting-started/deploying-zenml/zenml-pro/system-architectures", - "getting-started/deploying-zenml/zenml-pro/zenml-saas", - "getting-started/deploying-zenml/zenml-pro/user-management" - ] - }, - { - "group": "Reference", - "pages": [ - { - "group": "Migration Guide", - "icon": "arrows-spin", - "iconType": "solid", - "pages": [ - "getting-started/migration-guide/migration-guide", - "getting-started/migration-guide/migration-zero-twenty", - "getting-started/migration-guide/migration-zero-thirty", - "getting-started/migration-guide/migration-zero-forty", - "getting-started/migration-guide/migration-zero-sixty" - ] - }, - "getting-started/reference/community-and-content" - ] - }, - { - "group": "🛠️ Pipelines", - "icon": "timeline-arrow", - "iconType": "solid", - "pages": [ - { - "group": " Build a pipeline", - "icon": "link", - "iconType": "solid", - "pages": [ - "usage/pipelines/build-pipelines/build-pipelines", - "usage/pipelines/build-pipelines/use-pipeline-step-parameters", - "usage/pipelines/build-pipelines/configuring-a-pipeline-at-runtime", - "usage/pipelines/build-pipelines/step-output-typing-and-annotation", - "usage/pipelines/build-pipelines/control-caching-behavior", - "usage/pipelines/build-pipelines/schedule-a-pipeline", - "usage/pipelines/build-pipelines/compose-pipelines", - "usage/pipelines/build-pipelines/run-pipelines-asynchronously", - "usage/pipelines/build-pipelines/run-an-individual-step" - ] - }, - { - "group": " Pipeline lifecycle", - "icon": "life-ring", - "iconType": "solid", - "pages": [ - "usage/pipelines/build-pipelines/delete-a-pipeline", - "usage/pipelines/build-pipelines/fetching-pipelines", - "usage/pipelines/build-pipelines/get-past-pipeline-step-runs", - "usage/pipelines/build-pipelines/use-failure-success-hooks" - ] - }, - { - "group": " Advanced pipeline features", - "icon": "gear", - "iconType": "solid", - "pages": [ - "usage/pipelines/build-pipelines/access-secrets-in-a-step", - "usage/pipelines/build-pipelines/control-execution-order-of-steps", - "usage/pipelines/build-pipelines/hyper-parameter-tuning", - "usage/pipelines/build-pipelines/retry-steps", - "usage/pipelines/build-pipelines/using-a-custom-step-invocation-id", - "usage/pipelines/build-pipelines/name-your-pipeline-and-runs" - ] - }, - { - "group": " Trigger a pipeline", - "icon": "bullhorn", - "iconType": "solid", - "pages": [ - "usage/pipelines/trigger-pipelines/trigger-pipelines", - "usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-client", - "usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-another", - "usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-rest-api" - ] - }, - { - "group": " Create and run templates", - "icon": "circle-play", - "iconType": "solid", - "pages": [ - "usage/pipelines/create-and-run-templates/create-and-run-templates", - "usage/pipelines/create-and-run-templates/create-a-run-template", - "usage/pipelines/create-and-run-templates/run-a-template" - ] - } - ] - }, - { - "group": "🧰 Project Setup & Config", - "icon": "folder-open", - "iconType": "solid", - "pages": [ - { - "group": " Set up a project repository", - "icon": "cat", - "iconType": "solid", - "pages": [ - "usage/project-setup/setting-up-a-project-repository/setting-up-a-project-repository", - "usage/project-setup/setting-up-a-project-repository/connect-your-git-repository", - "usage/project-setup/setting-up-a-project-repository/using-project-templates", - "usage/project-setup/setting-up-a-project-repository/best-practices" - ] - }, - { - "group": " Use configuration files", - "icon": "files", - "iconType": "solid", - "pages": [ - "usage/project-setup/use-configuration-files/use-configuration-files", - "usage/project-setup/use-configuration-files/how-to-use-config", - "usage/project-setup/use-configuration-files/what-can-be-configured", - "usage/project-setup/use-configuration-files/runtime-configuration", - "usage/project-setup/use-configuration-files/configuration-hierarchy", - "usage/project-setup/use-configuration-files/retrieve-used-configuration-of-a-run", - "usage/project-setup/use-configuration-files/autogenerate-a-template-yaml-file" - ] - }, - { - "group": " Configure Python environments", - "icon": "python", - "iconType": "solid", - "pages": [ - "usage/project-setup/configure-python-environments/configure-python-environments", - "usage/project-setup/configure-python-environments/handling-dependencies", - "usage/project-setup/configure-python-environments/configure-the-server-environment" - ] - }, - "usage/project-setup/global-settings", - { - "group": " Use secrets", - "icon": "lock", - "iconType": "solid", - "pages": [ - "usage/project-setup/use-secrets/interact-with-secrets", - "usage/project-setup/use-secrets/reference-secrets-in-stack-configuration", - "usage/project-setup/use-secrets/secret-management" - ] - }, - { - "group": " Customize Docker builds", - "icon": "docker", - "iconType": "solid", - "pages": [ - "usage/project-setup/customize-docker-builds/customize-docker-builds", - "usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline", - "usage/project-setup/customize-docker-builds/docker-settings-on-a-step", - "usage/project-setup/customize-docker-builds/specify-pip-dependencies-and-apt-packages", - "usage/project-setup/customize-docker-builds/use-your-own-docker-files", - "usage/project-setup/customize-docker-builds/which-files-are-built-into-the-image", - "usage/project-setup/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times", - "usage/project-setup/customize-docker-builds/define-where-an-image-is-built" - ] - } - ] - }, - { - "group": "🗂️ Resource & Data Management", - "icon": "database", - "iconType": "solid", - "pages": [ - { - "group": " Train with GPUs", - "icon": "car", - "iconType": "solid", - "pages": [ - "usage/resource-data-management/training-with-gpus/training-with-gpus", - "usage/resource-data-management/training-with-gpus/accelerate-distributed-training" - ] - }, - { - "group": " Connect to a server", - "icon": "bolt", - "iconType": "solid", - "pages": [ - "usage/resource-data-management/connecting-to-zenml/connecting-to-zenml", - "usage/resource-data-management/connecting-to-zenml/connect-in-with-your-user-interactive", - "usage/resource-data-management/connecting-to-zenml/connect-with-a-service-account" - ] - }, - { - "group": " Connect services (AWS, GCP, Azure, K8s etc)", - "icon": "bolt", - "iconType": "solid", - "pages": [ - "usage/resource-data-management/auth-management/auth-management", - "usage/resource-data-management/auth-management/service-connectors-guide", - "usage/resource-data-management/auth-management/best-security-practices", - "usage/resource-data-management/auth-management/docker-service-connector", - "usage/resource-data-management/auth-management/kubernetes-service-connector", - "usage/resource-data-management/auth-management/aws-service-connector", - "usage/resource-data-management/auth-management/gcp-service-connector", - "usage/resource-data-management/auth-management/azure-service-connector", - "usage/resource-data-management/auth-management/hyperai-service-connector" - ] - }, - { - "group": " Handle Data/Artifacts", - "icon": "server", - "iconType": "solid", - "pages": [ - "usage/resource-data-management/handle-data-artifacts/handle-data-artifacts", - "usage/resource-data-management/handle-data-artifacts/artifact-versioning", - "usage/resource-data-management/handle-data-artifacts/return-multiple-outputs-from-a-step", - "usage/resource-data-management/handle-data-artifacts/delete-an-artifact", - "usage/resource-data-management/handle-data-artifacts/tagging", - "usage/resource-data-management/handle-data-artifacts/get-arbitrary-artifacts-in-a-step", - "usage/resource-data-management/handle-data-artifacts/handle-custom-data-types", - "usage/resource-data-management/handle-data-artifacts/load-artifacts-into-memory", - "usage/resource-data-management/handle-data-artifacts/unmaterialized-artifacts", - "usage/resource-data-management/handle-data-artifacts/passing-artifacts-between-pipelines" - ] - }, - { - "group": " Visualizing artifacts", - "icon": "chart-simple", - "iconType": "solid", - "pages": [ - "usage/resource-data-management/visualize-artifacts/visualize-artifacts", - "usage/resource-data-management/visualize-artifacts/types-of-visualizations", - "usage/resource-data-management/visualize-artifacts/creating-custom-visualizations", - "usage/resource-data-management/visualize-artifacts/visualizations-in-dashboard", - "usage/resource-data-management/visualize-artifacts/disabling-visualizations" - ] - }, - { - "group": " Track metrics and metadata", - "icon": "chart-line", - "iconType": "solid", - "pages": [ - "usage/resource-data-management/track-metrics-metadata/track-metrics-metadata", - "usage/resource-data-management/track-metrics-metadata/attach-metadata-to-a-model", - "usage/resource-data-management/track-metrics-metadata/attach-metadata-to-an-artifact", - "usage/resource-data-management/track-metrics-metadata/attach-metadata-to-steps", - "usage/resource-data-management/track-metrics-metadata/grouping-metadata", - "usage/resource-data-management/track-metrics-metadata/logging-metadata", - "usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps", - "usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-pipeline" - ] - } - ] - }, - { - "group": "🔎 Development & Monitoring", - "icon": "chart-line", - "iconType": "solid", - "pages": [ - { - "group": " Run remote pipelines from notebooks", - "icon": "book", - "iconType": "solid", - "pages": [ - "usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-remote-steps-and-pipelines-from-notebooks", - "usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/limitations-of-defining-steps-in-notebook-cells", - "usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-a-single-step-from-a-notebook" - ] - }, - "usage/development-monitoring/debug-and-solve-issues", - { - "group": " Control logging", - "icon": "tree", - "iconType": "solid", - "pages": [ - "usage/development-monitoring/control-logging/control-logging", - "usage/development-monitoring/control-logging/view-logs-on-the-dasbhoard", - "usage/development-monitoring/control-logging/enable-or-disable-logs-storing", - "usage/development-monitoring/control-logging/set-logging-verbosity", - "usage/development-monitoring/control-logging/disable-rich-traceback", - "usage/development-monitoring/control-logging/disable-colorful-logging" - ] - }, - { - "group": " Use the Model Control Plane", - "icon": "plane", - "iconType": "solid", - "pages": [ - "usage/development-monitoring/use-the-model-control-plane/use-the-model-control-plane", - "usage/development-monitoring/use-the-model-control-plane/register-a-model", - "usage/development-monitoring/use-the-model-control-plane/delete-a-model", - "usage/development-monitoring/use-the-model-control-plane/associate-a-pipeline-with-a-model", - "usage/development-monitoring/use-the-model-control-plane/connecting-artifacts-via-a-model", - "usage/development-monitoring/use-the-model-control-plane/model-versions", - "usage/development-monitoring/use-the-model-control-plane/load-a-model-in-code", - "usage/development-monitoring/use-the-model-control-plane/promote-a-model", - "usage/development-monitoring/use-the-model-control-plane/linking-model-binaries-data-to-models", - "usage/development-monitoring/use-the-model-control-plane/load-artifacts-from-model" - ] - }, - { - "group": " Popular integrations", - "icon": "link-slash", - "iconType": "solid", - "pages": [ - "usage/development-monitoring/popular-integrations/popular-integrations", - "usage/development-monitoring/popular-integrations/aws-guide", - "usage/development-monitoring/popular-integrations/gcp-guide", - "usage/development-monitoring/popular-integrations/azure-guide", - "usage/development-monitoring/popular-integrations/kubeflow", - "usage/development-monitoring/popular-integrations/kubernetes", - "usage/development-monitoring/popular-integrations/mlflow", - "usage/development-monitoring/popular-integrations/skypilot" - ] - } - ] - }, - { - "group": "🧩 Stacks", - "pages": [ - "stack-components/component-guide", - "stack-components/stack-deployment/deploying-stacks", - { - "group": " Orchestrators", - "icon": "battery-full", - "iconType": "solid", - "pages": [ - "stack-components/orchestrators/orchestrators", - "stack-components/orchestrators/local", - "stack-components/orchestrators/local-docker", - "stack-components/orchestrators/kubeflow", - "stack-components/orchestrators/kubernetes", - "stack-components/orchestrators/vertex", - "stack-components/orchestrators/sagemaker", - "stack-components/orchestrators/azureml", - "stack-components/orchestrators/databricks", - "stack-components/orchestrators/tekton", - "stack-components/orchestrators/airflow", - "stack-components/orchestrators/skypilot-vm", - "stack-components/orchestrators/hyperai", - "stack-components/orchestrators/lightning", - "stack-components/orchestrators/custom" - ] - }, - { - "group": " Artifact Stores", - "icon": "store", - "iconType": "solid", - "pages": [ - "stack-components/artifact-stores/artifact-stores", - "stack-components/artifact-stores/local", - "stack-components/artifact-stores/s3", - "stack-components/artifact-stores/gcp", - "stack-components/artifact-stores/azure", - "stack-components/artifact-stores/custom" - ] - }, - { - "group": " Container Registries", - "icon": "docker", - "iconType": "solid", - "pages": [ - "stack-components/container-registries/container-registries", - "stack-components/container-registries/default", - "stack-components/container-registries/dockerhub", - "stack-components/container-registries/aws", - "stack-components/container-registries/gcp", - "stack-components/container-registries/azure", - "stack-components/container-registries/github", - "stack-components/container-registries/custom" - ] - }, - { - "group": " Data Validators", - "icon": "vial", - "iconType": "solid", - "pages": [ - "stack-components/data-validators/data-validators", - "stack-components/data-validators/great-expectations", - "stack-components/data-validators/deepchecks", - "stack-components/data-validators/evidently", - "stack-components/data-validators/whylogs", - "stack-components/data-validators/custom" - ] - }, - { - "group": " Experiment Trackers", - "icon": "chart-line", - "iconType": "solid", - "pages": [ - "stack-components/experiment-trackers/experiment-trackers", - "stack-components/experiment-trackers/comet", - "stack-components/experiment-trackers/mlflow", - "stack-components/experiment-trackers/neptune", - "stack-components/experiment-trackers/wandb", - "stack-components/experiment-trackers/custom" - ] - }, - { - "group": " Model Deployers", - "icon": "person-running", - "iconType": "solid", - "pages": [ - "stack-components/model-deployers/model-deployers", - "stack-components/model-deployers/mlflow", - "stack-components/model-deployers/seldon", - "stack-components/model-deployers/bentoml", - "stack-components/model-deployers/huggingface", - "stack-components/model-deployers/databricks", - "stack-components/model-deployers/custom" - ] - }, - { - "group": " Step Operators", - "icon": "stairs", - "iconType": "solid", - "pages": [ - "stack-components/step-operators/step-operators", - "stack-components/step-operators/sagemaker", - "stack-components/step-operators/vertex", - "stack-components/step-operators/azureml", - "stack-components/step-operators/kubernetes", - "stack-components/step-operators/spark-kubernetes", - "stack-components/step-operators/custom" - ] - }, - { - "group": " Alerters", - "icon": "exclamation", - "iconType": "solid", - "pages": [ - "stack-components/alerters/alerters", - "stack-components/alerters/discord", - "stack-components/alerters/slack", - "stack-components/alerters/custom" - ] - }, - { - "group": " Image Builders", - "icon": "image", - "iconType": "solid", - "pages": [ - "stack-components/image-builders/image-builders", - "stack-components/image-builders/local", - "stack-components/image-builders/kaniko", - "stack-components/image-builders/gcp", - "stack-components/image-builders/custom" - ] - }, - { - "group": " Annotators", - "icon": "tag", - "iconType": "solid", - "pages": [ - "stack-components/annotators/annotators", - "stack-components/annotators/argilla", - "stack-components/annotators/label-studio", - "stack-components/annotators/pigeon", - "stack-components/annotators/prodigy", - "stack-components/annotators/custom" - ] - }, - { - "group": " Model Registries", - "icon": "book", - "iconType": "solid", - "pages": [ - "stack-components/model-registries/model-registries", - "stack-components/model-registries/mlflow", - "stack-components/model-registries/custom" - ] - }, - { - "group": " Feature Stores", - "icon": "shop", - "iconType": "solid", - "pages": [ - "stack-components/feature-stores/feature-stores", - "stack-components/feature-stores/feast", - "stack-components/feature-stores/custom" - ] - } - ] - }, - { - "group": " Manage stacks & components", - "icon": "hammer", - "iconType": "solid", - "pages": [ - "stack-components/stack-deployment/deploy-a-cloud-stack", - "stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform", - "stack-components/stack-deployment/register-a-cloud-stack", - "stack-components/stack-deployment/deploy-a-stack-using-mlstacks" - ] - }, - { - "group": "🛠️ Custom stack components", - "icon": "hammer", - "iconType": "solid", - "pages": [ - "stack-components/stack-deployment/implement-a-custom-stack-component", - "stack-components/stack-deployment/implement-a-custom-integration" - ] - }, - { - "group": "📚 User Guides", - "pages": [ - { - "group": " Starter guide", - "icon": "egg", - "iconType": "solid", - "pages": [ - "user-guide/guides/starter-guide", - "user-guide/guides/starter-guide/create-an-ml-pipeline", - "user-guide/guides/starter-guide/cache-previous-executions", - "user-guide/guides/starter-guide/manage-artifacts", - "user-guide/guides/starter-guide/track-ml-models", - "user-guide/guides/starter-guide/starter-project" - ] - }, - { - "group": " Production guide", - "icon": "the-red-yeti", - "iconType": "solid", - "pages": [ - "user-guide/guides/production-guide", - "user-guide/guides/production-guide/deploying-zenml", - "user-guide/guides/production-guide/understand-stacks", - "user-guide/guides/production-guide/remote-storage", - "user-guide/guides/production-guide/cloud-orchestration", - "user-guide/guides/production-guide/configure-pipeline", - "user-guide/guides/production-guide/connect-code-repository", - "user-guide/guides/production-guide/ci-cd", - "user-guide/guides/production-guide/end-to-end" - ] - }, - { - "group": " LLMOps guide", - "icon": "earlybirds", - "iconType": "solid", - "pages": [ - "user-guide/guides/llmops-guide", - { - "group": "RAG with ZenML", - "icon": "robot", - "iconType": "solid", - "pages": [ - "user-guide/guides/llmops-guide/rag-with-zenml", - "user-guide/guides/llmops-guide/rag-with-zenml/rag-85-loc", - "user-guide/guides/llmops-guide/rag-with-zenml/understanding-rag", - "user-guide/guides/llmops-guide/rag-with-zenml/data-ingestion", - "user-guide/guides/llmops-guide/rag-with-zenml/embeddings-generation", - "user-guide/guides/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database", - "user-guide/guides/llmops-guide/rag-with-zenml/basic-rag-inference-pipeline" - ] - }, - { - "group": "Evaluation and metrics", - "icon": "chart-line", - "iconType": "solid", - "pages": [ - "user-guide/guides/llmops-guide/evaluation", - "user-guide/guides/llmops-guide/evaluation/evaluation-in-65-loc", - "user-guide/guides/llmops-guide/evaluation/retrieval", - "user-guide/guides/llmops-guide/evaluation/generation", - "user-guide/guides/llmops-guide/evaluation/evaluation-in-practice" - ] - }, - { - "group": "Reranking for better retrieval", - "icon": "ranking-star", - "iconType": "solid", - "pages": [ - "user-guide/guides/llmops-guide/reranking", - "user-guide/guides/llmops-guide/reranking/understanding-reranking", - "user-guide/guides/llmops-guide/reranking/implementing-reranking", - "user-guide/guides/llmops-guide/reranking/evaluating-reranking-performance" - ] - }, - { - "group": "Improve retrieval by finetuning embeddings", - "icon": "bullseye-arrow", - "iconType": "solid", - "pages": [ - "user-guide/guides/llmops-guide/finetuning-embeddings", - "user-guide/guides/llmops-guide/finetuning-embeddings/synthetic-data-generation", - "user-guide/guides/llmops-guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers", - "user-guide/guides/llmops-guide/finetuning-embeddings/evaluating-finetuned-embeddings" - ] - }, - "user-guide/guides/llmops-guide/finetuning-llms" - ] - } - ] - }, - { - "group": "🧑‍💻 Examples", - "pages": [ - "user-guide/examples/quickstart", - "user-guide/examples/end-to-end-batch", - "user-guide/examples/basic-nlp-with-bert", - "user-guide/examples/computer-vision-with-yolov8", - "user-guide/examples/llm-finetuning", - "user-guide/examples/more-projects" - ] - }, - { - "group": "📚 Code Docs", - "pages": [ - "sdk/python-client", - { - "group": "Integration Docs", - "pages": [ - { - "group": "v1", - "pages": [ - "sdk/integration-docs/v1/integrations-sklearn", - "sdk/integration-docs/v1/integrations-kubernetes", - "sdk/integration-docs/v1/integrations-great_expectations", - "sdk/integration-docs/v1/integrations-comet", - "sdk/integration-docs/v1/integrations-pandas", - "sdk/integration-docs/v1/integrations-bitbucket", - "sdk/integration-docs/v1/integrations-pillow", - "sdk/integration-docs/v1/integrations-whylogs", - "sdk/integration-docs/v1/integrations-github", - "sdk/integration-docs/v1/integrations-facets", - "sdk/integration-docs/v1/integrations-utils", - "sdk/integration-docs/v1/integrations-databricks", - "sdk/integration-docs/v1/integrations-pycaret", - "sdk/integration-docs/v1/integrations-gitlab", - "sdk/integration-docs/v1/integrations-label_studio", - "sdk/integration-docs/v1/integrations-evidently", - "sdk/integration-docs/v1/integrations-tensorflow", - "sdk/integration-docs/v1/integrations-numpy", - "sdk/integration-docs/v1/integrations-neptune", - "sdk/integration-docs/v1/integrations-registry", - "sdk/integration-docs/v1/integrations-mlflow", - "sdk/integration-docs/v1/integrations-pytorch", - "sdk/integration-docs/v1/integrations-skypilot_azure", - "sdk/integration-docs/v1/integrations-lightning", - "sdk/integration-docs/v1/integrations-skypilot", - "sdk/integration-docs/v1/integrations-tensorboard", - "sdk/integration-docs/v1/integrations-pytorch_lightning", - "sdk/integration-docs/v1/integrations-constants", - "sdk/integration-docs/v1/integrations-feast", - "sdk/integration-docs/v1/integrations-kaniko", - "sdk/integration-docs/v1/integrations-lightgbm", - "sdk/integration-docs/v1/integrations-kubeflow", - "sdk/integration-docs/v1/integrations-pigeon", - "sdk/integration-docs/v1/integrations-argilla", - "sdk/integration-docs/v1/integrations-scipy", - "sdk/integration-docs/v1/integrations-llama_index", - "sdk/integration-docs/v1/integrations-tekton", - "sdk/integration-docs/v1/integrations-skypilot_lambda", - "sdk/integration-docs/v1/integrations-aws", - "sdk/integration-docs/v1/integrations-airflow", - "sdk/integration-docs/v1/integrations-xgboost", - "sdk/integration-docs/v1/integrations-integration", - "sdk/integration-docs/v1/integrations-gcp", - "sdk/integration-docs/v1/integrations-bentoml", - "sdk/integration-docs/v1/integrations-skypilot_aws", - "sdk/integration-docs/v1/integrations-prodigy", - "sdk/integration-docs/v1/integrations-s3", - "sdk/integration-docs/v1/integrations-hyperai", - "sdk/integration-docs/v1/integrations-langchain", - "sdk/integration-docs/v1/integrations-neural_prophet", - "sdk/integration-docs/v1/integrations-seldon", - "sdk/integration-docs/v1/integrations-polars", - "sdk/integration-docs/v1/integrations-skypilot_gcp", - "sdk/integration-docs/v1/integrations-huggingface", - "sdk/integration-docs/v1/integrations-spark", - "sdk/integration-docs/v1/integrations-wandb", - "sdk/integration-docs/v1/integrations-azure", - "sdk/integration-docs/v1/integrations-discord", - "sdk/integration-docs/v1/integrations-openai", - "sdk/integration-docs/v1/integrations-deepchecks", - "sdk/integration-docs/v1/integrations-slack" - ] - } - ] - }, - { - "group": "Core Code Docs", - "pages": [ - { - "group": "v1", - "pages": [ - "sdk/core-sdk/v1/core-hooks", - "sdk/core-sdk/v1/core-annotators", - "sdk/core-sdk/v1/core-zen_stores", - "sdk/core-sdk/v1/core-constants", - "sdk/core-sdk/v1/core-plugins", - "sdk/core-sdk/v1/core-experiment_trackers", - "sdk/core-sdk/v1/core-types", - "sdk/core-sdk/v1/core-logger", - "sdk/core-sdk/v1/core-logging", - "sdk/core-sdk/v1/core-entrypoints", - "sdk/core-sdk/v1/core-environment", - "sdk/core-sdk/v1/core-enums", - "sdk/core-sdk/v1/core-lineage_graph", - "sdk/core-sdk/v1/core-assistant", - "sdk/core-sdk/v1/core-code_repositories", - "sdk/core-sdk/v1/core-analytics", - "sdk/core-sdk/v1/core-exceptions", - "sdk/core-sdk/v1/core-artifacts", - "sdk/core-sdk/v1/core-client_lazy_loader", - "sdk/core-sdk/v1/core-event_hub", - "sdk/core-sdk/v1/core-feature_stores", - "sdk/core-sdk/v1/core-new", - "sdk/core-sdk/v1/core-model", - "sdk/core-sdk/v1/core-io", - "sdk/core-sdk/v1/core-utils", - "sdk/core-sdk/v1/core-services", - "sdk/core-sdk/v1/core-_hub", - "sdk/core-sdk/v1/core-metadata", - "sdk/core-sdk/v1/core-secret", - "sdk/core-sdk/v1/core-console", - "sdk/core-sdk/v1/core-api", - "sdk/core-sdk/v1/core-service_connectors", - "sdk/core-sdk/v1/core-data_validators", - "sdk/core-sdk/v1/core-actions", - "sdk/core-sdk/v1/core-orchestrators", - "sdk/core-sdk/v1/core-steps", - "sdk/core-sdk/v1/core-image_builders", - "sdk/core-sdk/v1/core-models", - "sdk/core-sdk/v1/core-model_registries", - "sdk/core-sdk/v1/core-artifact_stores", - "sdk/core-sdk/v1/core-post_execution", - "sdk/core-sdk/v1/core-alerter", - "sdk/core-sdk/v1/core-pipelines", - "sdk/core-sdk/v1/core-stack_deployments", - "sdk/core-sdk/v1/core-stack", - "sdk/core-sdk/v1/core-materializers", - "sdk/core-sdk/v1/core-model_deployers", - "sdk/core-sdk/v1/core-event_sources", - "sdk/core-sdk/v1/core-config", - "sdk/core-sdk/v1/core-client", - "sdk/core-sdk/v1/core-zen_server", - "sdk/core-sdk/v1/core-step_operators", - "sdk/core-sdk/v1/core-container_registries" - ] - } - ] - } - ] - }, - { - "group": "📘 API Documentation", - "pages": ["api/api-reference"] - } - ], - "footerSocials": { - "x": "https://x.com/zenml_io", - "github": "https://github.com/zenml_io", - "linkedin": "https://www.linkedin.com/company/zenml", - "slack": "https://zenml.io/slack-invite", - "youtube": "https://www.youtube.com/@ZenML" - }, - "search": { - "prompt": "Chat with the docs..." - }, - "feedback": { - "thumbsRating": true - } -} diff --git a/develop/sdk/core-sdk/v1/core-_hub.md b/develop/sdk/core-sdk/v1/core-_hub.md deleted file mode 100644 index f0a9a47..0000000 --- a/develop/sdk/core-sdk/v1/core-_hub.md +++ /dev/null @@ -1,7 +0,0 @@ -# Hub - -::: zenml._hub - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-actions.md b/develop/sdk/core-sdk/v1/core-actions.md deleted file mode 100644 index c019816..0000000 --- a/develop/sdk/core-sdk/v1/core-actions.md +++ /dev/null @@ -1,7 +0,0 @@ -# Actions - -::: zenml.actions - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-alerter.md b/develop/sdk/core-sdk/v1/core-alerter.md deleted file mode 100644 index 8eb9146..0000000 --- a/develop/sdk/core-sdk/v1/core-alerter.md +++ /dev/null @@ -1,7 +0,0 @@ -# Alerter - -::: zenml.alerter - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-analytics.md b/develop/sdk/core-sdk/v1/core-analytics.md deleted file mode 100644 index 454c3de..0000000 --- a/develop/sdk/core-sdk/v1/core-analytics.md +++ /dev/null @@ -1,7 +0,0 @@ -# Analytics - -::: zenml.analytics - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-annotators.md b/develop/sdk/core-sdk/v1/core-annotators.md deleted file mode 100644 index 54f462c..0000000 --- a/develop/sdk/core-sdk/v1/core-annotators.md +++ /dev/null @@ -1,7 +0,0 @@ -# Annotators - -::: zenml.annotators - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-api.md b/develop/sdk/core-sdk/v1/core-api.md deleted file mode 100644 index 5b18400..0000000 --- a/develop/sdk/core-sdk/v1/core-api.md +++ /dev/null @@ -1,7 +0,0 @@ -# Api - -::: zenml.api - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-artifact_stores.md b/develop/sdk/core-sdk/v1/core-artifact_stores.md deleted file mode 100644 index 46d000e..0000000 --- a/develop/sdk/core-sdk/v1/core-artifact_stores.md +++ /dev/null @@ -1,7 +0,0 @@ -# Artifact Stores - -::: zenml.artifact_stores - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-artifacts.md b/develop/sdk/core-sdk/v1/core-artifacts.md deleted file mode 100644 index d9ca0d9..0000000 --- a/develop/sdk/core-sdk/v1/core-artifacts.md +++ /dev/null @@ -1,7 +0,0 @@ -# Artifacts - -::: zenml.artifacts - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-assistant.md b/develop/sdk/core-sdk/v1/core-assistant.md deleted file mode 100644 index abd998a..0000000 --- a/develop/sdk/core-sdk/v1/core-assistant.md +++ /dev/null @@ -1,7 +0,0 @@ -# Assistant - -::: zenml.assistant - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-client.md b/develop/sdk/core-sdk/v1/core-client.md deleted file mode 100644 index c618af8..0000000 --- a/develop/sdk/core-sdk/v1/core-client.md +++ /dev/null @@ -1,7 +0,0 @@ -# Client - -::: zenml.client - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-client_lazy_loader.md b/develop/sdk/core-sdk/v1/core-client_lazy_loader.md deleted file mode 100644 index c100f9b..0000000 --- a/develop/sdk/core-sdk/v1/core-client_lazy_loader.md +++ /dev/null @@ -1,7 +0,0 @@ -# Client Lazy Loader - -::: zenml.client_lazy_loader - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-code_repositories.md b/develop/sdk/core-sdk/v1/core-code_repositories.md deleted file mode 100644 index 7a19e8a..0000000 --- a/develop/sdk/core-sdk/v1/core-code_repositories.md +++ /dev/null @@ -1,7 +0,0 @@ -# Code Repositories - -::: zenml.code_repositories - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-config.md b/develop/sdk/core-sdk/v1/core-config.md deleted file mode 100644 index d96259b..0000000 --- a/develop/sdk/core-sdk/v1/core-config.md +++ /dev/null @@ -1,7 +0,0 @@ -# Config - -::: zenml.config - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-console.md b/develop/sdk/core-sdk/v1/core-console.md deleted file mode 100644 index 4232c8f..0000000 --- a/develop/sdk/core-sdk/v1/core-console.md +++ /dev/null @@ -1,7 +0,0 @@ -# Console - -::: zenml.console - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-constants.md b/develop/sdk/core-sdk/v1/core-constants.md deleted file mode 100644 index ce9657b..0000000 --- a/develop/sdk/core-sdk/v1/core-constants.md +++ /dev/null @@ -1,7 +0,0 @@ -# Constants - -::: zenml.constants - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-container_registries.md b/develop/sdk/core-sdk/v1/core-container_registries.md deleted file mode 100644 index c116294..0000000 --- a/develop/sdk/core-sdk/v1/core-container_registries.md +++ /dev/null @@ -1,7 +0,0 @@ -# Container Registries - -::: zenml.container_registries - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-data_validators.md b/develop/sdk/core-sdk/v1/core-data_validators.md deleted file mode 100644 index a6dcff5..0000000 --- a/develop/sdk/core-sdk/v1/core-data_validators.md +++ /dev/null @@ -1,7 +0,0 @@ -# Data Validators - -::: zenml.data_validators - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-entrypoints.md b/develop/sdk/core-sdk/v1/core-entrypoints.md deleted file mode 100644 index 741aeb2..0000000 --- a/develop/sdk/core-sdk/v1/core-entrypoints.md +++ /dev/null @@ -1,7 +0,0 @@ -# Entrypoints - -::: zenml.entrypoints - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-enums.md b/develop/sdk/core-sdk/v1/core-enums.md deleted file mode 100644 index 87dc007..0000000 --- a/develop/sdk/core-sdk/v1/core-enums.md +++ /dev/null @@ -1,7 +0,0 @@ -# Enums - -::: zenml.enums - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-environment.md b/develop/sdk/core-sdk/v1/core-environment.md deleted file mode 100644 index baef7e5..0000000 --- a/develop/sdk/core-sdk/v1/core-environment.md +++ /dev/null @@ -1,7 +0,0 @@ -# Environment - -::: zenml.environment - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-event_hub.md b/develop/sdk/core-sdk/v1/core-event_hub.md deleted file mode 100644 index 0a4ca12..0000000 --- a/develop/sdk/core-sdk/v1/core-event_hub.md +++ /dev/null @@ -1,7 +0,0 @@ -# Event Hub - -::: zenml.event_hub - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-event_sources.md b/develop/sdk/core-sdk/v1/core-event_sources.md deleted file mode 100644 index ba31ac9..0000000 --- a/develop/sdk/core-sdk/v1/core-event_sources.md +++ /dev/null @@ -1,7 +0,0 @@ -# Event Sources - -::: zenml.event_sources - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-exceptions.md b/develop/sdk/core-sdk/v1/core-exceptions.md deleted file mode 100644 index f817b21..0000000 --- a/develop/sdk/core-sdk/v1/core-exceptions.md +++ /dev/null @@ -1,7 +0,0 @@ -# Exceptions - -::: zenml.exceptions - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-experiment_trackers.md b/develop/sdk/core-sdk/v1/core-experiment_trackers.md deleted file mode 100644 index 040b220..0000000 --- a/develop/sdk/core-sdk/v1/core-experiment_trackers.md +++ /dev/null @@ -1,7 +0,0 @@ -# Experiment Trackers - -::: zenml.experiment_trackers - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-feature_stores.md b/develop/sdk/core-sdk/v1/core-feature_stores.md deleted file mode 100644 index 8786f2f..0000000 --- a/develop/sdk/core-sdk/v1/core-feature_stores.md +++ /dev/null @@ -1,7 +0,0 @@ -# Feature Stores - -::: zenml.feature_stores - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-hooks.md b/develop/sdk/core-sdk/v1/core-hooks.md deleted file mode 100644 index 2367fdf..0000000 --- a/develop/sdk/core-sdk/v1/core-hooks.md +++ /dev/null @@ -1,3 +0,0 @@ -# zenml package - -## Subpackages \ No newline at end of file diff --git a/develop/sdk/core-sdk/v1/core-image_builders.md b/develop/sdk/core-sdk/v1/core-image_builders.md deleted file mode 100644 index 236d42e..0000000 --- a/develop/sdk/core-sdk/v1/core-image_builders.md +++ /dev/null @@ -1,7 +0,0 @@ -# Image Builders - -::: zenml.image_builders - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-io.md b/develop/sdk/core-sdk/v1/core-io.md deleted file mode 100644 index 156a111..0000000 --- a/develop/sdk/core-sdk/v1/core-io.md +++ /dev/null @@ -1,7 +0,0 @@ -# Io - -::: zenml.io - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-lineage_graph.md b/develop/sdk/core-sdk/v1/core-lineage_graph.md deleted file mode 100644 index 4ffc838..0000000 --- a/develop/sdk/core-sdk/v1/core-lineage_graph.md +++ /dev/null @@ -1,7 +0,0 @@ -# Lineage Graph - -::: zenml.lineage_graph - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-logger.md b/develop/sdk/core-sdk/v1/core-logger.md deleted file mode 100644 index 7c9eb30..0000000 --- a/develop/sdk/core-sdk/v1/core-logger.md +++ /dev/null @@ -1,7 +0,0 @@ -# Logger - -::: zenml.logger - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-logging.md b/develop/sdk/core-sdk/v1/core-logging.md deleted file mode 100644 index dab4510..0000000 --- a/develop/sdk/core-sdk/v1/core-logging.md +++ /dev/null @@ -1,7 +0,0 @@ -# Logging - -::: zenml.logging - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-materializers.md b/develop/sdk/core-sdk/v1/core-materializers.md deleted file mode 100644 index e6a3097..0000000 --- a/develop/sdk/core-sdk/v1/core-materializers.md +++ /dev/null @@ -1,7 +0,0 @@ -# Materializers - -::: zenml.materializers - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-metadata.md b/develop/sdk/core-sdk/v1/core-metadata.md deleted file mode 100644 index 2bb9198..0000000 --- a/develop/sdk/core-sdk/v1/core-metadata.md +++ /dev/null @@ -1,7 +0,0 @@ -# Metadata - -::: zenml.metadata - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-model.md b/develop/sdk/core-sdk/v1/core-model.md deleted file mode 100644 index 81baf7b..0000000 --- a/develop/sdk/core-sdk/v1/core-model.md +++ /dev/null @@ -1,7 +0,0 @@ -# Model - -::: zenml.model - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-model_deployers.md b/develop/sdk/core-sdk/v1/core-model_deployers.md deleted file mode 100644 index 88f2c2e..0000000 --- a/develop/sdk/core-sdk/v1/core-model_deployers.md +++ /dev/null @@ -1,7 +0,0 @@ -# Model Deployers - -::: zenml.model_deployers - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-model_registries.md b/develop/sdk/core-sdk/v1/core-model_registries.md deleted file mode 100644 index 4ecc018..0000000 --- a/develop/sdk/core-sdk/v1/core-model_registries.md +++ /dev/null @@ -1,7 +0,0 @@ -# Model Registries - -::: zenml.model_registries - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-models.md b/develop/sdk/core-sdk/v1/core-models.md deleted file mode 100644 index f337595..0000000 --- a/develop/sdk/core-sdk/v1/core-models.md +++ /dev/null @@ -1,7 +0,0 @@ -# Models - -::: zenml.models - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-new.md b/develop/sdk/core-sdk/v1/core-new.md deleted file mode 100644 index ad2e065..0000000 --- a/develop/sdk/core-sdk/v1/core-new.md +++ /dev/null @@ -1,7 +0,0 @@ -# New - -::: zenml.new - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-orchestrators.md b/develop/sdk/core-sdk/v1/core-orchestrators.md deleted file mode 100644 index e2042f1..0000000 --- a/develop/sdk/core-sdk/v1/core-orchestrators.md +++ /dev/null @@ -1,7 +0,0 @@ -# Orchestrators - -::: zenml.orchestrators - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-pipelines.md b/develop/sdk/core-sdk/v1/core-pipelines.md deleted file mode 100644 index c4ae9ad..0000000 --- a/develop/sdk/core-sdk/v1/core-pipelines.md +++ /dev/null @@ -1,7 +0,0 @@ -# Pipelines - -::: zenml.pipelines - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-plugins.md b/develop/sdk/core-sdk/v1/core-plugins.md deleted file mode 100644 index 06edce4..0000000 --- a/develop/sdk/core-sdk/v1/core-plugins.md +++ /dev/null @@ -1,7 +0,0 @@ -# Plugins - -::: zenml.plugins - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-post_execution.md b/develop/sdk/core-sdk/v1/core-post_execution.md deleted file mode 100644 index 788963f..0000000 --- a/develop/sdk/core-sdk/v1/core-post_execution.md +++ /dev/null @@ -1,7 +0,0 @@ -# Post Execution - -::: zenml.post_execution - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-secret.md b/develop/sdk/core-sdk/v1/core-secret.md deleted file mode 100644 index 2ac7981..0000000 --- a/develop/sdk/core-sdk/v1/core-secret.md +++ /dev/null @@ -1,7 +0,0 @@ -# Secret - -::: zenml.secret - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-service_connectors.md b/develop/sdk/core-sdk/v1/core-service_connectors.md deleted file mode 100644 index c4c861a..0000000 --- a/develop/sdk/core-sdk/v1/core-service_connectors.md +++ /dev/null @@ -1,7 +0,0 @@ -# Service Connectors - -::: zenml.service_connectors - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-services.md b/develop/sdk/core-sdk/v1/core-services.md deleted file mode 100644 index 6773c5e..0000000 --- a/develop/sdk/core-sdk/v1/core-services.md +++ /dev/null @@ -1,7 +0,0 @@ -# Services - -::: zenml.services - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-stack.md b/develop/sdk/core-sdk/v1/core-stack.md deleted file mode 100644 index 5bbad25..0000000 --- a/develop/sdk/core-sdk/v1/core-stack.md +++ /dev/null @@ -1,7 +0,0 @@ -# Stack - -::: zenml.stack - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-stack_deployments.md b/develop/sdk/core-sdk/v1/core-stack_deployments.md deleted file mode 100644 index a15b1f1..0000000 --- a/develop/sdk/core-sdk/v1/core-stack_deployments.md +++ /dev/null @@ -1,7 +0,0 @@ -# Stack Deployments - -::: zenml.stack_deployments - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-step_operators.md b/develop/sdk/core-sdk/v1/core-step_operators.md deleted file mode 100644 index 0513014..0000000 --- a/develop/sdk/core-sdk/v1/core-step_operators.md +++ /dev/null @@ -1,7 +0,0 @@ -# Step Operators - -::: zenml.step_operators - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-steps.md b/develop/sdk/core-sdk/v1/core-steps.md deleted file mode 100644 index 782190a..0000000 --- a/develop/sdk/core-sdk/v1/core-steps.md +++ /dev/null @@ -1,7 +0,0 @@ -# Steps - -::: zenml.steps - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-types.md b/develop/sdk/core-sdk/v1/core-types.md deleted file mode 100644 index cb6fa7f..0000000 --- a/develop/sdk/core-sdk/v1/core-types.md +++ /dev/null @@ -1,7 +0,0 @@ -# Types - -::: zenml.types - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-utils.md b/develop/sdk/core-sdk/v1/core-utils.md deleted file mode 100644 index 290307f..0000000 --- a/develop/sdk/core-sdk/v1/core-utils.md +++ /dev/null @@ -1,7 +0,0 @@ -# Utils - -::: zenml.utils - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-zen_server.md b/develop/sdk/core-sdk/v1/core-zen_server.md deleted file mode 100644 index d16aef9..0000000 --- a/develop/sdk/core-sdk/v1/core-zen_server.md +++ /dev/null @@ -1,7 +0,0 @@ -# Zen Server - -::: zenml.zen_server - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/core-sdk/v1/core-zen_stores.md b/develop/sdk/core-sdk/v1/core-zen_stores.md deleted file mode 100644 index 09db142..0000000 --- a/develop/sdk/core-sdk/v1/core-zen_stores.md +++ /dev/null @@ -1,7 +0,0 @@ -# Zen Stores - -::: zenml.zen_stores - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-airflow.md b/develop/sdk/integration-docs/v1/integrations-airflow.md deleted file mode 100644 index 7ae2108..0000000 --- a/develop/sdk/integration-docs/v1/integrations-airflow.md +++ /dev/null @@ -1,7 +0,0 @@ -# Airflow - -::: zenml.integrations.airflow - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-argilla.md b/develop/sdk/integration-docs/v1/integrations-argilla.md deleted file mode 100644 index 93a3c3a..0000000 --- a/develop/sdk/integration-docs/v1/integrations-argilla.md +++ /dev/null @@ -1,7 +0,0 @@ -# Argilla - -::: zenml.integrations.argilla - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-aws.md b/develop/sdk/integration-docs/v1/integrations-aws.md deleted file mode 100644 index 3c8ba48..0000000 --- a/develop/sdk/integration-docs/v1/integrations-aws.md +++ /dev/null @@ -1,7 +0,0 @@ -# Aws - -::: zenml.integrations.aws - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-azure.md b/develop/sdk/integration-docs/v1/integrations-azure.md deleted file mode 100644 index 3a5c6a8..0000000 --- a/develop/sdk/integration-docs/v1/integrations-azure.md +++ /dev/null @@ -1,7 +0,0 @@ -# Azure - -::: zenml.integrations.azure - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-bentoml.md b/develop/sdk/integration-docs/v1/integrations-bentoml.md deleted file mode 100644 index ef91f73..0000000 --- a/develop/sdk/integration-docs/v1/integrations-bentoml.md +++ /dev/null @@ -1,7 +0,0 @@ -# Bentoml - -::: zenml.integrations.bentoml - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-bitbucket.md b/develop/sdk/integration-docs/v1/integrations-bitbucket.md deleted file mode 100644 index dec6980..0000000 --- a/develop/sdk/integration-docs/v1/integrations-bitbucket.md +++ /dev/null @@ -1,7 +0,0 @@ -# Bitbucket - -::: zenml.integrations.bitbucket - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-comet.md b/develop/sdk/integration-docs/v1/integrations-comet.md deleted file mode 100644 index aeb6f06..0000000 --- a/develop/sdk/integration-docs/v1/integrations-comet.md +++ /dev/null @@ -1,7 +0,0 @@ -# Comet - -::: zenml.integrations.comet - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-constants.md b/develop/sdk/integration-docs/v1/integrations-constants.md deleted file mode 100644 index 667eeee..0000000 --- a/develop/sdk/integration-docs/v1/integrations-constants.md +++ /dev/null @@ -1,7 +0,0 @@ -# Constants - -::: zenml.integrations.constants - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-databricks.md b/develop/sdk/integration-docs/v1/integrations-databricks.md deleted file mode 100644 index b81cbee..0000000 --- a/develop/sdk/integration-docs/v1/integrations-databricks.md +++ /dev/null @@ -1,7 +0,0 @@ -# Databricks - -::: zenml.integrations.databricks - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-deepchecks.md b/develop/sdk/integration-docs/v1/integrations-deepchecks.md deleted file mode 100644 index 03071fd..0000000 --- a/develop/sdk/integration-docs/v1/integrations-deepchecks.md +++ /dev/null @@ -1,7 +0,0 @@ -# Deepchecks - -::: zenml.integrations.deepchecks - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-discord.md b/develop/sdk/integration-docs/v1/integrations-discord.md deleted file mode 100644 index 3fbe081..0000000 --- a/develop/sdk/integration-docs/v1/integrations-discord.md +++ /dev/null @@ -1,7 +0,0 @@ -# Discord - -::: zenml.integrations.discord - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-evidently.md b/develop/sdk/integration-docs/v1/integrations-evidently.md deleted file mode 100644 index 3bb2efa..0000000 --- a/develop/sdk/integration-docs/v1/integrations-evidently.md +++ /dev/null @@ -1,7 +0,0 @@ -# Evidently - -::: zenml.integrations.evidently - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-facets.md b/develop/sdk/integration-docs/v1/integrations-facets.md deleted file mode 100644 index 5f3906e..0000000 --- a/develop/sdk/integration-docs/v1/integrations-facets.md +++ /dev/null @@ -1,7 +0,0 @@ -# Facets - -::: zenml.integrations.facets - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-feast.md b/develop/sdk/integration-docs/v1/integrations-feast.md deleted file mode 100644 index 00bdec3..0000000 --- a/develop/sdk/integration-docs/v1/integrations-feast.md +++ /dev/null @@ -1,7 +0,0 @@ -# Feast - -::: zenml.integrations.feast - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-gcp.md b/develop/sdk/integration-docs/v1/integrations-gcp.md deleted file mode 100644 index 8c17ecb..0000000 --- a/develop/sdk/integration-docs/v1/integrations-gcp.md +++ /dev/null @@ -1,7 +0,0 @@ -# Gcp - -::: zenml.integrations.gcp - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-github.md b/develop/sdk/integration-docs/v1/integrations-github.md deleted file mode 100644 index 6489297..0000000 --- a/develop/sdk/integration-docs/v1/integrations-github.md +++ /dev/null @@ -1,7 +0,0 @@ -# Github - -::: zenml.integrations.github - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-gitlab.md b/develop/sdk/integration-docs/v1/integrations-gitlab.md deleted file mode 100644 index 39985d6..0000000 --- a/develop/sdk/integration-docs/v1/integrations-gitlab.md +++ /dev/null @@ -1,7 +0,0 @@ -# Gitlab - -::: zenml.integrations.gitlab - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-great_expectations.md b/develop/sdk/integration-docs/v1/integrations-great_expectations.md deleted file mode 100644 index b549645..0000000 --- a/develop/sdk/integration-docs/v1/integrations-great_expectations.md +++ /dev/null @@ -1,7 +0,0 @@ -# Great Expectations - -::: zenml.integrations.great_expectations - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-huggingface.md b/develop/sdk/integration-docs/v1/integrations-huggingface.md deleted file mode 100644 index 67a9a5c..0000000 --- a/develop/sdk/integration-docs/v1/integrations-huggingface.md +++ /dev/null @@ -1,7 +0,0 @@ -# Huggingface - -::: zenml.integrations.huggingface - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-hyperai.md b/develop/sdk/integration-docs/v1/integrations-hyperai.md deleted file mode 100644 index a8ba14d..0000000 --- a/develop/sdk/integration-docs/v1/integrations-hyperai.md +++ /dev/null @@ -1,7 +0,0 @@ -# Hyperai - -::: zenml.integrations.hyperai - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-integration.md b/develop/sdk/integration-docs/v1/integrations-integration.md deleted file mode 100644 index ecb630b..0000000 --- a/develop/sdk/integration-docs/v1/integrations-integration.md +++ /dev/null @@ -1,7 +0,0 @@ -# Integration - -::: zenml.integrations.integration - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-kaniko.md b/develop/sdk/integration-docs/v1/integrations-kaniko.md deleted file mode 100644 index 3e0b67f..0000000 --- a/develop/sdk/integration-docs/v1/integrations-kaniko.md +++ /dev/null @@ -1,7 +0,0 @@ -# Kaniko - -::: zenml.integrations.kaniko - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-kubeflow.md b/develop/sdk/integration-docs/v1/integrations-kubeflow.md deleted file mode 100644 index ec0c992..0000000 --- a/develop/sdk/integration-docs/v1/integrations-kubeflow.md +++ /dev/null @@ -1,7 +0,0 @@ -# Kubeflow - -::: zenml.integrations.kubeflow - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-kubernetes.md b/develop/sdk/integration-docs/v1/integrations-kubernetes.md deleted file mode 100644 index 8901d3d..0000000 --- a/develop/sdk/integration-docs/v1/integrations-kubernetes.md +++ /dev/null @@ -1,7 +0,0 @@ -# Kubernetes - -::: zenml.integrations.kubernetes - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-label_studio.md b/develop/sdk/integration-docs/v1/integrations-label_studio.md deleted file mode 100644 index 1dc37f1..0000000 --- a/develop/sdk/integration-docs/v1/integrations-label_studio.md +++ /dev/null @@ -1,7 +0,0 @@ -# Label Studio - -::: zenml.integrations.label_studio - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-langchain.md b/develop/sdk/integration-docs/v1/integrations-langchain.md deleted file mode 100644 index c42b0be..0000000 --- a/develop/sdk/integration-docs/v1/integrations-langchain.md +++ /dev/null @@ -1,7 +0,0 @@ -# Langchain - -::: zenml.integrations.langchain - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-lightgbm.md b/develop/sdk/integration-docs/v1/integrations-lightgbm.md deleted file mode 100644 index f70707a..0000000 --- a/develop/sdk/integration-docs/v1/integrations-lightgbm.md +++ /dev/null @@ -1,7 +0,0 @@ -# Lightgbm - -::: zenml.integrations.lightgbm - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-lightning.md b/develop/sdk/integration-docs/v1/integrations-lightning.md deleted file mode 100644 index f2daf89..0000000 --- a/develop/sdk/integration-docs/v1/integrations-lightning.md +++ /dev/null @@ -1,7 +0,0 @@ -# Lightning - -::: zenml.integrations.lightning - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-llama_index.md b/develop/sdk/integration-docs/v1/integrations-llama_index.md deleted file mode 100644 index f6c72b9..0000000 --- a/develop/sdk/integration-docs/v1/integrations-llama_index.md +++ /dev/null @@ -1,7 +0,0 @@ -# Llama Index - -::: zenml.integrations.llama_index - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-mlflow.md b/develop/sdk/integration-docs/v1/integrations-mlflow.md deleted file mode 100644 index f95ea92..0000000 --- a/develop/sdk/integration-docs/v1/integrations-mlflow.md +++ /dev/null @@ -1,7 +0,0 @@ -# Mlflow - -::: zenml.integrations.mlflow - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-neptune.md b/develop/sdk/integration-docs/v1/integrations-neptune.md deleted file mode 100644 index 75b1661..0000000 --- a/develop/sdk/integration-docs/v1/integrations-neptune.md +++ /dev/null @@ -1,7 +0,0 @@ -# Neptune - -::: zenml.integrations.neptune - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-neural_prophet.md b/develop/sdk/integration-docs/v1/integrations-neural_prophet.md deleted file mode 100644 index c3ab05d..0000000 --- a/develop/sdk/integration-docs/v1/integrations-neural_prophet.md +++ /dev/null @@ -1,7 +0,0 @@ -# Neural Prophet - -::: zenml.integrations.neural_prophet - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-numpy.md b/develop/sdk/integration-docs/v1/integrations-numpy.md deleted file mode 100644 index f414e98..0000000 --- a/develop/sdk/integration-docs/v1/integrations-numpy.md +++ /dev/null @@ -1,7 +0,0 @@ -# Numpy - -::: zenml.integrations.numpy - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-openai.md b/develop/sdk/integration-docs/v1/integrations-openai.md deleted file mode 100644 index de2924d..0000000 --- a/develop/sdk/integration-docs/v1/integrations-openai.md +++ /dev/null @@ -1,7 +0,0 @@ -# Openai - -::: zenml.integrations.openai - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-pandas.md b/develop/sdk/integration-docs/v1/integrations-pandas.md deleted file mode 100644 index 7adbb36..0000000 --- a/develop/sdk/integration-docs/v1/integrations-pandas.md +++ /dev/null @@ -1,7 +0,0 @@ -# Pandas - -::: zenml.integrations.pandas - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-pigeon.md b/develop/sdk/integration-docs/v1/integrations-pigeon.md deleted file mode 100644 index a904e62..0000000 --- a/develop/sdk/integration-docs/v1/integrations-pigeon.md +++ /dev/null @@ -1,7 +0,0 @@ -# Pigeon - -::: zenml.integrations.pigeon - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-pillow.md b/develop/sdk/integration-docs/v1/integrations-pillow.md deleted file mode 100644 index 2374f26..0000000 --- a/develop/sdk/integration-docs/v1/integrations-pillow.md +++ /dev/null @@ -1,7 +0,0 @@ -# Pillow - -::: zenml.integrations.pillow - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-polars.md b/develop/sdk/integration-docs/v1/integrations-polars.md deleted file mode 100644 index 4c687e9..0000000 --- a/develop/sdk/integration-docs/v1/integrations-polars.md +++ /dev/null @@ -1,7 +0,0 @@ -# Polars - -::: zenml.integrations.polars - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-prodigy.md b/develop/sdk/integration-docs/v1/integrations-prodigy.md deleted file mode 100644 index 56be04d..0000000 --- a/develop/sdk/integration-docs/v1/integrations-prodigy.md +++ /dev/null @@ -1,7 +0,0 @@ -# Prodigy - -::: zenml.integrations.prodigy - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-pycaret.md b/develop/sdk/integration-docs/v1/integrations-pycaret.md deleted file mode 100644 index b550c03..0000000 --- a/develop/sdk/integration-docs/v1/integrations-pycaret.md +++ /dev/null @@ -1,7 +0,0 @@ -# Pycaret - -::: zenml.integrations.pycaret - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-pytorch.md b/develop/sdk/integration-docs/v1/integrations-pytorch.md deleted file mode 100644 index 2b4cd41..0000000 --- a/develop/sdk/integration-docs/v1/integrations-pytorch.md +++ /dev/null @@ -1,7 +0,0 @@ -# Pytorch - -::: zenml.integrations.pytorch - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-pytorch_lightning.md b/develop/sdk/integration-docs/v1/integrations-pytorch_lightning.md deleted file mode 100644 index 7b3bca5..0000000 --- a/develop/sdk/integration-docs/v1/integrations-pytorch_lightning.md +++ /dev/null @@ -1,7 +0,0 @@ -# Pytorch Lightning - -::: zenml.integrations.pytorch_lightning - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-registry.md b/develop/sdk/integration-docs/v1/integrations-registry.md deleted file mode 100644 index 3487bc2..0000000 --- a/develop/sdk/integration-docs/v1/integrations-registry.md +++ /dev/null @@ -1,7 +0,0 @@ -# Registry - -::: zenml.integrations.registry - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-s3.md b/develop/sdk/integration-docs/v1/integrations-s3.md deleted file mode 100644 index a5f0832..0000000 --- a/develop/sdk/integration-docs/v1/integrations-s3.md +++ /dev/null @@ -1,7 +0,0 @@ -# S3 - -::: zenml.integrations.s3 - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-scipy.md b/develop/sdk/integration-docs/v1/integrations-scipy.md deleted file mode 100644 index e02b4c2..0000000 --- a/develop/sdk/integration-docs/v1/integrations-scipy.md +++ /dev/null @@ -1,7 +0,0 @@ -# Scipy - -::: zenml.integrations.scipy - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-seldon.md b/develop/sdk/integration-docs/v1/integrations-seldon.md deleted file mode 100644 index 13d96c1..0000000 --- a/develop/sdk/integration-docs/v1/integrations-seldon.md +++ /dev/null @@ -1,7 +0,0 @@ -# Seldon - -::: zenml.integrations.seldon - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-sklearn.md b/develop/sdk/integration-docs/v1/integrations-sklearn.md deleted file mode 100644 index 7522ba8..0000000 --- a/develop/sdk/integration-docs/v1/integrations-sklearn.md +++ /dev/null @@ -1,7 +0,0 @@ -# Sklearn - -::: zenml.integrations.sklearn - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-skypilot.md b/develop/sdk/integration-docs/v1/integrations-skypilot.md deleted file mode 100644 index 4058970..0000000 --- a/develop/sdk/integration-docs/v1/integrations-skypilot.md +++ /dev/null @@ -1,7 +0,0 @@ -# Skypilot - -::: zenml.integrations.skypilot - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-skypilot_aws.md b/develop/sdk/integration-docs/v1/integrations-skypilot_aws.md deleted file mode 100644 index 84723a0..0000000 --- a/develop/sdk/integration-docs/v1/integrations-skypilot_aws.md +++ /dev/null @@ -1,7 +0,0 @@ -# Skypilot Aws - -::: zenml.integrations.skypilot_aws - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-skypilot_azure.md b/develop/sdk/integration-docs/v1/integrations-skypilot_azure.md deleted file mode 100644 index e43bc6a..0000000 --- a/develop/sdk/integration-docs/v1/integrations-skypilot_azure.md +++ /dev/null @@ -1,7 +0,0 @@ -# Skypilot Azure - -::: zenml.integrations.skypilot_azure - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-skypilot_gcp.md b/develop/sdk/integration-docs/v1/integrations-skypilot_gcp.md deleted file mode 100644 index 879b0dc..0000000 --- a/develop/sdk/integration-docs/v1/integrations-skypilot_gcp.md +++ /dev/null @@ -1,7 +0,0 @@ -# Skypilot Gcp - -::: zenml.integrations.skypilot_gcp - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-skypilot_lambda.md b/develop/sdk/integration-docs/v1/integrations-skypilot_lambda.md deleted file mode 100644 index 307b252..0000000 --- a/develop/sdk/integration-docs/v1/integrations-skypilot_lambda.md +++ /dev/null @@ -1,7 +0,0 @@ -# Skypilot Lambda - -::: zenml.integrations.skypilot_lambda - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-slack.md b/develop/sdk/integration-docs/v1/integrations-slack.md deleted file mode 100644 index dd1faff..0000000 --- a/develop/sdk/integration-docs/v1/integrations-slack.md +++ /dev/null @@ -1,7 +0,0 @@ -# Slack - -::: zenml.integrations.slack - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-spark.md b/develop/sdk/integration-docs/v1/integrations-spark.md deleted file mode 100644 index e72edff..0000000 --- a/develop/sdk/integration-docs/v1/integrations-spark.md +++ /dev/null @@ -1,7 +0,0 @@ -# Spark - -::: zenml.integrations.spark - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-tekton.md b/develop/sdk/integration-docs/v1/integrations-tekton.md deleted file mode 100644 index eafde3b..0000000 --- a/develop/sdk/integration-docs/v1/integrations-tekton.md +++ /dev/null @@ -1,7 +0,0 @@ -# Tekton - -::: zenml.integrations.tekton - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-tensorboard.md b/develop/sdk/integration-docs/v1/integrations-tensorboard.md deleted file mode 100644 index a48cb71..0000000 --- a/develop/sdk/integration-docs/v1/integrations-tensorboard.md +++ /dev/null @@ -1,7 +0,0 @@ -# Tensorboard - -::: zenml.integrations.tensorboard - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-tensorflow.md b/develop/sdk/integration-docs/v1/integrations-tensorflow.md deleted file mode 100644 index e85f4c6..0000000 --- a/develop/sdk/integration-docs/v1/integrations-tensorflow.md +++ /dev/null @@ -1,7 +0,0 @@ -# Tensorflow - -::: zenml.integrations.tensorflow - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-utils.md b/develop/sdk/integration-docs/v1/integrations-utils.md deleted file mode 100644 index 3e55bae..0000000 --- a/develop/sdk/integration-docs/v1/integrations-utils.md +++ /dev/null @@ -1,7 +0,0 @@ -# Utils - -::: zenml.integrations.utils - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-wandb.md b/develop/sdk/integration-docs/v1/integrations-wandb.md deleted file mode 100644 index 12bf116..0000000 --- a/develop/sdk/integration-docs/v1/integrations-wandb.md +++ /dev/null @@ -1,7 +0,0 @@ -# Wandb - -::: zenml.integrations.wandb - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-whylogs.md b/develop/sdk/integration-docs/v1/integrations-whylogs.md deleted file mode 100644 index 8687bb6..0000000 --- a/develop/sdk/integration-docs/v1/integrations-whylogs.md +++ /dev/null @@ -1,7 +0,0 @@ -# Whylogs - -::: zenml.integrations.whylogs - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/integration-docs/v1/integrations-xgboost.md b/develop/sdk/integration-docs/v1/integrations-xgboost.md deleted file mode 100644 index 01159d8..0000000 --- a/develop/sdk/integration-docs/v1/integrations-xgboost.md +++ /dev/null @@ -1,7 +0,0 @@ -# Xgboost - -::: zenml.integrations.xgboost - handler: python - rendering: - show_root_heading: true - show_source: true diff --git a/develop/sdk/python-client.mdx b/develop/sdk/python-client.mdx deleted file mode 100644 index 249708b..0000000 --- a/develop/sdk/python-client.mdx +++ /dev/null @@ -1,143 +0,0 @@ ---- -title: "Python Client" -description: "Interacting with your ZenML instance through the ZenML Client." -icon: staff-snake ---- - -Pipelines, runs, stacks, and many other ZenML resources are stored and versioned in a database within your ZenML instance behind the scenes. The ZenML Python `Client` allows you to fetch, update, or even create any of these resources programmatically in Python. - - -In all other programming languages and environments, you can interact with ZenML resources through the REST API endpoints of your ZenML server instead. Checkout the `/docs/` page of your server for an overview of all available endpoints. - - -### Usage Example - -The following example shows how to use the ZenML Client to fetch the last 10 pipeline runs that you ran yourself on the stack that you have currently set: - - -```py -from zenml.client import Client - -client = Client() - -my_runs_on_current_stack = client.list_pipeline_runs( - - stack_id=client.active_stack_model.id, # on current stack - - user_id=client.active_user.id, # ran by you - - sort_by="desc:start_time", # last 10 - - size=10, - -) - -for pipeline_run in my_runs_on_current_stack: - - print(pipeline_run.name) -``` - -### List of Resources - -These are the main ZenML resources that you can interact with via the ZenML Client: - -#### Pipelines, Runs, Artifacts - -* **Pipelines**: The pipelines that were implicitly tracked when running ZenML pipelines. -* **Pipeline Runs**: Information about all pipeline runs that were executed on your ZenML instance. -* **Run Templates**: Templates to run pipelines from the server or dashboard. -* **Step Runs**: The steps of all pipeline runs. Mainly useful for directly fetching a specific step of a run by its ID. -* **Artifacts**: Information about all artifacts that were written to your artifact stores as part of pipeline runs. -* **Schedules**: Metadata about the schedules that you have used to [schedule pipeline runs](/develop/usage/pipelines/build-pipelines/schedule-a-pipeline). -* **Builds**: The pipeline-specific Docker images that were created when [containerizing your pipeline](/develop/usage/project-setup/customize-docker-builds). -* **Code Repositories**: The git code repositories that you have connected with your ZenML instance. See [here](/develop/user-guide/guides/production-guide/connect-code-repository) for more information. - - -Checkout the [documentation on fetching runs](/develop/usage/pipelines/build-pipelines/fetching-pipelines) for more information on the various ways how you can fetch and use the pipeline, pipeline run, step run, and artifact resources in code. - - -#### Stacks, Infrastructure, Authentication - -* **Stack**: The stacks registered in your ZenML instance. -* **Stack Components**: The stack components registered in your ZenML instance, e.g., all orchestrators, artifact stores, model deployers, ... -* **Flavors**: The [stack component flavors](/develop/getting-started/core-concepts#flavor) available to you, including: - * Built-in flavors like the [local orchestrator](/develop/stack-components/orchestrators/local), - * Integration-enabled flavors like the [Kubeflow orchestrator](/develop/stack-components/orchestrators/kubeflow), - * Custom flavors that you have [created yourself](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). -* **User**: The users registered in your ZenML instance. If you are running locally, there will only be a single `default` user. -* **Secrets**: The infrastructure authentication secrets that you have registered in the [ZenML Secret Store](/develop/usage/project-setup/use-secrets/interact-with-secrets). -* **Service Connectors**: The service connectors that you have set up to [connect ZenML to your infrastructure](/develop/usage/resource-data-management/auth-management). - -### Client Methods - -#### Reading and Writing Resources - -**List Methods** - -Get a list of resources, e.g.: - - -```py -client.list_pipeline_runs( - - stack_id=client.active_stack_model.id, # filter by stack - - user_id=client.active_user.id, # filter by user - - sort_by="desc:start_time", # sort by start time descending - - size=10, # limit page size to 10 - -) -``` - -These methods always return a [Page](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-models/#zenml.models.page%5Fmodel) of resources, which behaves like a standard Python list and contains, by default, the first 50 results. You can modify the page size by passing the `size` argument or fetch a subsequent page by passing the `page` argument to the list method. - -You can further restrict your search by passing additional arguments that will be used to filter the results. E.g., most resources have a `user_id` associated with them that can be set to only list resources created by that specific user. The available filter argument options are different for each list method; check out the method declaration in the [Client SDK documentation](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-client/) to find out which exact arguments are supported or have a look at the fields of the corresponding filter model class. - -Except for pipeline runs, all other resources will by default be ordered by creation time ascending. E.g., `client.list_artifacts()` would return the first 50 artifacts ever created. You can change the ordering by specifying the `sort_by` argument when calling list methods. - -**Get Methods** - -Fetch a specific instance of a resource by either resource ID, name, or name prefix, e.g.: - -```shell -client.get_pipeline_run("413cfb42-a52c-4bf1-a2fd-78af2f7f0101") # ID - -client.get_pipeline_run("first_pipeline-2023_06_20-16_20_13_274466") # Name - -client.get_pipeline_run("first_pipeline-2023_06_20-16") # Name prefix -``` - -**Create, Update, and Delete Methods** - -Methods for creating / updating / deleting resources are only available for some of the resources and the required arguments are different for each resource. Checkout the [Client SDK Documentation](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-client/) to find out whether a specific resource supports write operations through the Client and which arguments are required. - -#### Active User and Active Stack - -For some use cases you might need to know information about the user that you are authenticated as or the stack that you have currently set as active. You can fetch this information via the `client.active_user` and `client.active_stack_model` properties respectively, e.g.: - - -```py -my_runs_on_current_stack = client.list_pipeline_runs( - - stack_id=client.active_stack_model.id, # on current stack - - user_id=client.active_user.id, # ran by you - -) -``` - -### Resource Models - -The methods of the ZenML Client all return **Response Models**, which are [Pydantic Models](https://docs.pydantic.dev/latest/usage/models/) that allow ZenML to validate that the returned data always has the correct attributes and types. E.g., the `client.list_pipeline_runs` method always returns type `Page[PipelineRunResponseModel]`. - - -You can think of these models as similar to types in strictly-typed languages, or as the requirements of a single endpoint in an API. In particular, they are **not related to machine learning models** like decision trees, neural networks, etc. - - -ZenML also has similar models that define which information is required to create, update, or search resources, named **Request Models**, **Update Models**, and **Filter Models** respectively. However, these models are only used for the server API endpoints, and not for the Client methods. - - -To find out which fields a specific resource model contains, checkout the [ZenML Models SDK Documentation](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-models/#zenml.models) and expand the source code to see a list of all fields of the respective model. Note that all resources have **Base Models** that define fields that response, request, update, and filter models have in common, so you need to take a look at the base model source code as well. - diff --git a/develop/snippets/snippet-intro.mdx b/develop/snippets/snippet-intro.mdx deleted file mode 100644 index c57e7c7..0000000 --- a/develop/snippets/snippet-intro.mdx +++ /dev/null @@ -1,4 +0,0 @@ -One of the core principles of software development is DRY (Don't Repeat -Yourself). This is a principle that apply to documentation as -well. If you find yourself repeating the same content in multiple places, you -should consider creating a custom snippet to keep your content in sync. diff --git a/develop/stack-components/alerters/alerters.mdx b/develop/stack-components/alerters/alerters.mdx deleted file mode 100644 index 6512fe5..0000000 --- a/develop/stack-components/alerters/alerters.mdx +++ /dev/null @@ -1,41 +0,0 @@ ---- -title: "Overview" -description: "Sending automated alerts to chat services." -icon: exclamation ---- - -**Alerters** allow you to send messages to chat services (like Slack, Discord, Mattermost, etc.) from within your pipelines. This is useful to immediately get notified when failures happen, for general monitoring/reporting, and also for building human-in-the-loop ML. - -## Alerter Flavors - -Currently, the [SlackAlerter](/develop/stack-components/alerters/slack) and [DiscordAlerter](/develop/stack-components/alerters/discord) are the available alerter integrations. However, it is straightforward to extend ZenML and [build an alerter for other chat services](/develop/stack-components/alerters/custom). - -| Alerter | Flavor | Integration | Notes | -| ---------------------------------------------------------- | -------- | ------------------------------------------------------------------ | -------------------------------- | -| [Slack](/develop/stack-components/alerters/slack) | slack | slack | Interacts with a Slack channel | -| [Discord](/develop/stack-components/alerters/discord) | discord | discord | Interacts with a Discord channel | -| [Custom Implementation](/develop/stack-components/alerters/custom) | _custom_ | Extend the alerter abstraction and provide your own implementation | | - -If you would like to see the available flavors of alerters in your terminal, you can use the following command: - -```bash -zenml alerter flavor list -``` - -## How to use Alerters with ZenML - -Each alerter integration comes with specific standard steps that you can use out of the box. - -However, you first need to register an alerter component in your terminal: - -```bash -zenml alerter register ... -``` - -Then you can add it to your stack using - -```bash -zenml stack register ... -al -``` - -Afterward, you can import the alerter standard steps provided by the respective integration and directly use them in your pipelines. diff --git a/develop/stack-components/alerters/custom.mdx b/develop/stack-components/alerters/custom.mdx deleted file mode 100644 index 85226df..0000000 --- a/develop/stack-components/alerters/custom.mdx +++ /dev/null @@ -1,142 +0,0 @@ ---- -title: "Develop a Custom Alerter" -description: "Learning how to develop a custom alerter." -icon: sim-card ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -### Base Abstraction - -The base abstraction for alerters is very basic, as it only defines two abstract methods that subclasses should implement: - -* `post()` takes a string, posts it to the desired chat service, and returns `True` if the operation succeeded, else `False`. -* `ask()` does the same as `post()`, but after sending the message, it waits until someone approves or rejects the operation from within the chat service (e.g., by sending "approve" / "reject" to the bot as a response). `ask()` then only returns `True` if the operation succeeded and was approved, else `False`. - -Then base abstraction looks something like this: - -```py -class BaseAlerter(StackComponent, ABC): - """Base class for all ZenML alerters.""" - - def post( - self, message: str, params: Optional[BaseAlerterStepParameters] - ) -> bool: - """Post a message to a chat service.""" - return True - - def ask( - self, question: str, params: Optional[BaseAlerterStepParameters] - ) -> bool: - """Post a message to a chat service and wait for approval.""" - return True -``` - - - This is a slimmed-down version of the base implementation. To see the full docstrings and imports, please check [the source code on GitHub](https://github.com/zenml-io/zenml/blob/main/src/zenml/alerter/base%5Falerter.py). - -### Building your own custom alerter - -Creating your own custom alerter can be done in three steps: - -1. Create a class that inherits from the `BaseAlerter` and implement the `post()` and `ask()` methods. - -```py -from typing import Optional - -from zenml.alerter import BaseAlerter, BaseAlerterStepParameters - - -class MyAlerter(BaseAlerter): - """My alerter class.""" - - def post( - self, message: str, config: Optional[BaseAlerterStepParameters] - ) -> bool: - """Post a message to a chat service.""" - ... - return "Hey, I implemented an alerter." - - def ask( - self, question: str, config: Optional[BaseAlerterStepParameters] - ) -> bool: - """Post a message to a chat service and wait for approval.""" - ... - return True -``` -2. If you need to configure your custom alerter, you can also implement a config object. -```py - -from zenml.alerter.base_alerter import BaseAlerterConfig - - -class MyAlerterConfig(BaseAlerterConfig): - my_param: str -``` - -3. Finally, you can bring the implementation and the configuration together in a new flavor object. - -```py - -from typing import Type, TYPE_CHECKING - -from zenml.alerter import BaseAlerterFlavor - -if TYPE_CHECKING: - from zenml.stack import StackComponent, StackComponentConfig - - -class MyAlerterFlavor(BaseAlerterFlavor): - @property - def name(self) -> str: - return "my_alerter" - - @property - def config_class(self) -> Type[StackComponentConfig]: - from my_alerter_config import MyAlerterConfig - - return MyAlerterConfig - - @property - def implementation_class(self) -> Type[StackComponent]: - from my_alerter import MyAlerter - - return MyAlerter - -``` -Once you are done with the implementation, you can register your new flavor through the CLI. Please ensure you **point to the flavor class via dot notation**: - -```bash -zenml alerter flavor register -``` - -For example, if your flavor class `MyAlerterFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml alerter flavor register flavors.my_flavor.MyAlerterFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. - - -Afterward, you should see the new custom alerter flavor in the list of available alerter flavors: - -```bash -zenml alerter flavor list -``` - - -It is important to draw attention to when and how these abstractions are coming into play in a ZenML workflow. - -* The **MyAlerterFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **MyAlerterConfig** class is imported when someone tries to register/update a stack component with the `my_alerter` flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` objects are inherently `pydantic` objects, you can also add your own custom validators here. -* The **MyAlerter** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `MyAlerterFlavor` and the `MyAlerterConfig` are implemented in a different module/path than the actual `MyAlerter`). - - diff --git a/develop/stack-components/alerters/discord.mdx b/develop/stack-components/alerters/discord.mdx deleted file mode 100644 index 250d839..0000000 --- a/develop/stack-components/alerters/discord.mdx +++ /dev/null @@ -1,112 +0,0 @@ ---- -title: "Discord Alerter" -description: "Sending automated alerts to a Discord channel." -icon: discord ---- - -The `DiscordAlerter` enables you to send messages to a dedicated Discord channel directly from within your ZenML pipelines. - -The `discord` integration contains the following two standard steps: - -* [discord\_alerter\_post\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-discord/#zenml.integrations.discord.steps.discord%5Falerter%5Fpost%5Fstep.discord%5Falerter%5Fpost%5Fstep) takes a string message, posts it to a Discord channel, and returns whether the operation was successful. -* [discord\_alerter\_ask\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-discord/#zenml.integrations.discord.steps.discord%5Falerter%5Fask%5Fstep.discord%5Falerter%5Fask%5Fstep) also posts a message to a Discord channel, but waits for user feedback, and only returns `True` if a user explicitly approved the operation from within Discord (e.g., by sending "approve" / "reject" to the bot in response). - -Interacting with Discord from within your pipelines can be very useful in practice: - -* The `discord_alerter_post_step` allows you to get notified immediately when failures happen (e.g., model performance degradation, data drift, ...), -* The `discord_alerter_ask_step` allows you to integrate a human-in-the-loop into your pipelines before executing critical steps, such as deploying new models. - -## How to use it - -### Requirements - -Before you can use the `DiscordAlerter`, you first need to install ZenML's `discord` integration: - -```bash -zenml integration install discord -y -``` - - - See the [Integrations](/develop/stack-components/component-guide) page for more details on ZenML integrations and how to install and use them. - -### Setting Up a Discord Bot - -In order to use the `DiscordAlerter`, you first need to have a Discord workspace set up with a channel that you want your pipelines to post to. This is the `` you will need when registering the discord alerter component. - -Then, you need to [create a Discord App with a bot in your server](https://discordpy.readthedocs.io/en/latest/discord.html) . - - - Note in the bot token step, if you don't find the button then click on reset token to reset the bot and you will get a new token which you can use. Also, make sure you give necessary permissions to the bot required for sending and receiving messages. - - -### Registering a Discord Alerter in ZenML - -Next, you need to register a `discord` alerter in ZenML and link it to the bot you just created. You can do this with the following command: - -```Bash -zenml alerter register discord_alerter \ - --flavor=discord \ - --discord_token= \ - --default_discord_channel_id= -``` - -After you have registered the `discord_alerter`, you can add it to your stack like this: - -```bash -zenml stack register ... -al discord_alerter -``` - -Here is where you can find the required parameters: - -#### DISCORD\_CHANNEL\_ID - -Open the discord server, then right-click on the text channel and click on the ' Channel ID' option. - - - If you don't see any ' Channel ID' option for your channel, go to "User Settings" > "Advanced" and make sure "Developer Mode" is active. - - -#### DISCORD\_TOKEN - -This is the Discord token of your bot. You can find the instructions on how to set up a bot, invite it to your channel, and find its token [here](https://discordpy.readthedocs.io/en/latest/discord.html). - - -When inviting the bot to your channel, make sure it has at least the following permissions: - -* Read Messages/View Channels -* Send Messages -* Send Messages in Threads - - -### How to Use the Discord Alerter - -After you have a `DiscordAlerter` configured in your stack, you can directly import the [discord\_alerter\_post\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-discord/#zenml.integrations.discord.steps.discord%5Falerter%5Fpost%5Fstep.discord%5Falerter%5Fpost%5Fstep) and [discord\_alerter\_ask\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-discord/#zenml.integrations.discord.steps.discord%5Falerter%5Fask%5Fstep.discord%5Falerter%5Fask%5Fstep) steps and use them in your pipelines. - -Since these steps expect a string message as input (which needs to be the output of another step), you typically also need to define a dedicated formatter step that takes whatever data you want to communicate and generates the string message that the alerter should post. - -As an example, adding `discord_alerter_ask_step()` to your pipeline could look like this: - -```py -from zenml.integrations.discord.steps.discord_alerter_ask_step import discord_alerter_ask_step -from zenml import step, pipeline - - -@step -def my_formatter_step(artifact_to_be_communicated) -> str: - return f"Here is my artifact {artifact_to_be_communicated}!" - - -@pipeline -def my_pipeline(...): - ... - artifact_to_be_communicated = ... - message = my_formatter_step(artifact_to_be_communicated) - approved = discord_alerter_ask_step(message) - ... # Potentially have different behavior in subsequent steps if `approved` - -if __name__ == "__main__": - my_pipeline() -``` - -For more information and a full list of configurable attributes of the Discord alerter, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-discord/#zenml.integrations.discord.alerters.discord%5Falerter.DiscordAlerter) . - diff --git a/develop/stack-components/alerters/slack.mdx b/develop/stack-components/alerters/slack.mdx deleted file mode 100644 index c680b7e..0000000 --- a/develop/stack-components/alerters/slack.mdx +++ /dev/null @@ -1,150 +0,0 @@ ---- -title: "Slack Alerter" -description: "Sending automated alerts to a Slack channel." -icon: slack ---- - -The `SlackAlerter` enables you to send messages to a dedicated Slack channel directly from within your ZenML pipelines. - -The `slack` integration contains the following two standard steps: - -* [slack\_alerter\_post\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-slack/#zenml.integrations.slack.steps.slack%5Falerter%5Fpost%5Fstep.slack%5Falerter%5Fpost%5Fstep) takes a string message or a custom [Slack block](https://api.slack.com/block-kit/building), posts it to a Slack channel, and returns whether the operation was successful. -* [slack\_alerter\_ask\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-slack/#zenml.integrations.slack.steps.slack%5Falerter%5Fask%5Fstep.slack%5Falerter%5Fask%5Fstep) also posts a message or a custom [Slack block](https://api.slack.com/block-kit/building) to a Slack channel, but waits for user feedback, and only returns `True` if a user explicitly approved the operation from within Slack (e.g., by sending "approve" / "reject" to the bot in response). - -Interacting with Slack from within your pipelines can be very useful in practice: - -* The `slack_alerter_post_step` allows you to get notified immediately when failures happen (e.g., model performance degradation, data drift, ...), -* The `slack_alerter_ask_step` allows you to integrate a human-in-the-loop into your pipelines before executing critical steps, such as deploying new models. - -## How to use it - -### Requirements - -Before you can use the `SlackAlerter`, you first need to install ZenML's `slack` integration: - -```bash -zenml integration install slack -y -``` - - - See the [Integrations](/develop/stack-components/component-guide) page for more details on ZenML integrations and how to install and use them. - - -### Setting Up a Slack Bot - -In order to use the `SlackAlerter`, you first need to have a Slack workspace set up with a channel that you want your pipelines to post to. - -Then, you need to [create a Slack App](https://api.slack.com/apps?new%5Fapp=1) with a bot in your workspace. - - -Make sure to give your Slack bot the following permissions in the `OAuth & Permissions` tab under `Scopes`: - -* `chat:write`, -* `chat:write.public` -* `channels:read` -* `groups:read` -* `im:read` -* `mpim:read` - - - - - - - -### Registering a Slack Alerter in ZenML - -Next, you need to register a `slack` alerter in ZenML and link it to the bot you just created. You can do this with the following commands: - -```bash -zenml secret create slack_token --oauth_token= -zenml alerter register slack_alerter \ - --flavor=slack \ - --slack_token='{{ slack_token:oauth_token }}' \ - --default_slack_channel_id= -``` - -Here is where you can find the required parameters: - -* ``: Open your desired Slack channel in a browser, and out the last part of the URL starting with `C....`. -* ``: This is the Slack token of your bot. You can find it in the Slack app settings under `OAuth & Permissions`. **IMPORTANT**: Please make sure that the token is the `Bot User OAuth Token` not the `User OAuth Token`. - - - - - -After you have registered the `slack_alerter`, you can add it to your stack like this: - -```bash -zenml stack register ... -al slack_alerter -``` - -### How to Use the Slack Alerter - -After you have a `SlackAlerter` configured in your stack, you can directly import the [slack\_alerter\_post\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-slack/#zenml.integrations.slack.steps.slack%5Falerter%5Fpost%5Fstep.slack%5Falerter%5Fpost%5Fstep) and [slack\_alerter\_ask\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-slack/#zenml.integrations.slack.steps.slack%5Falerter%5Fask%5Fstep.slack%5Falerter%5Fask%5Fstep) steps and use them in your pipelines. - -Since these steps expect a string message as input (which needs to be the output of another step), you typically also need to define a dedicated formatter step that takes whatever data you want to communicate and generates the string message that the alerter should post. - -As an example, adding `slack_alerter_ask_step()` to your pipeline could look like this: - -```py -from zenml.integrations.slack.steps.slack_alerter_ask_step import slack_alerter_ask_step -from zenml import step, pipeline - - -@step -def my_formatter_step(artifact_to_be_communicated) -> str: - return f"Here is my artifact {artifact_to_be_communicated}!" - - -@pipeline -def my_pipeline(...): - ... - artifact_to_be_communicated = ... - message = my_formatter_step(artifact_to_be_communicated) - approved = slack_alerter_ask_step(message) - ... # Potentially have different behavior in subsequent steps if `approved` - -if __name__ == "__main__": - my_pipeline() -``` - -An example of adding a custom Slack block as part of any alerter logic for your pipeline could look like this: - -```py -from typing import List, Dict -from zenml.integrations.slack.steps.slack_alerter_ask_step import slack_alerter_post_step -from zenml.integrations.slack.alerters.slack_alerter import SlackAlerterParameters -from zenml import step, pipeline - - -@step -def my_custom_block_step(block_message) -> List[Dict]: - my_custom_block = [ - { - "type": "header", - "text": { - "type": "plain_text", - "text": f":tada: {block_message}", - "emoji": true - } - } - ] - return SlackAlerterParameters(blocks = my_custom_block) - - -@pipeline -def my_pipeline(...): - ... - message_blocks = my_custom_block_step("my custom block!") - post_message = slack_alerter_post_step(params = message_blocks) - return post_message - -if __name__ == "__main__": - my_pipeline() - -``` - -For more information and a full list of configurable attributes of the Slack alerter, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-slack/#zenml.integrations.slack.alerters.slack%5Falerter.SlackAlerter) . - diff --git a/develop/stack-components/annotators/annotators.mdx b/develop/stack-components/annotators/annotators.mdx deleted file mode 100644 index 0ca31c4..0000000 --- a/develop/stack-components/annotators/annotators.mdx +++ /dev/null @@ -1,67 +0,0 @@ ---- -title: "Overview" -description: "Annotating the data in your workflow." -icon: tag ---- - -Annotators are a stack component that enables the use of data annotation as part of your ZenML stack and pipelines. You can use the associated CLI command to launch annotation, configure your datasets and get stats on how many labeled tasks you have ready for use. - -Data annotation/labeling is a core part of MLOps that is frequently left out of the conversation. ZenML will incrementally start to build features that support an iterative annotation workflow that sees the people doing labeling (and their workflows/behaviors) as integrated parts of their ML process(es). - - - - - - - -There are a number of different places in the ML lifecycle where this can happen: - -* **At the start**: You might be starting out without any data, or with a ton of data but no clear sense of which parts of it are useful to your particular problem. It’s not uncommon to have a lot of data but to be lacking accurate labels for that data. So you can start and get great value from bootstrapping your model: label some data, train your model, and use your model to suggest labels allowing you to speed up your labeling, iterating on and on in this way. Labeling data early on in the process also helps clarify and condense down your specific rules and standards. For example, you might realize that you need to have specific definitions for certain concepts so that your labeling efforts are consistent across your team. -* **As new data comes in**: New data will likely continue to come in, and you might want to check in with the labeling process at regular intervals to expose yourself to this new data. (You’ll probably also want to have some kind of automation around detecting data or concept drift, but for certain kinds of unstructured data you probably can never completely abandon the instant feedback of actual contact with the raw data.) -* **Samples generated for inference**: Your model will be making predictions on real-world data being passed in. If you store and label this data, you’ll gain a valuable set of data that you can use to compare your labels with what the model was predicting, another possible way to flag drifts of various kinds. This data can then (subject to privacy/user consent) be used in retraining or fine-tuning your model. -* **Other ad hoc interventions**: You will probably have some kind of process to identify bad labels, or to find the kinds of examples that your model finds really difficult to make correct predictions. For these, and for areas where you have clear class imbalances, you might want to do ad hoc annotation to supplement the raw materials your model has to learn from. - -ZenML currently offers standard steps that help you tackle the above use cases, but the stack component and abstraction will continue to be developed to make it easier to use. - -### When to use it - -The annotator is an optional stack component in the ZenML Stack. We designed our abstraction to fit into the larger ML use cases, particularly the training and deployment parts of the lifecycle. - -The core parts of the annotation workflow include: - -* using labels or annotations in your training steps in a seamless way -* handling the versioning of annotation data -* allow for the conversion of annotation data to and from custom formats -* handle annotator-specific tasks, for example, the generation of UI config files that Label Studio requires for the web annotation interface - -### List of available annotators - -For production use cases, some more flavors can be found in specific `integrations` modules. In terms of annotators, ZenML features integrations with `label_studio` and `pigeon`. - -| Annotator | Flavor | Integration | Notes | -| ----------------------------------------------------------------- | ------------- | -------------------------------------------------------------------- | ----------------------------------------------------------------------------------- | -| [ArgillaAnnotator](/develop/stack-components/annotators/argilla) | argilla | argilla | Connect ZenML with Argilla | -| [LabelStudioAnnotator](/develop/stack-components/annotators/label-studio) | label\_studio | label\_studio | Connect ZenML with Label Studio | -| [PigeonAnnotator](/develop/stack-components/annotators/pigeon) | pigeon | pigeon | Connect ZenML with Pigeon. Notebook only & for image and text classification tasks. | -| [ProdigyAnnotator](/develop/stack-components/annotators/prodigy) | prodigy | prodigy | Connect ZenML with [Prodigy](https://prodi.gy/) | -| [Custom Implementation](/develop/stack-components/annotators/custom) | _custom_ | Extend the annotator abstraction and provide your own implementation | | - -If you would like to see the available flavors for annotators, you can use the command: - -```bash -zenml annotator flavor list -``` - -### How to use it - -The available implementation of the annotator is built on top of the Label Studio integration, which means that using an annotator currently is no different from what's described on the [Label Studio page: How to use it?](/develop/stack-components/annotators/label-studio#how-do-you-use-it). ([Pigeon](/develop/stack-components/annotators/pigeon) is also supported, but has a very limited functionality and only works within Jupyter notebooks.) - -### A note on names - -The various annotation tools have mostly standardized around the naming of key concepts as part of how they build their tools. Unfortunately, this hasn't been completely unified so ZenML takes an opinion on which names we use for our stack components and integrations. Key differences to note: - -* Label Studio refers to the grouping of a set of annotations/tasks as a 'Project', whereas most other tools use the term 'Dataset', so ZenML also calls this grouping a 'Dataset'. -* The individual meta-unit for 'an annotation + the source data' is referred to in different ways, but at ZenML (and with Label Studio) we refer to them as 'tasks'. - -The remaining core concepts ('annotation' and 'prediction', in particular) are broadly used among annotation tools. - diff --git a/develop/stack-components/annotators/argilla.mdx b/develop/stack-components/annotators/argilla.mdx deleted file mode 100644 index 7b82dfd..0000000 --- a/develop/stack-components/annotators/argilla.mdx +++ /dev/null @@ -1,94 +0,0 @@ ---- -title: "Argilla" -description: "Annotating data using Argilla." -icon: snowflake ---- - -[Argilla](https://github.com/argilla-io/argilla) is an open-source data curation platform designed to enhance the development of both small and large language models (LLMs) and NLP tasks in general. It enables users to build robust language models through faster data curation using both human and machine feedback, providing support for each step in the MLOps cycle, from data labeling to model monitoring. - - - - - -Argilla distinguishes itself for its focus on specific use cases and human-in-the-loop approaches. While it does offer programmatic features, Argilla's core value lies in actively involving human experts in the tool-building process, setting it apart from other competitors. - -### When would you want to use it? - -If you need to label textual data as part of your ML workflow, that is the point at which you could consider adding the Argilla annotator stack component as part of your ZenML stack. - -We currently support the use of annotation at the various stages described in [the main annotators docs page](/develop/stack-components/annotators). The Argilla integration currently is built to support annotation using a local (Docker-backed) instance of Argilla as well as a deployed instance of Argilla. There is an easy way to deploy Argilla as a [Hugging Face Space](https://huggingface.co/docs/hub/spaces-sdks-docker-argilla), for instance, which is documented in the [Argilla documentation](https://docs.argilla.io/en/latest/getting%5Fstarted/installation/deployments/huggingface-spaces.html). - -### How to deploy it? - -The Argilla Annotator flavor is provided by the Argilla ZenML integration. You need to install it to be able to register it as an Annotator and add it to your stack: - -```bash -zenml integration install argilla -``` -You can either pass the `api_key` directly into the `zenml annotator register` command or you can register it as a secret and pass the secret name into the command. We recommend the latter approach for security reasons. If you want to take the latter approach, be sure to register a secret for whichever artifact store you choose, and then you should make sure to pass the name of that secret into the annotator as the `--authentication_secret`. For example, you'd run: - -```bash -zenml secret create argilla_secrets --api_key="" -``` - -(Visit the Argilla documentation and interface to obtain your API key.) - -Then register your annotator with ZenML: - -```bash -zenml annotator register argilla --flavor argilla --authentication_secret=argilla_secrets -``` - -When using a deployed instance of Argilla, the instance URL must be specified without any trailing `/` at the end. If you are using a Hugging Face Spaces instance and its visibility is set to private, you must also set the `extra_headers` parameter which would include a Hugging Face token. For example: - -```bash -zenml annotator register argilla --flavor argilla --authentication_secret=argilla_secrets --instance_url="https://[your-owner-name]-[your_space_name].hf.space" --extra_headers="{"Authorization": f"Bearer {}"}" -``` - -Finally, add all these components to a stack and set it as your active stack. For example: - -```bash -zenml stack copy default annotation -# this must be done separately so that the other required stack components are first registered -zenml stack update annotation -an -zenml stack set annotation -# optionally also -zenml stack describe -``` - -Now if you run a simple CLI command like `zenml annotator dataset list` this should work without any errors. You're ready to use your annotator in your ML workflow! - -### How do you use it? - -ZenML supports access to your data and annotations via the `zenml annotator ...` CLI command. We have also implemented an interface to some of the common Argilla functionality via the ZenML SDK. - -You can access information about the datasets you're using with the `zenml annotator dataset list`. To work on annotation for a particular dataset, you can run `zenml annotator dataset annotate `. What follows is an overview of some key components to the Argilla integration and how it can be used. - -#### Argilla Annotator Stack Component - -Our Argilla annotator component inherits from the `BaseAnnotator` class. There are some methods that are core methods that must be defined, like being able to register or get a dataset. Most annotators handle things like the storage of state and have their own custom features, so there are quite a few extra methods specific to Argilla. - -The core Argilla functionality that's currently enabled includes a way to register your datasets, export any annotations for use in separate steps as well as start the annotator daemon process. (Argilla requires a server to be running in order to use the web interface, and ZenML handles the connection to this server using the details you passed in when registering the component.) - -#### Argilla Annotator SDK - -Visit [the SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-argilla/) to learn more about the methods that ZenML exposes for the Argilla annotator. To access the SDK through Python, you would first get the client object and then call the methods you need. For example: - -```bash -from zenml.client import Client - -client = Client() -annotator = client.active_stack.annotator - -# list dataset names -dataset_names = annotator.get_dataset_names() - -# get a specific dataset -dataset = annotator.get_dataset("dataset_name") - -# get the annotations for a dataset -annotations = annotator.get_labeled_data(dataset_name="dataset_name") -``` - -For more detailed information on how to use the Argilla annotator and the functionality it provides, visit the [Argilla documentation](https://docs.argilla.io/en/latest/). - diff --git a/develop/stack-components/annotators/custom.mdx b/develop/stack-components/annotators/custom.mdx deleted file mode 100644 index 03f1f5e..0000000 --- a/develop/stack-components/annotators/custom.mdx +++ /dev/null @@ -1,18 +0,0 @@ ---- -title: "Develop a Custom Annotator" -description: "Learning how to develop a custom annotator." -icon: shuffle ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -Annotators are a stack component that enables the use of data annotation as part of your ZenML stack and pipelines. You can use the associated CLI command to launch annotation, configure your datasets and get stats on how many labeled tasks you have ready for use. - - -**Base abstraction in progress!** - -We are actively working on the base abstraction for the annotators, which will be available soon. As a result, their extension is not possible at the moment. If you would like to use an annotator in your stack, please check the list of already available feature stores down below. - - diff --git a/develop/stack-components/annotators/label-studio.mdx b/develop/stack-components/annotators/label-studio.mdx deleted file mode 100644 index 85a8034..0000000 --- a/develop/stack-components/annotators/label-studio.mdx +++ /dev/null @@ -1,103 +0,0 @@ ---- -title: "Label Studio" -description: "Annotating data using Label Studio." -icon: ribbon ---- - -Label Studio is one of the leading open-source annotation platforms available to data scientists and ML practitioners. It is used to create or edit datasets that you can then use as part of training or validation workflows. It supports a broad range of annotation types, including: - -* Computer Vision (image classification, object detection, semantic segmentation) -* Audio & Speech (classification, speaker diarization, emotion recognition, audio transcription) -* Text / NLP (classification, NER, question answering, sentiment analysis) -* Time Series (classification, segmentation, event recognition) -* Multi-Modal / Domain (dialogue processing, OCR, time series with reference) - -### When would you want to use it? - -If you need to label data as part of your ML workflow, that is the point at which you could consider adding the optional annotator stack component as part of your ZenML stack. - -We currently support the use of annotation at the various stages described in [the main annotators docs page](/develop/stack-components/annotators), and also offer custom utility functions to generate Label Studio label config files for image classification and object detection. (More will follow in due course.) - -The Label Studio integration currently is built to support workflows using the following three cloud artifact stores: AWS S3, GCP/GCS, and Azure Blob Storage. Purely local stacks will currently _not_ work if you want to do add the annotation stack component as part of your stack. - -### How to deploy it? - -The Label Studio Annotator flavor is provided by the Label Studio ZenML integration, you need to install it, to be able to register it as an Annotator and add it to your stack: - -```bash -zenml integration install label_studio -``` - -You will then need to obtain your Label Studio API key. This will give you access to the web annotation interface. (The following steps apply to a local instance of Label Studio, but feel free to obtain your API key directly from your deployed instance if that's what you are using.) - -```bash -git clone https://github.com/HumanSignal/label-studio.git - -cd label-studio - -docker-compose up -d # starts label studio at http://localhost:8080 -``` - -Then visit [http://localhost:8080/](http://localhost:8080/) to log in, and then visit [http://localhost:8080/user/account](http://localhost:8080/user/account) and get your Label Studio API key (from the upper right-hand corner). You will need it for the next step. Keep the Label Studio server running, because the ZenML Label Studio annotator will use it as the backend. - -At this point you should register the API key under a custom secret name, making sure to replace the two parts in `<>` with whatever you choose: - -```bash -zenml secret create label_studio_secrets --api_key="" -``` -Then register your annotator with ZenML: - -```bash -zenml annotator register label_studio --flavor label_studio --authentication_secret="label_studio_secrets" --port=8080 - -# for deployed instances of Label Studio, you can also pass in the URL as follows, for example: - -# zenml annotator register label_studio --flavor label_studio --authentication_secret="" --instance_url="" --port=80 -``` - -When using a deployed instance of Label Studio, the instance URL must be specified without any trailing `/` at the end. You should specify the port, for example, port 80 for a standard HTTP connection. For a Hugging Face deployment (the easiest way to get going with Label Studio), please read the [Hugging Face deployment documentation](https://huggingface.co/docs/hub/spaces-sdks-docker-label-studio). - -Finally, add all these components to a stack and set it as your active stack. For example: - -```bash -zenml stack copy default annotation -zenml stack update annotation -a -# this must be done separately so that the other required stack components are first registered -zenml stack update annotation -an -zenml stack set annotation -# optionally also -zenml stack describe -``` - -Now if you run a simple CLI command like `zenml annotator dataset list` this should work without any errors. You're ready to use your annotator in your ML workflow! - -### How do you use it? - -ZenML assumes that users have registered a cloud artifact store and an annotator as described above. ZenML currently only supports this setup, but we will add in the fully local stack option in the future. - -ZenML supports access to your data and annotations via the `zenml annotator ...` CLI command. - -You can access information about the datasets you're using with the `zenml annotator dataset list`. To work on annotation for a particular dataset, you can run `zenml annotator dataset annotate `. - -[Our computer vision end to end example](https://github.com/zenml-io/zenml-projects/tree/main/end-to-end-computer-vision) is the best place to see how all the pieces of making this integration work fit together. What follows is an overview of some key components to the Label Studio integration and how it can be used. - -#### Label Studio Annotator Stack Component - -Our Label Studio annotator component inherits from the `BaseAnnotator` class. There are some methods that are core methods that must be defined, like being able to register or get a dataset. Most annotators handle things like the storage of state and have their own custom features, so there are quite a few extra methods specific to Label Studio. - -The core Label Studio functionality that's currently enabled includes a way to register your datasets, export any annotations for use in separate steps as well as start the annotator daemon process. (Label Studio requires a server to be running in order to use the web interface, and ZenML handles the provisioning of this server locally using the details you passed in when registering the component unless you've specified that you want to use a deployed instance.) - -#### Standard Steps - -ZenML offers some standard steps (and their associated config objects) which will get you up and running with the Label Studio integration quickly. These include: - -* `LabelStudioDatasetRegistrationConfig` \- a step config object to be used when registering a dataset with Label studio using the `get_or_create_dataset` step -* `LabelStudioDatasetSyncConfig` \- a step config object to be used when registering a dataset with Label studio using the `sync_new_data_to_label_studio` step. Note that this requires a ZenML secret to have been pre-registered with your artifact store as being the one that holds authentication secrets specific to your particular cloud provider. (Label Studio provides some documentation on what permissions these secrets require [here](https://labelstud.io/guide/tasks.html).) -* `get_or_create_dataset` step - This takes a `LabelStudioDatasetRegistrationConfig` config object which includes the name of the dataset. If it exists, this step will return the name, but if it doesn't exist then ZenML will register the dataset along with the appropriate label config with Label Studio. -* `get_labeled_data` step - This step will get all labeled data available for a particular dataset. Note that these are output in a Label Studio annotation format, which will subsequently be converted into a format appropriate for your specific use case. -* `sync_new_data_to_label_studio` step - This step is for ensuring that ZenML is handling the annotations and that the files being used are stored and synced with the ZenML cloud artifact store. This is an important step as part of a continuous annotation workflow since you want all the subsequent steps of your workflow to remain in sync with whatever new annotations are being made or have been created. - -#### Helper Functions - -Label Studio requires the use of what it calls 'label config' when you are creating/registering your dataset. These are strings containing HTML-like syntax that allow you to define a custom interface for your annotation. ZenML provides three helper functions that will construct these label config strings in the case of object detection, image classification, and OCR. See the [integrations.label\_studio.label\_config\_generators](https://github.com/zenml-io/zenml/blob/main/src/zenml/integrations/label%5Fstudio/label%5Fconfig%5Fgenerators/label%5Fconfig%5Fgenerators.py) module for those three functions. - diff --git a/develop/stack-components/annotators/pigeon.mdx b/develop/stack-components/annotators/pigeon.mdx deleted file mode 100644 index 35d7e14..0000000 --- a/develop/stack-components/annotators/pigeon.mdx +++ /dev/null @@ -1,106 +0,0 @@ ---- -title: "Pigeon" -description: "Annotating data using Pigeon." -icon: kiwi-bird ---- - -Pigeon is a lightweight, open-source annotation tool designed for quick and easy labeling of data directly within Jupyter notebooks. It provides a simple and intuitive interface for annotating various types of data, including: - -* Text Classification -* Image Classification -* Text Captioning - -### When would you want to use it? - - - - - -If you need to label a small to medium-sized dataset as part of your ML workflow and prefer the convenience of doing it directly within your Jupyter notebook, Pigeon is a great choice. It is particularly useful for: - -* Quick labeling tasks that don't require a full-fledged annotation platform -* Iterative labeling during the exploratory phase of your ML project -* Collaborative labeling within a Jupyter notebook environment - -### How to deploy it? - -To use the Pigeon annotator, you first need to install the ZenML Pigeon integration: - -```bash -zenml integration install pigeon -``` - -Next, register the Pigeon annotator with ZenML, specifying the output directory where the annotation files will be stored: - -```bash -zenml annotator register pigeon --flavor pigeon --output_dir="path/to/dir" -``` - -Note that the `output_dir` is relative to the repository or notebook root. - -Finally, add the Pigeon annotator to your stack and set it as the active stack: - -```bash -zenml stack update --annotator pigeon -``` - -Now you're ready to use the Pigeon annotator in your ML workflow! - -### How do you use it? - -With the Pigeon annotator registered and added to your active stack, you can easily access it using the ZenML client within your Jupyter notebook. - -For text classification tasks, you can launch the Pigeon annotator as follows: - -```py -from zenml.client import Client - -annotator = Client().active_stack.annotator - -annotations = annotator.annotate( - data=[ - 'I love this movie', - 'I was really disappointed by the book' - ], - options=[ - 'positive', - 'negative' - ] -) -``` - -For image classification tasks, you can provide a custom display function to render the images: - -```py -from zenml.client import Client -from IPython.display import display, Image - -annotator = Client().active_stack.annotator - -annotations = annotator.annotate( - data=[ - '/path/to/image1.png', - '/path/to/image2.png' - ], - options=[ - 'cat', - 'dog' - ], - display_fn=lambda filename: display(Image(filename)) -) -``` - -The `launch` method returns the annotations as a list of tuples, where each tuple contains the data item and its corresponding label. - -You can also use the `zenml annotator dataset` commands to manage your datasets: - -* `zenml annotator dataset list` \- List all available datasets -* `zenml annotator dataset delete ` \- Delete a specific dataset -* `zenml annotator dataset stats ` \- Get statistics for a specific dataset - -Annotation files are saved as JSON files in the specified output directory. Each annotation file represents a dataset, with the filename serving as the dataset name. - -## Acknowledgements - -Pigeon was created by [Anastasis Germanidis](https://github.com/agermanidis) and released as a [Python package](https://pypi.org/project/pigeon-jupyter/) and [Github repository](https://github.com/agermanidis/pigeon). It is licensed under the Apache License. It has been updated to work with more recent `ipywidgets` versions and some small UI improvements were added. We are grateful to Anastasis for creating this tool and making it available to the community. - diff --git a/develop/stack-components/annotators/prodigy.mdx b/develop/stack-components/annotators/prodigy.mdx deleted file mode 100644 index d974b8f..0000000 --- a/develop/stack-components/annotators/prodigy.mdx +++ /dev/null @@ -1,94 +0,0 @@ ---- -title: "Prodigy" -description: "Annotating data using Prodigy." -icon: feather-pointed ---- - -[Prodigy](https://prodi.gy/) is a modern annotation tool for creating training and evaluation data for machine learning models. You can also use Prodigy to help you inspect and clean your data, do error analysis and develop rule-based systems to use in combination with your statistical models. - - - - - - - Prodigy is a paid annotation tool. You will need a Prodigy is a paid tool. A license is required to download and use it with ZenML. - - -The Prodigy Python library includes a range of pre-built workflows and command-line commands for various tasks, and well-documented components for implementing your own workflow scripts. Your scripts can specify how the data is loaded and saved, change which questions are asked in the annotation interface, and can even define custom HTML and JavaScript to change the behavior of the front-end. The web application is optimized for fast, intuitive and efficient annotation. - -### When would you want to use it? - -If you need to label data as part of your ML workflow, that is the point at which you could consider adding the optional annotator stack component as part of your ZenML stack. - -### How to deploy it? - -The Prodigy Annotator flavor is provided by the Prodigy ZenML integration. You need to install it to be able to register it as an Annotator and add it to your stack: - -```bash -zenml integration export-requirements --output-file prodigy-requirements.txt prodigy -``` - -Note that you'll need to install Prodigy separately since it requires a license. Please [visit the Prodigy docs](https://prodi.gy/docs/install) for information on how to install it. Currently Prodigy also requires the `urllib3<2` dependency, so make sure to install that. - -Then register your annotator with ZenML: - -```bash -zenml annotator register prodigy --flavor prodigy - -# optionally also pass in --custom_config_path="" -``` - -See https://prodi.gy/docs/install#config for more on custom Prodigy config files. Passing a `custom_config_path` allows you to override the default Prodigy config. - -Finally, add all these components to a stack and set it as your active stack. For example: - -```bash -zenml stack copy default annotation -zenml stack update annotation -an prodigy -zenml stack set annotation -# optionally also -zenml stack describe -``` - -Now if you run a simple CLI command like `zenml annotator dataset list` this should work without any errors. You're ready to use your annotator in your ML workflow! - -### How do you use it? - -With Prodigy, there is no need to specially start the annotator ahead of time like with [Label Studio](/develop/stack-components/annotators/label-studio). Instead, just use Prodigy as per the [Prodigy docs](https://prodi.gy) and then you can use the ZenML wrapper / API to get your labeled data etc using our Python methods. - -ZenML supports access to your data and annotations via the `zenml annotator ...` CLI command. - -You can access information about the datasets you're using with the `zenml annotator dataset list`. To work on annotation for a particular dataset, you can run `zenml annotator dataset annotate `. This is the equivalent of running `prodigy ` in the terminal. For example, you might run: - - - -```bash -zenml annotator dataset annotate your_dataset --command="textcat.manual news_topics ./news_headlines.jsonl --label Technology,Politics,Economy,Entertainment" -``` - -This would launch the Prodigy interface for [the textcat.manual recipe](https://prodi.gy/docs/recipes#textcat-manual) with the `news_topics` dataset and the labels `Technology`, `Politics`, `Economy`, and `Entertainment`. The data would be loaded from the `news_headlines.jsonl` file. - -A common workflow for Prodigy is to annotate data as you would usually do, and then use the connection into ZenML to import those annotations within a step in your pipeline (if running locally). For example, within a ZenML step: - -```py -from typing import List, Dict, Any - -from zenml import step -from zenml.client import Client - -@step -def import_annotations() -> List[Dict[str, Any]: - zenml_client = Client() - annotations = zenml_client.active_stack.annotator.get_labeled_data(dataset_name="my_dataset") - # Do something with the annotations - return annotations -``` - -If you're running in a cloud environment, you can manually export the annotations, store them somewhere in a cloud environment and then reference or use those within ZenML. The precise way you do this will be very case-dependent, however, so it's difficult to provide a one-size-fits-all solution. - -#### Prodigy Annotator Stack Component - -Our Prodigy annotator component inherits from the `BaseAnnotator` class. There are some methods that are core methods that must be defined, like being able to register or get a dataset. Most annotators handle things like the storage of state and have their own custom features, so there are quite a few extra methods specific to Prodigy. - -The core Prodigy functionality that's currently enabled from within the `annotator` stack component interface includes a way to register your datasets and export any annotations for use in separate steps. - diff --git a/develop/stack-components/artifact-stores/artifact-stores.mdx b/develop/stack-components/artifact-stores/artifact-stores.mdx deleted file mode 100644 index de8b816..0000000 --- a/develop/stack-components/artifact-stores/artifact-stores.mdx +++ /dev/null @@ -1,166 +0,0 @@ ---- -title: "Overview" -description: "Setting up a persistent storage for your artifacts." -icon: store ---- - -The Artifact Store is a central component in any MLOps stack. As the name suggests, it acts as a data persistence layer where artifacts (e.g. datasets, models) ingested or generated by the machine learning pipelines are stored. - -ZenML automatically serializes and saves the data circulated through your pipelines in the Artifact Store: datasets, models, data profiles, data and model validation reports, and generally any object that is returned by a pipeline step. This is coupled with tracking in ZenML to provide extremely useful features such as caching and provenance/lineage tracking and pipeline reproducibility. - - -Not all objects returned by pipeline steps are physically stored in the Artifact Store, nor do they have to be. How artifacts are serialized and deserialized and where their contents are stored are determined by the particular implementation of the [Materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) associated with the artifact data type. The majority of Materializers shipped with ZenML use the Artifact Store which is part of the active Stack as the location where artifacts are kept. - -If you need to store _a particular type of pipeline artifact_ in a different medium (e.g. use an external model registry to store model artifacts, or an external data lake or data warehouse to store dataset artifacts), you can write your own [Materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) to implement the custom logic required for it. In contrast, if you need to use an entirely different storage backend to store artifacts, one that isn't already covered by one of the ZenML integrations, you can [extend the Artifact Store abstraction](/develop/stack-components/artifact-stores/custom) to provide your own Artifact Store implementation. - - -In addition to pipeline artifacts, the Artifact Store may also be used as storage backed by other specialized stack components that need to store their data in the form of persistent object storage. The [Great Expectations Data Validator](/develop/stack-components/data-validators/great-expectations) is such an example. - -Related concepts: - -* the Artifact Store is a type of Stack Component that needs to be registered as part of your ZenML [Stack](/develop/user-guide/guides/production-guide/understand-stacks). -* the objects circulated through your pipelines are serialized and stored in the Artifact Store using [Materializers](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types). Materializers implement the logic required to serialize and deserialize the artifact contents and to store them and retrieve their contents to/from the Artifact Store. - -### When to use it - -The Artifact Store is a mandatory component in the ZenML stack. It is used to store all artifacts produced by pipeline runs, and you are required to configure it in all of your stacks. - -#### Artifact Store Flavors - -Out of the box, ZenML comes with a `local` artifact store already part of the default stack that stores artifacts on your local filesystem. Additional Artifact Stores are provided by integrations: - -| Artifact Store | Flavor | Integration | URI Schema(s) | Notes | -| ----------------------------------------------------------------- | -------- | ----------- | ------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | -| [Local](/develop/stack-components/artifact-stores/local) | local | _built-in_ | None | This is the default Artifact Store. It stores artifacts on your local filesystem. Should be used only for running ZenML locally. | -| [Amazon S3](/develop/stack-components/artifact-stores/s3) | s3 | s3 | s3:// | Uses AWS S3 as an object store backend | -| [Google Cloud Storage](/develop/stack-components/artifact-stores/gcp) | gcp | gcp | gs:// | Uses Google Cloud Storage as an object store backend | -| [Azure](/develop/stack-components/artifact-stores/azure) | azure | azure | abfs://, az:// | Uses Azure Blob Storage as an object store backend | -| [Custom Implementation](/develop/stack-components/artifact-stores/custom) | _custom_ | _custom_ | Extend the Artifact Store abstraction and provide your own implementation | | - -If you would like to see the available flavors of Artifact Stores, you can use the command: - -```bash -zenml artifact-store flavor list -``` - - -Every Artifact Store has a `path` attribute that must be configured when it is registered with ZenML. This is a URI pointing to the root path where all objects are stored in the Artifact Store. It must use a URI schema that is supported by the Artifact Store flavor. For example, the S3 Artifact Store will need a URI that contains the `s3://` schema: - -```bash -zenml artifact-store register s3_store -f s3 --path s3://my_bucket -``` - - -### How to use it - -The Artifact Store provides low-level object storage services for other ZenML mechanisms. When you develop ZenML pipelines, you normally don't even have to be aware of its existence or interact with it directly. ZenML provides higher-level APIs that can be used as an alternative to store and access artifacts: - -* return one or more objects from your pipeline steps to have them automatically saved in the active Artifact Store as pipeline artifacts. -* [retrieve pipeline artifacts](/develop/usage/resource-data-management/handle-data-artifacts/load-artifacts-into-memory) from the active Artifact Store after a pipeline run is complete. - -You will probably need to interact with the [low-level Artifact Store API](/develop/stack-components/artifact-stores#the-artifact-store-api) directly: - -* if you implement custom [Materializers](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) for your artifact data types -* if you want to store custom objects in the Artifact Store - -#### The Artifact Store API - -All ZenML Artifact Stores implement [the same IO API](/develop/stack-components/artifact-stores/custom) that resembles a standard file system. This allows you to access and manipulate the objects stored in the Artifact Store in the same manner you would normally handle files on your computer and independently of the particular type of Artifact Store that is configured in your ZenML stack. - -Accessing the low-level Artifact Store API can be done through the following Python modules: - -* `zenml.io.fileio` provides low-level utilities for manipulating Artifact Store objects (e.g. `open`, `copy`, `rename` , `remove`, `mkdir`). These functions work seamlessly across Artifact Stores types. They have the same signature as the [Artifact Store abstraction methods](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-artifact%5Fstores/#zenml.artifact%5Fstores.base%5Fartifact%5Fstore.BaseArtifactStore) ( in fact, they are one and the same under the hood). -* [zenml.utils.io\_utils](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-utils/#zenml.utils.io%5Futils) includes some higher-level helper utilities that make it easier to find and transfer objects between the Artifact Store and the local filesystem or memory. - - - When calling the Artifact Store API, you should always use URIs that are relative to the Artifact Store root path, otherwise, you risk using an unsupported protocol or storing objects outside the store. You can use the `Repository` singleton to retrieve the root path of the active Artifact Store and then use it as a base path for artifact URIs, e.g.: - -```py -import os -from zenml.client import Client -from zenml.io import fileio - -root_path = Client().active_stack.artifact_store.path - -artifact_contents = "example artifact" -artifact_path = os.path.join(root_path, "artifacts", "examples") -artifact_uri = os.path.join(artifact_path, "test.txt") -fileio.makedirs(artifact_path) -with fileio.open(artifact_uri, "w") as f: - f.write(artifact_contents) -``` - -When using the Artifact Store API to write custom Materializers, the base artifact URI path is already provided. See the documentation on [Materializers](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) for an example. - - -The following are some code examples showing how to use the Artifact Store API for various operations: - -* creating folders, writing and reading data directly to/from an artifact store object - -```py -import os -from zenml.utils import io_utils -from zenml.io import fileio - -from zenml.client import Client - -root_path = Client().active_stack.artifact_store.path - -artifact_contents = "example artifact" -artifact_path = os.path.join(root_path, "artifacts", "examples") -artifact_uri = os.path.join(artifact_path, "test.txt") -fileio.makedirs(artifact_path) -io_utils.write_file_contents_as_string(artifact_uri, artifact_contents) -``` - -```py -import os -from zenml.utils import io_utils - -from zenml.client import Client - -root_path = Client().active_stack.artifact_store.path - -artifact_path = os.path.join(root_path, "artifacts", "examples") -artifact_uri = os.path.join(artifact_path, "test.txt") -artifact_contents = io_utils.read_file_contents_as_string(artifact_uri) -``` - -* using a temporary local file/folder to serialize and copy in-memory objects to/from the artifact store (heavily used in Materializers to transfer information between the Artifact Store and external libraries that don't support writing/reading directly to/from the artifact store backend): - -```py -import os -import tempfile -import external_lib - -root_path = Repository().active_stack.artifact_store.path - -artifact_path = os.path.join(root_path, "artifacts", "examples") -artifact_uri = os.path.join(artifact_path, "test.json") -fileio.makedirs(artifact_path) - -with tempfile.NamedTemporaryFile( - mode="w", suffix=".json", delete=True -) as f: - external_lib.external_object.save_to_file(f.name) - # Copy it into artifact store - fileio.copy(f.name, artifact_uri) -``` - -```py -import os -import tempfile -import external_lib - -root_path = Repository().active_stack.artifact_store.path - -artifact_path = os.path.join(root_path, "artifacts", "examples") -artifact_uri = os.path.join(artifact_path, "test.json") - -with tempfile.NamedTemporaryFile( - mode="w", suffix=".json", delete=True -) as f: - # Copy the serialized object from the artifact store - fileio.copy(artifact_uri, f.name) - external_lib.external_object.load_from_file(f.name) -``` diff --git a/develop/stack-components/artifact-stores/azure.mdx b/develop/stack-components/artifact-stores/azure.mdx deleted file mode 100644 index 1b1edde..0000000 --- a/develop/stack-components/artifact-stores/azure.mdx +++ /dev/null @@ -1,234 +0,0 @@ ---- -title: "Azure Blob Storage" -description: "Storing artifacts using Azure Blob Storage" -icon: warehouse ---- - -The Azure Artifact Store is an [Artifact Store](/develop/stack-components/artifact-stores) flavor provided with the Azure ZenML integration that uses [the Azure Blob Storage managed object storage service](https://azure.microsoft.com/en-us/services/storage/blobs/) to store ZenML artifacts in an Azure Blob Storage container. - -### When would you want to use it? - -Running ZenML pipelines with [the local Artifact Store](/develop/stack-components/artifact-stores/local) is usually sufficient if you just want to evaluate ZenML or get started quickly without incurring the trouble and the cost of employing cloud storage services in your stack. However, the local Artifact Store becomes insufficient or unsuitable if you have more elaborate needs for your project: - -* if you want to share your pipeline run results with other team members or stakeholders inside or outside your organization -* if you have other components in your stack that are running remotely (e.g. a Kubeflow or Kubernetes Orchestrator running in a public cloud). -* if you outgrow what your local machine can offer in terms of storage space and need to use some form of private or public storage service that is shared with others -* if you are running pipelines at scale and need an Artifact Store that can handle the demands of production-grade MLOps - -In all these cases, you need an Artifact Store that is backed by a form of public cloud or self-hosted shared object storage service. - -You should use the Azure Artifact Store when you decide to keep your ZenML artifacts in a shared object storage and if you have access to the Azure Blob Storage managed service. You should consider one of the other [Artifact Store flavors](/develop/stack-components/artifact-stores#artifact-store-flavors) if you don't have access to the Azure Blob Storage service. - -### How do you deploy it? - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including an Azure Artifact Store? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML Azure Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -The Azure Artifact Store flavor is provided by the Azure ZenML integration, you need to install it on your local machine to be able to register an Azure Artifact Store and add it to your stack: - -```bash -zenml integration install azure -y -``` - -The only configuration parameter mandatory for registering an Azure Artifact Store is the root path URI, which needs to point to an Azure Blog Storage container and take the form `az://container-name` or `abfs://container-name`. Please read [the Azure Blob Storage documentation](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-portal) on how to configure an Azure Blob Storage container. - -With the URI to your Azure Blob Storage container known, registering an Azure Artifact Store can be done as follows: - -```bash -# Register the Azure artifact store -zenml artifact-store register az_store -f azure --path=az://container-name - -# Register and set a stack with the new artifact store -zenml stack register custom_stack -a az_store ... --set -``` - -Depending on your use case, however, you may also need to provide additional configuration parameters pertaining to [authentication](/develop/stack-components/artifact-stores/azure#authentication-methods) to match your deployment scenario. - -#### Authentication Methods - -Integrating and using an Azure Artifact Store in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Implicit Authentication_ method. However, the recommended way to authenticate to the Azure cloud platform is through [an Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector). This is particularly useful if you are configuring ZenML stacks that combine the Azure Artifact Store with other remote stack components also running in Azure. - -You will need the following information to configure Azure credentials for ZenML, depending on which type of Azure credentials you want to use: - -* an Azure connection string -* an Azure account key -* the client ID, client secret and tenant ID of the Azure service principal - -For more information on how to retrieve information about your Azure Storage Account and Access Key or connection string, please refer to this [Azure guide](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-python?tabs=environment-variable-windows#copy-your-credentials-from-the-azure-portal). - -For information on how to configure an Azure service principal, please consult the [Azure documentation](https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal). - - -This method uses the implicit Azure authentication available _in the environment where the ZenML code is running_. On your local machine, this is the quickest way to configure an Azure Artifact Store. You don't need to supply credentials explicitly when you register the Azure Artifact Store, instead, you have to set one of the following sets of environment variables: - -* to use [an Azure storage account key](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage) , set `AZURE_STORAGE_ACCOUNT_NAME` to your account name and one of `AZURE_STORAGE_ACCOUNT_KEY` or `AZURE_STORAGE_SAS_TOKEN` to the Azure key value. -* to use [an Azure storage account key connection string](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage) , set `AZURE_STORAGE_CONNECTION_STRING` to your Azure Storage Key connection string -* to use [Azure Service Principal credentials](https://learn.microsoft.com/en-us/azure/active-directory/develop/app-objects-and-service-principals) , [create an Azure Service Principal](https://learn.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal) and then set `AZURE_STORAGE_ACCOUNT_NAME` to your account name and `AZURE_STORAGE_CLIENT_ID` , `AZURE_STORAGE_CLIENT_SECRET` and `AZURE_STORAGE_TENANT_ID` to the client ID, secret and tenant ID of your service principal - -Certain dashboard functionality, such as visualizing or deleting artifacts, is not available when using an implicitly authenticated artifact store together with a deployed ZenML server because the ZenML server will not have permission to access the filesystem. - - -The implicit authentication method also needs to be coordinated with other stack components that are highly dependent on the Artifact Store and need to interact with it directly to the function. If these components are not running on your machine, they do not have access to the local environment variables and will encounter authentication failures while trying to access the Azure Artifact Store: - -* [Orchestrators](/develop/stack-components/orchestrators) need to access the Artifact Store to manage pipeline artifacts -* [Step Operators](/develop/stack-components/step-operators) need to access the Artifact Store to manage step-level artifacts -* [Model Deployers](/develop/stack-components/model-deployers) need to access the Artifact Store to load served models - -To enable these use cases, it is recommended to use [an Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector) to link your Azure Artifact Store to the remote Azure Blob storage container. - - - -To set up the Azure Artifact Store to authenticate to Azure and access an Azure Blob storage container, it is recommended to leverage the many features provided by [the Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector) such as auto-configuration, best security practices regarding long-lived credentials and reusing the same credentials across multiple stack components. - -If you don't already have an Azure Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure an Azure Service Connector that can be used to access more than one Azure blob storage container or even more than one type of Azure resource: - -```bash -zenml service-connector register --type azure -i -``` - -A non-interactive CLI example that uses [Azure Service Principal credentials](https://learn.microsoft.com/en-us/azure/active-directory/develop/app-objects-and-service-principals) to configure an Azure Service Connector targeting a single Azure Blob storage container is: - -```bash -zenml service-connector register --type azure --auth-method service-principal --tenant_id= --client_id= --client_secret= --resource-type blob-container --resource-id -``` - -```bash Example Command Output -$ zenml service-connector register azure-blob-demo --type azure --auth-method service-principal --tenant_id=a79f3633-8f45-4a74-a42e-68871c17b7fb --client_id=8926254a-8c3f-430a-a2fd-bdab234d491e --client_secret=AzureSuperSecret --resource-type blob-container --resource-id az://demo-zenmlartifactstore - -Successfully registered service connector `azure-blob-demo` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────┼──────────────────────────────┨ - -┃ 📦 blob-container │ az://demo-zenmlartifactstore ┃ - -┗━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -> **Note**: Please remember to grant the Azure service principal permissions to read and write to your Azure Blob storage container as well as to list accessible storage accounts and Blob containers. For a full list of permissions required to use an AWS Service Connector to access one or more S3 buckets, please refer to the [Azure Service Connector Blob storage container resource type documentation](/develop/usage/resource-data-management/auth-management/azure-service-connector#azure-blob-storage-container) or read the documentation available in the interactive CLI commands and dashboard. The Azure Service Connector supports [many different authentication methods](/develop/usage/resource-data-management/auth-management/azure-service-connector#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use-case. - -If you already have one or more Azure Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the Azure Blob storage container you want to use for your Azure Artifact Store by running e.g.: - -```bash -zenml service-connector list-resources --resource-type blob-container -``` - -```bash Example Command Output -The following 'blob-container' resources can be accessed by service connectors: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────────┼────────────────┼───────────────────┼──────────────────────────────┨ - -┃ 273d2812-2643-4446-82e6-6098b8ccdaa4 │ azure-service-principal │ 🇦 azure │ 📦 blob-container │ az://demo-zenmlartifactstore ┃ - -┠──────────────────────────────────────┼─────────────────────────┼────────────────┼───────────────────┼──────────────────────────────┨ - -┃ f6b329e1-00f7-4392-94c9-264119e672d0 │ azure-blob-demo │ 🇦 azure │ 📦 blob-container │ az://demo-zenmlartifactstore ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on an Azure Service Connector to use to connect to the target Azure Blob storage container, you can register the Azure Artifact Store as follows: - -```bash -# Register the Azure artifact-store and reference the target blob storage container -zenml artifact-store register -f azure \ - - --path='az://your-container' - -# Connect the Azure artifact-store to the target container via an Azure Service Connector -zenml artifact-store connect -i -``` - -A non-interactive version that connects the Azure Artifact Store to a target blob storage container through an Azure Service Connector: - -```bash -zenml artifact-store connect --connector -``` - -```bash Example Command Output -$ zenml artifact-store connect azure-blob-demo --connector azure-blob-demo - -Successfully connected artifact store `azure-blob-demo` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────┼────────────────┼───────────────────┼──────────────────────────────┨ - -┃ f6b329e1-00f7-4392-94c9-264119e672d0 │ azure-blob-demo │ 🇦 azure │ 📦 blob-container │ az://demo-zenmlartifactstore ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -As a final step, you can use the Azure Artifact Store in a ZenML Stack: - -```bash -# Register and set a stack with the new artifact store -zenml stack register -a ... --set -``` - - -When you register the Azure Artifact Store, you can create a [ZenML Secret](/develop/usage/project-setup/use-secrets/secret-management) to store a variety of Azure credentials and then reference it in the Artifact Store configuration: - -* to use [an Azure storage account key](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage) , set `account_name` to your account name and one of `account_key` or `sas_token` to the Azure key or SAS token value as attributes in the ZenML secret -* to use [an Azure storage account key connection string](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage) , configure the `connection_string` attribute in the ZenML secret to your Azure Storage Key connection string -* to use [Azure Service Principal credentials](https://learn.microsoft.com/en-us/azure/active-directory/develop/app-objects-and-service-principals) , [create an Azure Service Principal](https://learn.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal) and then set `account_name` to your account name and `client_id`, `client_secret` and `tenant_id` to the client ID, secret and tenant ID of your service principal in the ZenML secret - -This method has some advantages over the implicit authentication method: - -* you don't need to install and configure the Azure CLI on your host -* you don't need to care about enabling your other stack components (orchestrators, step operators and model deployers) to have access to the artifact store through Azure Managed Identities -* you can combine the Azure artifact store with other stack components that are not running in Azure - -Configuring Azure credentials in a ZenML secret and then referencing them in the Artifact Store configuration could look like this: - -```bash -# Store the Azure storage account key in a ZenML secret -zenml secret create az_secret \ - --account_name='' \ - --account_key='' - -# or if you want to use a connection string -zenml secret create az_secret \ - --connection_string='' - -# or if you want to use Azure ServicePrincipal credentials -zenml secret create az_secret \ - --account_name='' \ - --tenant_id='' \ - --client_id='' \ - --client_secret='' - -# Alternatively for providing key-value pairs, you can utilize the '--values' option by specifying a file path containing -# key-value pairs in either JSON or YAML format. -# File content example: {"account_name":"",...} -zenml secret create az_secret \ - --values=@path/to/file.txt - -# Register the Azure artifact store and reference the ZenML secret -zenml artifact-store register az_store -f azure \ - --path='az://your-container' \ - --authentication_secret=az_secret - -# Register and set a stack with the new artifact store -zenml stack register custom_stack -a az_store ... --set -``` - - - - -For more, up-to-date information on the Azure Artifact Store implementation and its configuration, you can have a look at [the SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-azure/#zenml.integrations.azure.artifact%5Fstores) . - -### How do you use it? - -Aside from the fact that the artifacts are stored in Azure Blob Storage, using the Azure Artifact Store is no different from [using any other flavor of Artifact Store](/develop/stack-components/artifact-stores#how-to-use-it). - diff --git a/develop/stack-components/artifact-stores/custom.mdx b/develop/stack-components/artifact-stores/custom.mdx deleted file mode 100644 index da067d8..0000000 --- a/develop/stack-components/artifact-stores/custom.mdx +++ /dev/null @@ -1,187 +0,0 @@ ---- -title: "Develop a custom artifact store" -description: "Learning how to develop a custom artifact store." -icon: sim-card ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -ZenML comes equipped with [Artifact Store implementations](/develop/stack-components/artifact-stores#artifact-store-flavors) that you can use to store artifacts on a local filesystem or in the managed AWS, GCP, or Azure cloud object storage services. However, if you need to use a different type of object storage service as a backend for your ZenML Artifact Store, you can extend ZenML to provide your own custom Artifact Store implementation. - -### Base Abstraction - -The Artifact Store establishes one of the main components in every ZenML stack. Now, let us take a deeper dive into the fundamentals behind its abstraction, namely [the BaseArtifactStore class](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-artifact%5Fstores/#zenml.artifact%5Fstores.base%5Fartifact%5Fstore.BaseArtifactStore): - -1. As ZenML only supports filesystem-based artifact stores, it features a configuration parameter called `path`, which will indicate the root path of the artifact store. When registering an artifact store, users will have to define this parameter. -2. Moreover, there is another variable in the config class called `SUPPORTED_SCHEMES`. This is a class variable that needs to be defined in every subclass of the base artifact store configuration. It indicates the supported file path schemes for the corresponding implementation. For instance, for the Azure artifact store, this set will be defined as `{"abfs://", "az://"}`. -3. Lastly, the base class features a set of `abstractmethod`s: `open`, `copyfile`,`exists`,`glob`,`isdir`,`listdir` ,`makedirs`,`mkdir`,`remove`, `rename`,`rmtree`,`stat`,`walk`. In the implementation of every `ArtifactStore` flavor, it is required to define these methods with respect to the flavor at hand. - -Putting all these considerations together, we end up with the following implementation: - -```py - - -from zenml.enums import StackComponentType -from zenml.stack import StackComponent, StackComponentConfig - -PathType = Union[bytes, str] - - -class BaseArtifactStoreConfig(StackComponentConfig): - """Config class for `BaseArtifactStore`.""" - - path: str - - SUPPORTED_SCHEMES: ClassVar[Set[str]] - - -class BaseArtifactStore(StackComponent): - """Base class for all ZenML artifact stores.""" - - @abstractmethod - def open(self, name: PathType, mode: str = "r") -> Any: - """Open a file at the given path.""" - - @abstractmethod - def copyfile( - self, src: PathType, dst: PathType, overwrite: bool = False - ) -> None: - """Copy a file from the source to the destination.""" - - @abstractmethod - def exists(self, path: PathType) -> bool: - """Returns `True` if the given path exists.""" - - @abstractmethod - def glob(self, pattern: PathType) -> List[PathType]: - """Return the paths that match a glob pattern.""" - - @abstractmethod - def isdir(self, path: PathType) -> bool: - """Returns whether the given path points to a directory.""" - - @abstractmethod - def listdir(self, path: PathType) -> List[PathType]: - """Returns a list of files under a given directory in the filesystem.""" - - @abstractmethod - def makedirs(self, path: PathType) -> None: - """Make a directory at the given path, recursively creating parents.""" - - @abstractmethod - def mkdir(self, path: PathType) -> None: - """Make a directory at the given path; parent directory must exist.""" - - @abstractmethod - def remove(self, path: PathType) -> None: - """Remove the file at the given path. Dangerous operation.""" - - @abstractmethod - def rename( - self, src: PathType, dst: PathType, overwrite: bool = False - ) -> None: - """Rename source file to destination file.""" - - @abstractmethod - def rmtree(self, path: PathType) -> None: - """Deletes dir recursively. Dangerous operation.""" - - @abstractmethod - def stat(self, path: PathType) -> Any: - """Return the stat descriptor for a given file path.""" - - @abstractmethod - def walk( - self, - top: PathType, - topdown: bool = True, - onerror: Optional[Callable[..., None]] = None, - ) -> Iterable[Tuple[PathType, List[PathType], List[PathType]]]: - """Return an iterator that walks the contents of the given directory.""" - - -class BaseArtifactStoreFlavor(Flavor): - """Base class for artifact store flavors.""" - - @property - @abstractmethod - def name(self) -> Type["BaseArtifactStore"]: - """Returns the name of the flavor.""" - - @property - def type(self) -> StackComponentType: - """Returns the flavor type.""" - return StackComponentType.ARTIFACT_STORE - - @property - def config_class(self) -> Type[StackComponentConfig]: - """Config class.""" - return BaseArtifactStoreConfig - - @property - @abstractmethod - def implementation_class(self) -> Type["BaseArtifactStore"]: - """Implementation class.""" -``` - - - This is a slimmed-down version of the base implementation which aims to highlight the abstraction layer. In order to see the full implementation and get the complete docstrings, please check the [SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-artifact%5Fstores/#zenml.artifact%5Fstores.base%5Fartifact%5Fstore.BaseArtifactStore) . - - -**The effect on the** `**zenml.io.fileio**` - -If you created an instance of an artifact store, added it to your stack, and activated the stack, it will create a filesystem each time you run a ZenML pipeline and make it available to the `zenml.io.fileio` module. - -This means that when you utilize a method such as `fileio.open(...)` with a file path that starts with one of the `SUPPORTED_SCHEMES` within your steps or materializers, it will be able to use the `open(...)` method that you defined within your artifact store. - -### Build your own custom artifact store - -If you want to implement your own custom Artifact Store, you can follow the following steps: - -1. Create a class that inherits from [the BaseArtifactStore class](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-artifact%5Fstores/#zenml.artifact%5Fstores.base%5Fartifact%5Fstore.BaseArtifactStore) and implements the abstract methods. -2. Create a class that inherits from [the BaseArtifactStoreConfig class](/develop/stack-components/artifact-stores/custom) and fill in the `SUPPORTED_SCHEMES` based on your file system. -3. Bring both of these classes together by inheriting from [the BaseArtifactStoreFlavor class](/develop/stack-components/artifact-stores/custom). - -Once you are done with the implementation, you can register it through the CLI. Please ensure you **point to the flavor class via dot notation**: - -```bash -zenml artifact-store flavor register -``` - -For example, if your flavor class `MyArtifactStoreFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml artifact-store flavor register flavors.my_flavor.MyArtifactStoreFlavor -``` - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. - - -Afterward, you should see the new custom artifact store flavor in the list of available artifact store flavors: - -```bash -zenml artifact-store flavor list -``` - - -It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow. - -* The **CustomArtifactStoreFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomArtifactStoreConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` objects are inherently `pydantic` objects, you can also add your own custom validators here. -* The **CustomArtifactStore** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomArtifactStoreFlavor` and the `CustomArtifactStoreConfig` are implemented in a different module/path than the actual `CustomArtifactStore`). - - -#### Enabling Artifact Visualizations with Custom Artifact Stores - -ZenML automatically saves visualizations for many common data types and allows you to view these visualizations in the ZenML dashboard. Under the hood, this works by saving the visualizations together with the artifacts in the artifact store. - -In order to load and display these visualizations, ZenML needs to be able to load and access the corresponding artifact store. This means that your custom artifact store needs to be configured in a way that allows authenticating to the back-end without relying on the local environment, e.g., by embedding the authentication credentials in the stack component configuration or by referencing a secret. - -Furthermore, for deployed ZenML instances, you need to install the package dependencies of your artifact store implementation in the environment where you have deployed ZenML. See the [Documentation on deploying ZenML with custom Docker images](/develop/getting-started/deploying-zenml/deploy-with-custom-image) for more information on how to do that. - diff --git a/develop/stack-components/artifact-stores/gcp.mdx b/develop/stack-components/artifact-stores/gcp.mdx deleted file mode 100644 index 393a246..0000000 --- a/develop/stack-components/artifact-stores/gcp.mdx +++ /dev/null @@ -1,223 +0,0 @@ ---- -title: "Google Cloud Storage (GCS)" -description: "Storing artifacts using GCP Cloud Storage." -icon: database ---- - -The GCS Artifact Store is an [Artifact Store](/develop/stack-components/artifact-stores) flavor provided with the GCP ZenML integration that uses [the Google Cloud Storage managed object storage service](https://cloud.google.com/storage/docs/introduction) to store ZenML artifacts in a GCP Cloud Storage bucket. - -### When would you want to use it? - -Running ZenML pipelines with [the local Artifact Store](/develop/stack-components/artifact-stores/local) is usually sufficient if you just want to evaluate ZenML or get started quickly without incurring the trouble and the cost of employing cloud storage services in your stack. However, the local Artifact Store becomes insufficient or unsuitable if you have more elaborate needs for your project: - -* if you want to share your pipeline run results with other team members or stakeholders inside or outside your organization -* if you have other components in your stack that are running remotely (e.g. a Kubeflow or Kubernetes Orchestrator running in a public cloud). -* if you outgrow what your local machine can offer in terms of storage space and need to use some form of private or public storage service that is shared with others -* if you are running pipelines at scale and need an Artifact Store that can handle the demands of production-grade MLOps - -In all these cases, you need an Artifact Store that is backed by a form of public cloud or self-hosted shared object storage service. - -You should use the GCS Artifact Store when you decide to keep your ZenML artifacts in a shared object storage and if you have access to the Google Cloud Storage managed service. You should consider one of the other [Artifact Store flavors](/develop/stack-components/artifact-stores#artifact-store-flavors) if you don't have access to the GCP Cloud Storage service. - -### How do you deploy it? - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including a GCS Artifact Store? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML GCP Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -The GCS Artifact Store flavor is provided by the GCP ZenML integration, you need to install it on your local machine to be able to register a GCS Artifact Store and add it to your stack: - -```bash -zenml integration install gcp -y -``` - -The only configuration parameter mandatory for registering a GCS Artifact Store is the root path URI, which needs to point to a GCS bucket and take the form `gs://bucket-name`. Please read [the Google Cloud Storage documentation](https://cloud.google.com/storage/docs/creating-buckets) on how to configure a GCS bucket. - -With the URI to your GCS bucket known, registering a GCS Artifact Store can be done as follows: - -```bash -# Register the GCS artifact store -zenml artifact-store register gs_store -f gcp --path=gs://bucket-name - -# Register and set a stack with the new artifact store -zenml stack register custom_stack -a gs_store ... --set -``` - -Depending on your use case, however, you may also need to provide additional configuration parameters pertaining to [authentication](/develop/stack-components/artifact-stores/gcp#authentication-methods) to match your deployment scenario. - -#### Infrastructure Deployment - -A GCS Artifact Store can be deployed directly from the ZenML CLI: - -```bash -zenml artifact-store deploy gcs_artifact_store --flavor=gcp --provider=gcp ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the [dedicated documentation section](/develop/stack-components/stack-deployment). - -#### Authentication Methods - -Integrating and using a GCS Artifact Store in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Implicit Authentication_ method. However, the recommended way to authenticate to the GCP cloud platform is through [a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector). This is particularly useful if you are configuring ZenML stacks that combine the GCS Artifact Store with other remote stack components also running in GCP. - - -This method uses the implicit GCP authentication available _in the environment where the ZenML code is running_. On your local machine, this is the quickest way to configure a GCS Artifact Store. You don't need to supply credentials explicitly when you register the GCS Artifact Store, as it leverages the local credentials and configuration that the Google Cloud CLI stores on your local machine. However, you will need to install and set up the Google Cloud CLI on your machine as a prerequisite, as covered in [the Google Cloud documentation](https://cloud.google.com/sdk/docs/install-sdk) , before you register the GCS Artifact Store. - -Certain dashboard functionality, such as visualizing or deleting artifacts, is not available when using an implicitly authenticated artifact store together with a deployed ZenML server because the ZenML server will not have permission to access the filesystem. - - -The implicit authentication method also needs to be coordinated with other stack components that are highly dependent on the Artifact Store and need to interact with it directly to the function. If these components are not running on your machine, they do not have access to the local Google Cloud CLI configuration and will encounter authentication failures while trying to access the GCS Artifact Store: - -* [Orchestrators](/develop/stack-components/orchestrators) need to access the Artifact Store to manage pipeline artifacts -* [Step Operators](/develop/stack-components/step-operators) need to access the Artifact Store to manage step-level artifacts -* [Model Deployers](/develop/stack-components/model-deployers) need to access the Artifact Store to load served models - -To enable these use cases, it is recommended to use [a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) to link your GCS Artifact Store to the remote GCS bucket. - - - -To set up the GCS Artifact Store to authenticate to GCP and access a GCS bucket, it is recommended to leverage the many features provided by [the GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) such as auto-configuration, best security practices regarding long-lived credentials and reusing the same credentials across multiple stack components. - -If you don't already have a GCP Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure a GCP Service Connector that can be used to access more than one GCS bucket or even more than one type of GCP resource: - -```bash -zenml service-connector register --type gcp -i -``` - -A non-interactive CLI example that leverages [the Google Cloud CLI configuration](https://cloud.google.com/sdk/docs/install-sdk) on your local machine to auto-configure a GCP Service Connector targeting a single GCS bucket is: - -```bash -zenml service-connector register --type gcp --resource-type gcs-bucket --resource-name --auto-configure -``` - -```bash Example Command Output -$ zenml service-connector register gcs-zenml-bucket-sl --type gcp --resource-type gcs-bucket --resource-id gs://zenml-bucket-sl --auto-configure - -⠸ Registering service connector 'gcs-zenml-bucket-sl'... - -Successfully registered service connector `gcs-zenml-bucket-sl` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼──────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -> **Note**: Please remember to grant the entity associated with your GCP credentials permissions to read and write to your GCS bucket as well as to list accessible GCS buckets. For a full list of permissions required to use a GCP Service Connector to access one or more GCS buckets, please refer to the [GCP Service Connector GCS bucket resource type documentation](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcs-bucket) or read the documentation available in the interactive CLI commands and dashboard. The GCP Service Connector supports [many different authentication methods](/develop/usage/resource-data-management/auth-management/gcp-service-connector#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use case. - -If you already have one or more GCP Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the GCS bucket you want to use for your GCS Artifact Store by running e.g.: - -```bash -zenml service-connector list-resources --resource-type gcs-bucket -``` - -```bash Example Command Output -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────┼────────────────┼───────────────┼─────────────────────────────────────────────────┨ - -┃ 7f0c69ba-9424-40ae-8ea6-04f35c2eba9d │ gcp-user-account │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ │ │ │ gs://zenml-core.appspot.com ┃ - -┃ │ │ │ │ gs://zenml-core_cloudbuild ┃ - -┃ │ │ │ │ gs://zenml-datasets ┃ - -┃ │ │ │ │ gs://zenml-internal-artifact-store ┃ - -┃ │ │ │ │ gs://zenml-kubeflow-artifact-store ┃ - -┠──────────────────────────────────────┼─────────────────────┼────────────────┼───────────────┼─────────────────────────────────────────────────┨ - -┃ 2a0bec1b-9787-4bd7-8d4a-9a47b6f61643 │ gcs-zenml-bucket-sl │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on a GCP Service Connector to use to connect to the target GCS bucket, you can register the GCS Artifact Store as follows: - -```bash -# Register the GCS artifact-store and reference the target GCS bucket -zenml artifact-store register -f gcp \ - - --path='gs://your-bucket' - -# Connect the GCS artifact-store to the target bucket via a GCP Service Connector -zenml artifact-store connect -i -``` - -A non-interactive version that connects the GCS Artifact Store to a target GCP Service Connector: - -```bash -zenml artifact-store connect --connector -``` - -```bash Example Command Output - -$ zenml artifact-store connect gcs-zenml-bucket-sl --connector gcs-zenml-bucket-sl - -Successfully connected artifact store `gcs-zenml-bucket-sl` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────┼────────────────┼───────────────┼──────────────────────┨ - -┃ 2a0bec1b-9787-4bd7-8d4a-9a47b6f61643 │ gcs-zenml-bucket-sl │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -As a final step, you can use the GCS Artifact Store in a ZenML Stack: - -```bash -# Register and set a stack with the new artifact store -zenml stack register -a ... --set -``` - - -When you register the GCS Artifact Store, you can [generate a GCP Service Account Key](https://cloud.google.com/docs/authentication/application-default-credentials#attached-sa), store it in a [ZenML Secret](/develop/usage/project-setup/use-secrets/secret-management) and then reference it in the Artifact Store configuration. - -This method has some advantages over the implicit authentication method: - -* you don't need to install and configure the GCP CLI on your host -* you don't need to care about enabling your other stack components (orchestrators, step operators and model deployers) to have access to the artifact store through GCP Service Accounts and Workload Identity -* you can combine the GCS artifact store with other stack components that are not running in GCP - -For this method, you need to [create a user-managed GCP service account](https://cloud.google.com/iam/docs/service-accounts-create), grant it privileges to read and write to your GCS bucket (i.e. use the `Storage Object Admin` role) and then [create a service account key](https://cloud.google.com/iam/docs/keys-create-delete#creating). - -With the service account key downloaded to a local file, you can register a ZenML secret and reference it in the GCS Artifact Store configuration as follows: - - -```bash -# Store the GCP credentials in a ZenML -zenml secret create gcp_secret \ - --token=@path/to/service_account_key.json - -# Register the GCS artifact store and reference the ZenML secret -zenml artifact-store register gcs_store -f gcp \ - --path='gs://your-bucket' \ - --authentication_secret=gcp_secret - -# Register and set a stack with the new artifact store -zenml stack register custom_stack -a gs_store ... --set - -``` - - - - -For more, up-to-date information on the GCS Artifact Store implementation and its configuration, you can have a look at [the SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-gcp/#zenml.integrations.gcp.artifact%5Fstores.gcp%5Fartifact%5Fstore) . - -### How do you use it? - -Aside from the fact that the artifacts are stored in GCP Cloud Storage, using the GCS Artifact Store is no different from [using any other flavor of Artifact Store](/develop/stack-components/artifact-stores#how-to-use-it). - diff --git a/develop/stack-components/artifact-stores/local.mdx b/develop/stack-components/artifact-stores/local.mdx deleted file mode 100644 index e992b48..0000000 --- a/develop/stack-components/artifact-stores/local.mdx +++ /dev/null @@ -1,82 +0,0 @@ ---- -title: "Local Artifact Store" -description: "Storing artifacts on your local filesystem." -icon: file ---- - -The local Artifact Store is a built-in ZenML [Artifact Store](/develop/stack-components/artifact-stores) flavor that uses a folder on your local filesystem to store artifacts. - -### When would you want to use it? - -The local Artifact Store is a great way to get started with ZenML, as it doesn't require you to provision additional local resources or to interact with managed object-store services like Amazon S3 and Google Cloud Storage. All you need is the local filesystem. You should use the local Artifact Store if you're just evaluating or getting started with ZenML, or if you are still in the experimental phase and don't need to share your pipeline artifacts (dataset, models, etc.) with others. - - - The local Artifact Store is not meant to be utilized in production. The local filesystem cannot be shared across your team and the artifacts stored in it cannot be accessed from other machines. This also means that [artifact visualizations](/develop/usage/resource-data-management/visualize-artifacts) will not be available when using a local Artifact Store through a [ZenML instance deployed in the cloud](/develop/getting-started/deploying-zenml). - -Furthermore, the local Artifact Store doesn't cover services like high-availability, scalability, backup and restore and other features that are expected from a production grade MLOps system. - -The fact that it stores artifacts on your local filesystem also means that not all stack components can be used in the same stack as a local Artifact Store: - -* only [Orchestrators](/develop/stack-components/orchestrators) running on the local machine, such as the [local Orchestrator](/develop/stack-components/orchestrators/local), a [local Kubeflow Orchestrator](/develop/stack-components/orchestrators/kubeflow), or a [local Kubernetes Orchestrator](/develop/stack-components/orchestrators/kubernetes) can be combined with a local Artifact Store -* only [Model Deployers](/develop/stack-components/model-deployers) that are running locally, such as the [MLflow Model Deployer](/develop/stack-components/model-deployers/mlflow), can be used in combination with a local Artifact Store -* [Step Operators](/develop/stack-components/step-operators): none of the Step Operators can be used in the same stack as a local Artifact Store, given that their very purpose is to run ZenML steps in remote specialized environments - -As you transition to a team setting or a production setting, you can replace the local Artifact Store in your stack with one of the other flavors that are better suited for these purposes, with no changes required in your code. - - -### How do you deploy it? - -The `default` stack that comes pre-configured with ZenML already contains a local Artifact Store: - -```bash -$ zenml stack list -Running without an active repository root. -Using the default local database. -┏━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┓ -┃ ACTIVE │ STACK NAME │ ARTIFACT_STORE │ ORCHESTRATOR ┃ -┠────────┼────────────┼────────────────┼──────────────┨ -┃ 👉 │ default │ default │ default ┃ -┗━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┛ - -$ zenml artifact-store describe -Running without an active repository root. -Using the default local database. -Running with active stack: 'default' -No component name given; using `default` from active stack. - ARTIFACT_STORE Component Configuration (ACTIVE) -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ COMPONENT_PROPERTY │ VALUE ┃ -┠────────────────────┼──────────────────────────────────────────────────────────────────────────────┨ -┃ TYPE │ artifact_store ┃ -┠────────────────────┼──────────────────────────────────────────────────────────────────────────────┨ -┃ FLAVOR │ local ┃ -┠────────────────────┼──────────────────────────────────────────────────────────────────────────────┨ -┃ NAME │ default ┃ -┠────────────────────┼──────────────────────────────────────────────────────────────────────────────┨ -┃ UUID │ 2b7773eb-d371-4f24-96f1-fad15e74fd6e ┃ -┠────────────────────┼──────────────────────────────────────────────────────────────────────────────┨ -┃ PATH │ /home/stefan/.config/zenml/local_stores/2b7773eb-d371-4f24-96f1-fad15e74fd6e ┃ -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -As shown by the `PATH` value in the `zenml artifact-store describe` output, the artifacts are stored inside a folder on your local filesystem. - -You can create additional instances of local Artifact Stores and use them in your stacks as you see fit, e.g.: - -```bash -# Register the local artifact store -zenml artifact-store register custom_local --flavor local - -# Register and set a stack with the new artifact store -zenml stack register custom_stack -o default -a custom_local --set -``` - - - Same as all other Artifact Store flavors, the local Artifact Store does take in a `path` configuration parameter that can be set during registration to point to a custom path on your machine. However, it is highly recommended that you rely on the default `path` value, otherwise, it may lead to unexpected results. Other local stack components depend on the convention used for the default path to be able to access the local Artifact Store. - -For more, up-to-date information on the local Artifact Store implementation and its configuration, you can have a look at [the SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-artifact%5Fstores/#zenml.artifact%5Fstores.local%5Fartifact%5Fstore) . - -### How do you use it? - -Aside from the fact that the artifacts are stored locally, using the local Artifact Store is no different from [using any other flavor of Artifact Store](/develop/stack-components/artifact-stores#how-to-use-it). - diff --git a/develop/stack-components/artifact-stores/s3.mdx b/develop/stack-components/artifact-stores/s3.mdx deleted file mode 100644 index e6e4fe6..0000000 --- a/develop/stack-components/artifact-stores/s3.mdx +++ /dev/null @@ -1,230 +0,0 @@ ---- -title: "Amazon Simple Cloud Storage (S3)" -description: "Storing artifacts in an AWS S3 bucket." -icon: amazon ---- - -The S3 Artifact Store is an [Artifact Store](/develop/stack-components/artifact-stores) flavor provided with the S3 ZenML integration that uses [the AWS S3 managed object storage service](https://aws.amazon.com/s3/) or one of the self-hosted S3 alternatives, such as [MinIO](https://min.io/) or [Ceph RGW](https://ceph.io/en/discover/technology/#object), to store artifacts in an S3 compatible object storage backend. - -### When would you want to use it? - -Running ZenML pipelines with [the local Artifact Store](/develop/stack-components/artifact-stores/local) is usually sufficient if you just want to evaluate ZenML or get started quickly without incurring the trouble and the cost of employing cloud storage services in your stack. However, the local Artifact Store becomes insufficient or unsuitable if you have more elaborate needs for your project: - -* if you want to share your pipeline run results with other team members or stakeholders inside or outside your organization -* if you have other components in your stack that are running remotely (e.g. a Kubeflow or Kubernetes Orchestrator running in a public cloud). -* if you outgrow what your local machine can offer in terms of storage space and need to use some form of private or public storage service that is shared with others -* if you are running pipelines at scale and need an Artifact Store that can handle the demands of production-grade MLOps - -In all these cases, you need an Artifact Store that is backed by a form of public cloud or self-hosted shared object storage service. - -You should use the S3 Artifact Store when you decide to keep your ZenML artifacts in a shared object storage and if you have access to the AWS S3 managed service or one of the S3 compatible alternatives (e.g. Minio, Ceph RGW). You should consider one of the other [Artifact Store flavors](/develop/stack-components/artifact-stores#artifact-store-flavors) if you don't have access to an S3-compatible service. - -### How do you deploy it? - - -Would you like to skip ahead and deploy a full ZenML cloud stack already, including an S3 Artifact Store? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML AWS Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -The S3 Artifact Store flavor is provided by the S3 ZenML integration, you need to install it on your local machine to be able to register an S3 Artifact Store and add it to your stack: - -```bash -zenml integration install s3 -y -``` - -The only configuration parameter mandatory for registering an S3 Artifact Store is the root path URI, which needs to point to an S3 bucket and take the form `s3://bucket-name`. Please read the documentation relevant to the S3 service that you are using on how to create an S3 bucket. For example, the AWS S3 documentation is available [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html). - -With the URI to your S3 bucket known, registering an S3 Artifact Store and using it in a stack can be done as follows: - -```bash - -# Register the S3 artifact-store -zenml artifact-store register s3_store -f s3 --path=s3://bucket-name - -# Register and set a stack with the new artifact store -zenml stack register custom_stack -a s3_store ... --set -``` - -Depending on your use case, however, you may also need to provide additional configuration parameters pertaining to [authentication](/develop/stack-components/artifact-stores/s3#authentication-methods) or [pass advanced configuration parameters](/develop/stack-components/artifact-stores/s3#advanced-configuration) to match your S3-compatible service or deployment scenario. - -#### Infrastructure Deployment - -An S3 Artifact Store can be deployed directly from the ZenML CLI: - -```bash -zenml artifact-store deploy s3-artifact-store --flavor=s3 --provider=aws ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the [dedicated documentation section](/develop/stack-components/stack-deployment). - -#### Authentication Methods - -Integrating and using an S3-compatible Artifact Store in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Implicit Authentication_ method. However, the recommended way to authenticate to the AWS cloud platform is through [an AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector). This is particularly useful if you are configuring ZenML stacks that combine the S3 Artifact Store with other remote stack components also running in AWS. - - -This method uses the implicit AWS authentication available _in the environment where the ZenML code is running_. On your local machine, this is the quickest way to configure an S3 Artifact Store. You don't need to supply credentials explicitly when you register the S3 Artifact Store, as it leverages the local credentials and configuration that the AWS CLI stores on your local machine. However, you will need to install and set up the AWS CLI on your machine as a prerequisite, as covered in [the AWS CLI documentation](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html), before you register the S3 Artifact Store. - -Certain dashboard functionality, such as visualizing or deleting artifacts, is not available when using an implicitly authenticated artifact store together with a deployed ZenML server because the ZenML server will not have permission to access the filesystem. - -The implicit authentication method also needs to be coordinated with other stack components that are highly dependent on the Artifact Store and need to interact with it directly to work. If these components are not running on your machine, they do not have access to the local AWS CLI configuration and will encounter authentication failures while trying to access the S3 Artifact Store: - -* [Orchestrators](/develop/stack-components/orchestrators) need to access the Artifact Store to manage pipeline artifacts -* [Step Operators](/develop/stack-components/step-operators) need to access the Artifact Store to manage step-level artifacts -* [Model Deployers](/develop/stack-components/model-deployers) need to access the Artifact Store to load served models - -To enable these use-cases, it is recommended to use [an AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector) to link your S3 Artifact Store to the remote S3 bucket. - - - To set up the S3 Artifact Store to authenticate to AWS and access an S3 bucket, it is recommended to leverage the many features provided by [the AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector) such as auto-configuration, best security practices regarding long-lived credentials and fine-grained access control and reusing the same credentials across multiple stack components. - -If you don't already have an AWS Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure an AWS Service Connector that can be used to access more than one S3 bucket or even more than one type of AWS resource: - -```bash -zenml service-connector register --type aws -i -``` - -A non-interactive CLI example that leverages [the AWS CLI configuration](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) on your local machine to auto-configure an AWS Service Connector targeting a single S3 bucket is: - -```bash -zenml service-connector register --type aws --resource-type s3-bucket --resource-name --auto-configure -``` - - -```bash Example Command Output -$ zenml service-connector register s3-zenfiles --type aws --resource-type s3-bucket --resource-id s3://zenfiles --auto-configure - -⠸ Registering service connector 's3-zenfiles'... - -Successfully registered service connector `s3-zenfiles` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -> **Note**: Please remember to grant the entity associated with your AWS credentials permissions to read and write to your S3 bucket as well as to list accessible S3 buckets. For a full list of permissions required to use an AWS Service Connector to access one or more S3 buckets, please refer to the [AWS Service Connector S3 bucket resource type documentation](/develop/usage/resource-data-management/auth-management/aws-service-connector#s3-bucket) or read the documentation available in the interactive CLI commands and dashboard. The AWS Service Connector supports [many different authentication methods](/develop/usage/resource-data-management/auth-management/aws-service-connector#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use case. - -If you already have one or more AWS Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the S3 bucket you want to use for your S3 Artifact Store by running e.g.: - -```bash -zenml service-connector list-resources --resource-type s3-bucket -``` - -```bash Example Command Output - -The following 's3-bucket' resources can be accessed by service connectors that you have configured: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼──────────────────────┼────────────────┼───────────────┼────────────────────────────────────────────────┨ -┃ aeed6507-f94c-4329-8bc2-52b85cd8d94d │ aws-s3 │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ -┠──────────────────────────────────────┼──────────────────────┼────────────────┼───────────────┼────────────────────────────────────────────────┨ -┃ 9a810521-ef41-4e45-bb48-8569c5943dc6 │ aws-implicit │ 🔶 aws │ 📦 s3-bucket │ s3://sagemaker-studio-907999144431-m11qlsdyqr8 ┃ -┃ │ │ │ │ s3://sagemaker-studio-d8a14tvjsmb ┃ -┠──────────────────────────────────────┼──────────────────────┼────────────────┼───────────────┼────────────────────────────────────────────────┨ -┃ 37c97fa0-fa47-4d55-9970-e2aa6e1b50cf │ aws-secret-key │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ -┃ │ │ │ │ s3://zenml-demos ┃ -┃ │ │ │ │ s3://zenml-generative-chat ┃ -┃ │ │ │ │ s3://zenml-public-datasets ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on an AWS Service Connector to use to connect to the target S3 bucket, you can register the S3 Artifact Store as follows: - -```bash -# Register the S3 artifact-store and reference the target S3 bucket -zenml artifact-store register -f s3 \ - --path='s3://your-bucket' - -# Connect the S3 artifact-store to the target bucket via an AWS Service Connector -zenml artifact-store connect -i -``` - -A non-interactive version that connects the S3 Artifact Store to a target S3 bucket through an AWS Service Connector: - -```bash -zenml artifact-store connect --connector -``` - -```bash Example Command Output -$ zenml artifact-store connect s3-zenfiles --connector s3-zenfiles - -Successfully connected artifact store `s3-zenfiles` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ c4ee3f0a-bc69-4c79-9a74-297b2dd47d50 │ s3-zenfiles │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -As a final step, you can use the S3 Artifact Store in a ZenML Stack: - -```bash -# Register and set a stack with the new artifact store - -zenml stack register -a ... --set -``` - - -When you register the S3 Artifact Store, you can [generate an AWS access key](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key/), store it in a [ZenML Secret](/develop/usage/project-setup/use-secrets/secret-management) and then reference it in the Artifact Store configuration. - -This method has some advantages over the implicit authentication method: - -* you don't need to install and configure the AWS CLI on your host -* you don't need to care about enabling your other stack components (orchestrators, step operators, and model deployers) to have access to the artifact store through IAM roles and policies -* you can combine the S3 artifact store with other stack components that are not running in AWS - -> **Note**: When you create the IAM user for your AWS access key, please remember to grant the created IAM user permissions to read and write to your S3 bucket (i.e. at a minimum: `s3:PutObject`, `s3:GetObject`, `s3:ListBucket`, `s3:DeleteObject`) - -After having set up the IAM user and generated the access key, as described in the [AWS documentation](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key/), you can register the S3 Artifact Store as follows: - -```bash -# Store the AWS access key in a ZenML secret -zenml secret create s3_secret \ - --aws_access_key_id='' \ - --aws_secret_access_key='' - -# Register the S3 artifact-store and reference the ZenML secret -zenml artifact-store register s3_store -f s3 \ - --path='s3://your-bucket' \ - --authentication_secret=s3_secret - -# Register and set a stack with the new artifact store -zenml stack register custom_stack -a s3_store ... --set -``` - - - - -#### Advanced Configuration - -The S3 Artifact Store accepts a range of advanced configuration options that can be used to further customize how ZenML connects to the S3 storage service that you are using. These are accessible via the `client_kwargs`, `config_kwargs` and `s3_additional_kwargs` configuration attributes and are passed transparently to [the underlying S3Fs library](https://s3fs.readthedocs.io/en/latest/#s3-compatible-storage): - -* `client_kwargs`: arguments that will be transparently passed to [the botocore client](https://boto3.amazonaws.com/v1/documentation/api/latestversions/0.66.0/reference/core/session.html#boto3.session.Session.client) . You can use it to configure parameters like `endpoint_url` and `region_name` when connecting to an S3-compatible endpoint (e.g. Minio). -* `config_kwargs`: advanced parameters passed to [botocore.client.Config](https://botocore.amazonaws.com/v1/documentation/api/latestversions/0.66.0/reference/config.html). -* `s3_additional_kwargs`: advanced parameters that are used when calling S3 API, typically used for things like `ServerSideEncryption` and `ACL`. - -To include these advanced parameters in your Artifact Store configuration, pass them using JSON format during registration, e.g.: - -```bash -zenml artifact-store register minio_store -f s3 \ - --path='s3://minio_bucket' \ - --authentication_secret=s3_secret \ - --client_kwargs='{"endpoint_url": "http://minio.cluster.local:9000", "region_name": "us-east-1"}' -``` - -For more, up-to-date information on the S3 Artifact Store implementation and its configuration, you can have a look at [the SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-s3/#zenml.integrations.s3.artifact%5Fstores.s3%5Fartifact%5Fstore) . - -### How do you use it? - -Aside from the fact that the artifacts are stored in an S3 compatible backend, using the S3 Artifact Store is no different than [using any other flavor of Artifact Store](/develop/stack-components/artifact-stores#how-to-use-it). - diff --git a/develop/stack-components/component-guide.mdx b/develop/stack-components/component-guide.mdx deleted file mode 100644 index 2003f14..0000000 --- a/develop/stack-components/component-guide.mdx +++ /dev/null @@ -1,118 +0,0 @@ ---- -title: "Overview" -description: "Overview of categories of MLOps components and third-party integrations." -icon: file-lines ---- - -If you are new to the world of MLOps, it is often daunting to be immediately faced with a sea of tools that seemingly all promise and do the same things. It is useful in this case to try to categorize tools in various groups in order to understand their value in your toolchain in a more precise manner. - -ZenML tackles this problem by introducing the concept of [Stacks and Stack Components](/develop/user-guide/guides/production-guide/understand-stacks). These stack components represent categories, each of which has a particular function in your MLOps pipeline. ZenML realizes these stack components as base abstractions that standardize the entire workflow for your team. In order to then realize the benefit, one can write a concrete implementation of the [abstraction](/develop/stack-components/stack-deployment/implement-a-custom-stack-component), or use one of the many built-in [integrations](/develop/stack-components/component-guide) that implement these abstractions for you. - -Here is a full list of all stack components currently supported in ZenML, with a description of the role of that component in the MLOps process: - - - - Orchestrating the runs of your pipeline - - - Storage for the artifacts created by your pipelines - - - Store for your containers - - - Data and model validation - - - Tracking your ML experiments - - Services/platforms responsible for online model serving - - - Execution of individual steps in specialized runtime environments - - - Sending alerts through specified channels - - Builds container images - - Labeling and annotating data - - - Manage and interact with ML Models - - - Management of your data/features - - - -Each pipeline run that you execute with ZenML will require a **stack** and each **stack** will be required to include at least an orchestrator and an artifact store. Apart from these two, the other components are optional and to be added as your pipeline evolves in MLOps maturity. - -## Writing custom component flavors - -You can take control of how ZenML behaves by creating your own components. This is done by writing custom component `flavors`. To learn more, head over to [the general guide on writing component flavors](/develop/stack-components/stack-deployment/implement-a-custom-stack-component), or read more specialized guides for specific component types (e.g. the [custom orchestrator guide](/develop/stack-components/orchestrators/custom)). - -## Integrations - -Categorizing the MLOps stack is a good way to write abstractions for an MLOps pipeline and standardize your processes. But ZenML goes further and also provides concrete implementations of these categories by **integrating** with various tools for each category. Once code is organized into a ZenML pipeline, you can supercharge your ML workflows with the best-in-class solutions from various MLOps areas. - -For example, you can orchestrate your ML pipeline workflows using [Airflow](/develop/stack-components/orchestrators/airflow) or [Kubeflow](/develop/stack-components/orchestrators/kubeflow), track experiments using [MLflow Tracking](/develop/stack-components/experiment-trackers/mlflow) or [Weights & Biases](/develop/stack-components/experiment-trackers/wandb), and transition seamlessly from a local [MLflow deployment](/develop/stack-components/model-deployers/mlflow) to a deployed model on Kubernetes using [Seldon Core](/develop/stack-components/model-deployers/seldon). - -There are lots of moving parts for all the MLOps tooling and infrastructure you require for ML in production and ZenML brings them all together and enables you to manage them in one place. This also allows you to delay the decision of which MLOps tool to use in your stack as you have no vendor lock-in with ZenML and can easily switch out tools as soon as your requirements change. - - - - - - -### Available integrations - -We have a [dedicated webpage](https://zenml.io/integrations) that indexes all supported ZenML integrations and their categories. - -Another easy way of seeing a list of integrations is to see the list of directories in the [integrations directory](https://github.com/zenml-io/zenml/tree/main/src/zenml/integrations) on our GitHub. - -### Installing ZenML integrations - -Before you can use integrations, you first need to install them using `zenml integration install`, e.g., you can install [Kubeflow](/develop/stack-components/orchestrators/kubeflow), [MLflow Tracking](/develop/stack-components/experiment-trackers/mlflow), and [Seldon Core](/develop/stack-components/model-deployers/seldon), using: - -```bash -zenml integration install kubeflow mlflow seldon -y -``` - -Under the hood, this simply installs the preferred versions of all integrations using pip, i.e., it executes in a sub-process call: - -```bash -pip install kubeflow== mlflow== seldon== -``` - - -* The `-y` flag confirms all `pip install` commands without asking you for - -You can run `zenml integration --help` to see a full list of CLI commands that ZenML provides for interacting with integrations. - - -Note, that you can also install your dependencies directly, but please note that there is no guarantee that ZenML internals with work with any arbitrary version of any external library. - -#### Experimental: Use `uv` for package installation - -You can use [uv](https://github.com/astral-sh/uv) as a package manager if you want. Simply pass the `--uv` flag to the `zenml integration ...` command and it'll use `uv` for installation, upgrades and uninstallations. Note that `uv` must be installed for this to work. This is an experimental option that we've added for users wishing to use `uv` but given that it is relatively new as an option there might be certain packages that don't work well with `uv`. We will monitor how this performs and update as `uv` becomes more stable. - -### Upgrade ZenML integrations - -You can upgrade all integrations to their latest possible version using: - -```bash -zenml integration upgrade mlflow pytorch -y -``` - - -* The `-y` flag confirms all `pip install --upgrade` commands without asking you for confirmation. -* If no integrations are specified, all installed integrations will be upgraded. - - -### Help us with integrations! - -There are countless tools in the ML / MLOps field. We have made an initial prioritization of which tools to support with integrations that are visible on our public [roadmap](https://zenml.io/roadmap). - -We also welcome community contributions. Check our [Contribution Guide](https://github.com/zenml-io/zenml/blob/main/CONTRIBUTING.md) and [External Integration Guide](https://github.com/zenml-io/zenml/blob/main/src/zenml/integrations/README.md) for more details on how to best contribute to new integrations. - diff --git a/develop/stack-components/container-registries/aws.mdx b/develop/stack-components/container-registries/aws.mdx deleted file mode 100644 index 656cbb2..0000000 --- a/develop/stack-components/container-registries/aws.mdx +++ /dev/null @@ -1,230 +0,0 @@ ---- -title: "Amazon Elastic Container Registry (ECR)" -description: "Storing container images in Amazon ECR." -icon: aws ---- - -The AWS container registry is a [container registry](/develop/stack-components/container-registries) flavor provided with the ZenML `aws` integration and uses [Amazon ECR](https://aws.amazon.com/ecr/) to store container images. - -### When to use it - -You should use the AWS container registry if: - -* one or more components of your stack need to pull or push container images. -* you have access to AWS ECR. If you're not using AWS, take a look at the other [container registry flavors](/develop/stack-components/container-registries#container-registry-flavors). - -### How to deploy it - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including an AWS ECR container registry? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML AWS Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -The ECR registry is automatically activated once you create an AWS account. However, you'll need to create a `Repository` in order to push container images to it: - -* Go to the [ECR website](https://console.aws.amazon.com/ecr). -* Make sure the correct region is selected on the top right. -* Click on `Create repository`. -* Create a private repository. The name of the repository depends on the [orchestrator](/develop/stack-components/orchestrators) or [step operator](/develop/stack-components/step-operators) you're using in your stack. - -### URI format - -The AWS container registry URI should have the following format: - -```bash - -.dkr.ecr..amazonaws.com -# Examples: -123456789.dkr.ecr.eu-west-2.amazonaws.com -987654321.dkr.ecr.ap-south-1.amazonaws.com -135792468.dkr.ecr.af-south-1.amazonaws.com -``` - -To figure out the URI for your registry: - -* Go to the [AWS console](https://console.aws.amazon.com/) and click on your user account in the top right to see the `Account ID`. -* Go [here](https://docs.aws.amazon.com/general/latest/gr/rande.html#regional-endpoints) and choose the region in which you would like to store your container images. Make sure to choose a nearby region for faster access. -* Once you have both these values, fill in the values in this template `.dkr.ecr..amazonaws.com` to get your container registry URI. - -#### Infrastructure Deployment - -An AWS ECR Container Registry can be deployed directly from the ZenML CLI: - -```bash -zenml container-registry deploy ecr_container_registry --flavor=aws --provider=aws ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the dedicated documentation section. - -### How to use it - -To use the AWS container registry, we need: - -* The ZenML `aws` integration installed. If you haven't done so, run - -```bash -zenml integration install aws -``` -* [Docker](https://www.docker.com) installed and running. -* The registry URI. Check out the [previous section](/develop/stack-components/container-registries/aws#how-to-deploy-it) on the URI format and how to get the URI for your registry. - -We can then register the container registry and use it in our active stack: - -```bash -zenml container-registry register \ - --flavor=aws \ - --uri= - -# Add the container registry to the active stack -zenml stack update -c -``` - -You also need to set up [authentication](/develop/stack-components/container-registries/aws#authentication-methods) required to log in to the container registry. - -#### Authentication Methods - -Integrating and using an AWS Container Registry in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Local Authentication_ method. However, the recommended way to authenticate to the AWS cloud platform is through [an AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector). This is particularly useful if you are configuring ZenML stacks that combine the AWS Container Registry with other remote stack components also running in AWS. - - -This method uses the Docker client authentication available _in the environment where the ZenML code is running_. On your local machine, this is the quickest way to configure an AWS Container Registry. You don't need to supply credentials explicitly when you register the AWS Container Registry, as it leverages the local credentials and configuration that the AWS CLI and Docker client store on your local machine. However, you will need to install and set up the AWS CLI on your machine as a prerequisite, as covered in [the AWS CLI documentation](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html), before you register the AWS Container Registry. - -With the AWS CLI installed and set up with credentials, we'll need to log in to the container registry so Docker can pull and push images: - -```bash -# Fill your REGISTRY_URI and REGION in the placeholders in the following command. -# You can find the REGION as part of your REGISTRY_URI: `.dkr.ecr..amazonaws.com` -aws ecr get-login-password --region | docker login --username AWS --password-stdin -``` - - - Stacks using the AWS Container Registry set up with local authentication are not portable across environments. To make ZenML pipelines fully portable, it is recommended to use [an AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector) to link your AWS Container Registry to the remote ECR registry. - - - -To set up the AWS Container Registry to authenticate to AWS and access an ECR registry, it is recommended to leverage the many features provided by [the AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector) such as auto-configuration, local login, best security practices regarding long-lived credentials and fine-grained access control and reusing the same credentials across multiple stack components. - -If you don't already have an AWS Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure an AWS Service Connector that can be used to access an ECR registry or even more than one type of AWS resource: - -```bash -zenml service-connector register --type aws -i -``` - -A non-interactive CLI example that leverages [the AWS CLI configuration](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) on your local machine to auto-configure an AWS Service Connector targeting an ECR registry is: - -```bash -zenml service-connector register --type aws --resource-type docker-registry --auto-configure -``` - -```bash Example Command Output -$ zenml service-connector register aws-us-east-1 --type aws --resource-type docker-registry --auto-configure - -⠸ Registering service connector 'aws-us-east-1'... - -Successfully registered service connector `aws-us-east-1` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -> **Note**: Please remember to grant the entity associated with your AWS credentials permissions to read and write to one or more ECR repositories as well as to list accessible ECR repositories. For a full list of permissions required to use an AWS Service Connector to access an ECR registry, please refer to the [AWS Service Connector ECR registry resource type documentation](/develop/usage/resource-data-management/auth-management/aws-service-connector#ecr-container-registry) or read the documentation available in the interactive CLI commands and dashboard. The AWS Service Connector supports [many different authentication methods](/develop/usage/resource-data-management/auth-management/aws-service-connector#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use case. - -If you already have one or more AWS Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the ECR registry you want to use for your AWS Container Registry by running e.g.: - -```bash -zenml service-connector list-resources --connector-type aws --resource-type docker-registry -``` - -```bash Example Command Output - -The following 'docker-registry' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────────┼────────────────┼────────────────────┼──────────────────────────────────────────────┨ - -┃ 37c97fa0-fa47-4d55-9970-e2aa6e1b50cf │ aws-secret-key │ 🔶 aws │ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┠──────────────────────────────────────┼─────────────────────────┼────────────────┼────────────────────┼──────────────────────────────────────────────┨ - -┃ d400e0c6-a8e7-4b95-ab34-0359229c5d36 │ aws-us-east-1 │ 🔶 aws │ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on an AWS Service Connector to use to connect to the target ECR registry, you can register the AWS Container Registry as follows: - - -```bash -# Register the AWS container registry and reference the target ECR registry URI - -zenml container-registry register -f aws \ - - --uri= - -# Connect the AWS container registry to the target ECR registry via an AWS Service Connector - -zenml container-registry connect -i -``` - -A non-interactive version that connects the AWS Container Registry to a target ECR registry through an AWS Service Connector: - -```bash -zenml container-registry connect --connector -``` -```bash Example Command Output -$ zenml container-registry connect aws-us-east-1 --connector aws-us-east-1 - -Successfully connected container registry `aws-us-east-1` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼──────────────────────────────────────────────┨ - -┃ d400e0c6-a8e7-4b95-ab34-0359229c5d36 │ aws-us-east-1 │ 🔶 aws │ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -As a final step, you can use the AWS Container Registry in a ZenML Stack: - -```bash -# Register and set a stack with the new container registry - -zenml stack register -c ... --set -``` - - - Linking the AWS Container Registry to a Service Connector means that your local Docker client is no longer authenticated to access the remote registry. If you need to manually interact with the remote registry via the Docker CLI, you can use the [local login Service Connector feature](/develop/usage/resource-data-management/auth-management/service-connectors-guide#configure-local-clients) to temporarily authenticate your local Docker client to the remote registry: - - ```bash - zenml service-connector login --resource-type docker-registry - ``` - - ```bash Example Command Output - $ zenml service-connector login aws-us-east-1 --resource-type docker-registry - - ⠼ Attempting to configure local client using service connector 'aws-us-east-1'... - - WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - - Configure a credential helper to remove this warning. See - - https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - - The 'aws-us-east-1' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. - ``` - - - - -For more information and a full list of configurable attributes of the AWS container registry, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-aws/#zenml.integrations.aws.container%5Fregistries.aws%5Fcontainer%5Fregistry.AWSContainerRegistry). diff --git a/develop/stack-components/container-registries/azure.mdx b/develop/stack-components/container-registries/azure.mdx deleted file mode 100644 index c6e9ecf..0000000 --- a/develop/stack-components/container-registries/azure.mdx +++ /dev/null @@ -1,206 +0,0 @@ ---- -title: "Azure Container Registry" -description: "Storing container images in Azure." -icon: warehouse ---- - -The Azure container registry is a [container registry](/develop/stack-components/container-registries) flavor that comes built-in with ZenML and uses the [Azure Container Registry](https://azure.microsoft.com/en-us/services/container-registry/) to store container images. - -### When to use it - -You should use the Azure container registry if: - -* one or more components of your stack need to pull or push container images. -* you have access to Azure. If you're not using Azure, take a look at the other [container registry flavors](/develop/stack-components/container-registries#container-registry-flavors). - -### How to deploy it - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including an Azure container registry? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML Azure Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -Go [here](https://portal.azure.com/#create/Microsoft.ContainerRegistry) and choose a subscription, resource group, location, and registry name. Then click on `Review + Create` and to create your container registry. - -### How to find the registry URI - -The Azure container registry URI should have the following format: - -```bash -.azurecr.io -# Examples: -zenmlregistry.azurecr.io -myregistry.azurecr.io -``` - -To figure out the URI for your registry: - -* Go to the [Azure portal](https://portal.azure.com/#home). -* In the search bar, enter `container registries` and select the container registry you want to use. If you don't have any container registries yet, check out the [deployment section](/develop/stack-components/container-registries/azure#how-to-deploy-it) on how to create one. -* Use the name of your registry to fill the template `.azurecr.io` and get your URI. - -### How to use it - -To use the Azure container registry, we need: - -* [Docker](https://www.docker.com) installed and running. -* The registry URI. Check out the [previous section](/develop/stack-components/container-registries/azure#how-to-find-the-registry-uri) on the URI format and how to get the URI for your registry. - -We can then register the container registry and use it in our active stack: - -```bash -zenml container-registry register \ - --flavor=azure \ - --uri= - -# Add the container registry to the active stack -zenml stack update -c -``` - -You also need to set up [authentication](/develop/stack-components/container-registries/azure#authentication-methods) required to log in to the container registry. - -#### Authentication Methods - -Integrating and using an Azure Container Registry in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Local Authentication_ method. However, the recommended way to authenticate to the Azure cloud platform is through [an Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector). This is particularly useful if you are configuring ZenML stacks that combine the Azure Container Registry with other remote stack components also running in Azure. - - -This method uses the Docker client authentication available _in the environment where the ZenML code is running_. On your local machine, this is the quickest way to configure an Azure Container Registry. You don't need to supply credentials explicitly when you register the Azure Container Registry, as it leverages the local credentials and configuration that the Azure CLI and Docker client store on your local machine. However, you will need to install and set up the Azure CLI on your machine as a prerequisite, as covered in [the Azure CLI documentation](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli), before you register the Azure Container Registry. - -With the Azure CLI installed and set up with credentials, you need to login to the container registry so Docker can pull and push images: - -```bash -# Fill your REGISTRY_NAME in the placeholder in the following command. - -# You can find the REGISTRY_NAME as part of your registry URI: `.azurecr.io` -az acr login --name= -``` - - - Stacks using the Azure Container Registry set up with local authentication are not portable across environments. To make ZenML pipelines fully portable, it is recommended to use [an Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector) to link your Azure Container Registry to the remote ACR registry. - - - - - To set up the Azure Container Registry to authenticate to Azure and access an ACR registry, it is recommended to leverage the many features provided by [the Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector) such as auto-configuration, local login, best security practices regarding long-lived credentials and reusing the same credentials across multiple stack components. - -If you don't already have an Azure Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure an Azure Service Connector that can be used to access a ACR registry or even more than one type of Azure resource: - -```bash -zenml service-connector register --type azure -i -``` - -A non-interactive CLI example that uses [Azure Service Principal credentials](https://learn.microsoft.com/en-us/azure/active-directory/develop/app-objects-and-service-principals) to configure an Azure Service Connector targeting a single ACR registry is: - -```bash -zenml service-connector register --type azure --auth-method service-principal --tenant_id= --client_id= --client_secret= --resource-type docker-registry --resource-id -``` - -```bash Example Command Output -$ zenml service-connector register azure-demo --type azure --auth-method service-principal --tenant_id=a79f3633-8f45-4a74-a42e-68871c17b7fb --client_id=8926254a-8c3f-430a-a2fd-bdab234d491e --client_secret=AzureSuperSecret --resource-type docker-registry --resource-id demozenmlcontainerregistry.azurecr.io - -⠸ Registering service connector 'azure-demo'... - -Successfully registered service connector `azure-demo` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────────┼───────────────────────────────────────┨ - -┃ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -> **Note**: Please remember to grant the entity associated with your Azure credentials permissions to read and write to your ACR registry as well as to list accessible ACR registries. For a full list of permissions required to use an Azure Service Connector to access a ACR registry, please refer to the [Azure Service Connector ACR registry resource type documentation](/develop/usage/resource-data-management/auth-management/azure-service-connector#acr-container-registry) or read the documentation available in the interactive CLI commands and dashboard. The Azure Service Connector supports [many different authentication methods](/develop/usage/resource-data-management/auth-management/azure-service-connector#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use case. - -If you already have one or more Azure Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the ACR registry you want to use for your Azure Container Registry by running e.g.: - -```bash -zenml service-connector list-resources --connector-type azure --resource-type docker-registry -``` - -```bash Example Command Output -The following 'docker-registry' resources can be accessed by 'azure' service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼───────────────────────────────────────┨ - -┃ db5821d0-a658-4504-ae96-04c3302d8f85 │ azure-demo │ 🇦 azure │ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on an Azure Service Connector to use to connect to the target ACR registry, you can register the Azure Container Registry as follows: - -```bash -# Register the Azure container registry and reference the target ACR registry URI - -zenml container-registry register -f azure \ - - --uri= - -# Connect the Azure container registry to the target ACR registry via an Azure Service Connector - -zenml container-registry connect -i -``` - -A non-interactive version that connects the Azure Container Registry to a target ACR registry through an Azure Service Connector: - -```bash -zenml container-registry connect --connector -``` - -```bash Example Command Output -$ zenml container-registry connect azure-demo --connector azure-demo - -Successfully connected container registry `azure-demo` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼───────────────────────────────────────┨ - -┃ db5821d0-a658-4504-ae96-04c3302d8f85 │ azure-demo │ 🇦 azure │ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -As a final step, you can use the Azure Container Registry in a ZenML Stack: - -```bash -# Register and set a stack with the new container registry -zenml stack register -c ... --set -``` - - - Linking the Azure Container Registry to a Service Connector means that your local Docker client is no longer authenticated to access the remote registry. If you need to manually interact with the remote registry via the Docker CLI, you can use the [local login Service Connector feature](/develop/usage/resource-data-management/auth-management/service-connectors-guide#configure-local-clients) to temporarily authenticate your local Docker client to the remote registry: - -```bash -zenml service-connector login --resource-type docker-registry --resource-id -``` - -```bash Example Command Output -$ zenml service-connector login azure-demo --resource-type docker-registry --resource-id demozenmlcontainerregistry.azurecr.io - -⠹ Attempting to configure local client using service connector 'azure-demo'... - -WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - -Configure a credential helper to remove this warning. See - -https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - -The 'azure-demo' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. -``` - - - - -For more information and a full list of configurable attributes of the Azure container registry, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-container%5Fregistries/#zenml.container%5Fregistries.azure%5Fcontainer%5Fregistry.AzureContainerRegistry) . - - diff --git a/develop/stack-components/container-registries/container-registries.mdx b/develop/stack-components/container-registries/container-registries.mdx deleted file mode 100644 index 334ee3d..0000000 --- a/develop/stack-components/container-registries/container-registries.mdx +++ /dev/null @@ -1,37 +0,0 @@ ---- -title: "Overview" -description: "Setting up a storage for Docker images." -icon: docker ---- - -The container registry is an essential part of most remote MLOps stacks. It is used to store container images that are built to run machine learning pipelines in remote environments. Containerization of the pipeline code creates a portable environment that allows code to run in an isolated manner. - -### When to use it - -The container registry is needed whenever other components of your stack need to push or pull container images. Currently, this is the case for most of ZenML's remote [orchestrators](/develop/stack-components/orchestrators) , [step operators](/develop/stack-components/step-operators), and some [model deployers](/develop/stack-components/model-deployers). These containerize your pipeline code and therefore require a container registry to store the resulting [Docker](https://www.docker.com/) images. Take a look at the documentation page of the component you want to use in your stack to see if it requires a container registry or even a specific container registry flavor. - -### Container Registry Flavors - -ZenML comes with a few container registry flavors that you can use: - -* Default flavor: Allows any URI without validation. Use this if you want to use a local container registry or when using a remote container registry that is not covered by other flavors. -* Specific flavors: Validates your container registry URI and performs additional checks to ensure you're able to push to the registry. - - - We highly suggest using the specific container registry flavors in favor of the `default` one to make use of the additional URI validations. - - -| Container Registry | Flavor | Integration | URI example | -| ------------------------------------------------------------------------------ | --------- | ----------- | ----------------------------------------- | -| [DefaultContainerRegistry](/develop/stack-components/container-registries/default) | default | _built-in_ | \- | -| [DockerHubContainerRegistry](/develop/stack-components/container-registries/dockerhub) | dockerhub | _built-in_ | docker.io/zenml | -| [GCPContainerRegistry](/develop/stack-components/container-registries/gcp) | gcp | _built-in_ | gcr.io/zenml | -| [AzureContainerRegistry](/develop/stack-components/container-registries/azure) | azure | _built-in_ | zenml.azurecr.io | -| [GitHubContainerRegistry](/develop/stack-components/container-registries/github) | github | _built-in_ | ghcr.io/zenml | -| [AWSContainerRegistry](/develop/stack-components/container-registries/aws) | aws | aws | 123456789.dkr.ecr.us-east-1.amazonaws.com | - -If you would like to see the available flavors of container registries, you can use the command: - -```bash -zenml container-registry flavor list -``` diff --git a/develop/stack-components/container-registries/custom.mdx b/develop/stack-components/container-registries/custom.mdx deleted file mode 100644 index 9b5336d..0000000 --- a/develop/stack-components/container-registries/custom.mdx +++ /dev/null @@ -1,121 +0,0 @@ ---- -title: "Develop a custom container registry" -description: "Learning how to develop a custom container registry." -icon: sim-card ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -### Base Abstraction - -In the current version of ZenML, container registries have a rather basic base abstraction. In essence, their base configuration only features a `uri` and their implementation features a non-abstract `prepare_image_push` method for validation. - -```py -from abc import abstractmethod -from typing import Type - -from zenml.enums import StackComponentType -from zenml.stack import Flavor -from zenml.stack.authentication_mixin import ( - AuthenticationConfigMixin, - AuthenticationMixin, -) -from zenml.utils import docker_utils - - -class BaseContainerRegistryConfig(AuthenticationConfigMixin): - """Base config for a container registry.""" - - uri: str - - -class BaseContainerRegistry(AuthenticationMixin): - """Base class for all ZenML container registries.""" - - def prepare_image_push(self, image_name: str) -> None: - """Conduct necessary checks/preparations before an image gets pushed.""" - - def push_image(self, image_name: str) -> str: - """Pushes a Docker image.""" - if not image_name.startswith(self.config.uri): - raise ValueError( - f"Docker image `{image_name}` does not belong to container " - f"registry `{self.config.uri}`." - ) - - self.prepare_image_push(image_name) - return docker_utils.push_image(image_name) - - -class BaseContainerRegistryFlavor(Flavor): - """Base flavor for container registries.""" - - @property - @abstractmethod - def name(self) -> str: - """Returns the name of the flavor.""" - - @property - def type(self) -> StackComponentType: - """Returns the flavor type.""" - return StackComponentType.CONTAINER_REGISTRY - - @property - def config_class(self) -> Type[BaseContainerRegistryConfig]: - """Config class for this flavor.""" - return BaseContainerRegistryConfig - - @property - def implementation_class(self) -> Type[BaseContainerRegistry]: - """Implementation class.""" - return BaseContainerRegistry -``` - - - This is a slimmed-down version of the base implementation which aims to highlight the abstraction layer. In order to see the full implementation and get the complete docstrings, please check the [SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-container%5Fregistries/#zenml.container%5Fregistries.base%5Fcontainer%5Fregistry.BaseContainerRegistry) . - - -### Building your own container registry - -If you want to create your own custom flavor for a container registry, you can follow the following steps: - -1. Create a class that inherits from the `BaseContainerRegistry` class and if you need to execute any checks/validation before the image gets pushed, you can define these operations in the `prepare_image_push` method. As an example, you can check the `AWSContainerRegistry`. -2. If you need further configuration, you can create a class which inherits from the `BaseContainerRegistryConfig` class. -3. Bring both the implementation and the configuration together by inheriting from the `BaseContainerRegistryFlavor` class. - -Once you are done with the implementation, you can register it through the CLI. Please ensure you **point to the flavor class via dot notation**: - -```bash -zenml container-registry flavor register -``` - -For example, your flavor class `MyContainerRegistryFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml container-registry flavor register flavors.my_flavor.MyContainerRegistryFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually it's better to not have to rely on this mechanism, and initialize zenml at the root. - - -Afterward, you should see the new flavor in the list of available flavors: - -```bash -zenml container-registry flavor list -``` - - -It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow. - -* The **CustomContainerRegistryFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomContainerRegistryConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` object are inherently `pydantic` objects, you can also add your own custom validators here. -* The **CustomContainerRegistry** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomContainerRegistryFlavor` and the `CustomContainerRegistryConfig` are implemented in a different module/path than the actual `CustomContainerRegistry`). - - diff --git a/develop/stack-components/container-registries/default.mdx b/develop/stack-components/container-registries/default.mdx deleted file mode 100644 index ed4e089..0000000 --- a/develop/stack-components/container-registries/default.mdx +++ /dev/null @@ -1,190 +0,0 @@ ---- -title: "Default Container Registry" -description: "Storing container images locally." -icon: file ---- - -The Default container registry is a [container registry](/develop/stack-components/container-registries) flavor that comes built-in with ZenML and allows container registry URIs of any format. - -### When to use it - -You should use the Default container registry if you want to use a **local** container registry or when using a remote container registry that is not covered by other [container registry flavors](/develop/stack-components/container-registries#container-registry-flavors). - -### Local registry URI format - -To specify a URI for a local container registry, use the following format: - -```bash -localhost: - -# Examples: -localhost:5000 -localhost:8000 -localhost:9999 -``` - -### How to use it - -To use the Default container registry, we need: - -* [Docker](https://www.docker.com) installed and running. -* The registry URI. If you're using a local container registry, check out -* the [previous section](/develop/stack-components/container-registries/default#local-registry-uri-format) on the URI format. - -We can then register the container registry and use it in our active stack: - -```bash -zenml container-registry register \ - --flavor=default \ - --uri= - -# Add the container registry to the active stack -zenml stack update -c -``` - -You may also need to set up [authentication](/develop/stack-components/container-registries/default#authentication-methods) required to log in to the container registry. - -#### Authentication Methods - -If you are using a private container registry, you will need to configure some form of authentication to login to the registry. If you're looking for a quick way to get started locally, you can use the _Local Authentication_ method. However, the recommended way to authenticate to a remote private container registry is through [a Docker Service Connector](/develop/usage/resource-data-management/auth-management/docker-service-connector). - -If your target private container registry comes from a cloud provider like AWS, GCP or Azure, you should use the [container registry flavor](/develop/stack-components/container-registries#container-registry-flavors) targeted at that cloud provider. For example, if you're using AWS, you should use the [AWS Container Registry](/develop/stack-components/container-registries/aws) flavor. These cloud provider flavors also use specialized cloud provider Service Connectors to authenticate to the container registry. - - -This method uses the Docker client authentication available _in the environment where the ZenML code is running_. On your local machine, this is the quickest way to configure a Default Container Registry. You don't need to supply credentials explicitly when you register the Default Container Registry, as it leverages the local credentials and configuration that the Docker client stores on your local machine. - -To log in to the container registry so Docker can pull and push images, you'll need to run the `docker login` command and supply your credentials, e.g.: - -```bash -docker login --username --password-stdin -``` - - - Stacks using the Default Container Registry set up with local authentication are not portable across environments. To make ZenML pipelines fully portable, it is recommended to use [a Docker Service Connector](/develop/usage/resource-data-management/auth-management/docker-service-connector) to link your Default Container Registry to the remote private container registry. - - - - To set up the Default Container Registry to authenticate to and access a private container registry, it is recommended to leverage the features provided by [the Docker Service Connector](/develop/usage/resource-data-management/auth-management/docker-service-connector) such as local login and reusing the same credentials across multiple stack components. - -If you don't already have a Docker Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command: - -```bash -zenml service-connector register --type docker -i -``` - -A non-interactive CLI example is: - -```bash -zenml service-connector register --type docker --username= --password= -``` - -```bash Example Command Output -$ zenml service-connector register dockerhub --type docker --username=username --password=password - -Successfully registered service connector `dockerhub` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────────┼────────────────┨ - -┃ 🐳 docker-registry │ docker.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -If you already have one or more Docker Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the container registry you want to use for your Default Container Registry by running e.g.: - -```bash -zenml service-connector list-resources --connector-type docker --resource-id -``` - -```bash Example Command Output - -$ zenml service-connector list-resources --connector-type docker --resource-id docker.io - -The resource with name 'docker.io' can be accessed by 'docker' service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼────────────────┨ - -┃ cf55339f-dbc8-4ee6-862e-c25aff411292 │ dockerhub │ 🐳 docker │ 🐳 docker-registry │ docker.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on a Docker Service Connector to use to connect to the target container registry, you can register the Docker Container Registry as follows: - -```bash -# Register the container registry and reference the target registry URI - -zenml container-registry register -f default \ - - --uri= - -# Connect the container registry to the target registry via a Docker Service Connector - -zenml container-registry connect -i -``` - -A non-interactive version that connects the Default Container Registry to a target registry through a Docker Service Connector: - -```bash -zenml container-registry connect --connector -``` - -```bash Example Command Output -$ zenml container-registry connect dockerhub --connector dockerhub - -Successfully connected container registry `dockerhub` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼────────────────┨ - -┃ cf55339f-dbc8-4ee6-862e-c25aff411292 │ dockerhub │ 🐳 docker │ 🐳 docker-registry │ docker.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -As a final step, you can use the Default Container Registry in a ZenML Stack: - -```bash -# Register and set a stack with the new container registry - -zenml stack register -c ... --set -``` - - - Linking the Default Container Registry to a Service Connector means that your local Docker client is no longer authenticated to access the remote registry. If you need to manually interact with the remote registry via the Docker CLI, you can use the [local login Service Connector feature](/develop/usage/resource-data-management/auth-management/service-connectors-guide#configure-local-clients) to temporarily authenticate your local Docker client to the remote registry: - -```bash -zenml service-connector login -``` - -```bash Example Command Output -$ zenml service-connector login dockerhub - -⠹ Attempting to configure local client using service connector 'dockerhub'... - -WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - -Configure a credential helper to remove this warning. See - -https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - -The 'dockerhub' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. -``` - - - - - -For more information and a full list of configurable attributes of the Default container registry, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-aws/#zenml.integrations.aws.container%5Fregistries.default%5Fcontainer%5Fregistry.DefaultContainerRegistry) . - diff --git a/develop/stack-components/container-registries/dockerhub.mdx b/develop/stack-components/container-registries/dockerhub.mdx deleted file mode 100644 index 6dd1649..0000000 --- a/develop/stack-components/container-registries/dockerhub.mdx +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: "DockerHub" -description: "Storing container images in DockerHub." -icon: docker ---- - -The DockerHub container registry is a [container registry](/develop/stack-components/container-registries) flavor that comes built-in with ZenML and uses [DockerHub](https://hub.docker.com/) to store container images. - -### When to use it - -You should use the DockerHub container registry if: - -* one or more components of your stack need to pull or push container images. -* you have a DockerHub account. If you're not using DockerHub, take a look at the other [container registry flavors](/develop/stack-components/container-registries#container-registry-flavors). - -### How to deploy it - -To use the DockerHub container registry, all you need to do is create a [DockerHub](https://hub.docker.com/) account. - -When this container registry is used in a ZenML stack, the Docker images that are built will be published in a \*\* public\*\* repository and everyone will be able to pull your images. If you want to use a **private** repository instead, you'll have to [create a private repository](https://docs.docker.com/docker-hub/repos/#creating-repositories) on the website before running the pipeline. The repository name depends on the remote [orchestrator](/develop/stack-components/orchestrators) or [step operator](/develop/stack-components/step-operators) that you're using in your stack. - -### How to find the registry URI - -The DockerHub container registry URI should have one of the two following formats: - -```bash - -# or -docker.io/ - -# Examples: -zenml -my-username -docker.io/zenml -docker.io/my-username -``` - -To figure out the URI for your registry: - -* Find out the account name of your [DockerHub](https://hub.docker.com/) account. -* Use the account name to fill the template `docker.io/` and get your URI. - -### How to use it - -To use the Azure container registry, we need: - -* [Docker](https://www.docker.com) installed and running. -* The registry URI. Check out the [previous section](/develop/stack-components/container-registries/dockerhub#how-to-find-the-registry-uri) on the URI format and how to get the URI for your registry. - -We can then register the container registry and use it in our active stack: - - -```bash -zenml container-registry register \ - --flavor=dockerhub \ - --uri= - -# Add the container registry to the active stack -zenml stack update -c -``` - -Additionally, we'll need to log in to the container registry so Docker can pull and push images. This will require your DockerHub account name and either your password or preferably a [personal access token](https://docs.docker.com/docker-hub/access-tokens/). - -```bash -docker login -``` - -For more information and a full list of configurable attributes of the `dockerhub` container registry, check out the [SDK Docs](https://apidocs.zenml.io/latest/core%5Fcode%5Fdocs/core-container%5Fregistries/#zenml.container%5Fregistries.dockerhub%5Fcontainer%5Fregistry.DockerHubContainerRegistry) . - diff --git a/develop/stack-components/container-registries/gcp.mdx b/develop/stack-components/container-registries/gcp.mdx deleted file mode 100644 index 76929fe..0000000 --- a/develop/stack-components/container-registries/gcp.mdx +++ /dev/null @@ -1,302 +0,0 @@ ---- -title: "Google Cloud Container Registry" -description: "Storing container images in GCP." -icon: database ---- - -The GCP container registry is a [container registry](/develop/stack-components/container-registries) flavor that comes built-in with ZenML and uses the [Google Artifact Registry](https://cloud.google.com/artifact-registry). - - - **Important Notice: Google Container Registry** [**is being replaced by Artifact Registry**](https://cloud.google.com/artifact-registry/docs/transition/transition-from-gcr)**. Please start using Artifact Registry for your containers. As per Google's documentation, "after May 15, 2024, Artifact Registry will host images for the gcr.io domain in Google Cloud projects without previous Container Registry usage. After March 18, 2025, Container Registry will be shut down."** The terms `container registry` and `artifact registry` will be used interchangeably throughout this document. - - -### When to use it - -You should use the GCP container registry if: - -* one or more components of your stack need to pull or push container images. -* you have access to GCP. If you're not using GCP, take a look at the other [container registry flavors](/develop/stack-components/container-registries#container-registry-flavors). - -### How to deploy it - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including a Google Artifact Registry? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML GCP Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -When using the Google Artifact Registry, you need to: - -* enable it [here](https://console.cloud.google.com/marketplace/product/google/artifactregistry.googleapis.com) -* go [here](https://console.cloud.google.com/artifacts) and create a `Docker` repository. - -### Infrastructure Deployment - -A GCP Container Registry can be deployed directly from the ZenML CLI: - - -```bash -zenml container-registry deploy gcp_container_registry --flavor=gcp --provider=gcp ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the [dedicated documentation section](/develop/stack-components/stack-deployment/deploy-a-stack-using-mlstacks). - -## How to find the registry URI - -When using the Google Artifact Registry, the GCP container registry URI should have the following format: - -```bash --docker.pkg.dev// - -# Examples: -europe-west1-docker.pkg.dev/zenml/my-repo -southamerica-east1-docker.pkg.dev/zenml/zenml-test -asia-docker.pkg.dev/my-project/another-repo -``` - -To figure out the URI for your registry: - -* Go [here](https://console.cloud.google.com/artifacts) and select the repository that you want to use to store Docker images. If you don't have a repository yet, take a look at the [deployment section](/develop/stack-components/container-registries/gcp#how-to-deploy-it). -* On the top, click the copy button to copy the full repository URL. - -#### Infrastructure Deployment - -A GCP Container Registry can be deployed directly from the ZenML CLI: - -```bash -zenml container-registry deploy gcp_container_registry --flavor=gcp --provider=gcp ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the dedicated documentation section. - -### How to use it - -To use the GCP container registry, we need: - -* [Docker](https://www.docker.com) installed and running. -* The registry URI. Check out the [previous section](/develop/stack-components/container-registries/gcp#how-to-find-the-registry-uri) on the URI format and how to get the URI for your registry. - -We can then register the container registry and use it in our active stack: - -```bash -zenml container-registry register \ - - --flavor=gcp \ - - --uri= - -# Add the container registry to the active stack - -zenml stack update -c -``` - -You also need to set up [authentication](/develop/stack-components/container-registries/gcp#authentication-methods) required to log in to the container registry. - -#### Authentication Methods - -Integrating and using a GCP Container Registry in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Local Authentication_ method. However, the recommended way to authenticate to the GCP cloud platform is through [a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector). This is particularly useful if you are configuring ZenML stacks that combine the GCP Container Registry with other remote stack components also running in GCP. - - -This method uses the Docker client authentication available _in the environment where the ZenML code is running_. On your local machine, this is the quickest way to configure a GCP Container Registry. You don't need to supply credentials explicitly when you register the GCP Container Registry, as it leverages the local credentials and configuration that the GCP CLI and Docker client store on your local machine. However, you will need to install and set up the GCP CLI on your machine as a prerequisite, as covered in [the GCP CLI documentation](https://cloud.google.com/sdk/docs/install-sdk), before you register the GCP Container Registry. - -With the GCP CLI installed and set up with credentials, we'll need to configure Docker, so it can pull and push images: - -* for a Google Container Registry: - -```bash -gcloud auth configure-docker -``` -* for a Google Artifact Registry: - -```bash -gcloud auth configure-docker -docker.pkg.dev -``` - - - Stacks using the GCP Container Registry set up with local authentication are not portable across environments. To make ZenML pipelines fully portable, it is recommended to use [a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) to link your GCP Container Registry to the remote GCR registry. - - - - -To set up the GCP Container Registry to authenticate to GCP and access a GCR registry, it is recommended to leverage the many features provided by [the GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) such as auto-configuration, local login, best security practices regarding long-lived credentials and reusing the same credentials across multiple stack components. - - - The GCP Service Connector does not support the Google Artifact Registry yet. If you need to connect your GCP Container Registry to a Google Artifact Registry, you can use the _Local Authentication_ method instead. - - -If you don't already have a GCP Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure a GCP Service Connector that can be used to access a GCR registry or even more than one type of GCP resource: - -```bash -zenml service-connector register --type gcp -i -``` - -A non-interactive CLI example that leverages [the GCP CLI configuration](https://docs.gcp.amazon.com/cli/latest/userguide/getting-started-install.html) on your local machine to auto-configure a GCP Service Connector targeting a GCR registry is: - - -```bash -zenml service-connector register --type gcp --resource-type docker-registry --auto-configure -``` - -```bash Example Command Output - -$ zenml service-connector register gcp-zenml-core --type gcp --resource-type docker-registry --auto-configure - -⠸ Registering service connector 'gcp-zenml-core'... - -Successfully registered service connector `gcp-zenml-core` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┃ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -> **Note**: Please remember to grant the entity associated with your GCP credentials permissions to read and write to your GCR registry. For a full list of permissions required to use a GCP Service Connector to access a GCR registry, please refer to the [GCP Service Connector GCR registry resource type documentation](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcr-container-registry) or read the documentation available in the interactive CLI commands and dashboard. The GCP Service Connector supports [many different authentication methods](/develop/usage/resource-data-management/auth-management/gcp-service-connector#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use-case. - -If you already have one or more GCP Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the GCR registry you want to use for your GCP Container Registry by running e.g.: - -```bash -zenml service-connector list-resources --connector-type gcp --resource-type docker-registry -``` - -```bash Example Command Output -The following 'docker-registry' resources can be accessed by 'gcp' service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼──────────────────┼────────────────┼────────────────────┼─────────────────────────────────────────────────┨ - -┃ ffc01795-0c0a-4f1d-af80-b84aceabcfcf │ gcp-implicit │ 🔵 gcp │ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ │ │ │ us.gcr.io/zenml-core ┃ - -┃ │ │ │ │ eu.gcr.io/zenml-core ┃ - -┃ │ │ │ │ asia.gcr.io/zenml-core ┃ - -┃ │ │ │ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ │ │ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ │ │ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ │ │ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ │ │ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┠──────────────────────────────────────┼──────────────────┼────────────────┼────────────────────┼─────────────────────────────────────────────────┨ - -┃ 561b776a-af8b-491c-a4ed-14349b440f30 │ gcp-zenml-core │ 🔵 gcp │ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ │ │ │ us.gcr.io/zenml-core ┃ - -┃ │ │ │ │ eu.gcr.io/zenml-core ┃ - -┃ │ │ │ │ asia.gcr.io/zenml-core ┃ - -┃ │ │ │ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ │ │ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ │ │ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ │ │ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ │ │ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on a GCP Service Connector to use to connect to the target GCR registry, you can register the GCP Container Registry as follows: - -```bash -# Register the GCP container registry and reference the target GCR registry URI - -zenml container-registry register -f gcp \ - - --uri= - -# Connect the GCP container registry to the target GCR registry via a GCP Service Connector - -zenml container-registry connect -i -``` - -A non-interactive version that connects the GCP Container Registry to a target GCR registry through a GCP Service Connector: - -```bash -zenml container-registry connect --connector -``` - - Linking the GCP Container Registry to a Service Connector means that your local Docker client is no longer authenticated to access the remote registry. If you need to manually interact with the remote registry via the Docker CLI, you can use the [local login Service Connector feature](/develop/usage/resource-data-management/auth-management/service-connectors-guide#configure-local-clients) to temporarily authenticate your local Docker client to the remote registry: - - -```bash -zenml service-connector login --resource-type docker-registry -``` - -```bash Example Command Output - -$ zenml service-connector login gcp-zenml-core --resource-type docker-registry - -⠋ Attempting to configure local client using service connector 'gcp-zenml-core'... - -WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - -Configure a credential helper to remove this warning. See - -https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - -The 'gcp-zenml-core' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. -``` - -```bash Example Command Output - -$ zenml container-registry connect gcp-zenml-core --connector gcp-zenml-core - -Successfully connected container registry `gcp-zenml-core` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼─────────────────────────────────────────────┨ - -┃ 561b776a-af8b-491c-a4ed-14349b440f30 │ gcp-zenml-core │ 🔵 gcp │ 🐳 docker-registry │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -As a final step, you can use the GCP Container Registry in a ZenML Stack: - - -```bash -# Register and set a stack with the new container registry -zenml stack register -c ... --set -``` - - - - -For more information and a full list of configurable attributes of the GCP container registry, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-container%5Fregistries/#zenml.container%5Fregistries.gcp%5Fcontainer%5Fregistry.GCPContainerRegistry) . - diff --git a/develop/stack-components/container-registries/github.mdx b/develop/stack-components/container-registries/github.mdx deleted file mode 100644 index d98f397..0000000 --- a/develop/stack-components/container-registries/github.mdx +++ /dev/null @@ -1,58 +0,0 @@ ---- -title: "GitHub Container Registry" -description: "Storing container images in GitHub." -icon: github ---- - -The GitHub container registry is a [container registry](/develop/stack-components/container-registries) flavor that comes built-in with ZenML and uses the [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry) to store container images. - -### When to use it - -You should use the GitHub container registry if: - -* one or more components of your stack need to pull or push container images. -* you're using GitHub for your projects. If you're not using GitHub, take a look at the other [container registry flavors](/develop/stack-components/container-registries#container-registry-flavors). - -### How to deploy it - -The GitHub container registry is enabled by default when you create a GitHub account. - -### How to find the registry URI - -The GitHub container registry URI should have the following format: - -```bash - -ghcr.io/ - -# Examples: -ghcr.io/zenml -ghcr.io/my-username -ghcr.io/my-organization -``` - -To figure our the URI for your registry: - -* Use the GitHub user or organization name to fill the template `ghcr.io/` and get your URI. - -### How to use it - -To use the GitHub container registry, we need: - -* [Docker](https://www.docker.com) installed and running. -* The registry URI. Check out the [previous section](/develop/stack-components/container-registries/github#how-to-find-the-registry-uri) on the URI format and how to get the URI for your registry. -* Our Docker client configured, so it can pull and push images. Follow [this guide](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#authenticating-to-the-container-registry) to create a personal access token and login to the container registry. - -We can then register the container registry and use it in our active stack: - -```bash -zenml container-registry register \ - --flavor=github \ - --uri= - -# Add the container registry to the active stack -zenml stack update -c -``` - -For more information and a full list of configurable attributes of the GitHub container registry, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-container%5Fregistries/#zenml.container%5Fregistries.github%5Fcontainer%5Fregistry.GitHubContainerRegistry). - diff --git a/develop/stack-components/data-validators/custom.mdx b/develop/stack-components/data-validators/custom.mdx deleted file mode 100644 index 1d6a2d9..0000000 --- a/develop/stack-components/data-validators/custom.mdx +++ /dev/null @@ -1,62 +0,0 @@ ---- -title: "Develop a custom data validator" -description: "How to develop a custom data validator" -icon: file ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - - -**Base abstraction in progress!** - -We are actively working on the base abstraction for the Data Validators, which will be available soon. As a result, their extension is not recommended at the moment. When you are selecting a data validator for your stack, you can use one of [the existing flavors](/develop/stack-components/data-validators#data-validator-flavors). - -If you need to implement your own Data Validator flavor, you can still do so, but keep in mind that you may have to refactor it when the base abstraction is updated. - - -ZenML comes equipped with [Data Validator implementations](/develop/stack-components/data-validators#data-validator-flavors) that integrate a variety of data logging and validation libraries, frameworks and platforms. However, if you need to use a different library or service as a backend for your ZenML Data Validator, you can extend ZenML to provide your own custom Data Validator implementation. - -### Build your own custom data validator - -If you want to implement your own custom Data Validator, you can follow the following steps: - -1. Create a class which inherits from [the BaseDataValidator class](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-data%5Fvalidators/#zenml.data%5Fvalidators.base%5Fdata%5Fvalidator.BaseDataValidator) and override one or more of the abstract methods, depending on the capabilities of the underlying library/service that you want to integrate. -2. If you need any configuration, you can create a class which inherits from the `BaseDataValidatorConfig` class. -3. Bring both of these classes together by inheriting from the `BaseDataValidatorFlavor`. -4. (Optional) You should also provide some standard steps that others can easily insert into their pipelines for instant access to data validation features. - -Once you are done with the implementation, you can register it through the CLI. Please ensure you **point to the flavor class via dot notation**: - -```bash -zenml data-validator flavor register -``` - -For example, if your flavor class `MyDataValidatorFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml data-validator flavor register flavors.my_flavor.MyDataValidatorFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually it's better to not have to rely on this mechanism, and initialize zenml at the root. - - -Afterwards, you should see the new flavor in the list of available flavors: - -```bash -zenml data-validator flavor list -``` - - -It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow. - -* The **CustomDataValidatorFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomDataValidatorConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` object are inherently `pydantic` objects, you can also add your own custom validators here. -* The **CustomDataValidator** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomDataValidatorFlavor` and the `CustomDataValidatorConfig` are implemented in a different module/path than the actual `CustomDataValidator`). - diff --git a/develop/stack-components/data-validators/data-validators.mdx b/develop/stack-components/data-validators/data-validators.mdx deleted file mode 100644 index 823abec..0000000 --- a/develop/stack-components/data-validators/data-validators.mdx +++ /dev/null @@ -1,50 +0,0 @@ ---- -title: "Overview" -description: "How to enhance and maintain the quality of your data and the performance of your models with data profiling and validation" -icon: vial ---- - -Without good data, even the best machine learning models will yield questionable results. A lot of effort goes into ensuring and maintaining data quality not only in the initial stages of model development, but throughout the entire machine learning project lifecycle. Data Validators are a category of ML libraries, tools and frameworks that grant a wide range of features and best practices that should be employed in the ML pipelines to keep data quality in check and to monitor model performance to keep it from degrading over time. - -Data profiling, data integrity testing, data and model drift detection are all ways of employing data validation techniques at different points in your ML pipelines where data is concerned: data ingestion, model training and evaluation and online or batch inference. Data profiles and model performance evaluation results can be visualized and analyzed to detect problems and take preventive or correcting actions. - -Related concepts: - -* the Data Validator is an optional type of Stack Component that needs to be registered as part of your ZenML [Stack](/develop/user-guide/guides/production-guide/understand-stacks). -* Data Validators used in ZenML pipelines usually generate data profiles and data quality check reports that are versioned and stored in the [Artifact Store](/develop/stack-components/artifact-stores) and can be [retrieved and visualized](/develop/usage/resource-data-management/visualize-artifacts) later. - -### When to use it - -[Data-centric AI practices](https://blog.zenml.io/data-centric-mlops/) are quickly becoming mainstream and using Data Validators are an easy way to incorporate them into your workflow. These are some common cases where you may consider employing the use of Data Validators in your pipelines: - -* early on, even if it's just to keep a log of the quality state of your data and the performance of your models at different stages of development. -* if you have pipelines that regularly ingest new data, you should use data validation to run regular data integrity checks to signal problems before they are propagated downstream. -* in continuous training pipelines, you should use data validation techniques to compare new training data against a data reference and to compare the performance of newly trained models against previous ones. -* when you have pipelines that automate batch inference or if you regularly collect data used as input in online inference, you should use data validation to run data drift analyses and detect training-serving skew, data drift and model drift. - -#### Data Validator Flavors - -Data Validator are optional stack components provided by integrations. The following table lists the currently available Data Validators and summarizes their features and the data types and model types that they can be used with in ZenML pipelines: - -| Data Validator | Validation Features | Data Types | Model Types | Notes | Flavor/Integration | -| -------------------------------------------------------------------------- | ----------------------------------------------------- | ------------------------------------------------------------------- | -------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | ------------------- | -| [Deepchecks](/develop/stack-components/data-validators/deepchecks) | data quality data drift model drift model performance | tabular: pandas.DataFrameCV: torch.utils.data.dataloader.DataLoader | tabular: sklearn.base.ClassifierMixinCV: torch.nn.Module | Add Deepchecks data and model validation tests to your pipelines | deepchecks | -| [Evidently](/develop/stack-components/data-validators/evidently) | data quality data drift model drift model performance | tabular: pandas.DataFrame | N/A | Use Evidently to generate a variety of data quality and data/model drift reports and visualizations | evidently | -| [Great Expectations](/develop/stack-components/data-validators/great-expectations) | data profiling data quality | tabular: pandas.DataFrame | N/A | Perform data testing, documentation and profiling with Great Expectations | great\_expectations | -| [Whylogs/WhyLabs](/develop/stack-components/data-validators/whylogs) | data drift | tabular: pandas.DataFrame | N/A | Generate data profiles with whylogs and upload them to WhyLabs | whylogs | - -If you would like to see the available flavors of Data Validator, you can use the command: - -```bash -zenml data-validator flavor list -``` - -### How to use it - -Every Data Validator has different data profiling and testing capabilities and uses a slightly different way of analyzing your data and your models, but it generally works as follows: - -* first, you have to configure and add a Data Validator to your ZenML stack -* every integration includes one or more builtin data validation steps that you can add to your pipelines. Of course, you can also use the libraries directly in your own custom pipeline steps and simply return the results (e.g. data profiles, test reports) as artifacts that are versioned and stored by ZenML in its Artifact Store. -* you can access the data validation artifacts in subsequent pipeline steps, or [fetch them afterwards](/develop/usage/resource-data-management/handle-data-artifacts/load-artifacts-into-memory) to process them or visualize them as needed. - -Consult the documentation for the particular [Data Validator flavor](/develop/stack-components/data-validators#data-validator-flavors) that you plan on using or are using in your stack for detailed information about how to use it in your ZenML pipelines. diff --git a/develop/stack-components/data-validators/deepchecks.mdx b/develop/stack-components/data-validators/deepchecks.mdx deleted file mode 100644 index 7a8628d..0000000 --- a/develop/stack-components/data-validators/deepchecks.mdx +++ /dev/null @@ -1,419 +0,0 @@ ---- -title: "Deepchecks" -description: "How to test the data and models used in your pipelines with Deepchecks test suites" -icon: microscope ---- - -The Deepchecks [Data Validator](/develop/stack-components/data-validators) flavor provided with the ZenML integration uses [Deepchecks](https://deepchecks.com/) to run data integrity, data drift, model drift and model performance tests on the datasets and models circulated in your ZenML pipelines. The test results can be used to implement automated corrective actions in your pipelines or to render interactive representations for further visual interpretation, evaluation and documentation. - -### When would you want to use it? - -[Deepchecks](https://deepchecks.com/) is an open-source library that you can use to run a variety of data and model validation tests, from data integrity tests that work with a single dataset to model evaluation tests to data drift analyses and model performance comparison tests. All this can be done with minimal configuration input from the user, or customized with specialized conditions that the validation tests should perform. - -Deepchecks works with both tabular data and computer vision data (currently in beta). For tabular, the supported dataset format is `pandas.DataFrame` and the supported model format is `sklearn.base.ClassifierMixin`. For computer vision, the supported dataset format is `torch.utils.data.dataloader.DataLoader` and supported model format is `torch.nn.Module`. - -You should use the Deepchecks Data Validator when you need the following data and/or model validation features that are possible with Deepchecks: - -* Data Integrity Checks [for tabular](https://docs.deepchecks.com/en/stable/checks%5Fgallery/tabular.html#data-integrity) or [computer vision](https://docs.deepchecks.com/en/stable/checks%5Fgallery/vision.html#data-integrity) data: detect data integrity problems within a single dataset (e.g. missing values, conflicting labels, mixed data types etc.). -* Data Drift Checks [for tabular](https://docs.deepchecks.com/en/stable/checks%5Fgallery/tabular.html#train-test-validation) or [computer vision](https://docs.deepchecks.com/en/stable/checks%5Fgallery/vision.html#train-test-validation) data: detect data skew and data drift problems by comparing a target dataset against a reference dataset (e.g. feature drift, label drift, new labels etc.). -* Model Performance Checks [for tabular](https://docs.deepchecks.com/en/stable/checks%5Fgallery/tabular.html#model-evaluation) or [computer vision](https://docs.deepchecks.com/en/stable/checks%5Fgallery/vision.html#model-evaluation) data: evaluate a model and detect problems with its performance (e.g. confusion matrix, boosting overfit, model error analysis) - -You should consider one of the other [Data Validator flavors](/develop/stack-components/data-validators#data-validator-flavors) if you need a different set of data validation features. - -### How do you deploy it? - -The Deepchecks Data Validator flavor is included in the Deepchecks ZenML integration, you need to install it on your local machine to be able to register a Deepchecks Data Validator and add it to your stack: - -```bash -zenml integration install deepchecks -y -``` - -The Data Validator stack component does not have any configuration parameters. Adding it to a stack is as simple as running e.g.: - -```bash -# Register the Deepchecks data validator - -zenml data-validator register deepchecks_data_validator --flavor=deepchecks - -# Register and set a stack with the new data validator - -zenml stack register custom_stack -dv deepchecks_data_validator ... --set -``` - -### How do you use it? - -The ZenML integration restructures the way Deepchecks validation checks are organized in four categories, based on the type and number of input parameters that they expect as input. This makes it easier to reason about them when you decide which tests to use in your pipeline steps: - -* **data integrity checks** expect a single dataset as input. These correspond one-to-one to the set of Deepchecks data integrity checks [for tabular](https://docs.deepchecks.com/en/stable/checks%5Fgallery/tabular.html#data-integrity) and [computer vision](https://docs.deepchecks.com/en/stable/checks%5Fgallery/vision.html#data-integrity) data -* **data drift checks** require two datasets as input: target and reference. These correspond one-to-one to the set of Deepchecks train-test checks [for tabular data](https://docs.deepchecks.com/stable/checks%5Fgallery/tabular.html#train-test-validation) and [for computer vision](https://docs.deepchecks.com/stable/checks%5Fgallery/vision.html#train-test-validation). -* **model validation checks** require a single dataset and a mandatory model as input. This list includes a subset of the model evaluation checks provided by Deepchecks [for tabular data](https://docs.deepchecks.com/en/stable/checks%5Fgallery/tabular.html#model-evaluation) and [for computer vision](https://docs.deepchecks.com/stable/checks%5Fgallery/vision.html#model-evaluation) that expect a single dataset as input. -* **model drift checks** require two datasets and a mandatory model as input. This list includes a subset of the model evaluation checks provided by Deepchecks [for tabular data](https://docs.deepchecks.com/en/stable/checks%5Fgallery/tabular.html#model-evaluation) and [for computer vision](https://docs.deepchecks.com/stable/checks%5Fgallery/vision.html#model-evaluation) that expect two datasets as input: target and reference. - -This structure is directly reflected in how Deepchecks can be used with ZenML: there are four different Deepchecks standard steps and four different [ZenML enums for Deepchecks checks](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.validation%5Fchecks) . [The Deepchecks Data Validator API](/develop/stack-components/data-validators/deepchecks#the-deepchecks-data-validator) is also modeled to reflect this same structure. - -A notable characteristic of Deepchecks is that you don't need to customize the set of Deepchecks tests that are part of a test suite. Both ZenML and Deepchecks provide sane defaults that will run all available Deepchecks tests in a given category with their default conditions if a custom list of tests and conditions are not provided. - -There are three ways you can use Deepchecks in your ZenML pipelines that allow different levels of flexibility: - -* instantiate, configure and insert one or more of [the standard Deepchecks steps](/develop/stack-components/data-validators/deepchecks#the-deepchecks-standard-steps) shipped with ZenML into your pipelines. This is the easiest way and the recommended approach, but can only be customized through the supported step configuration parameters. -* call the data validation methods provided by [the Deepchecks Data Validator](/develop/stack-components/data-validators/deepchecks#the-deepchecks-data-validator) in your custom step implementation. This method allows for more flexibility concerning what can happen in the pipeline step, but you are still limited to the functionality implemented in the Data Validator. -* [use the Deepchecks library directly](/develop/stack-components/data-validators/deepchecks#call-deepchecks-directly) in your custom step implementation. This gives you complete freedom in how you are using Deepchecks' features. - -You can visualize Deepchecks results in Jupyter notebooks or view them directly in the ZenML dashboard. - -### Warning! Usage in remote orchestrators - -The current ZenML version has a limitation in its base Docker image that requires a workaround for _all_ pipelines using Deepchecks with a remote orchestrator (e.g. [Kubeflow](/develop/stack-components/orchestrators/kubeflow) , [Vertex](/develop/stack-components/orchestrators/vertex)). The limitation being that the base Docker image needs to be extended to include binaries that are required by `opencv2`, which is a package that Deepchecks requires. - -While these binaries might be available on most operating systems out of the box (and therefore not a problem with the default local orchestrator), we need to tell ZenML to add them to the containerization step when running in remote settings. Here is how: - -First, create a file called `deepchecks-zenml.Dockerfile` and place it on the same level as your runner script (commonly called `run.py`). The contents of the Dockerfile are as follows: - -```bash -ARG ZENML_VERSION=0.20.0 -FROM zenmldocker/zenml:${ZENML_VERSION} AS base - -RUN apt-get update -RUN apt-get install ffmpeg libsm6 libxext6 -y -``` - -Then, place the following snippet above your pipeline definition. Note that the path of the `dockerfile` are relative to where the pipeline definition file is. Read [the containerization guide](/develop/usage/project-setup/customize-docker-builds) for more details: - -```py -import zenml -from zenml import pipeline -from zenml.config import DockerSettings -from pathlib import Path -import sys - -docker_settings = DockerSettings( - dockerfile="deepchecks-zenml.Dockerfile", - build_options={ - "buildargs": { - "ZENML_VERSION": f"{zenml.__version__}" - }, - }, -) - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - # same code as always - ... -``` - -From here on, you can continue to use the deepchecks integration as is explained below. - -#### The Deepchecks standard steps - -ZenML wraps the Deepchecks functionality for tabular data in the form of four standard steps: - -* [DeepchecksDataIntegrityCheckStep](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.steps.deepchecks%5Fdata%5Fdrift.DeepchecksDataDriftCheckStep): use it in your pipelines to run data integrity tests on a single dataset -* [DeepchecksDataDriftCheckStep](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.steps.deepchecks%5Fdata%5Fintegrity.DeepchecksDataIntegrityCheckStep): use it in your pipelines to run data drift tests on two datasets as input: target and reference. -* [DeepchecksModelValidationCheckStep](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.steps.deepchecks%5Fmodel%5Fvalidation.DeepchecksModelValidationCheckStep): class DeepchecksModelDriftCheckStep(BaseStep): use it in your pipelines to run model performance tests using a single dataset and a mandatory model artifact as input -* [DeepchecksModelDriftCheckStep](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.steps.deepchecks%5Fmodel%5Fdrift.DeepchecksModelDriftCheckStep): use it in your pipelines to run model comparison/drift tests using a mandatory model artifact and two datasets as input: target and reference. - -The integration doesn't yet include standard steps for computer vision, but you can still write your own custom steps that call [the Deepchecks Data Validator API](/develop/stack-components/data-validators/deepchecks#the-deepchecks-data-validator) or even [call the Deepchecks library directly](/develop/stack-components/data-validators/deepchecks#call-deepchecks-directly). - -All four standard steps behave similarly regarding the configuration parameters and returned artifacts, with the following differences: - -* the type and number of input artifacts are different, as mentioned above -* each step expects a different enum data type to be used when explicitly listing the checks to be performed via the `check_list` configuration attribute. See the [zenml.integrations.deepchecks.validation\_checks](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.validation%5Fchecks) module for more details about these enums (e.g. the data integrity step expects a list of `DeepchecksDataIntegrityCheck` values). - -This section will only cover how you can use the data integrity step, with a similar usage to be easily inferred for the other three steps. - -To instantiate a data integrity step that will run all available Deepchecks data integrity tests with their default configuration, e.g.: - - -```py - -from zenml.integrations.deepchecks.steps import ( - deepchecks_data_integrity_check_step, -) - -data_validator = deepchecks_data_integrity_check_step.with_options( - parameters=dict( - dataset_kwargs=dict(label="target", cat_features=[]), - ), -) -``` - -The step can then be inserted into your pipeline where it can take in a dataset, e.g.: - -```py -docker_settings = DockerSettings(required_integrations=[DEEPCHECKS, SKLEARN]) - -@pipeline(settings={"docker": docker_settings}) -def data_validation_pipeline(): - df_train, df_test = data_loader() - data_validator(dataset=df_train) - - -data_validation_pipeline() -``` - -As can be seen from the [step definition](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.steps.deepchecks%5Fdata%5Fintegrity.deepchecks%5Fdata%5Fintegrity%5Fcheck%5Fstep) , the step takes in a dataset and it returns a Deepchecks `SuiteResult` object that contains the test results: - -```py -@step -def deepchecks_data_integrity_check_step( - dataset: pd.DataFrame, - check_list: Optional[Sequence[DeepchecksDataIntegrityCheck]] = None, - dataset_kwargs: Optional[Dict[str, Any]] = None, - check_kwargs: Optional[Dict[str, Any]] = None, - run_kwargs: Optional[Dict[str, Any]] = None, -) -> SuiteResult: - ... -``` - -If needed, you can specify a custom list of data integrity Deepchecks tests to be executed by supplying a `check_list` argument: - -```py -from zenml.integrations.deepchecks.validation_checks import DeepchecksDataIntegrityCheck -from zenml.integrations.deepchecks.steps import deepchecks_data_integrity_check_step - - -@pipeline -def validation_pipeline(): - deepchecks_data_integrity_check_step( - check_list=[ - DeepchecksDataIntegrityCheck.TABULAR_MIXED_DATA_TYPES, - DeepchecksDataIntegrityCheck.TABULAR_DATA_DUPLICATES, - DeepchecksDataIntegrityCheck.TABULAR_CONFLICTING_LABELS, - ], - dataset=... - ) -``` - -You should consult [the official Deepchecks documentation](https://docs.deepchecks.com/en/stable/checks%5Fgallery/tabular.html) for more information on what each test is useful for. - -For more customization, the data integrity step also allows for additional keyword arguments to be supplied to be passed transparently to the Deepchecks library: - -* `dataset_kwargs`: Additional keyword arguments to be passed to the Deepchecks `tabular.Dataset` or `vision.VisionData` constructor. This is used to pass additional information about how the data is structured, e.g.: - -```py -deepchecks_data_integrity_check_step( - dataset_kwargs=dict(label='class', cat_features=['country', 'state']), - ... -) -``` -* `check_kwargs`: Additional keyword arguments to be passed to the Deepchecks check object constructors. Arguments are grouped for each check and indexed using the full check class name or check enum value as dictionary keys, e.g.: - -```py -deepchecks_data_integrity_check_step( - check_list=[ - DeepchecksDataIntegrityCheck.TABULAR_OUTLIER_SAMPLE_DETECTION, - DeepchecksDataIntegrityCheck.TABULAR_STRING_LENGTH_OUT_OF_BOUNDS, - DeepchecksDataIntegrityCheck.TABULAR_STRING_MISMATCH, - ], - check_kwargs={ - DeepchecksDataIntegrityCheck.TABULAR_OUTLIER_SAMPLE_DETECTION: dict( - nearest_neighbors_percent=0.01, - extent_parameter=3, - ), - DeepchecksDataIntegrityCheck.TABULAR_STRING_LENGTH_OUT_OF_BOUNDS: dict( - num_percentiles=1000, - min_unique_values=3, - ), - }, - ... -) -``` -* `run_kwargs`: Additional keyword arguments to be passed to the Deepchecks Suite `run` method. - -The `check_kwargs` attribute can also be used to customize [the conditions](https://docs.deepchecks.com/en/stable/versions/0.66.0/user-guide/general/deepchecks%5Fhierarchy.html#condition) configured for each Deepchecks test. ZenML attaches a special meaning to all check arguments that start with `condition_` and have a dictionary as value. This is required because there is no declarative way to specify conditions for Deepchecks checks. For example, the following step configuration: - -```py -deepchecks_data_integrity_check_step( - check_list=[ - DeepchecksDataIntegrityCheck.TABULAR_OUTLIER_SAMPLE_DETECTION, - DeepchecksDataIntegrityCheck.TABULAR_STRING_LENGTH_OUT_OF_BOUNDS, - ], - dataset_kwargs=dict(label='class', cat_features=['country', 'state']), - check_kwargs={ - DeepchecksDataIntegrityCheck.TABULAR_OUTLIER_SAMPLE_DETECTION: dict( - nearest_neighbors_percent=0.01, - extent_parameter=3, - condition_outlier_ratio_less_or_equal=dict( - max_outliers_ratio=0.007, - outlier_score_threshold=0.5, - ), - condition_no_outliers=dict( - outlier_score_threshold=0.6, - ) - ), - DeepchecksDataIntegrityCheck.TABULAR_STRING_LENGTH_OUT_OF_BOUNDS: dict( - num_percentiles=1000, - min_unique_values=3, - condition_number_of_outliers_less_or_equal=dict( - max_outliers=3, - ) - ), - }, - ... -) -``` - -is equivalent to running the following Deepchecks tests: - -```py -import deepchecks.tabular.checks as tabular_checks -from deepchecks.tabular import Suite -from deepchecks.tabular import Dataset - -train_dataset = Dataset( - reference_dataset, - label='class', - cat_features=['country', 'state'] -) - -suite = Suite(name="custom") -check = tabular_checks.OutlierSampleDetection( - nearest_neighbors_percent=0.01, - extent_parameter=3, -) -check.add_condition_outlier_ratio_less_or_equal( - max_outliers_ratio=0.007, - outlier_score_threshold=0.5, -) -check.add_condition_no_outliers( - outlier_score_threshold=0.6, -) -suite.add(check) -check = tabular_checks.StringLengthOutOfBounds( - num_percentiles=1000, - min_unique_values=3, -) -check.add_condition_number_of_outliers_less_or_equal( - max_outliers=3, -) -suite.run(train_dataset=train_dataset) -``` - -You can view [the complete list of configuration parameters](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.steps.deepchecks%5Fdata%5Fintegrity.DeepchecksDataIntegrityCheckStepParameters) in the SDK docs. - -#### The Deepchecks Data Validator - -The Deepchecks Data Validator implements the same interface as do all Data Validators, so this method forces you to maintain some level of compatibility with the overall Data Validator abstraction, which guarantees an easier migration in case you decide to switch to another Data Validator. - -All you have to do is call the Deepchecks Data Validator methods when you need to interact with Deepchecks to run tests, e.g.: - -```py -import pandas as pd -from deepchecks.core.suite import SuiteResult -from zenml.integrations.deepchecks.data_validators import DeepchecksDataValidator -from zenml.integrations.deepchecks.validation_checks import DeepchecksDataIntegrityCheck -from zenml import step - - -@step -def data_integrity_check( - dataset: pd.DataFrame, -) -> SuiteResult: - """Custom data integrity check step with Deepchecks - - Args: - dataset: input Pandas DataFrame - - Returns: - Deepchecks test suite execution result - """ - - # validation pre-processing (e.g. dataset preparation) can take place here - - data_validator = DeepchecksDataValidator.get_active_data_validator() - suite = data_validator.data_validation( - dataset=dataset, - check_list=[ - DeepchecksDataIntegrityCheck.TABULAR_OUTLIER_SAMPLE_DETECTION, - DeepchecksDataIntegrityCheck.TABULAR_STRING_LENGTH_OUT_OF_BOUNDS, - ], - ) - - # validation post-processing (e.g. interpret results, take actions) can happen here - - return suite -``` - -The arguments that the Deepchecks Data Validator methods can take in are the same as those used for [the Deepchecks standard steps](/develop/stack-components/data-validators/deepchecks#the-deepchecks-standard-steps). - -Have a look at [the complete list of methods and parameters available in the DeepchecksDataValidator API](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.data%5Fvalidators.deepchecks%5Fdata%5Fvalidator.DeepchecksDataValidator) in the SDK docs. - -#### Call Deepchecks directly - -You can use the Deepchecks library directly in your custom pipeline steps, and only leverage ZenML's capability of serializing, versioning and storing the `SuiteResult` objects in its Artifact Store, e.g.: - -```py -import pandas as pd -import deepchecks.tabular.checks as tabular_checks - -from deepchecks.core.suite import SuiteResult -from deepchecks.tabular import Suite -from deepchecks.tabular import Dataset -from zenml import step - - -@step -def data_integrity_check( - dataset: pd.DataFrame, -) -> SuiteResult: - """Custom data integrity check step with Deepchecks - - Args: - dataset: a Pandas DataFrame - - Returns: - Deepchecks test suite execution result - """ - - # validation pre-processing (e.g. dataset preparation) can take place here - - train_dataset = Dataset( - dataset, - label='class', - cat_features=['country', 'state'] - ) - - suite = Suite(name="custom") - check = tabular_checks.OutlierSampleDetection( - nearest_neighbors_percent=0.01, - extent_parameter=3, - ) - check.add_condition_outlier_ratio_less_or_equal( - max_outliers_ratio=0.007, - outlier_score_threshold=0.5, - ) - suite.add(check) - check = tabular_checks.StringLengthOutOfBounds( - num_percentiles=1000, - min_unique_values=3, - ) - check.add_condition_number_of_outliers_less_or_equal( - max_outliers=3, - ) - results = suite.run(train_dataset=train_dataset) - - # validation post-processing (e.g. interpret results, take actions) can happen here - - return results -``` - -#### Visualizing Deepchecks Suite Results - -You can view visualizations of the suites and results generated by your pipeline steps directly in the ZenML dashboard by clicking on the respective artifact in the pipeline run DAG. - -Alternatively, if you are running inside a Jupyter notebook, you can load and render the suites and results using the [artifact.visualize() method](/develop/usage/resource-data-management/visualize-artifacts), e.g.: - -```py -from zenml.client import Client - - -def visualize_results(pipeline_name: str, step_name: str) -> None: - pipeline = Client().get_pipeline(pipeline=pipeline_name) - last_run = pipeline.last_run - step = last_run.steps[step_name] - step.visualize() - - -if __name__ == "__main__": - visualize_results("data_validation_pipeline", "data_integrity_check") -``` diff --git a/develop/stack-components/data-validators/evidently.mdx b/develop/stack-components/data-validators/evidently.mdx deleted file mode 100644 index e5a0312..0000000 --- a/develop/stack-components/data-validators/evidently.mdx +++ /dev/null @@ -1,632 +0,0 @@ ---- -title: "Evidently" -description: "How to keep your data quality in check and guard against data and model drift with Evidently profiling" -icon: clipboard-check ---- - -The Evidently [Data Validator](/develop/stack-components/data-validators) flavor provided with the ZenML integration uses [Evidently](https://evidentlyai.com/) to perform data quality, data drift, model drift and model performance analyses, to generate reports and run checks. The reports and check results can be used to implement automated corrective actions in your pipelines or to render interactive representations for further visual interpretation, evaluation and documentation. - -### When would you want to use it? - -[Evidently](https://evidentlyai.com/) is an open-source library that you can use to monitor and debug machine learning models by analyzing the data that they use through a powerful set of data profiling and visualization features, or to run a variety of data and model validation reports and tests, from data integrity tests that work with a single dataset to model evaluation tests to data drift analysis and model performance comparison tests. All this can be done with minimal configuration input from the user, or customized with specialized conditions that the validation tests should perform. - -Evidently currently works with tabular data in `pandas.DataFrame` or CSV file formats and can handle both regression and classification tasks. - -You should use the Evidently Data Validator when you need the following data and/or model validation features that are possible with Evidently: - -* [Data Quality](https://docs.evidentlyai.com/presets/data-quality) reports and tests: provides detailed feature statistics and a feature behavior overview for a single dataset. It can also compare any two datasets. E.g. you can use it to compare train and test data, reference and current data, or two subgroups of one dataset. -* [Data Drift](https://docs.evidentlyai.com/presets/data-drift) reports and tests: helps detects and explore feature distribution changes in the input data by comparing two datasets with identical schema. -* [Target Drift](https://docs.evidentlyai.com/presets/target-drift) reports and tests: helps detect and explore changes in the target function and/or model predictions by comparing two datasets where the target and/or prediction columns are available. -* [Regression Performance](https://docs.evidentlyai.com/presets/reg-performance) or [Classification Performance](https://docs.evidentlyai.com/presets/class-performance) reports and tests: evaluate the performance of a model by analyzing a single dataset where both the target and prediction columns are available. It can also compare it to the past performance of the same model, or the performance of an alternative model by providing a second dataset. - -You should consider one of the other [Data Validator flavors](/develop/stack-components/data-validators#data-validator-flavors) if you need a different set of data validation features. - -### How do you deploy it? - -The Evidently Data Validator flavor is included in the Evidently ZenML integration, you need to install it on your local machine to be able to register an Evidently Data Validator and add it to your stack: - -```bash -zenml integration install evidently -y -``` - -The Data Validator stack component does not have any configuration parameters. Adding it to a stack is as simple as running e.g.: - -```bash -# Register the Evidently data validator -zenml data-validator register evidently_data_validator --flavor=evidently - -# Register and set a stack with the new data validator -zenml stack register custom_stack -dv evidently_data_validator ... --set -``` - -### How do you use it? - -#### Data Profiling - -Evidently's profiling functions take in a `pandas.DataFrame` dataset or a pair of datasets and generate results in the form of a `Report` object. - -One of Evidently's notable characteristics is that it only requires datasets as input. Even when running model performance comparison analyses, no model needs to be present. However, that does mean that the input data needs to include additional `target` and `prediction` columns for some profiling reports and, you have to include additional information about the dataset columns in the form of [column mappings](https://docs.evidentlyai.com/versions/0.66.0/user-guide/tests-and-reports/column-mapping). Depending on how your data is structured, you may also need to include additional steps in your pipeline before the data validation step to insert the additional `target` and `prediction` columns into your data. This may also require interacting with one or more models. - -There are three ways you can use Evidently to generate data reports in your ZenML pipelines that allow different levels of flexibility: - -* instantiate, configure and insert the standard Evidently report step shipped with ZenML into your pipelines. This is the easiest way and the recommended approach. -* call the data validation methods provided by [the Evidently Data Validator](/develop/stack-components/data-validators/evidently#the-evidently-data-validator) in your custom step implementation. This method allows for more flexibility concerning what can happen in the pipeline step. -* [use the Evidently library directly](/develop/stack-components/data-validators/evidently#call-evidently-directly) in your custom step implementation. This gives you complete freedom in how you are using Evidently's features. - -You can [visualize Evidently reports](/develop/stack-components/data-validators/evidently#visualizing-evidently-reports) in Jupyter notebooks or view them directly in the ZenML dashboard by clicking on the respective artifact in the pipeline run DAG. - -**The Evidently Report step** - -ZenML wraps the Evidently data profiling functionality in the form of a standard Evidently report pipeline step that you can simply instantiate and insert in your pipeline. Here you can see how instantiating and configuring the standard Evidently report step can be done: - - -```py -from zenml.integrations.evidently.metrics import EvidentlyMetricConfig -from zenml.integrations.evidently.steps import ( - EvidentlyColumnMapping, - evidently_report_step, -) - -text_data_report = evidently_report_step.with_options( - parameters=dict( - column_mapping=EvidentlyColumnMapping( - target="Rating", - numerical_features=["Age", "Positive_Feedback_Count"], - categorical_features=[ - "Division_Name", - "Department_Name", - "Class_Name", - ], - text_features=["Review_Text", "Title"], - ), - metrics=[ - EvidentlyMetricConfig.metric("DataQualityPreset"), - EvidentlyMetricConfig.metric( - "TextOverviewPreset", column_name="Review_Text" - ), - EvidentlyMetricConfig.metric_generator( - "ColumnRegExpMetric", - columns=["Review_Text", "Title"], - reg_exp=r"[A-Z][A-Za-z0-9 ]*", - ), - ], - # We need to download the NLTK data for the TextOverviewPreset - download_nltk_data=True, - ), -) -``` - -The configuration shown in the example is the equivalent of running the following Evidently code inside the step: - -```py -from evidently.metrics import ColumnRegExpMetric -from evidently.metric_preset import DataQualityPreset, TextOverviewPreset -from evidently import ColumnMapping -from evidently.report import Report -from evidently.metrics.base_metric import generate_column_metrics -import nltk - -nltk.download("words") -nltk.download("wordnet") -nltk.download("omw-1.4") - -column_mapping = ColumnMapping( - target="Rating", - numerical_features=["Age", "Positive_Feedback_Count"], - categorical_features=[ - "Division_Name", - "Department_Name", - "Class_Name", - ], - text_features=["Review_Text", "Title"], -) - -report = Report( - metrics=[ - DataQualityPreset(), - TextOverviewPreset(column_name="Review_Text"), - generate_column_metrics( - ColumnRegExpMetric, - columns=["Review_Text", "Title"], - parameters={"reg_exp": r"[A-Z][A-Za-z0-9 ]*"} - ) - ] -) - -# The datasets are those that are passed to the Evidently step -# as input artifacts -report.run( - current_data=current_dataset, - reference_data=reference_dataset, - column_mapping=column_mapping, -) -``` - -Let's break this down... - -We configure the `evidently_report_step` using parameters that you would normally pass to the Evidently `Report` object to [configure and run an Evidently report](https://docs.evidentlyai.com/versions/0.66.0/user-guide/tests-and-reports/custom-report). It consists of the following fields: - -* `column_mapping`: This is an `EvidentlyColumnMapping` object that is the exact equivalent of [the ColumnMapping object in Evidently](https://docs.evidentlyai.com/versions/0.66.0/user-guide/input-data/column-mapping). It is used to describe the columns in the dataset and how they should be treated (e.g. as categorical, numerical, or text features). -* `metrics`: This is a list of `EvidentlyMetricConfig` objects that are used to configure the metrics that should be used to generate the report in a declarative way. This is the same as configuring the `metrics` that go in the Evidently `Report`. -* `download_nltk_data`: This is a boolean that is used to indicate whether the NLTK data should be downloaded. This is only needed if you are using Evidently reports that handle text data, which require the NLTK data to be downloaded ahead of time. - -There are several ways you can reference the Evidently metrics when configuring `EvidentlyMetricConfig` items: - -* by class name: this is the easiest way to reference an Evidently metric. You can use the name of a metric or metric preset class as it appears in the Evidently documentation (e.g.`"DataQualityPreset"`, `"DatasetDriftMetric"`). -* by full class path: you can also use the full Python class path of the metric or metric preset class ( e.g. `"evidently.metric_preset.DataQualityPreset"`, `"evidently.metrics.DatasetDriftMetric"`). This is useful if you want to use metrics or metric presets that are not included in Evidently library. -* by passing in the class itself: you can also import and pass in an Evidently metric or metric preset class itself, e.g.: - -```py -from evidently.metrics import DatasetDriftMetric -... -evidently_report_step.with_options( - parameters=dict( - metrics=[EvidentlyMetricConfig.metric(DatasetDriftMetric)] - ), -) -``` - -As can be seen in the example, there are two basic ways of adding metrics to your Evidently report step configuration: - -* to add a single metric or metric preset: call `EvidentlyMetricConfig.metric` with an Evidently metric or metric preset class name (or class path or class). The rest of the parameters are the same ones that you would usually pass to the Evidently metric or metric preset class constructor. -* to generate multiple metrics, similar to calling [the Evidently column metric generator](https://docs.evidentlyai.com/versions/0.66.0/user-guide/tests-and-reports/test-metric-generator#column-metric-generator): call `EvidentlyMetricConfig.metric_generator` with an Evidently metric or metric preset class name (or class path or class) and a list of column names. The rest of the parameters are the same ones that you would usually pass to the Evidently metric or metric preset class constructor. - -The ZenML Evidently report step can then be inserted into your pipeline where it can take in two datasets and outputs the Evidently report generated in both JSON and HTML formats, e.g.: - -```py -@pipeline(enable_cache=False, settings={"docker": docker_settings}) -def text_data_report_test_pipeline(): - """Links all the steps together in a pipeline.""" - data = data_loader() - reference_dataset, comparison_dataset = data_splitter(data) - report, _ = text_data_report( - reference_dataset=reference_dataset, - comparison_dataset=comparison_dataset, - ) - test_report, _ = text_data_test( - reference_dataset=reference_dataset, - comparison_dataset=comparison_dataset, - ) - text_analyzer(report) - - -text_data_report_test_pipeline() -``` - -For a version of the same step that works with a single dataset, simply don't pass any comparison dataset: - -```bash -text_data_report(reference_dataset=reference_dataset) -``` - -You should consult [the official Evidently documentation](https://docs.evidentlyai.comversions/0.66.0/reference/all-metrics) for more information on what each metric is useful for and what data columns it requires as input. - -The `evidently_report_step` step also allows for additional Report [options](https://docs.evidentlyai.com/versions/0.66.0/user-guide/customization) to be passed to the `Report` constructor e.g.: - -```bash -from zenml.integrations.evidently.steps import ( - EvidentlyColumnMapping, -) - -text_data_report = evidently_report_step.with_options( - parameters=dict( - report_options = [ - ( - "evidently.options.ColorOptions", { - "primary_color": "#5a86ad", - "fill_color": "#fff4f2", - "zero_line_color": "#016795", - "current_data_color": "#c292a1", - "reference_data_color": "#017b92", - } - ), - ], - ) -) -``` - -You can view [the complete list of configuration parameters](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-evidently/#zenml.integrations.evidently.steps.evidently%5Freport.evidently%5Freport%5Fstep) in the SDK docs. - -#### Data Validation - -Aside from data profiling, Evidently can also be used to configure and run automated data validation tests on your data. - -Similar to using Evidently through ZenML to run data profiling, there are three ways you can use Evidently to run data validation tests in your ZenML pipelines that allow different levels of flexibility: - -* instantiate, configure and insert [the standard Evidently test step](/develop/stack-components/data-validators/evidently) shipped with ZenML into your pipelines. This is the easiest way and the recommended approach. -* call the data validation methods provided by [the Evidently Data Validator](/develop/stack-components/data-validators/evidently#the-evidently-data-validator) in your custom step implementation. This method allows for more flexibility concerning what can happen in the pipeline step. -* [use the Evidently library directly](/develop/stack-components/data-validators/evidently#call-evidently-directly) in your custom step implementation. This gives you complete freedom in how you are using Evidently's features. - -You can visualize Evidently reports in Jupyter notebooks or view them directly in the ZenML dashboard by clicking on the respective artifact in the pipeline run DAG. - -You can [visualize Evidently reports](/develop/stack-components/data-validators/evidently#visualizing-evidently-reports) in Jupyter notebooks or view them directly in the ZenML dashboard by clicking on the respective artifact in the pipeline run DAG. - -ZenML wraps the Evidently data validation functionality in the form of a standard Evidently test pipeline step that you can simply instantiate and insert in your pipeline. Here you can see how instantiating and configuring the standard Evidently test step can be done using our included `evidently_test_step` utility function: - -```py -from zenml.integrations.evidently.steps import ( - EvidentlyColumnMapping, - evidently_test_step, -) -from zenml.integrations.evidently.tests import EvidentlyTestConfig - - -text_data_test = evidently_test_step.with_options( - parameters=dict( - column_mapping=EvidentlyColumnMapping( - target="Rating", - numerical_features=["Age", "Positive_Feedback_Count"], - categorical_features=[ - "Division_Name", - "Department_Name", - "Class_Name", - ], - text_features=["Review_Text", "Title"], - ), - tests=[ - EvidentlyTestConfig.test("DataQualityTestPreset"), - EvidentlyTestConfig.test_generator( - "TestColumnRegExp", - columns=["Review_Text", "Title"], - reg_exp=r"[A-Z][A-Za-z0-9 ]*", - ), - ], - # We need to download the NLTK data for the TestColumnRegExp test - download_nltk_data=True, - ), -) -``` - -The configuration shown in the example is the equivalent of running the following Evidently code inside the step: - - -```py -from evidently.tests import TestColumnRegExp -from evidently.test_preset import DataQualityTestPreset -from evidently import ColumnMapping -from evidently.test_suite import TestSuite -from evidently.tests.base_test import generate_column_tests -import nltk - -nltk.download("words") -nltk.download("wordnet") -nltk.download("omw-1.4") - -column_mapping = ColumnMapping( - target="Rating", - numerical_features=["Age", "Positive_Feedback_Count"], - categorical_features=[ - "Division_Name", - "Department_Name", - "Class_Name", - ], - text_features=["Review_Text", "Title"], -) - -test_suite = TestSuite( - tests=[ - DataQualityTestPreset(), - generate_column_tests( - TestColumnRegExp, - columns=["Review_Text", "Title"], - parameters={"reg_exp": r"[A-Z][A-Za-z0-9 ]*"} - ) - ] -) - -# The datasets are those that are passed to the Evidently step -# as input artifacts -test_suite.run( - current_data=current_dataset, - reference_data=reference_dataset, - column_mapping=column_mapping, -) -``` - -Let's break this down... - -We configure the `evidently_test_step` using parameters that you would normally pass to the Evidently `TestSuite` object to [configure and run an Evidently test suite](https://docs.evidentlyai.com/versions/0.66.0/user-guide/tests-and-reports/custom-test-suite) . It consists of the following fields: - -* `column_mapping`: This is an `EvidentlyColumnMapping` object that is the exact equivalent of [the ColumnMapping object in Evidently](https://docs.evidentlyai.com/versions/0.66.0/user-guide/input-data/column-mapping). It is used to describe the columns in the dataset and how they should be treated (e.g. as categorical, numerical, or text features). -* `tests`: This is a list of `EvidentlyTestConfig` objects that are used to configure the tests that will be run as part of your test suite in a declarative way. This is the same as configuring the `tests` that go in the Evidently `TestSuite`. -* `download_nltk_data`: This is a boolean that is used to indicate whether the NLTK data should be downloaded. This is only needed if you are using Evidently tests or test presets that handle text data, which require the NLTK data to be downloaded ahead of time. - -There are several ways you can reference the Evidently tests when configuring `EvidentlyTestConfig` items, similar to how you reference them in an `EvidentlyMetricConfig` object: - -* by class name: this is the easiest way to reference an Evidently test. You can use the name of a test or test preset class as it appears in the Evidently documentation (e.g.`"DataQualityTestPreset"`, `"TestColumnRegExp"`). -* by full class path: you can also use the full Python class path of the test or test preset class ( e.g. `"evidently.test_preset.DataQualityTestPreset"`, `"evidently.tests.TestColumnRegExp"`). This is useful if you want to use tests or test presets that are not included in Evidently library. -* by passing in the class itself: you can also import and pass in an Evidently test or test preset class itself, e.g.: - -```py -from evidently.tests import TestColumnRegExp -... -evidently_test_step.with_options( - parameters=dict( - tests=[EvidentlyTestConfig.test(TestColumnRegExp)] - ), -) -``` - -As can be seen in the example, there are two basic ways of adding tests to your Evidently test step configuration: - -* to add a single test or test preset: call `EvidentlyTestConfig.test` with an Evidently test or test preset class name (or class path or class). The rest of the parameters are the same ones that you would usually pass to the Evidently test or test preset class constructor. -* to generate multiple tests, similar to calling [the Evidently column test generator](https://docs.evidentlyai.com/versions/0.66.0/user-guide/tests-and-reports/test-metric-generator#column-test-generator): call `EvidentlyTestConfig.test_generator` with an Evidently test or test preset class name (or class path or class) and a list of column names. The rest of the parameters are the same ones that you would usually pass to the Evidently test or test preset class constructor. - -The ZenML Evidently test step can then be inserted into your pipeline where it can take in two datasets and outputs the Evidently test suite results generated in both JSON and HTML formats, e.g.: - -```py -@pipeline(enable_cache=False, settings={"docker": docker_settings}) -def text_data_test_pipeline(): - """Links all the steps together in a pipeline.""" - data = data_loader() - reference_dataset, comparison_dataset = data_splitter(data) - json_report, html_report = text_data_test( - reference_dataset=reference_dataset, - comparison_dataset=comparison_dataset, - ) - - -text_data_test_pipeline() -``` - -For a version of the same step that works with a single dataset, simply don't pass any comparison dataset: - -```py -text_data_test(reference_dataset=reference_dataset) -``` - -You should consult [the official Evidently documentation](https://docs.evidentlyai.comversions/0.66.0/reference/all-tests) for more information on what each test is useful for and what data columns it requires as input. - -The `evidently_test_step` step also allows for additional Test [options](https://docs.evidentlyai.com/versions/0.66.0/user-guide/customization) to be passed to the `TestSuite` constructor e.g.: - -```bash -from zenml.integrations.evidently.steps import ( - EvidentlyColumnMapping, -) - -text_data_test = evidently_test_step.with_options( - parameters=dict( - test_options = [ - ( - "evidently.options.ColorOptions", { - "primary_color": "#5a86ad", - "fill_color": "#fff4f2", - "zero_line_color": "#016795", - "current_data_color": "#c292a1", - "reference_data_color": "#017b92", - } - ), - ], - ), -) -``` - -You can view [the complete list of configuration parameters](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-evidently/#zenml.integrations.evidently.steps.evidently%5Ftest.evidently%5Ftest%5Fstep) in the SDK docs. - -#### The Evidently Data Validator - -The Evidently Data Validator implements the same interface as do all Data Validators, so this method forces you to maintain some level of compatibility with the overall Data Validator abstraction, which guarantees an easier migration in case you decide to switch to another Data Validator. - -All you have to do is call the Evidently Data Validator methods when you need to interact with Evidently to generate data reports or to run test suites, e.g.: - -```py -from typing_extensions import Annotated # or `from typing import Annotated on Python 3.9+ -from typing import Tuple -import pandas as pd -from evidently.pipeline.column_mapping import ColumnMapping -from zenml.integrations.evidently.data_validators import EvidentlyDataValidator -from zenml.integrations.evidently.metrics import EvidentlyMetricConfig -from zenml.integrations.evidently.tests import EvidentlyTestConfig -from zenml.types import HTMLString -from zenml import step - - -@step -def data_profiling( - reference_dataset: pd.DataFrame, - comparison_dataset: pd.DataFrame, -) -> Tuple[ - Annotated[str, "report_json"], - Annotated[HTMLString, "report_html"] -]: - """Custom data profiling step with Evidently. - - Args: - reference_dataset: a Pandas DataFrame - comparison_dataset: a Pandas DataFrame of new data you wish to - compare against the reference data - - Returns: - The Evidently report rendered in JSON and HTML formats. - """ - # pre-processing (e.g. dataset preparation) can take place here - - data_validator = EvidentlyDataValidator.get_active_data_validator() - report = data_validator.data_profiling( - dataset=reference_dataset, - comparison_dataset=comparison_dataset, - profile_list=[ - EvidentlyMetricConfig.metric("DataQualityPreset"), - EvidentlyMetricConfig.metric( - "TextOverviewPreset", column_name="Review_Text" - ), - EvidentlyMetricConfig.metric_generator( - "ColumnRegExpMetric", - columns=["Review_Text", "Title"], - reg_exp=r"[A-Z][A-Za-z0-9 ]*", - ), - ], - column_mapping = ColumnMapping( - target="Rating", - numerical_features=["Age", "Positive_Feedback_Count"], - categorical_features=[ - "Division_Name", - "Department_Name", - "Class_Name", - ], - text_features=["Review_Text", "Title"], - ), - download_nltk_data = True, - ) - - # post-processing (e.g. interpret results, take actions) can happen here - - return report.json(), HTMLString(report.show(mode="inline").data) - - -@step -def data_validation( - reference_dataset: pd.DataFrame, - comparison_dataset: pd.DataFrame, -) -> Tuple[ - Annotated[str, "test_json"], - Annotated[HTMLString, "test_html"] -]: - """Custom data validation step with Evidently. - - Args: - reference_dataset: a Pandas DataFrame - comparison_dataset: a Pandas DataFrame of new data you wish to - compare against the reference data - - Returns: - The Evidently test suite results rendered in JSON and HTML formats. - """ - # pre-processing (e.g. dataset preparation) can take place here - - data_validator = EvidentlyDataValidator.get_active_data_validator() - test_suite = data_validator.data_validation( - dataset=reference_dataset, - comparison_dataset=comparison_dataset, - check_list=[ - EvidentlyTestConfig.test("DataQualityTestPreset"), - EvidentlyTestConfig.test_generator( - "TestColumnRegExp", - columns=["Review_Text", "Title"], - reg_exp=r"[A-Z][A-Za-z0-9 ]*", - ), - ], - column_mapping = ColumnMapping( - target="Rating", - numerical_features=["Age", "Positive_Feedback_Count"], - categorical_features=[ - "Division_Name", - "Department_Name", - "Class_Name", - ], - text_features=["Review_Text", "Title"], - ), - download_nltk_data = True, - ) - - # post-processing (e.g. interpret results, take actions) can happen here - - return test_suite.json(), HTMLString(test_suite.show(mode="inline").data) -``` - -Have a look at [the complete list of methods and parameters available in the EvidentlyDataValidator API](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-evidently/#zenml.integrations.evidently.data%5Fvalidators.evidently%5Fdata%5Fvalidator.EvidentlyDataValidator) in the SDK docs. - -#### Call Evidently directly - -You can use the Evidently library directly in your custom pipeline steps, e.g.: - -```py -from typing_extensions import Annotated # or `from typing import Annotated` on Python 3.9+ -from typing import Tuple -import pandas as pd -from evidently.report import Report -import evidently.metric_preset as metric_preset -from evidently.test_suite import TestSuite -import evidently.test_preset as test_preset -from evidently.pipeline.column_mapping import ColumnMapping -from zenml.types import HTMLString -from zenml import step - - -@step -def data_profiler( - dataset: pd.DataFrame, -) -> Tuple[ - Annotated[str, "report_json"], - Annotated[HTMLString, "report_html"] -]: - """Custom data profiler step with Evidently - - Args: - dataset: a Pandas DataFrame - - Returns: - Evidently report generated for the dataset in JSON and HTML format. - """ - - # pre-processing (e.g. dataset preparation) can take place here - - report = Report(metrics=[metric_preset.DataQualityPreset()]) - report.run( - current_data=dataset, - reference_data=dataset, - ) - - # post-processing (e.g. interpret results, take actions) can happen here - - return report.json(), HTMLString(report.show(mode="inline").data) - - -@step -def data_tester( - dataset: pd.DataFrame, -) -> Tuple[ - Annotated[str, "test_json"], - Annotated[HTMLString, "test_html"] -]: - """Custom data tester step with Evidently - - Args: - dataset: a Pandas DataFrame - - Returns: - Evidently test results generated for the dataset in JSON and HTML format. - """ - - # pre-processing (e.g. dataset preparation) can take place here - - test_suite = TestSuite(metrics=[test_preset.DataQualityTestPreset()]) - report.run( - current_data=dataset, - reference_data=dataset, - ) - - # post-processing (e.g. interpret results, take actions) can happen here - - return test_suite.json(), HTMLString(test_suite.show(mode="inline").data) -``` - -### Visualizing Evidently Reports - -You can view visualizations of the Evidently reports generated by your pipeline steps directly in the ZenML dashboard by clicking on the respective artifact in the pipeline run DAG. - -Alternatively, if you are running inside a Jupyter notebook, you can load and render the reports using the [artifact.visualize() method](/develop/usage/resource-data-management/visualize-artifacts), e.g.: - -```py -from zenml.client import Client - - -def visualize_results(pipeline_name: str, step_name: str) -> None: - pipeline = Client().get_pipeline(pipeline=pipeline_name) - evidently_step = pipeline.last_run.steps[step_name] - evidently_step.visualize() - - -if __name__ == "__main__": - visualize_results("text_data_report_pipeline", "text_report") - visualize_results("text_data_test_pipeline", "text_test") -``` - - - - - - - - - diff --git a/develop/stack-components/data-validators/great-expectations.mdx b/develop/stack-components/data-validators/great-expectations.mdx deleted file mode 100644 index 6349c65..0000000 --- a/develop/stack-components/data-validators/great-expectations.mdx +++ /dev/null @@ -1,310 +0,0 @@ ---- -title: "Great Expectations" -description: "How to use Great Expectations to run data quality checks in your pipelines and document the results" -icon: folder ---- - -The Great Expectations [Data Validator](/develop/stack-components/data-validators) flavor provided with the ZenML integration uses [Great Expectations](https://greatexpectations.io/) to run data profiling and data quality tests on the data circulated through your pipelines. The test results can be used to implement automated corrective actions in your pipelines. They are also automatically rendered into documentation for further visual interpretation and evaluation. - -### When would you want to use it? - -[Great Expectations](https://greatexpectations.io/) is an open-source library that helps keep the quality of your data in check through data testing, documentation, and profiling, and to improve communication and observability. Great Expectations works with tabular data in a variety of formats and data sources, of which ZenML currently supports only `pandas.DataFrame` as part of its pipelines. - -You should use the Great Expectations Data Validator when you need the following data validation features that are possible with Great Expectations: - -* [Data Profiling](https://docs.greatexpectations.io/docs/oss/guides/expectations/creating%5Fcustom%5Fexpectations/how%5Fto%5Fadd%5Fsupport%5Ffor%5Fthe%5Fauto%5Finitializing%5Fframework%5Fto%5Fa%5Fcustom%5Fexpectation/#build-a-custom-profiler-for-your-expectation): generates a set of validation rules (Expectations) automatically by inferring them from the properties of an input dataset. -* [Data Quality](https://docs.greatexpectations.io/docs/oss/guides/validation/checkpoints/how%5Fto%5Fpass%5Fan%5Fin%5Fmemory%5Fdataframe%5Fto%5Fa%5Fcheckpoint/): runs a set of predefined or inferred validation rules (Expectations) against an in-memory dataset. -* [Data Docs](https://docs.greatexpectations.io/docs/reference/learn/terms/data%5Fdocs%5Fstore/): generate and maintain human-readable documentation of all your data validation rules, data quality checks and their results. - -You should consider one of the other [Data Validator flavors](/develop/stack-components/data-validators#data-validator-flavors) if you need a different set of data validation features. - -### How do you deploy it? - -The Great Expectations Data Validator flavor is included in the Great Expectations ZenML integration, you need to install it on your local machine to be able to register a Great Expectations Data Validator and add it to your stack: - -```bash -zenml integration install great_expectations -y -``` - -Depending on how you configure the Great Expectations Data Validator, it can reduce or even completely eliminate the complexity associated with setting up the store backends for Great Expectations. If you're only looking for a quick and easy way of adding Great Expectations to your stack and are not concerned with the configuration details, you can simply run: - -```bash - -# Register the Great Expectations data validator -zenml data-validator register ge_data_validator --flavor=great_expectations - -# Register and set a stack with the new data validator -zenml stack register custom_stack -dv ge_data_validator ... --set -``` - -If you already have a Great Expectations deployment, you can configure the Great Expectations Data Validator to reuse or even replace your current configuration. You should consider the pros and cons of every deployment use-case and choose the one that best fits your needs: - -1. let ZenML initialize and manage the Great Expectations configuration. The Artifact Store will serve as a storage backend for all the information that Great Expectations needs to persist (e.g. Expectation Suites, Validation Results). However, you will not be able to setup new Data Sources, Metadata Stores or Data Docs sites. Any changes you try and make to the configuration through code will not be persisted and will be lost when your pipeline completes or your local process exits. -2. use ZenML with your existing Great Expectations configuration. You can tell ZenML to replace your existing Metadata Stores with the active ZenML Artifact Store by setting the `configure_zenml_stores` attribute in the Data Validator. The downside is that you will only be able to run pipelines locally with this setup, given that the Great Expectations configuration is a file on your local machine. -3. migrate your existing Great Expectations configuration to ZenML. This is a compromise between 1\. and 2\. that allows you to continue to use your existing Data Sources, Metadata Stores and Data Docs sites even when running pipelines remotely. - -Some Great Expectations CLI commands will not work well with the deployment methods that puts ZenML in charge of your Great Expectations configuration (i.e. 1\. and 3.). You will be required to use Python code to manage your Expectations and you will have to edit the Jupyter notebooks generated by the Great Expectations CLI to connect them to your ZenML managed configuration. . - - -The default Data Validator setup plugs Great Expectations directly into the [Artifact Store](/develop/stack-components/artifact-stores) component that is part of the same stack. As a result, the Expectation Suites, Validation Results and Data Docs are stored in the ZenML Artifact Store and you don't have to configure Great Expectations at all, ZenML takes care of that for you: - -```bash -# Register the Great Expectations data validator -zenml data-validator register ge_data_validator --flavor=great_expectations - -# Register and set a stack with the new data validator -zenml stack register custom_stack -dv ge_data_validator ... --set -``` - - -If you have an existing Great Expectations configuration that you would like to reuse with your ZenML pipelines, the Data Validator allows you to do so. All you need is to point it to the folder where your local `great_expectations.yaml` configuration file is located: - - -```bash -# Register the Great Expectations data validator -zenml data-validator register ge_data_validator --flavor=great_expectations \ - --context_root_dir=/path/to/my/great_expectations - -# Register and set a stack with the new data validator -zenml stack register custom_stack -dv ge_data_validator ... --set -``` - -You can continue to edit your local Great Expectations configuration (e.g. add new Data Sources, update the Metadata Stores etc.) and these changes will be visible in your ZenML pipelines. You can also use the Great Expectations CLI as usual to manage your configuration and your Expectations. - - -This deployment method migrates your existing Great Expectations configuration to ZenML and allows you to use it with local as well as remote orchestrators. You have to load the Great Expectations configuration contents in one of the Data Validator configuration parameters using the `@` operator, e.g.: - -```bash -# Register the Great Expectations data validator -zenml data-validator register ge_data_validator --flavor=great_expectations \ - --context_config=@/path/to/my/great_expectations/great_expectations.yaml - -# Register and set a stack with the new data validator -zenml stack register custom_stack -dv ge_data_validator ... --set -``` - -When you are migrating your existing Great Expectations configuration to ZenML, keep in mind that the Metadata Stores that you configured there will also need to be accessible from the location where pipelines are running. For example, you cannot use a non-local orchestrator with a Great Expectations Metadata Store that is located on your filesystem. - - - - -#### Advanced Configuration - -The Great Expectations Data Validator has a few advanced configuration attributes that might be useful for your particular use-case: - -* `configure_zenml_stores`: if set, ZenML will automatically update the Great Expectation configuration to include Metadata Stores that use the Artifact Store as a backend. If neither `context_root_dir` nor `context_config` are set, this is the default behavior. You can set this flag to use the ZenML Artifact Store as a backend for Great Expectations with any of the deployment methods described above. Note that ZenML will not copy the information in your existing Great Expectations stores (e.g. Expectation Suites, Validation Results) in the ZenML Artifact Store. This is something that you will have to do yourself. -* `configure_local_docs`: set this flag to configure a local Data Docs site where Great Expectations docs are generated and can be visualized locally. Use this in case you don't already have a local Data Docs site in your existing Great Expectations configuration. - -For more, up-to-date information on the Great Expectations Data Validator configuration, you can have a look at [the SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-great%5Fexpectations/#zenml.integrations.great%5Fexpectations.data%5Fvalidators.ge%5Fdata%5Fvalidator.GreatExpectationsDataValidator) . - -### How do you use it? - -The core Great Expectations concepts that you should be aware of when using it within ZenML pipelines are Expectations / Expectation Suites, Validations and Data Docs. - -ZenML wraps the Great Expectations' functionality in the form of two standard steps: - -* a Great Expectations data profiler that can be used to automatically generate Expectation Suites from an input `pandas.DataFrame` dataset -* a Great Expectations data validator that uses an existing Expectation Suite to validate an input `pandas.DataFrame` dataset - -You can visualize Great Expectations Suites and Results in Jupyter notebooks or view them directly in the ZenML dashboard. - -#### The Great Expectation's data profiler step - -The standard Great Expectation's data profiler step builds an Expectation Suite automatically by running a [UserConfigurableProfiler](https://docs.greatexpectations.io/docs/guides/expectations/how%5Fto%5Fcreate%5Fand%5Fedit%5Fexpectations%5Fwith%5Fa%5Fprofiler) on an input `pandas.DataFrame` dataset. The generated Expectation Suite is saved in the Great Expectations Expectation Store, but also returned as an `ExpectationSuite` artifact that is versioned and saved in the ZenML Artifact Store. The step automatically rebuilds the Data Docs. - -At a minimum, the step configuration expects a name to be used for the Expectation Suite: - - -```py -from zenml.integrations.great_expectations.steps import ( - great_expectations_profiler_step, -) - -ge_profiler_step = great_expectations_profiler_step.with_options( - parameters={ - "expectation_suite_name": "steel_plates_suite", - "data_asset_name": "steel_plates_train_df", - } -) -``` - -The step can then be inserted into your pipeline where it can take in a pandas dataframe, e.g.: - -```py -from zenml import pipeline - -docker_settings = DockerSettings(required_integrations=[SKLEARN, GREAT_EXPECTATIONS]) - -@pipeline(settings={"docker": docker_settings}) -def profiling_pipeline(): - """Data profiling pipeline for Great Expectations. - - The pipeline imports a reference dataset from a source then uses the builtin - Great Expectations profiler step to generate an expectation suite (i.e. - validation rules) inferred from the schema and statistical properties of the - reference dataset. - - Args: - importer: reference data importer step - profiler: data profiler step - """ - dataset, _ = importer() - ge_profiler_step(dataset) - - -profiling_pipeline() -``` - -As can be seen from the [step definition](https://apidocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-great%5Fexpectations/#zenml.integrations.great%5Fexpectations.steps.ge%5Fprofiler.great%5Fexpectations%5Fprofiler%5Fstep) , the step takes in a `pandas.DataFrame` dataset, and it returns a Great Expectations `ExpectationSuite` object: - -```py -@step -def great_expectations_profiler_step( - dataset: pd.DataFrame, - expectation_suite_name: str, - data_asset_name: Optional[str] = None, - profiler_kwargs: Optional[Dict[str, Any]] = None, - overwrite_existing_suite: bool = True, -) -> ExpectationSuite: - ... -``` - -You can view [the complete list of configuration parameters](https://apidocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-great%5Fexpectations/#zenml.integrations.great%5Fexpectations.steps.ge%5Fprofiler.great%5Fexpectations%5Fprofiler%5Fstep) in the SDK docs. - -#### The Great Expectations data validator step - -The standard Great Expectations data validator step validates an input `pandas.DataFrame` dataset by running an existing Expectation Suite on it. The validation results are saved in the Great Expectations Validation Store, but also returned as an `CheckpointResult` artifact that is versioned and saved in the ZenML Artifact Store. The step automatically rebuilds the Data Docs. - -At a minimum, the step configuration expects the name of the Expectation Suite to be used for the validation: - -```py -from zenml.integrations.great_expectations.steps import ( - great_expectations_validator_step, -) - -ge_validator_step = great_expectations_validator_step.with_options( - parameters={ - "expectation_suite_name": "steel_plates_suite", - "data_asset_name": "steel_plates_train_df", - } -) -``` - -The step can then be inserted into your pipeline where it can take in a pandas dataframe and a bool flag used solely for order reinforcement purposes, e.g.: - -```py -docker_settings = DockerSettings(required_integrations=[SKLEARN, GREAT_EXPECTATIONS]) - -@pipeline(settings={"docker": docker_settings}) -def validation_pipeline(): - """Data validation pipeline for Great Expectations. - - The pipeline imports a test data from a source, then uses the builtin - Great Expectations data validation step to validate the dataset against - the expectation suite generated in the profiling pipeline. - - Args: - importer: test data importer step - validator: dataset validation step - checker: checks the validation results - """ - dataset, condition = importer() - results = ge_validator_step(dataset, condition) - message = checker(results) - - -validation_pipeline() -``` - -As can be seen from the [step definition](https://apidocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-great%5Fexpectations/#zenml.integrations.great%5Fexpectations.steps.ge%5Fvalidator.great%5Fexpectations%5Fvalidator%5Fstep) , the step takes in a `pandas.DataFrame` dataset and a boolean `condition` and it returns a Great Expectations `CheckpointResult` object. The boolean `condition` is only used as a means of ordering steps in a pipeline (e.g. if you must force it to run only after the data profiling step generates an Expectation Suite): - -```py -@step -def great_expectations_validator_step( - dataset: pd.DataFrame, - expectation_suite_name: str, - data_asset_name: Optional[str] = None, - action_list: Optional[List[Dict[str, Any]]] = None, - exit_on_error: bool = False, -) -> CheckpointResult: -``` - -You can view [the complete list of configuration parameters](https://apidocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-great%5Fexpectations/#zenml.integrations.great%5Fexpectations.steps.ge%5Fvalidator.great%5Fexpectations%5Fvalidator%5Fstep) in the SDK docs. - -#### Call Great Expectations directly - -You can use the Great Expectations library directly in your custom pipeline steps, while leveraging ZenML's capability of serializing, versioning and storing the `ExpectationSuite` and `CheckpointResult` objects in its Artifact Store. To use the Great Expectations configuration managed by ZenML while interacting with the Great Expectations library directly, you need to use the Data Context managed by ZenML instead of the default one provided by Great Expectations, e.g.: - -```py -import great_expectations as ge -from zenml.integrations.great_expectations.data_validators import ( - GreatExpectationsDataValidator -) - -import pandas as pd -from great_expectations.core import ExpectationSuite -from zenml import step - - -@step -def create_custom_expectation_suite( -) -> ExpectationSuite: - """Custom step that creates an Expectation Suite - - Returns: - An Expectation Suite - """ - context = GreatExpectationsDataValidator.get_data_context() - # instead of: - # context = ge.get_context() - - expectation_suite_name = "custom_suite" - suite = context.create_expectation_suite( - expectation_suite_name=expectation_suite_name - ) - expectation_configuration = ExpectationConfiguration(...) - suite.add_expectation(expectation_configuration=expectation_configuration) - ... - context.save_expectation_suite( - expectation_suite=suite, - expectation_suite_name=expectation_suite_name, - ) - context.build_data_docs() - return suite -``` - -The same approach must be used if you are using a Great Expectations configuration managed by ZenML and are using the Jupyter notebooks generated by the Great Expectations CLI. - -#### Visualizing Great Expectations Suites and Results - -You can view visualizations of the suites and results generated by your pipeline steps directly in the ZenML dashboard by clicking on the respective artifact in the pipeline run DAG. - -Alternatively, if you are running inside a Jupyter notebook, you can load and render the suites and results using the [artifact.visualize() method](/develop/usage/resource-data-management/visualize-artifacts), e.g.: - -```py -from zenml.client import Client - - -def visualize_results(pipeline_name: str, step_name: str) -> None: - pipeline = Client().get_pipeline(pipeline_name) - last_run = pipeline.last_run - validation_step = last_run.steps[step_name] - validation_step.visualize() - - -if __name__ == "__main__": - visualize_results("validation_pipeline", "profiler") - visualize_results("validation_pipeline", "train_validator") - visualize_results("validation_pipeline", "test_validator") -``` - - - - - - - - - - diff --git a/develop/stack-components/data-validators/whylogs.mdx b/develop/stack-components/data-validators/whylogs.mdx deleted file mode 100644 index 424b714..0000000 --- a/develop/stack-components/data-validators/whylogs.mdx +++ /dev/null @@ -1,285 +0,0 @@ ---- -title: "Whylogs" -description: "How to collect and visualize statistics to track changes in your pipelines' data with whylogs/WhyLabs profiling." -icon: clipboard-question ---- - -The whylogs/WhyLabs [Data Validator](/develop/stack-components/data-validators) flavor provided with the ZenML integration uses [whylogs](https://whylabs.ai/whylogs) and [WhyLabs](https://whylabs.ai) to generate and track data profiles, highly accurate descriptive representations of your data. The profiles can be used to implement automated corrective actions in your pipelines, or to render interactive representations for further visual interpretation, evaluation and documentation. - -### When would you want to use it? - -[Whylogs](https://whylabs.ai/whylogs) is an open-source library that analyzes your data and creates statistical summaries called whylogs profiles. Whylogs profiles can be processed in your pipelines and visualized locally or uploaded to the [WhyLabs platform](https://whylabs.ai/), where more in depth analysis can be carried out. Even though [whylogs also supports other data types](https://github.com/whylabs/whylogs#data-types), the ZenML whylogs integration currently only works with tabular data in `pandas.DataFrame` format. - -You should use the whylogs/WhyLabs Data Validator when you need the following data validation features that are possible with whylogs and WhyLabs: - -* Data Quality: validate data quality in model inputs or in a data pipeline -* Data Drift: detect data drift in model input features -* Model Drift: Detect training-serving skew, concept drift, and model performance degradation - -You should consider one of the other [Data Validator flavors](/develop/stack-components/data-validators#data-validator-flavors) if you need a different set of data validation features. - -### How do you deploy it? - -The whylogs Data Validator flavor is included in the whylogs ZenML integration, you need to install it on your local machine to be able to register a whylogs Data Validator and add it to your stack: - -```bash -zenml integration install whylogs -y -``` - -If you don't need to connect to the WhyLabs platform to upload and store the generated whylogs data profiles, the Data Validator stack component does not require any configuration parameters. Adding it to a stack is as simple as running e.g.: - -```bash -# Register the whylogs data validator -zenml data-validator register whylogs_data_validator --flavor=whylogs - -# Register and set a stack with the new data validator -zenml stack register custom_stack -dv whylogs_data_validator ... --set -``` - -Adding WhyLabs logging capabilities to your whylogs Data Validator is just slightly more complicated, as you also need to create a [ZenML Secret](/develop/usage/project-setup/use-secrets/secret-management) to store the sensitive WhyLabs authentication information in a secure location and then reference the secret in the Data Validator configuration. To generate a WhyLabs access token, you can follow [the official WhyLabs instructions documented here](https://docs.whylabs.ai/docs/whylabs-api/#creating-an-api-token) . - -Then, you can register the whylogs Data Validator with WhyLabs logging capabilities as follows: - -```bash -# Create the secret referenced in the data validator -zenml secret create whylabs_secret \ - --whylabs_default_org_id= \ - --whylabs_api_key= - -# Register the whylogs data validator -zenml data-validator register whylogs_data_validator --flavor=whylogs \ - --authentication_secret=whylabs_secret -``` - -You'll also need to enable whylabs logging for your custom pipeline steps if you want to upload the whylogs data profiles that they return as artifacts to the WhyLabs platform. This is enabled by default for the standard whylogs step. For custom steps, you can enable WhyLabs logging by setting the `upload_to_whylabs` parameter to `True` in the step configuration, e.g.: - -```py -from typing_extensions import Annotated # or `from typing import Annotated on Python 3.9+ -from typing import Tuple -import pandas as pd -import whylogs as why -from sklearn import datasets -from whylogs.core import DatasetProfileView - -from zenml.integrations.whylogs.flavors.whylogs_data_validator_flavor import ( - WhylogsDataValidatorSettings, -) -from zenml import step - - -@step( - settings={ - "data_validator": WhylogsDataValidatorSettings( - enable_whylabs=True, dataset_id="model-1" - ) - } -) -def data_loader() -> Tuple[ - Annotated[pd.DataFrame, "data"], - Annotated[DatasetProfileView, "profile"] -]: - """Load the diabetes dataset.""" - X, y = datasets.load_diabetes(return_X_y=True, as_frame=True) - - # merge X and y together - df = pd.merge(X, y, left_index=True, right_index=True) - - profile = why.log(pandas=df).profile().view() - return df, profile -``` - -### How do you use it? - -Whylogs's profiling functions take in a `pandas.DataFrame` dataset generate a `DatasetProfileView` object containing all the relevant information extracted from the dataset. - -There are three ways you can use whylogs in your ZenML pipelines that allow different levels of flexibility: - -* instantiate, configure and insert [the standard WhylogsProfilerStep](/develop/stack-components/data-validators/whylogs#the-whylogs-standard-step) shipped with ZenML into your pipelines. This is the easiest way and the recommended approach, but can only be customized through the supported step configuration parameters. -* call the data validation methods provided by [the whylogs Data Validator](/develop/stack-components/data-validators/whylogs#the-whylogs-data-validator) in your custom step implementation. This method allows for more flexibility concerning what can happen in the pipeline step, but you are still limited to the functionality implemented in the Data Validator. -* [use the whylogs library directly](/develop/stack-components/data-validators/whylogs#call-whylogs-directly) in your custom step implementation. This gives you complete freedom in how you are using whylogs's features. - -You can [visualize whylogs profiles](/develop/stack-components/data-validators/whylogs#visualizing-whylogs-profiles) in Jupyter notebooks or view them directly in the ZenML dashboard. - -#### The whylogs standard step - -ZenML wraps the whylogs/WhyLabs functionality in the form of a standard `WhylogsProfilerStep` step. The only field in the step config is a `dataset_timestamp` attribute which is only relevant when you upload the profiles to WhyLabs that uses this field to group and merge together profiles belonging to the same dataset. The helper function `get_whylogs_profiler_step` used to create an instance of this standard step takes in an optional `dataset_id` parameter that is also used only in the context of WhyLabs upload to identify the model in the context of which the profile is uploaded, e.g.: - -```py -from zenml.integrations.whylogs.steps import get_whylogs_profiler_step - -train_data_profiler = get_whylogs_profiler_step(dataset_id="model-2") -test_data_profiler = get_whylogs_profiler_step(dataset_id="model-3") -``` - -The step can then be inserted into your pipeline where it can take in a `pandas.DataFrame` dataset, e.g.: - -```py -from zenml import pipeline - -@pipeline -def data_profiling_pipeline(): - data, _ = data_loader() - train, test = data_splitter(data) - train_data_profiler(train) - test_data_profiler(test) - - -data_profiling_pipeline() -``` - -As can be seen from the [step definition](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-whylogs/#zenml.integrations.whylogs.steps.whylogs%5Fprofiler.whylogs%5Fprofiler%5Fstep) , the step takes in a dataset and returns a whylogs `DatasetProfileView` object: - -```py -@step -def whylogs_profiler_step( - dataset: pd.DataFrame, - dataset_timestamp: Optional[datetime.datetime] = None, -) -> DatasetProfileView: - ... -``` - -You should consult [the official whylogs documentation](https://whylogs.readthedocs.io/en/latest/index.html) for more information on what you can do with the collected profiles. - -You can view [the complete list of configuration parameters](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-whylogs/#zenml.integrations.whylogs.steps.whylogs%5Fprofiler.WhylogsProfilerConfig) in the SDK docs. - -#### The whylogs Data Validator - -The whylogs Data Validator implements the same interface as do all Data Validators, so this method forces you to maintain some level of compatibility with the overall Data Validator abstraction, which guarantees an easier migration in case you decide to switch to another Data Validator. - -All you have to do is call the whylogs Data Validator methods when you need to interact with whylogs to generate data profiles. You may optionally enable whylabs logging to automatically upload the returned whylogs profile to WhyLabs, e.g.: - -```py -import pandas as pd -from whylogs.core import DatasetProfileView -from zenml.integrations.whylogs.data_validators.whylogs_data_validator import ( - WhylogsDataValidator, -) -from zenml.integrations.whylogs.flavors.whylogs_data_validator_flavor import ( - WhylogsDataValidatorSettings, -) -from zenml import step - -whylogs_settings = WhylogsDataValidatorSettings( - enable_whylabs=True, dataset_id="" -) - - -@step( - settings={ - "data_validator": whylogs_settings - } -) -def data_profiler( - dataset: pd.DataFrame, -) -> DatasetProfileView: - """Custom data profiler step with whylogs - - Args: - dataset: a Pandas DataFrame - - Returns: - Whylogs profile generated for the data - """ - - # validation pre-processing (e.g. dataset preparation) can take place here - - data_validator = WhylogsDataValidator.get_active_data_validator() - profile = data_validator.data_profiling( - dataset, - ) - # optionally upload the profile to WhyLabs, if WhyLabs credentials are configured - data_validator.upload_profile_view(profile) - - # validation post-processing (e.g. interpret results, take actions) can happen here - - return profile -``` - -Have a look at [the complete list of methods and parameters available in the WhylogsDataValidator API](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-whylogs/#zenml.integrations.whylogs.data%5Fvalidators.whylogs%5Fdata%5Fvalidator.WhylogsDataValidator) in the SDK docs. - -#### Call whylogs directly - -You can use the whylogs library directly in your custom pipeline steps, and only leverage ZenML's capability of serializing, versioning and storing the `DatasetProfileView` objects in its Artifact Store. You may optionally enable whylabs logging to automatically upload the returned whylogs profile to WhyLabs, e.g.: - -```py - -import pandas as pd -from whylogs.core import DatasetProfileView -import whylogs as why -from zenml import step -from zenml.integrations.whylogs.flavors.whylogs_data_validator_flavor import ( - WhylogsDataValidatorSettings, -) - -whylogs_settings = WhylogsDataValidatorSettings( - enable_whylabs=True, dataset_id="" -) - - -@step( - settings={ - "data_validator": whylogs_settings - } -) -def data_profiler( - dataset: pd.DataFrame, -) -> DatasetProfileView: - """Custom data profiler step with whylogs - - Args: - dataset: a Pandas DataFrame - - Returns: - Whylogs Profile generated for the dataset - """ - - # validation pre-processing (e.g. dataset preparation) can take place here - - results = why.log(dataset) - profile = results.profile() - - # validation post-processing (e.g. interpret results, take actions) can happen here - - return profile.view() -``` - -### Visualizing whylogs Profiles - -You can view visualizations of the whylogs profiles generated by your pipeline steps directly in the ZenML dashboard by clicking on the respective artifact in the pipeline run DAG. - -Alternatively, if you are running inside a Jupyter notebook, you can load and render the whylogs profiles using the [artifact.visualize() method](/develop/usage/resource-data-management/visualize-artifacts), e.g.: - -```py -from zenml.client import Client - - -def visualize_statistics( - step_name: str, reference_step_name: Optional[str] = None -) -> None: - """Helper function to visualize whylogs statistics from step artifacts. - - Args: - step_name: step that generated and returned a whylogs profile - reference_step_name: an optional second step that generated a whylogs - profile to use for data drift visualization where two whylogs - profiles are required. - """ - pipe = Client().get_pipeline(pipeline="data_profiling_pipeline") - whylogs_step = pipe.last_run.steps[step_name] - whylogs_step.visualize() - - -if __name__ == "__main__": - visualize_statistics("data_loader") - visualize_statistics("train_data_profiler", "test_data_profiler") -``` - - - - - - - - - - diff --git a/develop/stack-components/experiment-trackers/comet.mdx b/develop/stack-components/experiment-trackers/comet.mdx deleted file mode 100644 index e8f8368..0000000 --- a/develop/stack-components/experiment-trackers/comet.mdx +++ /dev/null @@ -1,300 +0,0 @@ ---- -title: "Comet" -description: "Logging and visualizing experiments with Comet." -icon: meteor ---- - -The Comet Experiment Tracker is an [Experiment Tracker](/develop/stack-components/experiment-trackers) flavor provided with the Comet ZenML integration that uses [the Comet experiment tracking platform](https://www.comet.com/site/products/ml-experiment-tracking/) to log and visualize information from your pipeline steps (e.g., models, parameters, metrics). - - - - - -### When would you want to use it? - -[Comet](https://www.comet.com/site/products/ml-experiment-tracking/) is a popular platform that you would normally use in the iterative ML experimentation phase to track and visualize experiment results. That doesn't mean that it cannot be repurposed to track and visualize the results produced by your automated pipeline runs, as you make the transition towards a more production-oriented workflow. - -You should use the Comet Experiment Tracker: - -* if you have already been using Comet to track experiment results for your project and would like to continue doing so as you are incorporating MLOps workflows and best practices in your project through ZenML. -* if you are looking for a more visually interactive way of navigating the results produced from your ZenML pipeline runs (e.g., models, metrics, datasets) -* if you would like to connect ZenML to Comet to share the artifacts and metrics logged by your pipelines with your team, organization, or external stakeholders - -You should consider one of the other [Experiment Tracker flavors](/develop/stack-components/experiment-trackers#experiment-tracker-flavors) if you have never worked with Comet before and would rather use another experiment tracking tool that you are more familiar with. - -### How do you deploy it? - -The Comet Experiment Tracker flavor is provided by the Comet ZenML integration. You need to install it on your local machine to be able to register a Comet Experiment Tracker and add it to your stack: - -```bash -zenml integration install comet -y -``` - -The Comet Experiment Tracker needs to be configured with the credentials required to connect to the Comet platform using one of the available authentication methods. - -#### Authentication Methods - -You need to configure the following credentials for authentication to the Comet platform: - -* `api_key`: Mandatory API key token of your Comet account. -* `project_name`: The name of the project where you're sending the new experiment. If the project is not specified, the experiment is put in the default project associated with your API key. -* `workspace`: Optional. The name of the workspace where your project is located. If not specified, the default workspace associated with your API key will be used. - - - This method requires you to [configure a ZenML secret](/develop/usage/project-setup/use-secrets/secret-management) to store the Comet tracking service credentials securely. - -You can create the secret using the `zenml secret create` command: - -```bash -zenml secret create comet_secret \ - --workspace= \ - --project_name= \ - --api_key= -``` - -Once the secret is created, you can use it to configure the Comet Experiment Tracker: - -```bash -# Reference the workspace, project, and api-key in our experiment tracker component - -zenml experiment-tracker register comet_tracker \ - --flavor=comet \ - --workspace={{comet_secret.workspace}} \ - --project_name={{comet_secret.project_name}} \ - --api_key={{comet_secret.api_key}} - ... - -# Register and set a stack with the new experiment tracker - -zenml stack register custom_stack -e comet_experiment_tracker ... --set -``` - - - Read more about [ZenML Secrets](/develop/usage/project-setup/use-secrets/secret-management) in the ZenML documentation. - - - - - This option configures the credentials for the Comet platform directly as stack component attributes. - -This is not recommended for production settings as the credentials won't be stored securely and will be clearly visible in the stack configuration. - -```bash -# Register the Comet experiment tracker - -zenml experiment-tracker register comet_experiment_tracker --flavor=comet \ - - --workspace= --project_name= --api_key= - -# Register and set a stack with the new experiment tracker - -zenml stack register custom_stack -e comet_experiment_tracker ... --set -``` - - - - - - - -For more up-to-date information on the Comet Experiment Tracker implementation and its configuration, you can have a look at [the SDK docs for our Comet integration](https://sdkdocs.zenml.io/0.66.0/integration%5Fcode%5Fdocs/integrations-comet/#zenml.integrations.comet.flavors.comet%5Fexperiment%5Ftracker%5Fflavor.CometExperimentTrackerConfig). - -### How do you use it? - -To be able to log information from a ZenML pipeline step using the Comet Experiment Tracker component in the active stack, you need to enable an experiment tracker using the `@step` decorator. Then use Comet logging capabilities as you would normally do, e.g.: - -```py -from zenml.client import Client - -experiment_tracker = Client().active_stack.experiment_tracker - -@step(experiment_tracker=experiment_tracker.name) -def my_step(): - ... - # go through some experiment tracker methods - experiment_tracker.log_metrics({"my_metric": 42}) - experiment_tracker.log_params({"my_param": "hello"}) - - # or use the Experiment object directly - experiment_tracker.experiment.log_model(...) - - # or pass the Comet Experiment object into helper methods - from comet_ml.integration.sklearn import log_model - log_model( - experiment=experiment_tracker.experiment, - model_name="SVC", - model=model, - ) - ... -``` - - - Instead of hardcoding an experiment tracker name, you can also use the [Client](/develop/sdk/python-client) to dynamically use the experiment tracker of your active stack, as shown in the example above. - - -### Comet UI - -Comet comes with a web-based UI that you can use to find further details about your tracked experiments. - -Every ZenML step that uses Comet should create a separate experiment which you can inspect in the Comet UI. - - - - - - - - - - -You can find the URL of the Comet experiment linked to a specific ZenML run via the metadata of the step in which the experiment tracker was used: - -```py -from zenml.client import Client - -last_run = client.get_pipeline("").last_run -trainer_step = last_run.get_step("") -tracking_url = trainer_step.run_metadata["experiment_tracker_url"].value -print(tracking_url) -``` - - - - - - -Alternatively, you can see an overview of all experiments at `https://www.comet.com/{WORKSPACE_NAME}/{PROJECT_NAME}/experiments/`. - - - The naming convention of each Comet experiment is `{pipeline_run_name}_{step_name}` (e.g., `comet_example_pipeline-25_Apr_22-20_06_33_535737_my_step`), and each experiment will be tagged with both `pipeline_name` and `pipeline_run_name`, which you can use to group and filter experiments. - - -## Full Code Example - -This section combines all the code from this section into one simple script that you can use to run easily: - -```py -from comet_ml.integration.sklearn import log_model - -import numpy as np -from sklearn.datasets import load_iris -from sklearn.model_selection import train_test_split -from sklearn.preprocessing import StandardScaler -from sklearn.svm import SVC -from sklearn.metrics import accuracy_score -from typing import Tuple - -from zenml import pipeline, step -from zenml.client import Client -from zenml.integrations.comet.flavors.comet_experiment_tracker_flavor import ( - CometExperimentTrackerSettings, -) -from zenml.integrations.comet.experiment_trackers import CometExperimentTracker - -# Get the experiment tracker from the active stack -experiment_tracker: CometExperimentTracker = Client().active_stack.experiment_tracker - - -@step -def load_data() -> Tuple[np.ndarray, np.ndarray]: - iris = load_iris() - X = iris.data - y = iris.target - return X, y - - -@step -def preprocess_data( - X: np.ndarray, y: np.ndarray -) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: - X_train, X_test, y_train, y_test = train_test_split( - X, y, test_size=0.2, random_state=42 - ) - scaler = StandardScaler() - X_train_scaled = scaler.fit_transform(X_train) - X_test_scaled = scaler.transform(X_test) - return X_train_scaled, X_test_scaled, y_train, y_test - - -@step(experiment_tracker=experiment_tracker.name) -def train_model(X_train: np.ndarray, y_train: np.ndarray) -> SVC: - model = SVC(kernel="rbf", C=1.0) - model.fit(X_train, y_train) - log_model( - experiment=experiment_tracker.experiment, - model_name="SVC", - model=model, - ) - return model - - -@step(experiment_tracker=experiment_tracker.name) -def evaluate_model(model: SVC, X_test: np.ndarray, y_test: np.ndarray) -> float: - y_pred = model.predict(X_test) - accuracy = accuracy_score(y_test, y_pred) - - # Log metrics using Comet - experiment_tracker.log_metrics({"accuracy": accuracy}) - experiment_tracker.experiment.log_confusion_matrix(y_test, y_pred) - return accuracy - - -@pipeline(enable_cache=False) -def iris_classification_pipeline(): - X, y = load_data() - X_train, X_test, y_train, y_test = preprocess_data(X, y) - model = train_model(X_train, y_train) - accuracy = evaluate_model(model, X_test, y_test) - - -if __name__ == "__main__": - # Configure Comet settings - comet_settings = CometExperimentTrackerSettings(tags=["iris_classification", "svm"]) - - # Run the pipeline - last_run = iris_classification_pipeline.with_options( - settings={"experiment_tracker": comet_settings} - )() - - # Get the URLs for the trainer and evaluator steps - trainer_step, evaluator_step = ( - last_run.steps["train_model"], - last_run.steps["evaluate_model"], - ) - trainer_url = trainer_step.run_metadata["experiment_tracker_url"].value - evaluator_url = evaluator_step.run_metadata["experiment_tracker_url"].value - print(f"URL for trainer step: {trainer_url}") - print(f"URL for evaluator step: {evaluator_url}") -``` - - - -#### Additional configuration - -For additional configuration of the Comet experiment tracker, you can pass `CometExperimentTrackerSettings` to provide additional tags for your experiments: - -```py -from zenml.integrations.comet.flavors.comet_experiment_tracker_flavor import ( - CometExperimentTrackerSettings, -) - -comet_settings = CometExperimentTrackerSettings( - tags=["some_tag"], - run_name="", - settings={}, -) - -@step( - experiment_tracker="", - settings={ - "experiment_tracker": comet_settings - } -) -def my_step(): - ... -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-comet/#zenml.integrations.comet.flavors.comet%5Fexperiment%5Ftracker%5Fflavor.CometExperimentTrackerSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - diff --git a/develop/stack-components/experiment-trackers/custom.mdx b/develop/stack-components/experiment-trackers/custom.mdx deleted file mode 100644 index 29de4b8..0000000 --- a/develop/stack-components/experiment-trackers/custom.mdx +++ /dev/null @@ -1,59 +0,0 @@ ---- -title: "Develop a custom experiment tracker" -description: "Learning how to develop a custom experiment tracker." -icon: shuffle ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -**Base abstraction in progress!** - -We are actively working on the base abstraction for the Experiment Tracker, which will be available soon. As a result, their extension is not recommended at the moment. When you are selecting an Experiment Tracker for your stack, you can use one of [the existing flavors](/develop/stack-components/experiment-trackers#experiment-tracker-flavors). - - -If you need to implement your own Experiment Tracker flavor, you can still do so, but keep in mind that you may have to refactor it when the base abstraction is released. - -### Build your own custom experiment tracker - -If you want to create your own custom flavor for an experiment tracker, you can follow the following steps: - -1. Create a class that inherits from the `BaseExperimentTracker` class and implements the abstract methods. -2. If you need any configuration, create a class that inherits from the `BaseExperimentTrackerConfig` class and add your configuration parameters. -3. Bring both the implementation and the configuration together by inheriting from the `BaseExperimentTrackerFlavor` class. - -Once you are done with the implementation, you can register it through the CLI. Please ensure you **point to the flavor class via dot notation**: - -```bash -zenml experiment-tracker flavor register -``` - -For example, if your flavor class `MyExperimentTrackerFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml experiment-tracker flavor register flavors.my_flavor.MyExperimentTrackerFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. - - -Afterward, you should see the new flavor in the list of available flavors: - -```bash -zenml experiment-tracker flavor list -``` - - -It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow. - -* The **CustomExperimentTrackerFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomExperimentTrackerConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` objects are inherently `pydantic` objects, you can also add your own custom validators here. -* The **CustomExperimentTracker** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomExperimentTrackerFlavor` and the `CustomExperimentTrackerConfig` are implemented in a different module/path than the actual `CustomExperimentTracker`). - - diff --git a/develop/stack-components/experiment-trackers/experiment-trackers.mdx b/develop/stack-components/experiment-trackers/experiment-trackers.mdx deleted file mode 100644 index 1ac6ef8..0000000 --- a/develop/stack-components/experiment-trackers/experiment-trackers.mdx +++ /dev/null @@ -1,72 +0,0 @@ ---- -title: "Overview" -description: "Logging and visualizing ML experiments." -icon: chart-line ---- - -Experiment trackers let you track your ML experiments by logging extended information about your models, datasets, metrics, and other parameters and allowing you to browse them, visualize them and compare them between runs. In the ZenML world, every pipeline run is considered an experiment, and ZenML facilitates the storage of experiment results through Experiment Tracker stack components. This establishes a clear link between pipeline runs and experiments. - -Related concepts: - -* the Experiment Tracker is an optional type of Stack Component that needs to be registered as part of your ZenML [Stack](/develop/user-guide/guides/production-guide/understand-stacks). -* ZenML already provides versioning and tracking for the pipeline artifacts by storing artifacts in the [Artifact Store](/develop/stack-components/artifact-stores). - -### When to use it - -ZenML already records information about the artifacts circulated through your pipelines by means of the mandatory [Artifact Store](/develop/stack-components/artifact-stores). - -However, these ZenML mechanisms are meant to be used programmatically and can be more difficult to work with without a visual interface. - -Experiment Trackers on the other hand are tools designed with usability in mind. They include extensive UIs providing users with an interactive and intuitive interface that allows them to browse and visualize the information logged during the ML pipeline runs. - -You should add an Experiment Tracker to your ZenML stack and use it when you want to augment ZenML with the visual features provided by experiment tracking tools. - -### How they experiment trackers slot into the stack - -Here is an architecture diagram that shows how experiment trackers fit into the overall story of a remote stack. - - - - - -#### Experiment Tracker Flavors - -Experiment Trackers are optional stack components provided by integrations: - -| Experiment Tracker | Flavor | Integration | Notes | -| --------------------------------------------------------------------- | -------- | ----------- | ----------------------------------------------------------------------------------------------- | -| [Comet](/develop/stack-components/experiment-trackers/comet) | comet | comet | Add Comet experiment tracking and visualization capabilities to your ZenML pipelines | -| [MLflow](/develop/stack-components/experiment-trackers/mlflow) | mlflow | mlflow | Add MLflow experiment tracking and visualization capabilities to your ZenML pipelines | -| [Neptune](/develop/stack-components/experiment-trackers/neptune) | neptune | neptune | Add Neptune experiment tracking and visualization capabilities to your ZenML pipelines | -| [Weights & Biases](/develop/stack-components/experiment-trackers/wandb) | wandb | wandb | Add Weights & Biases experiment tracking and visualization capabilities to your ZenML pipelines | -| [Custom Implementation](/develop/stack-components/experiment-trackers/custom) | _custom_ | _custom_ | | - -If you would like to see the available flavors of Experiment Tracker, you can use the command: - -```bash -zenml experiment-tracker flavor list -``` - -### How to use it - -Every Experiment Tracker has different capabilities and uses a different way of logging information from your pipeline steps, but it generally works as follows: - -* first, you have to configure and add an Experiment Tracker to your ZenML stack -* next, you have to explicitly enable the Experiment Tracker for individual steps in your pipeline by decorating them with the included decorator -* in your steps, you have to explicitly log information (e.g. models, metrics, data) to the Experiment Tracker same as you would if you were using the tool independently of ZenML -* finally, you can access the Experiment Tracker UI to browse and visualize the information logged during your pipeline runs. You can use the following code snippet to get the URL of the experiment tracker UI for the experiment linked to a certain step of your pipeline run: - -```bash -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("") - -step = pipeline_run.steps[""] - -experiment_tracker_url = step.run_metadata["experiment_tracker_url"].value -``` - - Experiment trackers will automatically declare runs as failed if the corresponding ZenML pipeline step fails. - - -Consult the documentation for the particular [Experiment Tracker flavor](/develop/stack-components/experiment-trackers#experiment-tracker-flavors) that you plan on using or are using in your stack for detailed information about how to use it in your ZenML pipelines. diff --git a/develop/stack-components/experiment-trackers/mlflow.mdx b/develop/stack-components/experiment-trackers/mlflow.mdx deleted file mode 100644 index f438329..0000000 --- a/develop/stack-components/experiment-trackers/mlflow.mdx +++ /dev/null @@ -1,213 +0,0 @@ ---- -title: "MLflow" -description: "Logging and visualizing experiments with MLflow." -icon: vial ---- - -The MLflow Experiment Tracker is an [Experiment Tracker](/develop/stack-components/experiment-trackers) flavor provided with the MLflow ZenML integration that uses [the MLflow tracking service](https://mlflow.org/docs/latest/tracking.html) to log and visualize information from your pipeline steps (e.g. models, parameters, metrics). - -## When would you want to use it? - -[MLflow Tracking](https://www.mlflow.org/docs/latest/tracking.html) is a very popular tool that you would normally use in the iterative ML experimentation phase to track and visualize experiment results. That doesn't mean that it cannot be repurposed to track and visualize the results produced by your automated pipeline runs, as you make the transition toward a more production-oriented workflow. - -You should use the MLflow Experiment Tracker: - -* if you have already been using MLflow to track experiment results for your project and would like to continue doing so as you are incorporating MLOps workflows and best practices in your project through ZenML. -* if you are looking for a more visually interactive way of navigating the results produced from your ZenML pipeline runs (e.g. models, metrics, datasets) -* if you or your team already have a shared MLflow Tracking service deployed somewhere on-premise or in the cloud, and you would like to connect ZenML to it to share the artifacts and metrics logged by your pipelines - -You should consider one of the other [Experiment Tracker flavors](/develop/stack-components/experiment-trackers#experiment-tracker-flavors) if you have never worked with MLflow before and would rather use another experiment tracking tool that you are more familiar with. - -## How do you configure it? - -The MLflow Experiment Tracker flavor is provided by the MLflow ZenML integration, you need to install it on your local machine to be able to register an MLflow Experiment Tracker and add it to your stack: - -```bash -zenml integration install mlflow -y -``` - -The MLflow Experiment Tracker can be configured to accommodate the following [MLflow deployment scenarios](https://mlflow.org/docs/latest/tracking.html#common-setups): - -* [Localhost (default)](https://mlflow.org/docs/latest/tracking.html#common-setups) and [Local Tracking with Local Database](https://mlflow.org/docs/latest/tracking/tutorials/local-database.html): This scenario requires that you use a [local Artifact Store](/develop/stack-components/artifact-stores/local) alongside the MLflow Experiment Tracker in your ZenML stack. The local Artifact Store comes with limitations regarding what other types of components you can use in the same stack. This scenario should only be used to run ZenML locally and is not suitable for collaborative and production settings. No parameters need to be supplied when configuring the MLflow Experiment Tracker, e.g: - -```bash -# Register the MLflow experiment tracker -zenml experiment-tracker register mlflow_experiment_tracker --flavor=mlflow - -# Register and set a stack with the new experiment tracker -zenml stack register custom_stack -e mlflow_experiment_tracker ... --set -``` - -* [Remote Experiment Tracking with MLflow Tracking Server](https://mlflow.org/docs/latest/tracking/tutorials/remote-server.html): This scenario assumes that you have already deployed an MLflow Tracking Server enabled with proxied artifact storage access. There is no restriction regarding what other types of components it can be combined with. This option requires [authentication-related parameters](/develop/stack-components/experiment-trackers/mlflow#authentication-methods) to be configured for the MLflow Experiment Tracker. - - - Due to a [critical severity vulnerability](https://github.com/advisories/GHSA-xg73-94fp-g449) found in older versions of MLflow, we recommend using MLflow version 2.2.1 or higher. - - -* [Databricks scenario](https://www.databricks.com/product/managed-mlflow): This scenario assumes that you have a Databricks workspace, and you want to use the managed MLflow Tracking server it provides. This option requires [authentication-related parameters](/develop/stack-components/experiment-trackers/mlflow#authentication-methods) to be configured for the MLflow Experiment Tracker. - -### Authentication Methods - -You need to configure the following credentials for authentication to a remote MLflow tracking server: - -* `tracking_uri`: The URL pointing to the MLflow tracking server. If using an MLflow Tracking Server managed by Databricks, then the value of this attribute should be `"databricks"`. -* `tracking_username`: Username for authenticating with the MLflow tracking server. -* `tracking_password`: Password for authenticating with the MLflow tracking server. -* `tracking_token` (in place of `tracking_username` and `tracking_password`): Token for authenticating with the MLflow tracking server. -* `tracking_insecure_tls` (optional): Set to skip verifying the MLflow tracking server SSL certificate. -* `databricks_host`: The host of the Databricks workspace with the MLflow-managed server to connect to. This is only required if the `tracking_uri` value is set to `"databricks"`. More information: [Access the MLflow tracking server from outside Databricks](https://docs.databricks.com/applications/mlflow/access-hosted-tracking-server.html) - -Either `tracking_token` or `tracking_username` and `tracking_password` must be specified. - - -This option configures the credentials for the MLflow tracking service directly as stack component attributes. - - - This is not recommended for production settings as the credentials won't be stored securely and will be clearly visible in the stack configuration. - - -```bash -# Register the MLflow experiment tracker -zenml experiment-tracker register mlflow_experiment_tracker --flavor=mlflow \ - --tracking_uri= --tracking_token= - -# You can also register it like this: -# zenml experiment-tracker register mlflow_experiment_tracker --flavor=mlflow \ -# --tracking_uri= --tracking_username= --tracking_password= - -# Register and set a stack with the new experiment tracker -zenml stack register custom_stack -e mlflow_experiment_tracker ... --set -``` - - -This method requires you to [configure a ZenML secret](/develop/usage/project-setup/use-secrets/interact-with-secrets) to store the MLflow tracking service credentials securely. - -You can create the secret using the `zenml secret create` command: - -```bash -# Create a secret called `mlflow_secret` with key-value pairs for the -# username and password to authenticate with the MLflow tracking server -zenml secret create mlflow_secret \ - --username= \ - --password= -``` - -Once the secret is created, you can use it to configure the MLflow Experiment Tracker: - -```bash -# Reference the username and password in our experiment tracker component -zenml experiment-tracker register mlflow \ - --flavor=mlflow \ - --tracking_username={{mlflow_secret.username}} \ - --tracking_password={{mlflow_secret.password}} \ - ... -``` - - - Read more about [ZenML Secrets](/develop/usage/project-setup/use-secrets/interact-with-secrets) in the ZenML documentation. - - - - - -For more, up-to-date information on the MLflow Experiment Tracker implementation and its configuration, you can have a look at [the SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-mlflow/#zenml.integrations.mlflow.experiment%5Ftrackers.mlflow%5Fexperiment%5Ftracker) . - -## How do you use it? - -To be able to log information from a ZenML pipeline step using the MLflow Experiment Tracker component in the active stack, you need to enable an experiment tracker using the `@step` decorator. Then use MLflow's logging or auto-logging capabilities as you would normally do, e.g.: - -```py -import mlflow - - -@step(experiment_tracker="") -def tf_trainer( - x_train: np.ndarray, - y_train: np.ndarray, -) -> tf.keras.Model: - """Train a neural net from scratch to recognize MNIST digits return our - model or the learner""" - - # compile model - - mlflow.tensorflow.autolog() - - # train model - - # log additional information to MLflow explicitly if needed - - mlflow.log_param(...) - mlflow.log_metric(...) - mlflow.log_artifact(...) - - return model -``` - - -Instead of hardcoding an experiment tracker name, you can also use the [Client](/develop/sdk/python-client) to dynamically use the experiment tracker of your active stack: - -```py -from zenml.client import Client - -experiment_tracker = Client().active_stack.experiment_tracker - -@step(experiment_tracker=experiment_tracker.name) -def tf_trainer(...): - ... -``` - - -### MLflow UI - -MLflow comes with its own UI that you can use to find further details about your tracked experiments. - -You can find the URL of the MLflow experiment linked to a specific ZenML run via the metadata of the step in which the experiment tracker was used: - -```bash -from zenml.client import Client - -last_run = client.get_pipeline("").last_run -trainer_step = last_run.get_step("") -tracking_url = trainer_step.run_metadata["experiment_tracker_url"].value -print(tracking_url) -``` - -This will be the URL of the corresponding experiment in your deployed MLflow instance, or a link to the corresponding mlflow experiment file if you are using local MLflow. - - - If you are using local MLflow, you can use the `mlflow ui` command to start MLflow at [localhost:5000](http://localhost:5000/) where you can then explore the UI in your browser. - -```bash -mlflow ui --backend-store-uri -``` - - -### Additional configuration - -For additional configuration of the MLflow experiment tracker, you can pass `MLFlowExperimentTrackerSettings` to create nested runs or add additional tags to your MLflow runs: - -```py - -import mlflow -from zenml.integrations.mlflow.flavors.mlflow_experiment_tracker_flavor import MLFlowExperimentTrackerSettings - -mlflow_settings = MLFlowExperimentTrackerSettings( - nested=True, - tags={"key": "value"} -) - - -@step( - experiment_tracker="", - settings={ - "experiment_tracker": mlflow_settings - } -) -def step_one( - data: np.ndarray, -) -> np.ndarray: - ... -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-mlflow/#zenml.integrations.mlflow.flavors.mlflow%5Fexperiment%5Ftracker%5Fflavor.MLFlowExperimentTrackerSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - diff --git a/develop/stack-components/experiment-trackers/neptune.mdx b/develop/stack-components/experiment-trackers/neptune.mdx deleted file mode 100644 index 2a2a4d6..0000000 --- a/develop/stack-components/experiment-trackers/neptune.mdx +++ /dev/null @@ -1,187 +0,0 @@ ---- -title: "Neptune" -description: "Logging and visualizing experiments with neptune.ai" -icon: star ---- - -The Neptune Experiment Tracker is an [Experiment Tracker](/develop/stack-components/experiment-trackers) flavor provided with the Neptune-ZenML integration that uses [neptune.ai](https://neptune.ai/product/experiment-tracking) to log and visualize information from your pipeline steps (e.g. models, parameters, metrics). - -### When would you want to use it? - -[Neptune](https://neptune.ai/product/experiment-tracking) is a popular tool that you would normally use in the iterative ML experimentation phase to track and visualize experiment results or as a model registry for your production-ready models. Neptune can also track and visualize the results produced by your automated pipeline runs, as you make the transition towards a more production-oriented workflow. - -You should use the Neptune Experiment Tracker: - -* if you have already been using neptune.ai to track experiment results for your project and would like to continue doing so as you are incorporating MLOps workflows and best practices in your project through ZenML. -* if you are looking for a more visually interactive way of navigating the results produced from your ZenML pipeline runs (e.g. models, metrics, datasets) -* if you would like to connect ZenML to neptune.ai to share the artifacts and metrics logged by your pipelines with your team, organization, or external stakeholders - -You should consider one of the other [Experiment Tracker flavors](/develop/stack-components/experiment-trackers#experiment-tracker-flavors) if you have never worked with neptune.ai before and would rather use another experiment tracking tool that you are more familiar with. - -### How do you deploy it? - -The Neptune Experiment Tracker flavor is provided by the Neptune-ZenML integration. You need to install it on your local machine to be able to register the Neptune Experiment Tracker and add it to your stack: - -```bash -zenml integration install neptune -y -``` - -The Neptune Experiment Tracker needs to be configured with the credentials required to connect to Neptune using an API token. - -#### Authentication Methods - -You need to configure the following credentials for authentication to Neptune: - -* `api_token`: [API key token](https://docs.neptune.ai/setup/setting%5Fapi%5Ftoken) of your Neptune account. You can create a free Neptune account [here](https://app.neptune.ai/register). If left blank, Neptune will attempt to retrieve the token from your environment variables. -* `project`: The name of the project where you're sending the new run, in the form "workspace-name/project-name". If the project is not specified, Neptune will attempt to retrieve it from your environment variables. - - -This option configures the credentials for neptune.ai directly as stack component attributes. - - This is not recommended for production settings as the credentials won't be stored securely and will be clearly visible in the stack configuration. - - - - -```bash -# Register the Neptune experiment tracker - -zenml experiment-tracker register neptune_experiment_tracker --flavor=neptune \ - - --project= --api_token= - -# Register and set a stack with the new experiment tracker - -zenml stack register custom_stack -e neptune_experiment_tracker ... --set -``` - - -This method requires you to [configure a ZenML secret](/develop/usage/project-setup/use-secrets/interact-with-secrets) to store the Neptune tracking service credentials securely. - -You can create the secret using the `zenml secret create` command: - -```bash -zenml secret create neptune_secret \ - --project= - --api_token= -``` - -Once the secret is created, you can use it to configure the `neptune` Experiment Tracker: - -```bash -# Reference the project and api-token in our experiment tracker component -zenml experiment-tracker register neptune_secret \ - --flavor=neptune \ - --project={{neptune_secret.project}} \ - --api_token={{neptune_secret.api_token}} - ... -``` - - - Read more about [ZenML Secrets](/develop/usage/project-setup/use-secrets/interact-with-secrets) in the ZenML documentation. - - - - -For more, up-to-date information on the Neptune Experiment Tracker implementation and its configuration, you can have a look at [the SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-neptune/#zenml.integrations.neptune.experiment%5Ftrackers.neptune%5Fexperiment%5Ftracker) . - -### How do you use it? - -To log information from a ZenML pipeline step using the Neptune Experiment Tracker component in the active stack, you need to enable an experiment tracker using the `@step` decorator. Then fetch the Neptune run object and use logging capabilities as you would normally do. For example: - -```py -import numpy as np - -import tensorflow as tf - -from neptune_tensorflow_keras import NeptuneCallback - -from zenml.integrations.neptune.experiment_trackers.run_state import ( - get_neptune_run, -) -from zenml import step - - -@step(experiment_tracker="") -def tf_trainer( - x_train: np.ndarray, - y_train: np.ndarray, - x_val: np.ndarray, - y_val: np.ndarray, - epochs: int = 5, - lr: float = 0.001 -) -> tf.keras.Model: - ... - neptune_run = get_neptune_run() - model.fit( - x_train, - y_train, - epochs=epochs, - validation_data=(x_val, y_val), - callbacks=[ - NeptuneCallback(run=neptune_run), - ], - ) - - metric = ... - - neptune_run[""] = metric -``` - - -Instead of hardcoding an experiment tracker name, you can also use the [Client](/develop/sdk/python-client) to dynamically use the experiment tracker of your active stack: - -``` -from zenml.client import Client - -experiment_tracker = Client().active_stack.experiment_tracker - -@step(experiment_tracker=experiment_tracker.name) -def tf_trainer(...): - ... -``` - - -#### Additional configuration - -You can pass a set of tags to the Neptune run by using the `NeptuneExperimentTrackerSettings` class, like in the example below: - -```py -import numpy as np - -import tensorflow as tf - -from zenml import step -from zenml.integrations.neptune.experiment_trackers.run_state import ( - get_neptune_run, -) -from zenml.integrations.neptune.flavors import NeptuneExperimentTrackerSettings - -neptune_settings = NeptuneExperimentTrackerSettings(tags={"keras", "mnist"}) - - -@step( - experiment_tracker="", - settings={ - "experiment_tracker": neptune_settings - } -) -def my_step( - x_test: np.ndarray, - y_test: np.ndarray, - model: tf.keras.Model, -) -> float: - """Log metadata to Neptune run""" - neptune_run = get_neptune_run() - ... -``` - -### Neptune UI - -Neptune comes with a web-based UI that you can use to find further details about your tracked experiments. Each pipeline run will be logged as a separate experiment run in Neptune, which you can inspect in the Neptune UI. - -You can find the URL of the Neptune run linked to a specific ZenML run printed on the console whenever a Neptune run is initialized. - -### Further reading - -Check [Neptune's docs](https://docs.neptune.ai/integrations/zenml/) for further information on how to use this integration and Neptune in general. diff --git a/develop/stack-components/experiment-trackers/wandb.mdx b/develop/stack-components/experiment-trackers/wandb.mdx deleted file mode 100644 index 70f6835..0000000 --- a/develop/stack-components/experiment-trackers/wandb.mdx +++ /dev/null @@ -1,197 +0,0 @@ ---- -title: "Weights & Biases" -description: "Logging and visualizing experiments with Weights & Biases." -icon: database ---- - -The Weights & Biases Experiment Tracker is an [Experiment Tracker](/develop/stack-components/experiment-trackers) flavor provided with the Weights & Biases ZenML integration that uses [the Weights & Biases experiment tracking platform](https://wandb.ai/site/experiment-tracking) to log and visualize information from your pipeline steps (e.g. models, parameters, metrics). - -### When would you want to use it? - -[Weights & Biases](https://wandb.ai/site/experiment-tracking) is a very popular platform that you would normally use in the iterative ML experimentation phase to track and visualize experiment results. That doesn't mean that it cannot be repurposed to track and visualize the results produced by your automated pipeline runs, as you make the transition towards a more production-oriented workflow. - -You should use the Weights & Biases Experiment Tracker: - -* if you have already been using Weights & Biases to track experiment results for your project and would like to continue doing so as you are incorporating MLOps workflows and best practices in your project through ZenML. -* if you are looking for a more visually interactive way of navigating the results produced from your ZenML pipeline runs (e.g. models, metrics, datasets) -* if you would like to connect ZenML to Weights & Biases to share the artifacts and metrics logged by your pipelines with your team, organization, or external stakeholders - -You should consider one of the other [Experiment Tracker flavors](/develop/stack-components/experiment-trackers#experiment-tracker-flavors) if you have never worked with Weights & Biases before and would rather use another experiment tracking tool that you are more familiar with. - -### How do you deploy it? - -The Weights & Biases Experiment Tracker flavor is provided by the MLflow ZenML integration, you need to install it on your local machine to be able to register a Weights & Biases Experiment Tracker and add it to your stack: - -```bash -zenml integration install wandb -y -``` - -The Weights & Biases Experiment Tracker needs to be configured with the credentials required to connect to the Weights & Biases platform using one of the [available authentication methods](/develop/stack-components/experiment-trackers/wandb#authentication-methods). - -#### Authentication Methods - -You need to configure the following credentials for authentication to the Weights & Biases platform: - -* `api_key`: Mandatory API key token of your Weights & Biases account. -* `project_name`: The name of the project where you're sending the new run. If the project is not specified, the run is put in an "Uncategorized" project. -* `entity`: An entity is a username or team name where you're sending runs. This entity must exist before you can send runs there, so make sure to create your account or team in the UI before starting to log runs. If you don't specify an entity, the run will be sent to your default entity, which is usually your username. - - -This option configures the credentials for the Weights & Biases platform directly as stack component attributes. - - - This is not recommended for production settings as the credentials won't be stored securely and will be clearly visible in the stack configuration. - - - -```bash -# Register the Weights & Biases experiment tracker -zenml experiment-tracker register wandb_experiment_tracker --flavor=wandb \ - --entity= --project_name= --api_key= - -# Register and set a stack with the new experiment tracker -zenml stack register custom_stack -e wandb_experiment_tracker ... --set -``` - - - This method requires you to [configure a ZenML secret](/develop/usage/project-setup/use-secrets/interact-with-secrets) to store the Weights & Biases tracking service credentials securely. - -You can create the secret using the `zenml secret create` command: -```bash -zenml secret create wandb_secret \ - --entity= \ - --project_name= - --api_key= -``` - -Once the secret is created, you can use it to configure the wandb Experiment Tracker: - -```shell -# Reference the entity, project and api-key in our experiment tracker component -zenml experiment-tracker register wandb_tracker \ - --flavor=wandb \ - --entity={{wandb_secret.entity}} \ - --project_name={{wandb_secret.project_name}} \ - --api_key={{wandb_secret.api_key}} - ... -``` - - - Read more about [ZenML Secrets](/develop/usage/project-setup/use-secrets/interact-with-secrets) in the ZenML documentation. - - - - -For more, up-to-date information on the Weights & Biases Experiment Tracker implementation and its configuration, you can have a look at [the SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-wandb/#zenml.integrations.wandb.experiment%5Ftrackers.wandb%5Fexperiment%5Ftracker) . - -### How do you use it? - -To be able to log information from a ZenML pipeline step using the Weights & Biases Experiment Tracker component in the active stack, you need to enable an experiment tracker using the `@step` decorator. Then use Weights & Biases logging or auto-logging capabilities as you would normally do, e.g.: - -```py -import wandb -from wandb.integration.keras import WandbCallback - - -@step(experiment_tracker="") -def tf_trainer( - config: TrainerConfig, - x_train: np.ndarray, - y_train: np.ndarray, - x_val: np.ndarray, - y_val: np.ndarray, -) -> tf.keras.Model: - ... - - model.fit( - x_train, - y_train, - epochs=config.epochs, - validation_data=(x_val, y_val), - callbacks=[ - WandbCallback( - log_evaluation=True, - validation_steps=16, - validation_data=(x_val, y_val), - ) - ], - ) - - metric = ... - - wandb.log({"": metric}) -``` - - -Instead of hardcoding an experiment tracker name, you can also use the [Client](/develop/sdk/python-client) to dynamically use the experiment tracker of your active stack: - -```py -from zenml.client import Client - -experiment_tracker = Client().active_stack.experiment_tracker - -@step(experiment_tracker=experiment_tracker.name) -def tf_trainer(...): - ... -``` - - -### Weights & Biases UI - -Weights & Biases comes with a web-based UI that you can use to find further details about your tracked experiments. - -Every ZenML step that uses Weights & Biases should create a separate experiment run which you can inspect in the Weights & Biases UI: - - - - - -You can find the URL of the Weights & Biases experiment linked to a specific ZenML run via the metadata of the step in which the experiment tracker was used: - -```py -from zenml.client import Client - -last_run = client.get_pipeline("").last_run -trainer_step = last_run.get_step("") -tracking_url = trainer_step.run_metadata["experiment_tracker_url"].value -print(tracking_url) -``` - -Alternatively, you can see an overview of all experiment runs at https://wandb.ai/{ENTITY\_NAME}/{PROJECT\_NAME}/runs/. - - - The naming convention of each Weights & Biases experiment run is `{pipeline_run_name}_{step_name}` (e.g. `wandb_example_pipeline-25_Apr_22-20_06_33_535737_tf_evaluator`) and each experiment run will be tagged with both `pipeline_name` and `pipeline_run_name`, which you can use to group and filter experiment runs. - - -#### Additional configuration - -For additional configuration of the Weights & Biases experiment tracker, you can pass `WandbExperimentTrackerSettings` to overwrite the [wandb.Settings](https://github.com/wandb/client/blob/master/wandb/sdk/wandb%5Fsettings.py#L353) or pass additional tags for your runs: - -```py -import wandb -from zenml.integrations.wandb.flavors.wandb_experiment_tracker_flavor import WandbExperimentTrackerSettings - -wandb_settings = WandbExperimentTrackerSettings( - settings=wandb.Settings(magic=True), - tags=["some_tag"] -) - - -@step( - experiment_tracker="", - settings={ - "experiment_tracker": wandb_settings - } -) -def my_step( - x_test: np.ndarray, - y_test: np.ndarray, - model: tf.keras.Model, -) -> float: - """Everything in this step is auto-logged""" - ... -``` - -Doing the above auto-magically logs all the data, metrics, and results within the step, no further action is required! - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-wandb/#zenml.integrations.wandb.flavors.wandb%5Fexperiment%5Ftracker%5Fflavor.WandbExperimentTrackerSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. diff --git a/develop/stack-components/feature-stores/custom.mdx b/develop/stack-components/feature-stores/custom.mdx deleted file mode 100644 index 91ef256..0000000 --- a/develop/stack-components/feature-stores/custom.mdx +++ /dev/null @@ -1,17 +0,0 @@ ---- -title: "Develop a Custom Feature Store" -description: "Learning how to develop a custom feature store." -icon: store ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -Feature stores allow data teams to serve data via an offline store, and an online low-latency store where data is kept in sync between the two. It also offers a centralized registry where features (and feature schemas) are stored for use within a team or wider organization. - - -**Base abstraction in progress!** - -We are actively working on the base abstraction for the feature stores, which will be available soon. As a result, their extension is not possible at the moment. If you would like to use a feature store in your stack, please check the list of already available feature stores down below. - diff --git a/develop/stack-components/feature-stores/feast.mdx b/develop/stack-components/feature-stores/feast.mdx deleted file mode 100644 index d6141c2..0000000 --- a/develop/stack-components/feature-stores/feast.mdx +++ /dev/null @@ -1,118 +0,0 @@ ---- -title: "Feast" -description: "Managing data in Feast feature stores." -icon: road ---- - -Feast (Feature Store) is an operational data system for managing and serving machine learning features to models in production. Feast is able to serve feature data to models from a low-latency online store (for real-time prediction) or from an offline store (for scale-out batch scoring or model training). - -### When would you want to use it? - -There are two core functions that feature stores enable: - -* access to data from an offline / batch store for training. -* access to online data at inference time. - -Feast integration currently supports your choice of offline data sources for your online feature serving. We encourage users to check out [Feast's documentation](https://docs.feast.dev/) and [guides](https://docs.feast.dev/how-to-guides/) on how to set up your offline and online data sources via the configuration `yaml` file. - - - COMING SOON: While the ZenML integration has an interface to access online feature store data, it currently is not usable in production settings with deployed models. We will update the docs when we enable this functionality. - - -### How to deploy it? - -ZenML assumes that users already have a Feast feature store that they just need to connect with. If you don't have a feature store yet, follow the [Feast Documentation](https://docs.feast.dev/how-to-guides/feast-snowflake-gcp-aws/deploy-a-feature-store) to deploy one first. - -To use the feature store as a ZenML stack component, you also need to install the corresponding `feast` integration in ZenML: - -```bash -zenml integration install feast -``` - -Now you can register your feature store as a ZenML stack component and add it into a corresponding stack: - -```bash -zenml feature-store register feast_store --flavor=feast --feast_repo="" - -zenml stack register ... -f feast_store -``` - -### How do you use it? - - - Online data retrieval is possible in a local setting, but we don't currently support using the online data serving in the context of a deployed model or as part of model deployment. We will update this documentation as we develop this feature. - - -Getting features from a registered and active feature store is possible by creating your own step that interfaces into the feature store: - -```py -from datetime import datetime -from typing import Any, Dict, List, Union -import pandas as pd - -from zenml import step -from zenml.client import Client - - -@step -def get_historical_features( - entity_dict: Union[Dict[str, Any], str], - features: List[str], - full_feature_names: bool = False -) -> pd.DataFrame: - """Feast Feature Store historical data step - - Returns: - The historical features as a DataFrame. - """ - feature_store = Client().active_stack.feature_store - if not feature_store: - raise DoesNotExistException( - "The Feast feature store component is not available. " - "Please make sure that the Feast stack component is registered as part of your current active stack." - ) - - params.entity_dict["event_timestamp"] = [ - datetime.fromisoformat(val) - for val in entity_dict["event_timestamp"] - ] - entity_df = pd.DataFrame.from_dict(entity_dict) - - return feature_store.get_historical_features( - entity_df=entity_df, - features=features, - full_feature_names=full_feature_names, - ) - - -entity_dict = { - "driver_id": [1001, 1002, 1003], - "label_driver_reported_satisfaction": [1, 5, 3], - "event_timestamp": [ - datetime(2021, 4, 12, 10, 59, 42).isoformat(), - datetime(2021, 4, 12, 8, 12, 10).isoformat(), - datetime(2021, 4, 12, 16, 40, 26).isoformat(), - ], - "val_to_add": [1, 2, 3], - "val_to_add_2": [10, 20, 30], -} - - -features = [ - "driver_hourly_stats:conv_rate", - "driver_hourly_stats:acc_rate", - "driver_hourly_stats:avg_daily_trips", - "transformed_conv_rate:conv_rate_plus_val1", - "transformed_conv_rate:conv_rate_plus_val2", -] - -@pipeline -def my_pipeline(): - my_features = get_historical_features(entity_dict, features) - ... -``` - - -Note that ZenML's use of Pydantic to serialize and deserialize inputs stored in the ZenML metadata means that we are limited to basic data types. Pydantic cannot handle Pandas `DataFrame`s, for example, or `datetime` values, so in the above code you can see that we have to convert them at various points. - -For more information and a full list of configurable attributes of the Feast feature store, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-feast/#zenml.integrations.feast.feature%5Fstores.feast%5Ffeature%5Fstore.FeastFeatureStore) . diff --git a/develop/stack-components/feature-stores/feature-stores.mdx b/develop/stack-components/feature-stores/feature-stores.mdx deleted file mode 100644 index 4e0a6b8..0000000 --- a/develop/stack-components/feature-stores/feature-stores.mdx +++ /dev/null @@ -1,39 +0,0 @@ ---- -title: "Overview" -description: "Managing data in feature stores." -icon: shop ---- - -Feature stores allow data teams to serve data via an offline store and an online low-latency store where data is kept in sync between the two. It also offers a centralized registry where features (and feature schemas) are stored for use within a team or wider organization. - -As a data scientist working on training your model, your requirements for how you access your batch / 'offline' data will almost certainly be different from how you access that data as part of a real-time or online inference setting. Feast solves the problem of developing [train-serve skew](https://ploomber.io/blog/train-serve-skew/) where those two sources of data diverge from each other. - -Feature stores are a relatively recent addition to commonly-used machine learning stacks. - -### When to use it - -The feature store is an optional stack component in the ZenML Stack. The feature store as a technology should be used to store the features and inject them into the process on the server side. This includes - -* Productionalize new features -* Reuse existing features across multiple pipelines and models -* Achieve consistency between training and serving data (Training Serving Skew) -* Provide a central registry of features and feature schemas - -### List of available feature stores - -For production use cases, some more flavors can be found in specific `integrations` modules. In terms of features stores, ZenML features an integration of `feast`. - -| Feature Store | Flavor | Integration | Notes | -| ---------------------------------------------------------------- | -------- | ------------------------------------------------------------------------ | ----------------------------------------- | -| [FeastFeatureStore](/develop/stack-components/feature-stores/feast) | feast | feast | Connect ZenML with already existing Feast | -| [Custom Implementation](/develop/stack-components/feature-stores/custom) | _custom_ | Extend the feature store abstraction and provide your own implementation | | - -If you would like to see the available flavors for feature stores, you can use the command: - -```bash -zenml feature-store flavor list -``` - -### How to use it - -The available implementation of the feature store is built on top of the feast integration, which means that using a feature store is no different from what's described on the [feast page: How to use it?](/develop/stack-components/feature-stores/feast#how-do-you-use-it). diff --git a/develop/stack-components/image-builders/custom.mdx b/develop/stack-components/image-builders/custom.mdx deleted file mode 100644 index c3bd559..0000000 --- a/develop/stack-components/image-builders/custom.mdx +++ /dev/null @@ -1,114 +0,0 @@ ---- -title: "Develop a Custom Image Builder" -description: "Learning how to develop a custom image builder." -icon: sim-card ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -### Base Abstraction - -The `BaseImageBuilder` is the abstract base class that needs to be subclassed in order to create a custom component that can be used to build Docker images. As image builders can come in many shapes and forms, the base class exposes a deliberately basic and generic interface: - -```py -from abc import ABC, abstractmethod -from typing import Any, Dict, Optional, Type, cast - -from zenml.container_registries import BaseContainerRegistry -from zenml.enums import StackComponentType -from zenml.image_builders import BuildContext -from zenml.stack import Flavor, StackComponent -from zenml.stack.stack_component import StackComponentConfig - - -class BaseImageBuilder(StackComponent, ABC): - """Base class for all ZenML image builders.""" - - @property - def build_context_class(self) -> Type["BuildContext"]: - """Build context class to use. - - The default build context class creates a build context that works - for the Docker daemon. Override this method if your image builder - requires a custom context. - - Returns: - The build context class. - """ - return BuildContext - - @abstractmethod - def build( - self, - image_name: str, - build_context: "BuildContext", - docker_build_options: Dict[str, Any], - container_registry: Optional["BaseContainerRegistry"] = None, - ) -> str: - """Builds a Docker image. - - If a container registry is passed, the image will be pushed to that - registry. - - Args: - image_name: Name of the image to build. - build_context: The build context to use for the image. - docker_build_options: Docker build options. - container_registry: Optional container registry to push to. - - Returns: - The Docker image repo digest or name. - """ -``` - - - This is a slimmed-down version of the base implementation which aims to highlight the abstraction layer. In order to see the full implementation and get the complete docstrings, please check [the source code on GitHub](https://github.com/zenml-io/zenml/blob/main/src/zenml/image%5Fbuilders/base%5Fimage%5Fbuilder.py) . - - -### Build your own custom image builder - -If you want to create your own custom flavor for an image builder, you can follow the following steps: - -1. Create a class that inherits from the `BaseImageBuilder` class and implement the abstract `build` method. This method should use the given build context and build a Docker image with it. If additionally a container registry is passed to the `build` method, the image builder is also responsible for pushing the image there. -2. If you need to provide any configuration, create a class that inherits from the `BaseImageBuilderConfig` class and adds your configuration parameters. -3. Bring both the implementation and the configuration together by inheriting from the `BaseImageBuilderFlavor` class. Make sure that you give a `name` to the flavor through its abstract property. - -Once you are done with the implementation, you can register it through the CLI. Please ensure you **point to the flavor class via dot notation**: - -```bash -zenml image-builder flavor register -``` - -For example, if your flavor class `MyImageBuilderFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml image-builder flavor register flavors.my_flavor.MyImageBuilderFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually it's better to not have to rely on this mechanism, and initialize zenml at the root. - - -Afterward, you should see the new flavor in the list of available flavors: - -```bash -zenml image-builder flavor list -``` - - -It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow. - -* The **CustomImageBuilderFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomImageBuilderConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` objects are inherently `pydantic` objects, you can also add your own custom validators here. -* The **CustomImageBuilder** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomImageBuilderFlavor` and the `CustomImageBuilderConfig` are implemented in a different module/path than the actual `CustomImageBuilder`). - - -#### Using a custom-build context - -The `BaseImageBuilder` abstraction uses the `build_context_class` to provide a class that should be used as the build context. In case your custom image builder requires a different build context than the default Docker build context, you can subclass the `BuildContext` class to customize the structure of your build context. In your image builder implementation, you can then overwrite the `build_context_class` property to specify your build context subclass. diff --git a/develop/stack-components/image-builders/gcp.mdx b/develop/stack-components/image-builders/gcp.mdx deleted file mode 100644 index 6f10658..0000000 --- a/develop/stack-components/image-builders/gcp.mdx +++ /dev/null @@ -1,188 +0,0 @@ ---- -title: "Google Cloud Image Builder" -description: "Building container images with Google Cloud Build" -icon: cloud ---- - -The Google Cloud image builder is an [image builder](/develop/stack-components/image-builders) flavor provided by the ZenML `gcp` integration that uses [Google Cloud Build](https://cloud.google.com/build) to build container images. - -### When to use it - -You should use the Google Cloud image builder if: - -* you're **unable** to install or use [Docker](https://www.docker.com) on your client machine. -* you're already using GCP. -* your stack is mainly composed of other Google Cloud components such as the [GCS Artifact Store](/develop/stack-components/artifact-stores/gcp) or the [Vertex Orchestrator](/develop/stack-components/orchestrators/vertex). - -### How to deploy it - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including the Google Cloud image builder? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML GCP Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -In order to use the ZenML Google Cloud image builder you need to enable Google Cloud Build relevant APIs on the Google Cloud project. - -### How to use it - -To use the Google Cloud image builder, we need: - -* The ZenML `gcp` integration installed. If you haven't done so, run: -```bash -zenml integration install gcp -``` -* A [GCP Artifact Store](/develop/stack-components/artifact-stores/gcp) where the build context will be uploaded, so Google Cloud Build can access it. -* A [GCP container registry](/develop/stack-components/container-registries/gcp) where the built image will be pushed. -* Optionally, the GCP project ID in which you want to run the build and a service account with the needed permissions to run the build. If not provided, then the project ID and credentials will be inferred from the environment. -* Optionally, you can change: - * the Docker image used by Google Cloud Build to execute the steps to build and push the Docker image. By default, the builder image will be `'gcr.io/cloud-builders/docker'`. - * The network to which the container used to build the ZenML pipeline Docker image will be attached. More information: [Cloud build network](https://cloud.google.com/build/docs/build-config-file-schema#network). - * The build timeout for the build, and for the blocking operation waiting for the build to finish. More information: [Build Timeout](https://cloud.google.com/build/docs/build-config-file-schema#timeout%5F2). - -We can register the image builder and use it in our active stack: - -```bash -zenml image-builder register \ - --flavor=gcp \ - --cloud_builder_image= \ - --network= \ - --build_timeout= - -# Register and activate a stack with the new image builder -zenml stack register -i ... --set -``` - -You also need to set up [authentication](/develop/stack-components/image-builders/gcp#authentication-methods) required to access the Cloud Build GCP services. - -#### Authentication Methods - -Integrating and using a GCP Image Builder in your pipelines is not possible without employing some form of authentication. If you're looking for a quick way to get started locally, you can use the _Local Authentication_ method. However, the recommended way to authenticate to the GCP cloud platform is through [a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector). This is particularly useful if you are configuring ZenML stacks that combine the GCP Image Builder with other remote stack components also running in GCP. - - - This method uses the implicit GCP authentication available _in the environment where the ZenML code is running_. On your local machine, this is the quickest way to configure a GCP Image Builder. You don't need to supply credentials explicitly when you register the GCP Image Builder, as it leverages the local credentials and configuration that the Google Cloud CLI stores on your local machine. However, you will need to install and set up the Google Cloud CLI on your machine as a prerequisite, as covered in [the Google Cloud documentation](https://cloud.google.com/sdk/docs/install-sdk) , before you register the GCP Image Builder. - - - Stacks using the GCP Image Builder set up with local authentication are not portable across environments. To make ZenML pipelines fully portable, it is recommended to use [a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) to authenticate your GCP Image Builder to the GCP cloud platform. - - - -To set up the GCP Image Builder to authenticate to GCP and access the GCP Cloud Build services, it is recommended to leverage the many features provided by [the GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) such as auto-configuration, best security practices regarding long-lived credentials and reusing the same credentials across multiple stack components. - -If you don't already have a GCP Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You also have the option to configure a GCP Service Connector that can be used to access more than just the GCP Cloud Build service: - -```bash -zenml service-connector register --type gcp -i -``` - -A non-interactive CLI example that leverages [the Google Cloud CLI configuration](https://cloud.google.com/sdk/docs/install-sdk) on your local machine to auto-configure a GCP Service Connector for the GCP Cloud Build service: - -```bash -zenml service-connector register --type gcp --resource-type gcp-generic --resource-name --auto-configure -``` - -```bash Example Command Output - -$ zenml service-connector register gcp-generic --type gcp --resource-type gcp-generic --auto-configure -Successfully registered service connector `gcp-generic` with access to the following resources: -┏━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠────────────────┼────────────────┨ -┃ 🔵 gcp-generic │ zenml-core ┃ -┗━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -> **Note**: Please remember to grant the entity associated with your GCP credentials permissions to access the Cloud Build API and to run Cloud Builder jobs (e.g. the [Cloud Build Editor IAM role](https://cloud.google.com/build/docs/iam-roles-permissions#predefined%5Froles)). The GCP Service Connector supports [many different authentication methods](/develop/usage/resource-data-management/auth-management/gcp-service-connector#authentication-methods) with different levels of security and convenience. You should pick the one that best fits your use case. - -If you already have one or more GCP Service Connectors configured in your ZenML deployment, you can check which of them can be used to access generic GCP resources like the GCP Image Builder required for your GCP Image Builder by running e.g.: - -```bash -zenml service-connector list-resources --resource-type gcp-generic -``` - -```bash Example Command Output -The following 'gcp-generic' resources can be accessed by service connectors that you have configured: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────┼────────────────┨ -┃ bfdb657d-d808-47e7-9974-9ba6e4919d83 │ gcp-generic │ 🔵 gcp │ 🔵 gcp-generic │ zenml-core ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on a GCP Service Connector to use to authenticate to GCP, you can register the GCP Image Builder as follows: - -```bash -zenml image-builder register \ - --flavor=gcp \ - --cloud_builder_image= \ - --network= \ - --build_timeout= - -# Connect the GCP Image Builder to GCP via a GCP Service Connector -zenml image-builder connect -i -``` - -A non-interactive version that connects the GCP Image Builder to a target GCP Service Connector: - -```bash -zenml image-builder connect --connector -``` - -```bash Example Command Output -$ zenml image-builder connect gcp-image-builder --connector gcp-generic - -Successfully connected image builder `gcp-image-builder` to the following resources: - -$ zenml image-builder connect gcp-image-builder --connector gcp-generic -Successfully connected image builder `gcp-image-builder` to the following resources: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────┼────────────────┨ -┃ bfdb657d-d808-47e7-9974-9ba6e4919d83 │ gcp-generic │ 🔵 gcp │ 🔵 gcp-generic │ zenml-core ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -As a final step, you can use the GCP Image Builder in a ZenML Stack: - -```bash -# Register and set a stack with the new image builder -zenml stack register -i ... --set -``` - - - -When you register the GCP Image Builder, you can [generate a GCP Service Account Key](https://cloud.google.com/docs/authentication/application-default-credentials#attached-sa), save it to a local file and then reference it in the Image Builder configuration. - -This method has the advantage that you don't need to install and configure the GCP CLI on your host, but it's still not as secure as using a GCP Service Connector and the stack component configuration is not portable to other hosts. - -For this method, you need to [create a user-managed GCP service account](https://cloud.google.com/iam/docs/service-accounts-create), and grant it privileges to access the Cloud Build API and to run Cloud Builder jobs (e.g. the [Cloud Build Editor IAM role](https://cloud.google.com/build/docs/iam-roles-permissions#predefined%5Froles). - -With the service account key downloaded to a local file, you can register the GCP Image Builder as follows: - -```bash -zenml image-builder register \ - --flavor=gcp \ - --project= \ - --service_account_path= \ - --cloud_builder_image= \ - --network= \ - --build_timeout= - -# Register and set a stack with the new image builder -zenml stack register -i ... --set -``` - - - - -### Caveats - -As described in this [Google Cloud Build documentation page](https://cloud.google.com/build/docs/build-config-file-schema#network), Google Cloud Build uses containers to execute the build steps which are automatically attached to a network called `cloudbuild` that provides some Application Default Credentials (ADC), that allow the container to be authenticated and therefore use other GCP services. - -By default, the GCP Image Builder is executing the build command of the ZenML Pipeline Docker image with the option `--network=cloudbuild`, so the ADC provided by the `cloudbuild` network can also be used in the build. This is useful if you want to install a private dependency from a GCP Artifact Registry, but you will also need to use a [custom base parent image](/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline) with the [keyrings.google-artifactregistry-auth](https://pypi.org/project/keyrings.google-artifactregistry-auth/) installed, so `pip` can connect and authenticate in the private artifact registry to download the dependency. - -```bash -FROM zenmldocker/zenml:latest - -RUN pip install keyrings.google-artifactregistry-auth -``` - -The above `Dockerfile` uses `zenmldocker/zenml:latest` as a base image, but is recommended to change the tag to specify the ZenML version and Python version like `0.33.0-py3.10`. diff --git a/develop/stack-components/image-builders/image-builders.mdx b/develop/stack-components/image-builders/image-builders.mdx deleted file mode 100644 index c9bd89f..0000000 --- a/develop/stack-components/image-builders/image-builders.mdx +++ /dev/null @@ -1,32 +0,0 @@ ---- -title: "Overview" -description: "Building container images for your ML workflow." -icon: image ---- - -The image builder is an essential part of most remote MLOps stacks. It is used to build container images such that your machine-learning pipelines and steps can be executed in remote environments. - -### When to use it - -The image builder is needed whenever other components of your stack need to build container images. Currently, this is the case for most of ZenML's remote [orchestrators](/develop/stack-components/orchestrators) , [step operators](/develop/stack-components/step-operators), and some [model deployers](/develop/stack-components/model-deployers). These containerize your pipeline code and therefore require an image builder to build [Docker](https://www.docker.com/) images. - -### Image Builder Flavors - -Out of the box, ZenML comes with a `local` image builder that builds Docker images on your client machine. Additional image builders are provided by integrations: - -| Image Builder | Flavor | Integration | Notes | -| ---------------------------------------------------------------- | -------- | ------------------------------------------------------------------------ | ----------------------------------------------------- | -| [LocalImageBuilder](/develop/stack-components/image-builders/local) | local | _built-in_ | Builds your Docker images locally. | -| [KanikoImageBuilder](/develop/stack-components/image-builders/kaniko) | kaniko | kaniko | Builds your Docker images in Kubernetes using Kaniko. | -| [GCPImageBuilder](/develop/stack-components/image-builders/gcp) | gcp | gcp | Builds your Docker images using Google Cloud Build. | -| [Custom Implementation](/develop/stack-components/image-builders/custom) | _custom_ | Extend the image builder abstraction and provide your own implementation | | - -If you would like to see the available flavors of image builders, you can use the command: - -```bash -zenml image-builder flavor list -``` - -### How to use it - -You don't need to directly interact with any image builder in your code. As long as the image builder that you want to use is part of your active [ZenML stack](/develop/user-guide/guides/production-guide/understand-stacks), it will be used automatically by any component that needs to build container images. diff --git a/develop/stack-components/image-builders/kaniko.mdx b/develop/stack-components/image-builders/kaniko.mdx deleted file mode 100644 index 8d8912b..0000000 --- a/develop/stack-components/image-builders/kaniko.mdx +++ /dev/null @@ -1,151 +0,0 @@ ---- -title: "Kaniko Image Builder" -description: "Building container images with Kaniko." -icon: hippo ---- - -The Kaniko image builder is an [image builder](/develop/stack-components/image-builders) flavor provided by the ZenML `kaniko` integration that uses [Kaniko](https://github.com/GoogleContainerTools/kaniko) to build container images. - -### When to use it - -You should use the Kaniko image builder if: - -* you're **unable** to install or use [Docker](https://www.docker.com) on your client machine. -* you're familiar with/already using Kubernetes. - -### How to deploy it - -In order to use the Kaniko image builder, you need a deployed Kubernetes cluster. - -### How to use it - -To use the Kaniko image builder, we need: - -* The ZenML `kaniko` integration installed. If you haven't done so, run - -```bash -zenml integration install kaniko -``` -* [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) installed. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. -* By default, the Kaniko image builder transfers the build context using the Kubernetes API. If you instead want to transfer the build context by storing it in the artifact store, you need to register it with the `store_context_in_artifact_store` attribute set to `True`. In this case, you also need a [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* Optionally, you can change the timeout (in seconds) until the Kaniko pod is running in the orchestrator using the `pod_running_timeout` attribute. - -We can then register the image builder and use it in our active stack: - -```bash -zenml image-builder register \ - --flavor=kaniko \ - --kubernetes_context= - [ --pod_running_timeout= ] - -# Register and activate a stack with the new image builder -zenml stack register -i ... --set -``` - -For more information and a full list of configurable attributes of the Kaniko image builder, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-kaniko/#zenml.integrations.kaniko.image%5Fbuilders.kaniko%5Fimage%5Fbuilder.KanikoImageBuilder) . - -#### Authentication for the container registry and artifact store - -The Kaniko image builder will create a Kubernetes pod that is running the build. This build pod needs to be able to pull from/push to certain container registries, and depending on the stack component configuration also needs to be able to read from the artifact store: - -* The pod needs to be authenticated to push to the container registry in your active stack. -* In case the [parent image](/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline#using-a-custom-parent-image) you use in your `DockerSettings` is stored in a private registry, the pod needs to be authenticated to pull from this registry. -* If you configured your image builder to store the build context in the artifact store, the pod needs to be authenticated to read files from the artifact store storage. - -ZenML is not yet able to handle setting all of the credentials of the various combinations of container registries and artifact stores on the Kaniko build pod, which is you're required to set this up yourself for now. The following section outlines how to handle it in the most straightforward (and probably also most common) scenario, when the Kubernetes cluster you're using for the Kaniko build is hosted on the same cloud provider as your container registry (and potentially the artifact store). For all other cases, check out the [official Kaniko repository](https://github.com/GoogleContainerTools/kaniko) for more information. - - - * Add permissions to push to ECR by attaching the `EC2InstanceProfileForImageBuilderECRContainerBuilds` policy to your [EKS node IAM role](https://docs.aws.amazon.com/eks/latest/userguide/create-node-role.html). -* Configure the image builder to set some required environment variables on the Kaniko build pod: - -```bash -# register a new image builder with the environment variables - -zenml image-builder register \ - --flavor=kaniko \ - --kubernetes_context= \ - --env='[{"name": "AWS_SDK_LOAD_CONFIG", "value": "true"}, {"name": "AWS_EC2_METADATA_DISABLED", "value": "true"}]' - -# or update an existing one - -zenml image-builder update \ - --env='[{"name": "AWS_SDK_LOAD_CONFIG", "value": "true"}, {"name": "AWS_EC2_METADATA_DISABLED", "value": "true"}]' -``` - -Check out [the Kaniko docs](https://github.com/GoogleContainerTools/kaniko#pushing-to-amazon-ecr) for more information. - - -* [Enable workload identity](https://cloud.google.com/kubernetes-engine/docs/versions/0.66.0/how-to/workload-identity#enable%5Fon%5Fcluster) for your cluster -* Follow the steps described [here](https://cloud.google.com/kubernetes-engine/docs/versions/0.66.0/how-to/workload-identity#authenticating%5Fto) to create a Google service account, a Kubernetes service account as well as an IAM policy binding between them. -* Grant the Google service account permissions to push to your GCR registry and read from your GCP bucket. -* Configure the image builder to run in the correct namespace and use the correct service account: - -```bash -# register a new image builder with namespace and service account -zenml image-builder register \ - --flavor=kaniko \ - --kubernetes_context= \ - --kubernetes_namespace= \ - --service_account_name= - # --executor_args='["--compressed-caching=false", "--use-new-run=true"]' - -# or update an existing one -zenml image-builder update \ - --kubernetes_namespace= \ - --service_account_name= -``` - -Check out [the Kaniko docs](https://github.com/GoogleContainerTools/kaniko#pushing-to-google-gcr) for more information. - - -* Create a Kubernetes `configmap` for a Docker config that uses the Azure credentials helper: - -```bash -kubectl create configmap docker-config --from-literal='config.json={ "credHelpers": { "mycr.azurecr.io": "acr-env" } }' -``` - -* Follow [these steps](https://learn.microsoft.com/en-us/azure/aks/use-managed-identity) to configure your cluster to use a managed identity -* Configure the image builder to mount the `configmap` in the Kaniko build pod: - -```bash -# register a new image builder with the mounted configmap -zenml image-builder register \ - --flavor=kaniko \ - --kubernetes_context= \ - --volume_mounts='[{"name": "docker-config", "mountPath": "/kaniko/.docker/"}]' \ - --volumes='[{"name": "docker-config", "configMap": {"name": "docker-config"}}]' - # --executor_args='["--compressed-caching=false", "--use-new-run=true"]' - -# or update an existing one -zenml image-builder update \ - --volume_mounts='[{"name": "docker-config", "mountPath": "/kaniko/.docker/"}]' \ - --volumes='[{"name": "docker-config", "configMap": {"name": "docker-config"}}]' -``` - -Check out [the Kaniko docs](https://github.com/GoogleContainerTools/kaniko#pushing-to-azure-container-registry) for more information. - - - -#### Passing additional parameters to the Kaniko build - -You can pass additional parameters to the Kaniko build by setting the `executor_args` attribute of the image builder. - -```bash -zenml image-builder register \ - --flavor=kaniko \ - --kubernetes_context= \ - --executor_args='["--label", "key=value"]' # Adds a label to the final image -``` - -List of some possible additional flags: - -* `--cache`: Set to `false` to disable caching. Defaults to `true`. -* `--cache-dir`: Set the directory where to store cached layers. Defaults to `/cache`. -* `--cache-repo`: Set the repository where to store cached layers. Defaults to `gcr.io/kaniko-project/executor`. -* `--cache-ttl`: Set the cache expiration time. Defaults to `24h`. -* `--cleanup`: Set to `false` to disable cleanup of the working directory. Defaults to `true`. -* `--compressed-caching`: Set to `false` to disable compressed caching. Defaults to `true`. - -For a full list of possible flags, check out the [Kaniko additional flags](https://github.com/GoogleContainerTools/kaniko#additional-flags) - diff --git a/develop/stack-components/image-builders/local.mdx b/develop/stack-components/image-builders/local.mdx deleted file mode 100644 index ae5b823..0000000 --- a/develop/stack-components/image-builders/local.mdx +++ /dev/null @@ -1,47 +0,0 @@ ---- -title: "Local Image Builder" -description: "Building container images locally." -icon: folder ---- - -The local image builder is an [image builder](/develop/stack-components/image-builders) flavor that comes built-in with ZenML and uses the local Docker installation on your client machine to build container images. - - - ZenML uses the official Docker Python library to build and push your images. This library loads its authentication credentials to push images from the default config location: `$HOME/.docker/config.json`. If your Docker configuration is stored in a different directory, you can use the environment variable `DOCKER_CONFIG` to override this behavior: - -```bash -export DOCKER_CONFIG=/path/to/config_dir -``` - -The directory that you specify here must contain your Docker configuration in a file called `config.json`. - - -### When to use it - -You should use the local image builder if: - -* you're able to install and use [Docker](https://www.docker.com) on your client machine. -* you want to use remote components that require containerization without the additional hassle of configuring infrastructure for an additional component. - -### How to deploy it - -The local image builder comes with ZenML and works without any additional setup. - -### How to use it - -To use the Local image builder, we need: - -* [Docker](https://www.docker.com) installed and running. -* The Docker client authenticated to push to the container registry that you intend to use in the same stack. - -We can then register the image builder and use it to create a new stack: - -```bash -zenml image-builder register --flavor=local - -# Register and activate a stack with the new image builder - -zenml stack register -i ... --set -``` - -For more information and a full list of configurable attributes of the local image builder, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-image%5Fbuilders/#zenml.image%5Fbuilders.local%5Fimage%5Fbuilder.LocalImageBuilder) . diff --git a/develop/stack-components/model-deployers/bentoml.mdx b/develop/stack-components/model-deployers/bentoml.mdx deleted file mode 100644 index 61a1d68..0000000 --- a/develop/stack-components/model-deployers/bentoml.mdx +++ /dev/null @@ -1,255 +0,0 @@ ---- -title: "BentoML" -description: "Deploying your models locally with BentoML." -icon: laptop-code ---- - -BentoML is an open-source framework for machine learning model serving. it can be used to deploy models locally, in a cloud environment, or in a Kubernetes environment. - - - The BentoML Model Deployer is one of the available flavors of the [Model Deployer](/develop/stack-components/model-deployers) stack component. Provided with the BentoML integration it can be used to deploy and manage [BentoML models](https://docs.bentoml.org/en/latest/concepts/model.html) or [Bento](https://docs.bentoml.org/en/latest/concepts/bento.html) on a local running HTTP server. - -The BentoML Model Deployer can be used to deploy models for local development and production use cases. While the integration mainly works in a local environment where pipelines are run, the used [Bento](https://docs.bentoml.org/en/latest/concepts/bento.html) can be exported and containerized, and deployed in a remote environment. Within the BentoML ecosystem, [Yatai](https://github.com/bentoml/Yatai) and [bentoctl](https://github.com/bentoml/bentoctl) are the tools responsible for deploying the Bentos into the Kubernetes cluster and Cloud Platforms. Full support for these advanced tools is in progress and will be available soon. - -## When to use it? - -You should use the BentoML Model Deployer to: - -* Standardize the way you deploy your models to production within your organization. -* if you are looking to deploy your models in a simple way, while you are still able to transform your model into a production-ready solution when that time comes. - -If you are looking to deploy your models with other Kubernetes-based solutions, you can take a look at one of the other [Model Deployer Flavors](/develop/stack-components/model-deployers#model-deployers-flavors) available in ZenML. - -BentoML also allows you to deploy your models in a more complex production-grade setting. [Bentoctl](https://github.com/bentoml/bentoctl) is one of the tools that can help you get there. Bentoctl takes your built Bento from a ZenML pipeline and deploys it with `bentoctl` into a cloud environment such as AWS Lambda, AWS SageMaker, Google Cloud Functions, Google Cloud AI Platform, or Azure Functions. Read more about this in the [From Local to Cloud with bentoctl section](/develop/stack-components/model-deployers/bentoml#from-local-to-cloud-with-bentoctl). - - - The `bentoctl` integration implementation is still in progress and will be available soon. The integration will allow you to deploy your models to a specific cloud provider with just a few lines of code using ZenML built-in steps. - - -## How do you deploy it? - -Within ZenML you can quickly get started with BentoML by simply creating Model Deployer Stack Component with the BentoML flavor. To do so you'll need to install the required Python packages on your local machine to be able to deploy your models: - -```bash -zenml integration install bentoml -y -``` - -To register the BentoML model deployer with ZenML you need to run the following command: - -```bash -zenml model-deployer register bentoml_deployer --flavor=bentoml -``` - -The ZenML integration will provision a local HTTP deployment server as a daemon process that will continue to run in the background to serve the latest models and Bentos. - -## How do you use it? - -The recommended flow to use the BentoML model deployer is to first [create a BentoML Service](/develop/stack-components/model-deployers/bentoml#bentoml-service-and-runner), then [use the bento\_builder\_step](/develop/stack-components/model-deployers/bentoml#zenml-bento-builder-step) to build the model and service into a bento bundle, and finally [deploy the bundle with the bentoml\_model\_deployer\_step](/develop/stack-components/model-deployers/bentoml#zenml-bentoml-deployer-step). - -### BentoML Service and Runner - -The first step to being able to deploy your models and use BentoML is to create a [bento service](https://docs.bentoml.org/en/latest/concepts/service.html) which is the main logic that defines how your model will be served, and a [bento runner](https://docs.bentoml.org/en/latest/concepts/runner.html) which represents a unit of execution for your model on a remote Python worker. - -The following example shows how to create a basic bento service and runner that will be used to serve a basic scikit-learn model. - -```py -import numpy as np -import bentoml -from bentoml.io import NumpyNdarray - -iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner() - -svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner]) - - -@svc.api(input=NumpyNdarray(), output=NumpyNdarray()) -def classify(input_series: np.ndarray) -> np.ndarray: - result = iris_clf_runner.predict.run(input_series) - return result - -``` - -### ZenML Bento Builder step - -Once you have your bento service and runner defined, we can use the built-in bento builder step to build the bento bundle that will be used to serve the model. The following example shows how can call the built-in bento builder step within a ZenML pipeline. - -```py -from zenml import pipeline, step -from zenml.integrations.bentoml.steps import bento_builder_step - -@pipeline -def bento_builder_pipeline(): - model = ... - bento = bento_builder_step( - model=model, - model_name="pytorch_mnist", # Name of the model - model_type="pytorch", # Type of the model (pytorch, tensorflow, sklearn, xgboost..) - service="service.py:svc", # Path to the service file within zenml repo - labels={ # Labels to be added to the bento bundle - "framework": "pytorch", - "dataset": "mnist", - "zenml_version": "0.21.1", - }, - exclude=["data"], # Exclude files from the bento bundle - python={ - "packages": ["zenml", "torch", "torchvision"], - }, # Python package requirements of the model - ) -``` - -The Bento Builder step can be used in any orchestration pipeline that you create with ZenML. The step will build the bento bundle and save it to the used artifact store. Which can be used to serve the model in a local setting using the BentoML Model Deployer Step, or in a remote setting using the `bentoctl` or Yatai. This gives you the flexibility to package your model in a way that is ready for different deployment scenarios. - -### ZenML BentoML Deployer step - -We have now built our bento bundle, and we can use the built-in `bentoml_model_deployer_step` to deploy the bento bundle to our local HTTP server. The following example shows how to call the built-in bento deployer step within a ZenML pipeline. - -Note: the `bentoml_model_deployer_step` can only be used in a local environment. - - - -```py -from zenml import pipeline, step -from zenml.integrations.bentoml.steps import bentoml_model_deployer_step - -@pipeline -def bento_deployer_pipeline(): - bento = ... - deployed_model = bentoml_model_deployer_step( - bento=bento - model_name="pytorch_mnist", # Name of the model - port=3001, # Port to be used by the http server - ) -``` - -### ZenML BentoML Pipeline examples - -Once all the steps have been defined, we can create a ZenML pipeline and run it. The bento builder step expects to get the trained model as an input, so we need to make sure either we have a previous step that trains the model and outputs it or loads the model from a previous run. Then the deployer step expects to get the bento bundle as an input, so we need to make sure either we have a previous step that builds the bento bundle and outputs it or load the bento bundle from a previous run or external source. - -The following example shows how to create a ZenML pipeline that trains a model, builds a bento bundle, and deploys it to a local HTTP server. - -```py -# Import the pipeline to use the pipeline decorator -from zenml.pipelines import pipeline - - -# Pipeline definition -@pipeline -def bentoml_pipeline( - importer, - trainer, - evaluator, - deployment_trigger, - bento_builder, - deployer, -): - """Link all the steps and artifacts together""" - train_dataloader, test_dataloader = importer() - model = trainer(train_dataloader) - accuracy = evaluator(test_dataloader=test_dataloader, model=model) - decision = deployment_trigger(accuracy=accuracy) - bento = bento_builder(model=model) - deployer(deploy_decision=decision, bento=bento) - -``` - -In more complex scenarios, you might want to build a pipeline that trains a model and builds a bento bundle in a remote environment. Then creates a new pipeline that retrieves the bento bundle and deploys it to a local http server, or to a cloud provider. The following example shows a pipeline example that does exactly that. - -```py -# Import the pipeline to use the pipeline decorator -from zenml.pipelines import pipeline - - -# Pipeline definition -@pipeline -def remote_train_pipeline( - importer, - trainer, - evaluator, - bento_builder, -): - """Link all the steps and artifacts together""" - train_dataloader, test_dataloader = importer() - model = trainer(train_dataloader) - accuracy = evaluator(test_dataloader=test_dataloader, model=model) - bento = bento_builder(model=model) - - -@pipeline -def local_deploy_pipeline( - bento_loader, - deployer, -): - """Link all the steps and artifacts together""" - bento = bento_loader() - deployer(deploy_decision=decision, bento=bento) - -``` - -### Predicting with the local deployed model - -Once the model has been deployed we can use the BentoML client to send requests to the deployed model. ZenML will automatically create a BentoML client for you and you can use it to send requests to the deployed model by simply calling the service to predict the method and passing the input data and the API function name. - -The following example shows how to use the BentoML client to send requests to the deployed model. - -```py -@step -def predictor( - inference_data: Dict[str, List], - service: BentoMLDeploymentService, -) -> None: - """Run an inference request against the BentoML prediction service. - - Args: - service: The BentoML service. - data: The data to predict. - """ - - service.start(timeout=10) # should be a NOP if already started - for img, data in inference_data.items(): - prediction = service.predict("predict_ndarray", np.array(data)) - result = to_labels(prediction[0]) - rich_print(f"Prediction for {img} is {result}") -``` - -Deploying and testing locally is a great way to get started and test your model. However, a real-world scenario will most likely require you to deploy your model to a remote environment. The next section will show you how to deploy the Bento you built with ZenML pipelines to a cloud environment using the `bentoctl` CLI. - -### From Local to Cloud with `bentoctl` - -Bentoctl helps deploy any machine learning models as production-ready API endpoints into the cloud. It is a command line tool that provides a simple interface to manage your BentoML bundles. - -The `bentoctl` CLI provides a list of operators which are plugins that interact with cloud services, some of these operators are: - -* [AWS Lambda](https://github.com/bentoml/aws-lambda-deploy) -* [AWS SageMaker](https://github.com/bentoml/aws-sagemaker-deploy) -* [AWS EC2](https://github.com/bentoml/aws-ec2-deploy) -* [Google Cloud Run](https://github.com/bentoml/google-cloud-run-deploy) -* [Google Compute Engine](https://github.com/bentoml/google-compute-engine-deploy) -* [Azure Container Instances](https://github.com/bentoml/azure-container-instances-deploy) -* [Heroku](https://github.com/bentoml/heroku-deploy) - -To deploy your BentoML bundle to the cloud, you need to install the `bentoctl` CLI and the operator plugin for the cloud service you want to deploy to. - -```bash -# Install bentoctl CLI -pip install bentoctl -# Install a choose operator -bentoctl operator install $OPERATOR # example: aws-lambda -``` - -Once you have the `bentoctl` CLI and the operator plugin installed, you can use the `bentoctl` CLI to deploy your BentoML bundle to the cloud. - -```bash -# Let's get the name of the BentoML bundle we want to deploy -bentoml list - -# Generate deployment configuration file -bentoctl init - -# Build and push the Docker image to the cloud -bentoctl build -b $BENTO_TAG -f deployment_config.yaml - -# Deploy to the cloud -bentoctl apply -f deployment_config.yaml -``` - -For more information and a full list of configurable attributes of the BentoML Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-bentoml/#zenml.integrations.bentoml.model%5Fdeployers.bentoml%5Fmodel%5Fdeployer) . diff --git a/develop/stack-components/model-deployers/custom.mdx b/develop/stack-components/model-deployers/custom.mdx deleted file mode 100644 index e5094d8..0000000 --- a/develop/stack-components/model-deployers/custom.mdx +++ /dev/null @@ -1,166 +0,0 @@ ---- -title: "Develop a Custom Model Deployer" -description: "Learning how to develop a custom model deployer." -icon: shuffle ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -To deploy and manage your trained machine-learning models, ZenML provides a stack component called `Model Deployer`. This component is responsible for interacting with the deployment tool, framework, or platform. - -When present in a stack, the model deployer can also act as a registry for models that are served with ZenML. You can use the model deployer to list all models that are currently deployed for online inference or filtered according to a particular pipeline run or step, or to suspend, resume or delete an external model server managed through ZenML. - -### Base Abstraction - -In ZenML, the base abstraction of the model deployer is built on top of three major criteria: - -1. It needs to ensure efficient deployment and management of models in accordance with the specific requirements of the serving infrastructure, by holding all the stack-related configuration attributes required to interact with the remote model serving tool, service, or platform. -2. It needs to implement the continuous deployment logic necessary to deploy models in a way that updates an existing model server that is already serving a previous version of the same model instead of creating a new model server for every new model version (see the `deploy_model` abstract method). This functionality can be consumed directly from ZenML pipeline steps, but it can also be used outside the pipeline to deploy ad-hoc models. It is also usually coupled with a standard model deployer step, implemented by each integration, that hides the details of the deployment process from the user. -3. It needs to act as a ZenML BaseService registry, where every BaseService instance is used as an internal representation of a remote model server (see the `find_model_server` abstract method). To achieve this, it must be able to re-create the configuration of a BaseService from information that is persisted externally, alongside, or even as part of the remote model server configuration itself. For example, for model servers that are implemented as Kubernetes resources, the BaseService instances can be serialized and saved as Kubernetes resource annotations. This allows the model deployer to keep track of all externally running model servers and to re-create their corresponding BaseService instance representations at any given time. The model deployer also defines methods that implement basic life-cycle management on remote model servers outside the coverage of a pipeline (see `stop_model_server` , `start_model_server` and `delete_model_server`). - -Putting all these considerations together, we end up with the following interface: - -```py -from abc import ABC, abstractmethod -from typing import Dict, List, Optional, Type -from uuid import UUID - -from zenml.enums import StackComponentType -from zenml.services import BaseService, ServiceConfig -from zenml.stack import StackComponent, StackComponentConfig, Flavor - -DEFAULT_DEPLOYMENT_START_STOP_TIMEOUT = 300 - - -class BaseModelDeployerConfig(StackComponentConfig): - """Base class for all ZenML model deployer configurations.""" - - -class BaseModelDeployer(StackComponent, ABC): - """Base class for all ZenML model deployers.""" - - @abstractmethod - def perform_deploy_model( - self, - id: UUID, - config: ServiceConfig, - timeout: int = DEFAULT_DEPLOYMENT_START_STOP_TIMEOUT, - ) -> BaseService: - """Abstract method to deploy a model.""" - - @staticmethod - @abstractmethod - def get_model_server_info( - service: BaseService, - ) -> Dict[str, Optional[str]]: - """Give implementation-specific way to extract relevant model server - properties for the user.""" - - @abstractmethod - def perform_stop_model( - self, - service: BaseService, - timeout: int = DEFAULT_DEPLOYMENT_START_STOP_TIMEOUT, - force: bool = False, - ) -> BaseService: - """Abstract method to stop a model server.""" - - @abstractmethod - def perform_start_model( - self, - service: BaseService, - timeout: int = DEFAULT_DEPLOYMENT_START_STOP_TIMEOUT, - ) -> BaseService: - """Abstract method to start a model server.""" - - @abstractmethod - def perform_delete_model( - self, - service: BaseService, - timeout: int = DEFAULT_DEPLOYMENT_START_STOP_TIMEOUT, - force: bool = False, - ) -> None: - """Abstract method to delete a model server.""" - - -class BaseModelDeployerFlavor(Flavor): - """Base class for model deployer flavors.""" - - @property - @abstractmethod - def name(self): - """Returns the name of the flavor.""" - - @property - def type(self) -> StackComponentType: - """Returns the flavor type. - - Returns: - The flavor type. - """ - return StackComponentType.MODEL_DEPLOYER - - @property - def config_class(self) -> Type[BaseModelDeployerConfig]: - """Returns `BaseModelDeployerConfig` config class. - - Returns: - The config class. - """ - return BaseModelDeployerConfig - - @property - @abstractmethod - def implementation_class(self) -> Type[BaseModelDeployer]: - """The class that implements the model deployer.""" -``` - - - This is a slimmed-down version of the base implementation which aims to highlight the abstraction layer. In order to see the full implementation and get the complete docstrings, please check the [SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-model%5Fdeployers/#zenml.model%5Fdeployers.base%5Fmodel%5Fdeployer.BaseModelDeployer) . - - -### Building your own model deployers - -If you want to create your own custom flavor for a model deployer, you can follow the following steps: - -1. Create a class that inherits from the `BaseModelDeployer` class and implements the abstract methods. -2. If you need to provide any configuration, create a class that inherits from the `BaseModelDeployerConfig` class and add your configuration parameters. -3. Bring both the implementation and the configuration together by inheriting from the `BaseModelDeployerFlavor` class. Make sure that you give a `name` to the flavor through its abstract property. -4. Create a service class that inherits from the `BaseService` class and implements the abstract methods. This class will be used to represent the deployed model server in ZenML. - -Once you are done with the implementation, you can register it through the CLI. Please ensure you **point to the flavor class via dot notation**: -```bash -zenml model-deployer flavor register -``` - -For example, if your flavor class `MyModelDeployerFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml model-deployer flavor register flavors.my_flavor.MyModelDeployerFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. - - -Afterward, you should see the new flavor in the list of available flavors: - - -```bash -zenml model-deployer flavor list -``` - - -It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow. - -* The **CustomModelDeployerFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomModelDeployerConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` objects are inherently `pydantic` objects, you can also add your own custom validators here. -* The **CustomModelDeployer** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomModelDeployerFlavor` and the `CustomModelDeployerConfig` are implemented in a different module/path than the actual `CustomModelDeployer`). - - diff --git a/develop/stack-components/model-deployers/databricks.mdx b/develop/stack-components/model-deployers/databricks.mdx deleted file mode 100644 index 52c6d11..0000000 --- a/develop/stack-components/model-deployers/databricks.mdx +++ /dev/null @@ -1,142 +0,0 @@ ---- -title: "Databricks" -description: "Deploying models to Databricks Inference Endpoints with Databricks" -icon: database ---- - -Databricks Model Serving or Mosaic AI Model Serving provides a unified interface to deploy, govern, and query AI models. Each model you serve is available as a REST API that you can integrate into your web or client application. - -This service provides dedicated and autoscaling infrastructure managed by Databricks, allowing you to deploy models without dealing with containers and GPUs. - - - Databricks Model deployer can be considered as a managed service for deploying models using MLflow, This means you can switch between MLflow and Databricks Model Deployers without changing your pipeline code even for custom complex models. - -## When to use it? - -You should use Databricks Model Deployer: - -* You are already using Databricks for your data and ML workloads. -* If you want to deploy AI models without dealing with containers and GPUs, Databricks Model Deployer provides a unified interface to deploy, govern, and query models. -* Databricks Model Deployer offers dedicated and autoscaling infrastructure managed by Databricks, making it easier to deploy models at scale. -* Enterprise security is a priority, and you need to deploy models into secure offline endpoints accessible only via a direct connection to your Virtual Private Cloud (VPCs). -* if your goal is to turn your models into production-ready APIs with minimal infrastructure or MLOps involvement. - -If you are looking for a more easy way to deploy your models locally, you can use the [MLflow Model Deployer](/develop/stack-components/model-deployers/mlflow) flavor. - -## How to deploy it? - -The Databricks Model Deployer flavor is provided by the Databricks ZenML integration, so you need to install it on your local machine to be able to deploy your models. You can do this by running the following command: - -```bash -zenml integration install databricks -y -``` - -To register the Databricks model deployer with ZenML you need to run the following command: - -```bash -zenml model-deployer register --flavor=databricks --host= --client_id={{databricks.client_id}} --client_secret={{databricks.client_secret}} -``` - - - We recommend creating a Databricks service account with the necessary permissions to create and run jobs. You can find more information on how to create a service account [here](https://docs.databricks.com/dev-tools/api/latest/authentication.html). You can generate a client\_id and client\_secret for the service account and use them to authenticate with Databricks. - - -We can now use the model deployer in our stack. - -```bash -zenml stack update --model-deployer= -``` - -See the [databricks\_model\_deployer\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-databricks/#zenml.integrations.databricks.steps.databricks%5Fdeployer.databricks%5Fmodel%5Fdeployer%5Fstep) for an example of using the Databricks Model Deployer to deploy a model inside a ZenML pipeline step. - -## Configuration - -Within the `DatabricksServiceConfig` you can configure: - -* `model_name`: The name of the model that will be served, this will be used to identify the model in the Databricks Model Registry. -* `model_version`: The version of the model that will be served, this will be used to identify the model in the Databricks Model Registry. -* `workload_size`: The size of the workload that the model will be serving. This can be `Small`, `Medium`, or `Large`. -* `scale_to_zero_enabled`: A boolean flag to enable or disable the scale to zero feature. -* `env_vars`: A dictionary of environment variables to be passed to the model serving container. -* `workload_type`: The type of workload that the model will be serving. This can be `CPU`, `GPU_LARGE`, `GPU_MEDIUM`, `GPU_SMALL`, or `MULTIGPU_MEDIUM`. -* `endpoint_secret_name`: The name of the secret that will be used to secure the endpoint and authenticate requests. - -For more information and a full list of configurable attributes of the Databricks Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-databricks/#zenml.integrations.databricks.model%5Fdeployers) and Databricks endpoint [code](https://github.com/databricks/databricks%5Fhub/blob/5e3b603ccc7cd6523d998e75f82848215abf9415/src/databricks%5Fhub/hf%5Fapi.py#L6957). - -### Run inference on a provisioned inference endpoint - -The following code example shows how to run inference against a provisioned inference endpoint: - -```py - -from typing import Annotated -from zenml import step, pipeline -from zenml.integrations.databricks.model_deployers import DatabricksModelDeployer -from zenml.integrations.databricks.services import DatabricksDeploymentService - - -# Load a prediction service deployed in another pipeline -@step(enable_cache=False) -def prediction_service_loader( - pipeline_name: str, - pipeline_step_name: str, - running: bool = True, - model_name: str = "default", -) -> DatabricksDeploymentService: - """Get the prediction service started by the deployment pipeline. - - Args: - pipeline_name: name of the pipeline that deployed the MLflow prediction - server - step_name: the name of the step that deployed the MLflow prediction - server - running: when this flag is set, the step only returns a running service - model_name: the name of the model that is deployed - """ - # get the Databricks model deployer stack component - model_deployer = DatabricksModelDeployer.get_active_model_deployer() - - # fetch existing services with same pipeline name, step name and model name - existing_services = model_deployer.find_model_server( - pipeline_name=pipeline_name, - pipeline_step_name=pipeline_step_name, - model_name=model_name, - running=running, - ) - - if not existing_services: - raise RuntimeError( - f"No Databricks inference endpoint deployed by step " - f"'{pipeline_step_name}' in pipeline '{pipeline_name}' with name " - f"'{model_name}' is currently running." - ) - - return existing_services[0] - - -# Use the service for inference -@step -def predictor( - service: DatabricksDeploymentService, - data: str -) -> Annotated[str, "predictions"]: - """Run a inference request against a prediction service""" - - prediction = service.predict(data) - return prediction - - -@pipeline -def databricks_deployment_inference_pipeline( - pipeline_name: str, pipeline_step_name: str = "databricks_model_deployer_step", -): - inference_data = ... - model_deployment_service = prediction_service_loader( - pipeline_name=pipeline_name, - pipeline_step_name=pipeline_step_name, - ) - predictions = predictor(model_deployment_service, inference_data) -``` - -For more information and a full list of configurable attributes of the Databricks Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-databricks/#zenml.integrations.databricks.model%5Fdeployers). - diff --git a/develop/stack-components/model-deployers/huggingface.mdx b/develop/stack-components/model-deployers/huggingface.mdx deleted file mode 100644 index bb738a9..0000000 --- a/develop/stack-components/model-deployers/huggingface.mdx +++ /dev/null @@ -1,150 +0,0 @@ ---- -title: "Hugging Face" -description: "Deploying models to Huggingface Inference Endpoints with Hugging Face" -icon: face-laugh ---- - -Hugging Face Inference Endpoints provides a secure production solution to easily deploy any `transformers`, `sentence-transformers`, and `diffusers` models on a dedicated and autoscaling infrastructure managed by Hugging Face. An Inference Endpoint is built from a model from the [Hub](https://huggingface.co/models). - -This service provides dedicated and autoscaling infrastructure managed by Hugging Face, allowing you to deploy models without dealing with containers and GPUs. - -## When to use it? - -You should use Hugging Face Model Deployer: - -* if you want to deploy [Transformers, Sentence-Transformers, or Diffusion models](https://huggingface.co/docs/inference-endpoints/supported%5Ftasks) on dedicated and secure infrastructure. -* if you prefer a fully-managed production solution for inference without the need to handle containers and GPUs. -* if your goal is to turn your models into production-ready APIs with minimal infrastructure or MLOps involvement -* Cost-effectiveness is crucial, and you want to pay only for the raw compute resources you use. -* Enterprise security is a priority, and you need to deploy models into secure offline endpoints accessible only via a direct connection to your Virtual Private Cloud (VPCs). - -If you are looking for a more easy way to deploy your models locally, you can use the [MLflow Model Deployer](/develop/stack-components/model-deployers/mlflow) flavor. - -## How to deploy it? - -The Hugging Face Model Deployer flavor is provided by the Hugging Face ZenML integration, so you need to install it on your local machine to be able to deploy your models. You can do this by running the following command: - -```bash -zenml integration install huggingface -y -``` - -To register the Hugging Face model deployer with ZenML you need to run the following command: - -```bash -zenml model-deployer register --flavor=huggingface --token= --namespace= -``` - -Here, - -* `token` parameter is the Hugging Face authentication token. It can be managed through [Hugging Face settings](https://huggingface.co/settings/tokens). -* `namespace` parameter is used for listing and creating the inference endpoints. It can take any of the following values, username or organization name or `*` depending on where the inference endpoint should be created. - -We can now use the model deployer in our stack. - -```bash -zenml stack update --model-deployer= -``` - -See the [huggingface\_model\_deployer\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-seldon/#zenml.integrations.huggingface.steps.huggingface%5Fdeployer.huggingface%5Fmodel%5Fdeployer%5Fstep) for an example of using the Hugging Face Model Deployer to deploy a model inside a ZenML pipeline step. - -## Configuration - -Within the `HuggingFaceServiceConfig` you can configure: - -* `model_name`: the name of the model in ZenML. -* `endpoint_name`: the name of the inference endpoint. We add a prefix `zenml-` and first 8 characters of the service uuid as a suffix to the endpoint name. -* `repository`: The repository name in the user’s namespace (`{username}/{model_id}`) or in the organization namespace (`{organization}/{model_id}`) from the Hugging Face hub. -* `framework`: The machine learning framework used for the model (e.g. `"custom"`, `"pytorch"` ) -* `accelerator`: The hardware accelerator to be used for inference. (e.g. `"cpu"`, `"gpu"`) -* `instance_size`: The size of the instance to be used for hosting the model (e.g. `"large"`, `"xxlarge"`) -* `instance_type`: Inference Endpoints offers a selection of curated CPU and GPU instances. (e.g. `"c6i"`, `"g5.12xlarge"`) -* `region`: The cloud region in which the Inference Endpoint will be created (e.g. `"us-east-1"`, `"eu-west-1"` for `vendor = aws` and `"eastus"` for Microsoft Azure vendor.). -* `vendor`: The cloud provider or vendor where the Inference Endpoint will be hosted (e.g. `"aws"`). -* `token`: The Hugging Face authentication token. It can be managed through [huggingface settings](https://huggingface.co/settings/tokens). The same token can be passed used while registering the Hugging Face model deployer. -* `account_id`: (Optional) The account ID used to link a VPC to a private Inference Endpoint (if applicable). -* `min_replica`: (Optional) The minimum number of replicas (instances) to keep running for the Inference Endpoint. Defaults to `0`. -* `max_replica`: (Optional) The maximum number of replicas (instances) to scale to for the Inference Endpoint. Defaults to `1`. -* `revision`: (Optional) The specific model revision to deploy on the Inference Endpoint for the Hugging Face repository . -* `task`: Select a supported [Machine Learning Task](https://huggingface.co/docs/inference-endpoints/supported%5Ftasks). (e.g. `"text-classification"`, `"text-generation"`) -* `custom_image`: (Optional) A custom Docker image to use for the Inference Endpoint. -* `namespace`: The namespace where the Inference Endpoint will be created. The same namespace can be passed used while registering the Hugging Face model deployer. -* `endpoint_type`: (Optional) The type of the Inference Endpoint, which can be `"protected"`, `"public"` (default) or `"private"`. - -For more information and a full list of configurable attributes of the Hugging Face Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-huggingface/#zenml.integrations.huggingface.model%5Fdeployers) and Hugging Face endpoint [code](https://github.com/huggingface/huggingface%5Fhub/blob/5e3b603ccc7cd6523d998e75f82848215abf9415/src/huggingface%5Fhub/hf%5Fapi.py#L6957). - -### Run inference on a provisioned inference endpoint - -The following code example shows how to run inference against a provisioned inference endpoint: - -```py -from typing import Annotated -from zenml import step, pipeline -from zenml.integrations.huggingface.model_deployers import HuggingFaceModelDeployer -from zenml.integrations.huggingface.services import HuggingFaceDeploymentService - - -# Load a prediction service deployed in another pipeline -@step(enable_cache=False) -def prediction_service_loader( - pipeline_name: str, - pipeline_step_name: str, - running: bool = True, - model_name: str = "default", -) -> HuggingFaceDeploymentService: - """Get the prediction service started by the deployment pipeline. - - Args: - pipeline_name: name of the pipeline that deployed the MLflow prediction - server - step_name: the name of the step that deployed the MLflow prediction - server - running: when this flag is set, the step only returns a running service - model_name: the name of the model that is deployed - """ - # get the Hugging Face model deployer stack component - model_deployer = HuggingFaceModelDeployer.get_active_model_deployer() - - # fetch existing services with same pipeline name, step name and model name - existing_services = model_deployer.find_model_server( - pipeline_name=pipeline_name, - pipeline_step_name=pipeline_step_name, - model_name=model_name, - running=running, - ) - - if not existing_services: - raise RuntimeError( - f"No Hugging Face inference endpoint deployed by step " - f"'{pipeline_step_name}' in pipeline '{pipeline_name}' with name " - f"'{model_name}' is currently running." - ) - - return existing_services[0] - - -# Use the service for inference -@step -def predictor( - service: HuggingFaceDeploymentService, - data: str -) -> Annotated[str, "predictions"]: - """Run a inference request against a prediction service""" - - prediction = service.predict(data) - return prediction - - -@pipeline -def huggingface_deployment_inference_pipeline( - pipeline_name: str, pipeline_step_name: str = "huggingface_model_deployer_step", -): - inference_data = ... - model_deployment_service = prediction_service_loader( - pipeline_name=pipeline_name, - pipeline_step_name=pipeline_step_name, - ) - predictions = predictor(model_deployment_service, inference_data) -``` - -For more information and a full list of configurable attributes of the Hugging Face Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-huggingface/#zenml.integrations.huggingface.model%5Fdeployers). - diff --git a/develop/stack-components/model-deployers/mlflow.mdx b/develop/stack-components/model-deployers/mlflow.mdx deleted file mode 100644 index d374d77..0000000 --- a/develop/stack-components/model-deployers/mlflow.mdx +++ /dev/null @@ -1,233 +0,0 @@ ---- -title: "MLflow" -description: "Deploying your models locally with MLflow." -icon: vial ---- - -The MLflow Model Deployer is one of the available flavors of the [Model Deployer](/develop/stack-components/model-deployers) stack component. Provided with the MLflow integration it can be used to deploy and manage [MLflow models](https://www.mlflow.org/docs/latest/python%5Fapi/mlflow.deployments.html) on a local running MLflow server. - - - The MLflow Model Deployer is not yet available for use in production. This is a work in progress and will be available soon. At the moment it is only available for use in a local development environment. - - -## When to use it? - -MLflow is a popular open-source platform for machine learning. It's a great tool for managing the entire lifecycle of your machine learning. One of the most important features of MLflow is the ability to package your model and its dependencies into a single artifact that can be deployed to a variety of deployment targets. - -You should use the MLflow Model Deployer: - -* if you want to have an easy way to deploy your models locally and perform real-time predictions using the running MLflow prediction server. -* if you are looking to deploy your models in a simple way without the need for a dedicated deployment environment like Kubernetes or advanced infrastructure configuration. - -If you are looking to deploy your models in a more complex way, you should use one of the other [Model Deployer Flavors](/develop/stack-components/model-deployers#model-deployers-flavors) available in ZenML. - -## How do you deploy it? - -The MLflow Model Deployer flavor is provided by the MLflow ZenML integration, so you need to install it on your local machine to be able to deploy your models. You can do this by running the following command: - -```bash -zenml integration install mlflow -y -``` - -To register the MLflow model deployer with ZenML you need to run the following command: - -```bash -zenml model-deployer register mlflow_deployer --flavor=mlflow -``` - -The ZenML integration will provision a local MLflow deployment server as a daemon process that will continue to run in the background to serve the latest MLflow model. - -## How do you use it? - -### Deploy a logged model - -Following [MLflow's documentation](https://mlflow.org/docs/latest/deployment/deploy-model-locally.html#deploy-mlflow-model-as-a-local-inference-server), if we want to deploy a model as a local inference server, we need the model to be logged in the MLflow experiment tracker first. Once the model is logged, we can use the model URI either from the artifact path saved with the MLflow run or using model name and version if a model is registered in the MLflow model registry. - -In the following examples, we will show how to deploy a model using the MLflow Model Deployer, in two different scenarios: - -1. We already know the logged model URI and we want to deploy it as a local inference server. - -```py -from zenml import pipeline, step, get_step_context -from zenml.client import Client - -@step -def deploy_model() -> Optional[MLFlowDeploymentService]: - # Deploy a model using the MLflow Model Deployer - zenml_client = Client() - model_deployer = zenml_client.active_stack.model_deployer - mlflow_deployment_config = MLFlowDeploymentConfig( - name: str = "mlflow-model-deployment-example", - description: str = "An example of deploying a model using the MLflow Model Deployer", - pipeline_name: str = get_step_context().pipeline_name, - pipeline_step_name: str = get_step_context().step_name, - model_uri: str = "runs://model" or "models://", - model_name: str = "model", - workers: int = 1 - mlserver: bool = False - timeout: int = DEFAULT_SERVICE_START_STOP_TIMEOUT - ) - service = model_deployer.deploy_model( - config=mlflow_deployment_config, - service_type=MLFlowDeploymentService.SERVICE_TYPE - ) - logger.info(f"The deployed service info: {model_deployer.get_model_server_info(service)}") - return service -``` - -1. We don't know the logged model URI, since the model was logged in a previous step. We want to deploy the model as a local inference server. ZenML provides set of functionalities that would make it easier to get the model URI from the current run and deploy it. - -```py -from zenml import pipeline, step, get_step_context -from zenml.client import Client -from mlflow.tracking import MlflowClient, artifact_utils - - -@step -def deploy_model() -> Optional[MLFlowDeploymentService]: - # Deploy a model using the MLflow Model Deployer - zenml_client = Client() - model_deployer = zenml_client.active_stack.model_deployer - experiment_tracker = zenml_client.active_stack.experiment_tracker - # Let's get the run id of the current pipeline - mlflow_run_id = experiment_tracker.get_run_id( - experiment_name=get_step_context().pipeline_name, - run_name=get_step_context().run_name, - ) - # Once we have the run id, we can get the model URI using mlflow client - experiment_tracker.configure_mlflow() - client = MlflowClient() - model_name = "model" # set the model name that was logged - model_uri = artifact_utils.get_artifact_uri( - run_id=mlflow_run_id, artifact_path=model_name - ) - mlflow_deployment_config = MLFlowDeploymentConfig( - name: str = "mlflow-model-deployment-example", - description: str = "An example of deploying a model using the MLflow Model Deployer", - pipeline_name: str = get_step_context().pipeline_name, - pipeline_step_name: str = get_step_context().step_name, - model_uri: str = model_uri, - model_name: str = model_name, - workers: int = 1, - mlserver: bool = False, - timeout: int = 300, - ) - service = model_deployer.deploy_model( - config=mlflow_deployment_config, - service_type=MLFlowDeploymentService.SERVICE_TYPE - ) - return service -``` - -#### Configuration - -Within the `MLFlowDeploymentService` you can configure: - -* `name`: The name of the deployment. -* `description`: The description of the deployment. -* `pipeline_name`: The name of the pipeline that deployed the MLflow prediction server. -* `pipeline_step_name`: The name of the step that deployed the MLflow prediction server. -* `model_name`: The name of the model that is deployed in case of model registry the name must be a valid registered model name. -* `model_version`: The version of the model that is deployed in case of model registry the version must be a valid registered model version. -* `silent_daemon`: set to True to suppress the output of the daemon (i.e., redirect stdout and stderr to /dev/null). If False, the daemon output will be redirected to a log file. -* `blocking`: set to True to run the service in the context of the current process and block until the service is stopped instead of running the service as a daemon process. Useful for operating systems that do not support daemon processes. -* `model_uri`: The URI of the model to be deployed. This can be a local file path, a run ID, or a model name and version. -* `workers`: The number of workers to be used by the MLflow prediction server. -* `mlserver`: If True, the MLflow prediction server will be started as a MLServer instance. -* `timeout`: The timeout in seconds to wait for the MLflow prediction server to start or stop. - -### Run inference on a deployed model - -The following code example shows how you can load a deployed model in Python and run inference against it: - -1. Load a prediction service deployed in another pipeline - -```py -import json -import requests -from zenml import step -from zenml.integrations.mlflow.model_deployers.mlflow_model_deployer import ( - MLFlowModelDeployer, -) -from zenml.integrations.mlflow.services import MLFlowDeploymentService - - -# Load a prediction service deployed in another pipeline -@step(enable_cache=False) -def prediction_service_loader( - pipeline_name: str, - pipeline_step_name: str, - model_name: str = "model", -) -> None: - """Get the prediction service started by the deployment pipeline. - - Args: - pipeline_name: name of the pipeline that deployed the MLflow prediction - server - step_name: the name of the step that deployed the MLflow prediction - server - running: when this flag is set, the step only returns a running service - model_name: the name of the model that is deployed - """ - # get the MLflow model deployer stack component - model_deployer = MLFlowModelDeployer.get_active_model_deployer() - - # fetch existing services with same pipeline name, step name and model name - existing_services = model_deployer.find_model_server( - pipeline_name=pipeline_name, - pipeline_step_name=pipeline_step_name, - model_name=model_name, - ) - - if not existing_services: - raise RuntimeError( - f"No MLflow prediction service deployed by step " - f"'{pipeline_step_name}' in pipeline '{pipeline_name}' with name " - f"'{model_name}' is currently running." - ) - - service = existing_services[0] - - # Let's try run a inference request against the prediction service - - payload = json.dumps( - { - "inputs": {"messages": [{"role": "user", "content": "Tell a joke!"}]}, - "params": { - "temperature": 0.5, - "max_tokens": 20, - }, - } - ) - response = requests.post( - url=service.get_prediction_url(), - data=payload, - headers={"Content-Type": "application/json"}, - ) - - response.json() -``` - -2. Within the same pipeline, use the service from previous step to run inference this time using pre-built predict method - -```py -from typing_extensions import Annotated -import numpy as np -from zenml import step -from zenml.integrations.mlflow.services import MLFlowDeploymentService - -# Use the service for inference -@step -def predictor( - service: MLFlowDeploymentService, - data: np.ndarray, -) -> Annotated[np.ndarray, "predictions"]: - """Run a inference request against a prediction service""" - - prediction = service.predict(data) - prediction = prediction.argmax(axis=-1) - - return prediction -``` - -For more information and a full list of configurable attributes of the MLflow Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-mlflow/#zenml.integrations.mlflow.model%5Fdeployers) . diff --git a/develop/stack-components/model-deployers/model-deployers.mdx b/develop/stack-components/model-deployers/model-deployers.mdx deleted file mode 100644 index 1e1463b..0000000 --- a/develop/stack-components/model-deployers/model-deployers.mdx +++ /dev/null @@ -1,174 +0,0 @@ ---- -title: "Overview" -description: "Deploying your models and serve real-time predictions." -icon: person-running ---- - -Model Deployment is the process of making a machine learning model available to make predictions and decisions on real-world data. Getting predictions from trained models can be done in different ways depending on the use case, a batch prediction is used to generate predictions for a large amount of data at once, while a real-time prediction is used to generate predictions for a single data point at a time. - -Model deployers are stack components responsible for serving models on a real-time or batch basis. - -Online serving is the process of hosting and loading machine-learning models as part of a managed web service and providing access to the models through an API endpoint like HTTP or GRPC. Once deployed, model inference can be triggered at any time, and you can send inference requests to the model through the web service's API and receive fast, low-latency responses. - -Batch inference or offline inference is the process of making a machine learning model make predictions on a batch of observations. This is useful for generating predictions for a large amount of data at once. The predictions are usually stored as files or in a database for end users or business applications. - -### When to use it? - -The model deployers are optional components in the ZenML stack. They are used to deploy machine learning models to a target environment, either a development (local) or a production (Kubernetes or cloud) environment. The model deployers are mainly used to deploy models for real-time inference use cases. With the model deployers and other stack components, you can build pipelines that are continuously trained and deployed to production. - -### How model deployers slot into the stack - -Here is an architecture diagram that shows how model deployers fit into the overall story of a remote stack. - - - - - - -#### Model Deployers Flavors - -ZenML comes with a `local` MLflow model deployer which is a simple model deployer that deploys models to a local MLflow server. Additional model deployers that can be used to deploy models on production environments are provided by integrations: - -| Model Deployer | Flavor | Integration | Notes | -| ----------------------------------------------------------------- | ----------- | ------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | -| [MLflow](/develop/stack-components/model-deployers/mlflow) | mlflow | mlflow | Deploys ML Model locally | -| [BentoML](/develop/stack-components/model-deployers/bentoml) | bentoml | bentoml | Build and Deploy ML models locally or for production grade (Cloud, K8s) | -| [Seldon Core](/develop/stack-components/model-deployers/seldon) | seldon | seldon Core | Built on top of Kubernetes to deploy models for production grade environment | -| [Hugging Face](/develop/stack-components/model-deployers/huggingface) | huggingface | huggingface | Deploys ML model on Hugging Face Inference Endpoints | -| [Custom Implementation](/develop/stack-components/model-deployers/custom) | _custom_ | Extend the Artifact Store abstraction and provide your own implementation | | - - -Every model deployer may have different attributes that must be configured in order to interact with the model serving tool, framework, or platform (e.g. hostnames, URLs, references to credentials, and other client-related configuration parameters). The following example shows the configuration of the MLflow and Seldon Core model deployers: - -```Bash -# Configure MLflow model deployer -zenml model-deployer register mlflow --flavor=mlflow - -# Configure Seldon Core model deployer -zenml model-deployer register seldon --flavor=seldon \ ---kubernetes_context=zenml-eks --kubernetes_namespace=zenml-workloads \ ---base_url=http://abb84c444c7804aa98fc8c097896479d-377673393.us-east-1.elb.amazonaws.com -... -``` - - - -#### The role that a model deployer plays in a ZenML Stack - -* Seamless Model Deployment: Facilitates the deployment of machine learning models to various serving environments, such as local servers, Kubernetes clusters, or cloud platforms, ensuring that models can be deployed and managed efficiently in accordance with the specific requirements of the serving infrastructure by holds all the stack-related configuration attributes required to interact with the remote model serving tool, service, or platform (e.g. hostnames, URLs, references to credentials, and other client-related configuration parameters). The following are examples of configuring the MLflow and Seldon Core Model Deployers and registering them as a Stack component: - -```bash -zenml integration install mlflow -zenml model-deployer register mlflow --flavor=mlflow -zenml stack register local_with_mlflow -m default -a default -o default -d mlflow --set -``` - -```bash -zenml integration install seldon -zenml model-deployer register seldon --flavor=seldon \ ---kubernetes_context=zenml-eks --kubernetes_namespace=zenml-workloads \ ---base_url=http://abb84c444c7804aa98fc8c097896479d-377673393.us-east-1.elb.amazonaws.com -... -zenml stack register seldon_stack -m default -a aws -o default -d seldon -``` -* Lifecycle Management: Provides mechanisms for comprehensive lifecycle management of model servers, including the ability to start, stop, and delete model servers, as well as to update existing servers with new model versions, thereby optimizing resource utilization and facilitating continuous delivery of model updates. Some core methods that can be used to interact with the remote model server include: - -`deploy_model` \- Deploys a model to the serving environment and returns a Service object that represents the deployed model server. `find_model_server` \- Finds and returns a list of Service objects that represent model servers that have been deployed to the serving environment, the services are stored in the DB and can be used as a reference to know what and where the model is deployed. `stop_model_server` \- Stops a model server that is currently running in the serving environment. `start_model_server` \- Starts a model server that has been stopped in the serving environment. `delete_model_server` \- Deletes a model server from the serving environment and from the DB. - - - ZenML uses the Service object to represent a model server that has been deployed to a serving environment. The Service object is saved in the DB and can be used as a reference to know what and where the model is deployed. The Service object consists of 2 main attributes, the `config` and the `status`. The `config` attribute holds all the deployment configuration attributes required to create a new deployment, while the `status` attribute holds the operational status of the deployment, such as the last error message, the prediction URL, and the deployment status. - - -```py -from zenml.integrations.huggingface.model_deployers import HuggingFaceModelDeployer - -model_deployer = HuggingFaceModelDeployer.get_active_model_deployer() -services = model_deployer.find_model_server( - pipeline_name="LLM_pipeline", - pipeline_step_name="huggingface_model_deployer_step", - model_name="LLAMA-7B", -) -if services: - if services[0].is_running: - print( - f"Model server {services[0].config['model_name']} is running at {services[0].status['prediction_url']}" - ) - else: - print(f"Model server {services[0].config['model_name']} is not running") - model_deployer.start_model_server(services[0]) - else: - print("No model server found") - service = model_deployer.deploy_model( - pipeline_name="LLM_pipeline", - pipeline_step_name="huggingface_model_deployer_step", - model_name="LLAMA-7B", - model_uri="s3://zenprojects/huggingface_model_deployer_step/output/884/huggingface", - revision="main", - task="text-classification", - region="us-east-1", - vendor="aws", - token="huggingface_token", - namespace="zenml-workloads", - endpoint_type="public", - ) - print(f"Model server {service.config['model_name']} is deployed at {service.status['prediction_url']}") -``` - -#### How to Interact with a model deployer after deployment? - -When a Model Deployer is part of the active ZenML Stack, it is also possible to interact with it from the CLI to list, start, stop, or delete the model servers that is managed: - -```bash -$ zenml model-deployer models list -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ STATUS │ UUID │ PIPELINE_NAME │ PIPELINE_STEP_NAME ┃ -┠────────┼──────────────────────────────────────┼────────────────────────────────┼────────────────────────────┨ -┃ ✅ │ 8cbe671b-9fce-4394-a051-68e001f92765 │ seldon_deployment_pipeline │ seldon_model_deployer_step ┃ -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -$ zenml model-deployer models describe 8cbe671b-9fce-4394-a051-68e001f92765 - Properties of Served Model 8cbe671b-9fce-4394-a051-68e001f92765 -┏━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ MODEL SERVICE PROPERTY │ VALUE ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ MODEL_NAME │ mnist ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ MODEL_URI │ s3://zenprojects/seldon_model_deployer_step/output/884/seldon ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ PIPELINE_NAME │ seldon_deployment_pipeline ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ RUN_NAME │ seldon_deployment_pipeline-11_Apr_22-09_39_27_648527 ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ PIPELINE_STEP_NAME │ seldon_model_deployer_step ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ PREDICTION_URL │ http://abb84c444c7804aa98fc8c097896479d-377673393.us-east-1.elb.amazonaws.com/seldon/… ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ SELDON_DEPLOYMENT │ zenml-8cbe671b-9fce-4394-a051-68e001f92765 ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ STATUS │ ✅ ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ STATUS_MESSAGE │ Seldon Core deployment 'zenml-8cbe671b-9fce-4394-a051-68e001f92765' is available ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┨ -┃ UUID │ 8cbe671b-9fce-4394-a051-68e001f92765 ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -$ zenml model-deployer models get-url 8cbe671b-9fce-4394-a051-68e001f92765 - Prediction URL of Served Model 8cbe671b-9fce-4394-a051-68e001f92765 is: - http://abb84c444c7804aa98fc8c097896479d-377673393.us-east-1.elb.amazonaws.com/seldon/zenml-workloads/zenml-8cbe67 -1b-9fce-4394-a051-68e001f92765/api/v0.1/predictions - -$ zenml model-deployer models delete 8cbe671b-9fce-4394-a051-68e001f92765 -``` - -In Python, you can alternatively discover the prediction URL of a deployed model by inspecting the metadata of the step that deployed the model: - -```bash -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("") -deployer_step = pipeline_run.steps[""] -deployed_model_url = deployer_step.run_metadata["deployed_model_url"].value -``` - -The ZenML integrations that provide Model Deployer stack components also include standard pipeline steps that can directly be inserted into any pipeline to achieve a continuous model deployment workflow. These steps take care of all the aspects of continuously deploying models to an external server and saving the Service configuration into the Artifact Store, where they can be loaded at a later time and re-create the initial conditions used to serve a particular model. - diff --git a/develop/stack-components/model-deployers/seldon.mdx b/develop/stack-components/model-deployers/seldon.mdx deleted file mode 100644 index d1d3787..0000000 --- a/develop/stack-components/model-deployers/seldon.mdx +++ /dev/null @@ -1,520 +0,0 @@ ---- -title: "Seldon" -description: "Deploying models to Kubernetes with Seldon Core." -icon: battle-net ---- - -[Seldon Core](https://github.com/SeldonIO/seldon-core) is a production grade source-available model serving platform. It packs a wide range of features built around deploying models to REST/GRPC microservices that include monitoring and logging, model explainers, outlier detectors and various continuous deployment strategies such as A/B testing, canary deployments and more. - -Seldon Core also comes equipped with a set of built-in model server implementations designed to work with standard formats for packaging ML models that greatly simplify the process of serving models for real-time inference. - - - The Seldon Core model deployer integration is currently not supported under **MacOS**. - - -## When to use it? - -[Seldon Core](https://github.com/SeldonIO/seldon-core) is a production-grade source-available model serving platform. It packs a wide range of features built around deploying models to REST/GRPC microservices that include monitoring and logging, model explainers, outlier detectors, and various continuous deployment strategies such as A/B testing, canary deployments, and more. - -Seldon Core also comes equipped with a set of built-in model server implementations designed to work with standard formats for packaging ML models that greatly simplify the process of serving models for real-time inference. - -You should use the Seldon Core Model Deployer: - -* If you are looking to deploy your model on a more advanced infrastructure like Kubernetes. -* If you want to handle the lifecycle of the deployed model with no downtime, including updating the runtime graph, scaling, monitoring, and security. -* Looking for more advanced API endpoints to interact with the deployed model, including REST and GRPC endpoints. -* If you want more advanced deployment strategies like A/B testing, canary deployments, and more. -* if you have a need for a more complex deployment process that can be customized by the advanced inference graph that includes custom [TRANSFORMER](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/overview.html) and [ROUTER](https://docs.seldon.io/projects/seldon-core/en/latest/analytics/routers.html?highlight=routers). - -If you are looking for a more easy way to deploy your models locally, you can use the [MLflow Model Deployer](/develop/stack-components/model-deployers/mlflow) flavor. - -## How to deploy it? - -ZenML provides a Seldon Core flavor build on top of the Seldon Core Integration to allow you to deploy and use your models in a production-grade environment. In order to use the integration you need to install it on your local machine to be able to register a Seldon Core Model deployer with ZenML and add it to your stack: - -```bash -zenml integration install seldon -y -``` - -To deploy and make use of the Seldon Core integration we need to have the following prerequisites: - -1. access to a Kubernetes cluster. This can be configured using the `kubernetes_context` configuration attribute to point to a local `kubectl` context or an in-cluster configuration, but the recommended approach is to [use a Service Connector](/develop/stack-components/model-deployers/seldon#using-a-service-connector) to link the Seldon Deployer Stack Component to a Kubernetes cluster. -2. Seldon Core needs to be preinstalled and running in the target Kubernetes cluster. Check out the [official Seldon Core installation instructions](https://github.com/SeldonIO/seldon-core/tree/master/examples/auth#demo-setup) or the [EKS installation example below](/develop/stack-components/model-deployers/seldon#installing-seldon-core-eg-in-an-eks-cluster). -3. models deployed with Seldon Core need to be stored in some form of persistent shared storage that is accessible from the Kubernetes cluster where Seldon Core is installed (e.g. AWS S3, GCS, Azure Blob Storage, etc.). You can use one of the supported [remote artifact store flavors](/develop/stack-components/artifact-stores) to store your models as part of your stack. For a smoother experience running Seldon Core with a cloud artifact store, we also recommend configuring explicit credentials for the artifact store. The Seldon Core model deployer knows how to automatically convert those credentials in the format needed by Seldon Core model servers to authenticate to the storage back-end where models are stored. - -Since the Seldon Model Deployer is interacting with the Seldon Core model server deployed on a Kubernetes cluster, you need to provide a set of configuration parameters. These parameters are: - -* kubernetes\_context: the Kubernetes context to use to contact the remote Seldon Core installation. If not specified, the active Kubernetes context is used or the in-cluster configuration is used if the model deployer is running in a Kubernetes cluster. The recommended approach is to [use a Service Connector](/develop/stack-components/model-deployers/seldon#using-a-service-connector) to link the Seldon Deployer Stack Component to a Kubernetes cluster and to skip this parameter. -* kubernetes\_namespace: the Kubernetes namespace where the Seldon Core deployment servers are provisioned and managed by ZenML. If not specified, the namespace set in the current configuration is used. -* base\_url: the base URL of the Kubernetes ingress used to expose the Seldon Core deployment servers. - -In addition to these parameters, the Seldon Core Model Deployer may also require additional configuration to be set up to allow it to authenticate to the remote artifact store or persistent storage service where model artifacts are located. This is covered in the [Managing Seldon Core Authentication](/develop/stack-components/model-deployers/seldon#managing-seldon-core-authentication) section. - - - Configuring Seldon Core in a Kubernetes cluster can be a complex and error-prone process, so we have provided a set of Terraform-based recipes to quickly provision popular combinations of MLOps tools. More information about these recipes can be found in the [MLOps Stack Recipes](https://github.com/zenml-io/mlstacks). - -### Infrastructure Deployment - -The Seldon Model Deployer can be deployed directly from the ZenML CLI: - -```bash -zenml model-deployer deploy seldon_deployer --flavor=seldon --provider= ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the [dedicated documentation section](/develop/stack-components/stack-deployment). - -### Seldon Core Installation Example - -The following example briefly shows how you can install Seldon in an EKS Kubernetes cluster. It assumes that the EKS cluster itself is already set up and configured with IAM access. For more information or tutorials for other clouds, check out the [official Seldon Core installation instructions](https://github.com/SeldonIO/seldon-core/tree/master/examples/auth#demo-setup). - -1. Configure EKS cluster access locally, e.g: - -```bash -aws eks --region us-east-1 update-kubeconfig --name zenml-cluster --alias zenml-eks -``` - -2. Install Istio 1.5.0 (required for the latest Seldon Core version): - -```bash -curl -L [https://istio.io/downloadIstio](https://istio.io/downloadIstio) | ISTIO_VERSION=1.5.0 sh - - -cd istio-1.5.0/ - -bin/istioctl manifest apply --set profile=demo -``` - -3. Set up an Istio gateway for Seldon Core: - -```bash -curl https://raw.githubusercontent.com/SeldonIO/seldon-core/master/notebooks/resources/seldon-gateway.yaml | kubectl apply -f - -``` - -4. Install Seldon Core: - -```bash -helm install seldon-core seldon-core-operator \ - - --repo https://storage.googleapis.com/seldon-charts \ - - --set usageMetrics.enabled=true \ - - --set istio.enabled=true \ - - --namespace seldon-system -``` - -5. Test that the installation is functional - - -```bash -kubectl apply -f iris.yaml -``` - -with `iris.yaml` defined as follows: - - -```yaml -apiVersion: machinelearning.seldon.io/v1 -kind: SeldonDeployment -metadata: - name: iris-model - namespace: default -spec: - name: iris - predictors: - - graph: - implementation: SKLEARN_SERVER - modelUri: gs://seldon-models/v1.14.0-dev/sklearn/iris - name: classifier - name: default - replicas: 1 -``` - -Then extract the URL where the model server exposes its prediction API: - - -```bash -export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].hostname}') -``` - -And use curl to send a test prediction API request to the server: - -```bash -curl -X POST http://$INGRESS_HOST/seldon/default/iris-model/api/v1.0/predictions \ - -H 'Content-Type: application/json' \ - -d '{ "data": { "ndarray": [[1,2,3,4]] } }' -``` - -### Using a Service Connector - -To set up the Seldon Core Model Deployer to authenticate to a remote Kubernetes cluster, it is recommended to leverage the many features provided by [the Service Connectors](/develop/usage/resource-data-management/auth-management) such as auto-configuration, local client login, best security practices regarding long-lived credentials and fine-grained access control and reusing the same credentials across multiple stack components. - -Depending on where your target Kubernetes cluster is running, you can use one of the following Service Connectors: - -* [the AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector), if you are using an AWS EKS cluster. -* [the GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector), if you are using a GKE cluster. -* [the Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector), if you are using an AKS cluster. -* [the generic Kubernetes Service Connector](/develop/usage/resource-data-management/auth-management/kubernetes-service-connector) for any other Kubernetes cluster. - -If you don't already have a Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure a Service Connector that can be used to access more than one Kubernetes cluster or even more than one type of cloud resource: - -```bash -zenml service-connector register -i -``` - -A non-interactive CLI example that leverages [the AWS CLI configuration](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) on your local machine to auto-configure an AWS Service Connector targeting a single EKS cluster is: - -```bash -zenml service-connector register --type aws --resource-type kubernetes-cluster --resource-name --auto-configure -``` - -```bash Example Command Output -$ zenml service-connector register eks-zenhacks --type aws --resource-type kubernetes-cluster --resource-id zenhacks-cluster --auto-configure -⠼ Registering service connector 'eks-zenhacks'... -Successfully registered service connector `eks-zenhacks` with access to the following resources: -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠───────────────────────┼──────────────────┨ -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ -``` - -Alternatively, you can configure a Service Connector through the ZenML dashboard: - -| | | -|---|---| -| | | - - - -> **Note**: Please remember to grant the entity associated with your cloud credentials permissions to access the Kubernetes cluster and to list accessible Kubernetes clusters. For a full list of permissions required to use a AWS Service Connector to access one or more Kubernetes cluster, please refer to the [documentation for your Service Connector of choice](/develop/usage/resource-data-management/auth-management) or read the documentation available in the interactive CLI commands and dashboard. The Service Connectors supports many different authentication methods with different levels of security and convenience. You should pick the one that best fits your use-case. - -If you already have one or more Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the Kubernetes cluster that you want to use for your Seldon Core Model Deployer by running e.g.: - -```bash -zenml service-connector list-resources --resource-type kubernetes-cluster -``` - -```bash Example Command Output -The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────────────┼───────────────────────────────────────────────┨ - -┃ bdf1dc76-e36b-4ab4-b5a6-5a9afea4822f │ eks-zenhacks │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────────────┼───────────────────────────────────────────────┨ - -┃ b57f5f5c-0378-434c-8d50-34b492486f30 │ gcp-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────────────┼───────────────────────────────────────────────┨ - -┃ d6fc6004-eb76-4fd7-8fa1-ec600cced680 │ azure-multi │ 🇦 azure │ 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -After having set up or decided on a Service Connector to use to connect to the target Kubernetes cluster where Seldon Core is installed, you can register the Seldon Core Model Deployer as follows: - -```bash -# Register the Seldon Core Model Deployer - -zenml model-deployer register --flavor=seldon \ - - --kubernetes_namespace= \ - - --base_url=http://$INGRESS_HOST - -# Connect the Seldon Core Model Deployer to the target cluster via a Service Connector - -zenml model-deployer connect -i -``` - -A non-interactive version that connects the Seldon Core Model Deployer to a target Kubernetes cluster through a Service Connector: - -```bash -zenml model-deployer connect --connector --resource-id -``` - -```bash Example Command Output -$ zenml model-deployer connect seldon-test --connector gcp-multi --resource-id zenml-test-cluster - -Successfully connected model deployer `seldon-test` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────────────┼────────────────────┨ - -┃ b57f5f5c-0378-434c-8d50-34b492486f30 │ gcp-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┛ - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -A similar experience is available when you configure the Seldon Core Model Deployer through the ZenML dashboard: - - - - - -### Managing Seldon Core Authentication - -The Seldon Core Model Deployer requires access to the persistent storage where models are located. In most cases, you will use the Seldon Core model deployer to serve models that are trained through ZenML pipelines and stored in the ZenML Artifact Store, which implies that the Seldon Core model deployer needs to access the Artifact Store. - -If Seldon Core is already running in the same cloud as the Artifact Store (e.g. S3 and an EKS cluster for AWS, or GCS and a GKE cluster for GCP), there are ways of configuring cloud workloads to have implicit access to other cloud resources like persistent storage without requiring explicit credentials. However, if Seldon Core is running in a different cloud, or on-prem, or if implicit in-cloud workload authentication is not enabled, then you need to configure explicit credentials for the Artifact Store to allow other components like the Seldon Core model deployer to authenticate to it. Every cloud Artifact Store flavor supports some way of configuring explicit credentials and this is documented for each individual flavor in the [Artifact Store documentation](/develop/stack-components/artifact-stores). - -When explicit credentials are configured in the Artifact Store, the Seldon Core Model Deployer doesn't need any additional configuration and will use those credentials automatically to authenticate to the same persistent storage service used by the Artifact Store. If the Artifact Store doesn't have explicit credentials configured, then Seldon Core will default to using whatever implicit authentication method is available in the Kubernetes cluster where it is running. For example, in AWS this means using the IAM role attached to the EC2 or EKS worker nodes, and in GCP this means using the service account attached to the GKE worker nodes. - - -If the Artifact Store used in combination with the Seldon Core Model Deployer in the same ZenML stack does not have explicit credentials configured, then the Seldon Core Model Deployer might not be able to authenticate to the Artifact Store which will cause the deployed model servers to fail. - -To avoid this, we recommend that you use Artifact Stores with explicit credentials in the same stack as the Seldon Core Model Deployer. Alternatively, if you're running Seldon Core in one of the cloud providers, you should configure implicit authentication for the Kubernetes nodes. - - -If you want to use a custom persistent storage with Seldon Core, or if you prefer to manually manage the authentication credentials attached to the Seldon Core model servers, you can use the approach described in the next section. - -**Advanced: Configuring a Custom Seldon Core Secret** - -The Seldon Core model deployer stack component allows configuring an additional `secret` attribute that can be used to specify custom credentials that Seldon Core should use to authenticate to the persistent storage service where models are located. This is useful if you want to connect Seldon Core to a persistent storage service that is not supported as a ZenML Artifact Store, or if you don't want to configure or use the same credentials configured for your Artifact Store. The `secret` attribute must be set to the name of [a ZenML secret](/develop/usage/project-setup/use-secrets/interact-with-secrets) containing credentials configured in the format supported by Seldon Core. - - - This method is not recommended, because it limits the Seldon Core model deployer to a single persistent storage service, whereas using the Artifact Store credentials gives you more flexibility in combining the Seldon Core model deployer with any Artifact Store in the same ZenML stack. - -Seldon Core model servers use [rclone](https://rclone.org/) to connect to persistent storage services and the credentials that can be configured in the ZenML secret must also be in the configuration format supported by `rclone`. This section covers a few common use cases and provides examples of how to configure the ZenML secret to support them, but for more information on supported configuration options, you can always refer to the [rclone documentation for various providers](https://rclone.org/). - -Example of configuring a Seldon Core secret for AWS S3: - -```bash -zenml secret create s3-seldon-secret \ ---rclone_config_s3_type="s3" \ # set to 's3' for S3 storage. ---rclone_config_s3_provider="aws" \ # the S3 provider (e.g. aws, Ceph, Minio). ---rclone_config_s3_env_auth=False \ # set to true to use implicit AWS authentication from EC2/ECS meta data -# (i.e. with IAM roles configuration). Only applies if access_key_id and secret_access_key are blank. ---rclone_config_s3_access_key_id="" \ # AWS Access Key ID. ---rclone_config_s3_secret_access_key="" \ # AWS Secret Access Key. ---rclone_config_s3_session_token="" \ # AWS Session Token. ---rclone_config_s3_region="" \ # region to connect to. ---rclone_config_s3_endpoint="" \ # S3 API endpoint. - -# Alternatively for providing key-value pairs, you can utilize the '--values' option by specifying a file path containing -# key-value pairs in either JSON or YAML format. -# File content example: {"rclone_config_s3_type":"s3",...} -zenml secret create s3-seldon-secret \ - --values=@path/to/file.json -``` - -Example of configuring a Seldon Core secret for GCS: - -```bash -zenml secret create gs-seldon-secret \ ---rclone_config_gs_type="google cloud storage" \ # set to 'google cloud storage' for GCS storage. ---rclone_config_gs_client_secret="" \ # OAuth client secret. ---rclone_config_gs_token="" \ # OAuth Access Token as a JSON blob. ---rclone_config_gs_project_number="" \ # project number. ---rclone_config_gs_service_account_credentials="" \ #service account credentials JSON blob. ---rclone_config_gs_anonymous=False \ # Access public buckets and objects without credentials. -# Set to True if you just want to download files and don't configure credentials. ---rclone_config_gs_auth_url="" \ # auth server URL. - -# Alternatively for providing key-value pairs, you can utilize the '--values' option by specifying a file path containing -# key-value pairs in either JSON or YAML format. -# File content example: {"rclone_config_gs_type":"google cloud storage",...} -zenml secret create gs-seldon-secret \ - --values=@path/to/file.json -``` - -Example of configuring a Seldon Core secret for Azure Blob Storage: - -```bash -zenml secret create az-seldon-secret \ ---rclone_config_az_type="azureblob" \ # set to 'azureblob' for Azure Blob Storage. ---rclone_config_az_account="" \ # storage Account Name. Leave blank to -# use SAS URL or MSI. ---rclone_config_az_key="" \ # storage Account Key. Leave blank to -# use SAS URL or MSI. ---rclone_config_az_sas_url="" \ # SAS URL for container level access -# only. Leave blank if using account/key or MSI. ---rclone_config_az_use_msi="" \ # use a managed service identity to -# authenticate (only works in Azure). ---rclone_config_az_client_id="" \ # client ID of the service principal -# to use for authentication. ---rclone_config_az_client_secret="" \ # client secret of the service -# principal to use for authentication. ---rclone_config_az_tenant="" \ # tenant ID of the service principal -# to use for authentication. - -# Alternatively for providing key-value pairs, you can utilize the '--values' option by specifying a file path containing -# key-value pairs in either JSON or YAML format. -# File content example: {"rclone_config_az_type":"azureblob",...} -zenml secret create az-seldon-secret \ - --values=@path/to/file.json -``` - - - -## How do you use it? - -### Requirements - -To run pipelines that deploy models to Seldon, you need the following tools installed locally: - -* [Docker](https://www.docker.com) -* [K3D](https://k3d.io/v5.2.1/#installation) (can be installed by running `curl -s https://raw.githubusercontent.com/rancher/k3d/main/install.sh | bash`). - -### Stack Component Registration - -For registering the model deployer, we need the URL of the Istio Ingress Gateway deployed on the Kubernetes cluster. We can get this URL by running the following command (assuming that the service name is `istio-ingressgateway`, deployed in the `istio-system` namespace): - -```bash -# For GKE clusters, the host is the GKE cluster IP address. - -export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}') - -# For EKS clusters, the host is the EKS cluster IP hostname. - -export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].hostname}') -``` - -Now register the model deployer: - -> **Note**: If you chose to configure your own custom credentials to authenticate to the persistent storage service where models are stored, as covered in the [Advanced: Configuring a Custom Seldon Core Secret](/develop/stack-components/model-deployers/seldon#managing-seldon-core-authentication) section, you will need to specify a ZenML secret reference when you configure the Seldon Core model deployer below: -> -> -> -> ```bash -> zenml model-deployer register seldon_deployer --flavor=seldon \ -> --kubernetes_context= \ -> --kubernetes_namespace= \ -> --base_url=http://$INGRESS_HOST \ -> --secret= -> ``` - - - -```bash -# Register the Seldon Core Model Deployer -zenml model-deployer register seldon_deployer --flavor=seldon \ - --kubernetes_context= \ - --kubernetes_namespace= \ - --base_url=http://$INGRESS_HOST \ -``` - -We can now use the model deployer in our stack. - -```bash -zenml stack update seldon_stack --model-deployer=seldon_deployer -``` - -See the [seldon\_model\_deployer\_step](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-seldon/#zenml.integrations.seldon.steps.seldon%5Fdeployer.seldon%5Fmodel%5Fdeployer%5Fstep) for an example of using the Seldon Core Model Deployer to deploy a model inside a ZenML pipeline step. - -### Configuration - -Within the `SeldonDeploymentConfig` you can configure: - -* `model_name`: the name of the model in the Seldon cluster and in ZenML. -* `replicas`: the number of replicas with which to deploy the model -* `implementation`: the type of Seldon inference server to use for the model. The implementation type can be one of the following: `TENSORFLOW_SERVER`, `SKLEARN_SERVER`, `XGBOOST_SERVER`, `custom`. -* `parameters`: an optional list of parameters (`SeldonDeploymentPredictorParameter`) to pass to the deployment predictor in the form of: - * `name` - * `type` - * `value` -* `resources`: the resources to be allocated to the model. This can be configured by passing a `SeldonResourceRequirements` object with the `requests` and `limits` properties. The values for these properties can be a dictionary with the `cpu` and `memory` keys. The values for these keys can be a string with the amount of CPU and memory to be allocated to the model. -* `serviceAccount` The name of the Service Account applied to the deployment. - -For more information and a full list of configurable attributes of the Seldon Core Model Deployer, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-seldon/#zenml.integrations.seldon.model%5Fdeployers) . - -### Custom Code Deployment - -ZenML enables you to deploy your pre- and post-processing code into the deployment environment together with the model by defining a custom predict function that will be wrapped in a Docker container and executed on the model deployment server, e.g.: - -```py -def custom_predict( - model: Any, - request: Array_Like, -) -> Array_Like: - """Custom Prediction function. - - The custom predict function is the core of the custom deployment, the - function is called by the custom deployment class defined for the serving - tool. The current implementation requires the function to get the model - loaded in the memory and a request with the data to predict. - - Args: - model: The model to use for prediction. - request: The prediction response of the model is an array-like format. - Returns: - The prediction in an array-like format. - """ - inputs = [] - for instance in request: - input = np.array(instance) - if not isinstance(input, np.ndarray): - raise Exception("The request must be a NumPy array") - processed_input = pre_process(input) - prediction = model.predict(processed_input) - postprocessed_prediction = post_process(prediction) - inputs.append(postprocessed_prediction) - return inputs - - -def pre_process(input: np.ndarray) -> np.ndarray: - """Pre process the data to be used for prediction.""" - input = input / 255.0 - return input[None, :, :] - - -def post_process(prediction: np.ndarray) -> str: - """Pre process the data""" - classes = [str(i) for i in range(10)] - prediction = tf.nn.softmax(prediction, axis=-1) - maxindex = np.argmax(prediction.numpy()) - return classes[maxindex] -``` - - - The custom predict function should get the model and the input data as arguments and return the model predictions. ZenML will automatically take care of loading the model into memory and starting the `seldon-core-microservice` that will be responsible for serving the model and running the predict function. - - -After defining your custom predict function in code, you can use the `seldon_custom_model_deployer_step` to automatically build your function into a Docker image and deploy it as a model server by setting the `predict_function` argument to the path of your `custom_predict` function: - -```py -from zenml.integrations.seldon.steps import seldon_custom_model_deployer_step -from zenml.integrations.seldon.services import SeldonDeploymentConfig -from zenml import pipeline - -@pipeline -def seldon_deployment_pipeline(): - model = ... - seldon_custom_model_deployer_step( - model=model, - predict_function="", # TODO: path to custom code - service_config=SeldonDeploymentConfig( - model_name="", # TODO: name of the deployed model - replicas=1, - implementation="custom", - resources=SeldonResourceRequirements( - limits={"cpu": "200m", "memory": "250Mi"} - ), - serviceAccountName="kubernetes-service-account", - ), - ) -``` - -#### Advanced Custom Code Deployment with Seldon Core Integration - - - Before creating your custom model class, you should take a look at the [custom Python model](https://docs.seldon.io/projects/seldon-core/en/latest/python/python%5Fwrapping%5Fdocker.html) section of the Seldon Core documentation. - - -The built-in Seldon Core custom deployment step is a good starting point for deploying your custom models. However, if you want to deploy more than the trained model, you can create your own custom class and a custom step to achieve this. - -See the [ZenML custom Seldon model class](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-seldon/#zenml.integrations.seldon.custom%5Fdeployer.zenml%5Fcustom%5Fmodel.ZenMLCustomModel) as a reference. diff --git a/develop/stack-components/model-registries/custom.mdx b/develop/stack-components/model-registries/custom.mdx deleted file mode 100644 index 2dd8958..0000000 --- a/develop/stack-components/model-registries/custom.mdx +++ /dev/null @@ -1,191 +0,0 @@ ---- -title: "Develop a Custom Model Registry" -description: "Learning how to develop a custom model registry." -icon: shuffle ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - - -**Base abstraction in progress!** - -The Model registry stack component is relatively new in ZenML. While it is fully functional, it can be challenging to cover all the ways ML systems deal with model versioning. This means that the API might change in the future. We will keep this page up-to-date with the latest changes. - -If you are writing a custom model registry flavor, and you found that the base abstraction is lacking or not flexible enough, please let us know by messaging us on [Slack](https://zenml.io/slack), or by opening an issue on [GitHub](https://github.com/zenml-io/zenml/issues/new/choose) - - -### Base Abstraction - -The `BaseModelRegistry` is the abstract base class that needs to be subclassed in order to create a custom component that can be used to register and retrieve models. As model registries can come in many shapes and forms, the base class exposes a deliberately basic and generic interface: - -```py -from abc import ABC, abstractmethod -from enum import Enum -from typing import Any, Dict, List, Optional, Type, cast - -from pydantic import BaseModel, Field, root_validator - -from zenml.enums import StackComponentType -from zenml.stack import Flavor, StackComponent -from zenml.stack.stack_component import StackComponentConfig - - -class BaseModelRegistryConfig(StackComponentConfig): - """Base config for model registries.""" - - -class BaseModelRegistry(StackComponent, ABC): - """Base class for all ZenML model registries.""" - - @property - def config(self) -> BaseModelRegistryConfig: - """Returns the config of the model registry.""" - return cast(BaseModelRegistryConfig, self._config) - - # --------- - # Model Registration Methods - # --------- - - @abstractmethod - def register_model( - self, - name: str, - description: Optional[str] = None, - tags: Optional[Dict[str, str]] = None, - ) -> RegisteredModel: - """Registers a model in the model registry.""" - - @abstractmethod - def delete_model( - self, - name: str, - ) -> None: - """Deletes a registered model from the model registry.""" - - @abstractmethod - def update_model( - self, - name: str, - description: Optional[str] = None, - tags: Optional[Dict[str, str]] = None, - ) -> RegisteredModel: - """Updates a registered model in the model registry.""" - - @abstractmethod - def get_model(self, name: str) -> RegisteredModel: - """Gets a registered model from the model registry.""" - - @abstractmethod - def list_models( - self, - name: Optional[str] = None, - tags: Optional[Dict[str, str]] = None, - ) -> List[RegisteredModel]: - """Lists all registered models in the model registry.""" - - # --------- - # Model Version Methods - # --------- - - @abstractmethod - def register_model_version( - self, - name: str, - description: Optional[str] = None, - tags: Optional[Dict[str, str]] = None, - model_source_uri: Optional[str] = None, - version: Optional[str] = None, - description: Optional[str] = None, - tags: Optional[Dict[str, str]] = None, - metadata: Optional[Dict[str, str]] = None, - zenml_version: Optional[str] = None, - zenml_run_name: Optional[str] = None, - zenml_pipeline_name: Optional[str] = None, - zenml_step_name: Optional[str] = None, - **kwargs: Any, - ) -> RegistryModelVersion: - """Registers a model version in the model registry.""" - - @abstractmethod - def delete_model_version( - self, - name: str, - version: str, - ) -> None: - """Deletes a model version from the model registry.""" - - @abstractmethod - def update_model_version( - self, - name: str, - version: str, - description: Optional[str] = None, - tags: Optional[Dict[str, str]] = None, - stage: Optional[ModelVersionStage] = None, - ) -> RegistryModelVersion: - """Updates a model version in the model registry.""" - - @abstractmethod - def list_model_versions( - self, - name: Optional[str] = None, - model_source_uri: Optional[str] = None, - tags: Optional[Dict[str, str]] = None, - **kwargs: Any, - ) -> List[RegistryModelVersion]: - """Lists all model versions for a registered model.""" - - @abstractmethod - def get_model_version(self, name: str, version: str) -> RegistryModelVersion: - """Gets a model version for a registered model.""" - - @abstractmethod - def load_model_version( - self, - name: str, - version: str, - **kwargs: Any, - ) -> Any: - """Loads a model version from the model registry.""" - - @abstractmethod - def get_model_uri_artifact_store( - self, - model_version: RegistryModelVersion, - ) -> str: - """Gets the URI artifact store for a model version.""" -``` - - - This is a slimmed-down version of the base implementation which aims to highlight the abstraction layer. To see the full implementation and get the complete docstrings, please check [the source code on GitHub](https://github.com/zenml-io/zenml/blob/main/src/zenml/model%5Fregistries/base%5Fmodel%5Fregistry.py) . - - -### Build your own custom model registry - -If you want to create your own custom flavor for a model registry, you can follow the following steps: - -1. Learn more about the core concepts for the model registry [here](/develop/stack-components/model-registries#model-registry-concepts-and-terminology). Your custom model registry will be built on top of these concepts so it helps to be aware of them. -2. Create a class that inherits from `BaseModelRegistry` and implements the abstract methods. -3. Create a `ModelRegistryConfig` class that inherits from `BaseModelRegistryConfig` and adds any additional configuration parameters that you need. -4. Bring the implementation and the configuration together by inheriting from the `BaseModelRegistryFlavor` class. Make sure that you give a `name` to the flavor through its abstract property. - -Once you are done with the implementation, you can register it through the CLI with the following command: - -```bash -zenml model-registry flavor register -``` - - -It is important to draw attention to how and when these base abstractions are coming into play in a ZenML workflow. - -* The **CustomModelRegistryFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomModelRegistryConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Most of all, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` objects are `pydantic` objects under the hood, you can also add your own custom validators here. -* The **CustomModelRegistry** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomModelRegistryFlavor` and the `CustomModelRegistryConfig` are implemented in a different module/path than the actual `CustomModelRegistry`). - - -For a full implementation example, please check out the [MLFlowModelRegistry](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-mlflow/#zenml.integrations.mlflow.model%5Fregistry.MLFlowModelRegistry) - diff --git a/develop/stack-components/model-registries/mlflow.mdx b/develop/stack-components/model-registries/mlflow.mdx deleted file mode 100644 index c6e829c..0000000 --- a/develop/stack-components/model-registries/mlflow.mdx +++ /dev/null @@ -1,184 +0,0 @@ ---- -title: "MLflow Model Registry" -description: "Managing MLFlow logged models and artifacts" -icon: leaf ---- - -[MLflow](https://www.mlflow.org/docs/latest/tracking.html) is a popular tool that helps you track experiments, manage models and even deploy them to different environments. ZenML already provides a [MLflow Experiment Tracker](/develop/stack-components/experiment-trackers/mlflow) that you can use to track your experiments, and an [MLflow Model Deployer](/develop/stack-components/model-deployers/mlflow) that you can use to deploy your models locally. - -The MLflow model registry uses [the MLflow model registry service](https://mlflow.org/docs/latest/model-registry.html) to manage and track ML models and their artifacts and provides a user interface to browse them: - - - - - - - -## When would you want to use it? - -You can use the MLflow model registry throughout your experimentation, QA, and production phases to manage and track machine learning model versions. It is designed to help teams collaborate on model development and deployment, and keep track of which models are being used in which environments. With the MLflow model registry, you can store and manage models, deploy them to different environments, and track their performance over time. - -This is particularly useful in the following scenarios: - -* If you are working on a machine learning project and want to keep track of different model versions as they are developed and deployed. -* If you need to deploy machine learning models to different environments and want to keep track of which version is being used in each environment. -* If you want to monitor and compare the performance of different model versions over time and make data-driven decisions about which models to use in production. -* If you want to simplify the process of deploying models either to a production environment or to a staging environment for testing. - -## How do you deploy it? - -The MLflow Experiment Tracker flavor is provided by the MLflow ZenML integration, so you need to install it on your local machine to be able to register an MLflow model registry component. Note that the MLFlow model registry requires [MLFlow Experiment Tracker](/develop/stack-components/experiment-trackers/mlflow) to be present in the stack. - -```bash -zenml integration install mlflow -y -``` - -Once the MLflow integration is installed, you can register an MLflow model registry component in your stack: - -```bash -zenml model-registry register mlflow_model_registry --flavor=mlflow -# Register and set a stack with the new model registry as the active stack -zenml stack register custom_stack -r mlflow_model_registry ... --set -``` - - - The MLFlow model registry will automatically use the same configuration as the MLFlow Experiment Tracker. So if you have a remote MLFlow tracking server configured in your stack, the MLFlow model registry will also use the same configuration. - - -Due to a [critical severity vulnerability](https://github.com/advisories/GHSA-xg73-94fp-g449) found in older versions of MLflow, we recommend using MLflow version 2.2.1 or higher. - - -## How do you use it? - -There are different ways to use the MLflow model registry. You can use it in your ZenML pipelines with the built-in step, or you can use the ZenML CLI to register your model manually or call the model registry API within a custom step in your pipeline. The following sections show you how to use the MLflow model registry in your ZenML pipelines and with the ZenML CLI: - -### Register models inside a pipeline - -ZenML provides a predefined `mlflow_model_deployer_step` that you can use to register a model in the MLflow model registry which you have previously logged to MLflow: - -```py -from zenml import pipeline -from zenml.integrations.mlflow.steps.mlflow_registry import ( - mlflow_register_model_step, -) - -@pipeline -def mlflow_registry_training_pipeline(): - model = ... - mlflow_register_model_step( - model=model, - name="tensorflow-mnist-model", - ) -``` - - The `mlflow_register_model_step` expects that the `model` it receives has already been logged to MLflow in a previous step. E.g., for a scikit-learn model, you would need to have used `mlflow.sklearn.autolog()` or `mlflow.sklearn.log_model(model)` in a previous step. See the [MLflow experiment tracker documentation](/develop/stack-components/experiment-trackers/mlflow) for more information on how to log models to MLflow from your ZenML steps. - - -#### List of available parameters - -When using the `mlflow_register_model_step`, you can set a variety of parameters for fine-grained control over which information is logged with your model: - -* `name`: The name of the model. This is a required parameter. -* `version`: version: The version of the model. -* `trained_model_name`: Name of the model artifact in MLflow. -* `model_source_uri`: The path to the model. If not provided, the model will be fetched from the MLflow tracking server via the `trained_model_name`. -* `description`: A description of the model version. -* `metadata`: A list of metadata to associate with the model version. - - -The `model_source_uri` parameter is the path to the model within the MLflow tracking server. - -If you are using a local MLflow tracking server, the path will be something like `file:///.../mlruns/667102566783201219/3973eabc151c41e6ab98baeb20c5323b/artifacts/model`. - -If you are using a remote MLflow tracking server, the path will be something like `s3://.../mlruns/667102566783201219/3973eabc151c41e6ab98baeb20c5323b/artifacts/model`. - -You can find the path of the model in the MLflow UI. Go to the `Artifacts` tab of the run that produced the model and click on the model. The path will be displayed in the URL: - - -### Register models via the CLI - -Sometimes adding a `mlflow_registry_training_pipeline` step to your pipeline might not be the best option for you, as it will register a model in the MLflow model registry every time you run the pipeline. - -If you want to register your models manually, you can use the `zenml model-registry models register-version` CLI command instead: - -```bash -zenml model-registry models register-version Tensorflow-model \ - --description="A new version of the tensorflow model with accuracy 98.88%" \ - -v 1 \ - --model-uri="file:///.../mlruns/667102566783201219/3973eabc151c41e6ab98baeb20c5323b/artifacts/model" \ - -m key1 value1 -m key2 value2 \ - --zenml-pipeline-name="mlflow_training_pipeline" \ - --zenml-step-name="trainer" -``` - -### Deploy a registered model - -After you have registered a model in the MLflow model registry, you can also easily deploy it as a prediction service. Checkout the [MLflow model deployer documentation](/develop/stack-components/model-deployers/mlflow#deploy-from-model-registry) for more information on how to do that. - -### Interact with registered models - -You can also use the ZenML CLI to interact with registered models and their versions. - -The `zenml model-registry models list` command will list all registered models in the model registry: - -```bash - -$ zenml model-registry models list - -┏━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┯━━━━━━━━━━┓ -┃ NAME │ DESCRIPTION │ METADATA ┃ -┠────────────────────────┼─────────────┼──────────┨ -┃ tensorflow-mnist-model │ │ ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┷━━━━━━━━━━┛ -``` - -To list all versions of a specific model, you can use the `zenml model-registry models list-versions REGISTERED_MODEL_NAME` command: - -``` -$ zenml model-registry models list-versions tensorflow-mnist-model -┏━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ NAME │ MODEL_VERSION │ VERSION_DESCRIPTION │ METADATA ┃ -┠────────────────────────┼───────────────┼─────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ tensorflow-mnist-model │ 3 │ Run #3 of the mlflow_training_pipeline. │ {'zenml_version': '0.34.0', 'zenml_run_name': 'mlflow_training_pipeline-2023_03_01-08_09_23_672599', 'zenml_pipeline_name': 'mlflow_training_pipeline', ┃ -┃ │ │ │ 'zenml_pipeline_run_uuid': 'a5d4faae-ef70-48f2-9893-6e65d5e51e98', 'epochs': '5', 'optimizer': 'Adam', 'lr': '0.005'} ┃ -┠────────────────────────┼───────────────┼─────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ tensorflow-mnist-model │ 2 │ Run #2 of the mlflow_training_pipeline. │ {'zenml_version': '0.34.0', 'zenml_run_name': 'mlflow_training_pipeline-2023_03_01-08_09_08_467212', 'zenml_pipeline_name': 'mlflow_training_pipeline', ┃ -┃ │ │ │ 'zenml_pipeline_run_uuid': '11858dcf-3e47-4b1a-82c5-6fa25ba4e037', 'epochs': '5', 'optimizer': 'Adam', 'lr': '0.003'} ┃ -┠────────────────────────┼───────────────┼─────────────────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ tensorflow-mnist-model │ 1 │ Run #1 of the mlflow_training_pipeline. │ {'zenml_version': '0.34.0', 'zenml_run_name': 'mlflow_training_pipeline-2023_03_01-08_08_52_398499', 'zenml_pipeline_name': 'mlflow_training_pipeline', ┃ -┃ │ │ │ 'zenml_pipeline_run_uuid': '29fb22c1-6e0b-4431-9e04-226226506d16', 'epochs': '5', 'optimizer': 'Adam', 'lr': '0.001'} ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -For more details on a specific model version, you can use the `zenml model-registry models get-version REGISTERED_MODEL_NAME -v VERSION` command: - - -```bash -$ zenml model-registry models get-version tensorflow-mnist-model -v 1 - -┏━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ MODEL VERSION PROPERTY │ VALUE ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ REGISTERED_MODEL_NAME │ tensorflow-mnist-model ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ VERSION │ 1 ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ VERSION_DESCRIPTION │ Run #1 of the mlflow_training_pipeline. ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ CREATED_AT │ 2023-03-01 09:09:06.899000 ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ UPDATED_AT │ 2023-03-01 09:09:06.899000 ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ METADATA │ {'zenml_version': '0.34.0', 'zenml_run_name': 'mlflow_training_pipeline-2023_03_01-08_08_52_398499', 'zenml_pipeline_name': 'mlflow_training_pipeline', 'zenml_pipeline_run_uuid': '29fb22c1-6e0b-4431-9e04-226226506d16', ┃ -┃ │ 'lr': '0.001', 'epochs': '5', 'optimizer': 'Adam'} ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ MODEL_SOURCE_URI │ file:///Users/safoine-zenml/Library/Application Support/zenml/local_stores/0902a511-117d-4152-a098-b2f1124c4493/mlruns/489728212459131640/293a0d2e71e046999f77a79639f6eac2/artifacts/model ┃ -┠────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ -┃ STAGE │ None ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -Finally, to delete a registered model or a specific model version, you can use the `zenml model-registry models delete REGISTERED_MODEL_NAME` and `zenml model-registry models delete-version REGISTERED_MODEL_NAME -v VERSION` commands respectively. - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-mlflow/#zenml.integrations.mlflow.model%5Fregistry.MLFlowModelRegistry) to see more about the interface and implementation. - diff --git a/develop/stack-components/model-registries/model-registries.mdx b/develop/stack-components/model-registries/model-registries.mdx deleted file mode 100644 index b0ffeba..0000000 --- a/develop/stack-components/model-registries/model-registries.mdx +++ /dev/null @@ -1,58 +0,0 @@ ---- -title: "Overview" -description: "Tracking and managing ML models." -icon: book ---- - -Model registries are centralized storage solutions for managing and tracking machine learning models across various stages of development and deployment. They help track the different versions and configurations of each model and enable reproducibility. By storing metadata such as version, configuration, and metrics, model registries help streamline the management of trained models. In ZenML, model registries are Stack Components that allow for the easy retrieval, loading, and deployment of trained models. They also provide information on the pipeline in which the model was trained and how to reproduce it. - -### Model Registry Concepts and Terminology - -ZenML provides a unified abstraction for model registries through which it is possible to handle and manage the concepts of model groups, versions, and stages in a consistent manner regardless of the underlying registry tool or platform being used. The following concepts are useful to be aware of for this abstraction: - -* **RegisteredModel**: A logical grouping of models that can be used to track different versions of a model. It holds information about the model, such as its name, description, and tags, and can be created by the user or automatically created by the model registry when a new model is logged. -* **RegistryModelVersion**: A specific version of a model identified by a unique version number or string. It holds information about the model, such as its name, description, tags, and metrics, and a reference to the model artifact logged to the model registry. In ZenML, it also holds a reference to the pipeline name, pipeline run ID, and step name. Each model version is associated with a model registration. -* **ModelVersionStage**: A model version stage is a state in that a model version can be. It can be one of the following: `None`, `Staging`, `Production`, `Archived`. The model version stage is used to track the lifecycle of a model version. For example, a model version can be in the `Staging` stage while it is being tested and then moved to the `Production` stage once it is ready for deployment. - -### When to use it - -ZenML provides a built-in mechanism for storing and versioning pipeline artifacts through its mandatory Artifact Store. While this is a powerful way to manage artifacts programmatically, it can be challenging to use without a visual interface. - -Model registries, on the other hand, offer a visual way to manage and track model metadata, particularly when using a remote orchestrator. They make it easy to retrieve and load models from storage, thanks to built-in integrations. A model registry is an excellent choice for interacting with all the models in your pipeline and managing their state in a centralized way. - -Using a model registry in your stack is particularly useful if you want to interact with all the logged models in your pipeline, or if you need to manage the state of your models in a centralized way and make it easy to retrieve, load, and deploy these models. - -### How model registries fit into the ZenML stack - -Here is an architecture diagram that shows how a model registry fits into the overall story of a remote stack. - - - - - - -#### Model Registry Flavors - -Model Registries are optional stack components provided by integrations: - -| Model Registry | Flavor | Integration | Notes | -| ------------------------------------------------------------------ | -------- | ----------- | ------------------------------------------ | -| [MLflow](/develop/stack-components/model-registries/mlflow) | mlflow | mlflow | Add MLflow as Model Registry to your stack | -| [Custom Implementation](/develop/stack-components/model-registries/custom) | _custom_ | _custom_ | | - -If you would like to see the available flavors of Model Registry, you can use the command: - -```bash -zenml model-registry flavor list -``` - -### How to use it - -Model registries are an optional component in the ZenML stack that is tied to the experiment tracker. This means that a model registry can only be used if you are also using an experiment tracker. If you're not using an experiment tracker, you can still store your models in ZenML, but you will need to manually retrieve model artifacts from the artifact store. More information on this can be found in the [documentation on the fetching runs](/develop/usage/pipelines/build-pipelines/fetching-pipelines). - -To use model registries, you first need to register a model registry in your stack with the same flavor as your experiment tracker. Then, you can register your trained model in the model registry using one of three methods: - -* (1) using the built-in step in the pipeline. -* (2) using the ZenML CLI to register the model from the command line. -* (3) registering the model from the model registry UI. Finally, you can use the model registry to retrieve and load your models for deployment or further experimentation. diff --git a/develop/stack-components/orchestrators/airflow.mdx b/develop/stack-components/orchestrators/airflow.mdx deleted file mode 100644 index 139a295..0000000 --- a/develop/stack-components/orchestrators/airflow.mdx +++ /dev/null @@ -1,222 +0,0 @@ ---- -title: "Airflow Orchestrator" -description: "Orchestrating your pipelines to run on Airflow." -icon: plane ---- - -ZenML pipelines can be executed natively as [Airflow](https://airflow.apache.org/) DAGs. This brings together the power of the Airflow orchestration with the ML-specific benefits of ZenML pipelines. Each ZenML step runs in a separate Docker container which is scheduled and started using Airflow. - - - If you're going to use a remote deployment of Airflow, you'll also need a [remote ZenML deployment](/develop/getting-started/deploying-zenml). - - -### When to use it - -You should use the Airflow orchestrator if - -* you're looking for a proven production-grade orchestrator. -* you're already using Airflow. -* you want to run your pipelines locally. -* you're willing to deploy and maintain Airflow. - -### How to deploy it - -The Airflow orchestrator can be used to run pipelines locally as well as remotely. In the local case, no additional setup is necessary. - -There are many options to use a deployed Airflow server: - -* Use one of [ZenML's Airflow stack recipes](https://github.com/zenml-io/mlstacks). This is the simplest solution to get ZenML working with Airflow, as the recipe also takes care of additional steps such as installing required Python dependencies in your Airflow server environment. -* Use a managed deployment of Airflow such as [Google Cloud Composer](https://cloud.google.com/composer) , [Amazon MWAA](https://aws.amazon.com/managed-workflows-for-apache-airflow/), or [Astronomer](https://www.astronomer.io/). -* Deploy Airflow manually. Check out the official [Airflow docs](https://airflow.apache.org/docs/apache-airflow/stable/production-deployment.html) for more information. - -If you're not using `mlstacks` to deploy Airflow, there are some additional Python packages that you'll need to install in the Python environment of your Airflow server: - -* `pydantic~=2.7.1`: The Airflow DAG files that ZenML creates for you require Pydantic to parse and validate configuration files. -* `apache-airflow-providers-docker` or `apache-airflow-providers-cncf-kubernetes`, depending on which Airflow operator you'll be using to run your pipeline steps. Check out [this section](/develop/stack-components/orchestrators/airflow#using-different-airflow-operators) for more information on supported operators. - -### How to use it - -To use the Airflow orchestrator, we need: - -* The ZenML `airflow` integration installed. If you haven't done so, run - -```bash -zenml integration install airflow -``` -* [Docker](https://docs.docker.com/get-docker/) installed and running. -* The orchestrator registered and part of our active stack: - -```bash -zenml orchestrator register \ - --flavor=airflow \ - --local=True # set this to `False` if using a remote Airflow deployment - -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - - -Due to dependency conflicts, we need to install the Python packages to start a local Airflow server in a separate Python environment. - -```bash -# Create a fresh virtual environment in which we install the Airflow server dependencies -python -m venv airflow_server_environment -source airflow_server_environment/bin/activate - -# Install the Airflow server dependencies -pip install "apache-airflow==2.4.0" "apache-airflow-providers-docker<3.8.0" "pydantic~=2.7.1" -``` - -Before starting the local Airflow server, we can set a few environment variables to configure it: - -* `AIRFLOW_HOME`: This variable defines the location where the Airflow server stores its database and configuration files. The default value is `~/airflow`. -* `AIRFLOW__CORE__DAGS_FOLDER`: This variable defines the location where the Airflow server looks for DAG files. The default value is `/dags`. -* `AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL`: This variable controls how often the Airflow scheduler checks for new or updated DAGs. By default, the scheduler will check for new DAGs every 30 seconds. This variable can be used to increase or decrease the frequency of the checks, depending on the specific needs of your pipeline. - - -When running this on MacOS, you might need to set the `no_proxy` environment variable to prevent crashes due to a bug in Airflow (see [this page](https://github.com/apache/airflow/issues/28487) for more information): - -```bash -export no_proxy=* -``` - - -We can now start the local Airflow server by running the following command: - -```bash -# Switch to the Python environment that has Airflow installed before running this command - -airflow standalone -``` - -This command will start up an Airflow server on your local machine. During the startup, it will print a username and password which you can use to log in to the Airflow UI [here](http://0.0.0.0:8080). - -We can now switch back the Python environment in which ZenML is installed and run a pipeline: - -```bash -# Switch to the Python environment that has ZenML installed before running this command - -python file_that_runs_a_zenml_pipeline.py -``` - -This call will produce a `.zip` file containing a representation of your ZenML pipeline for Airflow. The location of this `.zip` file will be in the logs of the command above. We now need to this file to the Airflow DAGs directory, from where the local Airflow server will load it and run your pipeline (It might take a few seconds until the pipeline shows up in the Airflow UI). To figure out the DAGs directory, we can run `airflow config get-value core DAGS_FOLDER` while having our Python environment with the Airflow installation active. - -To make this process easier, we can configure our ZenML Airflow orchestrator to automatically the `.zip` file to this directory for us. To do so, run the following command: - -```bash -# Switch to the Python environment that has ZenML installed before running this command - -zenml orchestrator update --dag_output_dir= -``` - -Now that we've set this up, running a pipeline in Airflow is as simple as just running the Python file: - -```bash -# Switch to the Python environment that has ZenML installed before running this command - -python file_that_runs_a_zenml_pipeline.py -``` - - -When using the Airflow orchestrator with a remote deployment, you'll additionally need: - -* A remote ZenML server deployed to the cloud. See the [deployment guide](/develop/getting-started/deploying-zenml) for more information. -* A deployed Airflow server. See the [deployment section](/develop/stack-components/orchestrators/airflow#how-to-deploy-it) for more information. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. - -In the remote case, the Airflow orchestrator works differently than other ZenML orchestrators. Executing a python file which runs a pipeline by calling `pipeline.run()` will not actually run the pipeline, but instead will create a `.zip` file containing an Airflow representation of your ZenML pipeline. In one additional step, you need to make sure this zip file ends up in the [DAGs directory](https://airflow.apache.org/docs/apache-airflow/stable/concepts/overview.html#architecture-overview) of your Airflow deployment. - - - - - ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your pipeline steps in Airflow. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - -#### Scheduling - -You can [schedule pipeline runs](/develop/usage/pipelines/build-pipelines/schedule-a-pipeline) on Airflow similarly to other orchestrators. However, note that **Airflow schedules always need to be set in the past**, e.g.,: - -```py -from datetime import datetime, timedelta - -from zenml.pipelines import Schedule - -scheduled_pipeline = fashion_mnist_pipeline.with_options( - schedule=Schedule( - start_time=datetime.now() - timedelta(hours=1), # start in the past - end_time=datetime.now() + timedelta(hours=1), - interval_second=timedelta(minutes=15), # run every 15 minutes - catchup=False, - ) -) -scheduled_pipeline() -``` - -#### Airflow UI - -Airflow comes with its own UI that you can use to find further details about your pipeline runs, such as the logs of your steps. For local Airflow, you can find the Airflow UI at [http://localhost:8080](http://localhost:8080) by default. - - - If you cannot see the Airflow UI credentials in the console, you can find the password in `/standalone_admin_password.txt`. `AIRFLOW_HOME` will usually be `~/airflow` unless you've manually configured it with the `AIRFLOW_HOME` environment variable. You can always run `airflow info` to figure out the directory for the active environment. - -The username will always be `admin`. - -#### Additional configuration - -For additional configuration of the Airflow orchestrator, you can pass `AirflowOrchestratorSettings` when defining or running your pipeline. Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-airflow/#zenml.integrations.airflow.flavors.airflow%5Forchestrator%5Fflavor.AirflowOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files) for more information on how to specify settings. - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. - -#### Using different Airflow operators - -Airflow operators specify how a step in your pipeline gets executed. As ZenML relies on Docker images to run pipeline steps, only operators that support executing a Docker image work in combination with ZenML. Airflow comes with two operators that support this: - -* the `DockerOperator` runs the Docker images for executing your pipeline steps on the same machine that your Airflow server is running on. For this to work, the server environment needs to have the `apache-airflow-providers-docker` package installed. -* the `KubernetesPodOperator` runs the Docker image on a pod in the Kubernetes cluster that the Airflow server is deployed to. For this to work, the server environment needs to have the `apache-airflow-providers-cncf-kubernetes` package installed. - -You can specify which operator to use and additional arguments to it as follows: - -```py -from zenml.integrations.airflow.flavors.airflow_orchestrator_flavor import AirflowOrchestratorSettings - -airflow_settings = AirflowOrchestratorSettings( - # This could also be a reference to one of your custom classes. - # e.g. `my_module.MyCustomOperatorClass` as long as the class - # is importable in your Airflow server environment - operator="airflow.providers.docker.operators.docker.DockerOperator", - # Dictionary of arguments to pass to the operator __init__ method - operator_args={} -) -``` - -**Custom operators** - -If you want to use any other operator to run your steps, you can specify the `operator` in your `AirflowSettings` as a path to the python operator class: - -```py -from zenml.integrations.airflow.flavors.airflow_orchestrator_flavor import AirflowOrchestratorSettings - -airflow_settings = AirflowOrchestratorSettings( - # This could also be a reference to one of your custom classes. - # e.g. `my_module.MyCustomOperatorClass` as long as the class - # is importable in your Airflow server environment - operator="airflow.providers.docker.operators.docker.DockerOperator", - # Dictionary of arguments to pass to the operator __init__ method - operator_args={} -) -``` - -**Custom DAG generator file** - -To run a pipeline in Airflow, ZenML creates a Zip archive that contains two files: - -* A JSON configuration file that the orchestrator creates. This file contains all the information required to create the Airflow DAG to run the pipeline. -* A Python file that reads this configuration file and actually creates the Airflow DAG. We call this file the `DAG generator` and you can find the implementation [here](https://github.com/zenml-io/zenml/blob/main/src/zenml/integrations/airflow/orchestrators/dag%5Fgenerator.py) . - -If you need more control over how the Airflow DAG is generated, you can provide a custom DAG generator file using the setting `custom_dag_generator`. This setting will need to reference a Python module that can be imported into your active Python environment. It will additionally need to contain the same classes (`DagConfiguration` and `TaskConfiguration`) and constants (`ENV_ZENML_AIRFLOW_RUN_ID`, `ENV_ZENML_LOCAL_STORES_PATH` and `CONFIG_FILENAME`) as the [original module](https://github.com/zenml-io/zenml/blob/main/src/zenml/integrations/airflow/orchestrators/dag%5Fgenerator.py) . For this reason, we suggest starting by ing the original and modifying it according to your needs. - -Check out our docs on how to apply settings to your pipelines [here](/develop/usage/project-setup/use-configuration-files). - -For more information and a full list of configurable attributes of the Airflow orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-airflow/#zenml.integrations.airflow.orchestrators.airflow%5Forchestrator.AirflowOrchestrator) . diff --git a/develop/stack-components/orchestrators/azureml.mdx b/develop/stack-components/orchestrators/azureml.mdx deleted file mode 100644 index 07726ac..0000000 --- a/develop/stack-components/orchestrators/azureml.mdx +++ /dev/null @@ -1,189 +0,0 @@ ---- -title: "AzureML Orchestrator" -description: "Orchestrating your pipelines to run on AzureML." -icon: grunt ---- - -[AzureML](https://azure.microsoft.com/en-us/products/machine-learning) is a cloud-based orchestration service provided by Microsoft, that enables data scientists, machine learning engineers, and developers to build, train, deploy, and manage machine learning models. It offers a comprehensive and integrated environment that supports the entire machine learning lifecycle, from data preparation and model development to deployment and monitoring. - -## When to use it - -You should use the AzureML orchestrator if: - -* you're already using Azure. -* you're looking for a proven production-grade orchestrator. -* you're looking for a UI in which you can track your pipeline runs. -* you're looking for a managed solution for running your pipelines. - -## How it works - -The ZenML AzureML orchestrator implementation uses [the Python SDK v2 of AzureML](https://learn.microsoft.com/en-gb/python/api/overview/azure/ai-ml-readme?view=azure-python) to allow our users to build their Machine Learning pipelines. For each ZenML step, it creates an AzureML [CommandComponent](https://learn.microsoft.com/en-us/python/api/azure-ai-ml/azure.ai.ml.entities.commandcomponent?view=azure-python) and brings them together in a pipeline. - -## How to deploy it - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including an AzureML orchestrator? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML Azure Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -In order to use an AzureML orchestrator, you need to first deploy [ZenML to the cloud](/develop/getting-started/deploying-zenml). It would be recommended to deploy ZenML in the same region as you plan on using for AzureML, but it is not necessary to do so. You must ensure that you are [connected to the remote ZenML server](/develop/usage/resource-data-management/connecting-to-zenml/connect-in-with-your-user-interactive) before using this stack component. - -## How to use it - -In order to use the AzureML orchestrator, you need: - -* The ZenML `azure` integration installed. If you haven't done so, run: - -```bash -zenml integration install azure -``` - -* [Docker](https://www.docker.com) installed and running or a remote image builder in your stack. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. -* An [Azure resource group equipped with an AzureML workspace](https://learn.microsoft.com/en-us/azure/machine-learning/quickstart-create-resources?view=azureml-api-2) to run your pipeline on. - -There are two ways of authenticating your orchestrator with AzureML: - -1. **Default Authentication** simplifies the authentication process while developing your workflows that deploy to Azure by combining credentials used in Azure hosting environments and credentials used in local development. -2. **Service Principal Authentication (recommended)** is using the concept of service principals on Azure to allow you to connect your cloud components with proper authentication. For this method, you will need to [create a service principal on Azure](https://learn.microsoft.com/en-us/azure/developer/python/sdk/authentication-on-premises-apps?tabs=azure-portal), assign it the correct permissions and use it to [register a ZenML Azure Service Connector](https://docs.zenml.io/usage/resource-data-management/auth-management/azure-service-connector). - -```bash -zenml service-connector register --type azure -i -zenml orchestrator connect -c -``` - -## Docker - -For each pipeline run, ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your pipeline steps in AzureML. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - -## AzureML UI - -Each AzureML workspace comes equipped with an Azure Machine Learning studio. Here you can inspect, manage, and debug your pipelines and steps. - - - - - -Double-clicking any of the steps on this view will open up the overview page for that specific step. Here you can check the configuration of the component and its execution logs. - -## Settings - -The ZenML AzureML orchestrator comes with a dedicated class called `AzureMLOrchestratorSettings` for configuring its settings, and it controls the compute resources used for pipeline execution in AzureML. - -Currently, it supports three different modes of operation. - -### 1\. Serverless Compute (Default) - -* Set `mode` to `serverless`. -* Other parameters are ignored. - -**Example:** - -```py -from zenml import step, pipeline -from zenml.integrations.azure.flavors import AzureMLOrchestratorSettings - -azureml_settings = AzureMLOrchestratorSettings( - mode="serverless" # It's the default behavior -) - -@step -def example_step() -> int: - return 3 - - -@pipeline(settings={"orchestrator": azureml_settings}) -def pipeline(): - example_step() - -pipeline() -``` - -### 2\. Compute Instance - -* Set `mode` to `compute-instance`. -* Requires a `compute_name`. - * If a compute instance with the same name exists, it uses the existing compute instance and ignores other parameters. (It will throw a warning if the provided configuration does not match the existing instance.) - * If a compute instance with the same name doesn't exist, it creates a new compute instance with the `compute_name`. For this process, you can specify `size` and `idle_type_before_shutdown_minutes`. - -**Example:** - -```py -from zenml import step, pipeline -from zenml.integrations.azure.flavors import AzureMLOrchestratorSettings - -azureml_settings = AzureMLOrchestratorSettings( - mode="compute-instance", - compute_name="my-gpu-instance", # Will fetch or create this instance - size="Standard_NC6s_v3", # Using a NVIDIA Tesla V100 GPU - idle_time_before_shutdown_minutes=20, -) - -@step -def example_step() -> int: - return 3 - - -@pipeline(settings={"orchestrator": azureml_settings}) -def pipeline(): - example_step() - -pipeline() -``` - -### 3\. Compute Cluster - -* Set `mode` to `compute-cluster`. -* Requires a `compute_name`. - * If a compute cluster with the same name exists, it uses existing cluster, ignores other parameters. (It will throw a warning if the provided - * configuration does not match the existing cluster.) - * If a compute cluster with the same name doesn't exist, it creates a new compute cluster. Additional parameters can be used for configuring this process. - -**Example:** - -```py -from zenml import step, pipeline -from zenml.integrations.azure.flavors import AzureMLOrchestratorSettings - -azureml_settings = AzureMLOrchestratorSettings( - mode="compute-cluster", - compute_name="my-gpu-cluster", # Will fetch or create this instance - size="Standard_NC6s_v3", # Using a NVIDIA Tesla V100 GPU - tier="Dedicated", # Can be set to either "Dedicated" or "LowPriority" - min_instances=2, - max_instances=10, - idle_time_before_scaledown_down=60, -) - -@step -def example_step() -> int: - return 3 - - -@pipeline(settings={"orchestrator": azureml_settings}) -def pipeline(): - example_step() - -pipeline() -``` - - - In order to learn more about the supported sizes for compute instances and clusters, you can check [the AzureML documentation](https://learn.microsoft.com/en-us/azure/machine-learning/concept-compute-target?view=azureml-api-2#supported-vm-series-and-sizes). - - -### Run pipelines on a schedule - -The AzureML orchestrator supports running pipelines on a schedule using its [JobSchedules](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-schedule-pipeline-job?view=azureml-api-2&tabs=python). Both cron expression and intervals are supported. - -```py -from zenml.config.schedule import Schedule - -# Run a pipeline every 5th minute -pipeline.run(schedule=Schedule(cron_expression="*/5 * * * *")) -``` - -Once you run the pipeline with a schedule, you can find the schedule and the corresponding run under the `All Schedules` tab `Jobs` in the jobs page on AzureML. - - - Note that ZenML only gets involved to schedule a run, but maintaining the lifecycle of the schedule is the responsibility of the user. That means, if you want to cancel a schedule that you created on AzureML, you will have to do it through the Azure UI. - diff --git a/develop/stack-components/orchestrators/custom.mdx b/develop/stack-components/orchestrators/custom.mdx deleted file mode 100644 index cd0ed1c..0000000 --- a/develop/stack-components/orchestrators/custom.mdx +++ /dev/null @@ -1,215 +0,0 @@ ---- -title: "Develop a custom orchestrator" -description: "Learning how to develop a custom orchestrator." -icon: house ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -### Base Implementation - -ZenML aims to enable orchestration with any orchestration tool. This is where the `BaseOrchestrator` comes into play. It abstracts away many of the ZenML-specific details from the actual implementation and exposes a simplified interface: - -```py -from abc import ABC, abstractmethod -from typing import Any, Dict, Type - -from zenml.models import PipelineDeploymentResponseModel -from zenml.enums import StackComponentType -from zenml.stack import StackComponent, StackComponentConfig, Stack, Flavor - - -class BaseOrchestratorConfig(StackComponentConfig): - """Base class for all ZenML orchestrator configurations.""" - - -class BaseOrchestrator(StackComponent, ABC): - """Base class for all ZenML orchestrators""" - - @abstractmethod - def prepare_or_run_pipeline( - self, - deployment: PipelineDeploymentResponseModel, - stack: Stack, - environment: Dict[str, str], - ) -> Any: - """Prepares and runs the pipeline outright or returns an intermediate - pipeline representation that gets deployed. - """ - - @abstractmethod - def get_orchestrator_run_id(self) -> str: - """Returns the run id of the active orchestrator run. - - Important: This needs to be a unique ID and return the same value for - all steps of a pipeline run. - - Returns: - The orchestrator run id. - """ - - -class BaseOrchestratorFlavor(Flavor): - """Base orchestrator for all ZenML orchestrator flavors.""" - - @property - @abstractmethod - def name(self): - """Returns the name of the flavor.""" - - @property - def type(self) -> StackComponentType: - """Returns the flavor type.""" - return StackComponentType.ORCHESTRATOR - - @property - def config_class(self) -> Type[BaseOrchestratorConfig]: - """Config class for the base orchestrator flavor.""" - return BaseOrchestratorConfig - - @property - @abstractmethod - def implementation_class(self) -> Type["BaseOrchestrator"]: - """Implementation class for this flavor.""" -``` - - - This is a slimmed-down version of the base implementation which aims to highlight the abstraction layer. In order to see the full implementation and get the complete docstrings, please check [the source code on GitHub](https://github.com/zenml-io/zenml/blob/main/src/zenml/orchestrators/base%5Forchestrator.py) . - - -### Build your own custom orchestrator - -If you want to create your own custom flavor for an orchestrator, you can follow the following steps: - -1. Create a class that inherits from the `BaseOrchestrator` class and implement the abstract `prepare_or_run_pipeline(...)` and `get_orchestrator_run_id()` methods. -2. If you need to provide any configuration, create a class that inherits from the `BaseOrchestratorConfig` class and add your configuration parameters. -3. Bring both the implementation and the configuration together by inheriting from the `BaseOrchestratorFlavor` class. Make sure that you give a `name` to the flavor through its abstract property. - -Once you are done with the implementation, you can register it through the CLI. Please ensure you **point to the flavor class via dot notation**: - -```bash -zenml orchestrator flavor register -``` - -For example, if your flavor class `MyOrchestratorFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml orchestrator flavor register flavors.my_flavor.MyOrchestratorFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. - - -Afterward, you should see the new flavor in the list of available flavors: - -```bash -zenml orchestrator flavor list -``` - - -It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow. - -* The **CustomOrchestratorFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomOrchestratorConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` object are inherently `pydantic` objects, you can also add your own custom validators here. -* The **CustomOrchestrator** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomOrchestratorFlavor` and the `CustomOrchestratorConfig` are implemented in a different module/path than the actual `CustomOrchestrator`). - - -## Implementation guide - -1. **Create your orchestrator class:** This class should either inherit from `BaseOrchestrator`, or more commonly from `ContainerizedOrchestrator`. If your orchestrator uses container images to run code, you should inherit from `ContainerizedOrchestrator` which handles building all Docker images for the pipeline to be executed. If your orchestator does not use container images, you'll be responsible that the execution environment contains all the necessary requirements and code files to run the pipeline. -2. **Implement the** `**prepare_or_run_pipeline(...)**` **method:** This method is responsible for running or scheduling the pipeline. In most cases, this means converting the pipeline into a format that your orchestration tool understands and running it. To do so, you should: - * Loop over all steps of the pipeline and configure your orchestration tool to run the correct command and arguments in the correct Docker image - * Make sure the passed environment variables are set when the container is run - * Make sure the containers are running in the correct order -Check out the [code sample](/develop/stack-components/orchestrators/custom#code-sample) below for more details on how to fetch the Docker image, command, arguments and step order. -3. **Implement the** `**get_orchestrator_run_id()**` **method:** This must return a ID that is different for each pipeline run, but identical if called from within Docker containers running different steps of the same pipeline run. If your orchestrator is based on an external tool like Kubeflow or Airflow, it is usually best to use an unique ID provided by this tool. - - - To see a full end-to-end worked example of a custom orchestrator, [see here](https://github.com/zenml-io/zenml-plugins/tree/main/how%5Fto%5Fcustom%5Forchestrator). - - -### Optional features - -There are some additional optional features that your orchestrator can implement: - -* **Running pipelines on a schedule**: if your orchestrator supports running pipelines on a schedule, make sure to handle `deployment.schedule` if it exists. If your orchestrator does not support schedules, you should either log a warning and or even raise an exception in case the user tries to schedule a pipeline. -* **Specifying hardware resources**: If your orchestrator supports setting resources like CPUs, GPUs or memory for the pipeline or specific steps, make sure to handle the values defined in `step.config.resource_settings`. See the code sample below for additional helper methods to check whether any resources are required from your orchestrator. - -### Code sample - -```py -from typing import Dict - -from zenml.entrypoints import StepEntrypointConfiguration -from zenml.models import PipelineDeploymentResponseModel -from zenml.orchestrators import ContainerizedOrchestrator -from zenml.stack import Stack - - -class MyOrchestrator(ContainerizedOrchestrator): - - def get_orchestrator_run_id(self) -> str: - # Return an ID that is different each time a pipeline is run, but the - # same for all steps being executed as part of the same pipeline run. - # If you're using some external orchestration tool like Kubeflow, you - # can usually use the run ID of that tool here. - ... - - def prepare_or_run_pipeline( - self, - deployment: "PipelineDeploymentResponseModel", - stack: "Stack", - environment: Dict[str, str], - ) -> None: - # If your orchestrator supports scheduling, you should handle the schedule - # configured by the user. Otherwise you might raise an exception or log a warning - # that the orchestrator doesn't support scheduling - if deployment.schedule: - ... - - for step_name, step in deployment.step_configurations.items(): - image = self.get_image(deployment=deployment, step_name=step_name) - command = StepEntrypointConfiguration.get_entrypoint_command() - arguments = StepEntrypointConfiguration.get_entrypoint_arguments( - step_name=step_name, deployment_id=deployment.id - ) - # Your orchestration tool should run this command and arguments - # in the Docker image fetched above. Additionally, the container which - # is running the command must contain the environment variables specified - # in the `environment` dictionary. - - # If your orchestrator supports parallel execution of steps, make sure - # each step only runs after all its upstream steps finished - upstream_steps = step.spec.upstream_steps - - # You can get the settings your orchestrator like so. - # The settings are the "dynamic" part of your orchestrators config, - # optionally defined when you register your orchestrator but can be - # overridden at runtime. - # In contrast, the "static" part of your orchestrators config is - # always defined when you register the orchestrator and can be - # accessed via `self.config`. - step_settings = cast( - MyOrchestratorSettings, self.get_settings(step) - ) - - # If your orchestrator supports setting resources like CPUs, GPUs or - # memory for the pipeline or specific steps, you can find out whether - # specific resources were specified for this step: - if self.requires_resources_in_orchestration_environment(step): - resources = step.config.resource_settings -``` - - - To see a full end-to-end worked example of a custom orchestrator, [see here](https://github.com/zenml-io/zenml-plugins/tree/main/how%5Fto%5Fcustom%5Forchestrator). - -### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use your custom orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. diff --git a/develop/stack-components/orchestrators/databricks.mdx b/develop/stack-components/orchestrators/databricks.mdx deleted file mode 100644 index f7ca64c..0000000 --- a/develop/stack-components/orchestrators/databricks.mdx +++ /dev/null @@ -1,193 +0,0 @@ ---- -title: "Databricks Orchestrator" -description: "Orchestrating your pipelines to run on Databricks." -icon: database ---- - -[Databricks](https://www.databricks.com/) is a unified data analytics platform that combines the best of data warehouses and data lakes to offer an integrated solution for big data processing and machine learning. It provides a collaborative environment for data scientists, data engineers, and business analysts to work together on data projects. Databricks offers optimized performance and scalability for big data workloads. - -The Databricks orchestrator is an orchestrator flavor provided by the ZenML databricks integration that allows you to run your pipelines on Databricks. This integration enables you to leverage Databricks' powerful distributed computing capabilities and optimized environment for your ML pipelines within the ZenML framework. - - - The following features are currently in Alpha and may be subject to change. We recommend using them in a controlled environment and providing feedback to the ZenML team. - - -### When to use it - -You should use the Databricks orchestrator if: - -* you're already using Databricks for your data and ML workloads. -* you want to leverage Databricks' powerful distributed computing capabilities for your ML pipelines. -* you're looking for a managed solution that integrates well with other Databricks services. -* you want to take advantage of Databricks' optimization for big data processing and machine learning. - -### Prerequisites - -You will need to do the following to start using the Databricks orchestrator: - -* An Active Databricks workspace, depends on the cloud provider you are using, you can find more information on how to create a workspace: - * [AWS](https://docs.databricks.com/en/versions/0.66.0/getting-started/onboarding-account.html) - * [Azure](https://learn.microsoft.com/en-us/azure/databricks/versions/0.66.0/getting-started/#--create-an-azure-databricks-workspace) - * [GCP](https://docs.gcp.databricks.com/en/versions/0.66.0/getting-started/index.html) -* Active Databricks account or service account with sufficient permission to create and run jobs - -## How it works - - - - - -The Databricks orchestrator in ZenML leverages the concept of Wheel Packages. When you run a pipeline with the Databricks orchestrator, ZenML creates a Python wheel package from your project. This wheel package contains all the necessary code and dependencies for your pipeline. - -Once the wheel package is created, ZenML uploads it to Databricks. ZenML leverage Databricks SDK to create a job definition, This job definition includes information about the pipeline steps and ensures that each step is executed only after its upstream steps have successfully completed. - -The Databricks job is also configured with the necessary cluster settings to run. This includes specifying the version of Spark to use, the number of workers, the node type, and other configuration options. - -When the Databricks job is executed, it retrieves the wheel package from Databricks and runs the pipeline using the specified cluster configuration. The job ensures that the steps are executed in the correct order based on their dependencies. - -Once the job is completed, ZenML retrieves the logs and status of the job and updates the pipeline run accordingly. This allows you to monitor the progress of your pipeline and view the logs of each step. - -### How to use it - -To use the Databricks orchestrator, you first need to register it and add it to your stack. Before registering the orchestrator, you need to install the Databricks integration by running the following command: - -```bash -zenml integration install databricks -``` - -This command will install the necessary dependencies, including the `databricks-sdk` package, which is required for authentication with Databricks. Once the integration is installed, you can proceed with registering the orchestrator and configuring the necessary authentication details. - -```bash -zenml integration install databricks -``` - -Then, we can register the orchestrator and use it in our active stack: - -```bash -zenml orchestrator register databricks_orchestrator --flavor=databricks --host="https://xxxxx.x.azuredatabricks.net" --client_id={{databricks.client_id}} --client_secret={{databricks.client_secret}} -``` - -We recommend creating a Databricks service account with the necessary permissions to create and run jobs. You can find more information on how to create a service account [here](https://docs.databricks.com/dev-tools/api/latest/authentication.html). You can generate a client\_id and client\_secret for the service account and use them to authenticate with Databricks. - - - - - - -```bash -# Add the orchestrator to your stack -zenml stack register databricks_stack -o databricks_orchestrator ... --set -``` - -You can now run any ZenML pipeline using the Databricks orchestrator: - -```bash -python run.py -``` - -### Databricks UI - -Databricks comes with its own UI that you can use to find further details about your pipeline runs, such as the logs of your steps. - - - - - -For any runs executed on Databricks, you can get the URL to the Databricks UI in Python using the following code snippet: - -```py -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("") -orchestrator_url = pipeline_run.run_metadata["orchestrator_url"].value -``` - - - - - -### Run pipelines on a schedule - -The Databricks Pipelines orchestrator supports running pipelines on a schedule using its [native scheduling capability](https://docs.databricks.com/en/workflows/jobs/schedule-jobs.html). - -**How to schedule a pipeline** - -```py -from zenml.config.schedule import Schedule - -# Run a pipeline every 5th minute -pipeline_instance.run( - schedule=Schedule( - cron_expression="*/5 * * * *" - ) -) -``` - - - The Databricks orchestrator only supports the `cron_expression`, in the `Schedule` object, and will ignore all other parameters supplied to define the schedule. - - - - The Databricks orchestrator requires Java Timezone IDs to be used in the `cron_expression`. You can find a list of supported timezones [here](https://docs.oracle.com/middleware/1221/wcs/tag-ref/MISC/TimeZones.html), the timezone ID must be set in the settings of the orchestrator (see below for more information how to set settings for the orchestrator). - -**How to delete a scheduled pipeline** - -Note that ZenML only gets involved to schedule a run, but maintaining the lifecycle of the schedule is the responsibility of the user. - -In order to cancel a scheduled Databricks pipeline, you need to manually delete the schedule in Databricks (via the UI or the CLI). - -### Additional configuration - -For additional configuration of the Databricks orchestrator, you can pass `DatabricksOrchestratorSettings` which allows you to change the Spark version, number of workers, node type, autoscale settings, Spark configuration, Spark environment variables, and schedule timezone. - -```py -from zenml.integrations.databricks.flavors.databricks_orchestrator_flavor import DatabricksOrchestratorSettings - -databricks_settings = DatabricksOrchestratorSettings( - spark_version="15.3.x-scala2.12", - num_workers="3", - node_type_id="Standard_D4s_v5", - policy_id=POLICY_ID, - autoscale=(2, 3), - spark_conf={}, - spark_env_vars={}, - schedule_timezone="America/Los_Angeles" or "PST" # You can get the timezone ID from here: https://docs.oracle.com/middleware/1221/wcs/tag-ref/MISC/TimeZones.html -) -``` - -These settings can then be specified on either pipeline-level or step-level: - -```py -# Either specify on pipeline-level -@pipeline( - settings={ - "orchestrator": databricks_settings, - } -) -def my_pipeline(): - ... -``` - -We can also enable GPU support for the Databricks orchestrator changing the `spark_version` and `node_type_id` to a GPU-enabled version and node type: - - -```py -from zenml.integrations.databricks.flavors.databricks_orchestrator_flavor import DatabricksOrchestratorSettings - -databricks_settings = DatabricksOrchestratorSettings( - spark_version="15.3.x-gpu-ml-scala2.12", - node_type_id="Standard_NC24ads_A100_v4", - policy_id=POLICY_ID, - autoscale=(1, 2), -) -``` - -With these settings, the orchestrator will use a GPU-enabled Spark version and a GPU-enabled node type to run the pipeline on Databricks, next section will show how to enable CUDA for the GPU to give its full acceleration for your pipeline. - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-databricks/#zenml.integrations.databricks.flavors.databricks%5Forchestrator%5Fflavor.DatabricksOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -For more information and a full list of configurable attributes of the Databricks orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-databricks/#zenml.integrations.databricks.orchestrators.databricks%5Forchestrator.DatabricksOrchestrator) . diff --git a/develop/stack-components/orchestrators/hyperai.mdx b/develop/stack-components/orchestrators/hyperai.mdx deleted file mode 100644 index 58d0d49..0000000 --- a/develop/stack-components/orchestrators/hyperai.mdx +++ /dev/null @@ -1,82 +0,0 @@ ---- -title: "HyperAI Orchestrator" -description: "Orchestrating your pipelines to run on HyperAI.ai instances." -icon: robot ---- - -[HyperAI](https://www.hyperai.ai) is a cutting-edge cloud compute platform designed to make AI accessible for everyone. The HyperAI orchestrator is an [orchestrator](/develop/stack-components/orchestrators) flavor that allows you to easily deploy your pipelines on HyperAI instances. - - - This component is only meant to be used within the context of a [remote ZenML deployment scenario](/develop/getting-started/deploying-zenml). Usage with a local ZenML deployment may lead to unexpected behavior! - - -### When to use it - -You should use the HyperAI orchestrator if: - -* you're looking for a managed solution for running your pipelines. -* you're a HyperAI customer. - -### Prerequisites - -You will need to do the following to start using the HyperAI orchestrator: - -* Have a running HyperAI instance. It must be accessible from the internet (or at least from the IP addresses of your ZenML users) and allow SSH key based access (passwords are not supported). -* Ensure that a recent version of Docker is installed. This version must include Docker Compose, meaning that the command `docker compose` works. -* Ensure that the appropriate [NVIDIA Driver](https://www.nvidia.com/en-us/drivers/unix/) is installed on the HyperAI instance (if not already installed by the HyperAI team). -* Ensure that the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) is installed and configured on the HyperAI instance. - -Note that it is possible to omit installing the NVIDIA Driver and NVIDIA Container Toolkit. However, you will then be unable to use the GPU from within your ZenML pipeline. Additionally, you will then need to disable GPU access within the container when configuring the Orchestrator component, or the pipeline will not start correctly. - -## How it works - -The HyperAI orchestrator works with Docker Compose, which can be used to construct machine learning pipelines. Under the hood, it creates a Docker Compose file which it then deploys and executes on the configured HyperAI instance. For each ZenML pipeline step, it creates a service in this file. It uses the `service_completed_successfully` condition to ensure that pipeline steps will only run if their connected upstream steps have successfully finished. - -If configured for it, the HyperAI orchestrator will connect the HyperAI instance to the stack's container registry to ensure a smooth transfer of Docker images. - -### Scheduled pipelines - -[Scheduled pipelines](/develop/usage/pipelines/build-pipelines/schedule-a-pipeline) are supported by the HyperAI orchestrator. Currently, the HyperAI orchestrator supports the following inputs to `Schedule`: - -* Cron expressions via `cron_expression`. When pipeline runs are scheduled, they are added as a crontab entry on the HyperAI instance. Use this when you want pipelines to run in intervals. Using cron expressions assumes that `crontab` is available on your instance and that its daemon is running. -* Scheduled runs via `run_once_start_time`. When pipeline runs are scheduled this way, they are added as an `at` entry on the HyperAI instance. Use this when you want pipelines to run just once and at a specified time. This assumes that `at` is available on your instance. - -### How to deploy it - -To use the HyperAI orchestrator, you must configure a HyperAI Service Connector in ZenML and link it to the HyperAI orchestrator component. The service connector contains credentials with which ZenML connects to the HyperAI instance. - -Additionally, the HyperAI orchestrator must be used in a stack that contains a container registry and an image builder. - -### How to use it - -To use the HyperAI orchestrator, we must configure a HyperAI Service Connector first using one of its supported authentication methods. For example, for authentication with an RSA-based key, create the service connector as follows: - -```bash -zenml service-connector register --type=hyperai --auth-method=rsa-key --base64_ssh_key= --hostnames=,,.., --username= -``` - -Hostnames are either DNS resolvable names or IP addresses. - -For example, if you have two servers - one at `1.2.3.4` and another at `4.3.2.1`, you could provide them as `--hostnames=1.2.3.4,4.3.2.1`. - -Optionally, it is possible to provide a passphrase for the key (`--ssh_passphrase`). - -Following registering the service connector, we can register the orchestrator and use it in our active stack: - -```bash -zenml orchestrator register --flavor=hyperai - -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - -You can now run any ZenML pipeline using the HyperAI orchestrator: - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. - diff --git a/develop/stack-components/orchestrators/kubeflow.mdx b/develop/stack-components/orchestrators/kubeflow.mdx deleted file mode 100644 index 1066f97..0000000 --- a/develop/stack-components/orchestrators/kubeflow.mdx +++ /dev/null @@ -1,350 +0,0 @@ ---- -title: "Kubeflow Orchestrator" -description: "Orchestrating your pipelines to run on Kubeflow." -icon: cloud-showers-water ---- - -The Kubeflow orchestrator is an [orchestrator](/develop/stack-components/orchestrators) flavor provided by the ZenML `kubeflow` integration that uses [Kubeflow Pipelines](https://www.kubeflow.org/docs/components/pipelines/introduction/) to run your pipelines. - - - This component is only meant to be used within the context of a [remote ZenML deployment scenario](/develop/getting-started/deploying-zenml). Usage with a local ZenML deployment may lead to unexpected behavior! - - -### When to use it - -You should use the Kubeflow orchestrator if: - -* you're looking for a proven production-grade orchestrator. -* you're looking for a UI in which you can track your pipeline runs. -* you're already using Kubernetes or are not afraid of setting up and maintaining a Kubernetes cluster. -* you're willing to deploy and maintain Kubeflow Pipelines on your cluster. - -### How to deploy it - -To run ZenML pipelines on Kubeflow, you'll need to set up a Kubernetes cluster and deploy Kubeflow Pipelines on it. This can be done in a variety of ways, depending on whether you want to use a cloud provider or your own infrastructure: - - -* Have an existing AWS [EKS cluster](https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html) set up. -* Make sure you have the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) set up. -* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and [configure](https://aws.amazon.com/premiumsupport/knowledge-center/eks-cluster-connection/) it to talk to your EKS cluster using the following command: - -```bash -aws eks --region REGION update-kubeconfig --name CLUSTER_NAME -``` -* [Install](https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/#deploying-kubeflow-pipelines) Kubeflow Pipelines onto your cluster. -* ( optional) [set up an AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector) to grant ZenML Stack Components easy and secure access to the remote EKS cluster. - - - -* Have an existing GCP [GKE cluster](https://cloud.google.com/kubernetes-engine/docs/quickstart) set up. -* Make sure you have the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install-sdk) set up first. -* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and [configure](https://cloud.google.com/kubernetes-engine/docs/versions/0.66.0/how-to/cluster-access-for-kubectl) it to talk to your GKE cluster using the following command: - -```bash -gcloud container clusters get-credentials CLUSTER_NAME -``` -* [Install](https://www.kubeflow.org/docs/distributions/gke/deploy/overview/) Kubeflow Pipelines onto your cluster. -* ( optional) [set up a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) to grant ZenML Stack Components easy and secure access to the remote GKE cluster. - - -* Have an existing [AKS cluster](https://azure.microsoft.com/en-in/services/kubernetes-service/#documentation) set up. -* Make sure you have the [az CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) set up first. -* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and ensure that it talks to your AKS cluster using the following command: - -```bash -az aks get-credentials --resource-group RESOURCE_GROUP --name CLUSTER_NAME -``` -* [Install](https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/#deploying-kubeflow-pipelines) Kubeflow Pipelines onto your cluster. - -Since Kubernetes v1.19, AKS has shifted to [containerd](https://docs.microsoft.com/en-us/azure/aks/cluster-configuration#container-settings). However, the workflow controller installed with the Kubeflow installation has `Docker` set as the default runtime. In order to make your pipelines work, you have to change the value to one of the options listed [here](https://argoproj.github.io/argo-workflows/workflow-executors/#workflow-executors), preferably `k8sapi`. - -This change has to be made by editing the `containerRuntimeExecutor` property of the `ConfigMap` corresponding to the workflow controller. Run the following commands to first know what config map to change and then to edit it to reflect your new value: - -```bash -kubectl get configmap -n kubeflow -kubectl edit configmap CONFIGMAP_NAME -n kubeflow -# This opens up an editor that can be used to make the change. -``` - - -* Have an existing Kubernetes cluster set up. -* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and configure it to talk to your Kubernetes cluster. -* [Install](https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/#deploying-kubeflow-pipelines) Kubeflow Pipelines onto your cluster. -* ( optional) [set up a Kubernetes Service Connector](/develop/usage/resource-data-management/auth-management/kubernetes-service-connector) to grant ZenML Stack Components easy and secure access to the remote Kubernetes cluster. This is especially useful if your Kubernetes cluster is remotely accessible, as this enables other ZenML users to use it to run pipelines without needing to configure and set up `kubectl` on their local machines. - - - - - - If one or more of the deployments are not in the `Running` state, try increasing the number of nodes in your cluster. - - - - If you're installing Kubeflow Pipelines manually, make sure the Kubernetes service is called exactly `ml-pipeline`. This is a requirement for ZenML to connect to your Kubeflow Pipelines deployment. - -#### Infrastructure Deployment - -A Kubeflow orchestrator can be deployed directly from the ZenML CLI: - -```bash -zenml orchestrator deploy kubeflow_orchestrator --flavor=kubeflow --provider= ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the dedicated documentation section. - -### How to use it - -To use the Kubeflow orchestrator, we need: - -* A Kubernetes cluster with Kubeflow pipelines installed. See the [deployment section](/develop/stack-components/orchestrators/kubeflow#how-to-deploy-it) for more information. -* A ZenML server deployed remotely where it can be accessed from the Kubernetes cluster. See the [deployment guide](/develop/getting-started/deploying-zenml) for more information. -* The ZenML `kubeflow` integration installed. If you haven't done so, run - -```bash -zenml integration install kubeflow -``` -* [Docker](https://www.docker.com) installed and running (unless you are using a remote [Image Builder](/develop/stack-components/image-builders) in your ZenML stack). -* [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) installed (optional, see below) - - - If you are using a single-tenant Kubeflow installed in a Kubernetes cluster managed by a cloud provider like AWS, GCP or Azure, it is recommended that you set up [a Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) and use it to connect ZenML Stack Components to the remote Kubernetes cluster. This guarantees that your Stack is fully portable on other environments and your pipelines are fully reproducible. - - -* The name of your Kubernetes context which points to your remote cluster. Run `kubectl config get-contexts` to see a list of available contexts. **NOTE**: this is no longer required if you are using [a Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) to connect your Kubeflow Orchestrator Stack Component to the remote Kubernetes cluster. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. - -We can then register the orchestrator and use it in our active stack. This can be done in two ways: - -1. If you have [a Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) configured to access the remote Kubernetes cluster, you no longer need to set the `kubernetes_context` attribute to a local `kubectl` context. In fact, you don't need the local Kubernetes CLI at all. You can [connect the stack component to the Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide#connect-stack-components-to-resources) instead: - -```bash -# List all available Kubernetes clusters that can be accessed by service connectors -zenml service-connector list-resources --resource-type kubernetes-cluster -e -# Register the Kubeflow orchestrator and connect it to the remote Kubernetes cluster -zenml orchestrator register --flavor kubeflow --connector --resource-id -# Register a new stack with the orchestrator -zenml stack register -o -a -c ... # Add other stack components as needed -``` -The following example demonstrates how to register the orchestrator and connect it to a remote Kubernetes cluster using a Service Connector: - -```bash -$ zenml service-connector list-resources --resource-type kubernetes-cluster -e -The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ e33c9fac-5daa-48b2-87bb-0187d3782cde │ aws-iam-multi-eu │ 🔶 aws │ 🌀 kubernetes-cluster │ kubeflowmultitenant ┃ -┃ │ │ │ │ zenbox ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ 1c54b32a-4889-4417-abbd-42d3ace3d03a │ gcp-sa-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛ -$ zenml orchestrator register aws-kubeflow --flavor kubeflow --connector aws-iam-multi-eu --resource-id zenhacks-cluster -Successfully registered orchestrator `aws-kubeflow`. -Successfully connected orchestrator `aws-kubeflow` to the following resources: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼──────────────────┼────────────────┼───────────────────────┼──────────────────┨ -┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ -# Create a new stack with the orchestrator -$ zenml stack register --set aws-kubeflow -o aws-kubeflow -a aws-s3 -c aws-ecr -Stack 'aws-kubeflow' successfully registered! - Stack Configuration -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┓ -┃ COMPONENT_TYPE │ COMPONENT_NAME ┃ -┠────────────────────┼─────────────────┨ -┃ ARTIFACT_STORE │ aws-s3 ┃ -┠────────────────────┼─────────────────┨ -┃ ORCHESTRATOR │ aws-kubeflow ┃ -┠────────────────────┼─────────────────┨ -┃ CONTAINER_REGISTRY │ aws-ecr ┃ -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┛ - 'aws-kubeflow' stack -No labels are set for this stack. -Stack 'aws-kubeflow' with id 'dab28f94-36ab-467a-863e-8718bbc1f060' is owned by user user. -Active global stack set to:'aws-kubeflow' -``` -2. if you don't have a Service Connector on hand and you don't want to [register one](/develop/usage/resource-data-management/auth-management/service-connectors-guide#register-service-connectors), the local Kubernetes `kubectl` client needs to be configured with a configuration context pointing to the remote cluster. The `kubernetes_context` must also be configured with the value of that context: - -```bash -zenml orchestrator register \ - --flavor=kubeflow \ - --kubernetes_context= -# Register a new stack with the orchestrator -zenml stack register -o -a -c ... # Add other stack components as needed -``` - - - ZenML will build a Docker image called `/zenml:` which includes all required software dependencies and use it to run your pipeline steps in Kubeflow. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - -You can now run any ZenML pipeline using the Kubeflow orchestrator: - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -#### Kubeflow UI - -Kubeflow comes with its own UI that you can use to find further details about your pipeline runs, such as the logs of your steps. For any runs executed on Kubeflow, you can get the URL to the Kubeflow UI in Python using the following code snippet: - -```py -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("") -orchestrator_url = pipeline_run.run_metadata["orchestrator_url"].value -``` - -#### Additional configuration - -For additional configuration of the Kubeflow orchestrator, you can pass `KubeflowOrchestratorSettings` which allows you to configure (among others) the following attributes: - -* `client_args`: Arguments to pass when initializing the KFP client. -* `user_namespace`: The user namespace to use when creating experiments and runs. -* `pod_settings`: Node selectors, affinity, and tolerations to apply to the Kubernetes Pods running your pipeline. These can be either specified using the Kubernetes model objects or as dictionaries. - -```py -from zenml.integrations.kubeflow.flavors.kubeflow_orchestrator_flavor import KubeflowOrchestratorSettings -from kubernetes.client.models import V1Toleration - -kubeflow_settings = KubeflowOrchestratorSettings( - client_args={}, - user_namespace="my_namespace", - pod_settings={ - "affinity": { - "nodeAffinity": { - "requiredDuringSchedulingIgnoredDuringExecution": { - "nodeSelectorTerms": [ - { - "matchExpressions": [ - { - "key": "node.kubernetes.io/name", - "operator": "In", - "values": ["my_powerful_node_group"], - } - ] - } - ] - } - } - }, - "tolerations": [ - V1Toleration( - key="node.kubernetes.io/name", - operator="Equal", - value="", - effect="NoSchedule" - ) - ] - } -) - - -@pipeline( - settings={ - "orchestrator": kubeflow_settings - } -) - - -... -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-kubeflow/#zenml.integrations.kubeflow.flavors.kubeflow%5Forchestrator%5Fflavor.KubeflowOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. - -### Important Note for Multi-Tenancy Deployments - -Kubeflow has a notion of [multi-tenancy](https://www.kubeflow.org/docs/components/multi-tenancy/overview/) built into its deployment. Kubeflow's multi-user isolation simplifies user operations because each user only views and edited the Kubeflow components and model artifacts defined in their configuration. - -Using the ZenML Kubeflow orchestrator on a multi-tenant deployment without any settings will result in the following error: - -```yaml -HTTP response body: {"error":"Invalid input error: Invalid resource references for experiment. ListExperiment requires filtering by namespace.","code":3,"message":"Invalid input error: Invalid resource references for experiment. ListExperiment requires filtering by -namespace.","details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid resource references for experiment. ListExperiment requires filtering by namespace.","error_details":"Invalid input error: Invalid resource references for experiment. ListExperiment requires filtering by namespace."}]} -``` - -In order to get it to work, we need to leverage the `KubeflowOrchestratorSettings` referenced above. By setting the namespace option, and by passing in the right authentication credentials to the Kubeflow Pipelines Client, we can make it work. - -First, when registering your Kubeflow orchestrator, please make sure to include the `kubeflow_hostname` parameter. The `kubeflow_hostname` **must end with the** `**/pipeline**` **post-fix**. - - - -```shell -zenml orchestrator register \ - --flavor=kubeflow \ - --kubeflow_hostname= # e.g. https://mykubeflow.example.com/pipeline -``` - -Then, ensure that you use the pass the right settings before triggering a pipeline run. The following snippet will prove useful: - -```py -import requests - -from zenml.client import Client -from zenml.integrations.kubeflow.flavors.kubeflow_orchestrator_flavor import ( - KubeflowOrchestratorSettings, -) - -NAMESPACE = "namespace_name" # This is the user namespace for the profile you want to use -USERNAME = "admin" # This is the username for the profile you want to use -PASSWORD = "abc123" # This is the password for the profile you want to use - -# Use client_username and client_password and ZenML will automatically fetch a session cookie -kubeflow_settings = KubeflowOrchestratorSettings( - client_username=USERNAME, - client_password=PASSWORD, - user_namespace=NAMESPACE -) - - -# You can also pass the cookie in `client_args` directly -# kubeflow_settings = KubeflowOrchestratorSettings( -# client_args={"cookies": session_cookie}, user_namespace=NAMESPACE -# ) - -@pipeline( - settings={ - "orchestrator": kubeflow_settings - } -) - -: -... - -if "__name__" == "__main__": -# Run the pipeline -``` - -Note that the above is also currently not tested on all Kubeflow versions, so there might be further bugs with older Kubeflow versions. In this case, please reach out to us on [Slack](https://zenml.io/slack). - -#### Using secrets in settings - -The above example encoded the username and password in plain text as settings. You can also set them as secrets. - -```bash -zenml secret create kubeflow_secret \ - --username=admin \ - --password=abc123 -``` - -And then you can use them in code: - -```bash -# Use client_username and client_password and ZenML will automatically fetch a session cookie - -kubeflow_settings = KubeflowOrchestratorSettings( - client_username="{{kubeflow_secret.username}}", # secret reference - client_password="{{kubeflow_secret.password}}", # secret reference - user_namespace="namespace_name" -) -``` - -See full documentation of using ZenML secrets [here](/develop/usage/project-setup/use-secrets/interact-with-secrets). - -For more information and a full list of configurable attributes of the Kubeflow orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-kubeflow/#zenml.integrations.kubeflow.orchestrators.kubeflow%5Forchestrator.KubeflowOrchestrator) . diff --git a/develop/stack-components/orchestrators/kubernetes.mdx b/develop/stack-components/orchestrators/kubernetes.mdx deleted file mode 100644 index 74d60ba..0000000 --- a/develop/stack-components/orchestrators/kubernetes.mdx +++ /dev/null @@ -1,273 +0,0 @@ ---- -title: "Kubernetes Orchestrator" -description: "Orchestrating your pipelines to run on Kubernetes clusters." -icon: battle-net ---- - -Using the ZenML `kubernetes` integration, you can orchestrate and scale your ML pipelines on a [Kubernetes](https://kubernetes.io/) cluster without writing a single line of Kubernetes code. - -This Kubernetes-native orchestrator is a minimalist, lightweight alternative to other distributed orchestrators like Airflow or Kubeflow. - -Overall, the Kubernetes orchestrator is quite similar to the Kubeflow orchestrator in that it runs each pipeline step in a separate Kubernetes pod. However, the orchestration of the different pods is not done by Kubeflow but by a separate master pod that orchestrates the step execution via topological sort. - -Compared to Kubeflow, this means that the Kubernetes-native orchestrator is faster and much simpler to start with since you do not need to install and maintain Kubeflow on your cluster. The Kubernetes-native orchestrator is an ideal choice for teams new to distributed orchestration that do not want to go with a fully-managed offering. - -However, since Kubeflow is much more mature, you should, in most cases, aim to move your pipelines to Kubeflow in the long run. A smooth way to production-grade orchestration could be to set up a Kubernetes cluster first and get started with the Kubernetes-native orchestrator. If needed, you can then install and set up Kubeflow later and simply switch out the orchestrator of your stack as soon as your full setup is ready. - - - This component is only meant to be used within the context of a [remote ZenML deployment scenario](/develop/getting-started/deploying-zenml). Usage with a local ZenML deployment may lead to unexpected behavior! - - -### When to use it - -You should use the Kubernetes orchestrator if: - -* you're looking for a lightweight way of running your pipelines on Kubernetes. -* you're not willing to maintain [Kubeflow Pipelines](/develop/stack-components/orchestrators/kubeflow) on your Kubernetes cluster. -* you're not interested in paying for managed solutions like [Vertex](/develop/stack-components/orchestrators/vertex). - -### How to deploy it - -The Kubernetes orchestrator requires a Kubernetes cluster in order to run. There are many ways to deploy a Kubernetes cluster using different cloud providers or on your custom infrastructure, and we can't possibly cover all of them, but you can check out our cloud guide - -If the above Kubernetes cluster is deployed remotely on the cloud, then another pre-requisite to use this orchestrator would be to deploy and connect to a [remote ZenML server](/develop/getting-started/deploying-zenml). - -#### Infrastructure Deployment - -A Kubernetes orchestrator can be deployed directly from the ZenML CLI: - -```bash -zenml orchestrator deploy k8s_orchestrator --flavor=kubernetes --provider= ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the dedicated documentation section. - -### How to use it - -To use the Kubernetes orchestrator, we need: - -* The ZenML `kubernetes` integration installed. If you haven't done so, run - -```bash -zenml integration install kubernetes -``` -* [Docker](https://www.docker.com) installed and running. -* [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) installed. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. -* A Kubernetes cluster [deployed](/develop/stack-components/orchestrators/kubernetes#how-to-deploy-it) -* [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) installed and the name of the Kubernetes configuration context which points to the target cluster (i.e. run`kubectl config get-contexts` to see a list of available contexts) . This is optional (see below). - - - It is recommended that you set up [a Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) and use it to connect ZenML Stack Components to the remote Kubernetes cluster, especially If you are using a Kubernetes cluster managed by a cloud provider like AWS, GCP or Azure, This guarantees that your Stack is fully portable on other environments and your pipelines are fully reproducible. - - -We can then register the orchestrator and use it in our active stack. This can be done in two ways: - -1. If you have [a Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) configured to access the remote Kubernetes cluster, you no longer need to set the `kubernetes_context` attribute to a local `kubectl` context. In fact, you don't need the local Kubernetes CLI at all. You can [connect the stack component to the Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide#connect-stack-components-to-resources) instead: - -```bash -$ zenml orchestrator register --flavor kubernetes -Running with active stack: 'default' (repository) -Successfully registered orchestrator ``. -$ zenml service-connector list-resources --resource-type kubernetes-cluster -e -The following 'kubernetes-cluster' resources can be accessed by service connectors: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ e33c9fac-5daa-48b2-87bb-0187d3782cde │ aws-iam-multi-eu │ 🔶 aws │ 🌀 kubernetes-cluster │ kubeflowmultitenant ┃ -┃ │ │ │ │ zenbox ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ 1c54b32a-4889-4417-abbd-42d3ace3d03a │ gcp-sa-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛ -$ zenml orchestrator connect --connector aws-iam-multi-us -Running with active stack: 'default' (repository) -Successfully connected orchestrator `` to the following resources: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼──────────────────┼────────────────┼───────────────────────┼──────────────────┨ -┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ -# Register and activate a stack with the new orchestrator -$ zenml stack register -o ... --set -``` -2. if you don't have a Service Connector on hand and you don't want to [register one](/develop/usage/resource-data-management/auth-management/service-connectors-guide#register-service-connectors) , the local Kubernetes `kubectl` client needs to be configured with a configuration context pointing to the remote cluster. The `kubernetes_context` stack component must also be configured with the value of that context: - -```bash -zenml orchestrator register \ - --flavor=kubernetes \ - --kubernetes_context= -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - - - ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your pipeline steps in Kubernetes. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - - -You can now run any ZenML pipeline using the Kubernetes orchestrator: - - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -If all went well, you should now see the logs of all Kubernetes pods in your terminal, and when running `kubectl get pods -n zenml`, you should also see that a pod was created in your cluster for each pipeline step. - -#### Interacting with pods via kubectl - -For debugging, it can sometimes be handy to interact with the Kubernetes pods directly via kubectl. To make this easier, we have added the following labels to all pods: - -* `run`: the name of the ZenML run. -* `pipeline`: the name of the ZenML pipeline associated with this run. - -E.g., you can use these labels to manually delete all pods related to a specific pipeline: - -```bash -kubectl delete pod -n zenml -l pipeline=kubernetes_example_pipeline -``` - -#### Additional configuration - -The Kubernetes orchestrator will by default use a Kubernetes namespace called `zenml` to run pipelines. In that namespace, it will automatically create a Kubernetes service account called `zenml-service-account` and grant it `edit` RBAC role in that namespace. To customize these settings, you can configure the following additional attributes in the Kubernetes orchestrator: - -* `kubernetes_namespace`: The Kubernetes namespace to use for running the pipelines. The namespace must already exist in the Kubernetes cluster. -* `service_account_name`: The name of a Kubernetes service account to use for running the pipelines. If configured, it must point to an existing service account in the default or configured `namespace` that has associated RBAC roles granting permissions to create and manage pods in that namespace. This can also be configured as an individual pipeline setting in addition to the global orchestrator setting. - -For additional configuration of the Kubernetes orchestrator, you can pass `KubernetesOrchestratorSettings` which allows you to configure (among others) the following attributes: - -* `pod_settings`: Node selectors, labels, affinity, and tolerations, and image pull secrets to apply to the Kubernetes Pods running the steps of your pipeline. These can be either specified using the Kubernetes model objects or as dictionaries. -* `orchestrator_pod_settings`: Node selectors, labels, affinity, and tolerations, and image pull secrets to apply to the Kubernetes Pod that is responsible for orchestrating the pipeline and starting the other Pods. These can be either specified using the Kubernetes model objects or as dictionaries. - -```py -from zenml.integrations.kubernetes.flavors.kubernetes_orchestrator_flavor import KubernetesOrchestratorSettings -from kubernetes.client.models import V1Toleration - -kubernetes_settings = KubernetesOrchestratorSettings( - pod_settings={ - "node_selectors": { - "cloud.google.com/gke-nodepool": "ml-pool", - "kubernetes.io/arch": "amd64" - }, - "affinity": { - "nodeAffinity": { - "requiredDuringSchedulingIgnoredDuringExecution": { - "nodeSelectorTerms": [ - { - "matchExpressions": [ - { - "key": "gpu-type", - "operator": "In", - "values": ["nvidia-tesla-v100", "nvidia-tesla-p100"] - } - ] - } - ] - } - } - }, - "tolerations": [ - V1Toleration( - key="gpu", - operator="Equal", - value="present", - effect="NoSchedule" - ), - V1Toleration( - key="high-priority", - operator="Exists", - effect="PreferNoSchedule" - ) - ], - "resources": { - "requests": { - "cpu": "2", - "memory": "4Gi", - "nvidia.com/gpu": "1" - }, - "limits": { - "cpu": "4", - "memory": "8Gi", - "nvidia.com/gpu": "1" - } - }, - "annotations": { - "prometheus.io/scrape": "true", - "prometheus.io/port": "8080" - }, - "volumes": [ - { - "name": "data-volume", - "persistentVolumeClaim": { - "claimName": "ml-data-pvc" - } - }, - { - "name": "config-volume", - "configMap": { - "name": "ml-config" - } - } - ], - "volume_mounts": [ - { - "name": "data-volume", - "mountPath": "/mnt/data" - }, - { - "name": "config-volume", - "mountPath": "/etc/ml-config", - "readOnly": True - } - ], - "host_ipc": True, - "image_pull_secrets": ["regcred", "gcr-secret"], - "labels": { - "app": "ml-pipeline", - "environment": "production", - "team": "data-science" - } - }, - orchestrator_pod_settings={ - "node_selectors": { - "cloud.google.com/gke-nodepool": "orchestrator-pool" - }, - "resources": { - "requests": { - "cpu": "1", - "memory": "2Gi" - }, - "limits": { - "cpu": "2", - "memory": "4Gi" - } - }, - "labels": { - "app": "zenml-orchestrator", - "component": "pipeline-runner" - } - }, - kubernetes_namespace="ml-pipelines", - service_account_name="zenml-pipeline-runner" -) - -@pipeline( - settings={ - "orchestrator": kubernetes_settings - } -) -def my_kubernetes_pipeline(): - # Your pipeline steps here - ... -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-kubernetes/#zenml.integrations.kubernetes.flavors.kubernetes%5Forchestrator%5Fflavor.KubernetesOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -For more information and a full list of configurable attributes of the Kubernetes orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-kubernetes/#zenml.integrations.kubernetes.orchestrators.kubernetes%5Forchestrator.KubernetesOrchestrator) . - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. - diff --git a/develop/stack-components/orchestrators/lightning.mdx b/develop/stack-components/orchestrators/lightning.mdx deleted file mode 100644 index 6a6119f..0000000 --- a/develop/stack-components/orchestrators/lightning.mdx +++ /dev/null @@ -1,163 +0,0 @@ ---- -title: "Lightning AI Orchestrator" -description: "Orchestrating your pipelines to run on Lightning AI." -icon: poo-storm ---- - -[Lightning AI Studio](https://lightning.ai/) is a platform that simplifies the development and deployment of AI applications. The Lightning AI orchestrator is an integration provided by ZenML that allows you to run your pipelines on Lightning AI's infrastructure, leveraging its scalable compute resources and managed environment. - - - This component is only meant to be used within the context of a [remote ZenML deployment scenario](/develop/getting-started/deploying-zenml). Usage with a local ZenML deployment may lead to unexpected behavior! - - -## When to use it - -* You are looking for a fast and easy way to run your pipelines on GPU instances. -* you're already using Lightning AI for your machine learning projects. -* you want to leverage Lightning AI's managed infrastructure for running your pipelines. -* you're looking for a solution that simplifies the deployment and scaling of your ML workflows. -* you want to take advantage of Lightning AI's optimizations for machine learning workloads. - -## How to deploy it - -To use the [Lightning AI Studio](https://lightning.ai/) orchestrator, you need to have a Lightning AI account and the necessary credentials. You don't need to deploy any additional infrastructure, as the orchestrator will use Lightning AI's managed resources. - -## How it works - -The Lightning AI orchestrator is a ZenML orchestrator that runs your pipelines on Lightning AI's infrastructure. When you run a pipeline with the Lightning AI orchestrator, ZenML will archive your current ZenML repository and upload it to the Lightning AI studio. Once the code is archived, using `lightning-sdk`, ZenML will create a new stduio in Lightning AI and upload the code to it. Then ZenML runs list of commands via `studio.run()` to prepare for the pipeline run (e.g. installing dependencies, setting up the environment). Finally, ZenML will run the pipeline on Lightning AI's infrastructure. - -* You can always use an already existing studio by specifying the `main_studio_name` in the `LightningOrchestratorSettings`. -* The orchestartor supports a async mode, which means that the pipeline will be run in the background and you can check the status of the run in the ZenML Dashboard or the Lightning AI Studio. -* You can specify a list of custom commands that will be executed before running the pipeline. This can be useful for installing dependencies or setting up the environment. -* The orchestrator supports both CPU and GPU machine types. You can specify the machine type in the `LightningOrchestratorSettings`. - -## How to use it - -To use the Lightning AI orchestrator, you need: - -* The ZenML `lightning` integration installed. If you haven't done so, run - -```bash -pip install lightning-sdk -``` -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* [Lightning AI credentials](/develop/stack-components/orchestrators/lightning#lightning-ai-credentials) - -### Lightning AI credentials - -* `LIGHTNING_USER_ID`: Your Lightning AI user ID -* `LIGHTNING_API_KEY`: Your Lightning AI API key -* `LIGHTNING_USERNAME`: Your Lightning AI username (optional) -* `LIGHTNING_TEAMSPACE`: Your Lightning AI teamspace (optional) -* `LIGHTNING_ORG`: Your Lightning AI organization (optional) - -Alternatively, you can configure these credentials when registering the orchestrator. - -We can then register the orchestrator and use it in our active stack: - - -```bash -zenml orchestrator register lightning_orchestrator \ - --flavor=lightning \ - --user_id= \ - --api_key= \ - --username= \ - --teamspace= \ - --organization= - -# Register and activate a stack with the new orchestrator -zenml stack register lightning_stack -o lightning_orchestrator ... --set -``` - -You can also configure the orchestrator at pipeline level, using the `orchestrator` parameter. - -```py -from zenml.integrations.lightning.flavors.lightning_orchestrator_flavor import LightningOrchestratorSettings - - -lightning_settings = LightningOrchestratorSettings( - main_studio_name="my_studio", - machine_type="cpu", - async_mode=True, - custom_commands=["pip install -r requirements.txt", "do something else"] -) - -@pipeline( - settings={ - "orchestrator.lightning": lightning_settings - } -) -def my_pipeline(): - ... -``` - - - ZenML will archive the current zenml repository (The code within the path where you run `zenml init`) and upload it to the Lightning AI studio. For this reason you need make sure that you are running `zenml init` in the same directory where you are running your pipeline. - - - - - - - - The `custom_commands` attribute allows you to specify a list of shell commands that will be executed before running the pipeline. This can be useful for installing dependencies or setting up the environment, The commands will be executed in the root directory of the uploaded and extracted ZenML repository. - -You can now run any ZenML pipeline using the Lightning AI orchestrator: - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -### Lightning AI UI - -Lightning AI provides its own UI where you can monitor and manage your running applications, including the pipelines orchestrated by ZenML. - - - - - -For any runs executed on Lightning AI, you can get the URL to the Lightning AI UI in Python using the following code snippet: - -```py -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("") -orchestrator_url = pipeline_run.run_metadata["orchestrator_url"].value -``` - -### Additional configuration - -For additional configuration of the Lightning AI orchestrator, you can pass `LightningOrchestratorSettings` which allows you to configure various aspects of the Lightning AI execution environment: - -```py -# Either specify on pipeline-level -@pipeline( - settings={ - "orchestrator.lightning": lightning_settings - } -) -def my_pipeline(): - ... - -# OR specify settings on step-level -@step( - settings={ - "orchestrator.lightning": lightning_settings - } -) -def my_step(): - ... -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-lightning/#zenml.integrations.lightning.flavors.lightning%5Forchestrator%5Fflavor.LightningOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -To use GPUs with the Lightning AI orchestrator, you need to specify a GPU-enabled machine type in your settings: - -```py -lightning_settings = LightningOrchestratorSettings( - machine_type="gpu", # or -) -``` - -Make sure to check Lightning AI's documentation for the available GPU-enabled machine types and their specifications. - diff --git a/develop/stack-components/orchestrators/local-docker.mdx b/develop/stack-components/orchestrators/local-docker.mdx deleted file mode 100644 index cf63915..0000000 --- a/develop/stack-components/orchestrators/local-docker.mdx +++ /dev/null @@ -1,73 +0,0 @@ ---- -title: "Local Docker Orchestrator" -description: "Orchestrating your pipelines to run in Docker." -icon: docker ---- - -The local Docker orchestrator is an [orchestrator](/develop/stack-components/orchestrators) flavor that comes built-in with ZenML and runs your pipelines locally using Docker. - -### When to use it - -You should use the local Docker orchestrator if: - -* you want the steps of your pipeline to run locally in isolated environments. -* you want to debug issues that happen when running your pipeline in Docker containers without waiting and paying for remote infrastructure. - -### How to deploy it - -To use the local Docker orchestrator, you only need to have [Docker](https://www.docker.com/) installed and running. - -### How to use it - -To use the local Docker orchestrator, we can register it and use it in our active stack: - - -```bash -zenml orchestrator register --flavor=local_docker - -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - -You can now run any ZenML pipeline using the local Docker orchestrator: - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -#### Additional configuration - -For additional configuration of the Local Docker orchestrator, you can pass `LocalDockerOrchestratorSettings` when defining or running your pipeline. Check out the [SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-orchestrators/#zenml.orchestrators.local%5Fdocker.local%5Fdocker%5Forchestrator.LocalDockerOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. A full list of what can be passed in via the `run_args` can be found [in the Docker Python SDK documentation](https://docker-py.readthedocs.io/en/stable/containers.html). - -For more information and a full list of configurable attributes of the local Docker orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-orchestrators/#zenml.orchestrators.local%5Fdocker.local%5Fdocker%5Forchestrator.LocalDockerOrchestrator) . - -For example, if you wanted to specify the CPU count available for the Docker image (note: only configurable for Windows), you could write a simple pipeline like the following: - - -```py -from zenml import step, pipeline -from zenml.orchestrators.local_docker.local_docker_orchestrator import ( - LocalDockerOrchestratorSettings, -) - - -@step -def return_one() -> int: - return 1 - - -settings = { - "orchestrator": LocalDockerOrchestratorSettings( - run_args={"cpu_count": 3} - ) -} - - -@pipeline(settings=settings) -def simple_pipeline(): - return_one() -``` - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. diff --git a/develop/stack-components/orchestrators/local.mdx b/develop/stack-components/orchestrators/local.mdx deleted file mode 100644 index 84e8b50..0000000 --- a/develop/stack-components/orchestrators/local.mdx +++ /dev/null @@ -1,40 +0,0 @@ ---- -title: "Local Orchestrator" -description: "Orchestrating your pipelines to run locally." -icon: folder ---- - -The local orchestrator is an [orchestrator](/develop/stack-components/orchestrators) flavor that comes built-in with ZenML and runs your pipelines locally. - -### When to use it - -The local orchestrator is part of your default stack when you're first getting started with ZenML. Due to it running locally on your machine, it requires no additional setup and is easy to use and debug. - -You should use the local orchestrator if: - -* you're just getting started with ZenML and want to run pipelines without setting up any cloud infrastructure. -* you're writing a new pipeline and want to experiment and debug quickly - -### How to deploy it - -The local orchestrator comes with ZenML and works without any additional setup. - -### How to use it - -To use the local orchestrator, we can register it and use it in our active stack: - -```bash -zenml orchestrator register --flavor=local - -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - -You can now run any ZenML pipeline using the local orchestrator: - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -For more information and a full list of configurable attributes of the local orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-orchestrators/#zenml.orchestrators.local.local%5Forchestrator.LocalOrchestrator) . - diff --git a/develop/stack-components/orchestrators/orchestrators.mdx b/develop/stack-components/orchestrators/orchestrators.mdx deleted file mode 100644 index a6e44ad..0000000 --- a/develop/stack-components/orchestrators/orchestrators.mdx +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: "Overview" -description: "Orchestrating the execution of ML pipelines." -icon: battery-full ---- - -The orchestrator is an essential component in any MLOps stack as it is responsible for running your machine learning pipelines. To do so, the orchestrator provides an environment that is set up to execute the steps of your pipeline. It also makes sure that the steps of your pipeline only get executed once all their inputs (which are outputs of previous steps of your pipeline) are available. - - - Many of ZenML's remote orchestrators build [Docker](https://www.docker.com/) images in order to transport and execute your pipeline code. If you want to learn more about how Docker images are built by ZenML, check out [this guide](/develop/usage/project-setup/customize-docker-builds). - - -### When to use it - -The orchestrator is a mandatory component in the ZenML stack. It is used to store all artifacts produced by pipeline runs, and you are required to configure it in all of your stacks. - -### Orchestrator Flavors - -Out of the box, ZenML comes with a `local` orchestrator already part of the default stack that runs pipelines locally. Additional orchestrators are provided by integrations: - -| Orchestrator | Flavor | Integration | Notes | -| ------------------------------------------------------------------------ | ------------- | ----------------------------------------------------------------------- | ----------------------------------------------- | -| [LocalOrchestrator](/develop/stack-components/orchestrators/local) | local | _built-in_ | Runs your pipelines locally. | -| [LocalDockerOrchestrator](/develop/stack-components/orchestrators/local-docker) | local\_docker | _built-in_ | Runs your pipelines locally using Docker. | -| [KubernetesOrchestrator](/develop/stack-components/orchestrators/kubernetes) | kubernetes | kubernetes | Runs your pipelines in Kubernetes clusters. | -| [KubeflowOrchestrator](/develop/stack-components/orchestrators/kubeflow) | kubeflow | kubeflow | Runs your pipelines using Kubeflow. | -| [VertexOrchestrator](/develop/stack-components/orchestrators/vertex) | vertex | gcp | Runs your pipelines in Vertex AI. | -| [SagemakerOrchestrator](/develop/stack-components/orchestrators/sagemaker) | sagemaker | aws | Runs your pipelines in Sagemaker. | -| [AzureMLOrchestrator](/develop/stack-components/orchestrators/azureml) | azureml | azure | Runs your pipelines in AzureML. | -| [TektonOrchestrator](/develop/stack-components/orchestrators/tekton) | tekton | tekton | Runs your pipelines using Tekton. | -| [AirflowOrchestrator](/develop/stack-components/orchestrators/airflow) | airflow | airflow | Runs your pipelines using Airflow. | -| [SkypilotAWSOrchestrator](/develop/stack-components/orchestrators/skypilot-vm) | vm\_aws | skypilot\[aws\] | Runs your pipelines in AWS VMs using SkyPilot | -| [SkypilotGCPOrchestrator](/develop/stack-components/orchestrators/skypilot-vm) | vm\_gcp | skypilot\[gcp\] | Runs your pipelines in GCP VMs using SkyPilot | -| [SkypilotAzureOrchestrator](/develop/stack-components/orchestrators/skypilot-vm) | vm\_azure | skypilot\[azure\] | Runs your pipelines in Azure VMs using SkyPilot | -| [HyperAIOrchestrator](/develop/stack-components/orchestrators/hyperai) | hyperai | hyperai | Runs your pipeline in HyperAI.ai instances. | -| [Custom Implementation](/develop/stack-components/orchestrators/custom) | _custom_ | Extend the orchestrator abstraction and provide your own implementation | | - -If you would like to see the available flavors of orchestrators, you can use the command: - -```bash -zenml orchestrator flavor list -``` - -### How to use it - -You don't need to directly interact with any ZenML orchestrator in your code. As long as the orchestrator that you want to use is part of your active [ZenML stack](/develop/user-guide/guides/production-guide/understand-stacks), using the orchestrator is as simple as executing a Python file that [runs a ZenML pipeline](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/starter-guide/starter-guide.md): - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -#### Inspecting Runs in the Orchestrator UI - -If your orchestrator comes with a separate user interface (for example Kubeflow, Airflow, Vertex), you can get the URL to the orchestrator UI of a specific pipeline run using the following code snippet: - -```bash -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("") - -orchestrator_url = pipeline_run.run_metadata["orchestrator_url"].value -``` - -#### Specifying per-step resources - -If your steps require the orchestrator to execute them on specific hardware, you can specify them on your steps as described [here](/develop/usage/project-setup/use-configuration-files/runtime-configuration). - -If your orchestrator of choice or the underlying hardware doesn't support this, you can also take a look at [step operators](/develop/stack-components/step-operators). - diff --git a/develop/stack-components/orchestrators/sagemaker.mdx b/develop/stack-components/orchestrators/sagemaker.mdx deleted file mode 100644 index 4e88945..0000000 --- a/develop/stack-components/orchestrators/sagemaker.mdx +++ /dev/null @@ -1,270 +0,0 @@ ---- -title: "AWS Sagemaker Orchestrator" -description: "Orchestrating your pipelines to run on Amazon Sagemaker." -icon: amazon ---- - -[Sagemaker Pipelines](https://aws.amazon.com/sagemaker/pipelines) is a serverless ML workflow tool running on AWS. It is an easy way to quickly run your code in a production-ready, repeatable cloud orchestrator that requires minimal setup without provisioning and paying for standby compute. - - - This component is only meant to be used within the context of a [remote ZenML deployment scenario](/develop/getting-started/deploying-zenml). Usage with a local ZenML deployment may lead to unexpected behavior! - -## When to use it - -You should use the Sagemaker orchestrator if: - -* you're already using AWS. -* you're looking for a proven production-grade orchestrator. -* you're looking for a UI in which you can track your pipeline runs. -* you're looking for a managed solution for running your pipelines. -* you're looking for a serverless solution for running your pipelines. - -## How it works - -The ZenML Sagemaker orchestrator works with [Sagemaker Pipelines](https://aws.amazon.com/sagemaker/pipelines), which can be used to construct machine learning pipelines. Under the hood, for each ZenML pipeline step, it creates a SageMaker `PipelineStep`, which contains a Sagemaker Processing job. Currently, other step types are not supported. - -## How to deploy it - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including a Sagemaker orchestrator? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML AWS Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - -In order to use a Sagemaker AI orchestrator, you need to first deploy [ZenML to the cloud](/develop/getting-started/deploying-zenml). It would be recommended to deploy ZenML in the same region as you plan on using for Sagemaker, but it is not necessary to do so. You must ensure that you are connected to the remote ZenML server before using this stack component. - -The only other thing necessary to use the ZenML Sagemaker orchestrator is enabling the relevant permissions for your particular role. - -In order to quickly enable APIs, and create other resources necessary for to use this integration, we will soon provide a Sagemaker stack recipe via [our mlstacks repository](https://github.com/zenml-io/mlstacks), which will help you set up the infrastructure with one click. - -### Infrastructure Deployment - -A Sagemaker orchestrator can be deployed directly from the ZenML CLI: - -```bash -zenml orchestrator deploy sagemaker_orchestrator --flavor=sagemaker --provider=aws ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the dedicated documentation section. - -## How to use it - -To use the Sagemaker orchestrator, we need: - -* The ZenML `aws` and `s3` integrations installed. If you haven't done so, run - -```bash -zenml integration install aws s3 -``` - -* [Docker](https://www.docker.com) installed and running. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack (configured with an `authentication_secret` attribute). -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. -* An IAM role or user with [an AmazonSageMakerFullAccess managed policy](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html) applied to it as well as `sagemaker.amazonaws.com` added as a Principal Service. Full details on these permissions can be found [here](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) or use the ZenML recipe (when available) which will set up the necessary permissions for you. -* The local client (whoever is running the pipeline) will also have to have the necessary permissions or roles to be able to launch Sagemaker jobs. (This would be covered by the `AmazonSageMakerFullAccess` policy suggested above.) - -There are three ways you can authenticate your orchestrator and link it to the IAM role you have created: - - - -The recommended way to authenticate your SageMaker orchestrator is by registering an [AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector) and connecting it to your SageMaker orchestrator: - -```bash -zenml service-connector register --type aws -i -zenml orchestrator register \ - --flavor=sagemaker \ - --execution_role= -zenml orchestrator connect --connector -zenml stack register -o ... --set -``` - - -Instead of creating a service connector, you can also configure your AWS authentication credentials directly in the orchestrator: - -```bash -zenml orchestrator register \ - --flavor=sagemaker \ - --execution_role= \ - --aws_access_key_id=... - --aws_secret_access_key=... - --region=... -zenml stack register -o ... --set -``` - -See the [SagemakerOrchestratorConfig SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-aws/#zenml.integrations.aws.flavors.sagemaker%5Forchestrator%5Fflavor) for more information on available configuration options. - - -If you neither connect your orchestrator to a service connector nor configure credentials explicitly, ZenML will try to implicitly authenticate to AWS via the `default` profile in your local [AWS configuration file](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html). - -```bash -zenml orchestrator register \ - --flavor=sagemaker \ - --execution_role= -zenml stack register -o ... --set -python run.py # Authenticates with `default` profile in `~/.aws/config` -``` - - - - - - ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your pipeline steps in Sagemaker. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - - -You can now run any ZenML pipeline using the Sagemaker orchestrator: - -```bash -python run.py -``` - -If all went well, you should now see the following output: - -``` -Steps can take 5-15 minutes to start running when using the Sagemaker Orchestrator. - -Your orchestrator 'sagemaker' is running remotely. Note that the pipeline run will only show up on the ZenML dashboard once the first step has started executing on the remote infrastructure. -``` - - - If it is taking more than 15 minutes for your run to show up, it might be that a setup error occurred in SageMaker before the pipeline could be started. Checkout the [Debugging SageMaker Pipelines](/develop/stack-components/orchestrators/sagemaker#debugging-sagemaker-pipelines) section for more information on how to debug this. - - -### Sagemaker UI - -Sagemaker comes with its own UI that you can use to find further details about your pipeline runs, such as the logs of your steps. - -To access the Sagemaker Pipelines UI, you will have to launch Sagemaker Studio via the AWS Sagemaker UI. Make sure that you are launching it from within your desired AWS region. - - - - -Once the Studio UI has launched, click on the 'Pipeline' button on the left side. From there you can view the pipelines that have been launched via ZenML: - - - - -### Debugging SageMaker Pipelines - -If your SageMaker pipeline encounters an error before the first ZenML step starts, the ZenML run will not appear in the ZenML dashboard. In such cases, use the [SageMaker UI](/develop/stack-components/orchestrators/sagemaker#sagemaker-ui) to review the error message and logs. Here's how: - -* Open the corresponding pipeline in the SageMaker UI as shown in the [SageMaker UI Section](/develop/stack-components/orchestrators/sagemaker#sagemaker-ui), -* Open the execution, -* Click on the failed step in the pipeline graph, -* Go to the 'Output' tab to see the error message or to 'Logs' to see the logs. - - - - -Alternatively, for a more detailed view of log messages during SageMaker pipeline executions, consider using [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/): - -* Search for 'CloudWatch' in the AWS console search bar. -* Navigate to 'Logs > Log groups.' -* Open the '/aws/sagemaker/ProcessingJobs' log group. -* Here, you can find log streams for each step of your SageMaker pipeline executions. - - - - - -### Run pipelines on a schedule - -The ZenML Sagemaker orchestrator doesn't currently support running pipelines on a schedule. We maintain a public roadmap for ZenML, which you can find [here](https://zenml.io/roadmap). We welcome community contributions (see more [here](https://github.com/zenml-io/zenml/blob/main/CONTRIBUTING.md)) so if you want to enable scheduling for Sagemaker, please [do let us know](https://zenml.io/slack)! - -### Configuration at pipeline or step level - -When running your ZenML pipeline with the Sagemaker orchestrator, the configuration set when configuring the orchestrator as a ZenML component will be used by default. However, it is possible to provide additional configuration at the pipeline or step level. This allows you to run whole pipelines or individual steps with alternative configurations. For example, this allows you to run the training process with a heavier, GPU-enabled instance type, while running other steps with lighter instances. - -Additional configuration for the Sagemaker orchestrator can be passed via `SagemakerOrchestratorSettings`. Here, it is possible to configure `processor_args`, which is a dictionary of arguments for the Processor. For available arguments, see the [Sagemaker documentation](https://sagemaker.readthedocs.io/en/stable/api/training/processing.html#sagemaker.processing.Processor) . Currently, it is not possible to provide custom configuration for the following attributes: - -* `image_uri` -* `instance_count` -* `sagemaker_session` -* `entrypoint` -* `base_job_name` -* `env` - -For example, settings can be provided in the following way: - -```py -sagemaker_orchestrator_settings = SagemakerOrchestratorSettings( - instance_type="ml.m5.large", - volume_size_in_gb=30, -) -``` - -They can then be applied to a step as follows: - -```py -@step(settings={"orchestrator": sagemaker_orchestrator_settings}) -``` - -For example, if your ZenML component is configured to use `ml.c5.xlarge` with 400GB additional storage by default, all steps will use it except for the step above, which will use `ml.t3.medium` with 30GB additional storage. - -Check out [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings in general. - -For more information and a full list of configurable attributes of the Sagemaker orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-aws/#zenml.integrations.aws.orchestrators.sagemaker%5Forchestrator.SagemakerOrchestrator) . - -#### S3 data access in ZenML steps - -In Sagemaker jobs, it is possible to [access data that is located in S3](https://docs.aws.amazon.com/sagemaker/latest/dg/model-access-training-data.html). Similarly, it is possible to write data from a job to a bucket. The ZenML Sagemaker orchestrator supports this via the `SagemakerOrchestratorSettings` and hence at component, pipeline, and step levels. - -**Import: S3 -> job** - -Importing data can be useful when large datasets are available in S3 for training, for which manual ing can be cumbersome. Sagemaker supports `File` (default) and `Pipe` mode, with which data is either fully copied before the job starts or piped on the fly. See the Sagemaker documentation referenced above for more information about these modes. - -Note that data import and export can be used jointly with `processor_args` for maximum flexibility. - -A simple example of importing data from S3 to the Sagemaker job is as follows: - -```bash -sagemaker_orchestrator_settings = SagemakerOrchestratorSettings( - input_data_s3_mode="File", - input_data_s3_uri="s3://some-bucket-name/folder" -) -``` - -In this case, data will be available at `/opt/ml/processing/input/data` within the job. - -It is also possible to split your input over channels. This can be useful if the dataset is already split in S3, or maybe even located in different buckets. - -```bash -sagemaker_orchestrator_settings = SagemakerOrchestratorSettings( - input_data_s3_mode="File", - input_data_s3_uri={ - "train": "s3://some-bucket-name/training_data", - "val": "s3://some-bucket-name/validation_data", - "test": "s3://some-other-bucket-name/testing_data" - } -) -``` - -Here, the data will be available in `/opt/ml/processing/input/data/train`, `/opt/ml/processing/input/data/val` and `/opt/ml/processing/input/data/test`. - -In the case of using `Pipe` for `input_data_s3_mode`, a file path specifying the pipe will be available as per the description written [here](https://docs.aws.amazon.com/sagemaker/latest/dg/model-access-training-data.html#model-access-training-data-input-modes) . An example of using this pipe file within a Python script can be found [here](https://github.com/aws/amazon-sagemaker-examples/blob/main/advanced%5Ffunctionality/pipe%5Fbring%5Fyour%5Fown/train.py) . - -**Export: job -> S3** - -Data from within the job (e.g. produced by the training process, or when preprocessing large data) can be exported as well. The structure is highly similar to that of importing data. ing data to S3 can be configured with `output_data_s3_mode`, which supports `EndOfJob` (default) and `Continuous`. - -In the simple case, data in `/opt/ml/processing/output/data` will be copied to S3 at the end of a job: - -```bash -sagemaker_orchestrator_settings = SagemakerOrchestratorSettings( - output_data_s3_mode="EndOfJob", - output_data_s3_uri="s3://some-results-bucket-name/results" -) -``` - -In a more complex case, data in `/opt/ml/processing/output/data/metadata` and `/opt/ml/processing/output/data/checkpoints` will be written away continuously: - -```bash -sagemaker_orchestrator_settings = SagemakerOrchestratorSettings( - output_data_s3_mode="Continuous", - output_data_s3_uri={ - "metadata": "s3://some-results-bucket-name/metadata", - "checkpoints": "s3://some-results-bucket-name/checkpoints" - } -) -``` - -### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. diff --git a/develop/stack-components/orchestrators/skypilot-vm.mdx b/develop/stack-components/orchestrators/skypilot-vm.mdx deleted file mode 100644 index 01fbd09..0000000 --- a/develop/stack-components/orchestrators/skypilot-vm.mdx +++ /dev/null @@ -1,466 +0,0 @@ ---- -title: "Skypilot VM Orchestrator" -description: "Orchestrating your pipelines to run on VMs using SkyPilot." -icon: plane-up ---- - -The SkyPilot VM Orchestrator is an integration provided by ZenML that allows you to provision and manage virtual machines (VMs) on any cloud provider supported by the [SkyPilot framework](https://skypilot.readthedocs.io/en/latest/index.html). This integration is designed to simplify the process of running machine learning workloads on the cloud, offering cost savings, high GPU availability, and managed execution, We recommend using the SkyPilot VM Orchestrator if you need access to GPUs for your workloads, but don't want to deal with the complexities of managing cloud infrastructure or expensive managed solutions. - - - This component is only meant to be used within the context of a [remote ZenML deployment scenario](/develop/getting-started/deploying-zenml). Usage with a local ZenML deployment may lead to unexpected behavior! - - - SkyPilot VM Orchestrator is currently supported only for Python 3.8 and 3.9. - - -## When to use it - -You should use the SkyPilot VM Orchestrator if: - -* you want to maximize cost savings by leveraging spot VMs and auto-picking the cheapest VM/zone/region/cloud. -* you want to ensure high GPU availability by provisioning VMs in all zones/regions/clouds you have access to. -* you don't need a built-in UI of the orchestrator. (You can still use ZenML's Dashboard to view and monitor your pipelines/artifacts.) -* you're not willing to maintain Kubernetes-based solutions or pay for managed solutions like [Sagemaker](/develop/stack-components/orchestrators/sagemaker). - -## How it works - -The orchestrator leverages the SkyPilot framework to handle the provisioning and scaling of VMs. It automatically manages the process of launching VMs for your pipelines, with support for both on-demand and managed spot VMs. While you can select the VM type you want to use, the orchestrator also includes an optimizer that automatically selects the cheapest VM/zone/region/cloud for your workloads. Finally, the orchestrator includes an autostop feature that cleans up idle clusters, preventing unnecessary cloud costs. - - - You can configure the SkyPilot VM Orchestrator to use a specific VM type, and resources for each step of your pipeline can be configured individually. Read more about how to configure step-specific resources [here](/develop/stack-components/orchestrators/skypilot-vm#configuring-step-specific-resources). - - - - The SkyPilot VM Orchestrator does not currently support the ability to [schedule pipelines runs](/develop/usage/pipelines/build-pipelines/schedule-a-pipeline) - - - - All ZenML pipeline runs are executed using Docker containers within the VMs provisioned by the orchestrator. For that reason, you may need to configure your pipeline settings with `docker_run_args=["--gpus=all"]` to enable GPU support in the Docker container. - - -## How to deploy it - -You don't need to do anything special to deploy the SkyPilot VM Orchestrator. As the SkyPilot integration itself takes care of provisioning VMs, you can simply use the orchestrator as you would any other ZenML orchestrator. However, you will need to ensure that you have the appropriate permissions to provision VMs on your cloud provider of choice and to configure your SkyPilot orchestrator accordingly using the [service connectors](/develop/usage/resource-data-management/auth-management/service-connectors-guide) feature. - - - The SkyPilot VM Orchestrator currently only supports the AWS, GCP, and Azure cloud platforms. - - -## How to use it - -To use the SkyPilot VM Orchestrator, you need: - -* One of the SkyPilot integrations installed. You can install the SkyPilot integration for your cloud provider of choice using the following command: - -```bash - # For AWS - pip install "zenml[connectors-aws]" - zenml integration install aws skypilot_aws - - # for GCP - pip install "zenml[connectors-gcp]" - zenml integration install gcp skypilot_gcp # for GCP - - # for Azure - pip install "zenml[connectors-azure]" - zenml integration install azure skypilot_azure # for Azure -``` -* [Docker](https://www.docker.com) installed and running. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. -* A [remote ZenML deployment](/develop/getting-started/deploying-zenml). -* The appropriate permissions to provision VMs on your cloud provider of choice. -* A [service connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) configured to authenticate with your cloud provider of choice. - - - -We need first to install the SkyPilot integration for AWS and the AWS connectors extra, using the following two commands: - -```bash - pip install "zenml[connectors-aws]" - - zenml integration install aws skypilot_aws -``` - -To provision VMs on AWS, your VM Orchestrator stack component needs to be configured to authenticate with [AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector). To configure the AWS Service Connector, you need to register a new service connector configured with AWS credentials that have at least the minimum permissions required by SkyPilot as documented [here](https://skypilot.readthedocs.io/en/latest/cloud-setup/cloud-permissions/aws.html). - -First, check that the AWS service connector type is available using the following command: - -```bash -zenml service-connector list-types --type aws -``` - -```bash -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠───────────────────────┼────────┼───────────────────────┼──────────────────┼───────┼────────┨ - -┃ AWS Service Connector │ 🔶 aws │ 🔶 aws-generic │ implicit │ ✅ │ ➖ ┃ - -┃ │ │ 📦 s3-bucket │ secret-key │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ sts-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ iam-role │ │ ┃ - -┃ │ │ │ session-token │ │ ┃ - -┃ │ │ │ federation-token │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -Next, configure a service connector using the CLI or the dashboard with the AWS credentials. For example, the following command uses the local AWS CLI credentials to auto-configure the service connector: - -```bash -zenml service-connector register aws-skypilot-vm --type aws --region=us-east-1 --auto-configure -``` - -This will automatically configure the service connector with the appropriate credentials and permissions to provision VMs on AWS. You can then use the service connector to configure your registered VM Orchestrator stack component using the following command: - -```bash -# Register the orchestrator -zenml orchestrator register --flavor vm_aws -# Connect the orchestrator to the service connector -zenml orchestrator connect --connector aws-skypilot-vm - -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - - -We need first to install the SkyPilot integration for GCP and the GCP extra for ZenML, using the following two commands: - -```bash - pip install "zenml[connectors-gcp]" - - zenml integration install gcp skypilot_gcp -``` - -To provision VMs on GCP, your VM Orchestrator stack component needs to be configured to authenticate with [GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) - -To configure the GCP Service Connector, you need to register a new service connector, but first let's check the available service connectors types using the following command: - -```bash -zenml service-connector list-types --type gcp -``` -```bash -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠───────────────────────┼────────┼───────────────────────┼─────────────────┼───────┼────────┨ - -┃ GCP Service Connector │ 🔵 gcp │ 🔵 gcp-generic │ implicit │ ✅ │ ➖ ┃ - -┃ │ │ 📦 gcs-bucket │ user-account │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ service-account │ │ ┃ - -┃ │ │ 🐳 docker-registry │ oauth2-token │ │ ┃ - -┃ │ │ │ impersonation │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -For this example we will configure a service connector using the `user-account` auth method. But before we can do that, we need to login to GCP using the following command: - -```bash -gcloud auth application-default login -``` - -This will open a browser window and ask you to login to your GCP account. Once you have logged in, you can register a new service connector using the following command: - -```bash -# We want to use --auto-configure to automatically configure the service connector with the appropriate credentials and permissions to provision VMs on GCP. -zenml service-connector register gcp-skypilot-vm -t gcp --auth-method user-account --auto-configure -# using generic resource type requires disabling the generation of temporary tokens -zenml service-connector update gcp-skypilot-vm --generate_temporary_tokens=False -``` - -This will automatically configure the service connector with the appropriate credentials and permissions to provision VMs on GCP. You can then use the service connector to configure your registered VM Orchestrator stack component using the following commands: - -```bash -# Register the orchestrator -zenml orchestrator register --flavor vm_gcp -# Connect the orchestrator to the service connector -zenml orchestrator connect --connector gcp-skypilot-vm - -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - - - -We need first to install the SkyPilot integration for Azure and the Azure extra for ZenML, using the following two commands - -```bash - pip install "zenml[connectors-azure]" - zenml integration install azure skypilot_azure -``` - -To provision VMs on Azure, your VM Orchestrator stack component needs to be configured to authenticate with [Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector) - -To configure the Azure Service Connector, you need to register a new service connector, but first let's check the available service connectors types using the following command: - -```bash -zenml service-connector list-types --type azure -``` - -```bash -┏━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠─────────────────────────┼───────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Azure Service Connector │ 🇦 azure │ 🇦 azure-generic │ implicit │ ✅ │ ➖ ┃ - -┃ │ │ 📦 blob-container │ service-principal │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ access-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ - -zenml service-connector register azure-skypilot-vm -t azure --auth-method access-token --auto-configure -``` - -This will automatically configure the service connector with the appropriate credentials and permissions to provision VMs on Azure. You can then use the service connector to configure your registered VM Orchestrator stack component using the following commands: - -```bash -# Register the orchestrator -zenml orchestrator register --flavor vm_azure -# Connect the orchestrator to the service connector -zenml orchestrator connect --connector azure-skypilot-vm -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - - -[Lambda Labs](https://lambdalabs.com/service/gpu-cloud) is a cloud provider that offers GPU instances for machine learning workloads. Unlike the major cloud providers, with Lambda Labs we don't need to configure a service connector to authenticate with the cloud provider. Instead, we can directly use API keys to authenticate with the Lambda Labs API. - -```bash - zenml integration install skypilot_lambda -``` - -Once the integration is installed, we can register the orchestrator with the following command: - -```bash -# For more secure and recommended way, we will register the API key as a secret -zenml secret create lambda_api_key --scope user --api_key= -# Register the orchestrator -zenml orchestrator register --flavor vm_lambda --api_key={{lambda_api_key.api_key}} -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - - - The Lambda Labs orchestrator does not support some of the features like `job_recovery`, `disk_tier`, `image_id`, `zone`, `idle_minutes_to_autostop`, `disk_size`, `use_spot`. It is recommended not to use these features with the Lambda Labs orchestrator and not to use [step-specific settings](/develop/stack-components/orchestrators/skypilot-vm#configuring-step-specific-resources). - - - While testing the orchestrator, we noticed that the Lambda Labs orchestrator does not support the `down` flag. This means the orchestrator will not automatically tear down the cluster after all jobs finish. We recommend manually tearing down the cluster after all jobs finish to avoid unnecessary costs. - - - - -#### Additional Configuration - -For additional configuration of the Skypilot orchestrator, you can pass `Settings` depending on which cloud you are using which allows you to configure (among others) the following attributes: - -* `instance_type`: The instance type to use. -* `cpus`: The number of CPUs required for the task. If a string, must be a string of the form `'2'` or `'2+'`, where the `+` indicates that the task requires at least 2 CPUs. -* `memory`: The amount of memory in GiB required. If a string, must be a string of the form `'16'` or `'16+'`, where the `+` indicates that the task requires at least 16 GB of memory. -* `accelerators`: The accelerators required. If a string, must be a string of the form `'V100'` or `'V100:2'`, where the `:2` indicates that the task requires 2 V100 GPUs. If a dict, must be a dict of the form `{'V100': 2}` or `{'tpu-v2-8': 1}`. -* `accelerator_args`: Accelerator-specific arguments. For example, `{'tpu_vm': True, 'runtime_version': 'tpu-vm-base'}` for TPUs. -* `use_spot`: Whether to use spot instances. If None, defaults to False. -* `job_recovery`: The spot recovery strategy to use for the managed spot to recover the cluster from preemption. Read more about the available strategies [here](https://skypilot.readthedocs.io/en/latestversions/0.66.0/reference/api.html?highlight=instance%5Ftype#resources) -* `region`: The cloud region to use. -* `zone`: The cloud zone to use within the region. -* `image_id`: The image ID to use. If a string, must be a string of the image id from the cloud, such as AWS: `'ami-1234567890abcdef0'`, GCP: `'projects/my-project-id/global/_assets/my-image-name'`; Or, a image tag provided by SkyPilot, such as AWS: `'skypilot:gpu-ubuntu-2004'`. If a dict, must be a dict mapping from region to image ID. -* `disk_size`: The size of the OS disk in GiB. -* `disk_tier`: The disk performance tier to use. If None, defaults to `'medium'`. -* `cluster_name`: Name of the cluster to create/reuse. If None, auto-generate a name. SkyPilot uses term `cluster` to refer to a group or a single VM that are provisioned to execute the task. The cluster name is used to identify the cluster and to determine whether to reuse an existing cluster or create a new one. -* `retry_until_up`: Whether to retry launching the cluster until it is up. -* `idle_minutes_to_autostop`: Automatically stop the cluster after this many minutes of idleness, i.e., no running or pending jobs in the cluster's job queue. Idleness gets reset whenever setting-up/running/pending jobs are found in the job queue. Setting this flag is equivalent to running `sky.launch(..., detach_run=True, ...)` and then `sky.autostop(idle_minutes=)`. If not set, the cluster will not be autostopped. -* `down`: Tear down the cluster after all jobs finish (successfully or abnormally). If `idle_minutes_to_autostop` is also set, the cluster will be torn down after the specified idle time. Note that if errors occur during provisioning/data syncing/setting up, the cluster will not be torn down for debugging purposes. -* `stream_logs`: If True, show the logs in the terminal as they are generated while the cluster is running. -* `docker_run_args`: Additional arguments to pass to the `docker run` command. For example, `['--gpus=all']` to use all GPUs available on the VM. - -The following code snippets show how to configure the orchestrator settings for each cloud provider: - - - -**Code Example:** - -```py -from zenml.integrations.skypilot_aws.flavors.skypilot_orchestrator_aws_vm_flavor import SkypilotAWSOrchestratorSettings - -skypilot_settings = SkypilotAWSOrchestratorSettings( - cpus="2", - memory="16", - accelerators="V100:2", - accelerator_args={"tpu_vm": True, "runtime_version": "tpu-vm-base"}, - use_spot=True, - job_recovery="recovery_strategy", - region="us-west-1", - zone="us-west1-a", - image_id="ami-1234567890abcdef0", - disk_size=100, - disk_tier="high", - cluster_name="my_cluster", - retry_until_up=True, - idle_minutes_to_autostop=60, - down=True, - stream_logs=True - docker_run_args=["--gpus=all"] -) - - -@pipeline( - settings={ - "orchestrator": skypilot_settings - } -) -``` - - -**Code Example:** -```py - -from zenml.integrations.skypilot_gcp.flavors.skypilot_orchestrator_gcp_vm_flavor import SkypilotGCPOrchestratorSettings - - -skypilot_settings = SkypilotGCPOrchestratorSettings( - cpus="2", - memory="16", - accelerators="V100:2", - accelerator_args={"tpu_vm": True, "runtime_version": "tpu-vm-base"}, - use_spot=True, - job_recovery="recovery_strategy", - region="us-west1", - zone="us-west1-a", - image_id="ubuntu-pro-2004-focal-v20231101", - disk_size=100, - disk_tier="high", - cluster_name="my_cluster", - retry_until_up=True, - idle_minutes_to_autostop=60, - down=True, - stream_logs=True -) - - -@pipeline( - settings={ - "orchestrator": skypilot_settings - } -) -``` - - -**Code Example:** -```py - -from zenml.integrations.skypilot_azure.flavors.skypilot_orchestrator_azure_vm_flavor import SkypilotAzureOrchestratorSettings - - -skypilot_settings = SkypilotAzureOrchestratorSettings( - cpus="2", - memory="16", - accelerators="V100:2", - accelerator_args={"tpu_vm": True, "runtime_version": "tpu-vm-base"}, - use_spot=True, - job_recovery="recovery_strategy", - region="West Europe", - image_id="Canonical:0001-com-ubuntu-server-jammy:22_04-lts-gen2:latest", - disk_size=100, - disk_tier="high", - cluster_name="my_cluster", - retry_until_up=True, - idle_minutes_to_autostop=60, - down=True, - stream_logs=True -) - - -@pipeline( - settings={ - "orchestrator": skypilot_settings - } -) -``` - - -**Code Example:** -```py - -from zenml.integrations.skypilot_lambda import SkypilotLambdaOrchestratorSettings - - -skypilot_settings = SkypilotLambdaOrchestratorSettings( - instance_type="gpu_1x_h100_pcie", - cluster_name="my_cluster", - retry_until_up=True, - idle_minutes_to_autostop=60, - down=True, - stream_logs=True, - docker_run_args=["--gpus=all"] -) - - -@pipeline( - settings={ - "orchestrator": skypilot_settings - } -) -``` - - - - -One of the key features of the SkyPilot VM Orchestrator is the ability to run each step of a pipeline on a separate VM with its own specific settings. This allows for fine-grained control over the resources allocated to each step, ensuring that each part of your pipeline has the necessary compute power while optimizing for cost and efficiency. - -## Configuring Step-Specific Resources - -The SkyPilot VM Orchestrator allows you to configure resources for each step individually. This means you can specify different VM types, CPU and memory requirements, and even use spot instances for certain steps while using on-demand instances for others. - -If no step-specific settings are specified, the orchestrator will use the resources specified in the orchestrator settings for each step and run the entire pipeline in one VM. If step-specific settings are specified, an orchestrator VM will be spun up first, which will subsequently spin out new VMs dependent on the step settings. You can disable this behavior by setting the `disable_step_based_settings` parameter to `True` in the orchestrator configuration, using the following command: - -```bash -zenml orchestrator update --disable_step_based_settings=True -``` - -Here's an example of how to configure specific resources for a step for the AWS cloud: - -```py -from zenml.integrations.skypilot_aws.flavors.skypilot_orchestrator_aws_vm_flavor import SkypilotAWSOrchestratorSettings - -# Settings for a specific step that requires more resources -high_resource_settings = SkypilotAWSOrchestratorSettings( - instance_type='t2.2xlarge', - cpus=8, - memory=32, - use_spot=False, - region='us-east-1', - # ... other settings -) - -@step(settings={"orchestrator": high_resource_settings}) -def my_resource_intensive_step(): - # Step implementation - pass -``` - - - When configuring pipeline or step-specific resources, you can use the `settings` parameter to specifically target the orchestrator flavor you want to use `orchestrator.STACK_COMPONENT_FLAVOR` and not orchestrator component name `orchestrator.STACK_COMPONENT_NAME`. For example, if you want to configure resources for the `vm_gcp` flavor, you can use `settings={"orchestrator": ...}`. - - -By using the `settings` parameter, you can tailor the resources for each step according to its specific needs. This flexibility allows you to optimize your pipeline execution for both performance and cost. - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-skypilot/#zenml.integrations.skypilot.flavors.skypilot%5Forchestrator%5Fbase%5Fvm%5Fflavor.SkypilotBaseOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - diff --git a/develop/stack-components/orchestrators/tekton.mdx b/develop/stack-components/orchestrators/tekton.mdx deleted file mode 100644 index 1a1cdfd..0000000 --- a/develop/stack-components/orchestrators/tekton.mdx +++ /dev/null @@ -1,238 +0,0 @@ ---- -title: "Tekton Orchestrator" -description: "Orchestrating your pipelines to run on Tekton." -icon: bomb ---- - -[Tekton](https://tekton.dev/) is a powerful and flexible open-source framework for creating CI/CD systems, allowing developers to build, test, and deploy across cloud providers and on-premise systems. - - - This component is only meant to be used within the context of a [remote ZenML deployment scenario](/develop/getting-started/deploying-zenml). Usage with a local ZenML deployment may lead to unexpected behavior! - -### When to use it - -You should use the Tekton orchestrator if: - -* you're looking for a proven production-grade orchestrator. -* you're looking for a UI in which you can track your pipeline runs. -* you're already using Kubernetes or are not afraid of setting up and maintaining a Kubernetes cluster. -* you're willing to deploy and maintain Tekton Pipelines on your cluster. - -### How to deploy it - -You'll first need to set up a Kubernetes cluster and deploy Tekton Pipelines: - - -* A remote ZenML server. See the [deployment guide](/develop/getting-started/deploying-zenml) for more information. -* Have an existing AWS [EKS cluster](https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html) set up. -* Make sure you have the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) set up. -* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and [configure](https://aws.amazon.com/premiumsupport/knowledge-center/eks-cluster-connection/) it to talk to your EKS cluster using the following command: - -``` -aws eks --region REGION update-kubeconfig --name CLUSTER_NAME -``` -* [Install](https://tekton.dev/docs/pipelines/install/) Tekton Pipelines onto your cluster. - - -* A remote ZenML server. See the [deployment guide](/develop/getting-started/deploying-zenml) for more information. -* Have an existing GCP [GKE cluster](https://cloud.google.com/kubernetes-engine/docs/quickstart) set up. -* Make sure you have the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install-sdk) set up first. -* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and [configure](https://cloud.google.com/kubernetes-engine/docs/versions/0.66.0/how-to/cluster-access-for-kubectl) it to talk to your GKE cluster using the following command: - -``` -gcloud container clusters get-credentials CLUSTER_NAME -``` -* [Install](https://tekton.dev/docs/pipelines/install/) Tekton Pipelines onto your cluster. - - -* A remote ZenML server. See the [deployment guide](/develop/getting-started/deploying-zenml) for more information. -* Have an existing [AKS cluster](https://azure.microsoft.com/en-in/services/kubernetes-service/#documentation) set up. -* Make sure you have the [az CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) set up first. -* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and it to talk to your AKS cluster using the following command: - -```py -az aks get-credentials --resource-group RESOURCE_GROUP --name CLUSTER_NAME -``` -* [Install](https://tekton.dev/docs/pipelines/install/) Tekton Pipelines onto your cluster. - - - - - If one or more of the deployments are not in the `Running` state, try increasing the number of nodes in your cluster. - - - - ZenML has only been tested with Tekton Pipelines >=0.38.3 and may not work with previous versions. - - -#### Infrastructure Deployment - -A Tekton orchestrator can be deployed directly from the ZenML CLI: - -```bash -zenml orchestrator deploy tekton_orchestrator --flavor=tekton --provider= ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the dedicated documentation section. - -### How to use it - -To use the Tekton orchestrator, we need: - -* The ZenML `tekton` integration installed. If you haven't done so, run - -```bash -zenml integration install tekton -y -``` -* [Docker](https://www.docker.com) installed and running. -* Tekton pipelines deployed on a remote cluster. See the [deployment section](/develop/stack-components/orchestrators/tekton#how-to-deploy-it) for more information. -* The name of your Kubernetes context which points to your remote cluster. Run `kubectl config get-contexts` to see a list of available contexts. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. -* [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) installed and the name of the Kubernetes configuration context which points to the target cluster (i.e. run`kubectl config get-contexts` to see a list of available contexts). This is optional (see below). - - - It is recommended that you set up [a Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) and use it to connect ZenML Stack Components to the remote Kubernetes cluster, especially If you are using a Kubernetes cluster managed by a cloud provider like AWS, GCP or Azure, This guarantees that your Stack is fully portable on other environments and your pipelines are fully reproducible. - -We can then register the orchestrator and use it in our active stack. This can be done in two ways: - -1. If you have [a Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) configured to access the remote Kubernetes cluster, you no longer need to set the `kubernetes_context` attribute to a local `kubectl` context. In fact, you don't need the local Kubernetes CLI at all. You can [connect the stack component to the Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide#connect-stack-components-to-resources) instead: - -```bash -$ zenml orchestrator register --flavor tekton -Running with active stack: 'default' (repository) -Successfully registered orchestrator ``. -$ zenml service-connector list-resources --resource-type kubernetes-cluster -e -The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ e33c9fac-5daa-48b2-87bb-0187d3782cde │ aws-iam-multi-eu │ 🔶 aws │ 🌀 kubernetes-cluster │ kubeflowmultitenant ┃ -┃ │ │ │ │ zenbox ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ 1c54b32a-4889-4417-abbd-42d3ace3d03a │ gcp-sa-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛ -$ zenml orchestrator connect --connector aws-iam-multi-us -Running with active stack: 'default' (repository) -Successfully connected orchestrator `` to the following resources: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼──────────────────┼────────────────┼───────────────────────┼──────────────────┨ -┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ -# Register and activate a stack with the new orchestrator -$ zenml stack register -o ... --set -``` -2. if you don't have a Service Connector on hand and you don't want to [register one](/develop/usage/resource-data-management/auth-management/service-connectors-guide#register-service-connectors) , the local Kubernetes `kubectl` client needs to be configured with a configuration context pointing to the remote cluster. The `kubernetes_context` stack component must also be configured with the value of that context: - -```bash -zenml orchestrator register \ - --flavor=tekton \ - --kubernetes_context= -# Register and activate a stack with the new orchestrator -zenml stack register -o ... --set -``` - - - ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your pipeline steps in Tekton. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - - -You can now run any ZenML pipeline using the Tekton orchestrator: - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -#### Tekton UI - -Tekton comes with its own UI that you can use to find further details about your pipeline runs, such as the logs of your steps. - - - - - -To find the Tekton UI endpoint, we can use the following command: - -``` -kubectl get ingress -n tekton-pipelines -o jsonpath='{.items[0].spec.rules[0].host}' -``` - -#### Additional configuration - -For additional configuration of the Tekton orchestrator, you can pass `TektonOrchestratorSettings` which allows you to configure node selectors, affinity, and tolerations to apply to the Kubernetes Pods running your pipeline. These can be either specified using the Kubernetes model objects or as dictionaries. - - -```py -from zenml.integrations.tekton.flavors.tekton_orchestrator_flavor import TektonOrchestratorSettings -from kubernetes.client.models import V1Toleration - -tekton_settings = TektonOrchestratorSettings( - pod_settings={ - "affinity": { - "nodeAffinity": { - "requiredDuringSchedulingIgnoredDuringExecution": { - "nodeSelectorTerms": [ - { - "matchExpressions": [ - { - "key": "node.kubernetes.io/name", - "operator": "In", - "values": ["my_powerful_node_group"], - } - ] - } - ] - } - } - }, - "tolerations": [ - V1Toleration( - key="node.kubernetes.io/name", - operator="Equal", - value="", - effect="NoSchedule" - ) - ] - } -) -``` - -If your pipelines steps have certain hardware requirements, you can specify them as `ResourceSettings`: - -```bash -resource_settings = ResourceSettings(cpu_count=8, memory="16GB") -``` - -These settings can then be specified on either pipeline-level or step-level: - -```py -# Either specify on pipeline-level -@pipeline( - settings={ - "orchestrator": tekton_settings, - "resources": resource_settings, - } -) -def my_pipeline(): - ... - -# OR specify settings on step-level -@step( - settings={ - "orchestrator": tekton_settings, - "resources": resource_settings, - } -) -def my_step(): - ... -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-tekton/#zenml.integrations.tekton.flavors.tekton%5Forchestrator%5Fflavor.TektonOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -For more information and a full list of configurable attributes of the Tekton orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-tekton/#zenml.integrations.tekton.orchestrators.tekton%5Forchestrator.TektonOrchestrator) . - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. diff --git a/develop/stack-components/orchestrators/vertex.mdx b/develop/stack-components/orchestrators/vertex.mdx deleted file mode 100644 index f9e4f7f..0000000 --- a/develop/stack-components/orchestrators/vertex.mdx +++ /dev/null @@ -1,297 +0,0 @@ ---- -title: "Google Cloud VertexAI Orchestrator" -description: "Orchestrating your pipelines to run on Vertex AI." -icon: tornado ---- - -[Vertex AI Pipelines](https://cloud.google.com/vertex-ai/docs/pipelines/introduction) is a serverless ML workflow tool running on the Google Cloud Platform. It is an easy way to quickly run your code in a production-ready, repeatable cloud orchestrator that requires minimal setup without provisioning and paying for standby compute. - - - This component is only meant to be used within the context of a [remote ZenML deployment scenario](/develop/getting-started/deploying-zenml). Usage with a local ZenML deployment may lead to unexpected behavior! - - -## When to use it - -You should use the Vertex orchestrator if: - -* you're already using GCP. -* you're looking for a proven production-grade orchestrator. -* you're looking for a UI in which you can track your pipeline runs. -* you're looking for a managed solution for running your pipelines. -* you're looking for a serverless solution for running your pipelines. - -## How to deploy it - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including a Vertex AI orchestrator? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML GCP Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -In order to use a Vertex AI orchestrator, you need to first deploy [ZenML to the cloud](/develop/getting-started/deploying-zenml). It would be recommended to deploy ZenML in the same Google Cloud project as where the Vertex infrastructure is deployed, but it is not necessary to do so. You must ensure that you are connected to the remote ZenML server before using this stack component. - -The only other thing necessary to use the ZenML Vertex orchestrator is enabling Vertex-relevant APIs on the Google Cloud project. - -In order to quickly enable APIs, and create other resources necessary for using this integration, you can also consider using [mlstacks](https://mlstacks.zenml.io/vertex), which helps you set up the infrastructure with one click. - -## How to use it - -To use the Vertex orchestrator, we need: - -* The ZenML `gcp` integration installed. If you haven't done so, run - -```bash -zenml integration install gcp -``` -* [Docker](https://www.docker.com) installed and running. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. -* [GCP credentials with proper permissions](/develop/stack-components/orchestrators/vertex#gcp-credentials-and-permissions) -* The GCP project ID and location in which you want to run your Vertex AI pipelines. - -### GCP credentials and permissions - -This part is without doubt the most involved part of using the Vertex orchestrator. In order to run pipelines on Vertex AI, you need to have a GCP user account and/or one or more GCP service accounts set up with proper permissions, depending on whether you wish to practice [the principle of least privilege](https://en.wikipedia.org/wiki/Principle%5Fof%5Fleast%5Fprivilege) and distribute permissions across multiple service accounts. - -You also have three different options to provide credentials to the orchestrator: - -* use the [gcloud CLI](https://cloud.google.com/sdk/gcloud) to authenticate locally with GCP -* configure the orchestrator to use a [service account key file](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) to authenticate with GCP by setting the `service_account_path` parameter in the orchestrator configuration. -* (recommended) configure [a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) with GCP credentials and then link the Vertex AI Orchestrator stack component to the Service Connector. - -This section [explains the different components and GCP resources](/develop/stack-components/orchestrators/vertex#vertex-ai-pipeline-components) involved in running a Vertex AI pipeline and what permissions they need, then provides instructions for three different configuration use-cases: - -1. [use the local gcloud CLI configured with your GCP user account](/develop/stack-components/orchestrators/vertex#configuration-use-case-local-gcloud-cli-with-user-account), including the ability to schedule pipelines -2. [use a GCP Service Connector and a single service account](/develop/stack-components/orchestrators/vertex#configuration-use-case-gcp-service-connector-with-single-service-account) with all permissions, including the ability to schedule pipelines -3. [use a GCP Service Connector and multiple service accounts](/develop/stack-components/orchestrators/vertex#configuration-use-case-gcp-service-connector-with-different-service-accounts) for different permissions, including the ability to schedule pipelines - -#### Vertex AI pipeline components - -To understand what accounts you need to provision and why, let's look at the different components of the Vertex orchestrator: - -1. _the ZenML client environment_ is the environment where you run the ZenML code responsible for building the pipeline Docker image and submitting the pipeline to Vertex AI, among other things. This is usually your local machine or some other environment used to automate running pipelines, like a CI/CD job. This environment needs to be able to authenticate with GCP and needs to have the necessary permissions to create a job in Vertex Pipelines, (e.g. [the Vertex AI User role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.user)). If you are planning to [run pipelines on a schedule](/develop/stack-components/orchestrators/vertex#run-pipelines-on-a-schedule), _the ZenML client environment_ also needs additional permissions: - * the [Storage Object Creator Role](https://cloud.google.com/iam/docs/understanding-roles#storage.objectCreator) to be able to write the pipeline JSON file to the artifact store directly (NOTE: not needed if the Artifact Store is configured with credentials or is linked to Service Connector) -2. _the Vertex AI pipeline environment_ is the GCP environment in which the pipeline steps themselves are running in GCP. The Vertex AI pipeline runs in the context of a GCP service account which we'll call here _the workload service account_. _The workload service account_ can be explicitly configured in the orchestrator configuration via the `workload_service_account` parameter. If it is omitted, the orchestrator will use [the Compute Engine default service account](https://cloud.google.com/compute/docs/access/service-accounts#default%5Fservice%5Faccount) for the GCP project in which the pipeline is running. This service account needs to have the following permissions: - * permissions to run a Vertex AI pipeline, (e.g. [the Vertex AI Service Agent role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.serviceAgent)). - -As you can see, there can be dedicated service accounts involved in running a Vertex AI pipeline. That's two service accounts if you also use a service account to authenticate to GCP in _the ZenML client environment_. However, you can keep it simple and use the same service account everywhere. - -#### Configuration use-case: local `gcloud` CLI with user account - -This configuration use-case assumes you have configured the [gcloud CLI](https://cloud.google.com/sdk/gcloud) to authenticate locally with your GCP account (i.e. by running `gcloud auth login`). It also assumes the following: - -* your GCP account has permissions to create a job in Vertex Pipelines, (e.g. [the Vertex AI User role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.user)). -* [the Compute Engine default service account](https://cloud.google.com/compute/docs/access/service-accounts#default%5Fservice%5Faccount) for the GCP project in which the pipeline is running is updated with additional permissions required to run a Vertex AI pipeline, (e.g. [the Vertex AI Service Agent role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.serviceAgent)). - -This is the easiest way to configure the Vertex AI Orchestrator, but it has the following drawbacks: - -* the setup is not portable on other machines and reproducible by other users. -* it uses the Compute Engine default service account, which is not recommended, given that it has a lot of permissions by default and is used by many other GCP services. - -We can then register the orchestrator as follows: - -```bash -zenml orchestrator register \ - --flavor=vertex \ - --project= \ - --location= \ - --synchronous=true -``` - -#### Configuration use-case: GCP Service Connector with single service account - -This use-case assumes you have already configured a GCP service account with the following permissions: - -* permissions to create a job in Vertex Pipelines, (e.g. [the Vertex AI User role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.user)). -* permissions to run a Vertex AI pipeline, (e.g. [the Vertex AI Service Agent role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.serviceAgent)). -* the [Storage Object Creator Role](https://cloud.google.com/iam/docs/understanding-roles#storage.objectCreator) to be able to write the pipeline JSON file to the artifact store directly. - -It also assumes you have already created a service account key for this service account and downloaded it to your local machine (e.g. in a `connectors-vertex-ai-workload.json` file). This is not recommended if you are conscious about security. The principle of least privilege is not applied here and the environment in which the pipeline steps are running has many permissions that it doesn't need. - -```bash -zenml service-connector register --type gcp --auth-method=service-account --project_id= --service_account_json=@connectors-vertex-ai-workload.json --resource-type gcp-generic - -zenml orchestrator register \ - --flavor=vertex \ - --location= \ - --synchronous=true \ - --workload_service_account=@.iam.gserviceaccount.com - -zenml orchestrator connect --connector -``` - -#### Configuration use-case: GCP Service Connector with different service accounts - -This setup applies the principle of least privilege by using different service accounts with the minimum of permissions needed for [the different components involved in running a Vertex AI pipeline](/develop/stack-components/orchestrators/vertex#vertex-ai-pipeline-components). It also uses a GCP Service Connector to make the setup portable and reproducible. This configuration is a best-in-class setup that you would normally use in production, but it requires a lot more work to prepare. - - - This setup involves creating and configuring several GCP service accounts, which is a lot of work and can be error prone. If you don't really need the added security, you can use [the GCP Service Connector with a single service account](/develop/stack-components/orchestrators/vertex#configuration-use-case-gcp-service-connector-with-single-service-account) instead. - -The following GCP service accounts are needed: - -1. a "client" service account that has the following permissions: - * permissions to create a job in Vertex Pipelines, (e.g. [the Vertex AI User role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.user)). - * permissions to create a Google Cloud Function (e.g. with the [Cloud Functions Developer Role](https://cloud.google.com/functions/docsversions/0.66.0/reference/iam/roles#cloudfunctions.developer)). - * the [Storage Object Creator Role](https://cloud.google.com/iam/docs/understanding-roles#storage.objectCreator) to be able to write the pipeline JSON file to the artifact store directly (NOTE: not needed if the Artifact Store is configured with credentials or is linked to Service Connector). -2. a "workload" service account that has permissions to run a Vertex AI pipeline, (e.g. [the Vertex AI Service Agent role](https://cloud.google.com/vertex-ai/docs/general/access-control#aiplatform.serviceAgent)). - -A key is also needed for the "client" service account. You can create a key for this service account and download it to your local machine (e.g. in a `connectors-vertex-ai-workload.json` file). - -With all the service accounts and the key ready, we can register [the GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) and Vertex AI orchestrator as follows: - -```bash -zenml service-connector register --type gcp --auth-method=service-account --project_id= --service_account_json=@connectors-vertex-ai-workload.json --resource-type gcp-generic - -zenml orchestrator register \ - --flavor=vertex \ - --location= \ - --synchronous=true \ - --workload_service_account=@.iam.gserviceaccount.com - -zenml orchestrator connect --connector -``` - -### Configuring the stack - -With the orchestrator registered, we can use it in our active stack: - -```bash -# Register and activate a stack with the new orchestrator - -zenml stack register -o ... --set -``` - - - ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your pipeline steps in Vertex AI. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - -You can now run any ZenML pipeline using the Vertex orchestrator: - -```bash -python file_that_runs_a_zenml_pipeline.py -``` - -### Vertex UI - -Vertex comes with its own UI that you can use to find further details about your pipeline runs, such as the logs of your steps. - - - - - -For any runs executed on Vertex, you can get the URL to the Vertex UI in Python using the following code snippet: - -```py -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("") -orchestrator_url = pipeline_run.run_metadata["orchestrator_url"].value -``` - -### Run pipelines on a schedule - -The Vertex Pipelines orchestrator supports running pipelines on a schedule using its [native scheduling capability](https://cloud.google.com/vertex-ai/docs/pipelines/schedule-pipeline-run). - -**How to schedule a pipeline** - -```py -from zenml.config.schedule import Schedule - -# Run a pipeline every 5th minute -pipeline_instance.run( - schedule=Schedule( - cron_expression="*/5 * * * *" - ) -) - -# Run a pipeline every hour -# starting in one day from now and ending in three days from now -pipeline_instance.run( - schedule=Schedule( - cron_expression="0 * * * *" - start_time=datetime.datetime.now() + datetime.timedelta(days=1), - end_time=datetime.datetime.now() + datetime.timedelta(days=3), - ) -) -``` - - - The Vertex orchestrator only supports the `cron_expression`, `start_time` (optional) and `end_time` (optional) parameters in the `Schedule` object, and will ignore all other parameters supplied to define the schedule. - - -The `start_time` and `end_time` timestamp parameters are both optional and are to be specified in local time. They define the time window in which the pipeline runs will be triggered. If they are not specified, the pipeline will run indefinitely. - -The `cron_expression` parameter [supports timezones](https://cloud.google.com/vertex-ai/docsversions/0.66.0/reference/rest/v1beta1/projects.locations.schedules). For example, the expression `TZ=Europe/Paris 0 10 * * *` will trigger runs at 10:00 in the Europe/Paris timezone. - -**How to delete a scheduled pipeline** - -Note that ZenML only gets involved to schedule a run, but maintaining the lifecycle of the schedule is the responsibility of the user. - -In order to cancel a scheduled Vertex pipeline, you need to manually delete the schedule in VertexAI (via the UI or the CLI). - -### Additional configuration - -For additional configuration of the Vertex orchestrator, you can pass `VertexOrchestratorSettings` which allows you to configure labels for your Vertex Pipeline jobs or specify which GPU to use. - -```py -from zenml.integrations.gcp.flavors.vertex_orchestrator_flavor import VertexOrchestratorSettings -from kubernetes.client.models import V1Toleration - -vertex_settings = VertexOrchestratorSettings( - labels={"key": "value"} -) -``` - -If your pipelines steps have certain hardware requirements, you can specify them as `ResourceSettings`: - - - -```py -resource_settings = ResourceSettings(cpu_count=8, memory="16GB") -``` - -To run your pipeline (or some steps of it) on a GPU, you will need to set both a node selector and the gpu count as follows: - - - -```py -vertex_settings = VertexOrchestratorSettings( - pod_settings={ - "node_selectors": { - "cloud.google.com/gke-accelerator": "NVIDIA_TESLA_A100" - }, - } -) -resource_settings = ResourceSettings(gpu_count=1) -``` - -You can find available accelerator types [here](https://cloud.google.com/vertex-ai/docs/training/configure-compute#specifying%5Fgpus). - -These settings can then be specified on either pipeline-level or step-level: - -```py -# Either specify on pipeline-level -@pipeline( - settings={ - "orchestrator": vertex_settings, - "resources": resource_settings, - } -) -def my_pipeline(): - ... - -# OR specify settings on step-level -@step( - settings={ - "orchestrator": vertex_settings, - "resources": resource_settings, - } -) -def my_step(): - ... -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-gcp/#zenml.integrations.gcp.flavors.vertex%5Forchestrator%5Fflavor.VertexOrchestratorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -For more information and a full list of configurable attributes of the Vertex orchestrator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-gcp/#zenml.integrations.gcp.orchestrators.vertex%5Forchestrator.VertexOrchestrator) . - -### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. diff --git a/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform.mdx b/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform.mdx deleted file mode 100644 index ee85ef4..0000000 --- a/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform.mdx +++ /dev/null @@ -1,308 +0,0 @@ ---- -title: "Deploy a cloud stack with Terraform" -description: "Deploy a cloud stack using Terraform" -icon: server ---- - -ZenML maintains a collection of [Terraform modules](https://registry.terraform.io/modules/zenml-io/zenml-stack) designed to streamline the provisioning of cloud resources and seamlessly integrate them with ZenML Stacks. These modules simplify the setup process, allowing users to quickly provision cloud resources as well as configure and authorize ZenML to utilize them for running pipelines and other AI/ML operations. - -By leveraging these Terraform modules, users can ensure a more efficient and scalable deployment of their machine learning infrastructure, ultimately enhancing their development and operational workflows. The modules' implementation can also be used as a reference for creating custom Terraform configurations tailored to specific cloud environments and requirements. - - -Terraform requires you to manage your infrastructure as code yourself. Among other things, this means that you will need to have Terraform installed on your machine and you will need to manually manage the state of your infrastructure. - -If you prefer a more automated approach, you can use [the 1-click stack deployment feature](/develop/stack-components/stack-deployment/deploy-a-cloud-stack) to deploy a cloud stack with ZenML with minimal knowledge of Terraform or cloud infrastructure for that matter. - -If you have the required infrastructure pieces already deployed on your cloud, you can also use [the stack wizard to seamlessly register your stack](/develop/stack-components/stack-deployment/register-a-cloud-stack). - - -## Pre-requisites - -To use this feature, you need a deployed ZenML server instance that is reachable from the cloud provider where you wish to have the stack provisioned (this can't be a local server started via `zenml up`). If you do not already have one set up, you [can register for a free ZenML Pro account](https://cloud.zenml.io/signup) or you can learn about self-hosting a ZenML server [here](/develop/getting-started/deploying-zenml). - -Once you are connected to your deployed ZenML server, you need to create a service account and an API key for it. You will use the API key to give the Terraform module programmatic access to your ZenML server. You can find more about service accounts and API keys [here](/develop/usage/resource-data-management/connecting-to-zenml/connect-with-a-service-account). but the process is as simple as running the following CLI command while connected to your ZenML server: - -```Bash -zenml service-account create -``` - -Example output: - -```bash -$ zenml service-account create terraform-account - -Created service account 'terraform-account'. - -Successfully created API key `default`. - -The API key value is: 'ZENKEY_...' - -Please store it safely as it will not be shown again. - -To configure a ZenML client to use this API key, run: - -zenml connect --url https://842ed6a9-zenml.staging.cloudinfra.zenml.io --api-key \ - - 'ZENKEY_...' -``` - -Finally, you will need the following on the machine where you will be running Terraform: - -* [Terraform](https://www.terraform.io/downloads.html) installed on your machine (version at least 1.9). -* the ZenML Terraform stack modules assume you are already locally authenticated with your cloud provider through the provider's CLI or SDK tool and have permissions to create the resources that the modules will provision. This is different depending on the cloud provider you are using and is covered in the following sections. - -## How to use the Terraform stack deployment modules - -If you are already knowledgeable with using Terraform and the cloud provider where you want to deploy the stack, this process will be straightforward. In a nutshell, you will need to: - -1. create a new Terraform configuration file (e.g., `main.tf`), preferably in a new directory, with the content that looks like this (`` can be `aws`, `gcp`, or `azure`): - - - -```Bash -module "zenml_stack" { - source = "zenml-io/zenml-stack/" - version = "x.y.z" - - # Required inputs - zenml_server_url = "https://" - zenml_api_key = "" - # Optional inputs - zenml_stack_name = "" - orchestrator = "" # e.g., "local", "sagemaker", "vertex", "azureml", "skypilot" -} -output "zenml_stack_id" { - value = module.zenml_stack.zenml_stack_id -} -output "zenml_stack_name" { - value = module.zenml_stack.zenml_stack_name -} -``` - -There might be a few additional required or optional inputs depending on the cloud provider you are using. You can find the full list of inputs for each module in the [Terraform Registry](https://registry.terraform.io/modules/zenml-io/zenml-stack) documentation for the relevant module or you can read on in the following sections. - -1. Run the following commands in the directory where you have your Terraform configuration file: - -```Bash -terraform init -terraform apply -``` - - - The directory where you keep the Terraform configuration file and where you run the `terraform` commands is important. This is where Terraform will store the state of your infrastructure. Make sure you do not delete this directory or the state file it contains unless you are sure you no longer need to manage these resources with Terraform or after you have deprovisioned them up with `terraform destroy`. - - -1. Terraform will prompt you to confirm the changes it will make to your cloud infrastructure. If you are happy with the changes, type `yes` and hit enter. -2. Terraform will then provision the resources you have specified in your configuration file. Once the process is complete, you will see a message indicating that the resources have been successfully created and printing out the ZenML stack ID and name: - -```yaml -... -Apply complete! Resources: 15 added, 0 changed, 0 destroyed. - -Outputs: - -zenml_stack_id = "04c65b96-b435-4a39-8484-8cc18f89b991" -zenml_stack_name = "terraform-gcp-588339e64d06" -``` - -At this point, a ZenML stack has also been created and registered with your ZenML server and you can start using it to run your pipelines: - -```Bash -zenml integration install -zenml stack set -``` - -You can find more details specific to the cloud provider of your choice in the next section: - - -The [original documentation for the ZenML AWS Terraform module](https://registry.terraform.io/modules/zenml-io/zenml-stack/aws/latest) contains extensive information about required permissions, inputs, outputs and provisioned resources. This is a summary of the key points from that documentation. - -#### Authentication - -To authenticate with AWS, you need to have [the AWS CLI](https://aws.amazon.com/cli/) installed on your machine and you need to have run `aws configure` to set up your credentials. - -#### Example Terraform Configuration - -Here is an example Terraform configuration file for deploying a ZenML stack on AWS: -```bash -module "zenml_stack" { - source = "zenml-io/zenml-stack/aws" - - # Required inputs - zenml_server_url = "https://" - zenml_api_key = "" - - # Optional inputs - region = "" - orchestrator = "" # e.g., "local", "sagemaker", "skypilot" -} -output "zenml_stack_id" { - value = module.zenml_stack.zenml_stack_id -} -output "zenml_stack_name" { - value = module.zenml_stack.zenml_stack_name -} -``` - -#### Stack Components - -The Terraform module will create a ZenML stack configuration with the following components: - -1. an S3 Artifact Store linked to a S3 bucket -2. an ECR Container Registry linked to a ECR repository -3. depending on the `orchestrator` input variable: -4. a local Orchestrator, if `orchestrator` is set to `local`. This can be used in combination with the SageMaker Step Operator to selectively run some steps locally and some on SageMaker. -5. a SageMaker Orchestrator linked to the AWS account, if `orchestrator` is set to `sagemaker` (default) -6. a SkyPilot Orchestrator linked to the AWS account, if `orchestrator` is set to `skypilot` -7. a SageMaker Step Operator linked to the AWS account -8. an AWS Service Connector configured with the IAM role credentials and used to authenticate all ZenML components with your AWS account - -To use the ZenML stack, you will need to install the required integrations: - -* for the local or SageMaker orchestrator: - -```Bash -zenml integration install aws s3 -``` - -* for the SkyPilot orchestrator: - -```Bash -zenml integration install aws s3 skypilot_aws -``` - - - -The [original documentation for the ZenML GCP Terraform module](https://registry.terraform.io/modules/zenml-io/zenml-stack/gcp/latest) contains extensive information about required permissions, inputs, outputs and provisioned resources. This is a summary of the key points from that documentation. - -#### Authentication - -To authenticate with GCP, you need to have [the gcloud CLI](https://cloud.google.com/sdk/gcloud) installed on your machine and you need to have run `gcloud init` or `gcloud auth login` to set up your credentials. - -#### Example Terraform Configuration - -Here is an example Terraform configuration file for deploying a ZenML stack on AWS: - -```bash -module "zenml_stack" { - source = "zenml-io/zenml-stack/gcp" - - # Required inputs - project_id = "" - zenml_server_url = "https://" - zenml_api_key = "" - - # Optional inputs - region = "" - orchestrator = "" # e.g., "local", "vertex", "skypilot" or "airflow" -} -output "zenml_stack_id" { - value = module.zenml_stack.zenml_stack_id -} -output "zenml_stack_name" { - value = module.zenml_stack.zenml_stack_name -} -``` - -#### Stack Components - -The Terraform module will create a ZenML stack configuration with the following components: - -1. an GCP Artifact Store linked to a GCS bucket -2. an GCP Container Registry linked to a Google Artifact Registry -3. depending on the `orchestrator` input variable: -4. a local Orchestrator, if `orchestrator` is set to `local`. This can be used in combination with the Vertex AI Step Operator to selectively run some steps locally and some on Vertex AI. -5. a Vertex AI Orchestrator linked to the GCP project, if `orchestrator` is set to `vertex` (default) -6. a SkyPilot Orchestrator linked to the GCP project, if `orchestrator` is set to `skypilot` -7. an Airflow Orchestrator linked to the Cloud Composer environment, if `orchestrator` is set to `airflow` -8. a Google Cloud Build Image Builder linked to your GCP project -9. a Vertex AI Step Operator linked to the GCP project -10. a GCP Service Connector configured with the GCP service account credentials or the GCP Workload Identity Provider configuration and used to authenticate all ZenML components with the GCP resources - -To use the ZenML stack, you will need to install the required integrations: - -* for the local and Vertex AI orchestrators: - -```bash -zenml integration install gcp -``` - -* for the SkyPilot orchestrator: - -```bash -zenml integration install gcp skypilot_gcp -``` - -* for the Airflow orchestrator: - -```bash -zenml integration install gcp airflow -``` - - -The [original documentation for the ZenML Azure Terraform module](https://registry.terraform.io/modules/zenml-io/zenml-stack/azure/latest) contains extensive information about required permissions, inputs, outputs and provisioned resources. This is a summary of the key points from that documentation. - -#### Authentication - -To authenticate with Azure, you need to have [the Azure CLI](https://learn.microsoft.com/en-us/cli/azure/) installed on your machine and you need to have run `az login` to set up your credentials. - -#### Example Terraform Configuration - -Here is an example Terraform configuration file for deploying a ZenML stack on AWS: - -```py -module "zenml_stack" { - source = "zenml-io/zenml-stack/azure" - - # Required inputs - zenml_server_url = "https://" - zenml_api_key = "" - - # Optional inputs - location = "" - orchestrator = "" # e.g., "local", "skypilot_azure" -} -output "zenml_stack_id" { - value = module.zenml_stack.zenml_stack_id -} -output "zenml_stack_name" { - value = module.zenml_stack.zenml_stack_name -} -``` - -#### Stack Components - -The Terraform module will create a ZenML stack configuration with the following components: - -1. an Azure Artifact Store linked to an Azure Storage Account and Blob Container -2. an ACR Container Registry linked to an Azure Container Registry -3. depending on the `orchestrator` input variable: -4. a local Orchestrator, if `orchestrator` is set to `local`. This can be used in combination with the AzureML Step Operator to selectively run some steps locally and some on AzureML. -5. an Azure SkyPilot Orchestrator linked to the Azure subscription, if `orchestrator` is set to `skypilot` (default) -6. an AzureML Orchestrator linked to an AzureML Workspace, if `orchestrator` is set to `azureml` -7. an AzureML Step Operator linked to an AzureML Workspace -8. an Azure Service Connector configured with Azure Service Principal credentials and used to authenticate all ZenML components with the Azure resources - -To use the ZenML stack, you will need to install the required integrations: - -* for the local and AzureML orchestrators: - -```bash -zenml integration install azure -``` - -* for the SkyPilot orchestrator: - -```Bash -zenml integration install azure skypilot_azure -``` - - - -## How to clean up the Terraform stack deployments - -Cleaning up the resources provisioned by Terraform is as simple as running the `terraform destroy` command in the directory where you have your Terraform configuration file. This will remove all the resources that were provisioned by the Terraform module and will also delete the ZenML stack that was registered with your ZenML server. - -```bash -terraform destroy -``` diff --git a/develop/stack-components/stack-deployment/deploy-a-cloud-stack.mdx b/develop/stack-components/stack-deployment/deploy-a-cloud-stack.mdx deleted file mode 100644 index f7ee975..0000000 --- a/develop/stack-components/stack-deployment/deploy-a-cloud-stack.mdx +++ /dev/null @@ -1,369 +0,0 @@ ---- -title: "Deploy a cloud stack with ZenML" -description: "Deploy a cloud stack from scratch with a single click" -icon: clipboard-list ---- - -In ZenML, the [stack](/develop/user-guide/guides/production-guide/understand-stacks) is a fundamental concept that represents the configuration of your infrastructure. In a normal workflow, creating a stack requires you to first deploy the necessary pieces of infrastructure and then define them as stack components in ZenML with proper authentication. - -Especially in a remote setting, this process can be challenging and time-consuming, and it may create multi-faceted problems. This is why we implemented a feature, that allows you to **deploy the necessary pieces of infrastructure on your selected cloud provider and get you started on remote stack with a single click**. - - -If you prefer to have more control over where and how resources are provisioned in your cloud, you can [use one of our Terraform modules](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) to manage your infrastructure as code yourself. - -If you have the required infrastructure pieces already deployed on your cloud, you can also use [the stack wizard to seamlessly register your stack](/develop/stack-components/stack-deployment/register-a-cloud-stack). - - -## How to use the 1-click deployment tool? - -The first thing that you need in order to use this feature is a deployed instance of ZenML (not a local server via `zenml up`). If you do not already have it set up for you, feel free to learn how to do so [here](/develop/getting-started/deploying-zenml). - -Once you are connected to your deployed ZenML instance, you can use the 1-click deployment tool either through the dashboard or the CLI: - - -In order to create a remote stack over the dashboard go to the stacks page on the dashboard and click "+ New Stack". - - - - - -Since we will be deploying it from scratch, select "New Infrastructure" on the next page: - - - - - - - - - -#### AWS - -If you choose `aws` as your provider, you will see a page where you will have to select a region and a name for your new stack: - - - - -Once the configuration is finished, you will see a deployment page: - - - - - -Clicking on the "Deploy in AWS" button will redirect you to a Cloud Formation page on AWS Console. - - - - - -You will have to log in to your AWS account, review and confirm the pre-filled configuration and create the stack. - - - - -#### GCP - -If you choose `gcp` as your provider, you will see a page where you will have to select a region and a name for your new stack: - - - - - - - - -Once the configuration is finished, you will see a deployment page: - - - - - -Make note of the configuration values provided to you in the ZenML dashboard. You will need these in the next step. - -Clicking on the "Deploy in GCP" button will redirect you to a Cloud Shell session on GCP. - - - - - - - The Cloud Shell session will warn you that the ZenML GitHub repository is untrusted. We recommend that you review [the contents of the repository](https://github.com/zenml-io/zenml/tree/main/infra/gcp) and then check the `Trust repo` checkbox to proceed with the deployment, otherwise the Cloud Shell session will not be authenticated to access your GCP projects. You will also get a chance to review the scripts that will be executed in the Cloud Shell session before proceeding. - - - - - -After the Cloud Shell session starts, you will be guided through the process of authenticating with GCP, configuring your deployment, and finally provisioning the resources for your new GCP stack using Deployment Manager. - -First, you will be asked to create or choose an existing GCP project with billing enabled and to configure your terminal with the selected project: - - - - -Next, you will be asked to configure your deployment by pasting the configuration values that were provided to you earlier in the ZenML dashboard. You may need to switch back to the ZenML dashboard to copy these values if you did not do so earlier: - - - - - - - - - - -You can take this opportunity to review the script that will be executed at the next step. You will notice that this script starts by enabling some necessary GCP service APIs and configuring some basic permissions for the service accounts involved in the stack deployment, and then deploys the stack using a GCP Deployment Manager template. You can proceed with the deployment by running the script in your terminal: - - - - -The script will deploy a GCP Deployment Manager template that provisions the necessary resources for your new GCP stack and automatically registers the stack with your ZenML server. You can monitor the progress of the deployment in your GCP console: - - - - -Once the deployment is complete, you may close the Cloud Shell session and return to the ZenML dashboard to view the newly created stack: - - - - - - - - -#### Azure - - - Currently, the 1-click deployment for Azure is only supported in the CLI. We are working on bringing support to the dashboard as well. Stay in touch for further updates. - - - -In order to create a remote stack over the CLI, you can use the following command: - -```Bash -zenml stack deploy -p {aws|gcp|azure} -``` - -#### AWS - -If you choose `aws` as your provider, the command will walk you through deploying a Cloud Formation stack on AWS. It will start by showing some information about the stack that will be created: - - - - -Upon confirmation, the command will redirect you to a Cloud Formation page on AWS Console where you will have to deploy the stack: - - - - -You will have to log in to your AWS account, have permission to deploy an AWS Cloud Formation stack, review and confirm the pre-filled configuration and create the stack. - - - - -The Cloud Formation stack will provision the necessary resources for your new AWS stack and automatically register the stack with your ZenML server. You can monitor the progress of the stack in your AWS console: - - - - -Once the provisioning is complete, you may close the AWS Cloud Formation page and return to the ZenML CLI to view the newly created stack: - - - - -#### GCP - -If you choose `gcp` as your provider, the command will walk you through deploying a Deployment Manager template on GCP. It will start by showing some information about the stack that will be created: - - - - - -Upon confirmation, the command will redirect you to a Cloud Shell session on GCP. - - - - -The Cloud Shell session will warn you that the ZenML GitHub repository is untrusted. We recommend that you review [the contents of the repository](https://github.com/zenml-io/zenml/tree/main/infra/gcp) and then check the `Trust repo` checkbox to proceed with the deployment, otherwise the Cloud Shell session will not be authenticated to access your GCP projects. You will also get a chance to review the scripts that will be executed in the Cloud Shell session before proceeding. - - - - -After the Cloud Shell session starts, you will be guided through the process of authenticating with GCP, configuring your deployment, and finally provisioning the resources for your new GCP stack using Deployment Manager. - -First, you will be asked to create or choose an existing GCP project with billing enabled and to configure your terminal with the selected project: - - - - - - -Next, you will be asked to configure your deployment by pasting the configuration values that were provided to you in the ZenML CLI. You may need to switch back to the ZenML CLI to copy these values if you did not do so earlier: - - - - -You can take this opportunity to review the script that will be executed at the next step. You will notice that this script starts by enabling some necessary GCP service APIs and configuring some basic permissions for the service accounts involved in the stack deployment and then deploys the stack using a GCP Deployment Manager template. You can proceed with the deployment by running the script in your terminal: - - - -The script will deploy a GCP Deployment Manager template that provisions the necessary resources for your new GCP stack and automatically registers the stack with your ZenML server. You can monitor the progress of the deployment in your GCP console: - - - - -Once the deployment is complete, you may close the Cloud Shell session and return to the ZenML CLI to view the newly created stack: - - - - - - - - -#### Azure - -If you choose `azure` as your provider, the command will walk you through deploying [the ZenML Azure Stack Terraform module](https://registry.terraform.io/modules/zenml-io/zenml-stack/azure). It will start by showing some information about the stack that will be created: - - - - - -Upon confirmation, the command will redirect you to a Cloud Shell session on Azure. - - - - -After the Cloud Shell session starts, you will have to use Terraform to deploy the stack, as instructed by the CLI. - -First, you will have to open a file named `main.tf` in the Cloud Shell session using the editor of your choice (e.g. `vim`, `nano`) and paste in the Terraform configuration provided by the CLI. You may need to switch back to the ZenML CLI to copy these values if you did not do so earlier: - - - - -The Terraform file is a simple configuration that uses [the ZenML Azure Stack Terraform module](https://registry.terraform.io/modules/zenml-io/zenml-stack/azure) to deploy the necessary resources for your Azure stack and then automatically register the stack with your ZenML server. You can read more about the module and its configuration options in the module's documentation. - -You can proceed with the deployment by running the `terraform init` and `terraform apply` Terraform commands in your terminal: - - - - - - - - -Once the Terraform deployment is complete, you may close the Cloud Shell session and return to the ZenML CLI to view the newly created stack: - - - - - - - - - - - - - -## What will be deployed? - -Here is an overview of the infrastructure that the 1-click deployment will prepare for you based on your cloud provider: - - -#### Resources - -* An S3 bucket that will be used as a ZenML Artifact Store. -* An ECR container registry that will be used as a ZenML Container Registry. -* Permissions to use SageMaker as a ZenML Orchestrator. -* An IAM user and IAM role with the minimum necessary permissions to access the resources listed above. -* An AWS access key used to give access to ZenML to connect to the above resources through a ZenML service connector. - -#### Permissions - -The configured IAM service account and AWS access key will grant ZenML the following AWS permissions in your AWS account: - -* S3 Bucket: - * s3:ListBucket - * s3:GetObject - * s3:PutObject - * s3:DeleteObject -* ECR Repository: - * ecr:DescribeRepositories - * ecr:ListRepositories - * ecr:DescribeRegistry - * ecr:BatchGetImage - * ecr:DescribeImages - * ecr:BatchCheckLayerAvailability - * ecr:GetDownloadUrlForLayer - * ecr:InitiateLayerUpload - * ecr:UploadLayerPart - * ecr:CompleteLayerUpload - * ecr:PutImage - * ecr:GetAuthorizationToken -* SageMaker (Client): - * sagemaker:CreatePipeline - * sagemaker:StartPipelineExecution - * sagemaker:DescribePipeline - * sagemaker:DescribePipelineExecution -* SageMaker (Jobs): - * AmazonSageMakerFullAccess - - - -#### Resources - -* A GCS bucket that will be used as a ZenML Artifact Store. -* A GCP Artifact Registry that will be used as a ZenML Container Registry. -* Permissions to use Vertex AI as a ZenML Orchestrator. -* Permissions to use GCP Cloud Builder as a ZenML Image Builder. -* A GCP Service Account with the minimum necessary permissions to access the resources listed above. -* An GCP Service Account access key used to give access to ZenML to connect to the above resources through a ZenML service connector. - -#### Permissions - -The configured GCP service account and its access key will grant ZenML the following GCP permissions in your GCP project: - -* GCS Bucket: - * roles/storage.objectUser -* GCP Artifact Registry: - * roles/artifactregistry.createOnPushWriter -* Vertex AI (Client): - * roles/aiplatform.user -* Vertex AI (Jobs): - * roles/aiplatform.serviceAgent -* Cloud Build (Client): - * roles/cloudbuild.builds.editor - - - -#### Resources - -* An Azure Resource Group to contain all the resources required for the ZenML stack -* An Azure Storage Account and Blob Storage Container that will be used as a ZenML Artifact Store. -* An Azure Container Registry that will be used as a ZenML Container Registry. -* An AzureML Workspace that will be used as a ZenML Orchestrator and ZenML Step Operator. A Key Vault and Application Insights instance will also be created in the same Resource Group and used to construct the AzureML Workspace. -* An Azure Service Principal with the minimum necessary permissions to access the above resources. -* An Azure Service Principal client secret used to give access to ZenML to connect to the above resources through a ZenML service connector. - -#### Permissions - -The configured Azure service principal and its client secret will grant ZenML the following permissions in your Azure subscription: - -* permissions granted for the created Storage Account: - * Storage Blob Data Contributor -* permissions granted for the created Container Registry: - * AcrPull - * AcrPush - * Contributor -* permissions granted for the created AzureML Workspace: - * AzureML Compute Operator - * AzureML Data Scientist - - - -There you have it! With a single click, you just deployed a cloud stack and, you can start running your pipelines on a remote setting. - diff --git a/develop/stack-components/stack-deployment/deploy-a-stack-using-mlstacks.mdx b/develop/stack-components/stack-deployment/deploy-a-stack-using-mlstacks.mdx deleted file mode 100644 index de7c6ec..0000000 --- a/develop/stack-components/stack-deployment/deploy-a-stack-using-mlstacks.mdx +++ /dev/null @@ -1,214 +0,0 @@ ---- -title: "Deploy stack/components using mlstacks" -description: "Individually deploying different stack components." -icon: box ---- - -The first step in running your pipelines on remote infrastructure is to deploy all the components that you would need, like an [MLflow tracking server](/develop/stack-components/experiment-trackers/mlflow), [Kubeflow orchestrator](/develop/stack-components/orchestrators/kubeflow), and more to your cloud. - -This can bring plenty of benefits like scalability, reliability, and collaboration. ZenML eases the path to production by providing a seamless way for all tools to interact with others through the use of abstractions. However, one of the most painful parts of this process, from what we see on our Slack and in general, is the deployment of these stack components. - -To make even this process easier for our users, we have created the `deploy` command in `zenml`, which allows you to quickly get started with a full-fledged MLOps stack using only a few commands. You can choose to deploy individual stack components through the stack-component CLI or deploy a stack with multiple components together (a tad more manual steps). - -Check out [the full documentation for the mlstacks package](https://mlstacks.zenml.io/) for more information. - -## When should I deploy something using mlstacks? - - - **MLStacks deploys resources using a Kubernetes cluster, which may be expensive and not for every user. In order to use stacks which are more basic and cheaper on the cloud, read** [**how to easily register a cloud stack**](/develop/stack-components/stack-deployment/register-a-cloud-stack) **if you have existing infrastructure, or read** [**how to deploy a cloud stack in one click**](/develop/stack-components/stack-deployment/deploy-a-cloud-stack) **or** [**how to deploy a cloud stack with Terraform**](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform/README.md)**.** - -Or simply try running one of: - -```Bash -zenml stack register --provider aws -zenml stack deploy --provider aws -``` - - -To answer this question, here are some pros and cons in comparison to the stack-component deploy method which can help you choose what works best for you! - - -* Offers a lot of flexibility in what you deploy. -* Deploying with `mlstacks` gives you a full MLOps stack as the output. Your components and stack is automatically imported to ZenML. This saves you the effort of manually registering all the components. - - -* Currently only supports AWS, GCP, and K3D as providers. -* Most stack deployments are Kubernetes-based which might be too heavy for your needs. -* Not all stack components are supported yet. - - - -The ZenML CLI has special subcommands that allow you to deploy individual stack components as well as whole stacks using MLStacks. These stacks will be useful for you if: - -* You are at the start of your MLOps journey, and would like to explore different tools. -* You are looking for guidelines for production-grade deployments. - -## How does `mlstacks` work? - -MLStacks is built around the concept of a stack specification. A stack specification is a YAML file that describes the stack and includes references to component specification files. A component specification is a YAML file that describes a component. (Currently all deployments of components (in various combinations) must be defined within the context of a stack.) - -ZenML handles the creation of stack specifications for you when you run one of the `deploy` subcommands using the CLI. A valid specification is generated and used by `mlstacks` to deploy your stack using Terraform. The Terraform definitions and state are stored in your global configuration directory along with any state files generated while deploying your stack. - -Your configuration directory could be in a number of different places depending on your operating system, but read more about it in the [Click docs](https://click.palletsprojects.com/en/8.1.x/api/#click.get%5Fapp%5Fdir) to see which location applies to your situation. - -## Installing the mlstacks extra - -To install `mlstacks`, either run `pip install mlstacks` or `pip install "zenml[mlstacks]"` to install it along with ZenML. - -MLStacks uses Terraform on the backend to manage infrastructure. You will need to have Terraform installed. Please visit [the Terraform docs](https://learn.hashicorp.com/tutorials/terraform/install-cli#install-terraform) for installation instructions. - -MLStacks also uses Helm to deploy Kubernetes resources. You will need to have Helm installed. Please visit [the Helm docs](https://helm.sh/docs/intro/install/#from-script) for installation instructions. - -## Deploying a stack - -Deploying an end-to-end stack through the ZenML CLI is only possible with the [deployment wizard which does not use mlstacks](/develop/stack-components/stack-deployment/deploy-a-cloud-stack). However, you can use `mlstacks` directly to deploy various types of stacks and [import them into ZenML](https://mlstacks.zenml.ioversions/0.66.0/reference/zenml). - -```Bash -zenml stack import -f -``` - -## Deploying a stack component - -If you have used ZenML before, you must be familiar with the flow of registering new stack components. It goes something like this: - -```Bash -zenml artifact-store register my_store --flavor=s3 --path=s3://my_bucket -``` - -Commands like these assume that you already have the stack component deployed. In this case, it would mean that you must already have a bucket called `my_bucket` on AWS S3 to be able to use this component. - -We took inspiration from this design to build something that feels natural to use and is also sufficiently powerful to take care of the deployment of the respective stack components for you. This is where the \ `deploy` CLI comes in! - -The `deploy` command allows you to deploy individual components of your MLOps stack with a single command . You can also customize your components easily by passing in flags (more on that later). - - -To install `mlstacks`, either run `pip install mlstacks` or `pip install "zenml[mlstacks]"` to install it along with ZenML. - -MLStacks uses Terraform on the backend to manage infrastructure. You will need to have Terraform installed. Please visit [the Terraform docs](https://learn.hashicorp.com/tutorials/terraform/install-cli#install-terraform) for installation instructions. - -MLStacks also uses Helm to deploy Kubernetes resources. You will need to have Helm installed. Please visit [the Helm docs](https://helm.sh/docs/intro/install/#from-script) for installation instructions. - - -For example, to deploy an artifact store on a GCP account, you can run: - -```bash -# after installing mlstacks - -zenml artifact-store deploy -f gcp -p gcp -r us-east1 -x project_id=zenml my_store -``` - -The command above takes in the following parameters: - -* **Name**: The name of the stack component. In this case, it is `my_store`. -* **Flavor:** The flavor of the stack component to deploy. Here, we are deploying an artifact store with the `gcp` flavor. -* **Provider:** The provider to deploy this stack component on. Currently, only **GCP, AWS, and K3D** are supported as providers. -* **Region**: The region to deploy the stack component in. -* **Extra Config:** Some components can be customized by the user and these settings are passed as flags to the command. In the example above, we pass the GCP project ID to select what project to deploy the component to. - -Successful execution of this command does the following: - -* It also automatically registers the deployed stack component with your ZenML server, so you don't have to worry about manually configuring components after the deployment! - - - The command currently uses your local credentials for GCP and AWS to provision resources. Integration with your ZenML connectors might be possible soon too! - - - -The stack component deploy CLI is powered by ZenML's [mlstacks](https://github.com/zenml-io/mlstacks) in the background. This allows you to configure and deploy select stack components. - -Using the values you pass for the cloud, the CLI picks up the right modular recipe to use (one of AWS, GCP, or K3D) and then deploys that recipe with the specific stack component enabled. - - - - -### Destroying a stack component - -Destroying a stack component (i.e. deleting and destroying the underlying infrastructure) is as easy as deploying one. You can run the following command to destroy the artifact store we created above: - -```bash - -zenml artifact-store destroy -p gcp my_store -``` - -This will destroy the deployed infrastructure and prompt you if you also want to remove and deregister the component from your ZenML server. - -### Available flavors for stack components - -Here's a table of all the flavors that can be deployed through the CLI for every stack component. This is a list that will keep on growing and you can also contribute any flavor or stack component that you feel is missing. Refer to the [Contribution page](https://github.com/zenml-io/zenml/blob/release/0.66.0/CONTRIBUTING.md) for steps on how to do that - - -Whenever you pass in a flavor to any stack-component deploy function, the combination of these two parameters is used to construct a variable name in the following format: - -```bash -enable__ -``` - -This variable is then passed as input to the underlying modular recipe. If you check the [variables.tf](https://github.com/zenml-io/mlstacks/blob/main/gcp-modular/variables.tf) file for a given recipe, you can find all the supported flavor-stack component combinations there. - - - - - -| Component Type | Flavor(s) | -| ------------------ | ------------------------------------ | -| Artifact Store | s3, gcp, minio | -| Container Registry | aws, gcp | -| Experiment Tracker | mlflow | -| Orchestrator | kubernetes, kubeflow, tekton, vertex | -| MLOps Platform | zenml | -| Model Deployer | seldon | -| Step Operator | sagemaker, vertex | - -#### Customizing your stack components - -With simplicity, we didn't want to compromise on the flexibility that this deployment method allows. As such, we have added the option to pass configuration specific to the stack components as key-value arguments to the deploy CLI. Here is an assortment of all possible configurations that can be set. - - The flags that you pass to the deploy CLI are passed on as-is to the backing modular recipes as input variables. This means that all the flags need to be defined as variables in the respective recipe. - -For example, if you take a look at the [variables.tf](https://github.com/zenml-io/mlstacks/blob/main/gcp-modular/variables.tf) file for a modular recipe, like the `gcp-modular` recipe, you can find variables like `mlflow_bucket` that you could potentially pass in. - -Validation for these flags does not exist yet at the CLI level, so you must be careful in naming them while calling `deploy`. - -All these extra configuration options are passed in with the `-x` option. For example, we already saw this in action above when we passed in the GCP project ID to the artifact store deploy command. - -```bash -zenml artifact-store deploy -f gcp -p gcp -r us-east1 -x project_id=zenml my_store -``` - -Simply pass in as many `-x` flags as you want to customize your stack component. - - - - -**Experiment Trackers** - -You can assign an existing bucket to the MLflow experiment tracker by passing the `-x mlflow_bucket=...` configuration: - -```bash -zenml experiment-tracker deploy mlflow_tracker --flavor=mlflow -p YOUR_DESIRED_PROVIDER -r YOUR_REGION -x mlflow_bucket=gs://my_bucket -``` - -**Artifact Stores** - -For an artifact store, you can pass `bucket_name` as an argument to the command. - -```bash -zenml artifact-store deploy s3_artifact_store --flavor=s3 --provider=aws -r YOUR_REGION -x bucket_name=my_bucket -``` - -**Container Registries** - -For container registries, you can pass the repository name using `repo_name`: - -```bash -zenml container-registry deploy aws_registry --flavor=aws -p aws -r YOUR_REGION -x repo_name=my_repo -``` - -This is only useful for the AWS case since AWS requires a repository to be created before pushing images to it and the deploy command ensures that a repository with the name you provide is created. In case of GCP and other providers, you can choose the repository name at the same time as you are pushing the image via code. This is achieved through setting the `target_repo` attribute of [the DockerSettings object](/develop/usage/project-setup/customize-docker-builds). - -#### Other configuration - -* In the case of GCP components, it is _required_ that you pass a project ID to the command as extra configuration when you're creating any GCP resource. diff --git a/develop/stack-components/stack-deployment/deploying-stacks.mdx b/develop/stack-components/stack-deployment/deploying-stacks.mdx deleted file mode 100644 index 1fe94c8..0000000 --- a/develop/stack-components/stack-deployment/deploying-stacks.mdx +++ /dev/null @@ -1,45 +0,0 @@ ---- -title: "Deploying Stacks" -description: "Stacks are the configuration of your infrastructure." -icon: hammer ---- - -The [stack](/develop/user-guide/guides/production-guide/understand-stacks) is a fundamental component of the ZenML framework. Put simply, a stack represents the configuration of the infrastructure and tooling that defines where and how a pipeline executes. - -However, deploying and managing a MLOps stack is tricky . It is not trivial to set up all the different tools that you might need for your pipeline. - -* Each tool comes with a certain set of requirements. For example, a Kubeflow installation will require you to have a Kubernetes cluster, and so would a Seldon Core deployment. -* Figuring out the defaults for infra parameters is not easy. Even if you have identified the backing infra that you need for a stack component, setting up reasonable defaults for parameters like instance size, CPU, memory, etc., needs a lot of experimentation to figure out. -* Many times, standard tool installations don't work out of the box. For example, to run a custom pipeline in Vertex AI, it is not enough to just run an imported pipeline. You might also need a custom service account that is configured to perform tasks like reading secrets from your secret store or talking to other GCP services that your pipeline might need. -* Some tools need an additional layer of installations to enable a more secure, production-grade setup. For example, a standard MLflow tracking server deployment comes without an authentication frontend which might expose all of your tracking data to the world if deployed as-is. -* All the components that you deploy must have the right permissions to be able to talk to each other. When you run your pipeline, it is inevitable that some components would need to communicate with the others. For example, your workloads running in a Kubernetes cluster might require access to the container registry or the code repository, and so on. -* Cleaning up your resources after you're done with your experiments is super important yet very challenging. Many of the components need a range of other resources to work which might slide past your radar if you're not careful. For example, if your Kubernetes cluster has made use of Load Balancers, you might still have one lying around in your account even after deleting the cluster, costing you money and frustration. - -All of these points make taking your pipelines to production a more difficult task than it should be. We believe that the expertise in setting up these often-complex stacks shouldn't be a prerequisite to running your ML pipelines. - -This docs section consists of information that makes it easier to provision, configure, and extend stacks and components in ZenML. - - - - - Description of deploying a cloud stack with ZenML. - - - Description of registering a cloud stack. - - - Description of deploying a cloud stack with Terraform. - - - Deploying an entire stack with ZenML's `mlstacks` package. - - - Description of referencing secrets in stack configuration. - - - Creating your custom stack component solutions. - - - Description of implementing a custom integration. - - diff --git a/develop/stack-components/stack-deployment/implement-a-custom-integration.mdx b/develop/stack-components/stack-deployment/implement-a-custom-integration.mdx deleted file mode 100644 index 6fa1a67..0000000 --- a/develop/stack-components/stack-deployment/implement-a-custom-integration.mdx +++ /dev/null @@ -1,127 +0,0 @@ ---- -title: "Implement a custom integration" -description: "Creating an external integration and contributing to ZenML" -icon: plug ---- - - - - - -One of the main goals of ZenML is to find some semblance of order in the ever-growing MLOps landscape. ZenML already provides [numerous integrations](https://zenml.io/integrations) into many popular tools, and allows you to come up with ways to [implement your own stack component flavors](/develop/stack-components/stack-deployment/implement-a-custom-stack-component) in order to fill in any gaps that are remaining. - -_However, what if you want to make your extension of ZenML part of the main codebase, to share it with others?_ If you are such a person, e.g., a tooling provider in the ML/MLOps space, or just want to contribute a tooling integration to ZenML, this guide is intended for you. - -### Step 1: Plan out your integration - -In [the previous page](/develop/stack-components/stack-deployment/implement-a-custom-stack-component), we looked at the categories and abstractions that core ZenML defines. In order to create a new integration into ZenML, you would need to first find the categories that your integration belongs to. The list of categories can be found [here](/develop/stack-components/component-guide) as well. - -Note that one integration may belong to different categories: For example, the cloud integrations (AWS/GCP/Azure) contain [container registries](/develop/stack-components/container-registries), [artifact stores](/develop/stack-components/artifact-stores) etc. - -### Step 2: Create individual stack component flavors - -Each category selected above would correspond to a [stack component type](/develop/stack-components/component-guide). You can now start developing individual stack component flavors for this type by following the detailed instructions on the respective pages. - -Before you package your new components into an integration, you may want to use/test them as a regular custom flavor. For instance, if you are [developing a custom orchestrator](/develop/stack-components/orchestrators/custom) and your flavor class `MyOrchestratorFlavor` is defined in `flavors/my_flavor.py`, you can register it by using: - -```bash -zenml orchestrator flavor register flavors.my_flavor.MyOrchestratorFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually it's better to not have to rely on this mechanism, and initialize zenml at the root. - - -Afterward, you should see the new flavor in the list of available flavors: - -```bash -zenml orchestrator flavor list -``` - -See the docs on extensibility of the different components [here](/develop/stack-components/component-guide) or get inspired by the many integrations that are already implemented such as [the MLflow experiment tracker](/develop/stack-components/experiment-trackers/mlflow). - -### Step 3: Create an integration class - -Once you are finished with your flavor implementations, you can start the process of packaging them into your integration and ultimately the base ZenML package. Follow this checklist to prepare everything: - -**1\. Clone Repo** - -Once your stack components work as a custom flavor, you can now [clone the main zenml repository](https://github.com/zenml-io/zenml) and follow the [contributing guide](https://github.com/zenml-io/zenml/blob/main/CONTRIBUTING.md) to set up your local environment for develop. - -**2\. Create the integration directory** - -All integrations live within [src/zenml/integrations/](https://github.com/zenml-io/zenml/tree/main/src/zenml/integrations) in their own sub-folder. You should create a new folder in this directory with the name of your integration. - -An example integration directory would be structured as follows: - -```bash - -/src/zenml/integrations/ <- ZenML integration directory - <- Root integration directory - | - ├── artifact-stores <- Separated directory for - | ├── __init_.py every type - | └── <- Implementation class for the - | artifact store flavor - ├── flavors - | ├── __init_.py - | └── <- Config class and flavor - | - └── __init_.py <- Integration class -``` -**3\. Define the name of your integration in constants** - -In [zenml/integrations/constants.py](https://github.com/zenml-io/zenml/blob/main/src/zenml/integrations/constants.py), add: - -```bash -EXAMPLE_INTEGRATION = "" -``` - -This will be the name of the integration when you run: - -```bash - zenml integration install -``` - -**4\. Create the integration class \_\_init\_\_.py** - -In `src/zenml/integrations//init__.py` you must now create a new class, which is a subclass of the `Integration` class, set some important attributes (`NAME` and `REQUIREMENTS`), and overwrite the `flavors` class method. - -```bash -from zenml.integrations.constants import -from zenml.integrations.integration import Integration -from zenml.stack import Flavor - -# This is the flavor that will be used when registering this stack component -# `zenml register ... -f example-orchestrator-flavor` -EXAMPLE_ORCHESTRATOR_FLAVOR = <"example-orchestrator-flavor"> - -# Create a Subclass of the Integration Class -class ExampleIntegration(Integration): - """Definition of Example Integration for ZenML.""" - - NAME = - REQUIREMENTS = [""] - - @classmethod - def flavors(cls) -> List[Type[Flavor]]: - """Declare the stack component flavors for the integration.""" - from zenml.integrations. import - - return [] - -ExampleIntegration.check_installation() # this checks if the requirements are installed -``` - -Have a look at the [MLflow Integration](https://github.com/zenml-io/zenml/blob/main/src/zenml/integrations/mlflow/%5F%5Finit%5F%5F.py) as an example for how it is done. - -**5. Import in all the right places** - -The Integration itself must be imported within [src/zenml/integrations/\_\_init\_\_.py](https://github.com/zenml-io/zenml/blob/main/src/zenml/integrations/%5F%5Finit%5F%5F.py). - -### Step 4: Create a PR and celebrate - -You can now [create a PR](https://github.com/zenml-io/zenml/compare) to ZenML and wait for the core maintainers to take a look. Thank you so much for your contribution to the codebase, rock on! diff --git a/develop/stack-components/stack-deployment/implement-a-custom-stack-component.mdx b/develop/stack-components/stack-deployment/implement-a-custom-stack-component.mdx deleted file mode 100644 index 24b8cf2..0000000 --- a/develop/stack-components/stack-deployment/implement-a-custom-stack-component.mdx +++ /dev/null @@ -1,286 +0,0 @@ ---- -title: "Implement a custom stack component" -description: "How to write a custom stack component flavor" -icon: puzzle-piece ---- - -When building a sophisticated MLOps Platform, you will often need to come up with custom-tailored solutions for your infrastructure or tooling. ZenML is built around the values of composability and reusability which is why the stack component flavors in ZenML are designed to be modular and straightforward to extend. - -This guide will help you understand what a flavor is, and how you can develop and use your own custom flavors in ZenML. - -## Understanding component flavors - -In ZenML, a component type is a broad category that defines the functionality of a stack component. Each type can have multiple flavors, which are specific implementations of the component type. For instance, the type `artifact_store` can have flavors like `local`, `s3`, etc. Each flavor defines a unique implementation of functionality that an artifact store brings to a stack. - -## Base Abstractions - -Before we get into the topic of creating custom stack component flavors, let us briefly discuss the three core abstractions related to stack components: the `StackComponent`, the `StackComponentConfig`, and the `Flavor`. - -### Base Abstraction 1: `StackComponent` - -The `StackComponent` is the abstraction that defines the core functionality. As an example, check out the `BaseArtifactStore` definition below: The `BaseArtifactStore` inherits from `StackComponent` and establishes the public interface of all artifact stores. Any artifact store flavor needs to follow the standards set by this base class. - -```py -from zenml.stack import StackComponent - - -class BaseArtifactStore(StackComponent): - """Base class for all ZenML artifact stores.""" - - # --- public interface --- - - @abstractmethod - def open(self, path, mode = "r"): - """Open a file at the given path.""" - - @abstractmethod - def exists(self, path): - """Checks if a path exists.""" - - ... -``` - -As each component defines a different interface, make sure to check out the base class definition of the component type that you want to implement and also check out the [documentation on how to extend specific stack components](/develop/stack-components/stack-deployment/implement-a-custom-stack-component#extending-specific-stack-components). - - - If you would like to automatically track some metadata about your custom stack component with each pipeline run, you can do so by defining some additional methods in your stack component implementation class as shown in the [Tracking Custom Stack Component Metadata](/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps) section. - - -See the full code of the base `StackComponent` class [here](https://github.com/zenml-io/zenml/blob/main/src/zenml/stack/stack%5Fcomponent.py#L301). - -### Base Abstraction 2: `StackComponentConfig` - -As the name suggests, the `StackComponentConfig` is used to configure a stack component instance. It is separated from the actual implementation on purpose. This way, ZenML can use this class to validate the configuration of a stack component during its registration/update, without having to import heavy (or even non-installed) dependencies. - - - The `config` and `settings` of a stack component are two separate, yet related entities. The `config` is the static part of your flavor's configuration, defined when you register your flavor. The `settings` are the dynamic part of your flavor's configuration that can be overridden at runtime. - - -You can read more about the differences [here](/develop/usage/project-setup/use-configuration-files/runtime-configuration). - -Let us now continue with the base artifact store example from above and take a look at the `BaseArtifactStoreConfig`: - -```py -from zenml.stack import StackComponentConfig - -class BaseArtifactStoreConfig(StackComponentConfig): - """Config class for `BaseArtifactStore`.""" - - path: str - - SUPPORTED_SCHEMES: ClassVar[Set[str]] - - ... -``` - -Through the `BaseArtifactStoreConfig`, each artifact store will require users to define a `path` variable. Additionally, the base config requires all artifact store flavors to define a `SUPPORTED_SCHEMES` class variable that ZenML will use to check if the user-provided `path` is actually supported by the flavor. - -See the full code of the base `StackComponentConfig` class [here](https://github.com/zenml-io/zenml/blob/main/src/zenml/stack/stack%5Fcomponent.py#L44). - -### Base Abstraction 3: `Flavor` - -Finally, the `Flavor` abstraction is responsible for bringing the implementation of a `StackComponent` together with the corresponding `StackComponentConfig` definition and also defines the `name` and `type` of the flavor. As an example, check out the definition of the `local` artifact store flavor below: - -```py -from zenml.enums import StackComponentType -from zenml.stack import Flavor - - -class LocalArtifactStore(BaseArtifactStore): - ... - - -class LocalArtifactStoreConfig(BaseArtifactStoreConfig): - ... - - -class LocalArtifactStoreFlavor(Flavor): - - @property - def name(self) -> str: - """Returns the name of the flavor.""" - return "local" - - @property - def type(self) -> StackComponentType: - """Returns the flavor type.""" - return StackComponentType.ARTIFACT_STORE - - @property - def config_class(self) -> Type[LocalArtifactStoreConfig]: - """Config class of this flavor.""" - return LocalArtifactStoreConfig - - @property - def implementation_class(self) -> Type[LocalArtifactStore]: - """Implementation class of this flavor.""" - return LocalArtifactStore -``` - -See the full code of the base `Flavor` class definition [here](https://github.com/zenml-io/zenml/blob/main/src/zenml/stack/flavor.py#L29). - -## Implementing a Custom Stack Component Flavor - -Let's recap what we just learned by reimplementing the `S3ArtifactStore` from the `aws` integration as a custom flavor. - -We can start with the configuration class: here we need to define the `SUPPORTED_SCHEMES` class variable introduced by the `BaseArtifactStore`. We also define several additional configuration values that users can use to configure how the artifact store will authenticate with AWS: - -```py -from zenml.artifact_stores import BaseArtifactStoreConfig -from zenml.utils.secret_utils import SecretField - - -class MyS3ArtifactStoreConfig(BaseArtifactStoreConfig): - """Configuration for the S3 Artifact Store.""" - - SUPPORTED_SCHEMES: ClassVar[Set[str]] = {"s3://"} - - key: Optional[str] = SecretField(default=None) - secret: Optional[str] = SecretField(default=None) - token: Optional[str] = SecretField(default=None) - client_kwargs: Optional[Dict[str, Any]] = None - config_kwargs: Optional[Dict[str, Any]] = None - s3_additional_kwargs: Optional[Dict[str, Any]] = None -``` - - - You can pass sensitive configuration values as [secrets](/develop/usage/project-setup/use-secrets/interact-with-secrets) by defining them as type `SecretField` in the configuration class. - - -With the configuration defined, we can move on to the implementation class, which will use the S3 file system to implement the abstract methods of the `BaseArtifactStore`: - -```py -import s3fs - -from zenml.artifact_stores import BaseArtifactStore - - -class MyS3ArtifactStore(BaseArtifactStore): - """Custom artifact store implementation.""" - - _filesystem: Optional[s3fs.S3FileSystem] = None - - @property - def filesystem(self) -> s3fs.S3FileSystem: - """Get the underlying S3 file system.""" - if self._filesystem: - return self._filesystem - - self._filesystem = s3fs.S3FileSystem( - key=self.config.key, - secret=self.config.secret, - token=self.config.token, - client_kwargs=self.config.client_kwargs, - config_kwargs=self.config.config_kwargs, - s3_additional_kwargs=self.config.s3_additional_kwargs, - ) - return self._filesystem - - def open(self, path, mode: = "r"): - """Custom logic goes here.""" - return self.filesystem.open(path=path, mode=mode) - - def exists(self, path): - """Custom logic goes here.""" - return self.filesystem.exists(path=path) -``` - - - The configuration values defined in the corresponding configuration class are always available in the implementation class under `self.config`. - - -Finally, let's define a custom flavor that brings these two classes together. Make sure that you give your flavor a globally unique name here. - -```py -from zenml.artifact_stores import BaseArtifactStoreFlavor - - -class MyS3ArtifactStoreFlavor(BaseArtifactStoreFlavor): - """Custom artifact store implementation.""" - - @property - def name(self): - """The name of the flavor.""" - return 'my_s3_artifact_store' - - @property - def implementation_class(self): - """Implementation class for this flavor.""" - from ... import MyS3ArtifactStore - - return MyS3ArtifactStore - - @property - def config_class(self): - """Configuration class for this flavor.""" - from ... import MyS3ArtifactStoreConfig - - return MyS3ArtifactStoreConfig -``` - - - For flavors that require additional dependencies, you should make sure to define your implementation, config, and flavor classes in separate Python files and to only import the implementation class inside the `implementation_class` property of the flavor class. Otherwise, ZenML will not be able to load and validate your flavor configuration without the dependencies installed. - - -## Managing a Custom Stack Component Flavor - -Once you have defined your implementation, config, and flavor classes, you can register your new flavor through the ZenML CLI: - -```bash -zenml artifact-store flavor register -``` - -Make sure to point to the flavor class via dot notation! - -For example, if your flavor class `MyS3ArtifactStoreFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml artifact-store flavor register flavors.my_flavor.MyS3ArtifactStoreFlavor -``` - -Afterwards, you should see the new custom artifact store flavor in the list of available artifact store flavors: - -```bash -zenml artifact-store flavor list -``` - -And that's it! You now have a custom stack component flavor that you can use in your stacks just like any other flavor you used before, e.g.: - -```bash -zenml artifact-store register \ - --flavor=my_s3_artifact_store \ - --path='some-path' \ - ... - -zenml stack register \ - --artifact-store \ - ... -``` - -## Tips and best practices - -* ZenML resolves the flavor classes by taking the path where you initialized ZenML (via `zenml init`) as the starting point of resolution. Therefore, you and your team should remember to execute `zenml init` in a consistent manner (usually at the root of the repository where the `.git` folder lives). If the `zenml init` command was not executed, the current working directory is used to find implementation classes, which could lead to unexpected behavior. -* You can use the ZenML CLI to find which exact configuration values a specific flavor requires. Check out [this 3-minute video](https://www.youtube.com/watch?v=CQRVSKbBjtQ) for more information. -* You can keep changing the `Config` and `Settings` of your flavor after registration. ZenML will pick up these "live" changes when running pipelines. -* Note that changing the config in a breaking way requires an update of the component (not a flavor). E.g., adding a mandatory name to flavor X field will break a registered component of that flavor. This may lead to a completely broken state where one should delete the component and re-register it. -* Always test your flavor thoroughly before using it in production. Make sure it works as expected and handles errors gracefully. -* Keep your flavor code clean and well-documented. This will make it easier for others to use and contribute to your flavor. -* Follow best practices for the language and libraries you're using. This will help ensure your flavor is efficient, reliable, and easy to maintain. -* We recommend you develop new flavors by using existing flavors as a reference. A good starting point is the flavors defined in the [official ZenML integrations](https://github.com/zenml-io/zenml/tree/main/src/zenml/integrations). - -## Extending Specific Stack Components - -If you would like to learn more about how to build a custom stack component flavor for a specific stack component type, check out the links below: - -| **Type of Stack Component** | **Description** | -| ------------------------------------------------------------------- | ----------------------------------------------------------------- | -| [Orchestrator](/develop/stack-components/orchestrators/custom) | Orchestrating the runs of your pipeline | -| [Artifact Store](/develop/stack-components/artifact-stores/custom) | Storage for the artifacts created by your pipelines | -| [Container Registry](/develop/stack-components/container-registries/custom) | Store for your containers | -| [Step Operator](/develop/stack-components/step-operators/custom) | Execution of individual steps in specialized runtime environments | -| [Model Deployer](/develop/stack-components/model-deployers/custom) | Services/platforms responsible for online model serving | -| [Feature Store](/develop/stack-components/feature-stores/custom) | Management of your data/features | -| [Experiment Tracker](/develop/stack-components/experiment-trackers/custom) | Tracking your ML experiments | -| [Alerter](/develop/stack-components/alerters/custom) | Sending alerts through specified channels | -| [Annotator](/develop/stack-components/annotators/custom) | Annotating and labeling data | -| [Data Validator](/develop/stack-components/data-validators/custom) | Validating and monitoring your data | diff --git a/develop/stack-components/stack-deployment/register-a-cloud-stack.mdx b/develop/stack-components/stack-deployment/register-a-cloud-stack.mdx deleted file mode 100644 index c452046..0000000 --- a/develop/stack-components/stack-deployment/register-a-cloud-stack.mdx +++ /dev/null @@ -1,263 +0,0 @@ ---- -title: "Register a cloud stack" -description: "Seamlessly register a cloud stack by using existing infrastructure" -icon: cloud ---- - -In ZenML, the [stack](/develop/user-guide/guides/production-guide/understand-stacks) is a fundamental concept that represents the configuration of your infrastructure. In a normal workflow, creating a stack requires you to first deploy the necessary pieces of infrastructure and then define them as stack components in ZenML with proper authentication. - -Especially in a remote setting, this process can be challenging and time-consuming, and it may create multi-faceted problems. This is why we implemented a feature called the stack wizard, that allows you to **browse through your existing infrastructure and use it to register a ZenML cloud stack**. - -If you do not have the required infrastructure pieces already deployed on your cloud, you can also use [the 1-click deployment tool to build your cloud stack](/develop/stack-components/stack-deployment/deploy-a-cloud-stack). - -Alternatively, if you prefer to have more control over where and how resources are provisioned in your cloud, you can [use one of our Terraform modules](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) to manage your infrastructure as code yourself. - -## How to use the Stack Wizard? - -The stack wizard is available to you by both our CLI and our dashboard. - - - If you are using the dashboard, the stack wizard is available through the stacks page. - - - - -Here you can click on "+ New Stack" and choose the option "Use existing Cloud". - - - - - -Next, you have to select the cloud provider that you want to work with. - - - - -Choose one of the possible authentication methods based on your provider and fill in the required fields. - - - - -
- -There are several different methods to authenticate with AWS: - - - - - - - - - On the dashboard, the stack wizard only works with AWS. We are working on bringing support to GCP and Azure as well. If you would like to use these providers, you can still use the CLI or stay tuned for further updates. - - -From this step forward, ZenML will show you different selections of resources that you can use from your existing infrastructure so that you can create the required stack components such as an artifact store, an orchestrator, and a container registry. - -
- - In order to register a remote stack over the CLI with the stack wizard, you can use the following command: - -```bash -zenml stack register -p {aws|gcp|azure} -``` - -To register the cloud stack, the first thing that the wizard needs is a [service connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide). You can either use an existing connector by providing its ID or name `-sc ` (CLI-Only) or the wizard will create one for you. - - - Similar to the service connector, if you use the CLI, you can also use existing stack components. However, this is only possible if these components are already configured with the same service connector that you provided through the parameter described above. - -#### Define Service Connector - -As the very first step the configuration wizard will check if the selected cloud provider credentials can be acquired automatically from the local environment. If the credentials are found, you will be offered to use them or proceed to manual configuration. - -`Example prompt for AWS auto-configuration` - - -```bash -AWS cloud service connector has detected connection -credentials in your environment. -Would you like to use these credentials or create a new -configuration by providing connection details? [y/n] (y): -``` - -If you decline auto-configuration next you might be offered the list of already created service connectors available on the server: pick one of them and proceed or pick `0` to create a new one. - - -If you select `aws` as your cloud provider, and you haven't selected a connector or declined auto-configuration, you will be prompted to select an authentication method for your cloud connector. - -Available authentication methods for AWS - - -```Available authentication methods for AWS -┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ Choice ┃ Name ┃ Required ┃ -┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ -│ [0] │ AWS Secret Key │ aws_access_key_id (AWS Access │ -│ │ │ Key ID) │ -│ │ │ aws_secret_access_key (AWS │ -│ │ │ Secret Access Key) │ -│ │ │ region (AWS Region) │ -│ │ │ │ -├─────────┼────────────────────────────────┼────────────────────────────────┤ -│ [1] │ AWS STS Token │ aws_access_key_id (AWS Access │ -│ │ │ Key ID) │ -│ │ │ aws_secret_access_key (AWS │ -│ │ │ Secret Access Key) │ -│ │ │ aws_session_token (AWS │ -│ │ │ Session Token) │ -│ │ │ region (AWS Region) │ -│ │ │ │ -├─────────┼────────────────────────────────┼────────────────────────────────┤ -│ [2] │ AWS IAM Role │ aws_access_key_id (AWS Access │ -│ │ │ Key ID) │ -│ │ │ aws_secret_access_key (AWS │ -│ │ │ Secret Access Key) │ -│ │ │ region (AWS Region) │ -│ │ │ role_arn (AWS IAM Role ARN) │ -│ │ │ │ -├─────────┼────────────────────────────────┼────────────────────────────────┤ -│ [3] │ AWS Session Token │ aws_access_key_id (AWS Access │ -│ │ │ Key ID) │ -│ │ │ aws_secret_access_key (AWS │ -│ │ │ Secret Access Key) │ -│ │ │ region (AWS Region) │ -│ │ │ │ -├─────────┼────────────────────────────────┼────────────────────────────────┤ -│ [4] │ AWS Federation Token │ aws_access_key_id (AWS Access │ -│ │ │ Key ID) │ -│ │ │ aws_secret_access_key (AWS │ -│ │ │ Secret Access Key) │ -│ │ │ region (AWS Region) │ -│ │ │ │ -└─────────┴────────────────────────────────┴────────────────────────────────┘ -``` - - - - -If you select `gcp` as your cloud provider, and you haven't selected a connector or declined auto-configuration, you will be prompted to select an authentication method for your cloud connector. - -Available authentication methods for GCP - - -```Bash Available authentication methods for GCP - Available authentication methods for GCP -┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ Choice ┃ Name ┃ Required ┃ -┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ -│ [0] │ GCP User Account │ user_account_json (GCP User │ -│ │ │ Account Credentials JSON │ -│ │ │ optionally base64 encoded.) │ -│ │ │ project_id (GCP Project ID │ -│ │ │ where the target resource is │ -│ │ │ located.) │ -│ │ │ │ -├─────────┼────────────────────────────────┼────────────────────────────────┤ -│ [1] │ GCP Service Account │ service_account_json (GCP │ -│ │ │ Service Account Key JSON │ -│ │ │ optionally base64 encoded.) │ -│ │ │ │ -├─────────┼────────────────────────────────┼────────────────────────────────┤ -│ [2] │ GCP External Account │ external_account_json (GCP │ -│ │ │ External Account JSON │ -│ │ │ optionally base64 encoded.) │ -│ │ │ project_id (GCP Project ID │ -│ │ │ where the target resource is │ -│ │ │ located.) │ -│ │ │ │ -├─────────┼────────────────────────────────┼────────────────────────────────┤ -│ [3] │ GCP Oauth 2.0 Token │ token (GCP OAuth 2.0 Token) │ -│ │ │ project_id (GCP Project ID │ -│ │ │ where the target resource is │ -│ │ │ located.) │ -│ │ │ │ -├─────────┼────────────────────────────────┼────────────────────────────────┤ -│ [4] │ GCP Service Account │ service_account_json (GCP │ -│ │ Impersonation │ Service Account Key JSON │ -│ │ │ optionally base64 encoded.) │ -│ │ │ target_principal (GCP Service │ -│ │ │ Account Email to impersonate) │ -│ │ │ │ -└─────────┴────────────────────────────────┴────────────────────────────────┘ -``` - - - - - -If you select `azure` as your cloud provider, and you haven't selected a connector or declined auto-configuration, you will be prompted to select an authentication method for your cloud connector. - -Available authentication methods for Azure - - -```Bash Available authentication methods for AZURE - Available authentication methods for AZURE -┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ Choice ┃ Name ┃ Required ┃ -┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ -│ [0] │ Azure Service Principal │ client_secret (Service principal │ -│ │ │ client secret) │ -│ │ │ tenant_id (Azure Tenant ID) │ -│ │ │ client_id (Azure Client ID) │ -│ │ │ │ -├────────┼─────────────────────────┼────────────────────────────────────┤ -│ [1] │ Azure Access Token │ token (Azure Access Token) │ -│ │ │ │ -└────────┴─────────────────────────┴────────────────────────────────────┘ -``` - - -#### Defining cloud components - -Next, you will define three major components of your target stack: - -* artifact store -* orchestrator -* container registry - -All three are crucial for a basic cloud stack. Extra components can be added later if they are needed. - -For each component, you will be asked: - -* if you would like to reuse one of the existing components connected via a defined service connector (if any) - -Example Command Output for available orchestrator - -```Bash - Available orchestrator -┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ Choice ┃ Name ┃ -┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ -│ [0] │ Create a new orchestrator │ -├──────────────────┼────────────────────────────────────────────────────┤ -│ [1] │ existing_orchestrator_1 │ -├──────────────────┼────────────────────────────────────────────────────┤ -│ [2] │ existing_orchestrator_2 │ -└──────────────────┴────────────────────────────────────────────────────┘ -``` - -* to create a new one from available to the service connector resources (if the existing not picked) - -Example Command Output for Artifact Stores - - -```Bash - Available GCP storages -┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ -┃ Choice ┃ Storage ┃ -┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ -│ [0] │ gs://*************************** │ -├───────────────┼───────────────────────────────────────────────────────┤ -│ [1] │ gs://*************************** │ -└───────────────┴───────────────────────────────────────────────────────┘ -``` - -Based on your selection, ZenML will create the stack component and ultimately register the stack for you. - -
- - -There you have it! Through the wizard, you just registered a cloud stack and, you can start running your pipelines on a remote setting. diff --git a/develop/stack-components/step-operators/azureml.mdx b/develop/stack-components/step-operators/azureml.mdx deleted file mode 100644 index fb1b306..0000000 --- a/develop/stack-components/step-operators/azureml.mdx +++ /dev/null @@ -1,138 +0,0 @@ ---- -title: "AzureML" -description: "Executing individual steps in AzureML." ---- - -[AzureML](https://azure.microsoft.com/en-us/products/machine-learning/) offers specialized compute instances to run your training jobs and has a comprehensive UI to track and manage your models and logs. ZenML's AzureML step operator allows you to submit individual steps to be run on AzureML compute instances. - -### When to use it - -You should use the AzureML step operator if: - -* one or more steps of your pipeline require computing resources (CPU, GPU, memory) that are not provided by your orchestrator. -* you have access to AzureML. If you're using a different cloud provider, take a look at the [SageMaker](/develop/stack-components/step-operators/sagemaker) or [Vertex](/develop/stack-components/step-operators/vertex) step operators. - -### How to deploy it - - - Would you like to skip ahead and deploy a full ZenML cloud stack already, including an AzureML step operator? Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML Azure Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack component. - - -* Create a `Machine learning` [workspace on Azure](https://docs.microsoft.com/en-us/azure/machine-learning/quickstart-create-resources). This should include an Azure container registry and an Azure storage account that will be used as part of your stack. -* (Optional) Once your resource is created, you can head over to the `Azure Machine Learning Studio` and [create a compute instance or cluster](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-compute-instance?view=azureml-api-2&tabs=python) to run your pipelines. If omitted, the AzureML step operator will use the serverless compute target or will provision a new compute target on the fly, depending on the settings used to configure the step operator. -* (Optional) Create a [Service Principal](https://docs.microsoft.com/en-us/azure/developer/java/sdk/identity-service-principal-auth) for authentication. This is required if you intend to use a service connector to authenticate your step operator. - -### How to use it - -To use the AzureML step operator, we need: - -* The ZenML `azure` integration installed. If you haven't done so, run - -```bash -zenml integration install azure -``` -* [Docker](https://www.docker.com) installed and running. -* An [Azure container registry](/develop/stack-components/container-registries/azure) as part of your stack. Take a look [here](/develop/stack-components/container-registries/azure#how-to-deploy-it) for a guide on how to set that up. -* An [Azure artifact store](/develop/stack-components/artifact-stores/azure) as part of your stack. This is needed so that both your orchestration environment and AzureML can read and write step artifacts. Take a look [here](/develop/stack-components/container-registries/azure#how-to-deploy-it) for a guide on how to set that up. -* An AzureML workspace and an optional compute cluster. Note that the AzureML workspace can share the Azure container registry and Azure storage account that are required above. See the [deployment section](/develop/stack-components/step-operators/azureml#how-to-deploy-it) for detailed instructions. - -There are two ways you can authenticate your step operator to be able to run steps on Azure: - - - The recommended way to authenticate your AzureML step operator is by registering or using an existing [Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector) and connecting it to your AzureML step operator. The credentials configured for the connector must have permissions to create and manage AzureML jobs (e.g. [the AzureML Data Scientist and AzureML Compute Operator managed roles](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-assign-roles?view=azureml-api-2&tabs=team-lead)). The AzureML step operator uses the `azure-generic` resource type, so make sure to configure the connector accordingly: - - ```bash - zenml service-connector register --type azure -i - zenml step-operator register \ - --flavor=azureml \ - --subscription_id= \ - --resource_group= \ - --workspace_name= \ - # --compute_target_name= # optionally specify an existing compute target - - zenml step-operator connect --connector - zenml stack register -s ... --set - ``` - - - -If you don't connect your step operator to a service connector: - -* If using a [local orchestrator](/develop/stack-components/orchestrators/local): ZenML will try to implicitly authenticate to Azure via the local [Azure CLI configuration](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli-interactively). Make sure the Azure CLI has permissions to create and manage AzureML jobs (e.g. [the AzureML Data Scientist and AzureML Compute Operator managed roles](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-assign-roles?view=azureml-api-2&tabs=team-lead)). -* If using a remote orchestrator: the remote environment in which the orchestrator runs needs to be able to implicitly authenticate to Azure and have permissions to create and manage AzureML jobs. This is only possible if the orchestrator is also running in Azure and uses a form of implicit workload authentication like a service role. If this is not the case, you will need to use a service connector. - -```bash -zenml step-operator register \ - --flavor=azureml \ - --subscription_id= \ - --resource_group= \ - --workspace_name= \ -# --compute_target_name= # optionally specify an existing compute target - -zenml stack register -s ... --set -``` - - - - -Once you added the step operator to your active stack, you can use it to execute individual steps of your pipeline by specifying it in the `@step` decorator as follows: - -```py -from zenml import step - - -@step(step_operator=) -def trainer(...) -> ...: - """Train a model.""" - # This step will be executed in AzureML. -``` - - - ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your steps in AzureML. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - -#### Additional configuration - -The ZenML AzureML step operator comes with a dedicated class called `AzureMLStepOperatorSettings` for configuring its settings and it controls the compute resources used for step execution in AzureML. - -Currently, it supports three different modes of operation. - -1. Serverless Compute (Default) -* Set `mode` to `serverless`. -* Other parameters are ignored. -1. Compute Instance -* Set `mode` to `compute-instance`. -* Requires a `compute_name`. - * If a compute instance with the same name exists, it uses the existing compute instance and ignores other parameters. - * If a compute instance with the same name doesn't exist, it creates a new compute instance with the `compute_name`. For this process, you can specify `compute_size` and `idle_type_before_shutdown_minutes`. -1. Compute Cluster -* Set `mode` to `compute-cluster`. -* Requires a `compute_name`. - * If a compute cluster with the same name exists, it uses existing cluster, ignores other parameters. - * If a compute cluster with the same name doesn't exist, it creates a new compute cluster. Additional parameters can be used for configuring this process. - -Here is an example how you can use the `AzureMLStepOperatorSettings` to define a compute instance: - -```py -from zenml.integrations.azure.flavors import AzureMLStepOperatorSettings - -azureml_settings = AzureMLStepOperatorSettings( - mode="compute-instance", - compute_name="MyComputeInstance", - compute_size="Standard_NC6s_v3", -) - -@step( - settings={ - "step_operator": azureml_settings - } -) -def my_azureml_step(): - # YOUR STEP CODE - ... -``` - -You can check out the [AzureMLStepOperatorSettings SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-azure/#zenml.integrations.azure.flavors.azureml%5Fstep%5Foperator%5Fflavor.AzureMLStepOperatorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this step operator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. diff --git a/develop/stack-components/step-operators/custom.mdx b/develop/stack-components/step-operators/custom.mdx deleted file mode 100644 index 01df4be..0000000 --- a/develop/stack-components/step-operators/custom.mdx +++ /dev/null @@ -1,122 +0,0 @@ ---- -title: "Develop a Custom Step Operator" -description: "Learning how to develop a custom step operator." ---- - - - Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our [general guide to writing custom component flavors in ZenML](/develop/stack-components/stack-deployment/implement-a-custom-stack-component). This guide provides an essential understanding of ZenML's component flavor concepts. - - -### Base Abstraction - -The `BaseStepOperator` is the abstract base class that needs to be subclassed in order to run specific steps of your pipeline in a separate environment. As step operators can come in many shapes and forms, the base class exposes a deliberately basic and generic interface: - -```py -from abc import ABC, abstractmethod -from typing import List, Type - -from zenml.enums import StackComponentType -from zenml.stack import StackComponent, StackComponentConfig, Flavor -from zenml.config.step_run_info import StepRunInfo - - -class BaseStepOperatorConfig(StackComponentConfig): - """Base config for step operators.""" - - -class BaseStepOperator(StackComponent, ABC): - """Base class for all ZenML step operators.""" - - @abstractmethod - def launch( - self, - info: StepRunInfo, - entrypoint_command: List[str], - ) -> None: - """Abstract method to execute a step. - - Subclasses must implement this method and launch a **synchronous** - job that executes the `entrypoint_command`. - - Args: - info: Information about the step run. - entrypoint_command: Command that executes the step. - """ - - -class BaseStepOperatorFlavor(Flavor): - """Base class for all ZenML step operator flavors.""" - - @property - @abstractmethod - def name(self) -> str: - """Returns the name of the flavor.""" - - @property - def type(self) -> StackComponentType: - """Returns the flavor type.""" - return StackComponentType.STEP_OPERATOR - - @property - def config_class(self) -> Type[BaseStepOperatorConfig]: - """Returns the config class for this flavor.""" - return BaseStepOperatorConfig - - @property - @abstractmethod - def implementation_class(self) -> Type[BaseStepOperator]: - """Returns the implementation class for this flavor.""" -``` - - - This is a slimmed-down version of the base implementation which aims to highlight the abstraction layer. In order to see the full implementation and get the complete docstrings, please check the [SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-step%5Foperators/#zenml.step%5Foperators.base%5Fstep%5Foperator.BaseStepOperator) . - - -### Build your own custom step operator - -If you want to create your own custom flavor for a step operator, you can follow the following steps: - -1. Create a class that inherits from the `BaseStepOperator` class and implement the abstract `launch` method. This method has two main responsibilities: - * Preparing a suitable execution environment (e.g. a Docker image): The general environment is highly dependent on the concrete step operator implementation, but for ZenML to be able to run the step it requires you to install some `pip` dependencies. The list of requirements needed to successfully execute the step can be found via the Docker settings `info.pipeline.docker_settings` passed to the `launch()` method. Additionally, you'll have to make sure that all the source code of your ZenML step and pipeline are available within this execution environment. - * Running the entrypoint command: Actually running a single step of a pipeline requires knowledge of many ZenML internals and is implemented in the `zenml.step_operators.step_operator_entrypoint_configuration` module. As long as your environment was set up correctly (see the previous bullet point), you can run the step using the command provided via the `entrypoint_command` argument of the `launch()` method. -2. If your step operator allows the specification of per-step resources, make sure to handle the resources defined on the step (`info.config.resource_settings`) that was passed to the `launch()` method. -3. If you need to provide any configuration, create a class that inherits from the `BaseStepOperatorConfig` class adds your configuration parameters. -4. Bring both the implementation and the configuration together by inheriting from the `BaseStepOperatorFlavor` class. Make sure that you give a `name` to the flavor through its abstract property. - -Once you are done with the implementation, you can register it through the CLI. Please ensure you **point to the flavor class via dot notation**: - -```bash -zenml step-operator flavor register -``` - -For example, if your flavor class `MyStepOperatorFlavor` is defined in `flavors/my_flavor.py`, you'd register it by doing: - -```bash -zenml step-operator flavor register flavors.my_flavor.MyStepOperatorFlavor -``` - - -ZenML resolves the flavor class by taking the path where you initialized zenml (via `zenml init`) as the starting point of resolution. Therefore, please ensure you follow [the best practice](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) of initializing zenml at the root of your repository. - -If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually, it's better to not have to rely on this mechanism and initialize zenml at the root. - - -Afterward, you should see the new flavor in the list of available flavors: - -```py -zenml step-operator flavor list -``` - - -It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow. - -* The **CustomStepOperatorFlavor** class is imported and utilized upon the creation of the custom flavor through the CLI. -* The **CustomStepOperatorConfig** class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As `Config` objects are inherently `pydantic` objects, you can also add your own custom validators here. -* The **CustomStepOperator** only comes into play when the component is ultimately in use. - -The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the `CustomStepOperatorFlavor` and the `CustomStepOperatorConfig` are implemented in a different module/path than the actual `CustomStepOperator`). - - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use your custom step operator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. diff --git a/develop/stack-components/step-operators/kubernetes.mdx b/develop/stack-components/step-operators/kubernetes.mdx deleted file mode 100644 index 02cceba..0000000 --- a/develop/stack-components/step-operators/kubernetes.mdx +++ /dev/null @@ -1,227 +0,0 @@ ---- -title: "Kubernetes" -description: "Executing individual steps in Kubernetes Pods." ---- - -ZenML's Kubernetes step operator allows you to submit individual steps to be run on Kubernetes pods. - -### When to use it - -You should use the Kubernetes step operator if: - -* one or more steps of your pipeline require computing resources (CPU, GPU, memory) that are not provided by your orchestrator. -* you have access to a Kubernetes cluster. - -### How to deploy it - -The Kubernetes step operator requires a Kubernetes cluster in order to run. There are many ways to deploy a Kubernetes cluster using different cloud providers or on your custom infrastructure, and we can't possibly cover all of them, but you can check out our cloud guide. - -### How to use it - -To use the Kubernetes step operator, we need: - -* The ZenML `kubernetes` integration installed. If you haven't done so, run - -```bash -zenml integration install kubernetes -``` -* A Kubernetes cluster [deployed](/develop/stack-components/step-operators/kubernetes#how-to-deploy-it) -* Either [Docker](https://www.docker.com) installed and running or a remote [image builder](/develop/stack-components/image-builders) in your stack. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. This is needed so that both your orchestration environment and Kubernetes Pods can read and write step artifacts. Check out the documentation page of the artifact store you want to use for more information on how to set that up and configure authentication for it. - - - It is recommended that you set up [a Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) and use it to connect the Kubernetes step operator to the Kubernetes cluster, especially if you are using a Kubernetes cluster managed by a cloud provider like AWS, GCP or Azure. - - -We can then register the step operator and use it in our stacks. This can be done in two ways: - -1. Using a Service Connector configured to access the remote Kubernetes cluster. Depending on your cloud provider, this should be either an [AWS](/develop/usage/resource-data-management/auth-management/aws-service-connector), [Azure](/develop/usage/resource-data-management/auth-management/azure-service-connector) or [GCP](/develop/usage/resource-data-management/auth-management/gcp-service-connector) service connector. If you're using a Kubernetes cluster that is not provided by any of these, you can use the generic [Kubernetes](/develop/usage/resource-data-management/auth-management/kubernetes-service-connector) service connector. You can then [connect the stack component to the Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide#connect-stack-components-to-resources): - -```bash -$ zenml step-operator register --flavor kubernetes -Running with active stack: 'default' (repository) -Successfully registered step operator ``. - -$ zenml service-connector list-resources --resource-type kubernetes-cluster -e -The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ e33c9fac-5daa-48b2-87bb-0187d3782cde │ aws-iam-multi-eu │ 🔶 aws │ 🌀 kubernetes-cluster │ kubeflowmultitenant ┃ -┃ │ │ │ │ zenbox ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ -┃ 1c54b32a-4889-4417-abbd-42d3ace3d03a │ gcp-sa-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛ - -$ zenml step-operator connect --connector aws-iam-multi-us -Running with active stack: 'default' (repository) -Successfully connected step_operator `` to the following resources: -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ -┠──────────────────────────────────────┼──────────────────┼────────────────┼───────────────────────┼──────────────────┨ -┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ -``` -2. Using the local Kubernetes `kubectl` client. This client needs to be configured with a configuration context pointing to the remote cluster. The `kubernetes_context` configuration attribute must also be configured with the value of that context: - -```bash -zenml step-operator register \ - --flavor=kubernetes \ - --kubernetes_context= -``` - -We can then use the registered step operator in our active stack: - -```bash -# Add the step operator to the active stack -zenml stack update -s -``` - -Once you added the step operator to your active stack, you can use it to execute individual steps of your pipeline by specifying it in the `@step` decorator as follows: - -```bash -from zenml import step - - -@step(step_operator=) -def trainer(...) -> ...: - """Train a model.""" - # This step will be executed in Kubernetes. -``` - - - ZenML will build a Docker images which includes your code and use it to run your steps in Kubernetes. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - - -#### Interacting with pods via kubectl - -For debugging, it can sometimes be handy to interact with the Kubernetes pods directly via kubectl. To make this easier, we have added the following labels to all pods: - -* `run`: the name of the ZenML run. -* `pipeline`: the name of the ZenML pipeline associated with this run. - -E.g., you can use these labels to manually delete all pods related to a specific pipeline: - -```bash -kubectl delete pod -n zenml -l pipeline=kubernetes_example_pipeline -``` - -#### Additional configuration - -For additional configuration of the Kubernetes step operator, you can pass `KubernetesStepOperatorSettings` which allows you to configure (among others) the following attributes: - -* `pod_settings`: Node selectors, labels, affinity, and tolerations, and image pull secrets to apply to the Kubernetes Pods. These can be either specified using the Kubernetes model objects or as dictionaries. -* `service_account_name`: The name of the service account to use for the Kubernetes Pods. - -```JSON -from zenml.integrations.kubernetes.flavors import KubernetesStepOperatorSettings -from kubernetes.client.models import V1Toleration - -kubernetes_settings = KubernetesStepOperatorSettings( - pod_settings={ - "node_selectors": { - "cloud.google.com/gke-nodepool": "ml-pool", - "kubernetes.io/arch": "amd64" - }, - "affinity": { - "nodeAffinity": { - "requiredDuringSchedulingIgnoredDuringExecution": { - "nodeSelectorTerms": [ - { - "matchExpressions": [ - { - "key": "gpu-type", - "operator": "In", - "values": ["nvidia-tesla-v100", "nvidia-tesla-p100"] - } - ] - } - ] - } - } - }, - "tolerations": [ - V1Toleration( - key="gpu", - operator="Equal", - value="present", - effect="NoSchedule" - ), - V1Toleration( - key="high-priority", - operator="Exists", - effect="PreferNoSchedule" - ) - ], - "resources": { - "requests": { - "cpu": "2", - "memory": "4Gi", - "nvidia.com/gpu": "1" - }, - "limits": { - "cpu": "4", - "memory": "8Gi", - "nvidia.com/gpu": "1" - } - }, - "annotations": { - "prometheus.io/scrape": "true", - "prometheus.io/port": "8080" - }, - "volumes": [ - { - "name": "data-volume", - "persistentVolumeClaim": { - "claimName": "ml-data-pvc" - } - }, - { - "name": "config-volume", - "configMap": { - "name": "ml-config" - } - } - ], - "volume_mounts": [ - { - "name": "data-volume", - "mountPath": "/mnt/data" - }, - { - "name": "config-volume", - "mountPath": "/etc/ml-config", - "readOnly": True - } - ], - "host_ipc": True, - "image_pull_secrets": ["regcred", "gcr-secret"], - "labels": { - "app": "ml-pipeline", - "environment": "production", - "team": "data-science" - } - }, - kubernetes_namespace="ml-pipelines", - service_account_name="zenml-pipeline-runner" -) - -@step( - settings={ - "step_operator": kubernetes_settings - } -) -def my_kubernetes_step(): - ... -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-kubernetes/#zenml.integrations.kubernetes.flavors.kubernetes%5Fstep%5Foperator%5Fflavor.KubernetesStepOperatorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -For more information and a full list of configurable attributes of the Kubernetes steop operator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-kubernetes/#zenml.integrations.kubernetes.step%5Foperators.kubernetes\step%5Foperator.KubernetesStepOperator) . - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this step operator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. - diff --git a/develop/stack-components/step-operators/sagemaker.mdx b/develop/stack-components/step-operators/sagemaker.mdx deleted file mode 100644 index 5d75a38..0000000 --- a/develop/stack-components/step-operators/sagemaker.mdx +++ /dev/null @@ -1,106 +0,0 @@ ---- -title: "Amazon SageMaker" -description: "Executing individual steps in SageMaker." ---- - -[SageMaker](https://aws.amazon.com/sagemaker/) offers specialized compute instances to run your training jobs and has a comprehensive UI to track and manage your models and logs. ZenML's SageMaker step operator allows you to submit individual steps to be run on Sagemaker compute instances. - -### When to use it - -You should use the SageMaker step operator if: - -* one or more steps of your pipeline require computing resources (CPU, GPU, memory) that are not provided by your orchestrator. -* you have access to SageMaker. If you're using a different cloud provider, take a look at the [Vertex](/develop/stack-components/step-operators/vertex) or [AzureML](/develop/stack-components/step-operators/azureml) step operators. - -### How to deploy it - -* Create a role in the IAM console that you want the jobs running in SageMaker to assume. This role should at least have the `AmazonS3FullAccess` and `AmazonSageMakerFullAccess` policies applied. Check [here](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html#sagemaker-roles-create-execution-role) for a guide on how to set up this role. - -#### Infrastructure Deployment - -A Sagemaker step operator can be deployed directly from the ZenML CLI: - -```bash -zenml orchestrator deploy sagemaker_step_operator --flavor=sagemaker --provider=aws ... -``` - -You can pass other configurations specific to the stack components as key-value arguments. If you don't provide a name, a random one is generated for you. For more information about how to work use the CLI for this, please refer to the dedicated documentation section. - -### How to use it - -To use the SageMaker step operator, we need: - -* The ZenML `aws` integration installed. If you haven't done so, run -Copy -``` -zenml integration install aws -``` -* [Docker](https://www.docker.com) installed and running. -* An IAM role with the correct permissions. See the [deployment section](/develop/stack-components/step-operators/sagemaker#how-to-deploy-it) for detailed instructions. -* An [AWS container registry](/develop/stack-components/container-registries/aws) as part of our stack. Take a look [here](/develop/stack-components/container-registries/aws#how-to-deploy-it) for a guide on how to set that up. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. This is needed so that both your orchestration environment and SageMaker can read and write step artifacts. Check out the documentation page of the artifact store you want to use for more information on how to set that up and configure authentication for it. -* An instance type that we want to execute our steps on. See [here](https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available-instance-types.html) for a list of available instance types. -* (Optional) An experiment that is used to group SageMaker runs. Check [this guide](https://docs.aws.amazon.com/sagemaker/latest/dg/experiments-create.html) to see how to create an experiment. - -There are two ways you can authenticate your orchestrator to AWS to be able to run steps on SageMaker: - - - The recommended way to authenticate your SageMaker step operator is by registering or using an existing [AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector) and connecting it to your SageMaker step operator. The credentials configured for the connector must have permissions to create and manage SageMaker runs (e.g. [the AmazonSageMakerFullAccess managed policy](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html) permissions). The SageMaker step operator uses these `aws-generic` resource type, so make sure to configure the connector accordingly: - -```bash -zenml service-connector register --type aws -i -zenml step-operator register \ - --flavor=sagemaker \ - --role= \ - --instance_type= \ -# --experiment_name= # optionally specify an experiment to assign this run to - -zenml step-operator connect --connector -zenml stack register -s ... --set -``` - - -If you don't connect your step operator to a service connector: - -* If using a [local orchestrator](/develop/stack-components/orchestrators/local): ZenML will try to implicitly authenticate to AWS via the `default` profile in your local [AWS configuration file](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html). Make sure this profile has permissions to create and manage SageMaker runs (e.g. [the AmazonSageMakerFullAccess managed policy](https://docs.aws.amazon.com/sagemaker/latest/dg/security-iam-awsmanpol.html) permissions). -* If using a remote orchestrator: the remote environment in which the orchestrator runs needs to be able to implicitly authenticate to AWS and assume the IAM role specified when registering the SageMaker step operator. This is only possible if the orchestrator is also running in AWS and uses a form of implicit workload authentication like the IAM role of an EC2 instance. If this is not the case, you will need to use a service connector. - -```bash -zenml step-operator register \ - --flavor=sagemaker \ - --role= \ - --instance_type= \ -# --experiment_name= # optionally specify an experiment to assign this run to - -zenml stack register -s ... --set -python run.py # Authenticates with `default` profile in `~/.aws/config` -``` - - - -Once you added the step operator to your active stack, you can use it to execute individual steps of your pipeline by specifying it in the `@step` decorator as follows: - -```py -from zenml import step - - -@step(step_operator= ) -def trainer(...) -> ...: - """Train a model.""" - # This step will be executed in SageMaker. -``` - - - ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your steps in SageMaker. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - - -#### Additional configuration - -For additional configuration of the SageMaker step operator, you can pass `SagemakerStepOperatorSettings` when defining or running your pipeline. Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-aws/#zenml.integrations.aws.flavors.sagemaker%5Fstep%5Foperator%5Fflavor.SagemakerStepOperatorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -For more information and a full list of configurable attributes of the SageMaker step operator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-aws/#zenml.integrations.aws.step%5Foperators.sagemaker%5Fstep%5Foperator.SagemakerStepOperator) . - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this step operator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. - diff --git a/develop/stack-components/step-operators/spark-kubernetes.mdx b/develop/stack-components/step-operators/spark-kubernetes.mdx deleted file mode 100644 index 8569970..0000000 --- a/develop/stack-components/step-operators/spark-kubernetes.mdx +++ /dev/null @@ -1,345 +0,0 @@ ---- -title: "Spark" -description: "Executing individual steps on Spark" ---- - -The `spark` integration brings two different step operators: - -* **Step Operator**: The `SparkStepOperator` serves as the base class for all the Spark-related step operators. -* **Step Operator**: The `KubernetesSparkStepOperator` is responsible for launching ZenML steps as Spark applications with Kubernetes as a cluster manager. - -## Step Operators: `SparkStepOperator` - -A summarized version of the implementation can be summarized in two parts. First, the configuration: - -```py -from typing import Optional, Dict, Any -from zenml.step_operators import BaseStepOperatorConfig - - -class SparkStepOperatorConfig(BaseStepOperatorConfig): - """Spark step operator config. - - Attributes: - master: is the master URL for the cluster. You might see different - schemes for different cluster managers which are supported by Spark - like Mesos, YARN, or Kubernetes. Within the context of this PR, - the implementation supports Kubernetes as a cluster manager. - deploy_mode: can either be 'cluster' (default) or 'client' and it - decides where the driver node of the application will run. - submit_kwargs: is the JSON string of a dict, which will be used - to define additional params if required (Spark has quite a - lot of different parameters, so including them, all in the step - operator was not implemented). - """ - - master: str - deploy_mode: str = "cluster" - submit_kwargs: Optional[Dict[str, Any]] = None -``` - -and then the implementation: - -```py -from typing import List -from pyspark.conf import SparkConf - -from zenml.step_operators import BaseStepOperator - - -class SparkStepOperator(BaseStepOperator): - """Base class for all Spark-related step operators.""" - - def _resource_configuration( - self, - spark_config: SparkConf, - resource_configuration: "ResourceSettings", - ) -> None: - """Configures Spark to handle the resource configuration.""" - - def _backend_configuration( - self, - spark_config: SparkConf, - step_config: "StepConfiguration", - ) -> None: - """Configures Spark to handle backends like YARN, Mesos or Kubernetes.""" - - def _io_configuration( - self, - spark_config: SparkConf - ) -> None: - """Configures Spark to handle different input/output sources.""" - - def _additional_configuration( - self, - spark_config: SparkConf - ) -> None: - """Appends the user-defined configuration parameters.""" - - def _launch_spark_job( - self, - spark_config: SparkConf, - entrypoint_command: List[str] - ) -> None: - """Generates and executes a spark-submit command.""" - - def launch( - self, - info: "StepRunInfo", - entrypoint_command: List[str], - ) -> None: - """Launches the step on Spark.""" -``` - -Under the base configuration, you will see the main configuration parameters: - -* `master` is the master URL for the cluster where Spark will run. You might see different schemes for this URL with varying cluster managers such as Mesos, YARN, or Kubernetes. -* `deploy_mode` can either be 'cluster' (default) or 'client' and it decides where the driver node of the application will run. -* `submit_args` is the JSON string of a dictionary, which will be used to define additional parameters if required ( Spark has a wide variety of parameters, thus including them all in a single class was deemed unnecessary.). - -In addition to this configuration, the `launch` method of the step operator gets additional configuration parameters from the `DockerSettings` and `ResourceSettings`. As a result, the overall configuration happens in 4 base methods: - -* `_resource_configuration` translates the ZenML `ResourceSettings` object to Spark's own resource configuration. -* `_backend_configuration` is responsible for cluster-manager-specific configuration. -* `_io_configuration` is a critical method. Even though we have materializers, Spark might require additional packages and configuration to work with a specific filesystem. This method is used as an interface to provide this configuration. -* `_additional_configuration` takes the `submit_args`, converts, and appends them to the overall configuration. - -Once the configuration is completed, `_launch_spark_job` comes into play. This takes the completed configuration and runs a Spark job on the given `master` URL with the specified `deploy_mode`. By default, this is achieved by creating and executing a `spark-submit` command. - -### Warning - -In its first iteration, the pre-configuration with `_io_configuration` method is only effective when it is paired with an `S3ArtifactStore` (which has an authentication secret). When used with other artifact store flavors, you might be required to provide additional configuration through the `submit_args`. - -## Stack Component: `KubernetesSparkStepOperator` - -The `KubernetesSparkStepOperator` is implemented by subclassing the base `SparkStepOperator` and uses the `PipelineDockerImageBuilder` class to build and push the required Docker images. - -```py -from typing import Optional - -from zenml.integrations.spark.step_operators.spark_step_operator import ( - SparkStepOperatorConfig -) - - -class KubernetesSparkStepOperatorConfig(SparkStepOperatorConfig): - """Config for the Kubernetes Spark step operator.""" - - namespace: Optional[str] = None - service_account: Optional[str] = None -``` - -```py -from pyspark.conf import SparkConf - -from zenml.utils.pipeline_docker_image_builder import PipelineDockerImageBuilder -from zenml.integrations.spark.step_operators.spark_step_operator import ( - SparkStepOperator -) - - -class KubernetesSparkStepOperator(SparkStepOperator): - """Step operator which runs Steps with Spark on Kubernetes.""" - - def _backend_configuration( - self, - spark_config: SparkConf, - step_config: "StepConfiguration", - ) -> None: - """Configures Spark to run on Kubernetes.""" - # Build and push the image - docker_image_builder = PipelineDockerImageBuilder() - image_name = docker_image_builder.build_and_push_docker_image(...) - - # Adjust the spark configuration - spark_config.set("spark.kubernetes.container.image", image_name) - ... -``` - -For Kubernetes, there are also some additional important configuration parameters: - -* `namespace` is the namespace under which the driver and executor pods will run. -* `service_account` is the service account that will be used by various Spark components (to create and watch the pods). - -Additionally, the `_backend_configuration` method is adjusted to handle the Kubernetes-specific configuration. - -## When to use it - -You should use the Spark step operator: - -* when you are dealing with large amounts of data. -* when you are designing a step that can benefit from distributed computing paradigms in terms of time and resources. - -## How to deploy it - -To use the `KubernetesSparkStepOperator` you will need to setup a few things first: - -* **Remote ZenML server:** See the [deployment guide](/develop/getting-started/deploying-zenml) for more information. -* **Kubernetes cluster:** There are many ways to deploy a Kubernetes cluster using different cloud providers or on your custom infrastructure. For AWS, you can follow the [Spark EKS Setup Guide](/develop/stack-components/step-operators/spark-kubernetes#spark-eks-setup-guide) below. - -### Spark EKS Setup Guide - -The following guide will walk you through how to spin up and configure a [Amazon Elastic Kubernetes Service](https://aws.amazon.com/eks/) with Spark on it: - -#### EKS Kubernetes Cluster - -* Follow [this guide](https://docs.aws.amazon.com/eks/latest/userguide/service%5FIAM%5Frole.html#create-service-role) to create an Amazon EKS cluster role. -* Follow [this guide](https://docs.aws.amazon.com/eks/latest/userguide/create-node-role.html#create-worker-node-role) to create an Amazon EC2 node role. -* Go to the [IAM website](https://console.aws.amazon.com/iam), and select `Roles` to edit both roles. -* Attach the `AmazonRDSFullAccess` and `AmazonS3FullAccess` policies to both roles. -* Go to the [EKS website](https://console.aws.amazon.com/eks). -* Make sure the correct region is selected on the top right. -* Click on `Add cluster` and select `Create`. -* Enter a name and select the **cluster role** for `Cluster service role`. -* Keep the default values for the networking and logging steps and create the cluster. -* Note down the cluster name and the API server endpoint: - -```bash -EKS_CLUSTER_NAME= - -EKS_API_SERVER_ENDPOINT= -``` - -* After the cluster is created, select it and click on `Add node group` in the `Compute` tab. -* Enter a name and select the **node role**. -* For the instance type, we recommend `t3a.xlarge`, as it provides up to 4 vCPUs and 16 GB of memory. - -#### Docker image for the Spark drivers and executors - -When you want to run your steps on a Kubernetes cluster, Spark will require you to choose a base image for the driver and executor pods. Normally, for this purpose, you can either use one of the base images in [Spark’s dockerhub](https://hub.docker.com/r/apache/spark-py/tags) or create an image using the [docker-image-tool](https://spark.apache.org/docs/latest/running-on-kubernetes.html#docker-images) which will use your own Spark installation and build an image. - -When using Spark in EKS, you need to use the latter and utilize the `docker-image-tool`. However, before the build process, you also need to download the following packages - -* [hadoop-aws \= 3.3.1](https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/3.3.1) -* [aws-java-sdk-bundle \= 1.12.150](https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-bundle/1.12.150) - -and put them in the `jars` folder within your Spark installation. Once that is set up, you can build the image as follows: - -```bash -cd $SPARK_HOME # If this empty for you then you need to set the SPARK_HOME variable which points to your Spark installation - -SPARK_IMAGE_TAG= - -./bin/docker-image-tool.sh -t $SPARK_IMAGE_TAG -p kubernetes/dockerfiles/spark/bindings/python/Dockerfile -u 0 build - -BASE_IMAGE_NAME=spark-py:$SPARK_IMAGE_TAG -``` - -If you are working on an M1 Mac, you will need to build the image for the amd64 architecture, by using the prefix `-X` on the previous command. For example: - -```bash -./bin/docker-image-tool.sh -X -t $SPARK_IMAGE_TAG -p kubernetes/dockerfiles/spark/bindings/python/Dockerfile -u 0 build -``` - -#### Configuring RBAC - -Additionally, you may need to create the several resources in Kubernetes in order to give Spark access to edit/manage your driver executor pods. - -To do so, create a file called `rbac.yaml` with the following content: - -```yaml -apiVersion: v1 -kind: Namespace -metadata: - name: spark-namespace ---- -apiVersion: v1 -kind: ServiceAccount -metadata: - name: spark-service-account - namespace: spark-namespace ---- -apiVersion: rbac.authorization.k8s.io/v1 -kind: ClusterRoleBinding -metadata: - name: spark-role - namespace: spark-namespace -subjects: - - kind: ServiceAccount - name: spark-service-account - namespace: spark-namespace -roleRef: - kind: ClusterRole - name: edit - apiGroup: rbac.authorization.k8s.io ---- -``` - -And then execute the following command to create the resources: - -```bash -aws eks --region=$REGION update-kubeconfig --name=$EKS_CLUSTER_NAME - -kubectl create -f rbac.yaml -``` - -Lastly, note down the **namespace** and the name of the **service account** since you will need them when registering the stack component in the next step. - -## How to use it - -To use the `KubernetesSparkStepOperator`, you need: - -* the ZenML `spark` integration. If you haven't installed it already, run - -``` -zenml integration install spark -``` -* [Docker](https://www.docker.com) installed and running. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. -* A [remote container registry](/develop/stack-components/container-registries) as part of your stack. -* A Kubernetes cluster [deployed](/develop/stack-components/step-operators/spark-kubernetes#how-to-deploy-it). - -We can then register the step operator and use it in our active stack: - -```bash - -zenml step-operator register spark_step_operator \ - --flavor=spark-kubernetes \ - --master=k8s://$EKS_API_SERVER_ENDPOINT \ - --namespace= \ - --service_account= -``` - -```bash -# Register the stack -zenml stack register spark_stack \ - -o default \ - -s spark_step_operator \ - -a spark_artifact_store \ - -c spark_container_registry \ - -i local_builder \ - --set -``` - -Once you added the step operator to your active stack, you can use it to execute individual steps of your pipeline by specifying it in the `@step` decorator as follows: - -```bash -from zenml import step - - -@step(step_operator=) -def step_on_spark(...) -> ...: - """Some step that should run with Spark on Kubernetes.""" - ... -``` - -After successfully running any step with a `KubernetesSparkStepOperator`, you should be able to see that a Spark driver pod was created in your cluster for each pipeline step when running `kubectl get pods -n $KUBERNETES_NAMESPACE`. - - -Instead of hardcoding a step operator name, you can also use the [Client](/develop/sdk/python-client) to dynamically use the step operator of your active stack: - -```bash -from zenml.client import Client - -step_operator = Client().active_stack.step_operator - -@step(step_operator=step_operator.name) -def step_on_spark(...) -> ...: - ... -``` - - -### Additional configuration - -For additional configuration of the Spark step operator, you can pass `SparkStepOperatorSettings` when defining or running your pipeline. Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-spark/#zenml.integrations.spark.flavors.spark%5Fstep%5Foperator%5Fflavor.SparkStepOperatorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - diff --git a/develop/stack-components/step-operators/step-operators.mdx b/develop/stack-components/step-operators/step-operators.mdx deleted file mode 100644 index 3af24f6..0000000 --- a/develop/stack-components/step-operators/step-operators.mdx +++ /dev/null @@ -1,56 +0,0 @@ ---- -title: "Overview" -description: "Executing individual steps in specialized environments." ---- - -The step operator enables the execution of individual pipeline steps in specialized runtime environments that are optimized for certain workloads. These specialized environments can give your steps access to resources like GPUs or distributed processing frameworks like [Spark](https://spark.apache.org/). - - - **Comparison to orchestrators:** The [orchestrator](/develop/stack-components/orchestrators) is a mandatory stack component that is responsible for executing all steps of a pipeline in the correct order and providing additional features such as scheduling pipeline runs. The step operator on the other hand is used to only execute individual steps of the pipeline in a separate environment in case the environment provided by the orchestrator is not feasible. - - -### When to use it - -A step operator should be used if one or more steps of a pipeline require resources that are not available in the runtime environments provided by the [orchestrator](/develop/stack-components/orchestrators). An example would be a step that trains a computer vision model and requires a GPU to run in a reasonable time, combined with a [Kubeflow orchestrator](/develop/stack-components/orchestrators/kubeflow) running on a Kubernetes cluster that does not contain any GPU nodes. In that case, it makes sense to include a step operator like [SageMaker](/develop/stack-components/step-operators/sagemaker), [Vertex](/develop/stack-components/step-operators/vertex), or [AzureML](/develop/stack-components/step-operators/azureml) to execute the training step with a GPU. - -### Step Operator Flavors - -Step operators to execute steps on one of the big cloud providers are provided by the following ZenML integrations: - -| Step Operator | Flavor | Integration | Notes | -| ---------------------------------------------------------------- | ---------- | ------------------------------------------------------------------------ | ----------------------------------------------------------------- | -| [SageMaker](/develop/stack-components/step-operators/sagemaker) | sagemaker | aws | Uses SageMaker to execute steps | -| [Vertex](/develop/stack-components/step-operators/vertex) | vertex | gcp | Uses Vertex AI to execute steps | -| [AzureML](/develop/stack-components/step-operators/azureml) | azureml | azure | Uses AzureML to execute steps | -| [Kubernetes](/develop/stack-components/step-operators/kubernetes) | kubernetes | kubernetes | Uses Kubernetes Pods to execute steps | -| [Spark](/develop/stack-components/step-operators/spark-kubernetes) | spark | spark | Uses Spark on Kubernetes to execute steps in a distributed manner | -| [Custom Implementation](/develop/stack-components/step-operators/custom) | _custom_ | Extend the step operator abstraction and provide your own implementation | | - -If you would like to see the available flavors of step operators, you can use the command: - -```bash -zenml step-operator flavor list -``` - -### How to use it - -You don't need to directly interact with any ZenML step operator in your code. As long as the step operator that you want to use is part of your active [ZenML stack](/develop/user-guide/guides/production-guide/understand-stacks), you can simply specify it in the `@step` decorator of your step. - -```bash -from zenml import step - -@step(step_operator= ) - -def my_step(...) -> ...: - - ... -``` - -#### Specifying per-step resources - -If your steps require additional hardware resources, you can specify them on your steps as described [here](/develop/usage/resource-data-management/training-with-gpus). - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use step operators to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. - diff --git a/develop/stack-components/step-operators/vertex.mdx b/develop/stack-components/step-operators/vertex.mdx deleted file mode 100644 index c172ab6..0000000 --- a/develop/stack-components/step-operators/vertex.mdx +++ /dev/null @@ -1,136 +0,0 @@ ---- -title: "Google Cloud VertexAI" -description: "Executing individual steps in Vertex AI." ---- - -[Vertex AI](https://cloud.google.com/vertex-ai) offers specialized compute instances to run your training jobs and has a comprehensive UI to track and manage your models and logs. ZenML's Vertex AI step operator allows you to submit individual steps to be run on Vertex AI compute instances. - -### When to use it - -You should use the Vertex step operator if: - -* one or more steps of your pipeline require computing resources (CPU, GPU, memory) that are not provided by your orchestrator. -* you have access to Vertex AI. If you're using a different cloud provider, take a look at the [SageMaker](/develop/stack-components/step-operators/sagemaker) or [AzureML](/develop/stack-components/step-operators/azureml) step operators. - -### How to deploy it - -* Enable Vertex AI [here](https://console.cloud.google.com/vertex-ai). -* Create a [service account](https://cloud.google.com/iam/docs/service-accounts) with the right permissions to create Vertex AI jobs (`roles/aiplatform.admin`) and push to the container registry (`roles/storage.admin`). - -### How to use it - -To use the Vertex step operator, we need: - -* The ZenML `gcp` integration installed. If you haven't done so, run - -```bash -zenml integration install gcp -``` -* [Docker](https://www.docker.com) installed and running. -* Vertex AI enabled and a service account file. See the [deployment section](/develop/stack-components/step-operators/vertex#how-to-deploy-it) for detailed instructions. -* A [GCR container registry](/develop/stack-components/container-registries/gcp) as part of our stack. -* (Optional) A machine type that we want to execute our steps on (this defaults to `n1-standard-4`). See [here](https://cloud.google.com/vertex-ai/docs/training/configure-compute#machine-types) for a list of available machine types. -* A [remote artifact store](/develop/stack-components/artifact-stores) as part of your stack. This is needed so that both your orchestration environment and VertexAI can read and write step artifacts. Check out the documentation page of the artifact store you want to use for more information on how to set that up and configure authentication for it. - -You have three different options to provide GCP credentials to the step operator: - -* use the [gcloud CLI](https://cloud.google.com/sdk/gcloud) to authenticate locally with GCP. This only works in combination with the local orchestrator. - -```bash -gcloud auth login -zenml step-operator register \ - --flavor=vertex \ - --project= \ - --region= \ -# --machine_type= # optionally specify the type of machine to run on -``` -* configure the orchestrator to use a [service account key file](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) to authenticate with GCP by setting the `service_account_path` parameter in the orchestrator configuration to point to a service account key file. This also works only in combination with the local orchestrator. - -```bash -zenml step-operator register \ - --flavor=vertex \ - --project= \ - --region= \ - --service_account_path= \ -# --machine_type= # optionally specify the type of machine to run on -``` -* (recommended) configure [a GCP Service Connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector) with GCP credentials coming from a [service account key file](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) or the local `gcloud` CLI set up with user account credentials and then link the Vertex AI Step Operator stack component to the Service Connector. This option works with any orchestrator. - -```bash -zenml service-connector register --type gcp --auth-method=service-account --project_id= --service_account_json=@ --resource-type gcp-generic -# Or, as an alternative, you could use the GCP user account locally set up with gcloud -# zenml service-connector register --type gcp --resource-type gcp-generic --auto-configure -zenml step-operator register \ - --flavor=vertex \ - --region= \ -# --machine_type= # optionally specify the type of machine to run on -zenml step-operator connect --connector -``` - -We can then use the registered step operator in our active stack: - - -```bash -# Add the step operator to the active stack -zenml stack update -s -``` - -Once you added the step operator to your active stack, you can use it to execute individual steps of your pipeline by specifying it in the `@step` decorator as follows: - - -```bash -from zenml import step -@step(step_operator=) -def trainer(...) -> ...: - """Train a model.""" - # This step will be executed in Vertex. -``` - - - ZenML will build a Docker image called `/zenml:` which includes your code and use it to run your steps in Vertex AI. Check out [this page](/develop/usage/project-setup/customize-docker-builds) if you want to learn more about how ZenML builds these images and how you can customize them. - - -#### Additional configuration - -You can specify the service account, network and reserved IP ranges to use for the VertexAI `CustomJob` by passing the `service_account`, `network` and `reserved_ip_ranges` parameters to the `step-operator register` command: - -```bash -zenml service-connector register --type gcp --auth-method=service-account --project_id= --service_account_json=@ --resource-type gcp-generic - -# Or, as an alternative, you could use the GCP user account locally set up with gcloud -# zenml service-connector register --type gcp --resource-type gcp-generic --auto-configure - -zenml step-operator register \ - --flavor=vertex \ - --region= \ -# --machine_type= # optionally specify the type of machine to run on - -zenml step-operator connect --connector -``` - -For additional configuration of the Vertex step operator, you can pass `VertexStepOperatorSettings` when defining or running your pipeline. - - -```bash -from zenml import step -from zenml.integrations.gcp.flavors.vertex_step_operator_flavor import VertexStepOperatorSettings - -@step(step_operator=, settings={"step_operator": VertexStepOperatorSettings( - accelerator_type= "NVIDIA_TESLA_T4", # see https://cloud.google.com/vertex-ai/docsversions/0.66.0/reference/rest/v1/MachineSpec#AcceleratorType - accelerator_count = 1, - machine_type = "n1-standard-2", # see https://cloud.google.com/vertex-ai/docs/training/configure-compute#machine-types - disk_type = "pd-ssd", # see https://cloud.google.com/vertex-ai/docs/training/configure-storage#disk-types - disk_size_gb = 100, # see https://cloud.google.com/vertex-ai/docs/training/configure-storage#disk-size -)}) -def trainer(...) -> ...: - """Train a model.""" - # This step will be executed in Vertex. -``` - -Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-gcp/#zenml.integrations.gcp.flavors.vertex%5Fstep%5Foperator%5Fflavor.VertexStepOperatorSettings) for a full list of available attributes and [this docs page](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information on how to specify settings. - -For more information and a full list of configurable attributes of the Vertex step operator, check out the [SDK Docs](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-gcp/#zenml.integrations.gcp.step%5Foperators.vertex%5Fstep%5Foperator.VertexStepOperator) . - -#### Enabling CUDA for GPU-backed hardware - -Note that if you wish to use this step operator to run steps on a GPU, you will need to follow [the instructions on this page](/develop/usage/resource-data-management/training-with-gpus) to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration. diff --git a/develop/usage/development-monitoring/control-logging/control-logging.mdx b/develop/usage/development-monitoring/control-logging/control-logging.mdx deleted file mode 100644 index 3442811..0000000 --- a/develop/usage/development-monitoring/control-logging/control-logging.mdx +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: "Overview" -description: "Configuring ZenML's default logging behavior" -icon: tree ---- - -ZenML produces various kinds of logs: - -* The [ZenML Server](/develop/getting-started/deploying-zenml) produces server logs (like any FastAPI server). -* The [Client or Runner](/develop/usage/project-setup/configure-python-environments#client-environment-or-the-runner-environment) environment produces logs, for example after running a pipeline. These are steps that are typically before, after, and during the creation of a pipeline run. -* The [Execution environment](/develop/usage/project-setup/configure-python-environments#execution-environments) (on the orchestrator level) produces logs when it executes each step of a pipeline. These are logs that are typically written in your steps using the python `logging` module. - -This section talks about how users can control logging behavior in these various environments. diff --git a/develop/usage/development-monitoring/control-logging/disable-colorful-logging.mdx b/develop/usage/development-monitoring/control-logging/disable-colorful-logging.mdx deleted file mode 100644 index 3f65b19..0000000 --- a/develop/usage/development-monitoring/control-logging/disable-colorful-logging.mdx +++ /dev/null @@ -1,28 +0,0 @@ ---- -title: "Disable colorful logging" -description: "How to disable colorful logging in ZenML." -icon: eye-slash ---- - -By default, ZenML uses colorful logging to make it easier to read logs. However, if you wish to disable this feature, you can do so by setting the following environment variable: - -```bash -ZENML_LOGGING_COLORS_DISABLED=true -``` - -Note that setting this on the [client environment](/develop/usage/project-setup/configure-python-environments#client-environment-or-the-runner-environment) (e.g. your local machine which runs the pipeline) will automatically disable colorful logging on remote pipeline runs. If you wish to only disable it locally, but turn on for remote pipeline runs, you can set the `ZENML_LOGGING_COLORS_DISABLED` environment variable in your pipeline runs environment as follows: - -```py -docker_settings = DockerSettings(environment={"ZENML_LOGGING_COLORS_DISABLED": "false"}) - -# Either add it to the decorator -@pipeline(settings={"docker": docker_settings}) -def my_pipeline() -> None: - my_step() - -# Or configure the pipelines options -my_pipeline = my_pipeline.with_options( - settings={"docker": docker_settings} -) -``` - diff --git a/develop/usage/development-monitoring/control-logging/disable-rich-traceback.mdx b/develop/usage/development-monitoring/control-logging/disable-rich-traceback.mdx deleted file mode 100644 index 23e29e9..0000000 --- a/develop/usage/development-monitoring/control-logging/disable-rich-traceback.mdx +++ /dev/null @@ -1,31 +0,0 @@ ---- -title: "Disable rich traceback output" -description: "How to disable rich traceback output in ZenML." -icon: handshake-slash ---- - -By default, ZenML uses the [rich](https://rich.readthedocs.io/en/stable/traceback.html) library to display rich traceback output. This is especially useful when debugging your pipelines. However, if you wish to disable this feature, you can do so by setting the following environment variable: - -```bash -export ZENML_ENABLE_RICH_TRACEBACK=false -``` - -This will ensure that you see only the plain text traceback output. - -Note that setting this on the [client environment](/develop/usage/project-setup/configure-python-environments#client-environment-or-the-runner-environment) (e.g. your local machine which runs the pipeline) will **not automatically disable rich tracebacks on remote pipeline runs**. That means setting this variable locally with only effect pipelines that run locally. - -If you wish to disable it also for [remote pipeline runs](/develop/user-guide/guides/production-guide/cloud-orchestration), you can set the `ZENML_ENABLE_RICH_TRACEBACK` environment variable in your pipeline runs environment as follows: - -```py -docker_settings = DockerSettings(environment={"ZENML_ENABLE_RICH_TRACEBACK": "false"}) - -# Either add it to the decorator -@pipeline(settings={"docker": docker_settings}) -def my_pipeline() -> None: - my_step() - -# Or configure the pipelines options -my_pipeline = my_pipeline.with_options( - settings={"docker": docker_settings} -) -``` diff --git a/develop/usage/development-monitoring/control-logging/enable-or-disable-logs-storing.mdx b/develop/usage/development-monitoring/control-logging/enable-or-disable-logs-storing.mdx deleted file mode 100644 index 84869e5..0000000 --- a/develop/usage/development-monitoring/control-logging/enable-or-disable-logs-storing.mdx +++ /dev/null @@ -1,54 +0,0 @@ ---- -title: "Enable or disable logs storage" -description: By default, ZenML uses a logging handler to capture the logs that occur during the execution of a step. Users are free to use the default python logging module or print statements, and ZenML's logging handler will catch these logs and store them. -icon: store-slash ---- - -```py -import logging - -from zenml import step - -@step -def my_step() -> None: - logging.warning("`Hello`") # You can use the regular `logging` module. - print("World.") # You can utilize `print` statements as well. -``` - -These logs are stored within the respective artifact store of your stack. You can display the logs in the dashboard as follows: - - - - - - Note that if you are not connected to a cloud artifact store with a service connector configured then you will not be able to view your logs in the dashboard. Read more [here](/develop/usage/development-monitoring/control-logging/view-logs-on-the-dasbhoard). - - -If you do not want to store the logs in your artifact store, you can: - -1. Disable it by using the `enable_step_logs` parameter either with your `@pipeline` or `@step` decorator: - -```py -from zenml import pipeline, step -@step(enable_step_logs=False) # disables logging for this step -def my_step() -> None: - ... -@pipeline(enable_step_logs=False) # disables logging for the entire pipeline -def my_pipeline(): - ... -``` -2. Disable it by using the environmental variable `ZENML_DISABLE_STEP_LOGS_STORAGE` and setting it to `true`. This environmental variable takes precedence over the parameters mentioned above. Note this environmental variable needs to be set on the [execution environment](/develop/usage/project-setup/configure-python-environments#execution-environments), i.e., on the orchestrator level: - -```py -docker_settings = DockerSettings(environment={"ZENML_DISABLE_STEP_LOGS_STORAGE": "true"}) - -# Either add it to the decorator -@pipeline(settings={"docker": docker_settings}) -def my_pipeline() -> None: - my_step() - -# Or configure the pipelines options -my_pipeline = my_pipeline.with_options( - settings={"docker": docker_settings} -) -``` diff --git a/develop/usage/development-monitoring/control-logging/set-logging-verbosity.mdx b/develop/usage/development-monitoring/control-logging/set-logging-verbosity.mdx deleted file mode 100644 index add0745..0000000 --- a/develop/usage/development-monitoring/control-logging/set-logging-verbosity.mdx +++ /dev/null @@ -1,31 +0,0 @@ ---- -title: "Set logging verbosity" -description: "How to set the logging verbosity in ZenML." -icon: gears ---- - -By default, ZenML sets the logging verbosity to `INFO`. If you wish to change this, you can do so by setting the following environment variable: - -```bash -export ZENML_LOGGING_VERBOSITY=INFO -``` - -Choose from `INFO`, `WARN`, `ERROR`, `CRITICAL`, `DEBUG`. This will set the logs to whichever level you suggest. - -Note that setting this on the [client environment](/develop/usage/project-setup/configure-python-environments#client-environment-or-the-runner-environment) (e.g. your local machine which runs the pipeline) will **not automatically set the same logging verbosity for remote pipeline runs**. That means setting this variable locally with only effect pipelines that run locally. - -If you wish to control for [remote pipeline runs](/develop/user-guide/guides/production-guide/cloud-orchestration), you can set the `ZENML_LOGGING_VERBOSITY` environment variable in your pipeline runs environment as follows: - -```py -docker_settings = DockerSettings(environment={"ZENML_LOGGING_VERBOSITY": "DEBUG"}) - -# Either add it to the decorator -@pipeline(settings={"docker": docker_settings}) -def my_pipeline() -> None: - my_step() - -# Or configure the pipelines options -my_pipeline = my_pipeline.with_options( - settings={"docker": docker_settings} -) -``` diff --git a/develop/usage/development-monitoring/control-logging/view-logs-on-the-dasbhoard.mdx b/develop/usage/development-monitoring/control-logging/view-logs-on-the-dasbhoard.mdx deleted file mode 100644 index 7814441..0000000 --- a/develop/usage/development-monitoring/control-logging/view-logs-on-the-dasbhoard.mdx +++ /dev/null @@ -1,32 +0,0 @@ ---- -title: "View logs on the dashboard" -description: By default, ZenML uses a logging handler to capture the logs that occur during the execution of a step. -icon: clipboard ---- - -Users are free to use the default python logging module or print statements, and ZenML's logging handler will catch these logs and store them. - -```py -import logging - -from zenml import step - -@step -def my_step() -> None: - logging.warning("`Hello`") # You can use the regular `logging` module. - print("World.") # You can utilize `print` statements as well. -``` - -These logs are stored within the respective artifact store of your stack. This means that you can only view these logs in the dashboard _if the deployed ZenML server has direct access to the underlying artifact store_. There are two cases in which this will be true: - -* In case of a local ZenML server (via `zenml up`), both local and remote artifact stores may be accessible, depending on configuration of the client. -* In case of a deployed ZenML server, logs for runs on a [local artifact store](/develop/stack-components/artifact-stores/local) will not be accessible. Logs for runs using a [remote artifact store](/develop/user-guide/guides/production-guide/remote-storage) **may be** accessible, if the artifact store has been configured with a [service connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide). Please read [this chapter](/develop/user-guide/guides/production-guide/remote-storage) of the production guide to learn how to configure a remote artifact store with a service connector. - -If configured correctly, the logs are displayed in the dashboard as follows: - - - - - - If you do not want to store the logs for your pipeline (for example due to performance reduction or storage limits), you can follow [these instructions](/develop/usage/development-monitoring/control-logging/enable-or-disable-logs-storing). - diff --git a/develop/usage/development-monitoring/debug-and-solve-issues.mdx b/develop/usage/development-monitoring/debug-and-solve-issues.mdx deleted file mode 100644 index 5f1cb8c..0000000 --- a/develop/usage/development-monitoring/debug-and-solve-issues.mdx +++ /dev/null @@ -1,204 +0,0 @@ ---- -title: "Debug and solve issues" -description: "A guide to debug common issues and get help." -icon: bug ---- - -If you stumbled upon this page, chances are you're facing issues with using ZenML. This page documents suggestions and best practices to let you debug, get help, and solve issues quickly. - -### When to get help? - -We suggest going through the following checklist before asking for help: - -* Search on Slack using the built-in Slack search function at the top of the page. - - - - - -* Search on [GitHub issues](https://github.com/zenml-io/zenml/issues). -* Search the [docs](https://docs.zenml.io) using the search bar in the top right corner of the page. - - - - -* Check out the [common errors](/develop/usage/development-monitoring/debug-and-solve-issues#most-common-errors) section below. -* Understand the problem by studying the [additional logs](/develop/usage/development-monitoring/debug-and-solve-issues#41-additional-logs) and [client/server logs](/develop/usage/development-monitoring/debug-and-solve-issues#client-and-server-logs). - -Chances are you'd find your answers there. If you can't find any clue, then it's time to post your question on [Slack](https://zenml.io/slack). - -### How to post on Slack? - -When posting on Slack it's useful to provide the following information (when applicable) so that we get a complete picture before jumping into solutions. - -#### 1\. System Information - -Let us know relevant information about your system. We recommend running the following in your terminal and attaching the output to your question. - -```bash -zenml info -a -s -``` - -You can optionally include information about specific packages where you're having problems by using the `-p` option. For example, if you're having problems with the `tensorflow` package, you can run: - -```bash -zenml info -p tensorflow -``` - -The output should look something like this: - -```yaml -ZENML_LOCAL_VERSION: 0.40.2 -ZENML_SERVER_VERSION: 0.40.2 -ZENML_SERVER_DATABASE: mysql -ZENML_SERVER_DEPLOYMENT_TYPE: alpha -ZENML_CONFIG_DIR: /Users/my_username/Library/Application Support/zenml -ZENML_LOCAL_STORE_DIR: /Users/my_username/Library/Application Support/zenml/local_stores -ZENML_SERVER_URL: https://someserver.zenml.io -ZENML_ACTIVE_REPOSITORY_ROOT: /Users/my_username/coding/zenml/repos/zenml -PYTHON_VERSION: 3.9.13 -ENVIRONMENT: native -SYSTEM_INFO: {'os': 'mac', 'mac_version': '13.2'} -ACTIVE_STACK: default -ACTIVE_USER: some_user -TELEMETRY_STATUS: disabled -ANALYTICS_CLIENT_ID: xxxxxxx-xxxxxxx-xxxxxxx -ANALYTICS_USER_ID: xxxxxxx-xxxxxxx-xxxxxxx -ANALYTICS_SERVER_ID: xxxxxxx-xxxxxxx-xxxxxxx -INTEGRATIONS: ['airflow', 'aws', 'azure', 'dash', 'evidently', 'facets', 'feast', 'gcp', 'github', -'graphviz', 'huggingface', 'kaniko', 'kubeflow', 'kubernetes', 'lightgbm', 'mlflow', -'neptune', 'neural_prophet', 'pillow', 'plotly', 'pytorch', 'pytorch_lightning', 's3', 'scipy', -'sklearn', 'slack', 'spark', 'tensorboard', 'tensorflow', 'vault', 'wandb', 'whylogs', 'xgboost'] -``` - -System information provides more context to your issue and also eliminates the need for anyone to ask when they're trying to help. This increases the chances of your question getting answered and saves everyone's time. - -#### 2\. What happened? - -Tell us briefly: - -* What were you trying to achieve? -* What did you expect to happen? -* What actually happened? - -#### 3\. How to reproduce the error? - -Walk us through how to reproduce the same error you had step-by-step, whenever possible. Use the format you prefer. Write it in text or record a video, whichever lets you get the issue at hand across to us! - -#### 4\. Relevant log output - -As a general rule of thumb, always attach relevant log outputs and the full error traceback to help us understand what happened under the hood. If the full error traceback does not fit into a text message, attach a file or use a service like [Pastebin](https://pastebin.com/) or [Github's Gist](https://gist.github.com/). - -Along with the error traceback, we recommend to always share the output of the following commands: - -* `zenml status` -* `zenml stack describe` - -When applicable, also attach logs of the orchestrator. For example, if you're using the Kubeflow orchestrator, include the logs of the pod that was running the step that failed. - -Usually, the default log you see in your terminal is sufficient, in the event it's not, then it's useful to provide additional logs. Additional logs are not shown by default, you'll have to toggle an environment variable for it. Read the next section to find out how. - -**4.1 Additional logs** - -When the default logs are not helpful, ambiguous, or do not point you to the root of the issue, you can toggle the value of the `ZENML_LOGGING_VERBOSITY` environment variable to change the type of logs shown. The default value of `ZENML_LOGGING_VERBOSITY` environment variable is: - -```bash -ZENML_LOGGING_VERBOSITY=INFO -``` - -You can pick other values such as `WARN`, `ERROR`, `CRITICAL`, `DEBUG` to change what's shown in the logs. And export the environment variable in your terminal. For example in Linux: - -```bash -export ZENML_LOGGING_VERBOSITY=DEBUG -``` -Read more about how to set environment variables for: - -* For [Linux](https://linuxize.com/post/how-to-set-and-list-environment-variables-in-linux/). -* For [macOS](https://youngstone89.medium.com/setting-up-environment-variables-in-mac-os-28e5941c771c). -* For [Windows](https://www.computerhope.com/issues/ch000549.htm). - -### Client and server logs - -When facing a ZenML Server-related issue, you can view the logs of the server to introspect deeper. To achieve this, run: - -```bash -zenml logs -``` - -The logs from a healthy server should look something like this: - -```bash -INFO:asyncio:Syncing pipeline runs... -2022-10-19 09:09:18,195 - zenml.zen_stores.metadata_store - DEBUG - Fetched 4 steps for pipeline run '13'. (metadata_store.py:315) -2022-10-19 09:09:18,359 - zenml.zen_stores.metadata_store - DEBUG - Fetched 0 inputs and 4 outputs for step 'importer'. (metadata_store.py:427) -2022-10-19 09:09:18,461 - zenml.zen_stores.metadata_store - DEBUG - Fetched 0 inputs and 4 outputs for step 'importer'. (metadata_store.py:427) -2022-10-19 09:09:18,516 - zenml.zen_stores.metadata_store - DEBUG - Fetched 2 inputs and 2 outputs for step 'normalizer'. (metadata_store.py:427) -2022-10-19 09:09:18,606 - zenml.zen_stores.metadata_store - DEBUG - Fetched 0 inputs and 4 outputs for step 'importer'. (metadata_store.py:427) -``` - -### Most common errors - -This section documents frequently encountered errors among users and solutions to each. - -#### Error initializing rest store - -Typically, the error presents itself as: - -```bash -RuntimeError: Error initializing rest store with URL 'http://127.0.0.1:8237': HTTPConnectionPool(host='127.0.0.1', port=8237): Max retries exceeded with url: /api/v1/login (Caused by -NewConnectionError(': Failed to establish a new connection: [Errno 61] Connection refused')) -``` - -If you restarted your machine after deploying ZenML then you have to run `zenml up` again after each restart. Local ZenML deployments don't survive machine restarts. - -#### Column 'step\_configuration' cannot be null - -```bash -sqlalchemy.exc.IntegrityError: (pymysql.err.IntegrityError) (1048, "Column 'step_configuration' cannot be null") -``` - -This happens when a step configuration is too long. We changed the limit from 4K to 65K chars, but it could still happen if you have excessively long strings in your config. - -#### 'NoneType' object has no attribute 'name' - -This is also a common error you might encounter when you do not have the necessary stack components registered on the stack. For example: - -```bash -╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ -│ /home/dnth/Documents/zenml-projects/nba-pipeline/run_pipeline.py:24 in │ -│ │ -│ 21 │ reference_data_splitter, │ -│ 22 │ TrainingSplitConfig, │ -│ 23 ) │ -│ ❱ 24 from steps.trainer import random_forest_trainer │ -│ 25 from steps.encoder import encode_columns_and_clean │ -│ 26 from steps.importer import ( │ -│ 27 │ import_season_schedule, │ -│ │ -│ /home/dnth/Documents/zenml-projects/nba-pipeline/steps/trainer.py:24 in │ -│ │ -│ 21 │ max_depth: int = 10000 │ -│ 22 │ target_col: str = "FG3M" │ -│ 23 │ -│ ❱ 24 @step(enable_cache=False, experiment_tracker=experiment_tracker.name) │ -│ 25 def random_forest_trainer( │ -│ 26 │ train_df_x: pd.DataFrame, │ -│ 27 │ train_df_y: pd.DataFrame, │ -╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ -AttributeError: 'NoneType' object has no attribute 'name' -``` - -In the above error snippet, the `step` on line 24 expects an experiment tracker but could not find it on the stack. To solve it, register an experiment tracker of your choice on the stack. For instance: - -```bash -zenml experiment-tracker register mlflow_tracker --flavor=mlflow -``` - -and update your stack with the experiment tracker: - -```bash -zenml stack update -e mlflow_tracker -``` - -This also applies to all other [stack components](/develop/stack-components/component-guide). - diff --git a/develop/usage/development-monitoring/popular-integrations/aws-guide.mdx b/develop/usage/development-monitoring/popular-integrations/aws-guide.mdx deleted file mode 100644 index 6bdf056..0000000 --- a/develop/usage/development-monitoring/popular-integrations/aws-guide.mdx +++ /dev/null @@ -1,281 +0,0 @@ ---- -title: "Run on AWS" -description: "A simple guide to create an AWS stack to run your ZenML pipelines" -icon: aws ---- - -This page aims to quickly set up a minimal production stack on AWS. With just a few simple steps, you will set up an IAM role with specifically-scoped permissions that ZenML can use to authenticate with the relevant AWS resources. - - -Would you like to skip ahead and deploy a full AWS ZenML cloud stack already? - -Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML AWS Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack. - - -## 1. Set up credentials and local environment - -To follow this guide, you need: - -* An active AWS account with necessary permissions for AWS S3, SageMaker, ECR, and ECS. -* ZenML [installed](/develop/getting-started/installation) -* AWS CLI installed and configured with your AWS credentials. You can follow the instructions [here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html). - -Once ready, navigate to the AWS console: - -1. Choose an AWS region In the AWS console, choose the region where you want to deploy your ZenML stack resources. Make note of the region name (e.g., `us-east-1`, `eu-west-2`, etc.) as you will need it in subsequent steps. -2. Create an IAM role - -For this, you'll need to find out your AWS account ID. You can find this by running: - -```bash -aws sts get-caller-identity --query Account --output text -``` - -This will output your AWS account ID. Make a note of this as you will need it in the next steps. (If you're doing anything more esoteric with your AWS account and IAM roles, this might not work for you. The account ID here that we're trying to get is the root account ID that you use to log in to the AWS console.) - -Then create a file named `assume-role-policy.json` with the following content: - -```JSON -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Principal": { - "AWS": "arn:aws:iam:::root", - "Service": "sagemaker.amazonaws.com" - }, - "Action": "sts:AssumeRole" - } - ] -} -``` - -Make sure to replace the placeholder `` with your actual AWS account ID that we found earlier. - -Now create a new IAM role that ZenML will use to access AWS resources. We'll use `zenml-role` as a role name in this example, but you can feel free to choose something else if you prefer. Run the following command to create the role: - -```bash -aws iam create-role --role-name zenml-role --assume-role-policy-document file://assume-role-policy.json -``` - -Be sure to take note of the information that is output to the terminal, as you will need it in the next steps, especially the Role ARN. - -1. Attach policies to the role - -Attach the following policies to the role to grant access to the necessary AWS services: - -* `AmazonS3FullAccess` -* `AmazonEC2ContainerRegistryFullAccess` -* `AmazonSageMakerFullAccess` - -```bash -aws iam attach-role-policy --role-name zenml-role --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess -aws iam attach-role-policy --role-name zenml-role --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess -aws iam attach-role-policy --role-name zenml-role --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess -``` - -1. If you have not already, install the AWS and S3 ZenML integrations: - -```bash -zenml integration install aws s3 -y -``` - -## 2. Create a Service Connector within ZenML - -Create an AWS Service Connector within ZenML. The service connector will allow ZenML and other ZenML components to authenticate themselves with AWS using the IAM role. - - -```Bash -zenml service-connector register aws_connector \ - --type aws \ - --auth-method iam-role \ - --role_arn= \ - --region= \ - --aws_access_key_id= \ - --aws_secret_access_key= -``` -Replace `` with the ARN of the IAM role you created in the previous step, `` with the respective value and use your AWS access key ID and secret access key that we noted down earlier. - - - - -## 3. Create Stack Components - -### Artifact Store (S3) - -An [artifact store](/develop/user-guide/guides/production-guide/remote-storage) is used for storing and versioning data flowing through your pipelines. - -1. Before you run anything within the ZenML CLI, create an AWS S3 bucket. If you already have one, you can skip this step. (Note: the bucket name should be unique, so you might need to try a few times to find a unique name.) - -```bash -aws s3api create-bucket --bucket your-bucket-name -``` - -Once this is done, you can create the ZenML stack component as follows: - -1. Register an S3 Artifact Store with the connector - -```bash -zenml artifact-store register cloud_artifact_store -f s3 --path=s3://bucket-name --connector aws_connector -``` - -More details [here](/develop/stack-components/artifact-stores/s3). - -### Orchestrator (SageMaker Pipelines) - -An [orchestrator](/develop/user-guide/guides/production-guide/cloud-orchestration) is the compute backend to run your pipelines. - -1. Before you run anything within the ZenML CLI, head on over to AWS and create a SageMaker domain (Skip this if you already have one). The instructions for creating a domain can be found [in the AWS core documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-quick-start.html). - -A SageMaker domain is a central management unit for all SageMaker users and resources within a region. It provides a single sign-on (SSO) experience and enables users to create and manage SageMaker resources, such as notebooks, training jobs, and endpoints, within a collaborative environment. - -When you create a SageMaker domain, you specify the configuration settings, such as the domain name, user profiles, and security settings. Each user within a domain gets their own isolated workspace, which includes a JupyterLab interface, a set of compute resources, and persistent storage. - -The SageMaker orchestrator in ZenML requires a SageMaker domain to run pipelines because it leverages the SageMaker Pipelines service, which is part of the SageMaker ecosystem. SageMaker Pipelines allows you to define, execute, and manage end-to-end machine learning workflows using a declarative approach. - -By creating a SageMaker domain, you establish the necessary environment and permissions for the SageMaker orchestrator to interact with SageMaker Pipelines and other SageMaker resources seamlessly. The domain acts as a prerequisite for using the SageMaker orchestrator in ZenML. - -Once this is done, you can create the ZenML stack component as follows: - -1. Register a SageMaker Pipelines orchestrator stack component: - -You'll need the IAM role ARN that we noted down earlier to register the orchestrator. This is the 'execution role' ARN you need to pass to the orchestrator. - -```bash -zenml orchestrator register sagemaker-orchestrator --flavor=sagemaker --region= --execution_role= -``` - -**Note**: The SageMaker orchestrator utilizes the AWS configuration for operation and does not require direct connection via a service connector for authentication, as it relies on your AWS CLI configurations or environment variables. - -More details [here](/develop/stack-components/orchestrators/sagemaker). - -### Container Registry (ECR) - -A [container registry](/develop/stack-components/container-registries) is used to store Docker images for your pipelines. - -1. You'll need to create a repository in ECR. If you already have one, you can skip this step. - -```bash -aws ecr create-repository --repository-name zenml --region -``` - -Once this is done, you can create the ZenML stack component as follows: - -1. Register an ECR container registry stack component: - -```bash -zenml container-registry register ecr-registry --flavor=aws --uri=.dkr.ecr..amazonaws.com --connector aws-connector -``` - -More details [here](/develop/stack-components/container-registries/aws). - -## 4. Create stack - - - - ```bash -export STACK_NAME=aws_stack - -zenml stack register ${STACK_NAME} -o ${ORCHESTRATOR_NAME} \ - -a ${ARTIFACT_STORE_NAME} -c ${CONTAINER_REGISTRY_NAME} --set -``` -In case you want to also add any other stack components to this stack, feel free to do so. - - -Combine the three stack components and you have your AWS stack. Feel free to add any other component of your choice as well. - - - -## 5. And you're already done! - -Just like that, you now have a fully working AWS stack ready to go. Feel free to take it for a spin by running a pipeline on it. - -Define a ZenML pipeline: - -```py -from zenml import pipeline, step - -@step -def hello_world() -> str: - return "Hello from SageMaker!" - -@pipeline -def aws_sagemaker_pipeline(): - hello_world() - -if __name__ == "__main__": - aws_sagemaker_pipeline() -``` - -Save this code to run.py and execute it. The pipeline will use AWS S3 for artifact storage, Amazon SageMaker Pipelines for orchestration, and Amazon ECR for container registry. - -```bash -python run.py -``` - - - - -Read more in the [production guide](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/production-guide/production-guide.md). - -## Cleanup - - - Make sure you no longer need the resources before deleting them. The instructions and commands that follow are DESTRUCTIVE. - - -Delete any AWS resources you no longer use to avoid additional charges. You'll want to do the following: - -```bash -# delete the S3 bucket -aws s3 rm s3://your-bucket-name --recursive -aws s3api delete-bucket --bucket your-bucket-name - -# delete the SageMaker domain -aws sagemaker delete-domain --domain-id - -# delete the ECR repository -aws ecr delete-repository --repository-name zenml-repository --force - -# detach policies from the IAM role -aws iam detach-role-policy --role-name zenml-role --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess -aws iam detach-role-policy --role-name zenml-role --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess -aws iam detach-role-policy --role-name zenml-role --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess - -# delete the IAM role -aws iam delete-role --role-name zenml-role -``` - -Make sure to run these commands in the same AWS region where you created the resources. - -By running these cleanup commands, you will delete the S3 bucket, SageMaker domain, ECR repository, and IAM role, along with their associated policies. This will help you avoid any unnecessary charges for resources you no longer need. - -Remember to be cautious when deleting resources and ensure that you no longer require them before running the deletion commands. - -## Conclusion - -In this guide, we walked through the process of setting up an AWS stack with ZenML to run your machine learning pipelines in a scalable and production-ready environment. The key steps included: - -1. Setting up credentials and the local environment by creating an IAM role with the necessary permissions. -2. Creating a ZenML service connector to authenticate with AWS services using the IAM role. -3. Configuring stack components, including an S3 artifact store, a SageMaker Pipelines orchestrator, and an ECR container registry. -4. Registering the stack components and creating a ZenML stack. - -By following these steps, you can leverage the power of AWS services, such as S3 for artifact storage, SageMaker Pipelines for orchestration, and ECR for container management, all within the ZenML framework. This setup allows you to build, deploy, and manage machine learning pipelines efficiently and scale your workloads based on your requirements. - -The benefits of using an AWS stack with ZenML include: - -* Scalability: Leverage the scalability of AWS services to handle large-scale machine learning workloads. -* Reproducibility: Ensure reproducibility of your pipelines with versioned artifacts and containerized environments. -* Collaboration: Enable collaboration among team members by using a centralized stack and shared resources. -* Flexibility: Customize and extend your stack components based on your specific needs and preferences. - -Now that you have a functional AWS stack set up with ZenML, you can explore more advanced features and capabilities offered by ZenML. Some next steps to consider: - -* Dive deeper into ZenML's [production guide](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/production-guide/production-guide.md) to learn best practices for deploying and managing production-ready pipelines. -* Explore ZenML's [integrations](/develop/stack-components/component-guide) with other popular tools and frameworks in the machine learning ecosystem. -* Join the [ZenML community](https://zenml.io/slack) to connect with other users, ask questions, and get support. - -By leveraging the power of AWS and ZenML, you can streamline your machine learning workflows, improve collaboration, and deploy production-ready pipelines with ease. Happy experimenting and building! - diff --git a/develop/usage/development-monitoring/popular-integrations/azure-guide.mdx b/develop/usage/development-monitoring/popular-integrations/azure-guide.mdx deleted file mode 100644 index eb87b12..0000000 --- a/develop/usage/development-monitoring/popular-integrations/azure-guide.mdx +++ /dev/null @@ -1,175 +0,0 @@ ---- -title: "Run on Azure" -description: "A simple guide to create an Azure stack to run your ZenML pipelines" -icon: book ---- - -This page aims to quickly set up a minimal production stack on Azure. With just a few simple steps, you will set up a resource group, a service principal with correct permissions and the relevant ZenML stack and components. - - -Would you like to skip ahead and deploy a full Azure ZenML cloud stack already? - -Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML Azure Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack. - - - -To follow this guide, you need: - -* An active Azure account. -* ZenML [installed](/develop/getting-started/installation). -* ZenML `azure` integration installed with `zenml integration install azure`. - -## 1. Set up proper credentials - -You can start by [creating a service principal by creating an app registration](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azureml.ipynb) on Azure: - -1. Go to the App Registrations on the Azure portal. -2. Click on `+ New registration`, -3. Give it a name and click register. - - - - - - - - - - -Once you create the service principal, you will get an Application ID and Tenant ID as they will be needed later. - -Next, go to your service principal and click on the `Certificates & secrets` in the `Manage` menu. Here, you have to create a client secret. Note down the secret value as it will be needed later. - - - - - -## 2. Create a service connector - -Once you have created the service principal and the client secret, you can go ahead and create [a ZenML Azure Service Connector](/develop/usage/resource-data-management/auth-management/azure-service-connector). - -```bash -zenml service-connector register azure_connector --type azure \ - --auth-method service-principal \ - --client_secret= \ - --tenant_id= \ - --client_id= -``` - -You will use this service connector later on to connect your components with proper authentication. - -## 3. Create a resource group and the AzureML instance - -Now, you have to [create a resource group on Azure](https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/manage-resource-groups-portal). To do this, go to the Azure portal and go to the `Resource Groups` page and click `+ Create`. - - - - - -Once the resource group is created, go to the overview page of your new resource group and click `+ Create`. This will open up the marketplace where you can select a variety of resource to create. Look for `Azure Machine Learning`. - - - - - -Select it, and you will start the process of creating an AzureML workspace. As you can see from the `Workspace details`, AzureML workspaces come equipped with a storage account, key vault, and application insights. It is highly recommended that you create a container registry as well. - - - - - -## 4\. Create Stack Components - -In order to run any workflows on Azure using ZenML, you need an artifact store, an orchestrator and a container registry. - -### Artifact Store (Azure Blob Storage) - -For the artifact store, we will be using the storage account attached to our AzureML workspace. But before registering the component itself, you have to create a container for blob storage. To do this, go to the corresponding storage account in your workspace and create a new container: - - - - - -Once you create the container, you can go ahead, register your artifact store using its path and connect it to your service connector: - - -```bash -zenml artifact-store register azure_artifact_store -f azure \ - --path= \ - --connector azure_connector -``` - -For more information regarding Azure Blob Storage artifact stores, feel free to [check the docs](/develop/stack-components/artifact-stores/azure). - -### Orchestrator (AzureML) - -As for the orchestrator, no additional setup is needed. Simply use the following command to register it and connect it your service connector: - -```bash -zenml orchestrator register azure_orchestrator -f azureml \ - --subscription_id= \ - --resource_group= \ - --workspace= \ - --connector azure_connector -``` - -For more information regarding AzureML orchestrator, feel free to [check the docs](/develop/stack-components/orchestrators/azureml). - -### Container Registry (Azure Container Registry) - -Similar to the orchestrator, you can register and connect your container registry using the following command: - - -```bash -zenml container-registry register azure_container_registry -f azure \ - --uri= \ - --connector azure_connector -``` - -For more information regarding Azure container registries, feel free to [check the docs](/develop/stack-components/container-registries/azure). - -## 5\. Create a Stack - -Now, you can use the registered components to create an Azure ZenML stack: - -```bash -zenml stack register azure_stack \ - -o azure_orchestrator \ - -a azure_artifact_store \ - -c azure_container_registry \ - --set -``` - -## 6. ...and you are done. - -Just like that, you now have a fully working Azure stack ready to go. Feel free to take it for a spin by running a pipeline on it. - -Define a ZenML pipeline: - - -```bash -from zenml import pipeline, step - -@step -def hello_world() -> str: - return "Hello from Azure!" - -@pipeline -def azure_pipeline(): - hello_world() - -if __name__ == "__main__": - azure_pipeline() -``` - -Save this code to run.py and execute it. The pipeline will use Azure Blob Storage for artifact storage, AzureML for orchestration, and an Azure container registry. - -```bash -python run.py -``` - -Now that you have a functional Azure stack set up with ZenML, you can explore more advanced features and capabilities offered by ZenML. Some next steps to consider: - -* Dive deeper into ZenML's [production guide](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/user-guide/production-guide/production-guide.md) to learn best practices for deploying and managing production-ready pipelines. -* Explore ZenML's [integrations](/develop/stack-components/component-guide) with other popular tools and frameworks in the machine learning ecosystem. -* Join the [ZenML community](https://zenml.io/slack) to connect with other users, ask questions, and get support. diff --git a/develop/usage/development-monitoring/popular-integrations/gcp-guide.mdx b/develop/usage/development-monitoring/popular-integrations/gcp-guide.mdx deleted file mode 100644 index 2e9f03c..0000000 --- a/develop/usage/development-monitoring/popular-integrations/gcp-guide.mdx +++ /dev/null @@ -1,161 +0,0 @@ ---- -title: "Run on GCP" -description: "A simple guide to quickly set up a minimal stack on GCP." -icon: file-code ---- - -This page aims to quickly set up a minimal production stack on GCP. With just a few simple steps you will set up a service account with specifically-scoped permissions that ZenML can use to authenticate with the relevant GCP resources. - - -Would you like to skip ahead and deploy a full GCP ZenML cloud stack already? - -Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML GCP Terraform module](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register this stack. - - - - While this guide focuses on Google Cloud, we are seeking contributors to create a similar guide for other cloud providers. If you are interested, please create a [pull request over on GitHub](https://github.com/zenml-io/zenml/blob/main/CONTRIBUTING.md). - - -### 1. Choose a GCP project - -In the Google Cloud console, on the project selector page, select or [create a Google Cloud project](https://cloud.google.com/resource-manager/docs/creating-managing-projects). Make sure a billing account is attached to this project to allow the use of some APIs. - -This is how you would do it from the CLI if this is preferred. - -```bash -gcloud projects create --billing-project= -``` - - - If you don't plan to keep the resources that you create in this procedure, create a new project. After you finish these steps, you can delete the project, thereby removing all resources associated with the project. - - -### 2. Enable GCloud APIs - -The [following APIs](https://console.cloud.google.com/flows/enableapi?apiid=cloudfunctions,cloudbuild.googleapis.com,artifactregistry.googleapis.com,run.googleapis.com,logging.googleapis.com\&redirect=https://cloud.google.com/functions/docs/create-deploy-gcloud&\%5Fga=2.103703808.1862683951.1694002459-205697788.1651483076&\%5Fgac=1.161946062.1694011263.Cj0KCQjwxuCnBhDLARIsAB-cq1ouJZlVKAVPMsXnYrgQVF2t1Q2hUjgiHVpHXi2N0NlJvG3j3y-PPh8aAoSIEALw\%5FwcB) will need to be enabled within your chosen GCP project. - -* Cloud Functions API # For the vertex orchestrator -* Cloud Run Admin API # For the vertex orchestrator -* Cloud Build API # For the container registry -* Artifact Registry API # For the container registry -* Cloud Logging API # Generally needed - -### 3. Create a dedicated service account - -The service account should have these following roles. - -* AI Platform Service Agent -* Storage Object Admin - -These roles give permissions for full CRUD on storage objects and full permissions for compute within VertexAI. - -### 4. Create a JSON Key for your service account - -This [json file](https://cloud.google.com/iam/docs/keys-create-delete) will allow the service account to assume the identity of this service account. You will need the filepath of the downloaded file in the next step. - -``` -export JSON_KEY_FILE_PATH= -``` - -### 5) Create a Service Connector within ZenML - -The service connector will allow ZenML and other ZenML components to authenticate themselves with GCP. - -```bash CLI - -zenml integration install gcp \ -&& zenml service-connector register gcp_connector \ ---type gcp \ ---auth-method service-account \ ---service_account_json=@${JSON_KEY_FILE_PATH} \ ---project_id= -``` - - -### 6. Create Stack Components - -#### Artifact Store - -Before you run anything within the ZenML CLI, head on over to GCP and create a GCS bucket, in case you don't already have one that you can use. Once this is done, you can create the ZenML stack component as follows: - - -```bash -export ARTIFACT_STORE_NAME=gcp_artifact_store - -# Register the GCS artifact-store and reference the target GCS bucket -zenml artifact-store register ${ARTIFACT_STORE_NAME} --flavor gcp \ - --path=gs:// - -# Connect the GCS artifact-store to the target bucket via a GCP Service Connector -zenml artifact-store connect ${ARTIFACT_STORE_NAME} -i -``` - - Head on over to our [docs](/develop/stack-components/artifact-stores/gcp) to learn more about artifact stores and how to configure them. - - - - -#### Orchestrator - -This guide will use Vertex AI as the orchestrator to run the pipelines. As a serverless service Vertex is a great choice for quick prototyping of your MLOps stack. The orchestrator can be switched out at any point in the future for a more use-case- and budget-appropriate solution. - - - -```bash -export ORCHESTRATOR_NAME=gcp_vertex_orchestrator - -# Register the GCS artifact-store and reference the target GCS bucket - -zenml orchestrator register ${ORCHESTRATOR_NAME} --flavor=vertex - - --project= --location=europe-west2 - -# Connect the GCS orchestrator to the target gcp project via a GCP Service Connector - -zenml orchestrator connect ${ORCHESTRATOR_NAME} -i -``` - - Head on over to our [docs](/develop/stack-components/orchestrators/vertex) to learn more about orchestrators and how to configure them. - - - -#### Container Registry - -```bash -export CONTAINER_REGISTRY_NAME=gcp_container_registry - -zenml container-registry register ${CONTAINER_REGISTRY_NAME} --flavor=gcp --uri= - -# Connect the GCS orchestrator to the target gcp project via a GCP Service Connector -zenml container-registry connect ${CONTAINER_REGISTRY_NAME} -i -``` - -Head on over to our [docs](/develop/stack-components/container-registries) to learn more about container registries and how to configure them. - - - -### 7. Create Stack - - -```bash -export STACK_NAME=gcp_stack - -zenml stack register ${STACK_NAME} -o ${ORCHESTRATOR_NAME} \ - -a ${ARTIFACT_STORE_NAME} -c ${CONTAINER_REGISTRY_NAME} --set -``` - -In case you want to also add any other stack components to this stack, feel free to do so. - - -## And you're already done! - -Just like that, you now have a fully working GCP stack ready to go. Feel free to take it for a spin by running a pipeline on it. - -## Cleanup - -If you do not want to use any of the created resources in the future, simply delete the project you created. - -```bash -gcloud project delete -``` - diff --git a/develop/usage/development-monitoring/popular-integrations/kubeflow.mdx b/develop/usage/development-monitoring/popular-integrations/kubeflow.mdx deleted file mode 100644 index d5d4903..0000000 --- a/develop/usage/development-monitoring/popular-integrations/kubeflow.mdx +++ /dev/null @@ -1,107 +0,0 @@ ---- -title: "Kubeflow" -description: "Run your ML pipelines on Kubeflow Pipelines." -icon: code ---- - -The ZenML Kubeflow Orchestrator allows you to run your ML pipelines on Kubeflow Pipelines without writing Kubeflow code. - -## Prerequisites - -To use the Kubeflow Orchestrator, you'll need: - -* ZenML `kubeflow` integration installed (`zenml integration install kubeflow`) -* Docker installed and running -* `kubectl` installed (optional, see below) -* A Kubernetes cluster with Kubeflow Pipelines installed (see deployment guide for your cloud provider) -* A remote artifact store and container registry in your ZenML stack -* A remote ZenML server deployed to the cloud -* The name of your Kubernetes context pointing to the remote cluster (optional, see below) - -## Configuring the Orchestrator - -There are two ways to configure the orchestrator: - -1. Using a [Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) to connect to the remote cluster (recommended for cloud-managed clusters). No local `kubectl` context needed. - -```bash -zenml orchestrator register --flavor kubeflow -zenml service-connector list-resources --resource-type kubernetes-cluster -e -zenml orchestrator connect --connector -zenml stack update -o -``` - -2. Configuring `kubectl` with a context pointing to the remote cluster and setting `kubernetes_context` in the orchestrator config: - -```bash -zenml orchestrator register \ - --flavor=kubeflow \ - --kubernetes_context= - -zenml stack update -o -``` - -## Running a Pipeline - -Once configured, you can run any ZenML pipeline using the Kubeflow Orchestrator: - -```bash -python your_pipeline.py -``` - -This will create a Kubernetes pod for each step in your pipeline. You can view pipeline runs in the Kubeflow UI. - -## Additional Configuration - -You can further configure the orchestrator using `KubeflowOrchestratorSettings`: - -```bash -from zenml.integrations.kubeflow.flavors.kubeflow_orchestrator_flavor import KubeflowOrchestratorSettings - -kubeflow_settings = KubeflowOrchestratorSettings( - client_args={}, - user_namespace="my_namespace", - pod_settings={ - "affinity": {...}, - "tolerations": [...] - } -) - -@pipeline( - settings={ - "orchestrator": kubeflow_settings - } -) -``` - -This allows specifying client arguments, user namespace, pod affinity/tolerations, and more. - -## Multi-Tenancy Deployments - -For multi-tenant Kubeflow deployments, specify the `kubeflow_hostname` ending in `/pipeline` when registering the orchestrator: - -```bash -zenml orchestrator register \ - --flavor=kubeflow \ - --kubeflow_hostname= # e.g. https://mykubeflow.example.com/pipeline -``` - -And provide the namespace, username and password in the orchestrator settings: - - -```bash -kubeflow_settings = KubeflowOrchestratorSettings( - client_username="admin", - client_password="abc123", - user_namespace="namespace_name" -) - -@pipeline( - settings={ - "orchestrator": kubeflow_settings - } -) -``` - -For more advanced options and details, refer to the [full Kubeflow Orchestrator documentation](/develop/stack-components/orchestrators/kubeflow). - diff --git a/develop/usage/development-monitoring/popular-integrations/kubernetes.mdx b/develop/usage/development-monitoring/popular-integrations/kubernetes.mdx deleted file mode 100644 index f7f682b..0000000 --- a/develop/usage/development-monitoring/popular-integrations/kubernetes.mdx +++ /dev/null @@ -1,65 +0,0 @@ ---- -title: "Kubernetes" -description: "Learn how to deploy ZenML pipelines on a Kubernetes cluster." -icon: terminal ---- - -The ZenML Kubernetes Orchestrator allows you to run your ML pipelines on a Kubernetes cluster without writing Kubernetes code. It's a lightweight alternative to more complex orchestrators like Airflow or Kubeflow. - - -If you only want to run individual steps of your pipeline in Kubernetes, check out our [Kubernetes Step Operator](/develop/stack-components/step-operators/kubernetes). - - -## Prerequisites - -To use the Kubernetes Orchestrator, you'll need: - -* ZenML `kubernetes` integration installed (`zenml integration install kubernetes`) -* Docker installed and running -* `kubectl` installed -* A remote artifact store and container registry in your ZenML stack -* A deployed Kubernetes cluster -* A configured `kubectl` context pointing to the cluster (optional, see below) - -## Deploying the Orchestrator - -You can deploy the orchestrator from the ZenML CLI: - -```bash -zenml orchestrator deploy k8s_orchestrator --flavor=kubernetes --provider= -``` - -## Configuring the Orchestrator - -There are two ways to configure the orchestrator: - -1. Using a [Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) to connect to the remote cluster. This is the recommended approach, especially for cloud-managed clusters. No local `kubectl` context is needed. - -```bash -zenml orchestrator register --flavor kubernetes -zenml service-connector list-resources --resource-type kubernetes-cluster -e -zenml orchestrator connect --connector -zenml stack register -o ... --set -``` - -1. Configuring `kubectl` with a context pointing to the remote cluster and setting the `kubernetes_context` in the orchestrator config: - -```bash -zenml orchestrator register \ - --flavor=kubernetes \ - --kubernetes_context= - -zenml stack register -o ... --set -``` - -## Running a Pipeline - -Once configured, you can run any ZenML pipeline using the Kubernetes Orchestrator: - -```bash -python your_pipeline.py -``` - -This will create a Kubernetes pod for each step in your pipeline. You can interact with the pods using `kubectl` commands. - -For more advanced configuration options and additional details, refer to the [full Kubernetes Orchestrator documentation](/develop/stack-components/orchestrators/kubernetes). diff --git a/develop/usage/development-monitoring/popular-integrations/mlflow.mdx b/develop/usage/development-monitoring/popular-integrations/mlflow.mdx deleted file mode 100644 index fa5a7fa..0000000 --- a/develop/usage/development-monitoring/popular-integrations/mlflow.mdx +++ /dev/null @@ -1,105 +0,0 @@ ---- -title: "MLflow" -description: "Learn how to use the MLflow Experiment Tracker with ZenML." -icon: code-fork ---- - -The ZenML MLflow Experiment Tracker integration and stack component allows you to log and visualize information from your pipeline steps using MLflow, without having to write extra MLflow code. - -## Prerequisites - -To use the MLflow Experiment Tracker, you'll need: - -* ZenML `mlflow` integration installed (`zenml integration install mlflow -y`) -* An MLflow deployment, either local (scenario 1) or remote with proxied artifact storage (scenario 5) - -## Configuring the Experiment Tracker - -There are two main MLflow deployment scenarios: - -1. Local (scenario 1): Use a local artifact store, only suitable for running ZenML locally. No extra configuration needed. - -```bash -zenml experiment-tracker register mlflow_experiment_tracker --flavor=mlflow - -zenml stack register custom_stack -e mlflow_experiment_tracker ... --set -``` - -1. Remote with proxied artifact storage (scenario 5): Can be used with any stack components. Requires authentication configuration. - -For remote, you'll need to configure authentication using one of: - -* Basic authentication (not recommended for production) -* ZenML secrets (recommended) - -To use ZenML secrets: - -```bash -zenml secret create mlflow_secret \ - --username= \ - --password= - -zenml experiment-tracker register mlflow \ - --flavor=mlflow \ - --tracking_username={{mlflow_secret.username}} \ - --tracking_password={{mlflow_secret.password}} \ - ... -``` - -## Using the Experiment Tracker - -To log information with MLflow in a pipeline step: - -1. Enable the experiment tracker using the `@step` decorator -2. Use MLflow's logging or auto-logging capabilities as usual - -```py -import mlflow - -@step(experiment_tracker="") -def train_step(...): - mlflow.tensorflow.autolog() - - mlflow.log_param(...) - mlflow.log_metric(...) - mlflow.log_artifact(...) - - ... -``` - -## Viewing Results - -You can find the URL to the MLflow experiment for a ZenML run: - -```py - -last_run = client.get_pipeline("").last_run -trainer_step = last_run.get_step("") -tracking_url = trainer_step.run_metadata["experiment_tracker_url"].value -``` - -This will link to your deployed MLflow instance UI, or the local MLflow experiment file. - -## Additional Configuration - -You can further configure the experiment tracker using `MLFlowExperimentTrackerSettings`: - -```py - -from zenml.integrations.mlflow.flavors.mlflow_experiment_tracker_flavor import MLFlowExperimentTrackerSettings - -mlflow_settings = MLFlowExperimentTrackerSettings( - nested=True, - tags={"key": "value"} -) - -@step( - experiment_tracker="", - settings={ - "experiment_tracker": mlflow_settings - } -) -``` - -For more details and advanced options, see the [full MLflow Experiment Tracker documentation](/develop/stack-components/experiment-trackers/mlflow). - diff --git a/develop/usage/development-monitoring/popular-integrations/popular-integrations.mdx b/develop/usage/development-monitoring/popular-integrations/popular-integrations.mdx deleted file mode 100644 index 14bca11..0000000 --- a/develop/usage/development-monitoring/popular-integrations/popular-integrations.mdx +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: "Overview" -description: "Use your favorite tools with ZenML." -icon: link-slash ---- - -ZenML is designed to work seamlessly with your favorite tools. This guide will show you how to integrate ZenML with some of the most popular tools in the data science and machine learning ecosystem. diff --git a/develop/usage/development-monitoring/popular-integrations/skypilot.mdx b/develop/usage/development-monitoring/popular-integrations/skypilot.mdx deleted file mode 100644 index 3c58dd0..0000000 --- a/develop/usage/development-monitoring/popular-integrations/skypilot.mdx +++ /dev/null @@ -1,92 +0,0 @@ ---- -title: "Skypilot" -description: "Use Skypilot with ZenML." -icon: code-commit ---- - -The ZenML SkyPilot VM Orchestrator allows you to provision and manage VMs on any supported cloud provider (AWS, GCP, Azure, Lambda Labs) for running your ML pipelines. It simplifies the process and offers cost savings and high GPU availability. - -## Prerequisites - -To use the SkyPilot VM Orchestrator, you'll need: - -* ZenML SkyPilot integration for your cloud provider installed (`zenml integration install skypilot_`) -* Docker installed and running -* A remote artifact store and container registry in your ZenML stack -* A remote ZenML deployment -* Appropriate permissions to provision VMs on your cloud provider -* A service connector configured to authenticate with your cloud provider (not needed for Lambda Labs) - -## Configuring the Orchestrator - -Configuration steps vary by cloud provider: - -AWS, GCP, Azure: - -1. Install the SkyPilot integration and connectors extra for your provider -2. Register a service connector with credentials that have SkyPilot's required permissions -3. Register the orchestrator and connect it to the service connector -4. Register and activate a stack with the new orchestrator - -```bash -zenml service-connector register -skypilot-vm -t --auto-configure -zenml orchestrator register --flavor vm_ -zenml orchestrator connect --connector -skypilot-vm -zenml stack register -o ... --set -``` - -Lambda Labs: - -1. Install the SkyPilot Lambda integration -2. Register a secret with your Lambda Labs API key -3. Register the orchestrator with the API key secret -4. Register and activate a stack with the new orchestrator - -```bash -zenml secret create lambda_api_key --scope user --api_key= -zenml orchestrator register --flavor vm_lambda --api_key={{lambda_api_key.api_key}} -zenml stack register -o ... --set -``` - -## Running a Pipeline - -Once configured, you can run any ZenML pipeline using the SkyPilot VM Orchestrator. Each step will run in a Docker container on a provisioned VM. - -## Additional Configuration - -You can further configure the orchestrator using cloud-specific `Settings` objects: - -```py -from zenml.integrations.skypilot_.flavors.skypilot_orchestrator__vm_flavor import SkypilotOrchestratorSettings - -skypilot_settings = SkypilotOrchestratorSettings( - cpus="2", - memory="16", - accelerators="V100:2", - use_spot=True, - region=, - ... -) - -@pipeline( - settings={ - "orchestrator": skypilot_settings - } -) -``` - -This allows specifying VM size, spot usage, region, and more. - -You can also configure resources per step: - -```py -high_resource_settings = SkypilotOrchestratorSettings(...) - -@step(settings={"orchestrator": high_resource_settings}) - -def resource_intensive_step(): - - ... -``` - -For more details and advanced options, see the [full SkyPilot VM Orchestrator documentation](/develop/stack-components/orchestrators/skypilot-vm). diff --git a/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/limitations-of-defining-steps-in-notebook-cells.mdx b/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/limitations-of-defining-steps-in-notebook-cells.mdx deleted file mode 100644 index 296dfbe..0000000 --- a/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/limitations-of-defining-steps-in-notebook-cells.mdx +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: "Limitations of defining steps in notebook cells" -description: "If you want to run ZenML steps defined in notebook cells remotely (either with a remote [orchestrator](/develop/stack-components/orchestrators) or [step operator](/develop/stack-components/step-operators)), the cells defining your steps must meet the following conditions" -icon: play ---- - - - -* The cell can only contain python code, no Jupyter magic commands or shell commands starting with a `%` or `!`. -* The cell **must not** call code from other notebook cells. Functions or classes imported from python files are allowed. -* The cell **must not** rely on imports of previous cells. This means your cell must perform all the imports it needs itself, including ZenML imports like `from zenml import step`. diff --git a/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-a-single-step-from-a-notebook.mdx b/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-a-single-step-from-a-notebook.mdx deleted file mode 100644 index 87c01d6..0000000 --- a/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-a-single-step-from-a-notebook.mdx +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: "Run a single step from a notebook" -icon: circle-exclamation -description: If you want to run just a single step remotely from a notebook, you can simply call the step as you would with a normal Python function. ---- - -ZenML will internally create a pipeline with just your step and run it on the active stack. - - - When defining a step that should be run remotely in a notebook, make sure you're aware of all the [limitations](/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/limitations-of-defining-steps-in-notebook-cells) that apply. - - -```py -from zenml import step -import pandas as pd -from sklearn.base import ClassifierMixin -from sklearn.svm import SVC - -# Configure the step to use a step operator. If you're not using -# a step operator, you can remove this and the step will run on -# your orchestrator instead. -@step(step_operator="") -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Tuple[ - Annotated[ClassifierMixin, "trained_model"], - Annotated[float, "training_acc"], -]: - """Train a sklearn SVC classifier.""" - - model = SVC(gamma=gamma) - model.fit(X_train.to_numpy(), y_train.to_numpy()) - - train_acc = model.score(X_train.to_numpy(), y_train.to_numpy()) - print(f"Train accuracy: {train_acc}") - - return model, train_acc - - -X_train = pd.DataFrame(...) -y_train = pd.Series(...) - -# Call the step directly. This will internally create a -# pipeline with just this step, which will be executed on -# the active stack. -model, train_acc = svc_trainer(X_train=X_train, y_train=y_train) -``` diff --git a/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-remote-steps-and-pipelines-from-notebooks.mdx b/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-remote-steps-and-pipelines-from-notebooks.mdx deleted file mode 100644 index 0179faa..0000000 --- a/develop/usage/development-monitoring/run-remote-steps-and-pipelines-from-notebooks/run-remote-steps-and-pipelines-from-notebooks.mdx +++ /dev/null @@ -1,14 +0,0 @@ ---- -title: "Overview" -description: "Use Jupyter Notebooks to run remote steps or pipelines" -icon: book ---- - -ZenML steps and pipelines can be defined in a Jupyter notebook and executed remotely. To do so, ZenML will extract the code from your notebook cells and run them as Python modules inside the Docker containers that execute your pipeline steps remotely. For this to work, the notebook cells in which you define your steps need to meet certain conditions. - -Learn more about it in the following sections: - - - - - diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/associate-a-pipeline-with-a-model.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/associate-a-pipeline-with-a-model.mdx deleted file mode 100644 index d8893bc..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/associate-a-pipeline-with-a-model.mdx +++ /dev/null @@ -1,51 +0,0 @@ ---- -title: "Associate a pipeline with a Model" -description: The most common use-case for a Model is to associate it with a pipeline. -icon: cart-plus ---- -```py -from zenml import pipeline -from zenml import Model - -@pipeline( - model=Model( - name="ClassificationModel", # Give your models unique names - tags=["MVP", "Tabular"] # Use tags for future filtering - ) -) -def my_pipeline(): - ... -``` - -This will associate this pipeline with the model specified. In case the model already exists, this will create a new version of that model. - -In case you want to attach the pipeline to an existing model version, specify this as well. - -```py -from zenml import pipeline -from zenml import Model -from zenml.enums import ModelStages - -@pipeline( - model=Model( - name="ClassificationModel", # Give your models unique names - tags=["MVP", "Tabular"], # Use tags for future filtering - version=ModelStages.LATEST # Alternatively use a stage: [STAGING, PRODUCTION]] - ) -) -def my_pipeline(): - ... -``` - -Feel free to also move the Model configuration into your configuration files: - -```yaml -... - -model: - name: text_classifier - description: A breast cancer classifier - tags: ["classifier","sgd"] - -... -``` diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/connecting-artifacts-via-a-model.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/connecting-artifacts-via-a-model.mdx deleted file mode 100644 index e1425b7..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/connecting-artifacts-via-a-model.mdx +++ /dev/null @@ -1,133 +0,0 @@ ---- -title: "Connecting artifacts via a Model" -description: "Structuring an MLOps project" -icon: link ---- - -Now that we've learned about managing [artifacts](/develop/user-guide/guides/starter-guide/manage-artifacts) and [models](/develop/user-guide/guides/starter-guide/track-ml-models), we can shift our attention again to the thing that brings them together: [Pipelines](/develop/user-guide/guides/starter-guide/create-an-ml-pipeline). This trifecta together will then inform how we structure our project. - - - In order to see the recommended repository structure of a ZenML MLOps project, read the [best practices](/develop/usage/project-setup/setting-up-a-project-repository/best-practices) section. - - -An MLOps project can often be broken down into many different pipelines. For example: - -* A `feature engineering` pipeline that prepares raw data into a format ready to get trained. -* A `training pipeline` that takes input data from a feature engineering pipeline and trains a models on it. -* An `inference pipeline` that runs batch predictions on the trained model and often takes pre-processing from the training pipeline. -* A deployment pipeline that deploys a trained model into a production endpoint. - -The lines between these pipelines can often get blurry: Some use cases call for these pipelines to be merged into one big pipeline. Others go further and break the pipeline down into even smaller chunks. Ultimately, the decision of how to structure your pipelines depends on the use case and requirements of the project. - -No matter how you design these pipelines, one thing stays consistent: you will often need to transfer or share information (in particular artifacts, models, and metadata) between pipelines. Here are some common patterns that you can use to help facilitate such an exchange: - -## Pattern 1: Artifact exchange between pipelines through `Client` - -Let's say we have a feature engineering pipeline and a training pipeline. The feature engineering pipeline is like a factory, pumping out many different datasets. Only a few of these datasets should be selected to be sent to the training pipeline to train an actual model. - - - - - -In this scenario, the [ZenML Client](/develop/sdk/python-client#client-methods) can be used to facilitate such an exchange: - -```py -from zenml import pipeline -from zenml.client import Client - -@pipeline -def feature_engineering_pipeline(): - dataset = load_data() - # This returns artifacts called "iris_training_dataset" and "iris_testing_dataset" - train_data, test_data = prepare_data() - -@pipeline -def training_pipeline(): - client = Client() - # Fetch by name alone - uses the latest version of this artifact - train_data = client.get_artifact_version(name="iris_training_dataset") - # For test, we want a particular version - test_data = client.get_artifact_version(name="iris_testing_dataset", version="raw_2023") - - # We can now send these directly into ZenML steps - sklearn_classifier = model_trainer(train_data) - model_evaluator(model, sklearn_classifier) -``` - - - Please note, that in the above example, the `train_data` and `test_data` artifacts are not [materialized](/develop/usage/resource-data-management/handle-data-artifacts/artifact-versioning) in memory in the `@pipeline` function, but rather the `train_data` and `test_data` objects are simply references to where this data is stored in the artifact store. Therefore, one cannot use any logic regarding the nature of this data itself during compilation time (i.e. in the `@pipeline` function). - - -## Pattern 2: Artifact exchange between pipelines through a `Model` - -While passing around artifacts with IDs or names is very useful, it is often desirable to have the ZenML Model be the point of reference instead. - -For example, let's say we have a training pipeline called `train_and_promote` and an inference pipeline called `do_predictions`. The training pipeline produces many different model artifacts, all of which are collected within a [ZenML Model](/develop/user-guide/guides/starter-guide/track-ml-models). Each time the `train_and_promote` pipeline runs, it creates a new `iris_classifier`. However, it only promotes the model to `production` if a certain accuracy threshold is met. The promotion can be also be done manually with human intervention, or it can be automated through setting a particular threshold. - -On the other side, the `do_predictions` pipeline simply picks up the latest promoted model and runs batch inference on it. It need not know of the IDs or names of any of the artifacts produced by the training pipeline's many runs. This way these two pipelines can independently be run, but can rely on each other's output. - - - - -In code, this is very simple. Once the [pipelines are configured to use a particular model](/develop/user-guide/guides/starter-guide/track-ml-models#configuring-a-model-in-a-pipeline), we can use `get_step_context` to fetch the configured model within a step directly. Assuming there is a `predict` step in the `do_predictions` pipeline, we can fetch the `production` model like so: - -```py -from zenml import step, get_step_context - -# IMPORTANT: Cache needs to be disabled to avoid unexpected behavior -@step(enable_cache=False) -def predict( - data: pd.DataFrame, -) -> Annotated[pd.Series, "predictions"]: - # model name and version are derived from pipeline context - model = get_step_context().model - - # Fetch the model directly from the model control plane - model = model.get_model_artifact("trained_model") - - # Make predictions - predictions = pd.Series(model.predict(data)) - return predictions -``` -However, this approach has the downside that if the step is cached, then it could lead to unexpected results. You could simply disable the cache in the above step or the corresponding pipeline. However, one other way of achieving this would be to resolve the artifact at the pipeline level: - -```py - -from typing_extensions import Annotated -from zenml import get_pipeline_context, pipeline, Model -from zenml.enums import ModelStages -import pandas as pd -from sklearn.base import ClassifierMixin - - -@step -def predict( - model: ClassifierMixin, - data: pd.DataFrame, -) -> Annotated[pd.Series, "predictions"]: - predictions = pd.Series(model.predict(data)) - return predictions - -@pipeline( - model=Model( - name="iris_classifier", - # Using the production stage - version=ModelStages.PRODUCTION, - ), -) -def do_predictions(): - # model name and version are derived from pipeline context - model = get_pipeline_context().model - inference_data = load_data() - predict( - # Here, we load in the `trained_model` from a trainer step - model=model.get_model_artifact("trained_model"), - data=inference_data, - ) - - -if __name__ == "__main__": - do_predictions() -``` -Ultimately, both approaches are fine. You should decide which one to use based on your own preferences. diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/delete-a-model.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/delete-a-model.mdx deleted file mode 100644 index 42481d1..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/delete-a-model.mdx +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: "Deleting a Model" -description: "Learn how to delete models." -icon: trash ---- - -Deleting a model or a specific model version means removing all links between the Model entity and artifacts + pipeline runs, and will also delete all metadata associated with that Model. - -## Deleting all versions of a model - - - -```bash - -zenml model delete -``` - - -```py - -from zenml.client import Client -Client().delete_model() -``` - - - - -## Delete a specific version of a model - - -```bash -zenml model version delete -``` - - -```py -from zenml.client import Client - -Client().delete_model_version() -``` - - diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/linking-model-binaries-data-to-models.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/linking-model-binaries-data-to-models.mdx deleted file mode 100644 index 15c7756..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/linking-model-binaries-data-to-models.mdx +++ /dev/null @@ -1,127 +0,0 @@ ---- -title: "Linking model binaries/data to a Model" -description: Artifacts generated during pipeline runs can be linked to models in ZenML. -icon: link ---- - -This connecting of artifacts provides lineage tracking and transparency into what data and models are used during training, evaluation, and inference. - -There are a few ways to link artifacts: - -## Configuring the Model at a pipeline level - -The easiest way is to configure the `model` parameter on the `@pipeline` decorator or `@step` decorator: - -```py -from zenml import Model, pipeline - -model = Model( - name="my_model", - version="1.0.0" -) - -@pipeline(model=model) -def my_pipeline(): - ... -``` - -This will automatically link all artifacts from this pipeline run to the specified model configuration. - -### Controlling artifact types and linkage - -A ZenML model supports linking three types of artifacts: - -* `Data artifacts`: These are the default artifacts. If nothing is specified, all artifacts are grouped under this category. -* `Model artifacts`: If there is a physical model artifact like a `.pkl` file or a model neural network weights file, it should be grouped in this category. -* `Deployment artifacts`: These artifacts are to do with artifacts related to the endpoints and deployments of the models. - -You can also explicitly specify the linkage on a per-artifact basis by passing a special configuration to the Annotated output: - -```py -from zenml import step, ArtifactConfig -from typing import Tuple -from typing_extensions import Annotated -import pandas as pd - -@step -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Tuple[ - # This third argument marks this as a Model Artifact - Annotated[ClassifierMixin, ArtifactConfig("trained_model", is_model_artifact=True)], - # This third argument marks this as a Data Artifact - Annotated[str, ArtifactConfig("deployment_uri", is_deployment_artifact=True)], -]: - ... -``` - -The `ArtifactConfig` object allows configuring model linkage directly on the artifact, and you specify whether it's for a model or deployment by using the `is_model_artifact` and `is_deployment_artifact` flags (as shown above) else it will be assumed to be a data artifact. - -## Saving intermediate artifacts - -It is often handy to save some of your work half-way: steps like epoch-based training can be running slow, and you don't want to lose any checkpoints along the way if an error occurs. You can use the `save_artifact` utility function to save your data assets as ZenML artifacts. Moreover, if your step has the Model context configured in the `@pipeline` or `@step` decorator it will be automatically linked to it, so you can get easy access to it using the Model Control Plane features. - -```py -from zenml import step, Model -from zenml.artifacts.utils import save_artifact -import pandas as pd -from typing_extensions import Annotated -from zenml.artifacts.artifact_config import ArtifactConfig - -@step(model=Model(name="MyModel", version="1.2.42")) -def trainer( - trn_dataset: pd.DataFrame, -) -> Annotated[ - ClassifierMixin, ArtifactConfig("trained_model", is_model_artifact=True) -]: # this configuration will be applied to `model` output - """Step running slow training.""" - ... - - for epoch in epochs: - checkpoint = model.train(epoch) - # this will save each checkpoint in `training_checkpoint` artifact - # with distinct version e.g. `1.2.42_0`, `1.2.42_1`, etc. - # Checkpoint artifacts will be linked to `MyModel` version `1.2.42` - # implicitly. - save_artifact( - data=checkpoint, - name="training_checkpoint", - version=f"1.2.42_{epoch}", - ) - - ... - - return model -``` - -## Link artifacts explicitly - -If you would like to link an artifact to a model not from the step context or even outside a step, you can use the `link_artifact_to_model` function. All you need is ready to link artifact and the configuration of a model. - - -```py -from zenml import step, Model, link_artifact_to_model, save_artifact -from zenml.client import Client - - -@step -def f_() -> None: - # produce new artifact - new_artifact = save_artifact(data="Hello, World!", name="manual_artifact") - # and link it inside a step - link_artifact_to_model( - artifact_version_id=new_artifact.id, - model=Model(name="MyModel", version="0.0.42"), - ) - - -# use existing artifact -existing_artifact = Client().get_artifact_version(name_id_or_prefix="existing_artifact") -# and link it even outside a step -link_artifact_to_model( - artifact_version_id=existing_artifact.id, - model=Model(name="MyModel", version="0.2.42"), -) -``` diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/load-a-model-in-code.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/load-a-model-in-code.mdx deleted file mode 100644 index 7295e12..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/load-a-model-in-code.mdx +++ /dev/null @@ -1,54 +0,0 @@ ---- -title: "Load a Model in code" -description: There are a few different ways to load a ZenML Model in code -icon: spinner ---- - -## Load the active model in a pipeline - -You can also use the [active model](/develop/user-guide/guides/starter-guide/track-ml-models) to get the model metadata, or the associated artifacts directly as described in the [starter guide](/develop/user-guide/guides/starter-guide/track-ml-models): - - -```py -from zenml import step, pipeline, get_step_context, pipeline, Model - -@pipeline(model=Model(name="my_model")) -def my_pipeline(): - ... - -@step -def my_step(): - # Get model from active step context - mv = get_step_context().model - - # Get metadata - print(mv.run_metadata["metadata_key"].value) - - # Directly fetch an artifact that is attached to the model - output = mv.get_artifact("my_dataset", "my_version") - output.run_metadata["accuracy"].value -``` - -## Load any model via the Client - -Alternatively, you can use the `Client`: - -```py -from zenml import step -from zenml.client import Client -from zenml.enums import ModelStages - -@step -def model_evaluator_step() - ... - # Get staging model version - try: - staging_zenml_model = Client().get_model_version( - model_name_or_id="", - model_version_name_or_number_or_id=ModelStages.STAGING, - ) - except KeyError: - staging_zenml_model = None - ... -``` - diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/load-artifacts-from-model.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/load-artifacts-from-model.mdx deleted file mode 100644 index 0d47798..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/load-artifacts-from-model.mdx +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: "Load artifacts from Model" -description: "One of the more common use-cases for a Model is to pass artifacts between pipelines (a pattern we have seen [before](/develop/usage/development-monitoring/use-the-model-control-plane/connecting-artifacts-via-a-model))." -icon: spinner ---- - -However, when and how to load these artifacts is important to know as well. - -As an example, let's have a look at a two-pipeline project, where the first pipeline is running training logic and the second runs batch inference leveraging trained model artifact(s): - -```py -from typing_extensions import Annotated -from zenml import get_pipeline_context, pipeline, Model -from zenml.enums import ModelStages -import pandas as pd -from sklearn.base import ClassifierMixin - - -@step -def predict( - model: ClassifierMixin, - data: pd.DataFrame, -) -> Annotated[pd.Series, "predictions"]: - predictions = pd.Series(model.predict(data)) - return predictions - -@pipeline( - model=Model( - name="iris_classifier", - # Using the production stage - version=ModelStages.PRODUCTION, - ), -) -def do_predictions(): - # model name and version are derived from pipeline context - model = get_pipeline_context().model - inference_data = load_data() - predict( - # Here, we load in the `trained_model` from a trainer step - model=model.get_model_artifact("trained_model"), - data=inference_data, - ) - - -if __name__ == "__main__": - do_predictions() -``` - -In the example above we used `get_pipeline_context().model` property to acquire the model context in which the pipeline is running. During pipeline compilation this context will not yet have been evaluated, because `Production` model version is not a stable version name and another model version can become `Production` before it comes to the actual step execution. The same applies to calls like `model.get_model_artifact("trained_model")`; it will get stored in the step configuration for delayed materialization which will only happen during the step run itself. - -It is also possible to achieve the same using bare `Client` methods reworking the pipeline code as follows: - -```py -from zenml.client import Client - -@pipeline -def do_predictions(): - # model name and version are directly passed into client method - model = Client().get_model_version("iris_classifier", ModelStages.PRODUCTION) - inference_data = load_data() - predict( - # Here, we load in the `trained_model` from a trainer step - model=model.get_model_artifact("trained_model"), - data=inference_data, - ) -``` - -In this case the evaluation of the actual artifact will happen only when the step is actually running. - diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/model-versions.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/model-versions.mdx deleted file mode 100644 index f8d381d..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/model-versions.mdx +++ /dev/null @@ -1,131 +0,0 @@ ---- -title: "Controlling Model versions" -description: Each model can have many versions. Model versions are a way for you to track different iterations of your training process, complete with some extra dashboard and API functionality to support the full ML lifecycle. -icon: code-compare ---- - -E.g. Based on your business rules during training, you can associate model version with stages and promote them to production. You have an interface that allows you to link these versions with non-technical artifacts and data, e.g. business data, datasets, or even stages in your process and workflow. - -Model versions are created implicitly as you are running your machine learning training, so you don't have to immediately think about this. If you want more control over versions, our API has you covered, with an option to explicitly name your versions. - -## Explicitly name your model version - -If you want to explicitly name your model version, you can do so by passing in the `version` argument to the `Model` object. If you don't do this, ZenML will automatically generate a version number for you. - -```py -from zenml import Model, step, pipeline - -model= Model( - name="my_model", - version="1.0.5" -) - -# The step configuration will take precedence over the pipeline -@step(model=model) -def svc_trainer(...) -> ...: - ... - -# This configures it for all steps within the pipeline -@pipeline(model=model) -def training_pipeline( ... ): - # training happens here -``` - -Here we are specifically setting the model configuration for a particular step or for the pipeline as a whole. - - - Please note in the above example if the model version exists, it is automatically associated with the pipeline and becomes active in the pipeline context. Therefore, a user should be careful and intentional as to whether you want to create a new pipeline, or fetch an existing one. See below for an example of fetching a model from an existing version/stage. - -## Use name templates for your model versions - -If you want to continuously run the same project, but keep track of your model versions using semantical naming, you can rely on templated naming in the `version` argument to the `Model` object. Instead of static model version name from the previous section, templated names will be unique with every new run, but also will be semantically searchable and readable by your team. - -```py - -from zenml import Model, step, pipeline - -model= Model( - name="my_model", - version="experiment_with_phi_3_{date}_{time}" -) - -# The step configuration will take precedence over the pipeline -@step(model=model) -def llm_trainer(...) -> ...: - ... - -# This configures it for all steps within the pipeline -@pipeline(model=model) -def training_pipeline( ... ): - # training happens here -``` - -Here we are specifically setting the model configuration for a particular step or for the pipeline as a whole. Once you run this pipeline it will produce a model version with a name evaluated at a runtime, like `experiment_with_phi_3_2024_08_30_12_42_53`. Subsequent runs will also have unique but readable names. - -We currently support following placeholders to be used in model version name templates: - -* `{date}`: current date -* `{time}`: current time in UTC format - -## Fetching model versions by stage - -A common pattern is to assign a special `stage` to a model version, i.e. `production`, `staging`, `development` etc. This marks this version especially, and can be used to fetch it using a particular semantic meaning, disconnected from the concrete model version. A model version can be assigned a particular stage in the dashboard or by executing the following command in the CLI: - -```bash -zenml model version update MODEL_NAME --stage=STAGE -``` - -These stages can then be passed in as a `version` to fetch the right model version at a later point: - -```py -from zenml import Model, step, pipeline - -model= Model( - name="my_model", - version="production" -) - -# The step configuration will take precedence over the pipeline -@step(model=model) -def svc_trainer(...) -> ...: - ... - -# This configures it for all steps within the pipeline -@pipeline(model=model) -def training_pipeline( ... ): - # training happens here -``` - -## Autonumbering of versions - -ZenML automatically numbers your model versions for you. If you don't specify a version number, or if you pass `None` into the `version` argument of the `Model` object, ZenML will automatically generate a version number (or a new version, if you already have a version) for you. For example if we had a model version `really_good_version` for model `my_model` and we wanted to create a new version of this model, we could do so as follows: - - -```py -from zenml import Model, step - -model = Model( - name="my_model", - version="even_better_version" -) - -@step(model=model) -def svc_trainer(...) -> ...: - ... -``` - -A new model version will be created and ZenML will track that this is the next in the iteration sequence of the models using the `number` property. If `really_good_version` was the 5th version of `my_model`, then `even_better_version` will be the 6th version of `my_model`. - -```py -from zenml import Model - -earlier_version = Model( - name="my_model", - version="really_good_version" -).number # == 5 - -updated_version = Model( - name="my_model", - version="even_better_version" -).number # == 6 -``` diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/promote-a-model.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/promote-a-model.mdx deleted file mode 100644 index 52aeebc..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/promote-a-model.mdx +++ /dev/null @@ -1,91 +0,0 @@ ---- -title: "Promote a Model" -description: Model stages are a way to model the progress that different versions takes through various stages in its lifecycle. -icon: rectangle-ad ---- - -## Stages and Promotion - A ZenML Model version can be promoted to a different stage through the Dashboard, the ZenML CLI or code. - -This is a way to signify the progression of your model version through the ML lifecycle and are an extra layer of metadata to identify the state of a particular model version. Possible options for stages are: - -* `staging`: This version is staged for production. -* `production`: This version is running in a production setting. -* `latest`: The latest version of the model. This is a virtual stage to retrieve the latest version only - versions cannot be promoted to `latest`. -* `archived`: This is archived and no longer relevant. This stage occurs when a model moves out of any other stage. - -Your own particular business or use case logic will determine which model version you choose to promote, and you can do this in the following ways: - -### Promotion via CLI - -This is probably the least common way that you'll use, but it's still possible and perhaps might be useful for some use cases or within a CI system, for example. You simply use the following CLI subcommand: - - -```py -zenml model version update iris_logistic_regression --stage=... -``` - -### Promotion via Cloud Dashboard - -This feature is not yet available, but soon you will be able to promote your model versions directly from the ZenML Pro dashboard. - -### Promotion via Python SDK - -This is the most common way that you'll use to promote your models. You can see how you would do this here: - -```py -from zenml import Model - -MODEL_NAME = "iris_logistic_regression" -from zenml.enums import ModelStages - -model = Model(name=MODEL_NAME, version="1.2.3") -model.set_stage(stage=ModelStages.PRODUCTION) - -# get latest model and set it as Staging -# (if there is current Staging version it will get Archived) -latest_model = Model(name=MODEL_NAME, version=ModelStages.LATEST) -latest_model.set_stage(stage=ModelStages.STAGING) -``` - -Within a pipeline context, you would get the model from the step context but the mechanism for setting the stage is the same. - -```py -from zenml import get_step_context, step, pipeline -from zenml.enums import ModelStages - -@step -def promote_to_staging(): - model = get_step_context().model - model.set_stage(ModelStages.STAGING, force=True) - -@pipeline( - ... -) -def train_and_promote_model(): - ... - promote_to_staging(after=["train_and_evaluate"]) -``` - -## Fetching model versions by stage - -The stage can be used to load the right model version, by passing them in as a `version`: - -```py -from zenml import Model, step, pipeline - -model= Model( - name="my_model", - version="production" -) - -# The step configuration will take precedence over the pipeline -@step(model=model) -def svc_trainer(...) -> ...: - ... - -# This configures it for all steps within the pipeline -@pipeline(model=model) -def training_pipeline( ... ): - # training happens here -``` diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/register-a-model.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/register-a-model.mdx deleted file mode 100644 index 1ffd83f..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/register-a-model.mdx +++ /dev/null @@ -1,71 +0,0 @@ ---- -title: "Registering a Model" -description: Registering models can be done in a number of ways depending on your specific needs. -icon: id-card ---- - -You can explicitly register models using the CLI or the Python SDK, or you can just allow ZenML to implicitly register your models as part of a pipeline run. - - - If you are using [ZenML Pro](https://cloud.zenml.io/?utm%5Fsource=docs&utm%5Fmedium=referral%5Flink&utm%5Fcampaign=cloud%5Fpromotion&utm%5Fcontent=signup%5Flink/), you already have access to a dashboard interface that allows you to register models. - - -## Explicit CLI registration - -Registering models using the CLI is as straightforward as the following command: - - -```bash -zenml model register iris_logistic_regression --license=... --description=... -``` - -You can view some of the options of what can be passed into this command by running `zenml model register --help` but since you are using the CLI outside a pipeline run the arguments you can pass in are limited to non-runtime items. You can also associate tags with models at this point, for example, using the `--tag` option. - -## Explicit dashboard registration - -[ZenML Pro](https://zenml.io/pro) can register their models directly from the cloud dashboard interface. - - - - - -## Explicit Python SDK registration - -You can register a model using the Python SDK as follows: - -```py -from zenml import Model -from zenml.client import Client - -Client().create_model( - name="iris_logistic_regression", - license="Copyright (c) ZenML GmbH 2023", - description="Logistic regression model trained on the Iris dataset.", - tags=["regression", "sklearn", "iris"], -) -``` - -## Implicit registration by ZenML - -The most common use case for registering models is to do so implicitly as part of a pipeline run. This is done by specifying a `Model` object as part of the `model` argument of the `@pipeline` decorator. - -As an example, here we have a training pipeline which orchestrates the training of a model object, storing datasets and the model object itself as links within a newly created Model version. This integration is achieved by configuring the pipeline within a Model Context using `Model`. The name is specified, while other fields remain optional for this task. - -```py -from zenml import pipeline -from zenml import Model - -@pipeline( - enable_cache=False, - model=Model( - name="demo", - license="Apache", - description="Show case Model Control Plane.", - ), -) -def train_and_promote_model(): - ... -``` - -Running the training pipeline creates a new model version, all while maintaining a connection to the artifacts. - diff --git a/develop/usage/development-monitoring/use-the-model-control-plane/use-the-model-control-plane.mdx b/develop/usage/development-monitoring/use-the-model-control-plane/use-the-model-control-plane.mdx deleted file mode 100644 index 189e29f..0000000 --- a/develop/usage/development-monitoring/use-the-model-control-plane/use-the-model-control-plane.mdx +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: "Overview" -description: A ZenML Model is a concept that more broadly encapsulates your ML products business logic. -icon: plane ---- - - - - - - -A `Model` is simply an entity that groups pipelines, artifacts, metadata, and other crucial business data into a unified entity. You may even think of a ZenML Model as a "project" or a "workspace" - - - Please note that one of the most common artifacts that is associated with a Model in ZenML is the so-called technical model, which is the actually model file/files that holds the weight and parameters of a machine learning training result. However, this is not the only artifact that is relevant; artifacts such as the training data and the predictions this model produces in production are also linked inside a ZenML Model. - - -Models are first-class citizens in ZenML and as such viewing and using them is unified and centralized in the ZenML API, client as well as on the [ZenML Pro](https://zenml.io/pro) dashboard. - -A Model captures lineage information and more. Within a Model, different Model versions can be staged. For example, you can rely on your predictions at a specific stage, like `Production`, and decide whether the Model version should be promoted based on your business rules during training. Plus, accessing data from other Models and their versions is just as simple. - -The Model Control Plane is how you manage your models through this unified interface. It allows you to combine the logic of your pipelines, artifacts and crucial business data along with the actual 'technical model'. - -To see an end-to-end example, please refer to the [starter guide](/develop/user-guide/guides/starter-guide/track-ml-models). diff --git a/develop/usage/pipelines/build-pipelines/access-secrets-in-a-step.mdx b/develop/usage/pipelines/build-pipelines/access-secrets-in-a-step.mdx deleted file mode 100644 index a0aacc5..0000000 --- a/develop/usage/pipelines/build-pipelines/access-secrets-in-a-step.mdx +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: "Access secrets in a step" -description: "ZenML secrets are groupings of **key-value pairs** which are securely stored in the ZenML secrets store." -icon: stairs ---- - -## Fetching secret values in a step - - Additionally, a secret always has a **name** that allows you to fetch or reference them in your pipelines and stacks. In order to learn more about how to configure and create secrets, please refer to the [platform guide on secrets](/develop/usage/project-setup/use-secrets/secret-management). - -You can access secrets directly from within your steps through the ZenML `Client` API. This allows you to use your secrets for querying APIs from within your step without hard-coding your access keys: - -```py -from zenml import step -from zenml.client import Client - -from somewhere import authenticate_to_some_api - - -@step -def secret_loader() -> None: - """Load the example secret from the server.""" - # Fetch the secret from ZenML. - secret = Client().get_secret("") - - # `secret.secret_values` will contain a dictionary with all key-value - # pairs within your secret. - authenticate_to_some_api( - username=secret.secret_values["username"], - password=secret.secret_values["password"], - ) - ... -``` ---- - -### See Also: - - - - - - diff --git a/develop/usage/pipelines/build-pipelines/build-pipelines.mdx b/develop/usage/pipelines/build-pipelines/build-pipelines.mdx deleted file mode 100644 index c4afc31..0000000 --- a/develop/usage/pipelines/build-pipelines/build-pipelines.mdx +++ /dev/null @@ -1,73 +0,0 @@ ---- -title: "Overview" -description: "Building pipelines is as simple as adding the `@step` and `@pipeline` decorators to your code." -mode: wide -icon: link ---- - -```py -@step # Just add this decorator -def load_data() -> dict: - training_data = [[1, 2], [3, 4], [5, 6]] - labels = [0, 1, 0] - return {'features': training_data, 'labels': labels} - - -@step -def train_model(data: dict) -> None: - total_features = sum(map(sum, data['features'])) - total_labels = sum(data['labels']) - - # Train some model here - - print(f"Trained model using {len(data['features'])} data points. " - f"Feature sum is {total_features}, label sum is {total_labels}") - - -@pipeline # This function combines steps together -def simple_ml_pipeline(): - dataset = load_data() - train_model(dataset) -``` - -You can now run this pipeline by simply calling the function: - -```Bash -simple_ml_pipeline() -``` - -When this pipeline is executed, the run of the pipeline gets logged to the ZenML dashboard where you can now go to look at its DAG and all the associated metadata. To access the dashboard you need to have a ZenML server either running locally or remotely. See our documentation on this [here](/develop/getting-started/deploying-zenml). - - - - -Check below for more advanced ways to build and interact with your pipeline. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/develop/usage/pipelines/build-pipelines/compose-pipelines.mdx b/develop/usage/pipelines/build-pipelines/compose-pipelines.mdx deleted file mode 100644 index 18a215e..0000000 --- a/develop/usage/pipelines/build-pipelines/compose-pipelines.mdx +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: "Compose pipelines" -description: "Reuse steps between pipelines." -icon: ghost ---- - -Sometimes it can be useful to extract some common functionality into separate functions in order to avoid code duplication. To facilitate this, ZenML allows you to compose your pipelines: - -```py -from zenml import pipeline - -@pipeline -def data_loading_pipeline(mode: str): - if mode == "train": - data = training_data_loader_step() - else: - data = test_data_loader_step() - - processed_data = preprocessing_step(data) - return processed_data - - -@pipeline -def training_pipeline(): - training_data = data_loading_pipeline(mode="train") - model = training_step(data=training_data) - test_data = data_loading_pipeline(mode="test") - evaluation_step(model=model, data=test_data) -``` - - - Here we are calling one pipeline from within another pipeline, so functionally the `data_loading_pipeline` is functioning as a step within the `training_pipeline`, i.e. the steps of the former are added to the latter. Only the parent pipeline will be visible in the dashboard. In order to actually trigger a pipeline from another, see [here](/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-another) - - - - - - diff --git a/develop/usage/pipelines/build-pipelines/configuring-a-pipeline-at-runtime.mdx b/develop/usage/pipelines/build-pipelines/configuring-a-pipeline-at-runtime.mdx deleted file mode 100644 index 5b9587f..0000000 --- a/develop/usage/pipelines/build-pipelines/configuring-a-pipeline-at-runtime.mdx +++ /dev/null @@ -1,18 +0,0 @@ ---- -title: "Configuring a pipeline at runtime" -description: "Configuring a pipeline at runtime." -icon: recycle ---- - -It is often the case that there is a need to run a pipeline with a different configuration. In this case, you should in most cases use the [pipeline.with\_options](/develop/usage/project-setup/use-configuration-files) method. You can do this: - -1. Either by explicitly configuring options like `with_options(steps="trainer": {"parameters": {"param1": 1}})` -2. Or by passing a YAML file using `with_options(config_file="path_to_yaml_file")`. - -You can learn more about these options [here](/develop/usage/project-setup/use-configuration-files). - -However, there is one exception: if you would like to trigger a pipeline from the client or another pipeline, you would need to pass the `PipelineRunConfiguration` object. Learn more about this [here](/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-another). - - - - diff --git a/develop/usage/pipelines/build-pipelines/control-caching-behavior.mdx b/develop/usage/pipelines/build-pipelines/control-caching-behavior.mdx deleted file mode 100644 index f2772bf..0000000 --- a/develop/usage/pipelines/build-pipelines/control-caching-behavior.mdx +++ /dev/null @@ -1,40 +0,0 @@ ---- -title: "Control caching behavior" -description: "By default steps in ZenML pipelines are cached whenever code and parameters stay unchanged." -icon: database ---- - -```py -@step(enable_cache=True) # set cache behavior at step level -def load_data(parameter: int) -> dict: - ... - -@step(enable_cache=False) # settings at step level override pipeline level -def train_model(data: dict) -> None: - ... - -@pipeline(enable_cache=True) # set cache behavior at step level -def simple_ml_pipeline(parameter: int): - ... -``` - - - Caching only happens when code and parameters stay the same. - - -Like many other step and pipeline settings, you can also change this afterward: - - -```py -# Same as passing it in the step decorator -my_step.configure(enable_cache=...) - -# Same as passing it in the pipeline decorator -my_pipeline.configure(enable_cache=...) -``` ---- - - - - - diff --git a/develop/usage/pipelines/build-pipelines/control-execution-order-of-steps.mdx b/develop/usage/pipelines/build-pipelines/control-execution-order-of-steps.mdx deleted file mode 100644 index 46c3a5a..0000000 --- a/develop/usage/pipelines/build-pipelines/control-execution-order-of-steps.mdx +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: "Control execution order of steps" -description: By default, ZenML uses the data flowing between steps of your pipeline to determine the order in which steps get executed. -icon: list-ol ---- - - - -The following example shows a pipeline in which `step_3` depends on the outputs of `step_1` and `step_2`. This means that ZenML can execute both `step_1` and `step_2` in parallel but needs to wait until both are finished before `step_3` can be started. - -```py -from zenml import pipeline - -@pipeline -def example_pipeline(): - step_1_output = step_1() - step_2_output = step_2() - step_3(step_1_output, step_2_output) -``` - -If you have additional constraints on the order in which steps get executed, you can specify non-data dependencies by passing the invocation IDs of steps that should run before your step like this: `my_step(after="other_step")`. If you want to define multiple upstream steps, you can also pass a list for the `after` argument when calling your step: `my_step(after=["other_step", "other_step_2"])`. - - - Check out the [documentation here](/develop/usage/pipelines/build-pipelines/using-a-custom-step-invocation-id) to learn about the invocation ID and how to use a custom one for your steps. - - -```py -from zenml import pipeline - -@pipeline -def example_pipeline(): - step_1_output = step_1(after="step_2") - step_2_output = step_2() - step_3(step_1_output, step_2_output) -``` - -This pipeline is similar to the one explained above, but this time ZenML will make sure to only start `step_1` after `step_2` has finished. - diff --git a/develop/usage/pipelines/build-pipelines/delete-a-pipeline.mdx b/develop/usage/pipelines/build-pipelines/delete-a-pipeline.mdx deleted file mode 100644 index c3517c5..0000000 --- a/develop/usage/pipelines/build-pipelines/delete-a-pipeline.mdx +++ /dev/null @@ -1,46 +0,0 @@ ---- -title: "Deleting a pipeline" -description: "Learn how to delete pipelines." -icon: trash ---- - -In order to delete a pipeline, you can either use the CLI or the Python SDK: - - - - -```Bash -zenml pipeline delete -``` - - - -```Bash -from zenml.client import Client -Client().delete_pipeline() -``` - - - - - Deleting a pipeline does not automatically delete any of its associated runs or artifacts. - - -## Delete a pipeline run - -To delete a pipeline run, you can use the following CLI command or the client: - - -```Bash -zenml pipeline runs delete -``` - - -```Bash -from zenml.client import Client - -Client().delete_pipeline_run() -``` - - - diff --git a/develop/usage/pipelines/build-pipelines/fetching-pipelines.mdx b/develop/usage/pipelines/build-pipelines/fetching-pipelines.mdx deleted file mode 100644 index e6eeefb..0000000 --- a/develop/usage/pipelines/build-pipelines/fetching-pipelines.mdx +++ /dev/null @@ -1,370 +0,0 @@ ---- -title: "Fetching pipelines" -description: "Inspecting a finished pipeline run and its outputs." -icon: cart-plus ---- - -Once a pipeline run has been completed, we can access the corresponding information in code, which enables the following: - -* Loading artifacts like models or datasets saved by previous runs -* Accessing metadata or configurations of previous runs -* Programmatically inspecting the lineage of pipeline runs and their artifacts - -The hierarchy of pipelines, runs, steps, and artifacts is as follows: -```mermaid -graph LR - A[pipelines] -->|1:N| B[runs] - B -->|1:N| C[steps] - C -->|1:N| D[artifacts] - -``` -As you can see from the diagram, there are many layers of 1-to-N relationships. - -Let us investigate how to traverse this hierarchy level by level: - -## Pipelines - -### Get a pipeline via the client - -After you have run a pipeline at least once, you can also fetch the pipeline via the [Client.get\_pipeline()](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-client/#zenml.client.Client.get%5Fpipeline) method. - - -```py -from zenml.client import Client - -pipeline_model = Client().get_pipeline("first_pipeline") -``` - -Check out the [ZenML Client Documentation](/develop/sdk/python-client) for more information on the `Client` class and its purpose. - -### Discover and list all pipelines - -If you're not sure which pipeline you need to fetch, you can find a list of all registered pipelines in the ZenML dashboard, or list them programmatically either via the Client or the CLI. - - -You can use the [Client.list\_pipelines()](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-client/#zenml.client.Client.list%5Fpipelines) method to get a list of all pipelines registered in ZenML: -```py -from zenml.client import Client - -pipelines = Client().list_pipelines() -``` - - - -Alternatively, you can also list pipelines with the following CLI command: - -```bash -zenml pipeline list -``` - - - - -## Runs - -Each pipeline can be executed many times, resulting in several **Runs**. - -### Get all runs of a pipeline - -You can get a list of all runs of a pipeline using the `runs` property of the pipeline: - -```Bash -runs = pipeline_model.runs -``` - -The result will be a list of the most recent runs of this pipeline, ordered from newest to oldest. - - - Alternatively, you can also use the `pipeline_model.get_runs()` method which allows you to specify detailed parameters for filtering or pagination. See the [ZenML SDK Docs](/develop/sdk/python-client#list-of-resources) for more information. - - -### Get the last run of a pipeline - -To access the most recent run of a pipeline, you can either use the `last_run` property or access it through the `runs` list: - -```bash -last_run = pipeline_model.last_run # OR: pipeline_model.runs[0] -``` - - - If your most recent runs have failed, and you want to find the last run that has succeeded, you can use the `last_successful_run` property instead. - - -### Get the latest run from a pipeline - -Calling a pipeline executes it and then returns the response of the freshly executed run. - -```bash -run = training_pipeline() -``` - - -The run that you get back is the model stored in the ZenML database at the point of the method call. This means the pipeline run is still initializing and no steps have been run. To get the latest state can get a refreshed version from the client: - -```py -from zenml.client import Client - -Client().get_pipeline_run(run.id) # to get a refreshed version -``` - - -### Get a run via the client - -If you already know the exact run that you want to fetch (e.g., from looking at the dashboard), you can use the [Client.get\_pipeline\_run()](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-client/#zenml.client.Client.get%5Fpipeline%5Frun) method to fetch the run directly without having to query the pipeline first: - -```py -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("first_pipeline-2023_06_20-16_20_13_274466") -``` - - - Similar to pipelines, you can query runs by either ID, name, or name prefix, and you can also discover runs through the Client or CLI via the [Client.list\_pipeline\_runs()](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-client/#zenml.client.Client.list%5Fpipeline%5Fruns) or `zenml pipeline runs list` commands. - - -### Run information - -Each run has a collection of useful information which can help you reproduce your runs. In the following, you can find a list of some of the most useful pipeline run information, but there is much more available. See the [PipelineRunResponse](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-models/#zenml.models.v2.core.pipeline%5Frun.PipelineRunResponse) definition for a comprehensive list. - -#### Status - -The status of a pipeline run. There are five possible states: initialized, failed, completed, running, and cached. - -```py -status = run.status -``` - -#### Configuration - -The `pipeline_configuration` is an object that contains all configurations of the pipeline and pipeline run, including the [pipeline-level settings](/develop/user-guide/guides/production-guide/configure-pipeline), which we will learn more about later: - -```Bash -pipeline_config = run.config -pipeline_settings = run.config.settings -``` - -#### Component-Specific metadata - -Depending on the stack components you use, you might have additional component-specific metadata associated with your run, such as the URL to the UI of a remote orchestrator. You can access this component-specific metadata via the `run_metadata` attribute: - -```Bash - -run_metadata = run.run_metadata -# The following only works for runs on certain remote orchestrators -orchestrator_url = run_metadata["orchestrator_url"].value - -## Steps - -Within a given pipeline run you can now further zoom in on individual steps using the `steps` attribute: - -```python -# get all steps of a pipeline for a given run -steps = run.steps - -# get a specific step by its invocation ID -step = run.steps["first_step"] -``` - - - If you're only calling each step once inside your pipeline, the **invocation ID** will be the same as the name of your step. For more complex pipelines, check out [this page](/develop/usage/pipelines/build-pipelines/using-a-custom-step-invocation-id) to learn more about the invocation ID. - - -### Inspect pipeline runs with our VS Code extension - - - - -If you are using [our VS Code extension](https://marketplace.visualstudio.com/items?itemName=ZenML.zenml-vscode), you can easily view your pipeline runs by opening the sidebar (click on the ZenML icon). You can then click on any particular pipeline run to see its status and some other metadata. If you want to delete a run, you can also do so from the same sidebar view. - -### Step information - -Similar to the run, you can use the `step` object to access a variety of useful information: - -* The parameters used to run the step via `step.config.parameters`, -* The step-level settings via `step.config.settings`, -* Component-specific step metadata, such as the URL of an experiment tracker or model deployer, via `step.run_metadata` - -See the [StepRunResponse](https://github.com/zenml-io/zenml/blob/main/src/zenml/models/v2/core/step%5Frun.py) definition for a comprehensive list of available information. - -## Artifacts - -Each step of a pipeline run can have multiple output and input artifacts that we can inspect via the `outputs` and `inputs` properties. - -To inspect the output artifacts of a step, you can use the `outputs` attribute, which is a dictionary that can be indexed using the name of an output. Alternatively, if your step only has a single output, you can use the `output` property as a shortcut directly: - -```Bash -# The outputs of a step are accessible by name -output = step.outputs["output_name"] - -# If there is only one output, you can use the `.output` property instead -output = step.output - -# use the `.load()` method to load the artifact into memory -my_pytorch_model = output.load() -``` - -Similarly, you can use the `inputs` and `input` properties to get the input artifacts of a step instead. - - - Check out [this page](/develop/user-guide/guides/starter-guide/manage-artifacts#giving-names-to-your-artifacts) to see what the output names of your steps are and how to customize them. - - -Note that the output of a step corresponds to a specific artifact version. - -### Fetching artifacts directly - -If you'd like to fetch an artifact or an artifact version directly, it is easy to do so with the `Client`: - -```py -from zenml.client import Client - -# Get artifact -artifact = Client().get_artifact('iris_dataset') -artifact.versions # Contains all the versions of the artifact -output = artifact.versions['2022'] # Get version name "2022" - -# Get artifact version directly: - -# Using version name: -output = Client().get_artifact_version('iris_dataset', '2022') - -# Using UUID -output = Client().get_artifact_version('f429f94c-fb15-43b5-961d-dbea287507c5') -loaded_artifact = output.load() -``` - -### Artifact information - -Regardless of how one fetches it, each artifact contains a lot of general information about the artifact as well as datatype-specific metadata and visualizations. - -#### Metadata - -All output artifacts saved through ZenML will automatically have certain datatype-specific metadata saved with them. NumPy Arrays, for instance, always have their storage size, `shape`, `dtype`, and some statistical properties saved with them. You can access such metadata via the `run_metadata` attribute of an output, e.g.: - -```Bash -output_metadata = output.run_metadata -storage_size_in_bytes = output_metadata["storage_size"].value -``` - -We will talk more about metadata [in the next section](/develop/user-guide/guides/starter-guide/manage-artifacts#logging-metadata-for-an-artifact). - -#### Visualizations - -ZenML automatically saves visualizations for many common data types. Using the `visualize()` method you can programmatically show these visualizations in Jupyter notebooks: - -```Bash -output.visualize() -``` - - - - - - If you're not in a Jupyter notebook, you can simply view the visualizations in the ZenML dashboard by running `zenml up` and clicking on the respective artifact in the pipeline run DAG instead. Check out the [artifact visualization page](/develop/usage/resource-data-management/visualize-artifacts) to learn more about how to build and view artifact visualizations in ZenML! - - -## Fetching information during run execution - -While most of this document has focused on fetching objects after a pipeline run has been completed, the same logic can also be used within the context of a running pipeline. - -This is often desirable in cases where a pipeline is running continuously over time and decisions have to be made according to older runs. - -For example, this is how we can fetch the last pipeline run of the same pipeline from within a ZenML step: - -```py -from zenml import get_step_context -from zenml.client import Client - -@step -def my_step(): - # Get the name of the current pipeline run - current_run_name = get_step_context().pipeline_run.name - - # Fetch the current pipeline run - current_run = Client().get_pipeline_run(current_run_name) - - # Fetch the previous run of the same pipeline - previous_run = current_run.pipeline.runs[1] # index 0 is the current run -``` - - - As shown in the example, we can get additional information about the current run using the `StepContext`, which is explained in more detail in the [advanced docs](/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps). - - -## Code example - -This section combines all the code from this section into one simple script that you can use to see the concepts discussed above: - -Putting it all together, this is how we can load the model trained by the `svc_trainer` step of our example pipeline from the previous sections: -```py -from typing_extensions import Tuple, Annotated -import pandas as pd -from sklearn.datasets import load_iris -from sklearn.model_selection import train_test_split -from sklearn.base import ClassifierMixin -from sklearn.svm import SVC - -from zenml import pipeline, step -from zenml.client import Client - - -@step -def training_data_loader() -> Tuple[ - Annotated[pd.DataFrame, "X_train"], - Annotated[pd.DataFrame, "X_test"], - Annotated[pd.Series, "y_train"], - Annotated[pd.Series, "y_test"], -]: - """Load the iris dataset as tuple of Pandas DataFrame / Series.""" - iris = load_iris(as_frame=True) - X_train, X_test, y_train, y_test = train_test_split( - iris.data, iris.target, test_size=0.2, shuffle=True, random_state=42 - ) - return X_train, X_test, y_train, y_test - - -@step -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Tuple[ - Annotated[ClassifierMixin, "trained_model"], - Annotated[float, "training_acc"], -]: - """Train a sklearn SVC classifier and log to MLflow.""" - model = SVC(gamma=gamma) - model.fit(X_train.to_numpy(), y_train.to_numpy()) - train_acc = model.score(X_train.to_numpy(), y_train.to_numpy()) - print(f"Train accuracy: {train_acc}") - return model, train_acc - - -@pipeline -def training_pipeline(gamma: float = 0.002): - X_train, X_test, y_train, y_test = training_data_loader() - svc_trainer(gamma=gamma, X_train=X_train, y_train=y_train) - - -if __name__ == "__main__": - # You can run the pipeline and get the run object directly - last_run = training_pipeline() - print(last_run.id) - - # You can also use the class directly with the `model` object - last_run = training_pipeline.model.last_run - print(last_run.id) - - # OR you can fetch it after execution is finished: - pipeline = Client().get_pipeline("training_pipeline") - last_run = pipeline.last_run - print(last_run.id) - - # You can now fetch the model - trainer_step = last_run.steps["svc_trainer"] - model = trainer_step.outputs["trained_model"].load() -``` - - diff --git a/develop/usage/pipelines/build-pipelines/get-past-pipeline-step-runs.mdx b/develop/usage/pipelines/build-pipelines/get-past-pipeline-step-runs.mdx deleted file mode 100644 index 54f2f24..0000000 --- a/develop/usage/pipelines/build-pipelines/get-past-pipeline-step-runs.mdx +++ /dev/null @@ -1,20 +0,0 @@ ---- -title: "Get past pipeline/step runs" -icon: recycle ---- - -In order to get past pipeline/step runs, you can use the `get_pipeline` method in combination with the `last_run` property or just index into the runs: - -```py -from zenml.client import Client - -client = Client() - -# Retrieve a pipeline by its name -p = client.get_pipeline("mlflow_train_deploy_pipeline") - -# Get the latest run of this pipeline -latest_run = p.last_run -# Alternatively you can also access runs by index or name -first_run = p[0] -``` diff --git a/develop/usage/pipelines/build-pipelines/hyper-parameter-tuning.mdx b/develop/usage/pipelines/build-pipelines/hyper-parameter-tuning.mdx deleted file mode 100644 index b2d6310..0000000 --- a/develop/usage/pipelines/build-pipelines/hyper-parameter-tuning.mdx +++ /dev/null @@ -1,91 +0,0 @@ ---- -title: "Hyperparameter tuning" -description: "Running a hyperparameter tuning trial with ZenML." -icon: sliders ---- - - - Hyperparameter tuning is not yet a first-class citizen in ZenML, but it is [(high up) on our roadmap of features](https://zenml.featureos.app/p/enable-hyper-parameter-tuning) and will likely receive first-class ZenML support soon. In the meanwhile, the following example shows how hyperparameter tuning can currently be implemented within a ZenML run. - - -A basic iteration through a number of hyperparameters can be achieved with ZenML by using a simple pipeline like this: - -```py -@pipeline -def my_pipeline(step_count: int) -> None: - data = load_data_step() - after = [] - for i in range(step_count): - train_step(data, learning_rate=i * 0.0001, name=f"train_step_{i}") - after.append(f"train_step_{i}") - model = select_model_step(..., after=after) -``` - -This is an implementation of a basic grid search (across a single dimension) that would allow for a different learning rate to be used across the same `train_step`. Once that step has been run for all the different learning rates, the `select_model_step` finds which hyperparameters gave the best results or performance. - - -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates)_._ - -In [pipelines/training.py](https://github.com/zenml-io/zenml/blob/release/0.66.0/examples/e2e/pipelines/training.py), you will find a training pipeline with a `Hyperparameter tuning stage` section. It contains a `for` loop that runs the `hp_tuning_single_search` over the configured model search spaces, followed by the `hp_tuning_select_best_model` being executed after all search steps are completed. As a result, we are getting `best_model_config` to be used to train the best possible model later on. - -```py -... -########## Hyperparameter tuning stage ########## -after = [] -search_steps_prefix = "hp_tuning_search_" -for i, model_search_configuration in enumerate( - MetaConfig.model_search_space -): - step_name = f"{search_steps_prefix}{i}" - hp_tuning_single_search( - model_metadata=ExternalArtifact( - value=model_search_configuration, - ), - id=step_name, - dataset_trn=dataset_trn, - dataset_tst=dataset_tst, - target=target, - ) - after.append(step_name) -best_model_config = hp_tuning_select_best_model( - search_steps_prefix=search_steps_prefix, after=after -) -... -``` - - -The main challenge of this implementation is that it is currently not possible to pass a variable number of artifacts into a step programmatically, so the `select_model_step` needs to query all artifacts produced by the previous steps via the ZenML Client instead: - -```py -from zenml import step, get_step_context -from zenml.client import Client - -@step -def select_model_step(): - run_name = get_step_context().pipeline_run.name - run = Client().get_pipeline_run(run_name) - - # Fetch all models trained by a 'train_step' before - trained_models_by_lr = {} - for step_name, step in run.steps.items(): - if step_name.startswith("train_step"): - for output_name, output in step.outputs.items(): - if output_name == "": - model = output.load() - lr = step.config.parameters["learning_rate"] - trained_models_by_lr[lr] = model - - # Evaluate the models to find the best one - for lr, model in trained_models_by_lr.items(): - ... -``` - -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates)_._ - -In the `steps/hp_tuning` folder, you will find two step files, which can be used as a starting point for building your own hyperparameter search tailored specifically to your use case: - -* [hp\_tuning\_single\_search(...)](https://github.com/zenml-io/zenml/blob/release/0.66.0/examples/e2e/steps/hp%5Ftuning/hp%5Ftuning%5Fsingle%5Fsearch.py) is performing a randomized search for the best model hyperparameters in a configured space. -* [hp\_tuning\_select\_best\_model(...)](https://github.com/zenml-io/zenml/blob/release/0.66.0/examples/e2e/steps/hp%5Ftuning/hp%5Ftuning%5Fselect%5Fbest%5Fmodel.py) is searching for the best hyperparameters, looping other results of previous random searches to find the best model according to a defined metric. - diff --git a/develop/usage/pipelines/build-pipelines/name-your-pipeline-and-runs.mdx b/develop/usage/pipelines/build-pipelines/name-your-pipeline-and-runs.mdx deleted file mode 100644 index 331744b..0000000 --- a/develop/usage/pipelines/build-pipelines/name-your-pipeline-and-runs.mdx +++ /dev/null @@ -1,37 +0,0 @@ ---- -title: "Name your pipeline runs" -icon: signature ---- - -In the output logs of a pipeline run you will see the name of the run: - - -```shell -Pipeline run training_pipeline-2023_05_24-12_41_04_576473 has finished in 3.742s. -``` - -This name is automatically generated based on the current date and time. To change the name for a run, pass `run_name` as a parameter to the `with_options()` method: - - -```py -training_pipeline = training_pipeline.with_options( - run_name="custom_pipeline_run_name" -) -training_pipeline() -``` - -Pipeline run names must be unique, so if you plan to run your pipelines multiple times or run them on a schedule, make sure to either compute the run name dynamically or include one of the following placeholders that ZenML will replace: - -* `{{date}}` will resolve to the current date, e.g. `2023_02_19` -* `{{time}}` will resolve to the current time, e.g. `11_07_09_326492` - - -```py -training_pipeline = training_pipeline.with_options( - run_name=f"custom_pipeline_run_name_{{date}}_{{time}}" -) -training_pipeline() -``` - -Be sure to include the `f` string prefix to allow for the placeholders to be replaced, as shown in the example above. Without the `f` prefix, the placeholders will not be replaced. - diff --git a/develop/usage/pipelines/build-pipelines/retry-steps.mdx b/develop/usage/pipelines/build-pipelines/retry-steps.mdx deleted file mode 100644 index 7ac27d6..0000000 --- a/develop/usage/pipelines/build-pipelines/retry-steps.mdx +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: "Automatically retry steps" -description: "Automatically configure your steps to retry if they fail." -icon: rotate ---- - -ZenML provides a built-in retry mechanism that allows you to configure automatic retries for your steps in case of failures. This can be useful when dealing with intermittent issues or transient errors. A common pattern when trying to run a step on GPU-backed hardware is that the provider will not have enough resources available, so you can set ZenML to handle the retries until the resources free up. You can configure three parameters for step retries: - -* **max\_retries:** The maximum number of times the step should be retried in case of failure. -* **delay:** The initial delay in **seconds** before the first retry attempt. -* **backoff:** The factor by which the delay should be multiplied after each retry attempt. - -## Using the @step decorator: - -You can specify the retry configuration directly in the definition of your step as follows: - - -```py -from zenml.config.retry_config import StepRetryConfig - -@step( - retry=StepRetryConfig( - max_retries=3, - delay=10, - backoff=2 - ) -) -def my_step() -> None: - raise Exception("This is a test exception") -steps: - my_step: - retry: - max_retries: 3 - delay: 10 - backoff: 2 -``` - - - Note that infinite retries are not supported at the moment. If you set `max_retries` to a very large value or do not specify it at all, ZenML will still enforce an internal maximum number of retries to prevent infinite loops. We recommend setting a reasonable `max_retries` value based on your use case and the expected frequency of transient failures. - - ---- - -### See Also: - - - - - diff --git a/develop/usage/pipelines/build-pipelines/run-an-individual-step.mdx b/develop/usage/pipelines/build-pipelines/run-an-individual-step.mdx deleted file mode 100644 index d8b6a1b..0000000 --- a/develop/usage/pipelines/build-pipelines/run-an-individual-step.mdx +++ /dev/null @@ -1,62 +0,0 @@ ---- -title: "Run an individual step" -description: If you want to run just an individual step on your stack, you can simply call the step as you would with a normal Python function. ZenML will internally create a pipeline with just your step and run it on the active stack. -icon: shoe-prints ---- - - - The pipeline run that will happen to execute your step will be `unlisted`, which means it will not be associated with any pipeline. You can still see it in the "Runs" tab of the dashboard. - - -```py -from zenml import step -import pandas as pd -from sklearn.base import ClassifierMixin -from sklearn.svm import SVC - -# Configure the step to use a step operator. If you're not using -# a step operator, you can remove this and the step will run on -# your orchestrator instead. -@step(step_operator="") -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Tuple[ - Annotated[ClassifierMixin, "trained_model"], - Annotated[float, "training_acc"], -]: - """Train a sklearn SVC classifier.""" - - model = SVC(gamma=gamma) - model.fit(X_train.to_numpy(), y_train.to_numpy()) - - train_acc = model.score(X_train.to_numpy(), y_train.to_numpy()) - print(f"Train accuracy: {train_acc}") - - return model, train_acc - - -X_train = pd.DataFrame(...) -y_train = pd.Series(...) - -# Call the step directly. This will internally create a -# pipeline with just this step, which will be executed on -# the active stack. -model, train_acc = svc_trainer(X_train=X_train, y_train=y_train) -``` - -## Run the underlying step function directly - -If you instead want to run your step function without ZenML getting involved, you can use the `entrypoint(...)` method of a step: - -```py -X_train = pd.DataFrame(...) -y_train = pd.Series(...) - -model, train_acc = svc_trainer.entrypoint(X_train=X_train, y_train=y_train) -``` - - - If you want to make this the default behavior when calling a step, you can set the `ZENML_RUN_SINGLE_STEPS_WITHOUT_STACK` environment variable to `True`. Once you do that, calling `svc_trainer(...)` will simply call the underlying function and not use your ZenML stack. - diff --git a/develop/usage/pipelines/build-pipelines/run-pipelines-asynchronously.mdx b/develop/usage/pipelines/build-pipelines/run-pipelines-asynchronously.mdx deleted file mode 100644 index 34d0034..0000000 --- a/develop/usage/pipelines/build-pipelines/run-pipelines-asynchronously.mdx +++ /dev/null @@ -1,32 +0,0 @@ ---- -title: "Run pipelines asynchronously" -description: "The best way to trigger a pipeline run so that it runs in the background" -icon: play ---- - -By default your pipelines will run synchronously. This means your terminal will follow along the logs as the pipeline is being built/runs. - -This behavior can be changed in multiple ways. Either the orchestrator can be configured to always run asynchronously by setting `synchronous=False`. The other option is to temporarily set this at the pipeline configuration level during runtime. - - -```py -from zenml import pipeline - -@pipeline(settings = {"orchestrator": {"synchronous": False}}) -def my_pipeline(): - ... -``` - -or in a yaml config file: - -```yaml -settings: - orchestrator.: - synchronous: false -``` - ---- - - - - diff --git a/develop/usage/pipelines/build-pipelines/schedule-a-pipeline.mdx b/develop/usage/pipelines/build-pipelines/schedule-a-pipeline.mdx deleted file mode 100644 index 1626540..0000000 --- a/develop/usage/pipelines/build-pipelines/schedule-a-pipeline.mdx +++ /dev/null @@ -1,67 +0,0 @@ ---- -title: "Schedule a pipeline" -description: "Learn how to set, pause and stop a schedule for pipelines." -icon: calendar ---- - - - Schedules don't work for all orchestrators. Here is a list of all supported orchestrators. - - -| Orchestrator | Scheduling Support | -| ------------------------------------------------------------------------- | ------------------ | -| [AirflowOrchestrator](/develop/stack-components/orchestrators/airflow) | | -| [AzureMLOrchestrator](/develop/stack-components/orchestrators/azureml) | | -| [DatabricksOrchestrator](/develop/stack-components/orchestrators/databricks) | | -| [HyperAIOrchestrator](/develop/stack-components/orchestrators/hyperai) | | -| [KubeflowOrchestrator](/develop/stack-components/orchestrators/kubeflow) | | -| [KubernetesOrchestrator](/develop/stack-components/orchestrators/kubernetes) | | -| [LocalOrchestrator](/develop/stack-components/orchestrators/local) | | -| [LocalDockerOrchestrator](/develop/stack-components/orchestrators/local-docker) | | -| [SagemakerOrchestrator](/develop/stack-components/orchestrators/sagemaker) | | -| [SkypilotAWSOrchestrator](/develop/stack-components/orchestrators/skypilot-vm) | | -| [SkypilotAzureOrchestrator](/develop/stack-components/orchestrators/skypilot-vm) | | -| [SkypilotGCPOrchestrator](/develop/stack-components/orchestrators/skypilot-vm) | | -| [SkypilotLambdaOrchestrator](/develop/stack-components/orchestrators/skypilot-vm) | | -| [TektonOrchestrator](/develop/stack-components/orchestrators/tekton) | | -| [VertexOrchestrator](/develop/stack-components/orchestrators/vertex) | | - -### Set a schedule - -```py -from zenml.config.schedule import Schedule -from zenml import pipeline -from datetime import datetime - -@pipeline() -def my_pipeline(...): - ... - -# Use cron expressions -schedule = Schedule(cron_expression="5 14 * * 3") -# or alternatively use human-readable notations -schedule = Schedule(start_time=datetime.now(), interval_second=1800) - -my_pipeline = my_pipeline.with_options(schedule=schedule) -my_pipeline() -``` - - - Check out our [SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.schedule.Schedule) to learn more about the different scheduling options. - - -### Pause/Stop a schedule - -The way pipelines are scheduled depends on the orchestrator you are using. For example, if you are using Kubeflow, you can use the Kubeflow UI to stop or pause a scheduled run. However, the exact steps for stopping or pausing a scheduled run may vary depending on the orchestrator you are using. We recommend consulting the documentation for your orchestrator to learn the current method for stopping or pausing a scheduled run. - - - Note that ZenML only gets involved to schedule a run, but maintaining the lifecycle of the schedule (as explained above) is the responsibility of the user. If you run a pipeline containing a schedule two times, two scheduled pipelines (with different/unique names) will be created. - - ---- - -### See Also: - - - - diff --git a/develop/usage/pipelines/build-pipelines/step-output-typing-and-annotation.mdx b/develop/usage/pipelines/build-pipelines/step-output-typing-and-annotation.mdx deleted file mode 100644 index 7cf59e0..0000000 --- a/develop/usage/pipelines/build-pipelines/step-output-typing-and-annotation.mdx +++ /dev/null @@ -1,118 +0,0 @@ ---- -title: "Step output typing and annotation" -description: "Step outputs are stored in your artifact store. Annotate and name them to make more explicit." -icon: tag ---- - -## Type annotations - -Your functions will work as ZenML steps even if you don't provide any type annotations for their inputs and outputs. However, adding type annotations to your step functions gives you lots of additional benefits: - -* **Type validation of your step inputs**: ZenML makes sure that your step functions receive an object of the correct type from the upstream steps in your pipeline. -* **Better serialization**: Without type annotations, ZenML uses [Cloudpickle](https://github.com/cloudpipe/cloudpickle) to serialize your step outputs. When provided with type annotations, ZenML can choose a [materializer](/develop/getting-started/core-concepts#materializers) that is best suited for the output. In case none of the builtin materializers work, you can even [write a custom materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types). - - - ZenML provides a built-in [CloudpickleMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.cloudpickle%5Fmaterializer.CloudpickleMaterializer) that can handle any object by saving it with [cloudpickle](https://github.com/cloudpipe/cloudpickle). However, this is not production-ready because the resulting artifacts cannot be loaded when running with a different Python version. In such cases, you should consider building a [custom Materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types#custom-materializers) to save your objects in a more robust and efficient format. - - -Moreover, using the `CloudpickleMaterializer` could allow users to upload of any kind of object. This could be exploited to upload a malicious file, which could execute arbitrary code on the vulnerable system. - - -```py -from typing import Tuple -from zenml import step - -@step -def square_root(number: int) -> float: - return number ** 0.5 - -# To define a step with multiple outputs, use a `Tuple` type annotation -@step -def divide(a: int, b: int) -> Tuple[int, int]: - return a // b, a % b -``` - -If you want to make sure you get all the benefits of type annotating your steps, you can set the environment variable `ZENML_ENFORCE_TYPE_ANNOTATIONS` to `True`. ZenML will then raise an exception in case one of the steps you're trying to run is missing a type annotation. - -### Tuple vs multiple outputs - -It is impossible for ZenML to detect whether you want your step to have a single output artifact of type `Tuple` or multiple output artifacts just by looking at the type annotation. - -We use the following convention to differentiate between the two: When the `return` statement is followed by a tuple literal (e.g. `return 1, 2` or `return (value_1, value_2)`) we treat it as a step with multiple outputs. All other cases are treated as a step with a single output of type `Tuple`. - - -```py -from zenml import step -from typing_extensions import Annotated -from typing import Tuple - -# Single output artifact -@step -def my_step() -> Tuple[int, int]: - output_value = (0, 1) - return output_value - -# Single output artifact with variable length -@step -def my_step(condition) -> Tuple[int, ...]: - if condition: - output_value = (0, 1) - else: - output_value = (0, 1, 2) - - return output_value - -# Single output artifact using the `Annotated` annotation -@step -def my_step() -> Annotated[Tuple[int, ...], "my_output"]: - return 0, 1 - - -# Multiple output artifacts -@step -def my_step() -> Tuple[int, int]: - return 0, 1 - - -# Not allowed: Variable length tuple annotation when using -# multiple output artifacts -@step -def my_step() -> Tuple[int, ...]: - return 0, 1 -``` - -## Step output names - -By default, ZenML uses the output name `output` for single output steps and `output_0, output_1, ...` for steps with multiple outputs. These output names are used to display your outputs in the dashboard and [fetch them after your pipeline is finished](/develop/usage/pipelines/build-pipelines/fetching-pipelines). - -If you want to use custom output names for your steps, use the `Annotated` type annotation: - -```py -from typing_extensions import Annotated # or `from typing import Annotated on Python 3.9+ -from typing import Tuple -from zenml import step - -@step -def square_root(number: int) -> Annotated[float, "custom_output_name"]: - return number ** 0.5 - -@step -def divide(a: int, b: int) -> Tuple[ - Annotated[int, "quotient"], - Annotated[int, "remainder"] -]: - return a // b, a % b -``` - - - If you do not give your outputs custom names, the created artifacts will be named `{pipeline_name}::{step_name}::output` or `{pipeline_name}::{step_name}::output_{i}` in the dashboard. See the [documentation on artifact versioning and configuration](/develop/user-guide/guides/starter-guide/manage-artifacts) for more information. - - ---- - -### See Also: - - - - - diff --git a/develop/usage/pipelines/build-pipelines/use-failure-success-hooks.mdx b/develop/usage/pipelines/build-pipelines/use-failure-success-hooks.mdx deleted file mode 100644 index 95f3082..0000000 --- a/develop/usage/pipelines/build-pipelines/use-failure-success-hooks.mdx +++ /dev/null @@ -1,265 +0,0 @@ ---- -title: "Use failure/success hooks" -description: "Running failure and success hooks after step execution." -icon: basketball ---- - -Hooks are a way to perform an action after a step has completed execution. They can be useful in a variety of scenarios, such as sending notifications, logging, or cleaning up resources after a step has been completed. - -A hook executes right after step execution, within the same environment as the step, therefore it has access to all the dependencies that a step has. Currently, there are two sorts of hooks that can be defined: `on_failure` and `on_success` . - -* `on_failure`: This hook triggers in the event of a step failing. -* `on_success`: This hook triggers in the event of a step succeeding. - -Here is a short demo for hooks in ZenML: - - - - - -## Defining hooks - -A hook can be defined as a callback function, and must be accessible within the repository where the pipeline and steps are located. - -In case of failure hooks, you can optionally add a `BaseException` argument to the hook, allowing you to access the concrete Exception that caused your step to fail: - -```py -from zenml import step - -def on_failure(exception: BaseException): - print(f"Step failed: {str(exception)}") - - -def on_success(): - print("Step succeeded!") - - -@step(on_failure=on_failure) -def my_failing_step() -> int: - """Returns an integer.""" - raise ValueError("Error") - - -@step(on_success=on_success) -def my_successful_step() -> int: - """Returns an integer.""" - return 1 -``` - -In this example, we define two hooks: `on_failure` and `on_success`, which print a message when the step fails or succeeds, respectively. We then use these hooks with two steps, `my_failing_step` and `my_successful_step`. When `my_failing_step` is executed, it raises a `ValueError`, which triggers the `on_failure` hook. Similarly, when `my_successful_step` is executed, it returns an integer successfully, which triggers the on\_success hook. - -A step can also be specified as a local user-defined function path (of the form `mymodule.myfile.my_function`). This is particularly useful when defining the hooks via a [YAML Config](/develop/usage/project-setup/use-configuration-files). - -## Defining hooks on a pipeline level - -In some cases, there is a need to define a hook on all steps of a given pipeline. Rather than having to define it on all steps individually, you can also specify any hook on the pipeline level. - -```py - -@pipeline(on_failure=on_failure, on_success=on_success) -def my_pipeline(...): - ... -``` - - -Note, that **step-level** defined hooks take **precedence** over **pipeline-level** defined hooks. - - - -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates)_._ - -In [steps/alerts/notify\_on.py](https://github.com/zenml-io/zenml/blob/release/0.66.0/examples/e2e/steps/alerts/notify%5Fon.py), you will find a step to notify the user about success and a function used to notify the user about step failure using the [Alerter](/develop/stack-components/alerters) from the active stack. - -We use `@step` for success notification to only notify the user about a fully successful pipeline run and not about every successful step. - -In [pipelines/training.py](https://github.com/zenml-io/zenml/blob/release/0.66.0/examples/e2e/pipelines/training.py), you can find the usage of a notification step and a function. We will attach a `notify_on_failure` function directly to the pipeline definition like this: - -```py -from zenml import pipeline -@pipeline( - ... - on_failure=notify_on_failure, - ... -) -``` -At the very end of the training pipeline, we will execute the `notify_on_success` step, but only after all other steps have finished - we control it with `after` statement as follows: - -```Bash -... -last_step_name = "promote_metric_compare_promoter" - -notify_on_success(after=[last_step_name]) -... -``` - - - -## Accessing step information inside a hook - -Similar as for regular ZenML steps, you can use the [StepContext](/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps) to access information about the current pipeline run or step inside your hook function: - - -```py -from zenml import step, get_step_context - -def on_failure(exception: BaseException): - context = get_step_context() - print(context.step_run.name) # Output will be `my_step` - print(context.step_run.config.parameters) # Print parameters of the step - print(type(exception)) # Of type value error - print("Step failed!") - - -@step(on_failure=on_failure) -def my_step(some_parameter: int = 1) - raise ValueError("My exception") -``` - -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates)_._ - -In [steps/alerts/notify\_on.py](https://github.com/zenml-io/zenml/blob/release/0.66.0/examples/e2e/steps/alerts/notify%5Fon.py), you will find a step to notify the user about success and a function used to notify the user about step failure using the [Alerter](/develop/stack-components/alerters) from the active stack. - -We use `@step` for success notification to only notify the user about a fully successful pipeline run and not about every successful step. - -Inside the helper function `build_message()`, you will find an example on how developers can work with [StepContext](/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps) to form a proper notification: - -```py -from zenml import get_step_context - -def build_message(status: str) -> str: - """Builds a message to post. - - Args: - status: Status to be set in text. - - Returns: - str: Prepared message. - """ - step_context = get_step_context() - run_url = get_run_url(step_context.pipeline_run) - - return ( - f"Pipeline `{step_context.pipeline.name}` [{str(step_context.pipeline.id)}] {status}!\n" - f"Run `{step_context.pipeline_run.name}` [{str(step_context.pipeline_run.id)}]\n" - f"URL: {run_url}" - ) - -@step(enable_cache=False) -def notify_on_success() -> None: - """Notifies user on pipeline success.""" - step_context = get_step_context() - if alerter and step_context.pipeline_run.config.extra["notify_on_success"]: - alerter.post(message=build_message(status="succeeded")) -``` - - - -## Linking to the `Alerter` Stack component - -A common use case is to use the [Alerter](/develop/stack-components/alerters) component inside the failure or success hooks to notify relevant people. It is quite easy to do this: - -```py - -from zenml import get_step_context -from zenml.client import Client - -def on_failure(): - step_name = get_step_context().step_run.name - Client().active_stack.alerter.post(f"{step_name} just failed!") -``` - -ZenML provides standard failure and success hooks that use the alerter you have configured in your stack. Here's an example of how to use them in your pipelines: - -```py -from zenml.hooks import alerter_success_hook, alerter_failure_hook - - -@step(on_failure=alerter_failure_hook, on_success=alerter_success_hook) -def my_step(...): - ... -``` - -```py -from zenml.client import Client -from zenml import get_step_context - -alerter = Client().active_stack.alerter - -def notify_on_failure() -> None: - """Notifies user on step failure. Used in Hook.""" - step_context = get_step_context() - if alerter and step_context.pipeline_run.config.extra["notify_on_failure"]: - alerter.post(message=build_message(status="failed")) -``` - -If the Al component is not present in Stack we suppress notification, but you can also dump it to the log as Error using: - -```py -from zenml.client import Client -from zenml.logger import get_logger -from zenml import get_step_context - -logger = get_logger(__name__) -alerter = Client().active_stack.alerter - -def notify_on_failure() -> None: - """Notifies user on step failure. Used in Hook.""" - step_context = get_step_context() - if step_context.pipeline_run.config.extra["notify_on_failure"]: - if alerter: - alerter.post(message=build_message(status="failed")) - else: - logger.error(message=build_message(status="failed")) -``` - - - -_To set up the local environment used below, follow the recommendations from the_ [_Project templates_](/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates)_._ - -In [steps/alerts/notify\_on.py](https://github.com/zenml-io/zenml/blob/release/0.66.0/examples/e2e/steps/alerts/notify%5Fon.py), you will find a step to notify the user about success and a function used to notify the user about step failure using the [Alerter](/develop/stack-components/alerters) from the active stack. - -We use `@step` for success notification to only notify the user about a fully successful pipeline run and not about every successful step. - -Inside this code file, you can find how developers can work with Al component to send notification messages across configured channels: - - -## Using the OpenAI ChatGPT failure hook - -The OpenAI ChatGPT failure hook is a hook that uses the OpenAI integration to generate a possible fix for whatever exception caused the step to fail. It is quite easy to use. (You will need [a valid OpenAI API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-secret-api-key) that has correctly set up billing for this.) - - -Note that using this integration will incur charges on your OpenAI account. - -First, ensure that you have the OpenAI integration installed and have stored your API key within a ZenML secret: - -```shell -zenml integration install openai -zenml secret create openai --api_key= -``` - -Then, you can use the hook in your pipeline: - -```py -from zenml.integration.openai.hooks import openai_chatgpt_alerter_failure_hook -from zenml import step - -@step(on_failure=openai_chatgpt_alerter_failure_hook) -def my_step(...): - ... -``` - -If you had set up a Slack alerter as your alerter, for example, then you would see a message like this: - - - - - -You can use the suggestions as input that can help you fix whatever is going wrong in your code. If you have GPT-4 enabled for your account, you can use the `openai_gpt4_alerter_failure_hook` hook instead (imported from the same module). - - diff --git a/develop/usage/pipelines/build-pipelines/use-pipeline-step-parameters.mdx b/develop/usage/pipelines/build-pipelines/use-pipeline-step-parameters.mdx deleted file mode 100644 index 29491f2..0000000 --- a/develop/usage/pipelines/build-pipelines/use-pipeline-step-parameters.mdx +++ /dev/null @@ -1,117 +0,0 @@ ---- -title: "Use pipeline/step parameters" -description: "Steps and pipelines can be parameterized just like any other python function that you are familiar with." -icon: gear ---- - -## Parameters for your steps - -When calling a step in a pipeline, the inputs provided to the step function can either be an **artifact** or a **parameter**. An artifact represents the output of another step that was executed as part of the same pipeline and serves as a means to share data between steps. Parameters, on the other hand, are values provided explicitly when invoking a step. They are not dependent on the output of other steps and allow you to parameterize the behavior of your steps. - - - In order to allow the configuration of your steps using a configuration file, only values that can be serialized to JSON using Pydantic can be passed as parameters. If you want to pass other non-JSON-serializable objects such as NumPy arrays to your steps, use [External Artifacts](/develop/user-guide/guides/starter-guide/manage-artifacts#consuming-external-artifacts-within-a-pipeline) instead. - - -```py -from zenml import step, pipeline - -@step -def my_step(input_1: int, input_2: int) -> None: - pass - - -@pipeline -def my_pipeline(): - int_artifact = some_other_step() - # We supply the value of `input_1` as an artifact and - # `input_2` as a parameter - my_step(input_1=int_artifact, input_2=42) - # We could also call the step with two artifacts or two - # parameters instead: - # my_step(input_1=int_artifact, input_2=int_artifact) - # my_step(input_1=1, input_2=2) -``` - -Parameters of steps and pipelines can also be passed in using YAML configuration files. The following configuration file and Python code can work together and give you the flexibility to update configuration only in YAML file, once needed: - - -```py -# config.yaml - -# these are parameters of the pipeline -parameters: - environment: production - -steps: - my_step: - # these are parameters of the step `my_step` - parameters: - input_2: 42 -``` - -```py -from zenml import step, pipeline -@step -def my_step(input_1: int, input_2: int) -> None: - ... - -# input `environment` will come from the configuration file, -# and it is evaluated to `production` -@pipeline -def my_pipeline(environment: str): - ... - -if __name__=="__main__": - my_pipeline.with_options(config_paths="config.yaml")() -``` - - - There might be conflicting settings for step or pipeline inputs, while working with YAML configuration files. Such situations happen when you define a step or a pipeline parameter in the configuration file and override it from the code later on. Don't worry - once it happens you will be informed with details and instructions how to fix. Example of such a conflict: -```py -# config.yaml -parameters: - some_param: 24 - -steps: - my_step: - parameters: - input_2: 42 -``` -```py -# run.py -from zenml import step, pipeline - -@step -def my_step(input_1: int, input_2: int) -> None: - pass - -@pipeline -def my_pipeline(some_param: int): - # here an error will be raised since `input_2` is - # `42` in config, but `43` was provided in the code - my_step(input_1=42, input_2=43) - -if __name__=="__main__": - # here an error will be raised since `some_param` is - # `24` in config, but `23` was provided in the code - my_pipeline(23) -``` - - - -**Parameters and caching** - -When an input is passed as a parameter, the step will only be cached if all parameter values are exactly the same as for previous executions of the step. - -**Artifacts and caching** - -When an artifact is used as a step function input, the step will only be cached if all the artifacts are exactly the same as for previous executions of the step. This means that if any of the upstream steps that produce the input artifacts for a step were not cached, the step itself will always be executed. - ---- - -### See Also: - - - - - diff --git a/develop/usage/pipelines/build-pipelines/using-a-custom-step-invocation-id.mdx b/develop/usage/pipelines/build-pipelines/using-a-custom-step-invocation-id.mdx deleted file mode 100644 index 2b81dad..0000000 --- a/develop/usage/pipelines/build-pipelines/using-a-custom-step-invocation-id.mdx +++ /dev/null @@ -1,28 +0,0 @@ ---- -title: "Using a custom step invocation ID" -description: "When calling a ZenML step as part of your pipeline, it gets assigned a unique **invocation ID** that you can use to reference this step invocation when [defining the execution order](/develop/usage/pipelines/build-pipelines/control-execution-order-of-steps) of your pipeline steps or use it to [fetch information](/develop/usage/pipelines/build-pipelines/fetching-pipelines) about the invocation after the pipeline has finished running." -icon: fingerprint ---- - - -```py -from zenml import pipeline, step - -@step -def my_step() -> None: - ... - -@pipeline -def example_pipeline(): - # When calling a step for the first time inside a pipeline, - # the invocation ID will be equal to the step name -> `my_step`. - my_step() - # When calling the same step again, the suffix `_2`, `_3`, ... will - # be appended to the step name to generate a unique invocation ID. - # For this call, the invocation ID would be `my_step_2`. - my_step() - # If you want to use a custom invocation ID when calling a step, you can - # do so by passing it like this. If you pass a custom ID, it needs to be - # unique for all the step invocations that happen as part of this pipeline. - my_step(id="my_custom_invocation_id") -``` diff --git a/develop/usage/pipelines/create-and-run-templates/create-a-run-template.mdx b/develop/usage/pipelines/create-and-run-templates/create-a-run-template.mdx deleted file mode 100644 index 0166f7d..0000000 --- a/develop/usage/pipelines/create-and-run-templates/create-a-run-template.mdx +++ /dev/null @@ -1,24 +0,0 @@ ---- -title: "Create a run template" -description: "Create run templates for running pipelines from the server" -icon: file-circle-plus ---- - - -This is a [ZenML Pro](https://zenml.io/pro) only feature. Please [sign up -here](https://cloud.zenml.io) to get access. - -The creation of a run template from a pipeline run **only** works for runs that were executed on a remote stack (i.e. at least a remote orchestrator, artifact store, and container registry). - - -## Create a template in code - -```py -from zenml.client import Client - -run = Client().get_pipeline_run() -Client().create_run_template( - name=, - deployment_id=run.deployment_id -) -``` diff --git a/develop/usage/pipelines/create-and-run-templates/create-and-run-templates.mdx b/develop/usage/pipelines/create-and-run-templates/create-and-run-templates.mdx deleted file mode 100644 index 61c4d76..0000000 --- a/develop/usage/pipelines/create-and-run-templates/create-and-run-templates.mdx +++ /dev/null @@ -1,15 +0,0 @@ ---- -title: "Overview" -description: "Creating and running templates" -icon: circle-play ---- - -This is a [ZenML Pro](https://zenml.io/pro) only feature. Please [sign up -here](https://cloud.zenml.io) to get access. - -Run templates allow you to use the dashboard or our Client/REST API to run a pipeline with updated configuration which allows you to iterate quickly with minimal friction. - - - - - diff --git a/develop/usage/pipelines/create-and-run-templates/run-a-template.mdx b/develop/usage/pipelines/create-and-run-templates/run-a-template.mdx deleted file mode 100644 index 6aaba8a..0000000 --- a/develop/usage/pipelines/create-and-run-templates/run-a-template.mdx +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: "Run a template" -description: "Run a template" -icon: play ---- - - - This is a [ZenML Pro](https://zenml.io/pro) only feature. Please [sign up - here](https://cloud.zenml.io) to get access. - - -## Run a template in code - -```py -from zenml.client import Client - -template = Client().get_run_template() -config = template.config_template -# Optionally modify the config here - -Client().trigger_pipeline( - template_id=template.id, - run_configuration=config, -) -``` diff --git a/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-another.mdx b/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-another.mdx deleted file mode 100644 index 86bc5d1..0000000 --- a/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-another.mdx +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: "Trigger a pipeline from another pipeline" -description: "Trigger a pipeline from another pipeline." -icon: recycle ---- - - - This is a [ZenML Pro](https://zenml.io/pro) only feature. Please [sign up here](https://cloud.zenml.io) get access. OSS users can only trigger a pipeline by calling the pipeline function inside their runner script. - - -Triggering a pipeline from another **only** works if you've created at least one run template for that pipeline. - -```py -import pandas as pd -from zenml import pipeline, step -from zenml.client import Client -from zenml.config.pipeline_run_configuration import PipelineRunConfiguration - -@step -def trainer(data_artifact_id: str): - df = load_artifact(data_artifact_id) - -@pipeline -def training_pipeline(): - trainer() - -@step -def load_data() -> pd.Dataframe: - ... - -@step -def trigger_pipeline(df: UnmaterializedArtifact): - # By using UnmaterializedArtifact we can get the ID of the artifact - run_config = PipelineRunConfiguration(steps={"trainer": {"parameters": {"data_artifact_id": df.id}}}) - Client().trigger_pipeline("training_pipeline", run_configuration=run_config) - -@pipeline -def loads_data_and_triggers_training(): - df = load_data() - trigger_pipeline(df) # Will trigger the other pipeline -``` - -Read more about the [PipelineRunConfiguration](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.pipeline%5Frun%5Fconfiguration.PipelineRunConfiguration) and [trigger\_pipeline](https://sdkdocs.zenml.io/0.60.0/core%5Fcode%5Fdocs/core-client/#zenml.client.Client) function object in the [SDK Docs](https://sdkdocs.zenml.io/). - -Read more about Unmaterialized Artifacts [here](/develop/usage/resource-data-management/handle-data-artifacts/unmaterialized-artifacts). - - - - diff --git a/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-client.mdx b/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-client.mdx deleted file mode 100644 index d62059f..0000000 --- a/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-client.mdx +++ /dev/null @@ -1,27 +0,0 @@ ---- -title: "Trigger a pipeline from Python Client" -description: "Trigger a pipeline using the ZenML Client." -icon: python ---- - - - This is a [ZenML Pro](https://zenml.io/pro) only feature. Please [sign up here](https://cloud.zenml.io) get access. OSS users can only trigger a pipeline by calling the pipeline function inside their runner script. - - -Triggering a pipeline from the Python client **only** works if you've created at least one run template for that pipeline. - -```py -from zenml.client import Client -from zenml.config.pipeline_run_configuration import PipelineRunConfiguration - -if __name__ == "__main__": - run_config = PipelineRunConfiguration(steps={"trainer": {"parameters": {"data_artifact_id": data_artifact_id}}}) - Client().trigger_pipeline("training_pipeline", run_configuration=run_config) -``` - -Read more about the [PipelineRunConfiguration](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.pipeline%5Frun%5Fconfiguration.PipelineRunConfiguration) and [trigger\_pipeline](https://sdkdocs.zenml.io/0.60.0/core%5Fcode%5Fdocs/core-client/#zenml.client.Client) in the [SDK Docs](https://sdkdocs.zenml.io/). - - - - - diff --git a/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-rest-api.mdx b/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-rest-api.mdx deleted file mode 100644 index 0b80f90..0000000 --- a/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-rest-api.mdx +++ /dev/null @@ -1,75 +0,0 @@ ---- -title: "Trigger a pipeline from REST API" -description: "Trigger a pipeline from the rest API." -icon: puzzle ---- - - - This is a [ZenML Pro](https://zenml.io/pro) only feature. Please [sign up here](https://cloud.zenml.io) get access. OSS users can only trigger a pipeline by calling the pipeline function inside their runner script. - - -Triggering a pipeline from the REST API **only** works if you've created at least one run template for that pipeline. - -As a pre-requisite, you need a pipeline name. After you have it, there are three calls that need to be made in order to trigger a pipeline from the REST API: - -1. `GET /pipelines?name=` \-> This returns a response, where a \ can be copied -2. `GET /run_templates?pipeline_id=` \-> This returns a list of responses where a \ can be chosen -3. `POST /run_templates//runs` \-> This runs the pipeline. You can pass the [PipelineRunConfiguration](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.pipeline%5Frun%5Fconfiguration.PipelineRunConfiguration) in the body - -## A worked example - - - Learn how to get a bearer token for the curl commands [here](/develop/api/api-reference#using-a-bearer-token-to-access-the-api-programmatically). - - -Here is an example. Let's say would we like to re-run a pipeline called `training`. We first query the `/pipelines` endpoint: - - -```Bash - -curl -X 'GET' \ - '/api/v1/pipelines?hydrate=false&name=training' \ - -H 'accept: application/json' \ - -H 'Authorization: Bearer ' -``` - - - - - -We can take the ID from any object in the list of responses. In this case, the \ is `c953985e-650a-4cbf-a03a-e49463f58473` in the response. - -After this, we take the pipeline ID and call the `/run_templates?pipeline_id=` API: - -```shell -curl -X 'GET' \ - '/api/v1/run_templates?hydrate=false&logical_operator=and&page=1&size=20&pipeline_id=b826b714-a9b3-461c-9a6e-1bde3df3241d' \ - -H 'accept: application/json' \ - -H 'Authorization: Bearer ' -``` - -We can now take the \ from this response. Here it is `b826b714-a9b3-461c-9a6e-1bde3df3241d`. - - - - - -Finally, we can use the template ID to trigger the pipeline with a different configuration: - -```Bash -curl -X 'POST' \ - '/api/v1/run_templates/b826b714-a9b3-461c-9a6e-1bde3df3241d/runs' \ - -H 'accept: application/json' \ - -H 'Content-Type: application/json' \ - -H 'Authorization: Bearer ' \ - -d '{ - "steps": {"model_trainer": {"parameters": {"model_type": "rf"}}} -}' -``` - -A positive response means your pipeline has been re-triggered with a different config! - - - diff --git a/develop/usage/pipelines/trigger-pipelines/trigger-pipelines.mdx b/develop/usage/pipelines/trigger-pipelines/trigger-pipelines.mdx deleted file mode 100644 index d744a3a..0000000 --- a/develop/usage/pipelines/trigger-pipelines/trigger-pipelines.mdx +++ /dev/null @@ -1,48 +0,0 @@ ---- -title: "Overview" -description: "There are numerous ways to trigger a pipeline, apart from calling the runner script." -mode: wide -icon: bullhorn ---- - -A pipeline can be run via Python like this: - -```py -@step # Just add this decorator -def load_data() -> dict: - training_data = [[1, 2], [3, 4], [5, 6]] - labels = [0, 1, 0] - return {'features': training_data, 'labels': labels} - - -@step -def train_model(data: dict) -> None: - total_features = sum(map(sum, data['features'])) - total_labels = sum(data['labels']) - - # Train some model here - - print(f"Trained model using {len(data['features'])} data points. " - f"Feature sum is {total_features}, label sum is {total_labels}") - - -@pipeline # This function combines steps together -def simple_ml_pipeline(): - dataset = load_data() - train_model(dataset) -``` - -You can now run this pipeline by simply calling the function: - - -```py -simple_ml_pipeline() -``` - -However, there are other ways to trigger a pipeline, specifically a pipeline with a remote stack (remote orchestrator, artifact store, and container registry). - - - - - - diff --git a/develop/usage/project-setup/configure-python-environments/configure-python-environments.mdx b/develop/usage/project-setup/configure-python-environments/configure-python-environments.mdx deleted file mode 100644 index 5c42698..0000000 --- a/develop/usage/project-setup/configure-python-environments/configure-python-environments.mdx +++ /dev/null @@ -1,50 +0,0 @@ ---- -title: "Overview" -description: "Navigating multiple development environments." -icon: python ---- - -ZenML deployments often involve multiple environments. This guide helps you manage dependencies and configurations across these environments. - -Here is a visual overview of the different environments: - - - - - -## Client Environment (or the Runner environment) - -The client environment (sometimes known as the runner environment) is where the ZenML pipelines are _compiled_, i.e., where you call the pipeline function (typically in a `run.py` script). There are different types of client environments: - -* A local development environment -* A CI runner in production. -* A [ZenML Pro](https://zenml.io/pro) runner. -* A `runner` image orchestrated by the ZenML server to start pipelines. - -In all the environments, you should use your preferred package manager (e.g., `pip` or `poetry`) to manage dependencies. Ensure you install the ZenML package and any required [integrations](/develop/stack-components/component-guide). - -The client environment typically follows these key steps when starting a pipeline: - -1. Compiling an intermediate pipeline representation via the `@pipeline` function. -2. Creating or triggering [pipeline and step build environments](/develop/stack-components/image-builders) if running remotely. -3. Triggering a run in the [orchestrator](/develop/stack-components/orchestrators). - -Please note that the `@pipeline` function in your code is **only ever called** in this environment. Therefore, any computational logic that is executed in the pipeline function needs to be relevant to this so-called _compile time_, rather than at _execution_ time, which happens later. - -## ZenML Server Environment - -The ZenML server environment is a FastAPI application managing pipelines and metadata. It includes the ZenML Dashboard and is accessed when you [deploy ZenML](/develop/getting-started/deploying-zenml). To manage dependencies, install them during [ZenML deployment](/develop/getting-started/deploying-zenml), but only if you have custom integrations, as most are built-in. - -See also [here](/develop/usage/project-setup/configure-python-environments/configure-the-server-environment) for more on [configuring the server environment](/develop/usage/project-setup/configure-python-environments/configure-the-server-environment). - -## Execution Environments - -When running locally, there is no real concept of an `execution` environment as the client, server, and execution environment are all the same. However, when running a pipeline remotely, ZenML needs to transfer your code and environment over to the remote [orchestrator](/develop/stack-components/orchestrators). In order to achieve this, ZenML builds Docker images known as `execution environments`. - -ZenML handles the Docker image configuration, creation, and pushing, starting with a [base image](https://hub.docker.com/r/zenmldocker/zenml) containing ZenML and Python, then adding pipeline dependencies. To manage the Docker image configuration, follow the steps in the [containerize your pipeline](/develop/usage/project-setup/customize-docker-builds) guide, including specifying additional pip dependencies, using a custom parent image, and customizing the build process. - -## Image Builder Environment - -By default, execution environments are created locally in the [client environment](/develop/usage/project-setup/configure-python-environments#client-environment-or-the-runner-environment) using the local Docker client. However, this requires Docker installation and permissions. ZenML offers [image builders](/develop/stack-components/image-builders), a special [stack component](/develop/stack-components/component-guide), allowing users to build and push Docker images in a different specialized _image builder environment_. - -Note that even if you don't configure an image builder in your stack, ZenML still uses the [local image builder](/develop/stack-components/image-builders/local) to retain consistency across all builds. In this case, the image builder environment is the same as the client environment. diff --git a/develop/usage/project-setup/configure-python-environments/configure-the-server-environment.mdx b/develop/usage/project-setup/configure-python-environments/configure-the-server-environment.mdx deleted file mode 100644 index 1dd9a2f..0000000 --- a/develop/usage/project-setup/configure-python-environments/configure-the-server-environment.mdx +++ /dev/null @@ -1,134 +0,0 @@ ---- -title: "Configure the server environment" -description: "How to control ZenML behavior with environmental variables." -icon: earth-asia ---- - -The ZenML server environment is configured using environment variables. You will -need to set these before deploying your server instance, and in general there are a few pre-defined environmental variables that can be used to control the behavior of ZenML. See the list below with default values and options: - -## Logging verbosity - -```Bash -export ZENML_LOGGING_VERBOSITY=INFO -``` - -Choose from `INFO`, `WARN`, `ERROR`, `CRITICAL`, `DEBUG`. - -## Disable step logs - -Usually, ZenML [stores step logs in the artifact store](/develop/usage/development-monitoring/control-logging/enable-or-disable-logs-storing), but this can sometimes cause performance bottlenecks, especially if the code utilizes progress bars. - -If you want to configure whether logged output from steps is stored or not, set the `ZENML_DISABLE_STEP_LOGS_STORAGE` environment variable to `true`. Note that this will mean that logs from your steps will no longer be stored and thus won't be visible on the dashboard anymore. - - -```Bash -export ZENML_DISABLE_STEP_LOGS_STORAGE=false -``` - -## ZenML repository path - -To configure where ZenML will install and look for its repository, set the environment variable `ZENML_REPOSITORY_PATH`. - - -```shell -export ZENML_REPOSITORY_PATH=/path/to/somewhere -``` - -## Analytics - -Please see [our full page](/develop/usage/project-setup/global-settings#usage-analytics) on what analytics are tracked and how you can opt out, but the quick summary is that you can set this to `false` if you want to opt out of analytics. - - -```shell -export ZENML_ANALYTICS_OPT_IN=false -``` - -## Debug mode - -Setting to `true` switches to developer mode: - - -```shell -export ZENML_DEBUG=true -``` - -## Active stack - -Setting the `ZENML_ACTIVE_STACK_ID` to a specific UUID will make the corresponding stack the active stack: - -shell -``` -export ZENML_ACTIVE_STACK_ID= -``` - -## Prevent pipeline execution - -When `true`, this prevents a pipeline from executing: - - -```shell -export ZENML_PREVENT_PIPELINE_EXECUTION=false -``` - -## Disable rich traceback - -Set to `false` to disable the [rich traceback](https://rich.readthedocs.io/en/stable/traceback.html): - -```shell -export ZENML_ENABLE_RICH_TRACEBACK=true -``` - -## Disable colorful logging - -If you wish to disable colorful logging, set the following environment variable: - - -```shell -ZENML_LOGGING_COLORS_DISABLED=true -``` - -Note that setting this on the [client environment](/develop/usage/project-setup/configure-python-environments#client-environment-or-the-runner-environment) (e.g. your local machine which runs the pipeline) will automatically disable colorful logging on remote orchestrators. If you wish to disable it locally, but turn on for remote orchestrators, you can set the `ZENML_LOGGING_COLORS_DISABLED` environment variable in your orchestrator's environment as follows: - -```py -docker_settings = DockerSettings(environment={"ZENML_LOGGING_COLORS_DISABLED": "false"}) - -# Either add it to the decorator - -@pipeline(settings={"docker": docker_settings}) - -def my_pipeline() -> None: - - my_step() - -# Or configure the pipelines options - -my_pipeline = my_pipeline.with_options( - - settings={"docker": docker_settings} - -) -``` - -## ZenML global config path - -To set the path to the global config file, used by ZenML to manage and store the state for a number of settings, set the environment variable as follows: - - -```shell -export ZENML_CONFIG_PATH=/path/to/somewhere -``` - -## Server configuration - -For more information on server configuration, see the [ZenML Server documentation](/develop/getting-started/deploying-zenml/deploy-with-docker#zenml-server-configuration-options) for more, especially the section entitled "ZenML server configuration options". - -## Client configuration - -Setting the `ZENML_STORE_URL` and `ZENML_STORE_API_KEY` environment variables automatically connects your ZenML Client to the specified server. This method is particularly useful when you are using the ZenML client in an automated CI/CD workload environment like GitHub Actions or GitLab CI or in a containerized environment like Docker or Kubernetes: - -```shell -export ZENML_STORE_URL=https://... - -export ZENML_STORE_API_KEY= -``` diff --git a/develop/usage/project-setup/configure-python-environments/handling-dependencies.mdx b/develop/usage/project-setup/configure-python-environments/handling-dependencies.mdx deleted file mode 100644 index 60ad51f..0000000 --- a/develop/usage/project-setup/configure-python-environments/handling-dependencies.mdx +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: "Handling dependencies" -description: "How to handle issues with conflicting dependencies" -icon: shield ---- - -This page documents a some of the common issues that arise when using ZenML with other libraries. - -When using ZenML with other libraries, you may encounter issues with conflicting dependencies. ZenML aims to be stack- and integration-agnostic, allowing you to run your pipelines using the tools that make sense for your problems. With this flexibility comes the possibility of dependency conflicts. - -ZenML allows you to install dependencies required by integrations through the `zenml integration install ...` command. This is a convenient way to install dependencies for a specific integration, but it can also lead to dependency conflicts if you are using other libraries in your environment. An easy way to see if the ZenML requirements are still met (after installing any extra dependencies required by your work) by running `zenml integration list` and checking that your desired integrations still bear the green tick symbol denoting that all requirements are met. - -## Suggestions for Resolving Dependency Conflicts - -### Use a tool like `pip-compile` for reproducibility - -Consider using a tool like `pip-compile` (available through [the pip-tools package](https://pip-tools.readthedocs.io/)) to compile your dependencies into a static `requirements.txt` file that can be used across environments. (If you are using [uv](https://github.com/astral-sh/uv), you might want to use `uv pip compile` as an alternative.) - -For a practical example and explanation of using `pip-compile` to address exactly this need, see [our 'gitflow' repository and workflow](https://github.com/zenml-io/zenml-gitflow#-software-requirements-management) to learn more. - -### Use `pip check` to discover dependency conflicts - -Running [pip check](https://pip.pypa.io/en/stable/cli/pip%5Fcheck/) will verify that your environment's dependencies are compatible with one another. If not, you will see a list of the conflicts. This may or may not be a problem or something that will prevent you from moving forward with your specific use case, but it is certainly worth being aware of whether this is the case. - -### Well-known dependency resolution issues - -Some of ZenML's integrations come with strict dependency and package version requirements. We try to keep these dependency requirements ranges as wide as possible for the integrations developed by ZenML, but it is not always possible to make this work completely smoothly. Here is one of the known issues: - -* `click`: ZenML currently requires `click~=8.0.3` for its CLI. This is on account of another dependency of ZenML. Using versions of `click` in your own project that are greater than 8.0.3 may cause unanticipated behaviors. - -### Manually bypassing ZenML's integration installation - -It is possible to skip ZenML's integration installation process and install dependencies manually. This is not recommended, but it is possible and can be run at your own risk. - - - Note that the `zenml integration install ...` command runs a `pip install ...` under the hood as part of its implementation, taking the dependencies listed in the integration object and installing them. For example, `zenml integration install gcp` will run `pip install "kfp==1.8.16" "gcsfs" "google-cloud-secret-manager" ...` and so on, since they are [specified in the integration definition](https://github.com/zenml-io/zenml/blob/ec2283473e5e0c5a2f1b7868875539a83e617f8c/src/zenml/integrations/gcp/%5F%5Finit%5F%5F.py#L45). - - -To do this, you will need to install the dependencies for the integration you want to use manually. You can find the dependencies for the integrations by running the following: - -```bash -# to have the requirements exported to a file -zenml integration export-requirements --output-file integration-requirements.txt INTEGRATION_NAME - -# to have the requirements printed to the console -zenml integration export-requirements INTEGRATION_NAME -``` - -You can then amend and tweak those requirements as you see fit. Note that if you are using a remote orchestrator, you would then have to place the updated versions for the dependencies in a `DockerSettings` object (described in detail [here](/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline)) which will then make sure everything is working as you need. diff --git a/develop/usage/project-setup/customize-docker-builds/customize-docker-builds.mdx b/develop/usage/project-setup/customize-docker-builds/customize-docker-builds.mdx deleted file mode 100644 index 44af8ce..0000000 --- a/develop/usage/project-setup/customize-docker-builds/customize-docker-builds.mdx +++ /dev/null @@ -1,9 +0,0 @@ ---- -title: "Overview" -description: "Using Docker images to run your pipeline." -icon: docker ---- - -ZenML executes pipeline steps sequentially in the active Python environment when running locally. However, with remote [orchestrators](/develop/user-guide/guides/production-guide/cloud-orchestration) or [step operators](/develop/stack-components/step-operators), ZenML builds [Docker](https://www.docker.com/) images to run your pipeline in an isolated, well-defined environment. - -This section discusses how to control this dockerization process. diff --git a/develop/usage/project-setup/customize-docker-builds/define-where-an-image-is-built.mdx b/develop/usage/project-setup/customize-docker-builds/define-where-an-image-is-built.mdx deleted file mode 100644 index c6a3993..0000000 --- a/develop/usage/project-setup/customize-docker-builds/define-where-an-image-is-built.mdx +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: "Define where an image is built" -description: "Defining the image builder." -icon: image ---- - -ZenML executes pipeline steps sequentially in the active Python environment when running locally. However, with remote [orchestrators](/develop/stack-components/orchestrators) or [step operators](/develop/stack-components/step-operators), ZenML builds [Docker](https://www.docker.com/) images to run your pipeline in an isolated, well-defined environment. - -By default, execution environments are created locally in the client environment using the local Docker client. However, this requires Docker installation and permissions. ZenML offers [image builders](/develop/stack-components/image-builders), a special [stack component](/develop/stack-components/component-guide), allowing users to build and push Docker images in a different specialized _image builder environment_. - -Note that even if you don't configure an image builder in your stack, ZenML still uses the [local image builder](/develop/stack-components/image-builders/local) to retain consistency across all builds. In this case, the image builder environment is the same as the [client environment](/develop/usage/project-setup/configure-python-environments#client-environment-or-the-runner-environment). - -You don't need to directly interact with any image builder in your code. As long as the image builder that you want to use is part of your active [ZenML stack](/develop/user-guide/guides/production-guide/understand-stacks), it will be used automatically by any component that needs to build container images. diff --git a/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline.mdx b/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline.mdx deleted file mode 100644 index f161107..0000000 --- a/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline.mdx +++ /dev/null @@ -1,143 +0,0 @@ ---- -title: "Docker settings on a pipeline" -description: "Using Docker images to run your pipeline." -icon: pied-piper ---- - -When a [pipeline is run with a remote orchestrator](/develop/usage/project-setup/configure-python-environments) a [Dockerfile](https://docs.docker.com/engineversions/0.66.0/reference/builder/) is dynamically generated at runtime. It is then used to build the Docker image using the [image builder](/develop/usage/project-setup/configure-python-environments#image-builder-environment) component of your stack. The Dockerfile consists of the following steps: - -* **Starts from a parent image** that has **ZenML installed**. By default, this will use the [official ZenML image](https://hub.docker.com/r/zenmldocker/zenml/) for the Python and ZenML version that you're using in the active Python environment. If you want to use a different image as the base for the following steps, check out [this guide](/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline#using-a-custom-parent-image). -* **Installs additional pip dependencies**. ZenML will automatically detect which integrations are used in your stack and install the required dependencies. If your pipeline needs any additional requirements, check out our [guide on including custom dependencies](/develop/usage/project-setup/customize-docker-builds/specify-pip-dependencies-and-apt-packages). -* **Optionally copies your source files**. Your source files need to be available inside the Docker container so ZenML can execute your step code. Check out [this section](/develop/usage/project-setup/customize-docker-builds/which-files-are-built-into-the-image) for more information on how you can customize how ZenML handles your source files in Docker images. -* **Sets user-defined environment variables.** - -The process described above is automated by ZenML and covers the most basic use cases. This section covers various ways to customize the Docker build process to fit your needs. - -For a full list of configuration options, check out [the DockerSettings object on the SDKDocs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.docker%5Fsettings.DockerSettings). - -## How to configure settings for a pipeline - -Customizing the Docker builds for your pipelines and steps is done using the [DockerSettings](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.docker%5Fsettings.DockerSettings) class which you can import like this: - - -```Bash -from zenml.config import DockerSettings -``` - -There are many ways in which you can supply these settings: - -* Configuring them on a pipeline applies the settings to all steps of that pipeline: - - -```py -from zenml.config import DockerSettings -docker_settings = DockerSettings() - -# Either add it to the decorator -@pipeline(settings={"docker": docker_settings}) -def my_pipeline() -> None: - my_step() - -# Or configure the pipelines options -my_pipeline = my_pipeline.with_options( - settings={"docker": docker_settings} -) -``` - -* Configuring them on a step gives you more fine-grained control and enables you to build separate specialized Docker images for different steps of your pipelines: - - -```py -docker_settings = DockerSettings() - -# Either add it to the decorator -@step(settings={"docker": docker_settings}) -def my_step() -> None: - pass - -# Or configure the step options -my_step = my_step.with_options( - settings={"docker": docker_settings} -) -``` - -* Using a YAML configuration file as described [here](/develop/usage/project-setup/use-configuration-files): - -```yaml -settings: - docker: - ... - -steps: - step_name: - settings: - docker: - ... -``` - -Check out [this page](/develop/usage/project-setup/use-configuration-files/configuration-hierarchy) for more information on the hierarchy and precedence of the various ways in which you can supply the settings. - -### Specifying Docker build options - -If you want to specify build options that get passed to the build method of the [image builder](/develop/usage/project-setup/configure-python-environments#image-builder-environment). For the default local image builder, these options get passed to the [docker build command](https://docker-py.readthedocs.io/en/stable/images.html#docker.models.images.ImageCollection.build). - -```py -docker_settings = DockerSettings(build_config={"build_options": {...}}) - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - - -If you're running your pipelines on MacOS with ARM architecture, the local Docker caching does not work unless you specify the target platform of the image: - -```py -docker_settings = DockerSettings(build_config={"build_options": {"platform": "linux/amd64"}}) - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - - -### Using a custom parent image - -By default, ZenML performs all the steps described above on top of the [official ZenML image](https://hub.docker.com/r/zenmldocker/zenml/) for the Python and ZenML version in the active Python environment. To have more control over the entire environment used to execute your pipelines, you can either specify a custom pre-built parent image or a Dockerfile that ZenML uses to build a parent image for you. - - - If you're going to use a custom parent image (either pre-built or by specifying a Dockerfile), you need to make sure that it has Python, pip, and ZenML installed for it to work. If you need a starting point, you can take a look at the Dockerfile that ZenML uses [here](https://github.com/zenml-io/zenml/blob/main/docker/base.Dockerfile). - - -#### Using a pre-built parent image - -To use a static parent image (e.g., with internal dependencies installed) that doesn't need to be rebuilt on every pipeline run, specify it in the Docker settings for your pipeline: - - -```py -docker_settings = DockerSettings(parent_image="my_registry.io/image_name:tag") - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -To use this image directly to run your steps without including any code or installing any requirements on top of it, skip the Docker builds by specifying it in the Docker settings: - - -```py -docker_settings = DockerSettings( - parent_image="my_registry.io/image_name:tag", - skip_build=True -) - - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - - -This is an advanced feature and may cause unintended behavior when running your pipelines. If you use this, ensure your code files are correctly included in the image you specified. - - diff --git a/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-step.mdx b/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-step.mdx deleted file mode 100644 index a8a22f1..0000000 --- a/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-step.mdx +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: "Docker settings on a step" -description: "You have the option to customize the Docker settings at a step level." -icon: stairs ---- - -By default every step of a pipeline uses the same Docker image that is defined at the [pipeline level](/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline). Sometimes your steps will have special requirements that make it necessary to define a different Docker image for one or many steps. This can easily be accomplished by adding the [DockerSettings](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.docker%5Fsettings.DockerSettings) to the step decorator directly. - -```py -from zenml import step -from zenml.config import DockerSettings - -@step( - settings={ - "docker": DockerSettings( - parent_image="pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime" - ) - } -) -def training(...): - ... -``` - -Alternatively, this can also be done within the configuration file. - -```yaml -steps: - training: - settings: - docker: - parent_image: pytorch/pytorch:2.2.0-cuda11.8-cudnn8-runtime - required_integrations: - - gcp - - github - requirements: - - zenml # Make sure to include ZenML for other parent images - - numpy -``` diff --git a/develop/usage/project-setup/customize-docker-builds/specify-pip-dependencies-and-apt-packages.mdx b/develop/usage/project-setup/customize-docker-builds/specify-pip-dependencies-and-apt-packages.mdx deleted file mode 100644 index 4ecfa54..0000000 --- a/develop/usage/project-setup/customize-docker-builds/specify-pip-dependencies-and-apt-packages.mdx +++ /dev/null @@ -1,139 +0,0 @@ ---- -title: "Specify pip dependencies and apt packages" -description: "The configuration for specifying pip and apt dependencies only works in the remote pipeline case, and is disregarded for local pipelines (i.e. pipelines that run locally without having to build a Docker image)." -icon: puzzle-piece ---- - - - -When a [pipeline is run with a remote orchestrator](/develop/usage/project-setup/configure-python-environments) a [Dockerfile](https://docs.docker.com/engineversions/0.66.0/reference/builder/) is dynamically generated at runtime. It is then used to build the Docker image using the [image builder](/develop/usage/project-setup/configure-python-environments#-configure-python-environments) component of your stack. - -For all of examples on this page, note that `DockerSettings` can be imported using `from zenml.config import DockerSettings`. - -By default, ZenML automatically installs all packages required by your active ZenML stack. However, you can specify additional packages to be installed in various ways: - -* Install all the packages in your local Python environment (This will use the `pip` or `poetry` package manager to get a list of your local packages): - -```py -# or use "poetry_export" -docker_settings = DockerSettings(replicate_local_python_environment="pip_freeze") - - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -If required, a custom command can be provided. This command must output a list of requirements following the format of the [requirements file](https://pip.pypa.io/en/stableversions/0.66.0/reference/requirements-file-format/): - - -```py -from zenml.config import DockerSettings - -docker_settings = DockerSettings(replicate_local_python_environment=[ - "poetry", - "export", - "--extras=train", - "--format=requirements.txt" -]) - - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -* Specify a list of requirements in code: - -```py -docker_settings = DockerSettings(requirements=["torch==1.12.0", "torchvision"]) - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` -* Specify a requirements file: - -```py -docker_settings = DockerSettings(requirements="/path/to/requirements.txt") -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` -* Specify a list of [ZenML integrations](/develop/stack-components/component-guide) that you're using in your pipeline: - - -```py -from zenml.integrations.constants import PYTORCH, EVIDENTLY - -docker_settings = DockerSettings(required_integrations=[PYTORCH, EVIDENTLY]) - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -* Specify a list of apt packages in code: - -```py -docker_settings = DockerSettings(apt_packages=["git"]) -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` -* Prevent ZenML from automatically installing the requirements of your stack: -```py -docker_settings = DockerSettings(install_stack_requirements=False) - @pipeline(settings={"docker": docker_settings}) - def my_pipeline(...): - ... -``` -* In some cases the steps of your pipeline will have conflicting requirements or some steps of your pipeline will require large dependencies that don't need to be installed to run the remaining steps of your pipeline. For this case, ZenML allows you to specify custom Docker settings for steps in your pipeline. - -```py -docker_settings = DockerSettings(requirements=["tensorflow"]) - -@step(settings={"docker": docker_settings}) - -def my_training_step(...): - - ... -``` - - - You can combine these methods but do make sure that your list of requirements does not overlap with the ones specified explicitly in the Docker settings. - - -Depending on the options specified in your Docker settings, ZenML installs the requirements in the following order (each step optional): - -* The packages installed in your local python environment -* The packages required by the stack unless this is disabled by setting `install_stack_requirements=False`. -* The packages specified via the `required_integrations` -* The packages specified via the `requirements` attribute -* You can specify additional arguments for the installer used to install your Python packages as follows: - -```py -# This will result in a `pip install --timeout=1000 ...` call when installing packages in the -# Docker image - -docker_settings = DockerSettings(python_package_installer_args={"timeout": 1000}) -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -* **Experimental**: If you want to use [uv](https://github.com/astral-sh/uv) for faster resolving and installation of your Python packages, you can use by it as follows: - -```py - -docker_settings = DockerSettings(python_package_installer="uv") - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - - - `uv` is a relatively new project and not as stable as `pip` yet, which might lead to errors during package installation. If this happens, try switching the installer back to `pip` and see if that solves the issue. - - diff --git a/develop/usage/project-setup/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times.mdx b/develop/usage/project-setup/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times.mdx deleted file mode 100644 index f9533cd..0000000 --- a/develop/usage/project-setup/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times.mdx +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: "Use code repositories to automate Docker build reuse" -description: While reusing Docker builds is useful, it can be limited. -icon: person-running ---- - - This is because specifying a custom build when running a pipeline will **not run the code on your client machine** but will use the code **included in the Docker images of the build**. As a consequence, even if you make local code changes, reusing a build will _always_ execute the code bundled in the Docker image, rather than the local code. Therefore, if you would like to reuse a Docker build AND make sure your local code changes are also downloaded into the image, you need to disconnect your code from the build. - -You can do so by connecting a git repository. Registering a code repository lets you avoid building images each time you run a pipeline **and** quickly iterate on your code. When running a pipeline that is part of a local code repository checkout, ZenML can instead build the Docker images without including any of your source files, and download the files inside the container before running your code. This greatly speeds up the building process and also allows you to reuse images that one of your colleagues might have built for the same stack. - -ZenML will **automatically figure out which builds match your pipeline and reuse the appropriate build id**. Therefore, you **do not** need to explicitly pass in the build id when you have a clean repository state and a connected git repository. This approach is **highly recommended**. See an end to end example [here](/develop/user-guide/guides/production-guide/connect-code-repository). - - - In order to benefit from the advantages of having a code repository in a project, you need to make sure that **the relevant integrations are installed for your ZenML installation.**. For instance, let's assume you are working on a project with ZenML and one of your team members has already registered a corresponding code repository of type `github` for it. If you do `zenml code-repository list`, you would also be able to see this repository. However, in order to fully use this repository, you still need to install the corresponding integration for it, in this example the `github` integration. - -```Bash -zenml integration install github -``` - - -## Detecting local code repository checkouts - -Once you have registered one or more code repositories, ZenML will check whether the files you use when running a pipeline are tracked inside one of those code repositories. This happens as follows: - -* First, the [source root](/develop/usage/project-setup/customize-docker-builds/which-files-are-built-into-the-image) is computed -* Next, ZenML checks whether this source root directory is included in a local checkout of one of the registered code repositories - -## Tracking code version for pipeline runs - -If a [local code repository checkout](/develop/usage/project-setup/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times#detecting-local-code-repository-checkouts) is detected when running a pipeline, ZenML will store a reference to the current commit for the pipeline run, so you'll be able to know exactly which code was used. Note that this reference is only tracked if your local checkout is clean (i.e. it does not contain any untracked or uncommitted files). This is to ensure that your pipeline is actually running with the exact code stored at the specific code repository commit. - -## Tips and best practices - -It is also important to take some additional points into consideration: - -* The file download is only possible if the local checkout is clean (i.e. it does not contain any untracked or uncommitted files) and the latest commit has been pushed to the remote repository. This is necessary as otherwise, the file download inside the Docker container will fail. -* If you want to disable or enforce the downloading of files, check out [this docs page](/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline) for the available options. - diff --git a/develop/usage/project-setup/customize-docker-builds/use-your-own-docker-files.mdx b/develop/usage/project-setup/customize-docker-builds/use-your-own-docker-files.mdx deleted file mode 100644 index c945ef1..0000000 --- a/develop/usage/project-setup/customize-docker-builds/use-your-own-docker-files.mdx +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: "Use your own Dockerfiles" -icon: files ---- - -In some cases, you might not want full control over the resulting Docker image but want to build a parent image dynamically each time a pipeline is executed. To make this process easier, ZenML allows you to specify a custom Dockerfile as well as `build context` directory and build options. ZenML then builds an intermediate image based on the Dockerfile you specified and uses the intermediate image as the parent image. - -Here is how the build process looks like: - -* **No** `**Dockerfile**` **specified**: If any of the options regarding requirements, environment variables or copying files require us to build an image, ZenML will build this image. Otherwise the `parent_image` will be used to run the pipeline. -* `**Dockerfile**` **specified**: ZenML will first build an image based on the specified `Dockerfile`. If any of the options regarding requirements, environment variables or copying files require an additional image built on top of that, ZenML will build a second image. If not, the image build from the specified `Dockerfile` will be used to run the pipeline. - -Depending on the configuration of the [DockerSettings](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.docker%5Fsettings.DockerSettings) object, requirements will be installed in the following order (each step optional): - -* The packages installed in your local Python environment. -* The packages specified via the `requirements` attribute. -* The packages specified via the `required_integrations` and potentially stack requirements. - - - Depending on the configuration of your Docker settings, this intermediate image might also be used directly to execute your pipeline steps. - - - -```py -docker_settings = DockerSettings( - dockerfile="/path/to/dockerfile", - build_context_root="/path/to/build/context", - parent_image_build_config={ - "build_options": ... - "dockerignore": ... - } -) - - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` diff --git a/develop/usage/project-setup/customize-docker-builds/which-files-are-built-into-the-image.mdx b/develop/usage/project-setup/customize-docker-builds/which-files-are-built-into-the-image.mdx deleted file mode 100644 index 093e314..0000000 --- a/develop/usage/project-setup/customize-docker-builds/which-files-are-built-into-the-image.mdx +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: "Which files are built into the image" -icon: file-image ---- - -ZenML determines the root directory of your source files in the following order: - -* If you've initialized zenml (`zenml init`) in your current working directory or one of its parent directories, the repository root directory will be used. -* Otherwise, the parent directory of the Python file you're executing will be the source root. For example, running `python /path/to/file.py`, the source root would be `/path/to`. - -You can specify how the files inside this root directory are handled using the following three attributes on the [DockerSettings](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.docker%5Fsettings.DockerSettings): - -* `allow_download_from_code_repository`: If this is set to `True` and your files are inside a registered [code repository](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository) and the repository has no local changes, the files will be downloaded from the code repository and not included in the image. -* `allow_download_from_artifact_store`: If the previous option is disabled or no code repository without local changes exists for the root directory, ZenML will archive and upload your code to the artifact store if this is set to `True`. -* `allow_including_files_in_images`: If both previous options were disabled or not possible, ZenML will include your files in the Docker image if this option is enabled. This means a new Docker image has to be built each time you modify one of your code files. - - - Setting all of the above attributes to `False` is not recommended and will most likely cause unintended and unanticipated behavior when running your pipelines. If you do this, you're responsible that all your files are at the correct paths in the Docker images that will be used to run your pipeline steps. - - -## Control which files get downloaded - -When downloading files either from a code repository or the artifact store, ZenML downloads all contents of the root directory into the Docker container. To exclude files, track your code in a Git repository use a [gitignore](https://git-scm.com/docs/gitignore/en) to specify which files should be excluded. - -## Control which files get included - -When including files in the image, ZenML copies all contents of the root directory into the Docker image. To exclude files and keep the image smaller, use a [.dockerignore file](https://docs.docker.com/engineversions/0.66.0/reference/builder/#dockerignore-file) in either of the following ways: - -* Have a file called `.dockerignore` in your source root directory. -* Explicitly specify a `.dockerignore` file to use: - -```py -docker_settings = DockerSettings(build_config={"dockerignore": "/path/to/.dockerignore"}) - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` diff --git a/develop/usage/project-setup/global-settings.mdx b/develop/usage/project-setup/global-settings.mdx deleted file mode 100644 index 228ed01..0000000 --- a/develop/usage/project-setup/global-settings.mdx +++ /dev/null @@ -1,159 +0,0 @@ ---- -title: "Configure global settings" -description: "Understanding the global settings of your ZenML installation." -icon: film ---- - -The information about the global settings of ZenML on a machine is kept in a folder commonly referred to as the **ZenML Global Config Directory** or the **ZenML Config Path**. The location of this folder depends on the operating system type and the current system user, but is usually located in the following locations: - -* Linux: `~/.config/zenml` -* Mac: `~/Library/Application Support/zenml` -* Windows: `C:\Users\%USERNAME%\AppData\Local\zenml` - -The default location may be overridden by setting the `ZENML_CONFIG_PATH` environment variable to a custom value. The current location of the global config directory used on a system can be retrieved by running the following commands: - - -```js -# The output will tell you something like this: - -# Using configuration from: '/home/stefan/.config/zenml' - -zenml status - -python -c 'from zenml.utils.io_utils import get_global_config_directory; print(get_global_config_directory())' -``` - - -Manually altering or deleting the files and folders stored under the ZenML global config directory is not recommended, as this can break the internal consistency of the ZenML configuration. As an alternative, ZenML provides CLI commands that can be used to manage the information stored there: - -* `zenml analytics` \- manage the analytics settings -* `zenml clean` \- to be used only in case of emergency, to bring the ZenML configuration back to its default factory state -* `zenml downgrade` \- downgrade the ZenML version in the global configuration to match the version of the ZenML package installed in the current environment. Read more about this in the [ZenML Version Mismatch](/develop/usage/project-setup/global-settings#version-mismatch-downgrading) section. - - -The first time that ZenML is run on a machine, it creates the global config directory and initializes the default configuration in it, along with a default Stack: - - -```Bash -Initializing the ZenML global configuration version to 0.13.2 - -Creating default user 'default' ... - -Creating default stack for user 'default'... - -The active stack is not set. Setting the active stack to the default stack. - -Using the default store for the global config. - -Unable to find ZenML repository in your current working directory (/tmp/folder) or any parent directories. If you want to use an existing repository which is in a different location, set the environment variable 'ZENML_REPOSITORY_PATH'. If you want to create a new repository, run zenml init. - -Running without an active repository root. - -Using the default local database. - -┏━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┓ - -┃ ACTIVE │ STACK NAME │ SHARED │ OWNER │ ARTIFACT_STORE │ ORCHESTRATOR ┃ - -┠────────┼────────────┼────────┼─────────┼────────────────┼──────────────┨ - -┃ 👉 │ default │ ❌ │ default │ default │ default ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┛ -``` - -The following is an example of the layout of the global config directory immediately after initialization: - - -```bash -/home/stefan/.config/zenml <- Global Config Directory - -├── config.yaml <- Global Configuration Settings - -└── local_stores <- Every Stack component that stores information - - | locally will have its own subdirectory here. - - ├── a1a0d3d0-d552-4a80-be09-67e5e29be8ee <- e.g. Local Store path for the - - | `default` local Artifact Store - - └── default_zen_store - - | - - └── zenml.db <- SQLite database where ZenML data (stacks, - - components, etc) are stored by default. -``` - -As shown above, the global config directory stores the following information: - -1. The `config.yaml` file stores the global configuration settings: the unique ZenML client ID, the active database configuration, the analytics-related options, and the active Stack. This is an example of the `config.yaml` file contents immediately after initialization: - -```Bash -active_stack_id: ... -analytics_opt_in: true -store: - database: ... - url: ... - username: ... - ... -user_id: d980f13e-05d1-4765-92d2-1dc7eb7addb7 -version: 0.13.2 -``` -2. The `local_stores` directory is where some "local" flavors of stack components, such as the local artifact store, or a local MLFlow experiment tracker, persist data locally. Every local stack component will have its own subdirectory here named after the stack component's unique UUID. One notable example is the local artifact store flavor that, when part of the active stack, stores all the artifacts generated by pipeline runs in the designated local directory. -3. The `zenml.db` in the `default_zen_store` directory is the default SQLite database where ZenML stores all information about the stacks, stack components, custom stack component flavors, etc. - -In addition to the above, you may also find the following files and folders under the global config directory, depending on what you do with ZenML: - -* `kubeflow` \- this is where the Kubeflow orchestrators that are part of a stack store some of their configuration and logs. - -## Usage analytics - -In order to help us better understand how the community uses ZenML, the pip package reports **anonymized** usage statistics. You can always opt out by using the CLI command: - -```Bash -zenml analytics opt-out -``` - -#### Why does ZenML collect analytics? - -In addition to the community at large, **ZenML** is created and maintained by a startup based in Munich, Germany called [ZenML GmbH](https://zenml.io). We're a team of techies that love MLOps and want to build tools that fellow developers would love to use in their daily work. [This is us](https://zenml.io/company#CompanyTeam) if you want to put faces to the names! - -However, in order to improve **ZenML** and understand how it is being used, we need to use analytics to have an overview of how it is used 'in the wild'. This not only helps us find bugs but also helps us prioritize features and commands that might be useful in future releases. If we did not have this information, all we really get is pip download statistics and chatting with people directly, which while being valuable, is not enough to seriously better the tool as a whole. - -#### How does ZenML collect these statistics? - -We use [Segment](https://segment.com) as the data aggregation library for all our analytics. However, before any events get sent to [Segment](https://segment.com), they first go through a central ZenML analytics server. This added layer allows us to put various countermeasures to incidents such as getting spammed with events and enables us to have a more optimized tracking process. - -The client code is entirely visible and can be seen in the [analytics](https://github.com/zenml-io/zenml/tree/main/src/zenml/analytics) module of our main repository. - -#### If I share my email, will you spam me? - -No, we won't. Our sole purpose of contacting you will be to ask for feedback (e.g. in the shape of a user interview). These interviews help the core team understand usage better and prioritize feature requests. If you have any concerns about data privacy and the usage of personal information, please [contact us](mailto:support@zenml.io), and we will try to alleviate any concerns as soon as possible. - -## Version mismatch (downgrading) - -If you've recently downgraded your ZenML version to an earlier release or installed a newer version on a different environment on the same machine, you might encounter an error message when running ZenML that says: - -```Bash -`The ZenML global configuration version (%s) is higher than the version of ZenML - -currently being used (%s).` -``` - -We generally recommend using the latest ZenML version. However, there might be cases where you need to match the global configuration version with the version of ZenML installed in the current environment. To do this, run the following command: - - -```Bash -zenml downgrade -``` - - -Note that downgrading the ZenML version may cause unexpected behavior, such as model schema validation failures or even data loss. In such cases, you may need to purge the local database and re-initialize the global configuration to bring it back to its default factory state. To do this, run the following command: - -```Bash -zenml clean -``` - diff --git a/develop/usage/project-setup/setting-up-a-project-repository/best-practices.mdx b/develop/usage/project-setup/setting-up-a-project-repository/best-practices.mdx deleted file mode 100644 index 1c4aaee..0000000 --- a/develop/usage/project-setup/setting-up-a-project-repository/best-practices.mdx +++ /dev/null @@ -1,99 +0,0 @@ ---- -title: "Best practices" -description: "Recommended repository structure and best practices." -icon: grunt ---- - -While it doesn't matter how you structure your ZenML project, here is a recommended project structure the core team often uses: - -```Bash -. -├── .dockerignore -├── Dockerfile -├── steps -│ ├── loader_step -│ │ ├── .dockerignore (optional) -│ │ ├── Dockerfile (optional) -│ │ ├── loader_step.py -│ │ └── requirements.txt (optional) -│ └── training_step -│ └── ... -├── pipelines -│ ├── training_pipeline -│ │ ├── .dockerignore (optional) -│ │ ├── config.yaml (optional) -│ │ ├── Dockerfile (optional) -│ │ ├── training_pipeline.py -│ │ └── requirements.txt (optional) -│ └── deployment_pipeline -│ └── ... -├── notebooks -│ └── *.ipynb -├── requirements.txt -├── .zen -└── run.py -``` - -All ZenML [Project templates](/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates#generating-project-from-a-project-template) are modeled around this basic structure. - -#### Steps - -Keep your steps in separate Python files. This allows you to optionally keep their utils, dependencies, and Dockerfiles separate. - -#### Logging - -ZenML records the root python logging handler's output into the artifact store as a side-effect of running a step. Therefore, when writing steps, use the `logging` module to record logs, to ensure that these logs then show up in the ZenML dashboard. - -```Bash -# Use ZenML handler -from zenml.logger import get_logger - -logger = get_logger(__name__) -... - -@step -def training_data_loader(): - # This will show up in the dashboard - logger.info("My logs") -``` - -#### Pipelines - -Just like steps, keep your pipelines in separate Python files. This allows you to optionally keep their utils, dependencies, and Dockerfiles separate. - -It is recommended that you separate the pipeline execution from the pipeline definition so that importing the pipeline does not immediately run it. - - Do not give pipelines or pipeline instances the name "pipeline". Doing this will overwrite the imported `pipeline` and decorator and lead to failures at later stages if more pipelines are decorated there. - - - Pipeline names are their unique identifiers, so using the same name for different pipelines will create a mixed history where two runs of a pipeline are two very different entities. - - - - -#### .dockerignore - -Containerized orchestrators and step operators load your complete project files into a Docker image for execution. To speed up the process and reduce Docker image sizes, exclude all unnecessary files (like data, virtual environments, git repos, etc.) within the `.dockerignore`. - -#### Dockerfile (optional) - -By default, ZenML uses the official [zenml Docker image](https://hub.docker.com/r/zenmldocker/zenml) as a base for all pipeline and step builds. You can use your own `Dockerfile` to overwrite this behavior. Learn more [here](/develop/usage/project-setup/customize-docker-builds). - -#### Notebooks - -Collect all your notebooks in one place. - -#### .zen - -By running `zenml init` at the root of your project, you define the project scope for ZenML. In ZenML terms, this will be called your "source's root". This will be used to resolve import paths and store configurations. - -Although this is optional, it is recommended that you do this for all of your projects. - - - All of your import paths should be relative to the source's root. - - -#### run.py - -Putting your pipeline runners in the root of the repository ensures that all imports that are defined relative to the project root resolve for the pipeline runner. In case there is no `.zen` defined this also defines the implicit source's root. - diff --git a/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository.mdx b/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository.mdx deleted file mode 100644 index 84440d0..0000000 --- a/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository.mdx +++ /dev/null @@ -1,165 +0,0 @@ ---- -title: "Connect your git repository" -description: "Tracking your code and avoiding unnecessary Docker builds by connecting your git repo." -icon: github ---- - -A code repository in ZenML refers to a remote storage location for your code. Some commonly known code repository platforms include [GitHub](https://github.com/) and [GitLab](https://gitlab.com/). - -Code repositories enable ZenML to keep track of the code version that you use for your pipeline runs. Additionally, running a pipeline that is tracked in a registered code repository can [speed up the Docker image building for containerized stack components](/develop/usage/project-setup/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times) by eliminating the need to rebuild Docker images each time you change one of your source code files. - -Learn more about how code repositories benefit development [here](/develop/usage/project-setup/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times). - -## Registering a code repository - -If you are planning to use one of the [available implementations of code repositories](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository#available-implementations), first, you need to install the corresponding ZenML integration: - -```Bash -zenml integration install -``` - -Afterward, code repositories can be registered using the CLI: - -```Bash -zenml code-repository register --type= [--CODE_REPOSITORY_OPTIONS] -``` - -For concrete options, check out the section on the [GitHubCodeRepository](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository#github), the [GitLabCodeRepository](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository#gitlab) or how to develop and register a [custom code repository implementation](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository#developing-a-custom-code-repository). - -## Available implementations - -ZenML comes with builtin implementations of the code repository abstraction for the `GitHub` and `GitLab` platforms, but it's also possible to use a [custom code repository implementation](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository#developing-a-custom-code-repository). - -### GitHub - -ZenML provides built-in support for using GitHub as a code repository for your ZenML pipelines. You can register a GitHub code repository by providing the URL of the GitHub instance, the owner of the repository, the name of the repository, and a GitHub Personal Access Token (PAT) with access to the repository. - -Before registering the code repository, first, you have to install the corresponding integration: - -```Bash -zenml integration install github -``` - -Afterward, you can register a GitHub code repository by running the following CLI command: - - -```Bash -zenml code-repository register --type=github \ - ---url= --owner= --repository= \ - ---token= -``` - -where \ is the name of the code repository you are registering, \ is the owner of the repository, \ is the name of the repository, \ is your GitHub Personal Access Token and \ is the URL of the GitHub instance which defaults to `https://github.com.` You will need to set a URL if you are using GitHub Enterprise. - -After registering the GitHub code repository, ZenML will automatically detect if your source files are being tracked by GitHub and store the commit hash for each pipeline run. - - 1. Go to your GitHub account settings and click on [Developer settings](https://github.com/settings/tokens?type=beta). -2. Select "Personal access tokens" and click on "Generate new token". -3. Give your token a name and a description. - - - -4. We recommend selecting the specific repository and then giving `contents` read-only access. - - - - - - -5. Click on "Generate token" and copy the token to a safe place. - - - - - - -### GitLab - -ZenML also provides built-in support for using GitLab as a code repository for your ZenML pipelines. You can register a GitLab code repository by providing the URL of the GitLab project, the group of the project, the name of the project, and a GitLab Personal Access Token (PAT) with access to the project. - -Before registering the code repository, first, you have to install the corresponding integration: - -```Bash -zenml integration install gitlab -``` - -Afterward, you can register a GitLab code repository by running the following CLI command: - -```Bash -zenml code-repository register --type=gitlab \ ---url= --group= --project= \ ---token= -``` - -where `` is the name of the code repository you are registering, `` is the group of the project, `` is the name of the project, \ is your GitLab Personal Access Token, and \ is the URL of the GitLab instance which defaults to `https://gitlab.com.` You will need to set a URL if you have a self-hosted GitLab instance. - -After registering the GitLab code repository, ZenML will automatically detect if your source files are being tracked by GitLab and store the commit hash for each pipeline run. - -1. Go to your GitLab account settings and click on [Access Tokens](https://gitlab.com/-/profile/personal%5Faccess%5Ftokens). -2. Name the token and select the scopes that you need (e.g. `read_repository`, `read_user`, `read_api`) - - - -3. Click on "Create personal access token" and copy the token to a safe place. - - - - - - - - -## Developing a custom code repository - -If you're using some other platform to store your code, and you still want to use a code repository in ZenML, you can implement and register a custom code repository. - -First, you'll need to subclass and implement the abstract methods of the `zenml.code_repositories.BaseCodeRepository` class: - -```py -class BaseCodeRepository(ABC): - """Base class for code repositories.""" - - @abstractmethod - def login(self) -> None: - """Logs into the code repository.""" - - @abstractmethod - def download_files( - self, commit: str, directory: str, repo_sub_directory: Optional[str] - ) -> None: - """Downloads files from the code repository to a local directory. - - Args: - commit: The commit hash to download files from. - directory: The directory to download files to. - repo_sub_directory: The subdirectory in the repository to - download files from. - """ - - @abstractmethod - def get_local_context( - self, path: str - ) -> Optional["LocalRepositoryContext"]: - """Gets a local repository context from a path. - - Args: - path: The path to the local repository. - - Returns: - The local repository context object. - """ -``` - -After you're finished implementing this, you can register it as follows: - -```Bash -# The `CODE_REPOSITORY_OPTIONS` are key-value pairs that your implementation will receive -# as configuration in its __init__ method. This will usually include stuff like the username -# and other credentials necessary to authenticate with the code repository platform. -zenml code-repository register --type=custom --source=my_module.MyRepositoryClass \ - [--CODE_REPOSITORY_OPTIONS] -``` diff --git a/develop/usage/project-setup/setting-up-a-project-repository/setting-up-a-project-repository.mdx b/develop/usage/project-setup/setting-up-a-project-repository/setting-up-a-project-repository.mdx deleted file mode 100644 index 6e4796b..0000000 --- a/develop/usage/project-setup/setting-up-a-project-repository/setting-up-a-project-repository.mdx +++ /dev/null @@ -1,12 +0,0 @@ ---- -title: "Overview" -description: "Setting your team up for success with a project repository." -icon: cat ---- - -ZenML code typically lives in a `git` repository. Setting this repository up correctly can make a huge impact on collaboration and getting the maximum out of your ZenML deployment. This section walks users through some of the options available to create a project repository with ZenML. - - - - - diff --git a/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates.mdx b/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates.mdx deleted file mode 100644 index 6773c48..0000000 --- a/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates.mdx +++ /dev/null @@ -1,89 +0,0 @@ ---- -title: "Project templates" -description: "Rocketstart your ZenML journey!" -icon: sketch ---- - -What would you need to get a quick understanding of the ZenML framework and start building your ML pipelines? The answer is one of ZenML project templates to cover major use cases of ZenML: a collection of steps and pipelines and, to top it all off, a simple but useful CLI. This is exactly what the ZenML templates are all about! - -## List of available project templates - -| Project Template \[Short name\] | Tags | Description | -| ---------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -| [Starter template](https://github.com/zenml-io/template-starter) \[starter\] | basic scikit-learn | All the basic ML ingredients you need to get you started with ZenML: parameterized steps, a model training pipeline, a flexible configuration and a simple CLI. All created around a representative and versatile model training use-case implemented with the scikit-learn library. | -| [E2E Training with Batch Predictions](https://github.com/zenml-io/template-e2e-batch) \[e2e\_batch\] | etl hp-tuning model-promotion drift-detection batch-prediction scikit-learn | This project template is a good starting point for anyone starting with ZenML. It consists of two pipelines with the following high-level steps: load, split, and preprocess data; run HP tuning; train and evaluate model performance; promote model to production; detect data drift; run batch inference. | -| [NLP Training Pipeline](https://github.com/zenml-io/template-nlp) \[nlp\] | nlp hp-tuning model-promotion training pytorch gradio huggingface | This project template is a simple NLP training pipeline that walks through tokenization, training, HP tuning, evaluation and deployment for a BERT or GPT-2 based model and testing locally it with gradio | - - - Do you have a personal project powered by ZenML that you would like to see here? At ZenML, we are looking for design partnerships and collaboration to help us better understand the real-world scenarios in which MLOps is being used and to build the best possible experience for our users. If you are interested in sharing all or parts of your project with us in the form of a ZenML project template, please [join our Slack](https://zenml.io/slack/) and leave us a message! - - -## Generating project from a project template - -First, to use the templates, you need to have ZenML and its `templates` extras installed: - -```Bash -pip install zenml[templates] -``` - -Now, you can generate a project from one of the existing templates by using the `--template` flag with the `zenml init` command: - -```Bash -zenml init --template - -# example: zenml init --template e2e_batch -``` - -Running the command above will result in input prompts being shown to you. If you would like to rely on default values for the ZenML project template - you can add `--template-with-defaults` to the same command, like this: - -```Bash -zenml init --template --template-with-defaults - -# example: zenml init --template e2e_batch --template-with-defaults -``` - -## Creating your own ZenML template - -Creating your own ZenML template is a great way to standardize and share your ML workflows across different projects or teams. ZenML uses [Copier](https://copier.readthedocs.io/en/stable/) to manage its project templates. Copier is a library that allows you to generate projects from templates. It's simple, versatile, and powerful. - -Here's a step-by-step guide on how to create your own ZenML template: - -1. **Create a new repository for your template.** This will be the place where you store all the code and configuration files for your template. -2. **Define your ML workflows as ZenML steps and pipelines.** You can start by copying the code from one of the existing ZenML templates (like the [starter template](https://github.com/zenml-io/template-starter)) and modifying it to fit your needs. -3. **Create a** `**copier.yml**` **file.** This file is used by Copier to define the template's parameters and their default values. You can learn more about this config file [in the copier docs](https://copier.readthedocs.io/en/stable/creating/). -4. **Test your template.** You can use the `copier` command-line tool to generate a new project from your template and check if everything works as expected: - -```Bash -copier copy https://github.com/your-username/your-template.git your-project -``` - -Replace `https://github.com/your-username/your-template.git` with the URL of your template repository, and `your-project` with the name of the new project you want to create. - -1. **Use your template with ZenML.** Once your template is ready, you can use it with the `zenml init` command: - -```Bash -zenml init --template https://github.com/your-username/your-template.git -``` - -Replace `https://github.com/your-username/your-template.git` with the URL of your template repository. - -If you want to use a specific version of your template, you can use the `--template-tag` option to specify the git tag of the version you want to use: - -```Bash -zenml init --template https://github.com/your-username/your-template.git --template-tag v1.0.0 -``` - -Replace `v1.0.0` with the git tag of the version you want to use. - -That's it! Now you have your own ZenML project template that you can use to quickly set up new ML projects. Remember to keep your template up-to-date with the latest best practices and changes in your ML workflows. - -Our [Production Guide](/develop/user-guide/guides/production-guide) documentation is built around the `E2E Batch` project template codes. Most examples will be based on it, so we highly recommend you to install the `e2e_batch` template with `--template-with-defaults` flag before diving deeper into this documentation section, so you can follow this guide along using your own local environment. - -```Bash -mkdir e2e_batch - -cd e2e_batch - -zenml init --template e2e_batch --template-with-defaults -``` - diff --git a/develop/usage/project-setup/use-configuration-files/autogenerate-a-template-yaml-file.mdx b/develop/usage/project-setup/use-configuration-files/autogenerate-a-template-yaml-file.mdx deleted file mode 100644 index ae87d7f..0000000 --- a/develop/usage/project-setup/use-configuration-files/autogenerate-a-template-yaml-file.mdx +++ /dev/null @@ -1,218 +0,0 @@ ---- -title: "Autogenerate a template yaml file" -description: "To help you figure out what you can put in your configuration file, simply autogenerate a template." -icon: file-code ---- - -If you want to generate a template yaml file of your specific pipeline, you can do so by using the `.write_run_configuration_template()` method. This will generate a yaml file with all options commented out. This way you can pick and choose the settings that are relevant to you. - -```py -from zenml import pipeline -... - -@pipeline(enable_cache=True) # set cache behavior at step level -def simple_ml_pipeline(parameter: int): - dataset = load_data(parameter=parameter) - train_model(dataset) - -simple_ml_pipeline.write_run_configuration_template(path="") -``` - -```yaml -build: Union[PipelineBuildBase, UUID, NoneType] -enable_artifact_metadata: Optional[bool] -enable_artifact_visualization: Optional[bool] -enable_cache: Optional[bool] -enable_step_logs: Optional[bool] -extra: Mapping[str, Any] -model: - audience: Optional[str] - description: Optional[str] - ethics: Optional[str] - license: Optional[str] - limitations: Optional[str] - name: str - save_models_to_registry: bool - suppress_class_validation_warnings: bool - tags: Optional[List[str]] - trade_offs: Optional[str] - use_cases: Optional[str] - version: Union[ModelStages, int, str, NoneType] - was_created_in_this_run: bool -parameters: Optional[Mapping[str, Any]] -run_name: Optional[str] -schedule: - catchup: bool - cron_expression: Optional[str] - end_time: Optional[datetime] - interval_second: Optional[timedelta] - name: Optional[str] - run_once_start_time: Optional[datetime] - start_time: Optional[datetime] -settings: - docker: - apt_packages: List[str] - build_context_root: Optional[str] - build_options: Mapping[str, Any] - copy_files: bool - copy_global_config: bool - dockerfile: Optional[str] - dockerignore: Optional[str] - environment: Mapping[str, Any] - install_stack_requirements: bool - parent_image: Optional[str] - python_package_installer: PythonPackageInstaller - replicate_local_python_environment: Union[List[str], PythonEnvironmentExportMethod, - NoneType] - required_integrations: List[str] - requirements: Union[NoneType, str, List[str]] - skip_build: bool - prevent_build_reuse: bool - allow_including_files_in_images: bool - allow_download_from_code_repository: bool - allow_download_from_artifact_store: bool - target_repository: str - user: Optional[str] - resources: - cpu_count: Optional[PositiveFloat] - gpu_count: Optional[NonNegativeInt] - memory: Optional[ConstrainedStrValue] -steps: - load_data: - enable_artifact_metadata: Optional[bool] - enable_artifact_visualization: Optional[bool] - enable_cache: Optional[bool] - enable_step_logs: Optional[bool] - experiment_tracker: Optional[str] - extra: Mapping[str, Any] - failure_hook_source: - attribute: Optional[str] - module: str - type: SourceType - model: - audience: Optional[str] - description: Optional[str] - ethics: Optional[str] - license: Optional[str] - limitations: Optional[str] - name: str - save_models_to_registry: bool - suppress_class_validation_warnings: bool - tags: Optional[List[str]] - trade_offs: Optional[str] - use_cases: Optional[str] - version: Union[ModelStages, int, str, NoneType] - was_created_in_this_run: bool - name: Optional[str] - outputs: - output: - default_materializer_source: - attribute: Optional[str] - module: str - type: SourceType - materializer_source: Optional[Tuple[Source, ...]] - parameters: {} - settings: - docker: - apt_packages: List[str] - build_context_root: Optional[str] - build_options: Mapping[str, Any] - copy_files: bool - copy_global_config: bool - dockerfile: Optional[str] - dockerignore: Optional[str] - environment: Mapping[str, Any] - install_stack_requirements: bool - parent_image: Optional[str] - python_package_installer: PythonPackageInstaller - replicate_local_python_environment: Union[List[str], PythonEnvironmentExportMethod, - NoneType] - required_integrations: List[str] - requirements: Union[NoneType, str, List[str]] - skip_build: bool - prevent_build_reuse: bool - allow_including_files_in_images: bool - allow_download_from_code_repository: bool - allow_download_from_artifact_store: bool - target_repository: str - user: Optional[str] - resources: - cpu_count: Optional[PositiveFloat] - gpu_count: Optional[NonNegativeInt] - memory: Optional[ConstrainedStrValue] - step_operator: Optional[str] - success_hook_source: - attribute: Optional[str] - module: str - type: SourceType - train_model: - enable_artifact_metadata: Optional[bool] - enable_artifact_visualization: Optional[bool] - enable_cache: Optional[bool] - enable_step_logs: Optional[bool] - experiment_tracker: Optional[str] - extra: Mapping[str, Any] - failure_hook_source: - attribute: Optional[str] - module: str - type: SourceType - model: - audience: Optional[str] - description: Optional[str] - ethics: Optional[str] - license: Optional[str] - limitations: Optional[str] - name: str - save_models_to_registry: bool - suppress_class_validation_warnings: bool - tags: Optional[List[str]] - trade_offs: Optional[str] - use_cases: Optional[str] - version: Union[ModelStages, int, str, NoneType] - was_created_in_this_run: bool - name: Optional[str] - outputs: {} - parameters: {} - settings: - docker: - apt_packages: List[str] - build_context_root: Optional[str] - build_options: Mapping[str, Any] - copy_files: bool - copy_global_config: bool - dockerfile: Optional[str] - dockerignore: Optional[str] - environment: Mapping[str, Any] - install_stack_requirements: bool - parent_image: Optional[str] - python_package_installer: PythonPackageInstaller - replicate_local_python_environment: Union[List[str], PythonEnvironmentExportMethod, - NoneType] - required_integrations: List[str] - requirements: Union[NoneType, str, List[str]] - skip_build: bool - prevent_build_reuse: bool - allow_including_files_in_images: bool - allow_download_from_code_repository: bool - allow_download_from_artifact_store: bool - target_repository: str - user: Optional[str] - resources: - cpu_count: Optional[PositiveFloat] - gpu_count: Optional[NonNegativeInt] - memory: Optional[ConstrainedStrValue] - step_operator: Optional[str] - success_hook_source: - attribute: Optional[str] - module: str - type: SourceType -``` - - - - - When you want to configure your pipeline with a certain stack in mind, you can do so as well: \`...write\_run\_configuration\_template(stack=\) - - diff --git a/develop/usage/project-setup/use-configuration-files/configuration-hierarchy.mdx b/develop/usage/project-setup/use-configuration-files/configuration-hierarchy.mdx deleted file mode 100644 index a2bbfe8..0000000 --- a/develop/usage/project-setup/use-configuration-files/configuration-hierarchy.mdx +++ /dev/null @@ -1,39 +0,0 @@ ---- -title: "Configuration hierarchy" -description: "When things can be configured on the pipeline and step level, the step configuration overrides the pipeline." -icon: layer-group ---- - -There are a few general rules when it comes to settings and configurations that are applied in multiple places. Generally the following is true: - -* Configurations in code override configurations made inside of the yaml file -* Configurations at the step level override those made at the pipeline level -* In case of attributes the dictionaries are merged - -```py -from zenml import pipeline, step -from zenml.config import ResourceSettings - - -@step -def load_data(parameter: int) -> dict: - ... - -@step(settings={"resources": ResourceSettings(gpu_count=1, memory="2GB")}) -def train_model(data: dict) -> None: - ... - - -@pipeline(settings={"resources": ResourceSettings(cpu_count=2, memory="1GB")}) -def simple_ml_pipeline(parameter: int): - ... - -# ZenMl merges the two configurations and uses the step configuration to override -# values defined on the pipeline level - -train_model.configuration.settings["resources"] -# -> cpu_count: 2, gpu_count=1, memory="2GB" - -simple_ml_pipeline.configuration.settings["resources"] -# -> cpu_count: 2, memory="1GB" -``` diff --git a/develop/usage/project-setup/use-configuration-files/how-to-use-config.mdx b/develop/usage/project-setup/use-configuration-files/how-to-use-config.mdx deleted file mode 100644 index 8440d89..0000000 --- a/develop/usage/project-setup/use-configuration-files/how-to-use-config.mdx +++ /dev/null @@ -1,40 +0,0 @@ ---- -title: "How to configure a pipeline with a YAML" -description: "Specify a configuration file" -icon: glass-water ---- - - - All configuration that can be specified in a YAML file can also be specified in code itself. However, it is best practice to use a YAML file to separate config from code. - - -You can use the `with_options(config_path=)` pattern to apply your configuration to a pipeline. Here is a minimal example of using a file based configuration yaml. - -```yaml -enable_cache: False - -# Configure the pipeline parameters -parameters: - dataset_name: "best_dataset" - -steps: - load_data: # Use the step name here - enable_cache: False # same as @step(enable_cache=False) -``` -```py -from zenml import step, pipeline - -@step -def load_data(dataset_name: str) -> dict: - ... - -@pipeline # This function combines steps together -def simple_ml_pipeline(dataset_name: str): - load_data(dataset_name) - -if __name__=="__main__": - simple_ml_pipeline.with_options(config_path=)() -``` - -The above would run the `simple_ml_pipeline` with cache disabled for `load_data` and the parameter `dataset_name` set to `best_dataset`. - diff --git a/develop/usage/project-setup/use-configuration-files/retrieve-used-configuration-of-a-run.mdx b/develop/usage/project-setup/use-configuration-files/retrieve-used-configuration-of-a-run.mdx deleted file mode 100644 index 873f3f9..0000000 --- a/develop/usage/project-setup/use-configuration-files/retrieve-used-configuration-of-a-run.mdx +++ /dev/null @@ -1,15 +0,0 @@ ---- -title: "Find out which configuration was used for a run" -description: Sometimes you might want to extract the used configuration from a pipeline that has already run. -icon: user-minus ---- - -You can do this simply by loading the pipeline run and accessing its `config` attribute. - -```py -from zenml.client import Client - -pipeline_run = Client().get_pipeline_run("") - -configuration = pipeline_run.config -``` diff --git a/develop/usage/project-setup/use-configuration-files/runtime-configuration.mdx b/develop/usage/project-setup/use-configuration-files/runtime-configuration.mdx deleted file mode 100644 index ba2397c..0000000 --- a/develop/usage/project-setup/use-configuration-files/runtime-configuration.mdx +++ /dev/null @@ -1,75 +0,0 @@ ---- -title: "Runtime settings for Docker, resources, and stack components" -description: "Using settings to configure runtime configuration." -icon: play ---- - - - - - -Part of the configuration of a pipeline are its `Settings`. These allow you to configure runtime configurations for stack components and pipelines. Concretely, they allow you to configure: - -* The [resources](/develop/usage/resource-data-management/training-with-gpus#specify-resource-requirements-for-steps) required for a step -* Configuring the [containerization](/develop/usage/project-setup/customize-docker-builds) process of a pipeline (e.g. What requirements get installed in the Docker image) -* Stack component-specific configuration, e.g., if you have an experiment tracker passing in the name of the experiment at runtime - -You will learn about all of the above in more detail later, but for now, let's try to understand that all of this configuration flows through one central concept called `BaseSettings`. (From here on, we use `settings` and `BaseSettings` as analogous in this guide). - -## Types of settings - -Settings are categorized into two types: - -* **General settings** that can be used on all ZenML pipelines. Examples of these are: - * [DockerSettings](/develop/usage/project-setup/customize-docker-builds) to specify Docker settings. - * [ResourceSettings](/develop/usage/resource-data-management/training-with-gpus) to specify resource settings. -* **Stack-component-specific settings**: These can be used to supply runtime configurations to certain stack components (the key should be `` or `.`). Settings for components not in the active stack will be ignored. Examples of these are: - * [SkypilotAWSOrchestratorSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-skypilot%5Faws/#zenml.integrations.skypilot%5Faws.flavors.skypilot%5Forchestrator%5Faws%5Fvm%5Fflavor.SkypilotAWSOrchestratorSettings) to specify Skypilot settings (works for `SkypilotGCPOrchestratorSettings` and `SkypilotAzureOrchestratorSettings` as well). - * [KubeflowOrchestratorSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-kubeflow/#zenml.integrations.kubeflow.flavors.kubeflow%5Forchestrator%5Fflavor.KubeflowOrchestratorSettings) to specify Kubeflow settings. - * [MLflowExperimentTrackerSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-mlflow/#zenml.integrations.mlflow.flavors.mlflow%5Fexperiment%5Ftracker%5Fflavor.MLFlowExperimentTrackerSettings) to specify MLflow settings. - * [WandbExperimentTrackerSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-wandb/#zenml.integrations.wandb.flavors.wandb%5Fexperiment%5Ftracker%5Fflavor.WandbExperimentTrackerSettings) to specify W&B settings. - * [WhylogsDataValidatorSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-whylogs/#zenml.integrations.whylogs.flavors.whylogs%5Fdata%5Fvalidator%5Fflavor.WhylogsDataValidatorSettings) to specify Whylogs settings. - * [SagemakerStepOperatorSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-aws/#zenml.integrations.aws.flavors.sagemaker%5Fstep%5Foperator%5Fflavor.SagemakerStepOperatorSettings) to specify AWS Sagemaker step operator settings. - * [VertexStepOperatorSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-gcp/#zenml.integrations.gcp.flavors.vertex%5Fstep%5Foperator%5Fflavor.VertexStepOperatorSettings) to specify GCP Vertex step operator settings. - * [AzureMLStepOperatorSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-azure/#zenml.integrations.azure.flavors.azureml%5Fstep%5Foperator%5Fflavor.AzureMLStepOperatorSettings) to specify AzureML step operator settings. - -### Difference between stack component settings at registration-time vs real-time - -For stack-component-specific settings, you might be wondering what the difference is between these and the configuration passed in while doing `zenml stack-component register --config1=configvalue --config2=configvalue`, etc. The answer is that the configuration passed in at registration time is static and fixed throughout all pipeline runs, while the settings can change. - -A good example of this is the [MLflow Experiment Tracker](/develop/stack-components/experiment-trackers/mlflow), where configuration which remains static such as the `tracking_url` is sent through at registration time, while runtime configuration such as the `experiment_name` (which might change every pipeline run) is sent through as runtime settings. - -Even though settings can be overridden at runtime, you can also specify _default_ values for settings while configuring a stack component. For example, you could set a default value for the `nested` setting of your MLflow experiment tracker: `zenml experiment-tracker register --flavor=mlflow --nested=True` - -This means that all pipelines that run using this experiment tracker use nested MLflow runs unless overridden by specifying settings for the pipeline at runtime. - -### Using the right key for Stack-component-specific settings - -When specifying stack-component-specific settings, a key needs to be passed. This key should always correspond to the pattern: `` or `.`. If you specify just the category (e.g. `step_operator` or `orchestrator`), ZenML will try to apply those settings to whatever flavor of component is in your stack when running a pipeline. If your settings don't apply to this flavor, they will be ignored. - -For example, the [SagemakerStepOperator](/develop/stack-components/step-operators/sagemaker) supports passing in [estimator\_args](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-aws/#zenml.integrations.aws.flavors.sagemaker%5Fstep%5Foperator%5Fflavor.SagemakerStepOperatorSettings). The way to specify this would be to use the key `step_operator` - - -```py -@step(step_operator="nameofstepoperator", settings= {"step_operator": {"estimator_args": {"instance_type": "m7g.medium"}}}) -def my_step(): - ... - -# Using the class -@step(step_operator="nameofstepoperator", settings= {"step_operator": SagemakerStepOperatorSettings(instance_type="m7g.medium")}) -def my_step(): - ... -``` - -or in YAML: - - -```yaml -steps: - my_step: - step_operator: "nameofstepoperator" - settings: - step_operator: - estimator_args: - instance_type: m7g.medium -``` diff --git a/develop/usage/project-setup/use-configuration-files/use-configuration-files.mdx b/develop/usage/project-setup/use-configuration-files/use-configuration-files.mdx deleted file mode 100644 index fad1769..0000000 --- a/develop/usage/project-setup/use-configuration-files/use-configuration-files.mdx +++ /dev/null @@ -1,14 +0,0 @@ ---- -title: "Overview" -description: "ZenML makes it easy to configure and run a pipeline with configuration files." -icon: files ---- - -ZenML pipelines can be configured at runtime with a simple YAML file that can help you set [parameters](/develop/usage/pipelines/build-pipelines/use-pipeline-step-parameters), control [caching behavior](/develop/usage/pipelines/build-pipelines/control-caching-behavior) or even configure different stack components. - -Learn more about the different options in the following sections: - - - - - diff --git a/develop/usage/project-setup/use-configuration-files/what-can-be-configured.mdx b/develop/usage/project-setup/use-configuration-files/what-can-be-configured.mdx deleted file mode 100644 index a95750b..0000000 --- a/develop/usage/project-setup/use-configuration-files/what-can-be-configured.mdx +++ /dev/null @@ -1,240 +0,0 @@ ---- -title: "What can be configured" -description: Here is an example of a sample YAML file, with the most important configuration highlighted. -icon: clipboard-question ---- - -For brevity, we have removed all possible keys. To view a sample file with all possible keys, refer to [this page](/develop/usage/project-setup/use-configuration-files/autogenerate-a-template-yaml-file). - -```yaml -# Build ID (i.e. which Docker image to use) -build: dcd6fafb-c200-4e85-8328-428bef98d804 - -# Enable flags (boolean flags that control behavior) -enable_artifact_metadata: True -enable_artifact_visualization: False -enable_cache: False -enable_step_logs: True - -# Extra dictionary to pass in arbitrary values -extra: - any_param: 1 - another_random_key: "some_string" - -# Specify the "ZenML Model" -model: - name: "classification_model" - version: production - - audience: "Data scientists" - description: "This classifies hotdogs and not hotdogs" - ethics: "No ethical implications" - license: "Apache 2.0" - limitations: "Only works for hotdogs" - tags: ["sklearn", "hotdog", "classification"] - -# Parameters of the pipeline -parameters: - dataset_name: "another_dataset" - -# Name of the run -run_name: "my_great_run" - -# Schedule, if supported on the orchestrator -schedule: - catchup: true - cron_expression: "* * * * *" - -# Real-time settings for Docker and resources -settings: - # Controls Docker building - docker: - apt_packages: ["curl"] - copy_files: True - dockerfile: "Dockerfile" - dockerignore: ".dockerignore" - environment: - ZENML_LOGGING_VERBOSITY: DEBUG - parent_image: "zenml-io/zenml-cuda" - requirements: ["torch"] - skip_build: False - - # Control resources for the entire pipeline - resources: - cpu_count: 2 - gpu_count: 1 - memory: "4Gb" - -# Per step configuration -steps: - # Top-level key should be the name of the step invocation ID - train_model: - # Parameters of the step - parameters: - data_source: "best_dataset" - - # Step-only configuration - experiment_tracker: "mlflow_production" - step_operator: "vertex_gpu" - outputs: {} - failure_hook_source: {} - success_hook_source: {} - - # Same as pipeline level configuration, if specified overrides for this step - enable_artifact_metadata: True - enable_artifact_visualization: True - enable_cache: False - enable_step_logs: True - - # Same as pipeline level configuration, if specified overrides for this step - extra: {} - - # Same as pipeline level configuration, if specified overrides for this step - model: {} - - # Same as pipeline level configuration, if specified overrides for this step - settings: - docker: {} - resources: {} - - # Stack component specific settings - step_operator.sagemaker: - estimator_args: - instance_type: m7g.medium -``` - -## Deep-dive - -### `enable_XXX` parameters - -These are boolean flags for various configurations: - -* `enable_artifact_metadata`: Whether to [associate metadata with artifacts or not](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types#optional-which-metadata-to-extract-for-the-artifact). -* `enable_artifact_visualization`: Whether to [attach visualizations of artifacts](/develop/usage/resource-data-management/visualize-artifacts). -* `enable_cache`: Utilize [caching](/develop/usage/pipelines/build-pipelines/control-caching-behavior) or not. -* `enable_step_logs`: Enable tracking [step logs](/develop/usage/development-monitoring/control-logging/enable-or-disable-logs-storing). - -```yaml -enable_artifact_metadata: True -enable_artifact_visualization: True -enable_cache: True -enable_step_logs: True -``` - -### `build` ID - -The UUID of the [build](/develop/usage/project-setup/customize-docker-builds) to use for this pipeline. If specified, Docker image building is skipped for remote orchestrators, and the Docker image specified in this build is used. - -```yaml -build: -``` - -### Configuring the `model` - -Specifies the ZenML [Model](/develop/user-guide/guides/starter-guide/track-ml-models) to use for this pipeline. - -```yaml -model: - name: "ModelName" - version: "production" - description: An example model - tags: ["classifier"] -``` - -### Pipeline and step `parameters` - -A dictionary of JSON-serializable [parameters](/develop/usage/pipelines/build-pipelines/use-pipeline-step-parameters) specified at the pipeline or step level. For example: - -```yaml -parameters: - gamma: 0.01 - -steps: - trainer: - parameters: - gamma: 0.001 -``` - -Corresponds to: - - -```yaml -from zenml import step, pipeline - -@step -def trainer(gamma: float): - # Use gamma as normal - print(gamma) - -@pipeline -def my_pipeline(gamma: float): - # use gamma or pass it into the step - print(0.01) - trainer(gamma=gamma) -``` - -Important note, in the above case, the value of the step would be the one defined in the `steps` key (i.e. 0.001). So the YAML config always takes precedence over pipeline parameters that are passed down to steps in code. Read [this section for more details](/develop/usage/project-setup/use-configuration-files/configuration-hierarchy). - -Normally, parameters defined at the pipeline level are used in multiple steps, and then no step-level configuration is defined. - - - Note that `parameters` are different from `artifacts`. Parameters are JSON-serializable values that are passed in the runtime configuration of a pipeline. Artifacts are inputs and outputs of a step, and need not always be JSON-serializable ([materializers](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) handle their persistence in the [artifact store](/develop/stack-components/artifact-stores)). - - -### Setting the `run_name` - -To change the name for a run, pass `run_name` as a parameter. This can be a dynamic value as well. - -```yaml -run_name: -``` - - - You will not be able to run with the same run\_name twice. Do not set this statically when running on a schedule. Try to include some auto-incrementation or timestamp to the name. - - -### Stack Component Runtime settings - -Settings are special runtime configurations of a pipeline or a step that require a [dedicated section](/develop/usage/project-setup/use-configuration-files/runtime-configuration). In short, they define a bunch of execution configuration such as Docker building and resource settings. - -### Docker Settings - -Docker Settings can be passed in directly as objects, or a dictionary representation of the object. For example, the Docker configuration can be set in configuration files as follows: - -```yaml -settings: - docker: - requirements: - - pandas - -``` - - - Find a complete list of all Docker Settings [here](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.docker%5Fsettings.DockerSettings). To learn more about pipeline containerization consult our documentation on this [here](/develop/usage/project-setup/customize-docker-builds). - - -### Resource Settings - -Some stacks allow setting the resource settings using these settings. - -```yaml -resources: - cpu_count: 2 - gpu_count: 1 - memory: "4Gb" -``` - -Note that this may not work for all types of stack components. To learn which components support this, please refer to the specific orchestrator docs. - -### `failure_hook_source` and `success_hook_source` - -The `source` of the [failure and success hooks](/develop/usage/pipelines/build-pipelines/use-failure-success-hooks) can be specified. - -### Step-specific configuration - -A lot of pipeline-level configuration can also be applied at a step level (as we have already seen with the `enable_cache` flag). However, there is some configuration that is step-specific, meaning it cannot be applied at a pipeline level, but only at a step level. - -* `experiment_tracker`: Name of the [experiment\_tracker](/develop/stack-components/experiment-trackers) to enable for this step. This experiment\_tracker should be defined in the active stack with the same name. -* `step_operator`: Name of the [step\_operator](/develop/stack-components/step-operators) to enable for this step. This step\_operator should be defined in the active stack with the same name. -* `outputs`: This is configuration of the output artifacts of this step. This is further keyed by output name (by default, step outputs [are named output](/develop/usage/resource-data-management/handle-data-artifacts/return-multiple-outputs-from-a-step)). The most interesting configuration here is the `materializer_source`, which is the UDF path of the materializer in code to use for this output (e.g. `materializers.some_data.materializer.materializer_class`). Read more about this source path [here](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types). - diff --git a/develop/usage/project-setup/use-secrets/interact-with-secrets.mdx b/develop/usage/project-setup/use-secrets/interact-with-secrets.mdx deleted file mode 100644 index 9ffee87..0000000 --- a/develop/usage/project-setup/use-secrets/interact-with-secrets.mdx +++ /dev/null @@ -1,148 +0,0 @@ ---- -title: "Interact with secrets" -description: "Registering and using secrets." -icon: user-lock ---- - -## What is a ZenML secret? - -ZenML secrets are groupings of **key-value pairs** which are securely stored in the ZenML secrets store. Additionally, a secret always has a **name** that allows you to fetch or reference them in your pipelines and stacks. - -## How to create a secret - - -To create a secret with a name `` and a key-value pair, you can run the following CLI command: -```bash -zenml secret create \ - --= \ - --= - -# Another option is to use the '--values' option and provide key-value pairs in either JSON or YAML format. -zenml secret create \ - --values='{"key1":"value2","key2":"value2"}' -``` -Alternatively, you can create the secret in an interactive session (in which ZenML will query you for the secret keys and values) by passing the `--interactive/-i` parameter: - -```bash -zenml secret create -i -``` - -For secret values that are too big to pass as a command line argument, or have special characters, you can also use the special `@` syntax to indicate to ZenML that the value needs to be read from a file: - -```bash -zenml secret create \ - --key=@path/to/file.txt \ - ... - -# Alternatively, you can utilize the '--values' option by specifying a file path containing key-value pairs in either JSON or YAML format. -zenml secret create \ - --values=@path/to/file.txt -``` - -The CLI also includes commands that can be used to list, update and delete secrets. A full guide on using the CLI to create, access, update and delete secrets is available [here](https://sdkdocs.zenml.io/latest/cli/#zenml.cli--secrets-management). - -**Interactively register missing secrets for your stack** - -If you're using components with [secret references](/develop/usage/project-setup/use-secrets/interact-with-secrets#reference-secrets-in-stack-component-attributes-and-settings) in your stack, you need to make sure that all the referenced secrets exist. To make this process easier, you can use the following CLI command to interactively register all secrets for a stack: - -```bash -zenml stack register-secrets [] -``` - - - -The ZenML client API offers a programmatic interface to create, e.g.: - -```JSON -from zenml.client import Client - -client = Client() -client.create_secret( - name="my_secret", - values={ - "username": "admin", - "password": "abc123" - } -) -``` - -Other Client methods used for secrets management include `get_secret` to fetch a secret by name or id, `update_secret` to update an existing secret, `list_secrets` to query the secrets store using a variety of filtering and sorting criteria, and `delete_secret` to delete a secret. The full Client API reference is available [here](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-client/). - - - - -## Set scope for secrets - -ZenML secrets can be scoped to a user. This allows you to create secrets that are only accessible to one user. - -By default, all created secrets are scoped to the active user. To create a secret and scope it to your active user instead, you can pass the `--scope` argument to the CLI command: - -```bash -zenml secret create \ - --scope user \ - --= \ - --= -``` - -Scopes also act as individual namespaces. When you are referencing a secret by name in your pipelines and stacks, ZenML will look for a secret with that name scoped to the active user. - -## Accessing registered secrets - -### Reference secrets in stack component attributes and settings - -Some of the components in your stack require you to configure them with sensitive information like passwords or tokens, so they can connect to the underlying infrastructure. Secret references allow you to configure these components in a secure way by not specifying the value directly but instead referencing a secret by providing the secret name and key. Referencing a secret for the value of any string attribute of your stack components, simply specify the attribute using the following syntax: `{{.}}` - -For example: - -```bash CLI -# Register a secret called `mlflow_secret` with key-value pairs for the -# username and password to authenticate with the MLflow tracking server - -# Using central secrets management -zenml secret create mlflow_secret \ - --username=admin \ - --password=abc123 - - -# Then reference the username and password in our experiment tracker component -zenml experiment-tracker register mlflow \ - --flavor=mlflow \ - --tracking_username={{mlflow_secret.username}} \ - --tracking_password={{mlflow_secret.password}} \ - ... -``` - -When using secret references in your stack, ZenML will validate that all secrets and keys referenced in your stack components exist before running a pipeline. This helps us fail early so your pipeline doesn't fail after running for some time due to some missing secret. - -This validation by default needs to fetch and read every secret to make sure that both the secret and the specified key-value pair exist. This can take quite some time and might fail if you don't have permission to read secrets. - -You can use the environment variable `ZENML_SECRET_VALIDATION_LEVEL` to disable or control the degree to which ZenML validates your secrets: - -* Setting it to `NONE` disables any validation. -* Setting it to `SECRET_EXISTS` only validates the existence of secrets. This might be useful if the machine you're running on only has permission to list secrets but not actually read their values. -* Setting it to `SECRET_AND_KEY_EXISTS` (the default) validates both the secret existence as well as the existence of the exact key-value pair. - -### Fetch secret values in a step - -If you are using [centralized secrets management](/develop/usage/project-setup/use-secrets/interact-with-secrets), you can access secrets directly from within your steps through the ZenML `Client` API. This allows you to use your secrets for querying APIs from within your step without hard-coding your access keys: - -```bash - -from zenml import step -from zenml.client import Client - - -@step -def secret_loader() -> None: - """Load the example secret from the server.""" - # Fetch the secret from ZenML. - secret = Client().get_secret( < SECRET_NAME >) - - # `secret.secret_values` will contain a dictionary with all key-value - # pairs within your secret. - authenticate_to_some_api( - username=secret.secret_values["username"], - password=secret.secret_values["password"], - ) - ... -``` diff --git a/develop/usage/project-setup/use-secrets/reference-secrets-in-stack-configuration.mdx b/develop/usage/project-setup/use-secrets/reference-secrets-in-stack-configuration.mdx deleted file mode 100644 index 18d00f2..0000000 --- a/develop/usage/project-setup/use-secrets/reference-secrets-in-stack-configuration.mdx +++ /dev/null @@ -1,66 +0,0 @@ ---- -title: "Reference secrets in stack configuration" -description: "Reference secrets in stack component attributes and settings" -icon: key ---- - -Some of the components in your stack require you to configure them with sensitive information like passwords or tokens, so they can connect to the underlying infrastructure. Secret references allow you to configure these components in a secure way by not specifying the value directly but instead referencing a secret by providing the secret name and key. Referencing a secret for the value of any string attribute of your stack components, simply specify the attribute using the following syntax: `{{.}}` - -For example: - -```Bash CLI -# Register a secret called `mlflow_secret` with key-value pairs for the -# username and password to authenticate with the MLflow tracking server - -# Using central secrets management -zenml secret create mlflow_secret \ - --username=admin \ - --password=abc123 - - -# Then reference the username and password in our experiment tracker component -zenml experiment-tracker register mlflow \ - --flavor=mlflow \ - --tracking_username={{mlflow_secret.username}} \ - --tracking_password={{mlflow_secret.password}} \ - ... -``` - -When using secret references in your stack, ZenML will validate that all secrets and keys referenced in your stack components exist before running a pipeline. This helps us fail early so your pipeline doesn't fail after running for some time due to some missing secret. - -This validation by default needs to fetch and read every secret to make sure that both the secret and the specified key-value pair exist. This can take quite some time and might fail if you don't have permission to read secrets. - -You can use the environment variable `ZENML_SECRET_VALIDATION_LEVEL` to disable or control the degree to which ZenML validates your secrets: - -* Setting it to `NONE` disables any validation. -* Setting it to `SECRET_EXISTS` only validates the existence of secrets. This might be useful if the machine you're running on only has permission to list secrets but not actually read their values. -* Setting it to `SECRET_AND_KEY_EXISTS` (the default) validates both the secret existence as well as the existence of the exact key-value pair. - -### Fetch secret values in a step - -If you are using [centralized secrets management](/develop/usage/project-setup/use-secrets/interact-with-secrets), you can access secrets directly from within your steps through the ZenML `Client` API. This allows you to use your secrets for querying APIs from within your step without hard-coding your access keys: - -```yaml -from zenml import step -from zenml.client import Client - - -@step -def secret_loader() -> None: - """Load the example secret from the server.""" - # Fetch the secret from ZenML. - secret = Client().get_secret( < SECRET_NAME >) - - # `secret.secret_values` will contain a dictionary with all key-value - # pairs within your secret. - authenticate_to_some_api( - username=secret.secret_values["username"], - password=secret.secret_values["password"], - ) - ... -``` - -## See Also - -* [Interact with secrets](/develop/usage/project-setup/use-secrets/interact-with-secrets): Learn how to create, list, and delete secrets using the ZenML CLI and Python SDK. - diff --git a/develop/usage/project-setup/use-secrets/secret-management.mdx b/develop/usage/project-setup/use-secrets/secret-management.mdx deleted file mode 100644 index a61a78f..0000000 --- a/develop/usage/project-setup/use-secrets/secret-management.mdx +++ /dev/null @@ -1,72 +0,0 @@ ---- -title: "Secret management" -description: "Configuring the secrets store." -icon: bars-progress ---- - -## Centralized secrets store - -ZenML provides a centralized secrets management system that allows you to register and manage secrets in a secure way. The metadata of the ZenML secrets (e.g. name, ID, owner, scope etc.) is always stored in the ZenML server database, while the actual secret values are stored and managed separately, through the ZenML Secrets Store. This allows for a flexible deployment strategy that meets the security and compliance requirements of your organization. - -In a local ZenML deployment, secret values are also stored in the local SQLite database. When connected to a remote ZenML server, the secret values are stored in the secrets management back-end that the server's Secrets Store is configured to use, while all access to the secrets is done through the ZenML server API. - - - - -Currently, the ZenML server can be configured to use one of the following supported secrets store back-ends: - -* the same SQL database that the ZenML server is using to store secrets metadata as well as other managed objects such as pipelines, stacks, etc. This is the default option. -* the AWS Secrets Manager -* the GCP Secret Manager -* the Azure Key Vault -* the HashiCorp Vault -* a custom secrets store back-end implementation is also supported - -## Configuration and deployment - -Configuring the specific secrets store back-end that the ZenML server uses is done at deployment time. This involves deciding on one of the supported back-ends and authentication mechanisms and configuring the ZenML server with the necessary credentials to authenticate with the back-end. - -The ZenML secrets store reuses the [ZenML Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide) authentication mechanisms to authenticate with the secrets store back-end. This means that the same authentication methods and configuration parameters that are supported by the available Service Connectors are also reflected in the ZenML secrets store configuration. It is recommended to practice the principle of least privilege when configuring the ZenML secrets store and to use credentials with the documented minimum required permissions to access the secrets store back-end. - -The ZenML secrets store configured for the ZenML Server can be updated at any time by updating the ZenML Server configuration and redeploying the server. This allows you to easily switch between different secrets store back-ends and authentication mechanisms. However, it is recommended to follow [the documented secret store migration strategy](/develop/usage/project-setup/use-secrets/secret-management#secrets-migration-strategy) to minimize downtime and to ensure that existing secrets are also properly migrated, in case the location where secrets are stored in the back-end changes. - -For more information on how to deploy a ZenML server and configure the secrets store back-end, refer to your deployment strategy inside the deployment guide. - -## Backup secrets store - -The ZenML Server deployment may be configured to optionally connect to _a second Secrets Store_ to provide additional features such as high-availability, backup and disaster recovery as well as an intermediate step in the process of migrating [secrets from one secrets store location to another](/develop/usage/project-setup/use-secrets/secret-management#secrets-migration-strategy). For example, the primary Secrets Store may be configured to use the internal database, while the backup Secrets Store may be configured to use the AWS Secrets Manager. Or two different AWS Secrets Manager accounts or regions may be used. - - - Always make sure that the backup Secrets Store is configured to use a different location than the primary Secrets Store. The location can be different in terms of the Secrets Store back-end type (e.g. internal database vs. AWS Secrets Manager) or the actual location of the Secrets Store back-end (e.g. different AWS Secrets Manager account or region, GCP Secret Manager project or Azure Key Vault's vault). - - -Using the same location for both the primary and backup Secrets Store will not provide any additional benefits and may even result in unexpected behavior. - -When a backup secrets store is in use, the ZenML Server will always attempt to read and write secret values from/to the primary Secrets Store first while ensuring to keep the backup Secrets Store in sync. If the primary Secrets Store is unreachable, if the secret values are not found there or any otherwise unexpected error occurs, the ZenML Server falls back to reading and writing from/to the backup Secrets Store. Only if the backup Secrets Store is also unavailable, the ZenML Server will return an error. - -In addition to the hidden backup operations, users can also explicitly trigger a backup operation by using the `zenml secret backup` CLI command. This command will attempt to read all secrets from the primary Secrets Store and write them to the backup Secrets Store. Similarly, the `zenml secret restore` CLI command can be used to restore secrets from the backup Secrets Store to the primary Secrets Store. These CLI commands are useful for migrating secrets from one Secrets Store to another. - -## Secrets migration strategy - -Sometimes you may need to change the external provider or location where secrets values are stored by the Secrets Store. The immediate implication of this is that the ZenML server will no longer be able to access existing secrets with the new configuration until they are also manually copied to the new location. Some examples of such changes include: - -* switching Secrets Store back-end types (e.g. from internal SQL database to AWS Secrets Manager or Azure Key Vault) -* switching back-end locations (e.g. changing the AWS Secrets Manager account or region, GCP Secret Manager project or Azure Key Vault's vault). - -In such cases, it is not sufficient to simply reconfigure and redeploy the ZenML server with the new Secrets Store configuration. This is because the ZenML server will not automatically migrate existing secrets to the new location. Instead, you should follow a specific migration strategy to ensure that existing secrets are also properly migrated to the new location with minimal, even zero downtime. - -The secrets migration process makes use of the fact that [a secondary Secrets Store](/develop/usage/project-setup/use-secrets/secret-management#backup-secrets-store) can be configured for the ZenML server for backup purposes. This secondary Secrets Store is used as an intermediate step in the migration process. The migration process is as follows (we'll refer to the Secrets Store that is currently in use as _Secrets Store A_ and the Secrets Store that will be used after the migration as _Secrets Store B_): - -1. Re-configure the ZenML server to use _Secrets Store B_ as the secondary Secrets Store. -2. Re-deploy the ZenML server. -3. Use the `zenml secret backup` CLI command to back up all secrets from _Secrets Store A_ to _Secrets Store B_. You don't have to worry about secrets that are created or updated by users during or after this process, as they will be automatically backed up to _Secrets Store B_. If you also wish to delete secrets from _Secrets Store A_ after they are successfully backed up to _Secrets Store B_, you should run `zenml secret backup --delete-secrets` instead. -4. Re-configure the ZenML server to use _Secrets Store B_ as the primary Secrets Store and remove _Secrets Store A_ as the secondary Secrets Store. -5. Re-deploy the ZenML server. - -This migration strategy is not necessary if the actual location of the secrets values in the Secrets Store back-end does not change. For example: - -* updating the credentials used to authenticate with the Secrets Store back-end before or after they expire -* switching to a different authentication method to authenticate with the same Secrets Store back-end (e.g. switching from an IAM account secret key to an IAM role in the AWS Secrets Manager) - -If you are a [ZenML Pro](https://zenml.io/pro) user, you can configure your cloud backend based on your [deployment scenario](/develop/getting-started/deploying-zenml/zenml-pro/system-architectures). - diff --git a/develop/usage/resource-data-management/auth-management/auth-management.mdx b/develop/usage/resource-data-management/auth-management/auth-management.mdx deleted file mode 100644 index c96747a..0000000 --- a/develop/usage/resource-data-management/auth-management/auth-management.mdx +++ /dev/null @@ -1,722 +0,0 @@ ---- -title: "Overview" -description: "Connect your ZenML deployment to a cloud provider and other infrastructure services and resources." -icon: bolt ---- - -A production-grade MLOps platform involves interactions between a diverse combination of third-party libraries and external services sourced from various different vendors. One of the most daunting hurdles in building and operating an MLOps platform composed of multiple components is configuring and maintaining uninterrupted and secured access to the infrastructure resources and services that it consumes. - -In layman's terms, your pipeline code needs to "connect" to a handful of different services to run successfully and do what it's designed to do. For example, it might need to connect to a private AWS S3 bucket to read and store artifacts, a Kubernetes cluster to execute steps with Kubeflow or Tekton, and a private GCR container registry to build and store container images. ZenML makes this possible by allowing you to configure authentication information and credentials embedded directly into your Stack Components, but this doesn't scale well when you have more than a few Stack Components and has many other disadvantages related to usability and security. - -Gaining access to infrastructure resources and services requires knowledge about the different authentication and authorization mechanisms and involves configuring and maintaining valid credentials. It gets even more complicated when these different services need to access each other. For instance, the Kubernetes container running your pipeline step needs access to the S3 bucket to store artifacts or needs to access a cloud service like AWS SageMaker, VertexAI, or AzureML to run a CPU/GPU intensive task like training a model. - -The challenge comes from _setting up and implementing proper authentication and authorization_ with the best security practices in mind, while at the same time _keeping this complexity away from the day-to-day routines_ of coding and running pipelines. - -The hard-to-swallow truth is there is no single standard that unifies all authentication and authorization-related matters or a single, well-defined set of security best practices that you can follow. However, with ZenML you get the next best thing, an abstraction that keeps the complexity of authentication and authorization away from your code and makes it easier to tackle them: _the ZenML Service Connectors_. - - - - - -Service Connectors abstract away complexity and implement security best practices - -## A representative use-case - -The range of features covered by Service Connectors is extensive and going through the entire [Service Connector Guide](/develop/usage/resource-data-management/auth-management/service-connectors-guide) can be overwhelming. If all you want is to get a quick overview of how Service Connectors work and what they can do for you, this section is for you. - -This is a representative example of how you would use a Service Connector to connect ZenML to a cloud service. This example uses [the AWS Service Connector](/develop/usage/resource-data-management/auth-management/aws-service-connector) to connect ZenML to an AWS S3 bucket and then link [an S3 Artifact Store Stack Component](/develop/stack-components/artifact-stores/s3) to it. - -Some details about the current alternatives to using Service Connectors and their drawbacks are provided below. Feel free to skip them if you are already familiar with them or just want to get to the good part. - - - -There are quicker alternatives to using a Service Connector to link an S3 Artifact Store to a private AWS S3 bucket. Let's lay them out first and then explain why using a Service Connector is the better option: - -1. the authentication information can be embedded directly into the Stack Component, although this is not recommended for security reasons: - -```bash -zenml artifact-store register s3 --flavor s3 --path=s3://BUCKET_NAME --key=AWS_ACCESS_KEY --secret=AWS_SECRET_KEY -``` -2. [a ZenML secret](/develop/usage/project-setup/use-secrets/secret-management) can hold the AWS credentials and then be referenced in the S3 Artifact Store configuration attributes: - -```bash -zenml secret create aws --aws_access_key_id=AWS_ACCESS_KEY --aws_secret_access_key=AWS_SECRET_KEY -zenml artifact-store register s3 --flavor s3 --path=s3://BUCKET_NAME --key='{{aws.aws_access_key_id}}' --secret='{{aws.aws_secret_access_key}}' -``` -3. an even better version is to reference the secret itself in the S3 Artifact Store configuration: - -```bash -zenml secret create aws --aws_access_key_id=AWS_ACCESS_KEY --aws_secret_access_key=AWS_SECRET_KEY -zenml artifact-store register s3 --flavor s3 --path=s3://BUCKET_NAME --authentication_secret=aws -``` - -All these options work, but they have many drawbacks: - -* first of all, not all Stack Components support referencing secrets in their configuration attributes, so this is not a universal solution. -* some Stack Components, like those linked to Kubernetes clusters, rely on credentials being set up on the machine where the pipeline is running, which makes pipelines less portable and more difficult to set up. In other cases, you also need to install and set up cloud-specific SDKs and CLIs to be able to use the Stack Component. -* people configuring and using Stack Components linked to cloud resources need to be given access to cloud credentials, or even provision the credentials themselves, which requires access to the cloud provider platform and knowledge about how to do it. -* in many cases, you can only configure long-lived credentials directly in Stack Components. This is a security risk because they can inadvertently grant access to key resources and services to a malicious party if they are compromised. Implementing a process that rotates credentials regularly is a complex task that requires a lot of effort and maintenance. -* Stack Components don't implement any kind of verification regarding the validity and permission of configured credentials. If the credentials are invalid or if they lack the proper permissions to access the remote resource or service, you will only find this out later, when running a pipeline will fail at runtime. -* ultimately, given that different Stack Component flavors rely on the same type of resource or cloud provider, it is not good design to duplicate the logic that handles authentication and authorization in each Stack Component implementation. - -These drawbacks are addressed by Service Connectors. - - - -Without Service Connectors, credentials are stored directly in the Stack Component configuration or ZenML Secret and are directly used in the runtime environment. The Stack Component implementation is directly responsible for validating credentials, authenticating and connecting to the infrastructure service. This is illustrated in the following diagram: - - - - - -When Service Connectors are involved in the authentication and authorization process, they can act as brokers. The credentials validation and authentication process takes place on the ZenML server. In most cases, the main credentials never have to leave the ZenML server as the Service Connector automatically converts them into short-lived credentials with a reduced set of privileges and issues these credentials to clients. Furthermore, multiple Stack Components of different flavors can use the same Service Connector to access different types or resources with the same credentials: - - - - - -In working with Service Connectors, the first step is usually _finding out what types of resources you can connect ZenML to_. Maybe you have already planned out the infrastructure options for your MLOps platform and are looking to find out whether ZenML can accommodate them. Or perhaps you want to use a particular Stack Component flavor in your Stack and are wondering whether you can use a Service Connector to connect it to external resources. - -Listing the available Service Connector Types will give you a good idea of what you can do with Service Connectors: - -```bash -zenml service-connector list-types -``` - -```Bash Example Command Output -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ -┠──────────────────────────────┼───────────────┼───────────────────────┼──────────────────┼───────┼────────┨ -┃ Kubernetes Service Connector │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ password │ ✅ │ ✅ ┃ -┃ │ │ │ token │ │ ┃ -┠──────────────────────────────┼───────────────┼───────────────────────┼──────────────────┼───────┼────────┨ -┃ Docker Service Connector │ 🐳 docker │ 🐳 docker-registry │ password │ ✅ │ ✅ ┃ -┠──────────────────────────────┼───────────────┼───────────────────────┼──────────────────┼───────┼────────┨ -┃ AWS Service Connector │ 🔶 aws │ 🔶 aws-generic │ implicit │ ✅ │ ✅ ┃ -┃ │ │ 📦 s3-bucket │ secret-key │ │ ┃ -┃ │ │ 🌀 kubernetes-cluster │ sts-token │ │ ┃ -┃ │ │ 🐳 docker-registry │ iam-role │ │ ┃ -┃ │ │ │ session-token │ │ ┃ -┃ │ │ │ federation-token │ │ ┃ -┠──────────────────────────────┼───────────────┼───────────────────────┼──────────────────┼───────┼────────┨ -┃ GCP Service Connector │ 🔵 gcp │ 🔵 gcp-generic │ implicit │ ✅ │ ✅ ┃ -┃ │ │ 📦 gcs-bucket │ user-account │ │ ┃ -┃ │ │ 🌀 kubernetes-cluster │ service-account │ │ ┃ -┃ │ │ 🐳 docker-registry │ oauth2-token │ │ ┃ -┃ │ │ │ impersonation │ │ ┃ -┠──────────────────────────────┼───────────────┼───────────────────────┼──────────────────┼───────┼────────┨ -┃ HyperAI Service Connector │ 🤖 hyperai │ 🤖 hyperai-instance │ rsa-key │ ✅ │ ✅ ┃ -┃ │ │ │ dsa-key │ │ ┃ -┃ │ │ │ ecdsa-key │ │ ┃ -┃ │ │ │ ed25519-key │ │ ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -Service Connector Types are also displayed in the dashboard during the configuration of a new Service Connector: - -The cloud provider of choice for our example is AWS and we're looking to hook up an S3 bucket to an S3 Artifact Store Stack Component. We'll use the AWS Service Connector Type. - - -A lot more is hidden behind a Service Connector Type than a name and a simple list of resource types. Before using a Service Connector Type to configure a Service Connector, you probably need to understand what it is, what it can offer and what are the supported authentication methods and their requirements. All this can be accessed on-site directly through the CLI or in the dashboard. Some examples are included here. - -Showing information about the AWS Service Connector Type: - -```bash -zenml service-connector describe-type aws -``` - -```bash Example Command Output - -╔══════════════════════════════════════════════════════════════════════════════╗ -║ 🔶 AWS Service Connector (connector type: aws) ║ -╚══════════════════════════════════════════════════════════════════════════════╝ - -Authentication methods: - - • 🔒 implicit - • 🔒 secret-key - • 🔒 sts-token - • 🔒 iam-role - • 🔒 session-token - • 🔒 federation-token - -Resource types: - - • 🔶 aws-generic - • 📦 s3-bucket - • 🌀 kubernetes-cluster - • 🐳 docker-registry - -Supports auto-configuration: True - -Available locally: True - -Available remotely: True - -The ZenML AWS Service Connector facilitates the authentication and access to -managed AWS services and resources. These encompass a range of resources, -including S3 buckets, ECR repositories, and EKS clusters. The connector provides -support for various authentication methods, including explicit long-lived AWS -secret keys, IAM roles, short-lived STS tokens and implicit authentication. - -To ensure heightened security measures, this connector also enables the -generation of temporary STS security tokens that are scoped down to the minimum -permissions necessary for accessing the intended resource. Furthermore, it -includes automatic configuration and detection of credentials locally configured -through the AWS CLI. - -This connector serves as a general means of accessing any AWS service by issuing -pre-authenticated boto3 sessions to clients. Additionally, the connector can -handle specialized authentication for S3, Docker and Kubernetes Python clients. -It also allows for the configuration of local Docker and Kubernetes CLIs. - -The AWS Service Connector is part of the AWS ZenML integration. You can either -install the entire integration or use a pypi extra to install it independently -of the integration: - - • pip install "zenml[connectors-aws]" installs only prerequisites for the AWS - Service Connector Type - • zenml integration install aws installs the entire AWS ZenML integration - -It is not required to install and set up the AWS CLI on your local machine to -use the AWS Service Connector to link Stack Components to AWS resources and -services. However, it is recommended to do so if you are looking for a quick -setup that includes using the auto-configuration Service Connector features. - -──────────────────────────────────────────────────────────────────────────────── -``` - -Dashboard equivalent: - - - - - -Fetching details about the S3 bucket resource type: - - -```bash -zenml service-connector describe-type aws --resource-type s3-bucket -``` - -```bash -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 📦 AWS S3 bucket (resource type: s3-bucket) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Authentication methods: implicit, secret-key, sts-token, iam-role, - -session-token, federation-token - - -Supports resource instances: True - - -Authentication methods: - - - • 🔒 implicit - - • 🔒 secret-key - - • 🔒 sts-token - - • 🔒 iam-role - - • 🔒 session-token - - • 🔒 federation-token - - -Allows users to connect to S3 buckets. When used by Stack Components, they are - -provided a pre-configured boto3 S3 client instance. - - -The configured credentials must have at least the following AWS IAM permissions - -associated with the ARNs of S3 buckets that the connector will be allowed to - -access (e.g. arn:aws:s3:::* and arn:aws:s3:::*/* represent all the available S3 - -buckets). - - - • s3:ListBucket - - • s3:GetObject - - • s3:PutObject - - • s3:DeleteObject - - • s3:ListAllMyBuckets - - -If set, the resource name must identify an S3 bucket using one of the following - -formats: - - - • S3 bucket URI (canonical resource name): s3://{bucket-name} - - • S3 bucket ARN: arn:aws:s3:::{bucket-name} - - • S3 bucket name: {bucket-name} - - -──────────────────────────────────────────────────────────────────────────────── -``` - -Dashboard equivalent: - - -Displaying information about the AWS Session Token authentication method: - -```bash -zenml service-connector describe-type aws --auth-method session-token -``` - -```bash Example Command Output -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 🔒 AWS Session Token (auth method: session-token) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Supports issuing temporary credentials: True - - -Generates temporary session STS tokens for IAM users. The connector needs to be - -configured with an AWS secret key associated with an IAM user or AWS account - -root user (not recommended). The connector will generate temporary STS tokens - -upon request by calling the GetSessionToken STS API. - - -These STS tokens have an expiration period longer that those issued through the - -AWS IAM Role authentication method and are more suitable for long-running - -processes that cannot automatically re-generate credentials upon expiration. - - -An AWS region is required and the connector may only be used to access AWS - -resources in the specified region. - - -The default expiration period for generated STS tokens is 12 hours with a - -minimum of 15 minutes and a maximum of 36 hours. Temporary credentials obtained - -by using the AWS account root user credentials (not recommended) have a maximum - -duration of 1 hour. - - -As a precaution, when long-lived credentials (i.e. AWS Secret Keys) are detected - -on your environment by the Service Connector during auto-configuration, this - -authentication method is automatically chosen instead of the AWS Secret Key - -authentication method alternative. - - -Generated STS tokens inherit the full set of permissions of the IAM user or AWS - -account root user that is calling the GetSessionToken API. Depending on your - -security needs, this may not be suitable for production use, as it can lead to - -accidental privilege escalation. Instead, it is recommended to use the AWS - -Federation Token or AWS IAM Role authentication methods to restrict the - -permissions of the generated STS tokens. - - -For more information on session tokens and the GetSessionToken AWS API, see: the - -official AWS documentation on the subject. - - -Attributes: - - - • aws_access_key_id {string, secret, required}: AWS Access Key ID - - • aws_secret_access_key {string, secret, required}: AWS Secret Access Key - - • region {string, required}: AWS Region - - • endpoint_url {string, optional}: AWS Endpoint URL - - -──────────────────────────────────────────────────────────────────────────────── -``` - -Dashboard equivalent: - - - -Not all Stack Components support being linked to a Service Connector. This is indicated in the flavor description of each Stack Component. Our example uses the S3 Artifact Store, which does support it: - -```bash -$ zenml artifact-store flavor describe s3 -Configuration class: S3ArtifactStoreConfig - -[...] - -This flavor supports connecting to external resources with a Service Connector. It requires a 's3-bucket' resource. You can get a list of all available connectors and the compatible resources that they can -access by running: - -'zenml service-connector list-resources --resource-type s3-bucket' -If no compatible Service Connectors are yet registered, you can register a new one by running: - -'zenml service-connector register -i' -``` - -The second step is _registering a Service Connector_ that effectively enables ZenML to authenticate to and access one or more remote resources. This step is best handled by someone with some infrastructure knowledge, but there are sane defaults and auto-detection mechanisms built into most Service Connectors that can make this a walk in the park even for the uninitiated. For our simple example, we're registering an AWS Service Connector with AWS credentials _automatically lifted up from your local host_, giving ZenML access to the same resources that you can access from your local machine through the AWS CLI. - -This step assumes the AWS CLI is already installed and set up with credentials on your machine (e.g. by running `aws configure`). - - -```bash -zenml service-connector register aws-s3 --type aws --auto-configure --resource-type s3-bucket -``` - -```bash Example Command Output -⠼ Registering service connector 'aws-s3'... - -Successfully registered service connector `aws-s3` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼───────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenbytes-bucket ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┃ │ s3://zenml-public-swagger-spec ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The CLI validates and shows all S3 buckets that can be accessed with the auto-discovered credentials. - - -The ZenML CLI provides an interactive way of registering Service Connectors. Just use the `-i` command line argument and follow the interactive guide: - -```bash -zenml service-connector register -i -``` - - - - -A quick glance into the Service Connector configuration that was automatically detected gives a better idea of what happened: - -```bash -zenml service-connector describe aws-s3 -``` - -```bash Example Command Output - -Service connector 'aws-s3' of type 'aws' with id '96a92154-4ec7-4722-bc18-21eeeadb8a4f' is owned by user 'default' and is 'private'. - - 'aws-s3' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ ID │ 96a92154-4ec7-4722-bc18-21eeeadb8a4f ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ NAME │ aws-s3 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ AUTH METHOD │ session-token ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 📦 s3-bucket ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SECRET ID │ a8c6d0ff-456a-4b25-8557-f0d7e3c12c5f ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SESSION DURATION │ 43200s ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-15 18:45:17.822337 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-15 18:45:17.822341 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - -The AWS Service Connector discovered and lifted the AWS Secret Key that was configured on the local machine and securely stored it in the [Secrets Store](/develop/usage/project-setup/use-secrets/secret-management). - -Moreover, the following security best practice is automatically enforced by the AWS connector: the AWS Secret Key will be kept hidden on the ZenML Server and the clients will never use it directly to gain access to any AWS resources. Instead, the AWS Service Connector will generate short-lived security tokens and distribute those to clients. It will also take care of issuing new tokens when those expire. This is identifiable from the `session-token` authentication method and the session duration configuration attributes. - -One way to confirm this is to ask ZenML to show us the exact configuration that a Service Connector client would see, but this requires us to pick an S3 bucket for which temporary credentials can be generated: - -```bash -zenml service-connector describe aws-s3 --resource-id s3://zenfiles -``` - -```Example Command Output -Service connector 'aws-s3 (s3-bucket | s3://zenfiles client)' of type 'aws' with id '96a92154-4ec7-4722-bc18-21eeeadb8a4f' is owned by user 'default' and is 'private'. - - 'aws-s3 (s3-bucket | s3://zenfiles client)' aws Service - - Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ ID │ 96a92154-4ec7-4722-bc18-21eeeadb8a4f ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ NAME │ aws-s3 (s3-bucket | s3://zenfiles client) ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ AUTH METHOD │ sts-token ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 📦 s3-bucket ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ RESOURCE NAME │ s3://zenfiles ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ EXPIRES IN │ 11h59m56s ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-15 18:56:33.880081 ┃ - -┠──────────────────┼───────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-15 18:56:33.880082 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_session_token │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - -As can be seen, this configuration is of a temporary STS AWS token that will expire in 12 hours. The AWS Secret Key is not visible on the client side. - - - - -The next step in this journey is _configuring and connecting one (or more) Stack Components to a remote resource_ via the Service Connector registered in the previous step. This is as easy as saying "_I want this S3 Artifact Store to use the_ `_s3://my-bucket_` _S3 bucket_" and doesn't require any knowledge whatsoever about the authentication mechanisms or even the provenance of those resources. The following example creates an S3 Artifact store and connects it to an S3 bucket with the earlier connector: - -```bash -zenml artifact-store register s3-zenfiles --flavor s3 --path=s3://zenfiles -zenml artifact-store connect s3-zenfiles --connector aws-s3 -``` - -```bash Example Command Output - -$ zenml artifact-store register s3-zenfiles --flavor s3 --path=s3://zenfiles -Successfully registered artifact_store `s3-zenfiles`. -$ zenml artifact-store connect s3-zenfiles --connector aws-s3 - -Successfully connected artifact store `s3-zenfiles` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 96a92154-4ec7-4722-bc18-21eeeadb8a4f │ aws-s3 │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - - -The ZenML CLI provides an even easier and more interactive way of connecting a stack component to an external resource. Just pass the `-i` command line argument and follow the interactive guide: - -```bash -zenml artifact-store register s3-zenfiles --flavor s3 --path=s3://zenfiles -zenml artifact-store connect s3-zenfiles -i -``` - - -The S3 Artifact Store Stack Component we just connected to the infrastructure is now ready to be used in a stack to run a pipeline: - -```bash -zenml stack register s3-zenfiles -o default -a s3-zenfiles --set -``` - -A simple pipeline could look like this: - -```py -from zenml import step, pipeline - -@step -def simple_step_one() -> str: - """Simple step one.""" - return "Hello World!" - - -@step -def simple_step_two(msg: str) -> None: - """Simple step two.""" - print(msg) - - -@pipeline -def simple_pipeline() -> None: - """Define single step pipeline.""" - message = simple_step_one() - simple_step_two(msg=message) - - -if __name__ == "__main__": - simple_pipeline() -``` - -Save this as `run.py` and run it with the following command: - -```bash -python run.py -``` - -```bash -Running pipeline simple_pipeline on stack s3-zenfiles (caching enabled) -Step simple_step_one has started. -Step simple_step_one has finished in 1.065s. -Step simple_step_two has started. -Hello World! -Step simple_step_two has finished in 5.681s. -Pipeline run simple_pipeline-2023_06_15-19_29_42_159831 has finished in 12.522s. -Dashboard URL: http://127.0.0.1:8237/default/pipelines/8267b0bc-9cbd-42ac-9b56-4d18275bdbb4/runs -``` - -This example is just a simple demonstration of how to use Service Connectors to connect ZenML Stack Components to your infrastructure. The range of features and possibilities is much larger. ZenML ships with built-in Service Connectors able to connect and authenticate to AWS, GCP, and Azure and offers many different authentication methods and security best practices. Follow the resources below for more information. - - The complete guide to unlocking the power of Service Connectors in your project. Best practices for various authentication methods implemented by Service Connectors. Connect ZenML to a generic Docker container registry using the Docker Service Connector. Use the Kubernetes Service Connector to connect ZenML to a generic Kubernetes cluster. Connect ZenML to AWS cloud resources using the AWS Service Connector. Use the GCP Service Connector to connect ZenML to GCP cloud resources. Connect ZenML to Azure cloud resources using the Azure Service Connector. diff --git a/develop/usage/resource-data-management/auth-management/aws-service-connector.mdx b/develop/usage/resource-data-management/auth-management/aws-service-connector.mdx deleted file mode 100644 index b7b265e..0000000 --- a/develop/usage/resource-data-management/auth-management/aws-service-connector.mdx +++ /dev/null @@ -1,2678 +0,0 @@ ---- -title: "AWS Service Connector" -description: "Configuring AWS Service Connectors to connect ZenML to AWS resources like S3 buckets, EKS Kubernetes clusters and ECR container registries." -icon: aws ---- - -The ZenML AWS Service Connector facilitates the authentication and access to managed AWS services and resources. These encompass a range of resources, including S3 buckets, ECR container repositories, and EKS clusters. The connector provides support for various authentication methods, including explicit long-lived AWS secret keys, IAM roles, short-lived STS tokens, and implicit authentication. - -To ensure heightened security measures, this connector also enables [the generation of temporary STS security tokens that are scoped down to the minimum permissions necessary](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) for accessing the intended resource. Furthermore, it includes [automatic configuration and detection of credentials locally configured through the AWS CLI](/develop/usage/resource-data-management/auth-management/service-connectors-guide#auto-configuration). - -This connector serves as a general means of accessing any AWS service by issuing pre-authenticated boto3 sessions. Additionally, the connector can handle specialized authentication for S3, Docker, and Kubernetes Python clients. It also allows for the configuration of local Docker and Kubernetes CLIs. - - -``` -$ zenml service-connector list-types --type aws -``` - - - -``` -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠───────────────────────┼────────┼───────────────────────┼──────────────────┼───────┼────────┨ - -┃ AWS Service Connector │ 🔶 aws │ 🔶 aws-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 s3-bucket │ secret-key │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ sts-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ iam-role │ │ ┃ - -┃ │ │ │ session-token │ │ ┃ - -┃ │ │ │ federation-token │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - - - This service connector will not be able to work if [Multi-Factor Authentication (MFA)](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Fmfa%5Fenable%5Fcliapi.html) is enabled on the role used by the AWS CLI. When MFA is enabled, the AWS CLI generates temporary credentials that are valid for a limited time. These temporary credentials cannot be used by the ZenML AWS Service Connector, as it requires long-lived credentials to authenticate and access AWS resources. - - -To use the AWS Service Connector with ZenML, you will need to use a different AWS CLI profile that does not have MFA enabled. You can do this by setting the `AWS_PROFILE` environment variable to the name of the profile you want to use before running the ZenML CLI commands. - -## Prerequisites - -The AWS Service Connector is part of the AWS ZenML integration. You can either install the entire integration or use a PyPI extra to install it independently of the integration: - -* `pip install "zenml[connectors-aws]"` installs only prerequisites for the AWS Service Connector Type -* `zenml integration install aws` installs the entire AWS ZenML integration - -It is not required to [install and set up the AWS CLI on your local machine](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) to use the AWS Service Connector to link Stack Components to AWS resources and services. However, it is recommended to do so if you are looking for a quick setup that includes using the auto-configuration Service Connector features. - - - The auto-configuration examples in this page rely on the AWS CLI being installed and already configured with valid credentials of one type or another. If you want to avoid installing the AWS CLI, we recommend using the interactive mode of the ZenML CLI to register Service Connectors: -``` -zenml service-connector register -i --type aws -``` - - -## Resource Types - -### Generic AWS resource - -This resource type allows consumers to use the AWS Service Connector to connect to any AWS service or resource. When used by connector clients, they are provided a generic Python boto3 session instance pre-configured with AWS credentials. This session can then be used to create boto3 clients for any particular AWS service. - -This generic AWS resource type is meant to be used with Stack Components that are not represented by other, more specific resource types, like S3 buckets, Kubernetes clusters, or Docker registries. It should be accompanied by a matching set of AWS permissions that allow access to the set of remote resources required by the client(s). - -The resource name represents the AWS region that the connector is authorized to access. - -### S3 bucket - -Allows users to connect to S3 buckets. When used by connector consumers, they are provided a pre-configured boto3 S3 client instance. - -The configured credentials must have at least the following [AWS IAM permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access%5Fpolicies.html) associated with [the ARNs of S3 buckets ](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-arn-format.html)that the connector will be allowed to access (e.g. `arn:aws:s3:::*` and `arn:aws:s3:::*/*` represent all the available S3 buckets). - -* `s3:ListBucket` -* `s3:GetObject` -* `s3:PutObject` -* `s3:DeleteObject` -* `s3:ListAllMyBuckets` - - - If you are using the [AWS IAM role](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role), [Session Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-session-token), or [Federation Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-federation-token) authentication methods, you don't have to worry too much about restricting the permissions of the AWS credentials that you use to access the AWS cloud resources. These authentication methods already support [automatically generating temporary tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) with permissions down-scoped to the minimum required to access the target resource. - - -If set, the resource name must identify an S3 bucket using one of the following formats: - -* S3 bucket URI (canonical resource name): `s3://{bucket-name}` -* S3 bucket ARN: `arn:aws:s3:::{bucket-name}` -* S3 bucket name: `{bucket-name}` - -### EKS Kubernetes cluster - -Allows users to access an EKS cluster as a standard Kubernetes cluster resource. When used by Stack Components, they are provided a pre-authenticated Python Kubernetes client instance. - -The configured credentials must have at least the following [AWS IAM permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access%5Fpolicies.html) associated with the [ARNs of EKS clusters](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference-arns.html) that the connector will be allowed to access (e.g. `arn:aws:eks:{region_id}:{project_id}:cluster/*` represents all the EKS clusters available in the target AWS region). - -* `eks:ListClusters` -* `eks:DescribeCluster` - - - If you are using the [AWS IAM role](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role), [Session Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-session-token) or [Federation Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-federation-token) authentication methods, you don't have to worry too much about restricting the permissions of the AWS credentials that you use to access the AWS cloud resources. These authentication methods already support [automatically generating temporary tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) with permissions down-scoped to the minimum required to access the target resource. - - -In addition to the above permissions, if the credentials are not associated with the same IAM user or role that created the EKS cluster, the IAM principal must be manually added to the EKS cluster's `aws-auth` ConfigMap, otherwise the Kubernetes client will not be allowed to access the cluster's resources. This makes it more challenging to use [the AWS Implicit](/develop/usage/resource-data-management/auth-management/aws-service-connector#implicit-authentication) and [AWS Federation Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-federation-token) authentication methods for this resource. For more information, [see this documentation](https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html). - -If set, the resource name must identify an EKS cluster using one of the following formats: - -* EKS cluster name (canonical resource name): `{cluster-name}` -* EKS cluster ARN: `arn:aws:eks:{region}:{account-id}:cluster/{cluster-name}` - -EKS cluster names are region scoped. The connector can only be used to access EKS clusters in the AWS region that it is configured to use. - -### ECR container registry - -Allows Stack Components to access one or more ECR repositories as a standard Docker registry resource. When used by Stack Components, they are provided a pre-authenticated python-docker client instance. - -The configured credentials must have at least the following [AWS IAM permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access%5Fpolicies.html) associated with the [ARNs of one or more ECR repositories](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference-arns.html) that the connector will be allowed to access (e.g. `arn:aws:ecr:{region}:{account}:repository/*` represents all the ECR repositories available in the target AWS region). - -* `ecr:DescribeRegistry` -* `ecr:DescribeRepositories` -* `ecr:ListRepositories` -* `ecr:BatchGetImage` -* `ecr:DescribeImages` -* `ecr:BatchCheckLayerAvailability` -* `ecr:GetDownloadUrlForLayer` -* `ecr:InitiateLayerUpload` -* `ecr:UploadLayerPart` -* `ecr:CompleteLayerUpload` -* `ecr:PutImage` -* `ecr:GetAuthorizationToken` - - - If you are using the [AWS IAM role](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role), [Session Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-session-token), or [Federation Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-federation-token) authentication methods, you don't have to worry too much about restricting the permissions of the AWS credentials that you use to access the AWS cloud resources. These authentication methods already support [automatically generating temporary tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) with permissions down-scoped to the minimum required to access the target resource. - - -This resource type is not scoped to a single ECR repository. Instead, a connector configured with this resource type will grant access to all the ECR repositories that the credentials are allowed to access under the configured AWS region (i.e. all repositories under the Docker registry URL `https://{account-id}.dkr.ecr.{region}.amazonaws.com`). - -The resource name associated with this resource type uniquely identifies an ECR registry using one of the following formats (the repository name is ignored, only the registry URL/ARN is used): - -* ECR repository URI (canonical resource name): - -`[https://]{account}.dkr.ecr.{region}.amazonaws.com[/{repository-name}]` - -* ECR repository ARN : - -`arn:aws:ecr:{region}:{account-id}:repository[/{repository-name}]` - -ECR repository names are region scoped. The connector can only be used to access ECR repositories in the AWS region that it is configured to use. - -## Authentication Methods - -### Implicit authentication - -[Implicit authentication](/develop/usage/resource-data-management/auth-management/best-security-practices#implicit-authentication) to AWS services using environment variables, local configuration files or IAM roles. - - - This method may constitute a security risk, because it can give users access to the same cloud resources and services that the ZenML Server itself is configured to access. For this reason, all implicit authentication methods are disabled by default and need to be explicitly enabled by setting the `ZENML_ENABLE_IMPLICIT_AUTH_METHODS` environment variable or the helm chart `enableImplicitAuthMethods` configuration option to `true` in the ZenML deployment. - - -This authentication method doesn't require any credentials to be explicitly configured. It automatically discovers and uses credentials from one of the following sources: - -* environment variables (AWS\_ACCESS\_KEY\_ID, AWS\_SECRET\_ACCESS\_KEY, AWS\_SESSION\_TOKEN, AWS\_DEFAULT\_REGION) -* local configuration files [set up through the AWS CLI ](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)(\~/aws/credentials, \~/.aws/config) -* IAM roles for Amazon EC2, ECS, EKS, Lambda, etc. Only works when running the ZenML server on an AWS resource with an IAM role attached to it. - -This is the quickest and easiest way to authenticate to AWS services. However, the results depend on how ZenML is deployed and the environment where it is used and is thus not fully reproducible: - -* when used with the default local ZenML deployment or a local ZenML server, the credentials are the same as those used by the AWS CLI or extracted from local environment variables -* when connected to a ZenML server, this method only works if the ZenML server is deployed in AWS and will use the IAM role attached to the AWS resource where the ZenML server is running (e.g. an EKS cluster). The IAM role permissions may need to be adjusted to allow listing and accessing/describing the AWS resources that the connector is configured to access. - -An IAM role may optionally be specified to be assumed by the connector on top of the implicit credentials. This is only possible when the implicit credentials have permissions to assume the target IAM role. Configuring an IAM role has all the advantages of the [AWS IAM Role](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role) authentication method plus the added benefit of not requiring any explicit credentials to be configured and stored: - -* the connector will [generate temporary STS tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) upon request by [calling the AssumeRole STS API](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Ftemp%5Frequest.html#api%5Fassumerole). -* allows implementing [a two layer authentication scheme](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles) that keeps the set of permissions associated with implicit credentials down to the bare minimum and grants permissions to the privilege-bearing IAM role instead. -* one or more optional [IAM session policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/access%5Fpolicies.html#policies%5Fsession) may also be configured to further restrict the permissions of the generated STS tokens. If not specified, IAM session policies are automatically configured for the generated STS tokens [to restrict them to the minimum set of permissions required to access the target resource](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials). Refer to the documentation for each supported Resource Type for the complete list of AWS permissions automatically granted to the generated STS tokens. -* the default expiration period for generated STS tokens is 1 hour with a minimum of 15 minutes up to the maximum session duration setting configured for the IAM role (default is 1 hour). If you need longer-lived tokens, you can configure the IAM role to use a higher maximum expiration value (up to 12 hours) or use the AWS Federation Token or AWS Session Token authentication methods. - -Note that the discovered credentials inherit the full set of permissions of the local AWS client configuration, environment variables, or remote AWS IAM role. Depending on the extent of those permissions, this authentication instead method might not be recommended for production use, as it can lead to accidental privilege escalation. It is recommended to also configure an IAM role when using the implicit authentication method, or to use the [AWS IAM Role](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role), [AWS Session Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-session-token), or [AWS Federation Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-federation-token) authentication methods instead to limit the validity and/or permissions of the credentials being issued to connector clients. - - - If you need to access an EKS Kubernetes cluster with this authentication method, please be advised that the EKS cluster's `aws-auth` ConfigMap may need to be manually configured to allow authentication with the implicit IAM user or role picked up by the Service Connector. For more information, [see this documentation](https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html). - - -An AWS region is required and the connector may only be used to access AWS resources in the specified region. - -The following assumes the local AWS CLI has a `connectors` AWS CLI profile already configured with credentials: - - -``` -AWS_PROFILE=connectors zenml service-connector register aws-implicit --type aws --auth-method implicit --region=us-east-1 -``` - -Example Command Output - - - -``` -⠸ Registering service connector 'aws-implicit'... - -Successfully registered service connector `aws-implicit` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -No credentials are stored with the Service Connector: - - - -``` -zenml service-connector describe aws-implicit -``` - -Example Command Output - - - -``` -Service connector 'aws-implicit' of type 'aws' with id 'e3853748-34a0-4d78-8006-00422ad32884' is owned by user 'default' and is 'private'. - - 'aws-implicit' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 9a810521-ef41-4e45-bb48-8569c5943dc6 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-implicit ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ implicit ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 18:08:37.969928 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 18:08:37.969930 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┗━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - -Verifying access to resources (note the `AWS_PROFILE` environment points to the same AWS CLI profile used during registration, but may yield different results with a different profile, which is why this method is not suitable for reproducible results): - - - -``` -AWS_PROFILE=connectors zenml service-connector verify aws-implicit --resource-type s3-bucket -``` - -Example Command Output - - - -``` -⠸ Verifying service connector 'aws-implicit'... - -Service connector 'aws-implicit' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼───────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -``` -zenml service-connector verify aws-implicit --resource-type s3-bucket -``` - -Example Command Output - - - -``` -⠸ Verifying service connector 'aws-implicit'... - -Service connector 'aws-implicit' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼────────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://sagemaker-studio-907999144431-m11qlsdyqr8 ┃ - -┃ │ s3://sagemaker-studio-d8a14tvjsmb ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -Depending on the environment, clients are issued either temporary STS tokens or long-lived credentials, which is a reason why this method isn't well suited for production: - - - -``` -AWS_PROFILE=zenml zenml service-connector describe aws-implicit --resource-type s3-bucket --resource-id zenfiles --client -``` - -Example Command Output - - - -``` -INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials - -Service connector 'aws-implicit (s3-bucket | s3://zenfiles client)' of type 'aws' with id 'e3853748-34a0-4d78-8006-00422ad32884' is owned by user 'default' and is 'private'. - - 'aws-implicit (s3-bucket | s3://zenfiles client)' aws Service - - Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ ID │ 9a810521-ef41-4e45-bb48-8569c5943dc6 ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ NAME │ aws-implicit (s3-bucket | s3://zenfiles client) ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ sts-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 📦 s3-bucket ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ s3://zenfiles ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 59m57s ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 18:13:34.146659 ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 18:13:34.146664 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_session_token │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - - - -``` -zenml service-connector describe aws-implicit --resource-type s3-bucket --resource-id s3://sagemaker-studio-d8a14tvjsmb --client -``` - -Example Command Output - - - -``` -INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials - -Service connector 'aws-implicit (s3-bucket | s3://sagemaker-studio-d8a14tvjsmb client)' of type 'aws' with id 'e3853748-34a0-4d78-8006-00422ad32884' is owned by user 'default' and is 'private'. - - 'aws-implicit (s3-bucket | s3://sagemaker-studio-d8a14tvjsmb client)' aws Service - - Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 9a810521-ef41-4e45-bb48-8569c5943dc6 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-implicit (s3-bucket | s3://sagemaker-studio-d8a14tvjsmb client) ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ secret-key ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 📦 s3-bucket ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ s3://sagemaker-studio-d8a14tvjsmb ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 18:12:42.066053 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 18:12:42.066055 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - - -### AWS Secret Key - -[Long-lived AWS credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#long-lived-credentials-api-keys-account-keys) consisting of an AWS access key ID and secret access key associated with an AWS IAM user or AWS account root user (not recommended). - -This method is preferred during development and testing due to its simplicity and ease of use. It is not recommended as a direct authentication method for production use cases because the clients have direct access to long-lived credentials and are granted the full set of permissions of the IAM user or AWS account root user associated with the credentials. For production, it is recommended to use [the AWS IAM Role](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role), [AWS Session Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-session-token), or [AWS Federation Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-federation-token) authentication method instead. - -An AWS region is required and the connector may only be used to access AWS resources in the specified region. - -If you already have the local AWS CLI set up with these credentials, they will be automatically picked up when auto-configuration is used (see the example below). - - -The following assumes the local AWS CLI has a `connectors` AWS CLI profile configured with an AWS Secret Key. We need to force the ZenML CLI to use the Secret Key authentication by passing the `--auth-method secret-key` option, otherwise it would automatically use [the AWS Session Token authentication method](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-session-token) as an extra precaution: - - - -``` -AWS_PROFILE=connectors zenml service-connector register aws-secret-key --type aws --auth-method secret-key --auto-configure -``` - -Example Command Output - - - -``` -⠸ Registering service connector 'aws-secret-key'... - -Successfully registered service connector `aws-secret-key` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The AWS Secret Key was lifted up from the local host: - - - -``` -zenml service-connector describe aws-secret-key -``` - -Example Command Output - - - -``` -Service connector 'aws-secret-key' of type 'aws' with id 'a1b07c5a-13af-4571-8e63-57a809c85790' is owned by user 'default' and is 'private'. - - 'aws-secret-key' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 37c97fa0-fa47-4d55-9970-e2aa6e1b50cf ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-secret-key ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ secret-key ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ b889efe1-0e23-4e2d-afc3-bdd785ee2d80 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:23:39.982950 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:23:39.982952 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - - - -### AWS STS Token - -Uses [temporary STS tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#short-lived-credentials) explicitly configured by the user or auto-configured from a local environment. - -This method has the major limitation that the user must regularly generate new tokens and update the connector configuration as STS tokens expire. On the other hand, this method is ideal in cases where the connector only needs to be used for a short period of time, such as sharing access temporarily with someone else in your team. - -Using other authentication methods like [IAM role](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role), [Session Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-session-token), or [Federation Token](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-federation-token) will automatically generate and refresh STS tokens for clients upon request. - -An AWS region is required and the connector may only be used to access AWS resources in the specified region. - -Fetching STS tokens from the local AWS CLI is possible if the AWS CLI is already configured with valid credentials. In our example, the `connectors` AWS CLI profile is configured with an IAM user Secret Key. We need to force the ZenML CLI to use the STS token authentication by passing the `--auth-method sts-token` option, otherwise it would automatically use [the session token authentication method](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-session-token): - - - -``` -AWS_PROFILE=connectors zenml service-connector register aws-sts-token --type aws --auto-configure --auth-method sts-token -``` - -Example Command Output - - - -``` -⠸ Registering service connector 'aws-sts-token'... - -Successfully registered service connector `aws-sts-token` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The Service Connector configuration shows that the connector is configured with an STS token: - - - -``` -zenml service-connector describe aws-sts-token -``` - -Example Command Output - - - -``` -Service connector 'aws-sts-token' of type 'aws' with id '63e14350-6719-4255-b3f5-0539c8f7c303' is owned by user 'default' and is 'private'. - - 'aws-sts-token' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ a05ef4ef-92cb-46b2-8a3a-a48535adccaf ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-sts-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ sts-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ bffd79c7-6d76-483b-9001-e9dda4e865ae ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 11h58m24s ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:25:40.278681 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:25:40.278684 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_session_token │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - -Note the temporary nature of the Service Connector. It will become unusable in 12 hours: - - - -``` -zenml service-connector list --name aws-sts-token -``` - -Example Command Output - - - -``` -┏━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼───────────────┼──────────────────────────────────────┼────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ aws-sts-token │ a05ef4ef-92cb-46b2-8a3a-a48535adccaf │ 🔶 aws │ 🔶 aws-generic │ │ ➖ │ default │ 11h57m51s │ ┃ - -┃ │ │ │ │ 📦 s3-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - - -### AWS IAM Role - -Generates [temporary STS credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles) by assuming an AWS IAM role. - -This authentication method still requires credentials to be explicitly configured. If your ZenML server is running in AWS and you're looking for an alternative that uses implicit credentials while at the same time benefits from all the security advantages of assuming an IAM role, you should [use the implicit authentication method with a configured IAM role](/develop/usage/resource-data-management/auth-management/aws-service-connector#implicit-authentication) instead. - -The connector needs to be configured with the IAM role to be assumed accompanied by an AWS secret key associated with an IAM user or an STS token associated with another IAM role. The IAM user or IAM role must have permission to assume the target IAM role. The connector will [generate temporary STS tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) upon request by [calling the AssumeRole STS API](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Ftemp%5Frequest.html#api%5Fassumerole). - -[The best practice implemented with this authentication scheme](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles) is to keep the set of permissions associated with the primary IAM user or IAM role down to the bare minimum and grant permissions to the privilege-bearing IAM role instead. - -An AWS region is required and the connector may only be used to access AWS resources in the specified region. - -One or more optional [IAM session policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/access%5Fpolicies.html#policies%5Fsession) may also be configured to further restrict the permissions of the generated STS tokens. If not specified, IAM session policies are automatically configured for the generated STS tokens [to restrict them to the minimum set of permissions required to access the target resource](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials). Refer to the documentation for each supported Resource Type for the complete list of AWS permissions automatically granted to the generated STS tokens. - -The default expiration period for generated STS tokens is 1 hour with a minimum of 15 minutes up to the maximum session duration setting configured for the IAM role (default is 1 hour). If you need longer-lived tokens, you can configure the IAM role to use a higher maximum expiration value (up to 12 hours) or use the AWS Federation Token or AWS Session Token authentication methods. - -For more information on IAM roles and the AssumeRole AWS API, see [the official AWS documentation on the subject](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Ftemp%5Frequest.html#api%5Fassumerole). - -For more information about the difference between this method and the AWS Federation Token authentication method, [consult this AWS documentation page](https://aws.amazon.com/blogs/security/understanding-the-api-options-for-securely-delegating-access-to-your-aws-account/). - - -The following assumes the local AWS CLI has a `zenml` AWS CLI profile already configured with an AWS Secret Key and an IAM role to be assumed: - - - -``` -AWS_PROFILE=zenml zenml service-connector register aws-iam-role --type aws --auto-configure -``` - -Example Command Output - - - -``` -⠸ Registering service connector 'aws-iam-role'... - -Successfully registered service connector `aws-iam-role` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The Service Connector configuration shows an IAM role and long-lived credentials: - - - -``` -zenml service-connector describe aws-iam-role -``` - -Example Command Output - - - -``` -Service connector 'aws-iam-role' of type 'aws' with id '8e499202-57fd-478e-9d2f-323d76d8d211' is owned by user 'default' and is 'private'. - - 'aws-iam-role' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 2b99de14-6241-4194-9608-b9d478e1bcfc ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-iam-role ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ iam-role ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 87795fdd-b70e-4895-b0dd-8bca5fd4d10e ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ 3600s ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:28:31.679843 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:28:31.679848 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼────────────────────────────────────────────────────────────────────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼────────────────────────────────────────────────────────────────────────┨ - -┃ role_arn │ arn:aws:iam::715803424590:role/OrganizationAccountRestrictedAccessRole ┃ - -┠───────────────────────┼────────────────────────────────────────────────────────────────────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼────────────────────────────────────────────────────────────────────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -However, clients receive temporary STS tokens instead of the AWS Secret Key configured in the connector (note the authentication method, expiration time, and credentials): - - - -``` -zenml service-connector describe aws-iam-role --resource-type s3-bucket --resource-id zenfiles --client -``` - -Example Command Output - - - -``` -Service connector 'aws-iam-role (s3-bucket | s3://zenfiles client)' of type 'aws' with id '8e499202-57fd-478e-9d2f-323d76d8d211' is owned by user 'default' and is 'private'. - - 'aws-iam-role (s3-bucket | s3://zenfiles client)' aws Service - - Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ ID │ 2b99de14-6241-4194-9608-b9d478e1bcfc ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ NAME │ aws-iam-role (s3-bucket | s3://zenfiles client) ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ sts-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 📦 s3-bucket ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ s3://zenfiles ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 59m56s ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:30:51.462445 ┃ - -┠──────────────────┼─────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:30:51.462449 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_session_token │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - - -### AWS Session Token - -Generates [temporary session STS tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) for IAM users. - -The connector needs to be configured with an AWS secret key associated with an IAM user or AWS account root user (not recommended). The connector will [generate temporary STS tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) upon request by calling [the GetSessionToken STS API](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Ftemp%5Frequest.html#api%5Fgetsessiontoken). - -The STS tokens have an expiration period longer than those issued through the [AWS IAM Role authentication method](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role) and are more suitable for long-running processes that cannot automatically re-generate credentials upon expiration. - -An AWS region is required and the connector may only be used to access AWS resources in the specified region. - -The default expiration period for generated STS tokens is 12 hours with a minimum of 15 minutes and a maximum of 36 hours. Temporary credentials obtained by using the AWS account root user credentials (not recommended) have a maximum duration of 1 hour. - -As a precaution, when long-lived credentials (i.e. AWS Secret Keys) are detected on your environment by the Service Connector during auto-configuration, this authentication method is automatically chosen instead of the AWS [Secret Key authentication method](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-secret-key) alternative. - -Generated STS tokens inherit the full set of permissions of the IAM user or AWS account root user that is calling the GetSessionToken API. Depending on your security needs, this may not be suitable for production use, as it can lead to accidental privilege escalation. Instead, it is recommended to use the AWS Federation Token or [AWS IAM Role authentication](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role) methods to restrict the permissions of the generated STS tokens. - -For more information on session tokens and the GetSessionToken AWS API, see [the official AWS documentation on the subject](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Ftemp%5Frequest.html#api%5Fgetsessiontoken). - -The following assumes the local AWS CLI has a `connectors` AWS CLI profile already configured with an AWS Secret Key: - - - -``` -AWS_PROFILE=connectors zenml service-connector register aws-session-token --type aws --auth-method session-token --auto-configure -``` - -Example Command Output - - - -``` -⠸ Registering service connector 'aws-session-token'... - -Successfully registered service connector `aws-session-token` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The Service Connector configuration shows long-lived credentials were lifted from the local environment and the AWS Session Token authentication method was configured: - - - -``` -zenml service-connector describe aws-session-token -``` - -Example Command Output - - - -``` -Service connector 'aws-session-token' of type 'aws' with id '3ae3e595-5cbc-446e-be64-e54e854e0e3f' is owned by user 'default' and is 'private'. - - 'aws-session-token' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ c0f8e857-47f9-418b-a60f-c3b03023da54 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-session-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ session-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 16f35107-87ef-4a86-bbae-caa4a918fc15 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ 43200s ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:31:54.971869 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:31:54.971871 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - -However, clients receive temporary STS tokens instead of the AWS Secret Key configured in the connector (note the authentication method, expiration time, and credentials): - - - -``` -zenml service-connector describe aws-session-token --resource-type s3-bucket --resource-id zenfiles --client -``` - -Example Command Output - - - -``` -Service connector 'aws-session-token (s3-bucket | s3://zenfiles client)' of type 'aws' with id '3ae3e595-5cbc-446e-be64-e54e854e0e3f' is owned by user 'default' and is 'private'. - - 'aws-session-token (s3-bucket | s3://zenfiles client)' aws Service - - Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ ID │ c0f8e857-47f9-418b-a60f-c3b03023da54 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ NAME │ aws-session-token (s3-bucket | s3://zenfiles client) ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ sts-token ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 📦 s3-bucket ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ s3://zenfiles ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 11h59m56s ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:35:24.090861 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:35:24.090863 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_session_token │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - - - -### AWS Federation Token - -Generates [temporary STS tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) for federated users by [impersonating another user](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles). - -The connector needs to be configured with an AWS secret key associated with an IAM user or AWS account root user (not recommended). The IAM user must have permission to call [the GetFederationToken STS API](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Ftemp%5Frequest.html#api%5Fgetfederationtoken) (i.e. allow the `sts:GetFederationToken` action on the `*` IAM resource). The connector will generate temporary STS tokens upon request by calling the GetFederationToken STS API. - -These STS tokens have an expiration period longer than those issued through [the AWS IAM Role authentication method](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role) and are more suitable for long-running processes that cannot automatically re-generate credentials upon expiration. - -An AWS region is required and the connector may only be used to access AWS resources in the specified region. - -One or more optional [IAM session policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/access%5Fpolicies.html#policies%5Fsession) may also be configured to further restrict the permissions of the generated STS tokens. If not specified, IAM session policies are automatically configured for the generated STS tokens [to restrict them to the minimum set of permissions required to access the target resource](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials). Refer to the documentation for each supported Resource Type for the complete list of AWS permissions automatically granted to the generated STS tokens. - - - If this authentication method is used with [the generic AWS resource type](/develop/usage/resource-data-management/auth-management/aws-service-connector#generic-aws-resource), a session policy MUST be explicitly specified, otherwise, the generated STS tokens will not have any permissions. - - -The default expiration period for generated STS tokens is 12 hours with a minimum of 15 minutes and a maximum of 36 hours. Temporary credentials obtained by using the AWS account root user credentials (not recommended) have a maximum duration of 1 hour. - - - If you need to access an EKS Kubernetes cluster with this authentication method, please be advised that the EKS cluster's `aws-auth` ConfigMap may need to be manually configured to allow authentication with the federated user. For more information, [see this documentation](https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html). - - -For more information on user federation tokens, session policies, and the GetFederationToken AWS API, see [the official AWS documentation on the subject](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Ftemp%5Frequest.html#api%5Fgetfederationtoken). - -For more information about the difference between this method and [the AWS IAM Role authentication method](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role), [consult this AWS documentation page](https://aws.amazon.com/blogs/security/understanding-the-api-options-for-securely-delegating-access-to-your-aws-account/). - - The following assumes the local AWS CLI has a `connectors` AWS CLI profile already configured with an AWS Secret Key: - - - -``` -AWS_PROFILE=connectors zenml service-connector register aws-federation-token --type aws --auth-method federation-token --auto-configure -``` - -Example Command Output - - - -``` -⠸ Registering service connector 'aws-federation-token'... - -Successfully registered service connector `aws-federation-token` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The Service Connector configuration shows long-lived credentials have been picked up from the local AWS CLI configuration: - - - -``` -zenml service-connector describe aws-federation-token -``` - -Example Command Output - - - -``` -Service connector 'aws-federation-token' of type 'aws' with id '868b17d4-b950-4d89-a6c4-12e520e66610' is owned by user 'default' and is 'private'. - - 'aws-federation-token' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ e28c403e-8503-4cce-9226-8a7cd7934763 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-federation-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ federation-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 958b840d-2a27-4f6b-808b-c94830babd99 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ 43200s ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:36:28.619751 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:36:28.619753 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - -However, clients receive temporary STS tokens instead of the AWS Secret Key configured in the connector (note the authentication method, expiration time, and credentials): - - - -``` -zenml service-connector describe aws-federation-token --resource-type s3-bucket --resource-id zenfiles --client -``` - -Example Command Output - - - -``` -Service connector 'aws-federation-token (s3-bucket | s3://zenfiles client)' of type 'aws' with id '868b17d4-b950-4d89-a6c4-12e520e66610' is owned by user 'default' and is 'private'. - - 'aws-federation-token (s3-bucket | s3://zenfiles client)' aws Service - - Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ ID │ e28c403e-8503-4cce-9226-8a7cd7934763 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-federation-token (s3-bucket | s3://zenfiles client) ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ sts-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 📦 s3-bucket ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ s3://zenfiles ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 11h59m56s ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:38:29.406986 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:38:29.406991 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_session_token │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - - - - -## Auto-configuration - -The AWS Service Connector allows [auto-discovering and fetching credentials](/develop/usage/resource-data-management/auth-management/service-connectors-guide#auto-configuration) and configuration set up [by the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) during registration. The default AWS CLI profile is used unless the AWS\_PROFILE environment points to a different profile. - - -The following is an example of lifting AWS credentials granting access to the same set of AWS resources and services that the local AWS CLI is allowed to access. In this case, [the IAM role authentication method](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role) was automatically detected: - - - -``` -AWS_PROFILE=zenml zenml service-connector register aws-auto --type aws --auto-configure -``` - -Example Command Output - - - -``` -⠹ Registering service connector 'aws-auto'... - -Successfully registered service connector `aws-auto` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenbytes-bucket ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The Service Connector configuration shows how credentials have automatically been fetched from the local AWS CLI configuration: - - - -``` -zenml service-connector describe aws-auto -``` - -Example Command Output - - - -``` -Service connector 'aws-auto' of type 'aws' with id '9f3139fd-4726-421a-bc07-312d83f0c89e' is owned by user 'default' and is 'private'. - - 'aws-auto' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 9cdc926e-55d7-49f0-838e-db5ac34bb7dc ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-auto ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ iam-role ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ a137151e-1778-4f50-b64b-7cf6c1f715f5 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ 3600s ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 19:39:11.958426 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 19:39:11.958428 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼────────────────────────────────────────────────────────────────────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼────────────────────────────────────────────────────────────────────────┨ - -┃ role_arn │ arn:aws:iam::715803424590:role/OrganizationAccountRestrictedAccessRole ┃ - -┠───────────────────────┼────────────────────────────────────────────────────────────────────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼────────────────────────────────────────────────────────────────────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -## Local client provisioning - -The local AWS CLI, Kubernetes `kubectl` CLI and the Docker CLI can be [configured with credentials extracted from or generated by a compatible AWS Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide#configure-local-clients). Please note that unlike the configuration made possible through the AWS CLI, the Kubernetes and Docker credentials issued by the AWS Service Connector have a short lifetime and will need to be regularly refreshed. This is a byproduct of implementing a high-security profile. - - - Configuring the local AWS CLI with credentials issued by the AWS Service Connector results in a local AWS CLI configuration profile being created with the name inferred from the first digits of the Service Connector UUID in the form -\. For example, a Service Connector with UUID `9f3139fd-4726-421a-bc07-312d83f0c89e` will result in a local AWS CLI configuration profile named `zenml-9f3139fd`. - - - -The following shows an example of configuring the local Kubernetes CLI to access an EKS cluster reachable through an AWS Service Connector: - - - -``` -zenml service-connector list --name aws-session-token -``` - -Example Command Output - - - -``` -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼───────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ aws-session-token │ c0f8e857-47f9-418b-a60f-c3b03023da54 │ 🔶 aws │ 🔶 aws-generic │ │ ➖ │ default │ │ ┃ - -┃ │ │ │ │ 📦 s3-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - -This checks the Kubernetes clusters that the AWS Service Connector has access to: - - - -``` -zenml service-connector verify aws-session-token --resource-type kubernetes-cluster -``` - -Example Command Output - - - -``` -Service connector 'aws-session-token' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ -``` - -Running the login CLI command will configure the local `kubectl` CLI to access the Kubernetes cluster: - - - -``` -zenml service-connector login aws-session-token --resource-type kubernetes-cluster --resource-id zenhacks-cluster -``` - -Example Command Output - - - -``` -⠇ Attempting to configure local client using service connector 'aws-session-token'... - -Cluster "arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster" set. - -Context "arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster" modified. - -Updated local kubeconfig with the cluster details. The current kubectl context was set to 'arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster'. - -The 'aws-session-token' Kubernetes Service Connector connector was used to successfully configure the local Kubernetes cluster client/SDK. -``` - -The following can be used to check that the local `kubectl` CLI is correctly configured: - - - -``` -kubectl cluster-info -``` - -Example Command Output - - - -``` -Kubernetes control plane is running at https://A5F8F4142FB12DDCDE9F21F6E9B07A18.gr7.us-east-1.eks.amazonaws.com - -CoreDNS is running at https://A5F8F4142FB12DDCDE9F21F6E9B07A18.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy -``` - -A similar process is possible with ECR container registries: - - - -``` -zenml service-connector verify aws-session-token --resource-type docker-registry -``` - -Example Command Output - - - -``` -Service connector 'aws-session-token' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -``` -zenml service-connector login aws-session-token --resource-type docker-registry -``` - -Example Command Output - - - -``` -⠏ Attempting to configure local client using service connector 'aws-session-token'... - -WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - -Configure a credential helper to remove this warning. See - -https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - -The 'aws-session-token' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. -``` - -The following can be used to check that the local Docker client is correctly configured: - - - -``` -docker pull 715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml-server -``` - -Example Command Output - - - -``` -Using default tag: latest - -latest: Pulling from zenml-server - -e9995326b091: Pull complete - -f3d7f077cdde: Pull complete - -0db71afa16f3: Pull complete - -6f0b5905c60c: Pull complete - -9d2154d50fd1: Pull complete - -d072bba1f611: Pull complete - -20e776588361: Pull complete - -3ce69736a885: Pull complete - -c9c0554c8e6a: Pull complete - -bacdcd847a66: Pull complete - -482033770844: Pull complete - -Digest: sha256:bf2cc3895e70dfa1ee1cd90bbfa599fa4cd8df837e27184bac1ce1cc239ecd3f - -Status: Downloaded newer image for 715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml-server:latest - -715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml-server:latest -``` - -It is also possible to update the local AWS CLI configuration with credentials extracted from the AWS Service Connector: - - - -``` -zenml service-connector login aws-session-token --resource-type aws-generic -``` - -Example Command Output - - - -``` -Configured local AWS SDK profile 'zenml-c0f8e857'. - -The 'aws-session-token' AWS Service Connector connector was used to successfully configure the local Generic AWS resource client/SDK. -``` - -A new profile is created in the local AWS CLI configuration holding the credentials. It can be used to access AWS resources and services, e.g.: - - - -``` -aws --profile zenml-c0f8e857 s3 ls -``` - - -## Stack Components use - -The [S3 Artifact Store Stack Component](/develop/stack-components/artifact-stores/s3) can be connected to a remote AWS S3 bucket through an AWS Service Connector. - -The AWS Service Connector can also be used with any Orchestrator or Model Deployer stack component flavor that relies on Kubernetes clusters to manage workloads. This allows EKS Kubernetes container workloads to be managed without the need to configure and maintain explicit AWS or Kubernetes `kubectl` configuration contexts and credentials in the target environment and in the Stack Component. - -Similarly, Container Registry Stack Components can be connected to an ECR Container Registry through an AWS Service Connector. This allows container images to be built and published to ECR container registries without the need to configure explicit AWS credentials in the target environment or the Stack Component. - -## End-to-end examples - - -This is an example of an end-to-end workflow involving Service Connectors that use a single multi-type AWS Service Connector to give access to multiple resources for multiple Stack Components. A complete ZenML Stack is registered and composed of the following Stack Components, all connected through the same Service Connector: - -* a [Kubernetes Orchestrator](/develop/stack-components/orchestrators/kubernetes) connected to an EKS Kubernetes cluster -* an [S3 Artifact Store](/develop/stack-components/artifact-stores/s3) connected to an S3 bucket -* an [ECR Container Registry](/develop/stack-components/container-registries/aws) stack component connected to an ECR container registry -* a local [Image Builder](/develop/stack-components/image-builders/local) - -As a last step, a simple pipeline is run on the resulting Stack. - -1. [Configure the local AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html) with valid IAM user account credentials with a wide range of permissions (i.e. by running `aws configure`) and install ZenML integration prerequisites: - -``` -zenml integration install -y aws s3 -``` - -``` -aws configure --profile connectors -``` - -Example Command Output - - - -``` -```text - -AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE - -AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY - -Default region name [None]: us-east-1 - -Default output format [None]: json - -``` -``` - -1. Make sure the AWS Service Connector Type is available - -``` -zenml service-connector list-types --type aws -``` - -Example Command Output - - - -``` -```text - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠───────────────────────┼────────┼───────────────────────┼──────────────────┼───────┼────────┨ - -┃ AWS Service Connector │ 🔶 aws │ 🔶 aws-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 s3-bucket │ secret-key │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ sts-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ iam-role │ │ ┃ - -┃ │ │ │ session-token │ │ ┃ - -┃ │ │ │ federation-token │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ - -``` -``` - -1. Register a multi-type AWS Service Connector using auto-configuration - -``` -AWS_PROFILE=connectors zenml service-connector register aws-demo-multi --type aws --auto-configure -``` - -Example Command Output - - - -``` -```text - -⠼ Registering service connector 'aws-demo-multi'... - -Successfully registered service connector `aws-demo-multi` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -**NOTE**: from this point forward, we don't need the local AWS CLI credentials or the local AWS CLI at all. The steps that follow can be run on any machine regardless of whether it has been configured and authorized to access the AWS platform or not. -``` - -4\. find out which S3 buckets, ECR registries, and EKS Kubernetes clusters we can gain access to. We'll use this information to configure the Stack Components in our minimal AWS stack: an S3 Artifact Store, a Kubernetes Orchestrator, and an ECR Container Registry. - - - -``` -```sh - -zenml service-connector list-resources --resource-type s3-bucket - -``` - -``` - -Example Command Output - - - -``` -```text - -The following 's3-bucket' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────┼────────────────┼───────────────┼───────────────────────────────────────┨ - -┃ bf073e06-28ce-4a4a-8100-32e7cb99dced │ aws-demo-multi │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ │ │ │ s3://zenml-demos ┃ - -┃ │ │ │ │ s3://zenml-generative-chat ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -```sh - -zenml service-connector list-resources --resource-type kubernetes-cluster - -``` - -``` - -Example Command Output - - - -``` -```text - -The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────┨ - -┃ bf073e06-28ce-4a4a-8100-32e7cb99dced │ aws-demo-multi │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -```sh - -zenml service-connector list-resources --resource-type docker-registry - -``` - -``` - -Example Command Output - - - -``` -```text - -The following 'docker-registry' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────────┼────────────────┼────────────────────┼─────────────────────────────────────────────────┨ - -┃ bf073e06-28ce-4a4a-8100-32e7cb99dced │ aws-demo-multi │ 🔶 aws │ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. register and connect an S3 Artifact Store Stack Component to an S3 bucket: - -``` -zenml artifact-store register s3-zenfiles --flavor s3 --path=s3://zenfiles -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully registered artifact_store `s3-zenfiles`. - -``` -``` - - - -``` -```sh - -zenml artifact-store connect s3-zenfiles --connector aws-demo-multi - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully connected artifact store `s3-zenfiles` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ bf073e06-28ce-4a4a-8100-32e7cb99dced │ aws-demo-multi │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. register and connect a Kubernetes Orchestrator Stack Component to an EKS cluster: - -``` -zenml orchestrator register eks-zenml-zenhacks --flavor kubernetes --synchronous=true --kubernetes_namespace=zenml-workloads -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully registered orchestrator `eks-zenml-zenhacks`. - -``` -``` - - - -``` -```sh - -zenml orchestrator connect eks-zenml-zenhacks --connector aws-demo-multi - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully connected orchestrator `eks-zenml-zenhacks` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────────────┼──────────────────┨ - -┃ bf073e06-28ce-4a4a-8100-32e7cb99dced │ aws-demo-multi │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. Register and connect an EC GCP Container Registry Stack Component to an ECR container registry: - -``` -zenml container-registry register ecr-us-east-1 --flavor aws --uri=715803424590.dkr.ecr.us-east-1.amazonaws.com -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully registered container_registry `ecr-us-east-1`. - -``` -``` - - - -``` -```sh - -zenml container-registry connect ecr-us-east-1 --connector aws-demo-multi - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully connected container registry `ecr-us-east-1` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼──────────────────────────────────────────────┨ - -┃ bf073e06-28ce-4a4a-8100-32e7cb99dced │ aws-demo-multi │ 🔶 aws │ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. Combine all Stack Components together into a Stack and set it as active (also throw in a local Image Builder for completion): - -``` -zenml image-builder register local --flavor local -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully registered image_builder `local`. - -``` -``` - - - -``` -```sh - -zenml stack register aws-demo -a s3-zenfiles -o eks-zenml-zenhacks -c ecr-us-east-1 -i local --set - -``` - -``` - -Example Command Output - - - -``` -```text - -Connected to the ZenML server: 'https://stefan.develaws.zenml.io' - -Stack 'aws-demo' successfully registered! - -Active repository stack set to:'aws-demo' - -``` -``` - -1. Finally, run a simple pipeline to prove that everything works as expected. We'll use the simplest pipelines possible for this example: - -``` -from zenml import pipeline, step -@step -def step_1() -> str: - """Returns the `world` string.""" - return "world" -@step(enable_cache=False) -def step_2(input_one: str, input_two: str) -> None: - """Combines the two strings at its input and prints them.""" - combined_str = f"{input_one} {input_two}" - print(combined_str) -@pipeline -def my_pipeline(): - output_step_one = step_1() - step_2(input_one="hello", input_two=output_step_one) -if __name__ == "__main__": - my_pipeline() -``` -Saving that to a `run.py` file and running it gives us: - -Example Command Output - - - -``` -```text - -$ python run.py - -Building Docker image(s) for pipeline simple_pipeline. - -Building Docker image 715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml:simple_pipeline-orchestrator. - -- Including user-defined requirements: boto3==1.26.76 - -- Including integration requirements: boto3, kubernetes==18.20.0, s3fs>2022.3.0,<=2023.4.0, sagemaker==2.117.0 - -No .dockerignore found, including all files inside build context. - -Step 1/10 : FROM zenmldocker/zenml:0.39.1-py3.8 - -Step 2/10 : WORKDIR /app - -Step 3/10 : .zenml_user_requirements . - -Step 4/10 : RUN pip install --default-timeout=60 --no-cache-dir -r .zenml_user_requirements - -Step 5/10 : .zenml_integration_requirements . - -Step 6/10 : RUN pip install --default-timeout=60 --no-cache-dir -r .zenml_integration_requirements - -Step 7/10 : ENV ZENML_ENABLE_REPO_INIT_WARNINGS=False - -Step 8/10 : ENV ZENML_CONFIG_PATH=/app/.zenconfig - -Step 9/10 : . . - -Step 10/10 : RUN chmod -R a+rw . - -Amazon ECR requires you to create a repository before you can push an image to it. ZenML is trying to push the image 715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml:simple_pipeline-orchestrator but could only detect the following repositories: []. We will try to push anyway, but in case it fails you need to create a repository named zenml. - -Pushing Docker image 715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml:simple_pipeline-orchestrator. - -Finished pushing Docker image. - -Finished building Docker image(s). - -Running pipeline simple_pipeline on stack aws-demo (caching disabled) - -Waiting for Kubernetes orchestrator pod... - -Kubernetes orchestrator pod started. - -Waiting for pod of step step_1 to start... - -Step step_1 has started. - -Step step_1 has finished in 0.390s. - -Pod of step step_1 completed. - -Waiting for pod of step step_2 to start... - -Step step_2 has started. - -Hello World! - -Step step_2 has finished in 2.364s. - -Pod of step step_2 completed. - -Orchestration pod completed. - -Dashboard URL: https://stefan.develaws.zenml.io/default/pipelines/be5adfe9-45af-4709-a8eb-9522c01640ce/runs - -``` - diff --git a/develop/usage/resource-data-management/auth-management/azure-service-connector.mdx b/develop/usage/resource-data-management/auth-management/azure-service-connector.mdx deleted file mode 100644 index 37dc58f..0000000 --- a/develop/usage/resource-data-management/auth-management/azure-service-connector.mdx +++ /dev/null @@ -1,1201 +0,0 @@ ---- -title: "Azure Service Connector" -description: "Configuring Azure Service Connectors to connect ZenML to Azure resources such as Blob storage buckets, AKS Kubernetes clusters, and ACR container registries." -icon: book ---- - -The ZenML Azure Service Connector facilitates the authentication and access to managed Azure services and resources. These encompass a range of resources, including blob storage containers, ACR repositories, and AKS clusters. - -This connector also supports [automatic configuration and detection of credentials](/develop/usage/resource-data-management/auth-management/service-connectors-guide#auto-configuration) locally configured through the Azure CLI. - -This connector serves as a general means of accessing any Azure service by issuing credentials to clients. Additionally, the connector can handle specialized authentication for Azure blob storage, Docker and Kubernetes Python clients. It also allows for the configuration of local Docker and Kubernetes CLIs. - -```bash -$ zenml service-connector list-types --type azure -``` - -```bash -┏━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ -┠─────────────────────────┼──────────┼───────────────────────┼───────────────────┼───────┼────────┨ -┃ Azure Service Connector │ 🇦 azure │ 🇦 azure-generic │ implicit │ ✅ │ ✅ ┃ -┃ │ │ 📦 blob-container │ service-principal │ │ ┃ -┃ │ │ 🌀 kubernetes-cluster │ access-token │ │ ┃ -┃ │ │ 🐳 docker-registry │ │ │ ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -## Prerequisites - -The Azure Service Connector is part of the Azure ZenML integration. You can either install the entire integration or use a pypi extra to install it independently of the integration: - -* `pip install "zenml[connectors-azure]"` installs only prerequisites for the Azure Service Connector Type -* `zenml integration install azure` installs the entire Azure ZenML integration - -It is not required to [install and set up the Azure CLI](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli) on your local machine to use the Azure Service Connector to link Stack Components to Azure resources and services. However, it is recommended to do so if you are looking for a quick setup that includes using the auto-configuration Service Connector features. - - - The auto-configuration option is limited to using temporary access tokens that don't work with Azure blob storage resources. To unlock the full power of the Azure Service Connector it is therefore recommended that you [configure and use an Azure service principal and its credentials](https://learn.microsoft.com/en-us/azure/developer/python/sdk/authentication-on-premises-apps?tabs=azure-portal). - - -## Resource Types - -### Generic Azure resource - -This resource type allows Stack Components to use the Azure Service Connector to connect to any Azure service or resource. When used by Stack Components, they are provided generic azure-identity credentials that can be used to create Azure python clients for any particular Azure service. - -This generic Azure resource type is meant to be used with Stack Components that are not represented by other, more specific resource type, like Azure blob storage containers, Kubernetes clusters or Docker registries. It should be accompanied by a matching set of Azure permissions that allow access to the set of remote resources required by the Stack Components. - -The resource name represents the name of the Azure subscription that the connector is authorized to access. - -### Azure blob storage container - -Allows users to connect to Azure Blob containers. When used by Stack Components, they are provided a pre-configured Azure Blob Storage client. - -The configured credentials must have at least the following Azure IAM permissions associated with the blob storage account or containers that the connector that the connector will be allowed to access: - -* allow read and write access to blobs (e.g. the `Storage Blob Data Contributor` role) -* allow listing the storage accounts (e.g. the `Reader and Data Access` role). This is only required if a storage account is not configured in the connector. -* allow listing the containers in a storage account (e.g. the `Reader and Data Access` role) - -If set, the resource name must identify an Azure blob storage container using one of the following formats: - -* Azure blob container URI (canonical resource name): `{az|abfs}://{container-name}` -* Azure blob container name: `{container-name}` - -If a storage account is configured in the connector, only blob storage containers in that storage account will be accessible. Otherwise, if a resource group is configured in the connector, only blob storage containers in storage accounts in that resource group will be accessible. Finally, if neither a storage account nor a resource group is configured in the connector, all blob storage containers in all accessible storage accounts will be accessible. - - - The only Azure authentication method that works with Azure blob storage resources is the service principal authentication method. - - -### AKS Kubernetes cluster - -Allows Stack Components to access an AKS cluster as a standard Kubernetes cluster resource. When used by Stack Components, they are provided a pre-authenticated python-kubernetes client instance. - -The configured credentials must have at least the following Azure IAM permissions associated with the AKS clusters that the connector will be allowed to access: - -* allow listing the AKS clusters and fetching their credentials (e.g. the `Azure Kubernetes Service Cluster Admin Role` role) - -If set, the resource name must identify an EKS cluster using one of the following formats: - -* resource group scoped AKS cluster name (canonical): `[{resource-group}/]{cluster-name}` -* AKS cluster name: `{cluster-name}` - -Given that the AKS cluster name is unique within a resource group, the resource group name may be included in the resource name to avoid ambiguity. If a resource group is configured in the connector, the resource group name in the resource name must match the configured resource group. If no resource group is configured in the connector and a resource group name is not included in the resource name, the connector will attempt to find the AKS cluster in any resource group. - -If a resource group is configured in the connector, only AKS clusters in that resource group will be accessible. - -### ACR container registry - -Allows Stack Components to access one or more ACR registries as a standard Docker registry resource. When used by Stack Components, they are provided a pre-authenticated python-docker client instance. - -The configured credentials must have at least the following Azure IAM permissions associated with the ACR registries that the connector will be allowed to access: - -* allow access to pull and push images (e.g. the `AcrPull` and `AcrPush` roles) -* allow access to list registries (e.g. the `Contributor` role) - -If set, the resource name must identify an ACR registry using one of the following formats: - -* ACR registry URI (canonical resource name): `[https://]{registry-name}.azurecr.io` -* ACR registry name: `{registry-name}` - -If a resource group is configured in the connector, only ACR registries in that resource group will be accessible. - -If an authentication method other than the Azure service principal is used for authentication, the admin account must be enabled for the registry, otherwise, clients will not be able to authenticate to the registry. See the official Azure [documentation on the admin account](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-authentication#admin-account) for more information. - -## Authentication Methods - -### Implicit authentication - -[Implicit authentication](/develop/usage/resource-data-management/auth-management/best-security-practices#implicit-authentication) to Azure services using environment variables, local configuration files, workload or managed identities. - - - This method may constitute a security risk, because it can give users access to the same cloud resources and services that the ZenML Server itself is configured to access. For this reason, all implicit authentication methods are disabled by default and need to be explicitly enabled by setting the `ZENML_ENABLE_IMPLICIT_AUTH_METHODS` environment variable or the helm chart `enableImplicitAuthMethods` configuration option to `true` in the ZenML deployment. - - -This authentication method doesn't require any credentials to be explicitly configured. It automatically discovers and uses credentials from one of the following sources: - -* [environment variables](https://learn.microsoft.com/en-us/python/api/overview/azure/identity-readme?view=azure-python#environment-variables) -* workload identity - if the application is deployed to an Azure Kubernetes Service with Managed Identity enabled. This option can only be used when running the ZenML server on an AKS cluster. -* managed identity - if the application is deployed to an Azure host with Managed Identity enabled. This option can only be used when running the ZenML client or server on an Azure host. -* Azure CLI - if a user has signed in via [the Azure CLI az login command](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli). - -This is the quickest and easiest way to authenticate to Azure services. However, the results depend on how ZenML is deployed and the environment where it is used and is thus not fully reproducible: - -* when used with the default local ZenML deployment or a local ZenML server, the credentials are the same as those used by the Azure CLI or extracted from local environment variables. -* when connected to a ZenML server, this method only works if the ZenML server is deployed in Azure and will use the workload identity attached to the Azure resource where the ZenML server is running (e.g. an AKS cluster). The permissions of the managed identity may need to be adjusted to allows listing and accessing/describing the Azure resources that the connector is configured to access. - -Note that the discovered credentials inherit the full set of permissions of the local Azure CLI configuration, environment variables or remote Azure managed identity. Depending on the extent of those permissions, this authentication method might not be recommended for production use, as it can lead to accidental privilege escalation. Instead, it is recommended to use the Azure service principal authentication method to limit the validity and/or permissions of the credentials being issued to connector clients. - -The following assumes the local Azure CLI has already been configured with user account credentials by running the `az login` command: - -``` -zenml service-connector register azure-implicit --type azure --auth-method implicit --auto-configure -``` - -Example Command Output - - - -``` -⠙ Registering service connector 'azure-implicit'... - -Successfully registered service connector `azure-implicit` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🇦 azure-generic │ ZenML Subscription ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 📦 blob-container │ az://demo-zenmlartifactstore ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -No credentials are stored with the Service Connector: - - - -``` -zenml service-connector describe azure-implicit -``` - -Example Command Output - - - -``` -Service connector 'azure-implicit' of type 'azure' with id 'ad645002-0cd4-4d4f-ae20-499ce888a00a' is owned by user 'default' and is 'private'. - - 'azure-implicit' azure Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ ad645002-0cd4-4d4f-ae20-499ce888a00a ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ azure-implicit ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🇦 azure ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ implicit ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🇦 azure-generic, 📦 blob-container, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-05 09:47:42.415949 ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-05 09:47:42.415954 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -### Azure Service Principal - -Azure service principal credentials consists of an Azure client ID and client secret. These credentials are used to authenticate clients to Azure services. - -For this authentication method, the Azure Service Connector requires [an Azure service principal to be created](https://learn.microsoft.com/en-us/azure/developer/python/sdk/authentication-on-premises-apps?tabs=azure-portal) and a client secret to be generated. - - - The following assumes an Azure service principal was configured with a client secret and has permissions to access an Azure blob storage container, an AKS Kubernetes cluster and an ACR container registry. The service principal client ID, tenant ID and client secret are then used to configure the Azure Service Connector. - - - -```bash -zenml service-connector register azure-service-principal --type azure --auth-method service-principal --tenant_id=a79f3633-8f45-4a74-a42e-68871c17b7fb --client_id=8926254a-8c3f-430a-a2fd-bdab234d491e --client_secret=AzureSuperSecret -``` - -Example Command Output - -```bash -⠙ Registering service connector 'azure-service-principal'... - -Successfully registered service connector `azure-service-principal` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🇦 azure-generic │ ZenML Subscription ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 📦 blob-container │ az://demo-zenmlartifactstore ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The Service Connector configuration shows that the connector is configured with service principal credentials: - - - -```bash -zenml service-connector describe azure-service-principal -``` - -Example Command Output - - - -```bash -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 273d2812-2643-4446-82e6-6098b8ccdaa4 ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ azure-service-principal ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🇦 azure ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ service-principal ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🇦 azure-generic, 📦 blob-container, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 50d9f230-c4ea-400e-b2d7-6b52ba2a6f90 ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-20 19:16:26.802374 ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-20 19:16:26.802378 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────┼──────────────────────────────────────┨ - -┃ tenant_id │ a79ff333-8f45-4a74-a42e-68871c17b7fb ┃ - -┠───────────────┼──────────────────────────────────────┨ - -┃ client_id │ 8926254a-8c3f-430a-a2fd-bdab234d491e ┃ - -┠───────────────┼──────────────────────────────────────┨ - -┃ client_secret │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - - -### Azure Access Token - -Uses [temporary Azure access tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#short-lived-credentials) explicitly configured by the user or auto-configured from a local environment. - -This method has the major limitation that the user must regularly generate new tokens and update the connector configuration as API tokens expire. On the other hand, this method is ideal in cases where the connector only needs to be used for a short period of time, such as sharing access temporarily with someone else in your team. - -This is the authentication method used during auto-configuration, if you have [the local Azure CLI set up with credentials](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli). The connector will generate an access token from the Azure CLI credentials and store it in the connector configuration. - - - Given that Azure access tokens are scoped to a particular Azure resource and the access token generated during auto-configuration is scoped to the Azure Management API, this method does not work with Azure blob storage resources. You should use [the Azure service principal authentication method](/develop/usage/resource-data-management/auth-management/azure-service-connector#azure-service-principal) for blob storage resources instead. - - -Fetching Azure session tokens from the local Azure CLI is possible if the Azure CLI is already configured with valid credentials (i.e. by running `az login`): - - - -```bash -zenml service-connector register azure-session-token --type azure --auto-configure -``` - -Example Command Output - - - -```bash -⠙ Registering service connector 'azure-session-token'... - -connector authorization failure: the 'access-token' authentication method is not supported for blob storage resources - -Successfully registered service connector `azure-session-token` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🇦 azure-generic │ ZenML Subscription ┃ - -┠───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 📦 blob-container │ 💥 error: connector authorization failure: the 'access-token' authentication method is not supported for blob storage resources ┃ - -┠───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -```bahs -zenml service-connector describe azure-session-token -``` - -Example Command Output - - - -```bash -Service connector 'azure-session-token' of type 'azure' with id '94d64103-9902-4aa5-8ce4-877061af89af' is owned by user 'default' and is 'private'. - - 'azure-session-token' azure Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 94d64103-9902-4aa5-8ce4-877061af89af ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ azure-session-token ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🇦 azure ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ access-token ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🇦 azure-generic, 📦 blob-container, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ b34f2e95-ae16-43b6-8ab6-f0ee33dbcbd8 ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 42m25s ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-05 10:03:32.646351 ┃ - -┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-05 10:03:32.646352 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━┯━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────┼──────────┨ - -┃ token │ [HIDDEN] ┃ - -┗━━━━━━━━━━┷━━━━━━━━━━┛ -``` - -Note the temporary nature of the Service Connector. It will expire and become unusable in approximately 1 hour: - - - -```bash -zenml service-connector list --name azure-session-token -``` - -Example Command Output - - - -```bash -Could not import GCP service connector: No module named 'google.api_core'. - -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼─────────────────────┼──────────────────────────────────────┼──────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ azure-session-token │ 94d64103-9902-4aa5-8ce4-877061af89af │ 🇦 azure │ 🇦 azure-generic │ │ ➖ │ default │ 40m58s │ ┃ - -┃ │ │ │ │ 📦 blob-container │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - - - -## Auto-configuration - -The Azure Service Connector allows [auto-discovering and fetching credentials](/develop/usage/resource-data-management/auth-management/service-connectors-guide#auto-configuration) and [configuration set up by the Azure CLI](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli) on your local host. - - - The Azure service connector auto-configuration comes with two limitations: - -1. it can only pick up temporary Azure access tokens and therefore cannot be used for long-term authentication scenarios -2. it doesn't support authenticating to the Azure blob storage service. [The Azure service principal authentication method](/develop/usage/resource-data-management/auth-management/azure-service-connector#azure-service-principal) can be used instead. - - -For an auto-configuration example, please refer to the [section about Azure access tokens](/develop/usage/resource-data-management/auth-management/azure-service-connector#azure-access-token). - -## Local client provisioning - -The local Azure CLI, Kubernetes `kubectl` CLI and the Docker CLI can be [configured with credentials extracted from or generated by a compatible Azure Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide#configure-local-clients). - - - Note that the Azure local CLI can only be configured with credentials issued by the Azure Service Connector if the connector is configured with the [service principal authentication method](/develop/usage/resource-data-management/auth-management/azure-service-connector#azure-service-principal). - - - - The following shows an example of configuring the local Kubernetes CLI to access an AKS cluster reachable through an Azure Service Connector: - - - -```bash -zenml service-connector list --name azure-service-principal -``` - -Example Command Output - - - -```bash -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼─────────────────────────┼──────────────────────────────────────┼──────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ azure-service-principal │ 3df920bc-120c-488a-b7fc-0e79bc8b021a │ 🇦 azure │ 🇦 azure-generic │ │ ➖ │ default │ │ ┃ - -┃ │ │ │ │ 📦 blob-container │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - -The verify CLI command can be used to list all Kubernetes clusters accessible through the Azure Service Connector: - - - -```bash -zenml service-connector verify azure-service-principal --resource-type kubernetes-cluster -``` - -Example Command Output - - - -```bash -⠙ Verifying service connector 'azure-service-principal'... - -Service connector 'azure-service-principal' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The login CLI command can be used to configure the local Kubernetes CLI to access a Kubernetes cluster reachable through an Azure Service Connector: - - - -```bash -zenml service-connector login azure-service-principal --resource-type kubernetes-cluster --resource-id demo-zenml-demos/demo-zenml-terraform-cluster -``` - -Example Command Output - - - -```bash -⠙ Attempting to configure local client using service connector 'azure-service-principal'... - -Updated local kubeconfig with the cluster details. The current kubectl context was set to 'demo-zenml-terraform-cluster'. - -The 'azure-service-principal' Kubernetes Service Connector connector was used to successfully configure the local Kubernetes cluster client/SDK. -``` - -The local Kubernetes CLI can now be used to interact with the Kubernetes cluster: - -```bash -kubectl cluster-info -``` - -Example Command Output - -```bash -Kubernetes control plane is running at https://demo-43c5776f7.hcp.westeurope.azmk8s.io:443 - -CoreDNS is running at https://demo-43c5776f7.hcp.westeurope.azmk8s.io:443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy - -Metrics-server is running at https://demo-43c5776f7.hcp.westeurope.azmk8s.io:443/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy -``` - -A similar process is possible with ACR container registries: - - - -```bash -zenml service-connector verify azure-service-principal --resource-type docker-registry -``` - -Example Command Output - - - -```bash -⠦ Verifying service connector 'azure-service-principal'... - -Service connector 'azure-service-principal' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────────┼───────────────────────────────────────┨ - -┃ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -```bash -zenml service-connector login azure-service-principal --resource-type docker-registry --resource-id demozenmlcontainerregistry.azurecr.io -``` - -Example Command Output - - - -```bash -⠹ Attempting to configure local client using service connector 'azure-service-principal'... - -WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - -Configure a credential helper to remove this warning. See - -https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - -The 'azure-service-principal' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. -``` - -The local Docker CLI can now be used to interact with the container registry: - - - -``` -docker push demozenmlcontainerregistry.azurecr.io/zenml:example_pipeline -``` - -Example Command Output - - - -```bash -The push refers to repository [demozenmlcontainerregistry.azurecr.io/zenml] - -d4aef4f5ed86: Pushed - -2d69a4ce1784: Pushed - -204066eca765: Pushed - -2da74ab7b0c1: Pushed - -75c35abda1d1: Layer already exists - -415ff8f0f676: Layer already exists - -c14cb5b1ec91: Layer already exists - -a1d005f5264e: Layer already exists - -3a3fd880aca3: Layer already exists - -149a9c50e18e: Layer already exists - -1f6d3424b922: Layer already exists - -8402c959ae6f: Layer already exists - -419599cb5288: Layer already exists - -8553b91047da: Layer already exists - -connectors: digest: sha256:a4cfb18a5cef5b2201759a42dd9fe8eb2f833b788e9d8a6ebde194765b42fe46 size: 3256 -``` - -It is also possible to update the local Azure CLI configuration with credentials extracted from the Azure Service Connector: - - - -```bash -zenml service-connector login azure-service-principal --resource-type azure-generic -``` - -Example Command Output - - - -```bash -Updated the local Azure CLI configuration with the connector's service principal credentials. - -The 'azure-service-principal' Azure Service Connector connector was used to successfully configure the local Generic Azure resource client/SDK. -``` - - - -## Stack Components use - -The [Azure Artifact Store Stack Component](/develop/stack-components/artifact-stores/azure) can be connected to a remote Azure blob storage container through an Azure Service Connector. - -The Azure Service Connector can also be used with any Orchestrator or Model Deployer stack component flavor that relies on a Kubernetes clusters to manage workloads. This allows AKS Kubernetes container workloads to be managed without the need to configure and maintain explicit Azure or Kubernetes `kubectl` configuration contexts and credentials in the target environment or in the Stack Component itself. - -Similarly, Container Registry Stack Components can be connected to a ACR Container Registry through an Azure Service Connector. This allows container images to be built and published to private ACR container registries without the need to configure explicit Azure credentials in the target environment or the Stack Component. - -## End-to-end examples - -This is an example of an end-to-end workflow involving Service Connectors that uses a single multi-type Azure Service Connector to give access to multiple resources for multiple Stack Components. A complete ZenML Stack is registered composed of the following Stack Components, all connected through the same Service Connector: - -* a [Kubernetes Orchestrator](/develop/stack-components/orchestrators/kubernetes) connected to an AKS Kubernetes cluster -* a [Azure Blob Storage Artifact Store](/develop/stack-components/artifact-stores/azure) connected to an Azure blob storage container -* an [Azure Container Registry](/develop/stack-components/container-registries/azure) connected to an ACR container registry -* a local [Image Builder](/develop/stack-components/image-builders/local) - -As a last step, a simple pipeline is run on the resulting Stack. - -This example needs to use a remote ZenML Server that is reachable from Azure. - -1. Configure an Azure service principal with a client secret and give it permissions to access an Azure blob storage container, an AKS Kubernetes cluster and an ACR container registry. Also make sure you have the Azure ZenML integration installed: - -```bash -zenml integration install -y azure -``` -2. Make sure the Azure Service Connector Type is available - -```bash -zenml service-connector list-types --type azure -``` - -Example Command Output - -```bash - -┏━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠─────────────────────────┼──────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Azure Service Connector │ 🇦 azure │ 🇦 azure-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 blob-container │ service-principal │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ access-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ - -``` -```bash - -1. Register a multi-type Azure Service Connector using the Azure service principal credentials set up at the first step. Note the resources that it has access to: - -```bash -zenml service-connector register azure-service-principal --type azure --auth-method service-principal --tenant_id=a79ff3633-8f45-4a74-a42e-68871c17b7fb --client_id=8926254a-8c3f-430a-a2fd-bdab234fd491e --client_secret=AzureSuperSecret -``` - -Example Command Output - -```bash - -⠸ Registering service connector 'azure-service-principal'... - -Successfully registered service connector `azure-service-principal` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🇦 azure-generic │ ZenML Subscription ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 📦 blob-container │ az://demo-zenmlartifactstore ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ - -┠───────────────────────┼───────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -4. register and connect an Azure Blob Storage Artifact Store Stack Component to an Azure blob container: - -``` -zenml artifact-store register azure-demo --flavor azure --path=az://demo-zenmlartifactstore -``` - -Example Command Output - -``` - -``` -Successfully registered artifact_store `azure-demo`. - -``` -``` - - - -``` -```sh - -zenml artifact-store connect azure-demo --connector azure-service-principal - -``` - -``` - -Example Command Output - - - -``` -``` - -Successfully connected artifact store `azure-demo` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────────┼────────────────┼───────────────────┼──────────────────────────────┨ - -┃ f2316191-d20b-4348-a68b-f5e347862196 │ azure-service-principal │ 🇦 azure │ 📦 blob-container │ az://demo-zenmlartifactstore ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. register and connect a Kubernetes Orchestrator Stack Component to an AKS cluster: - -``` -zenml orchestrator register aks-demo-cluster --flavor kubernetes --synchronous=true --kubernetes_namespace=zenml-workloads -``` - -Example Command Output - - - -``` -``` - -Successfully registered orchestrator `aks-demo-cluster`. - -``` -``` - - - -``` -```sh - -zenml orchestrator connect aks-demo-cluster --connector azure-service-principal - -``` - -``` - -Example Command Output - - - -``` -``` - -Successfully connected orchestrator `aks-demo-cluster` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────────┼────────────────┼───────────────────────┼───────────────────────────────────────────────┨ - -┃ f2316191-d20b-4348-a68b-f5e347862196 │ azure-service-principal │ 🇦 azure │ 🌀 kubernetes-cluster │ demo-zenml-demos/demo-zenml-terraform-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. Register and connect an Azure Container Registry Stack Component to an ACR container registry: - -``` -zenml container-registry register acr-demo-registry --flavor azure --uri=demozenmlcontainerregistry.azurecr.io -``` - -Example Command Output - - - -``` -``` - -Successfully registered container_registry `acr-demo-registry`. - -``` -``` - - - -``` -```sh - -zenml container-registry connect acr-demo-registry --connector azure-service-principal - -``` - -``` - -Example Command Output - - - -``` -``` - -Successfully connected container registry `acr-demo-registry` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────────┼────────────────┼────────────────────┼───────────────────────────────────────┨ - -┃ f2316191-d20b-4348-a68b-f5e347862196 │ azure-service-principal │ 🇦 azure │ 🐳 docker-registry │ demozenmlcontainerregistry.azurecr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. Combine all Stack Components together into a Stack and set it as active (also throw in a local Image Builder for completion): - -``` -zenml image-builder register local --flavor local -``` - -Example Command Output - - - -``` -``` - -Running with active stack: 'default' (global) - -Successfully registered image_builder `local`. - -``` -``` - - - -``` -```sh - -zenml stack register gcp-demo -a azure-demo -o aks-demo-cluster -c acr-demo-registry -i local --set - -``` - -``` - -Example Command Output - - - -``` -``` - -Stack 'gcp-demo' successfully registered! - -Active repository stack set to:'gcp-demo' - -``` -``` - -1. Finally, run a simple pipeline to prove that everything works as expected. We'll use the simplest pipelines possible for this example: - -``` -from zenml import pipeline, step -@step -def step_1() -> str: - """Returns the `world` string.""" - return "world" -@step(enable_cache=False) -def step_2(input_one: str, input_two: str) -> None: - """Combines the two strings at its input and prints them.""" - combined_str = f"{input_one} {input_two}" - print(combined_str) -@pipeline -def my_pipeline(): - output_step_one = step_1() - step_2(input_one="hello", input_two=output_step_one) -if __name__ == "__main__": - my_pipeline() -``` -Saving that to a `run.py` file and running it gives us: - -Example Command Output - - - -```bash -``` - -$ python run.py - -Building Docker image(s) for pipeline simple_pipeline. - -Building Docker image demozenmlcontainerregistry.azurecr.io/zenml:simple_pipeline-orchestrator. - -- Including integration requirements: adlfs==2021.10.0, azure-identity==1.10.0, azure-keyvault-keys, azure-keyvault-secrets, azure-mgmt-containerservice>=20.0.0, azureml-core==1.48.0, kubernetes, kubernetes==18.20.0 - -No .dockerignore found, including all files inside build context. - -Step 1/10 : FROM zenmldocker/zenml:0.40.0-py3.8 - -Step 2/10 : WORKDIR /app - -Step 3/10 : COPY .zenml_user_requirements . - -Step 4/10 : RUN pip install --default-timeout=60 --no-cache-dir -r .zenml_user_requirements - -Step 5/10 : COPY .zenml_integration_requirements . - -Step 6/10 : RUN pip install --default-timeout=60 --no-cache-dir -r .zenml_integration_requirements - -Step 7/10 : ENV ZENML_ENABLE_REPO_INIT_WARNINGS=False - -Step 8/10 : ENV ZENML_CONFIG_PATH=/app/.zenconfig - -Step 9/10 : COPY . . - -Step 10/10 : RUN chmod -R a+rw . - -Pushing Docker image demozenmlcontainerregistry.azurecr.io/zenml:simple_pipeline-orchestrator. - -Finished pushing Docker image. - -Finished building Docker image(s). - -Running pipeline simple_pipeline on stack gcp-demo (caching disabled) - -Waiting for Kubernetes orchestrator pod... - -Kubernetes orchestrator pod started. - -Waiting for pod of step simple_step_one to start... - -Step simple_step_one has started. - -INFO:azure.identity._internal.get_token_mixin:ClientSecretCredential.get_token succeeded - -INFO:azure.identity._internal.get_token_mixin:ClientSecretCredential.get_token succeeded - -INFO:azure.identity._internal.get_token_mixin:ClientSecretCredential.get_token succeeded - -INFO:azure.identity.aio._internal.get_token_mixin:ClientSecretCredential.get_token succeeded - -Step simple_step_one has finished in 0.396s. - -Pod of step simple_step_one completed. - -Waiting for pod of step simple_step_two to start... - -Step simple_step_two has started. - -INFO:azure.identity._internal.get_token_mixin:ClientSecretCredential.get_token succeeded - -INFO:azure.identity._internal.get_token_mixin:ClientSecretCredential.get_token succeeded - -INFO:azure.identity.aio._internal.get_token_mixin:ClientSecretCredential.get_token succeeded - -Hello World! - -Step simple_step_two has finished in 3.203s. - -Pod of step simple_step_two completed. - -Orchestration pod completed. - -Dashboard URL: https://zenml.stefan.20.23.46.143.nip.io/default/pipelines/98c41e2a-1ab0-4ec9-8375-6ea1ab473686/runs - -``` -``` - - - diff --git a/develop/usage/resource-data-management/auth-management/best-security-practices.mdx b/develop/usage/resource-data-management/auth-management/best-security-practices.mdx deleted file mode 100644 index cc722df..0000000 --- a/develop/usage/resource-data-management/auth-management/best-security-practices.mdx +++ /dev/null @@ -1,851 +0,0 @@ ---- -title: "Security best practices" -description: "Best practices concerning the various authentication methods implemented by Service Connectors." -icon: shield-halved ---- - -Service Connector Types, especially those targeted at cloud providers, offer a plethora of authentication methods matching those supported by remote cloud platforms. While there is no single authentication standard that unifies this process, there are some patterns that are easily identifiable and can be used as guidelines when deciding which authentication method to use to configure a Service Connector. - -This section explores some of those patterns and gives some advice regarding which authentication methods are best suited for your needs. - - - This section may require some general knowledge about authentication and authorization to be properly understood. We tried to keep it simple and limit ourselves to talking about high-level concepts, but some areas may get a bit too technical. - - -## Username and password - - -The key takeaway is this: you should avoid using your primary account password as authentication credentials as much as possible. If there are alternative authentication methods that you can use or other types of credentials (e.g. session tokens, API keys, API tokens), you should always try to use those instead. - -Ultimately, if you have no choice, be cognizant of the third parties you share your passwords with. If possible, they should never leave the premises of your local host or development environment. - - -This is the typical authentication method that uses a username or account name plus the associated password. While this is the de facto method used to log in with web consoles and local CLIs, this is the least secure of all authentication methods and _never_ something you want to share with other members of your team or organization or use to authenticate automated workloads. - -In fact, cloud platforms don't even allow using user account passwords directly as a credential when authenticating to the cloud platform APIs. There is always a process in place that allows exchanging the account/password credential for [another form of long-lived credential](/develop/usage/resource-data-management/auth-management/best-security-practices#long-lived-credentials-api-keys-account-keys). - -Even when passwords are mentioned as credentials, some services (e.g. DockerHub) also allow using an API access key in place of the user account password. - -## Implicit authentication - - - The key takeaway here is that implicit authentication gives you immediate access to some cloud resources and requires no configuration, but it may take some extra effort to expand the range of resources that you're initially allowed to access with it. This is not an authentication method you want to use if you're interested in portability and enabling others to reproduce your results. - - - - This method may constitute a security risk, because it can give users access to the same cloud resources and services that the ZenML Server itself is configured to access. For this reason, all implicit authentication methods are disabled by default and need to be explicitly enabled by setting the `ZENML_ENABLE_IMPLICIT_AUTH_METHODS` environment variable or the helm chart `enableImplicitAuthMethods` configuration option to `true` in the ZenML deployment. - - -Implicit authentication is just a fancy way of saying that the Service Connector will use locally stored credentials, configuration files, environment variables, and basically any form of authentication available in the environment where it is running, either locally or in the cloud. - -Most cloud providers and their associated Service Connector Types include some form of implicit authentication that is able to automatically discover and use the following forms of authentication in the environment where they are running: - -* configuration and credentials set up and stored locally through the cloud platform CLI -* configuration and credentials passed as environment variables -* some form of implicit authentication attached to the workload environment itself. This is only available in virtual environments that are already running inside the same cloud where other resources are available for use. This is called differently depending on the cloud provider in question, but they are essentially the same thing: - * in AWS, if you're running on Amazon EC2, ECS, EKS, Lambda, or some other form of AWS cloud workload, credentials can be loaded directly from _the instance metadata service._ This [uses the IAM role attached to your workload](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html) to authenticate to other AWS services without the need to configure explicit credentials. - * in GCP, a similar _metadata service_ allows accessing other GCP cloud resources via [the service account attached to the GCP workload](https://cloud.google.com/docs/authentication/application-default-credentials#attached-sa) (e.g. GCP VMs or GKE clusters). - * in Azure, the [Azure Managed Identity](https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/overview) services can be used to gain access to other Azure services without requiring explicit credentials - -There are a few caveats that you should be aware of when choosing an implicit authentication method. It may seem like the easiest way out, but it carries with it some implications that may impact portability and usability later down the road: - -* when used with a local ZenML deployment, like the default deployment, or [a local ZenML server started with zenml up](/develop/user-guide/guides/production-guide), the implicit authentication method will use the configuration files and credentials or environment variables set up _on your local machine_. These will not be available to anyone else outside your local environment and will also not be accessible to workloads running in other environments on your local host. This includes for example local K3D Kubernetes clusters and local Docker containers. -* when used with a remote ZenML server, the implicit authentication method only works if your ZenML server is deployed in the same cloud as the one supported by the Service Connector Type that you are using. For instance, if you're using the AWS Service Connector Type, then the ZenML server must also be deployed in AWS (e.g. in an EKS Kubernetes cluster). You may also need to manually adjust the cloud configuration of the remote cloud workload where the ZenML server is running to allow access to resources (e.g. add permissions to the AWS IAM role attached to the EC2 or EKS node, add roles to the GCP service account attached to the GKE cluster nodes). - - -```bash -zenml service-connector register gcp-implicit --type gcp --auth-method implicit --project_id=zenml-core -``` - -```bash Example Command Output - -Successfully registered service connector `gcp-implicit` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://annotation-gcp-store ┃ - -┃ │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┃ │ gs://zenml-internal-artifact-store ┃ - -┃ │ gs://zenml-kubeflow-artifact-store ┃ - -┃ │ gs://zenml-project-time-series-bucket ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -The following is an example of using the GCP Service Connector's implicit authentication method to gain immediate access to all the GCP resources that the ZenML server also has access to. Note that this is only possible because the ZenML server is also deployed in GCP, in a GKE cluster, and the cluster is attached to a GCP service account with permissions to access the project resources: - - -### Long-lived credentials (API keys, account keys) - - - This is the magic formula of authentication methods. When paired with another ability, such as [automatically generating short-lived API tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials), or [impersonating accounts or assuming roles](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles), this is the ideal authentication mechanism to use, particularly when using ZenML in production and when sharing results with other members of your ZenML team. - - -As a general best practice, but implemented particularly well for cloud platforms, account passwords are never directly used as a credential when authenticating to the cloud platform APIs. There is always a process in place that exchanges the account/password credential for another type of long-lived credential: - -* AWS uses the [aws configure CLI command](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) -* GCP offers [the gcloud auth application-default login CLI commands](https://cloud.google.com/docs/authentication/provide-credentials-adc#how%5Fto%5Fprovide%5Fcredentials%5Fto%5Fadc) -* Azure provides [the az login CLI command](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli) - -None of your original login information is stored on your local machine or used to access workloads. Instead, an API key, account key or some other form of intermediate credential is generated and stored on the local host and used to authenticate to remote cloud service APIs. - - - When using auto-configuration with Service Connector registration, this is usually the type of credentials automatically identified and extracted from your local machine. - -Different cloud providers use different names for these types of long-lived credentials, but they usually represent the same concept, with minor variations regarding the identity information and level of permissions attached to them: - -* AWS has [Account Access Keys](https://docs.aws.amazon.com/powershell/latest/userguide/pstools-appendix-sign-up.html) and [IAM User Access Keys](https://docs.aws.amazon.com/IAM/latest/UserGuide/id%5Fcredentials%5Faccess-keys.html) -* GCP has [User Account Credentials](https://cloud.google.com/docs/authentication#user-accounts) and [Service Account Credentials](https://cloud.google.com/docs/authentication#service-accounts) - -Generally speaking, a differentiation is being made between the following two classes of credentials: - -* _user credentials_: credentials representing a human user and usually directly tied to a user account identity. These credentials are usually associated with a broad spectrum of permissions and it is therefore not recommended to share them or make them available outside the confines of your local host. -* _service credentials:_ credentials used with automated processes and programmatic access, where humans are not directly involved. These credentials are not directly tied to a user account identity, but some other form of accounting like a service account or an IAM user devised to be used by non-human actors. It is also usually possible to restrict the range of permissions associated with this class of credentials, which makes them better candidates for sharing them with a larger audience. - -ZenML cloud provider Service Connectors can use both classes of credentials, but you should aim to use _service credentials_ as often as possible instead of _user credentials_, especially in production environments. Attaching automated workloads like ML pipelines to service accounts instead of user accounts acts as an extra layer of protection for your user identity and facilitates enforcing another security best practice called [_"the least-privilege principle"_](https://en.wikipedia.org/wiki/Principle%5Fof%5Fleast%5Fprivilege)_:_ granting each actor only the minimum level of permissions required to function correctly. - -Using long-lived credentials on their own still isn't ideal, because if leaked, they pose a security risk, even when they have limited permissions attached. The good news is that ZenML Service Connectors include additional mechanisms that, when used in combination with long-lived credentials, make it even safer to share long-lived credentials with other ZenML users and automated workloads: - -* automatically [generating temporary credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) from long-lived credentials and even downgrading their permission scope to enforce the least-privilege principle -* implementing [authentication schemes that impersonate accounts and assume roles](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles) - -### Generating temporary and down-scoped credentials - -Most [authentication methods that utilize long-lived credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#long-lived-credentials-api-keys-account-keys) also implement additional mechanisms that help reduce the accidental credentials exposure and risk of security incidents even further, making them ideal for production. - -_**Issuing temporary credentials**_: this authentication strategy keeps long-lived credentials safely stored on the ZenML server and away from the eyes of actual API clients and people that need to authenticate to the remote resources. Instead, clients are issued API tokens that have a limited lifetime and expire after a given amount of time. The Service Connector is able to generate these API tokens from long-lived credentials on a need-to-have basis. For example, the AWS Service Connector's "Session Token", "Federation Token" and "IAM Role" authentication methods and basically all authentication methods supported by the GCP Service Connector support this feature. - -The following example shows the difference between the long-lived AWS credentials configured for an AWS Service Connector and kept on the ZenML server and the temporary Kubernetes API token credentials that the client receives and uses to access the resource. - -First, showing the long-lived AWS credentials configured for the AWS Service Connector: - - - -``` -zenml service-connector describe eks-zenhacks-cluster -``` - -Example Command Output - - - -``` -Service connector 'eks-zenhacks-cluster' of type 'aws' with id 'be53166a-b39c-4e39-8e31-84658e50eec4' is owned by user 'default' and is 'private'. - - 'eks-zenhacks-cluster' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ ID │ be53166a-b39c-4e39-8e31-84658e50eec4 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ NAME │ eks-zenhacks-cluster ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ AUTH METHOD │ session-token ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🌀 kubernetes-cluster ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ RESOURCE NAME │ zenhacks-cluster ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SECRET ID │ fa42ab38-3c93-4765-a4c6-9ce0b548a86c ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SESSION DURATION │ 43200s ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-16 10:15:26.393769 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-16 10:15:26.393772 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - -Then, showing the temporary credentials that are issued to clients. Note the expiration time on the Kubernetes API token: - - - -``` -zenml service-connector describe eks-zenhacks-cluster --client -``` - -Example Command Output - - - -``` -Service connector 'eks-zenhacks-cluster (kubernetes-cluster | zenhacks-cluster client)' of type 'kubernetes' with id 'be53166a-b39c-4e39-8e31-84658e50eec4' is owned by user 'default' and is 'private'. - - 'eks-zenhacks-cluster (kubernetes-cluster | zenhacks-cluster client)' kubernetes Service - - Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ ID │ be53166a-b39c-4e39-8e31-84658e50eec4 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ eks-zenhacks-cluster (kubernetes-cluster | zenhacks-cluster client) ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🌀 kubernetes ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🌀 kubernetes-cluster ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 11h59m57s ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-16 10:17:46.931091 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-16 10:17:46.931094 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ server │ https://A5F8F4142FB12DDCDE9F21F6E9B07A18.gr7.us-east-1.eks.amazonaws.com ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ insecure │ False ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ cluster_name │ arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ token │ [HIDDEN] ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ certificate_authority │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - -_**Issuing downscoped credentials**_: in addition to the above, some authentication methods also support restricting the generated temporary API tokens to the minimum set of permissions required to access the target resource or set of resources. This is currently available for the AWS Service Connector's "Federation Token" and "IAM Role" authentication methods. - -It's not easy to showcase this without using some ZenML Python Client code, but here is an example that proves that the AWS client token issued to an S3 client can only access the S3 bucket resource it was issued for, even if the originating AWS Service Connector is able to access multiple S3 buckets with the corresponding long-lived credentials: - -``` -zenml service-connector register aws-federation-multi --type aws --auth-method=federation-token --auto-configure -``` - -Example Command Output - - - -``` -Successfully registered service connector `aws-federation-multi` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┃ │ s3://zenml-public-swagger-spec ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The next part involves running some ZenML Python code to showcase that the downscoped credentials issued to a client are indeed restricted to the S3 bucket that the client asked to access: - - - -``` -from zenml.client import Client - -client = Client() - -# Get a Service Connector client for a particular S3 bucket - -connector_client = client.get_service_connector_client( - - name_id_or_prefix="aws-federation-multi", - - resource_type="s3-bucket", - - resource_id="s3://zenfiles" - -) - -# Get the S3 boto3 python client pre-configured and pre-authenticated - -# from the Service Connector client - -s3_client = connector_client.connect() - -# Verify access to the chosen S3 bucket using the temporary token that - -# was issued to the client. - -s3_client.head_bucket(Bucket="zenfiles") - -# Try to access another S3 bucket that the original AWS long-lived credentials can access. - -# An error will be thrown indicating that the bucket is not accessible. - -s3_client.head_bucket(Bucket="zenml-demos") -``` - -Example Output - - - -``` ->>> from zenml.client import Client - ->>> - ->>> client = Client() - -Unable to find ZenML repository in your current working directory (/home/stefan/aspyre/src/zenml) or any parent directories. If you want to use an existing repository which is in a different location, set the environment variable 'ZENML_REPOSITORY_PATH'. If you want to create a new repository, run zenml init. - -Running without an active repository root. - ->>> - ->>> # Get a Service Connector client for a particular S3 bucket - ->>> connector_client = client.get_service_connector_client( - -... name_id_or_prefix="aws-federation-multi", - -... resource_type="s3-bucket", - -... resource_id="s3://zenfiles" - -... ) - ->>> - ->>> # Get the S3 boto3 python client pre-configured and pre-authenticated - ->>> # from the Service Connector client - ->>> s3_client = connector_client.connect() - ->>> - ->>> # Verify access to the chosen S3 bucket using the temporary token that - ->>> # was issued to the client. - ->>> s3_client.head_bucket(Bucket="zenfiles") - -{'ResponseMetadata': {'RequestId': '62YRYW5XJ1VYPCJ0', 'HostId': 'YNBXcGUMSOh90AsTgPW6/Ra89mqzfN/arQq/FMcJzYCK98cFx53+9LLfAKzZaLhwaiJTm+s3mnU=', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amz-id-2': 'YNBXcGUMSOh90AsTgPW6/Ra89mqzfN/arQq/FMcJzYCK98cFx53+9LLfAKzZaLhwaiJTm+s3mnU=', 'x-amz-request-id': '62YRYW5XJ1VYPCJ0', 'date': 'Fri, 16 Jun 2023 11:04:20 GMT', 'x-amz-bucket-region': 'us-east-1', 'x-amz-access-point-alias': 'false', 'content-type': 'application/xml', 'server': 'AmazonS3'}, 'RetryAttempts': 0}} - ->>> - ->>> # Try to access another S3 bucket that the original AWS long-lived credentials can access. - ->>> # An error will be thrown indicating that the bucket is not accessible. - ->>> s3_client.head_bucket(Bucket="zenml-demos") - -╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ - -│ :1 in │ - -│ │ - -│ /home/stefan/aspyre/src/zenml/.venv/lib/python3.8/site-packages/botocore/client.py:508 in │ - -│ _api_call │ - -│ │ - -│ 505 │ │ │ │ │ f"{py_operation_name}() only accepts keyword arguments." │ - -│ 506 │ │ │ │ ) │ - -│ 507 │ │ │ # The "self" in this scope is referring to the BaseClient. │ - -│ ❱ 508 │ │ │ return self._make_api_call(operation_name, kwargs) │ - -│ 509 │ │ │ - -│ 510 │ │ _api_call.__name__ = str(py_operation_name) │ - -│ 511 │ - -│ │ - -│ /home/stefan/aspyre/src/zenml/.venv/lib/python3.8/site-packages/botocore/client.py:915 in │ - -│ _make_api_call │ - -│ │ - -│ 912 │ │ if http.status_code >= 300: │ - -│ 913 │ │ │ error_code = parsed_response.get("Error", {}).get("Code") │ - -│ 914 │ │ │ error_class = self.exceptions.from_code(error_code) │ - -│ ❱ 915 │ │ │ raise error_class(parsed_response, operation_name) │ - -│ 916 │ │ else: │ - -│ 917 │ │ │ return parsed_response │ - -│ 918 │ - -╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ - -ClientError: An error occurred (403) when calling the HeadBucket operation: Forbidden -``` - - - -### Impersonating accounts and assuming roles - - - These types of authentication methods require more work to set up because multiple permission-bearing accounts and roles need to be provisioned in advance depending on the target audience. On the other hand, they also provide the most flexibility and control. Despite their operational cost, if you are a platform engineer and have the infrastructure know-how necessary to understand and set up the authentication resources, this is for you. - - -These authentication methods deliver another way of [configuring long-lived credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#long-lived-credentials-api-keys-account-keys) in your Service Connectors without exposing them to clients. They are especially useful as an alternative to cloud provider Service Connectors authentication methods that do not support [automatically downscoping the permissions of issued temporary tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials). - -The processes of account impersonation and role assumption are very similar and can be summarized as follows: - -* you configure a Service Connector with long-lived credentials associated with a primary user account or primary service account (preferable). As a best practice, it is common to attach a reduced set of permissions or even no permissions to these credentials other than those that allow the account impersonation or role assumption operation. This makes it more difficult to do any damage if the primary credentials are accidentally leaked. -* in addition to the primary account and its long-lived credentials, you also need to provision one or more secondary access entities in the cloud platform bearing the effective permissions that will be needed to access the target resource(s): - * one or more IAM roles (to be assumed) - * one or more service accounts (to be impersonated) -* the Service Connector configuration also needs to contain the name of a target IAM role to be assumed or a service account to be impersonated. -* upon request, the Service Connector will exchange the long-lived credentials associated with the primary account for short-lived API tokens that only have the permissions associated with the target IAM role or service account. These temporary credentials are issued to clients and used to access the target resource, while the long-lived credentials are kept safe and never have to leave the ZenML server boundary. - - - -For this example, we have the following set up in GCP: - -* a primary `empty-connectors@zenml-core.iam.gserviceaccount.com` GCP service account with no permissions whatsoever aside from the "Service Account Token Creator" role that allows it to impersonate the secondary service account below. We also generate a service account key for this account. -* a secondary `zenml-bucket-sl@zenml-core.iam.gserviceaccount.com` GCP service account that only has permissions to access the `zenml-bucket-sl` GCS bucket - -First, let's show that the `empty-connectors` service account has no permissions to access any GCS buckets or any other resources for that matter. We'll register a regular GCP Service Connector that uses the service account key (long-lived credentials) directly: - - - -``` -zenml service-connector register gcp-empty-sa --type gcp --auth-method service-account --service_account_json=@empty-connectors@zenml-core.json --project_id=zenml-core -``` - -Example Command Output - - - -``` -Expanding argument value service_account_json to contents of file /home/stefan/aspyre/src/zenml/empty-connectors@zenml-core.json. - -Successfully registered service connector `gcp-empty-sa` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ 💥 error: connector authorization failure: failed to list GCS buckets: 403 GET ┃ - -┃ │ https://storage.googleapis.com/storage/v1/b?project=zenml-core&projection=noAcl&prettyPrint= ┃ - -┃ │ false: empty-connectors@zenml-core.iam.gserviceaccount.com does not have ┃ - -┃ │ storage.buckets.list access to the Google Cloud project. Permission 'storage.buckets.list' ┃ - -┃ │ denied on resource (or it may not exist). ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ 💥 error: connector authorization failure: Failed to list GKE clusters: 403 Required ┃ - -┃ │ "container.clusters.list" permission(s) for "projects/20219041791". [request_id: ┃ - -┃ │ "0xcb7086235111968a" ┃ - -┃ │ ] ┃ - -┠───────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -Next, we'll register a GCP Service Connector that actually uses account impersonation to access the `zenml-bucket-sl` GCS bucket and verify that it can actually access the bucket: - - - -``` -zenml service-connector register gcp-impersonate-sa --type gcp --auth-method impersonation --service_account_json=@empty-connectors@zenml-core.json --project_id=zenml-core --target_principal=zenml-bucket-sl@zenml-core.iam.gserviceaccount.com --resource-type gcs-bucket --resource-id gs://zenml-bucket-sl -``` - -Example Command Output - - - -``` -Expanding argument value service_account_json to contents of file /home/stefan/aspyre/src/zenml/empty-connectors@zenml-core.json. - -Successfully registered service connector `gcp-impersonate-sa` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼──────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - -### Short-lived credentials - - - This category of authentication methods uses temporary credentials explicitly configured in the Service Connector or generated by the Service Connector during auto-configuration. Of all available authentication methods, this is probably the least useful and you will likely never have to use it because it is terribly impractical: when short-lived credentials expire, Service Connectors become unusable and need to either be manually updated or replaced. - - -On the other hand, this authentication method is ideal if you're looking to grant someone else in your team temporary access to some resources without exposing your long-lived credentials. - -A previous section described how [temporary credentials can be automatically generated from other, long-lived credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) by most cloud provider Service Connectors. It only stands to reason that temporary credentials can also be generated manually by external means such as cloud provider CLIs and used directly to configure Service Connectors, or automatically generated during Service Connector auto-configuration. - -This may be used as a way to grant an external party temporary access to some resources and have the Service Connector automatically become unusable (i.e. expire) after some time. Your long-lived credentials are kept safe, while the Service Connector only stores a short-lived credential. - -The following is an example of using Service Connector auto-configuration to automatically generate a short-lived token from long-lived credentials configured for the local cloud provider CLI (AWS in this case): - - - -``` -AWS_PROFILE=connectors zenml service-connector register aws-sts-token --type aws --auto-configure --auth-method sts-token -``` - -Example Command Output - - - -``` -⠸ Registering service connector 'aws-sts-token'... - -Successfully registered service connector `aws-sts-token` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The Service Connector is now configured with a short-lived token that will expire after some time. You can verify this by inspecting the Service Connector: - - - -``` -zenml service-connector describe aws-sts-token -``` - -Example Command Output - - - -``` -Service connector 'aws-sts-token' of type 'aws' with id '63e14350-6719-4255-b3f5-0539c8f7c303' is owned by user 'default' and is 'private'. - - 'aws-sts-token' aws Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ e316bcb3-6659-467b-81e5-5ec25bfd36b0 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ aws-sts-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔶 aws ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ sts-token ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔶 aws-generic, 📦 s3-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 971318c9-8db9-4297-967d-80cda070a121 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 11h58m17s ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-06-19 17:58:42.999323 ┃ - -┠──────────────────┼─────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-06-19 17:58:42.999324 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────┨ - -┃ region │ us-east-1 ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_access_key_id │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_secret_access_key │ [HIDDEN] ┃ - -┠───────────────────────┼───────────┨ - -┃ aws_session_token │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┛ -``` - -Note the temporary nature of the Service Connector. It will become unusable in 12 hours: - - - -``` -zenml service-connector list --name aws-sts-token -``` - -Example Command Output - - - -``` -┏━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼───────────────┼─────────────────────────────────┼────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ aws-sts-token │ e316bcb3-6659-467b-81e5-5ec25bf │ 🔶 aws │ 🔶 aws-generic │ │ ➖ │ default │ 11h57m12s │ ┃ - -┃ │ │ d36b0 │ │ 📦 s3-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - diff --git a/develop/usage/resource-data-management/auth-management/docker-service-connector.mdx b/develop/usage/resource-data-management/auth-management/docker-service-connector.mdx deleted file mode 100644 index ee5fe69..0000000 --- a/develop/usage/resource-data-management/auth-management/docker-service-connector.mdx +++ /dev/null @@ -1,111 +0,0 @@ ---- -title: "Docker Service Connector" -description: "Configuring Docker Service Connectors to connect ZenML to Docker container registries." -icon: docker ---- - -The ZenML Docker Service Connector allows authenticating with a Docker or OCI container registry and managing Docker clients for the registry. This connector provides pre-authenticated python-docker Python clients to Stack Components that are linked to it. - -```bash -zenml service-connector list-types --type docker -``` - -```bash -┏━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ -┠──────────────────────────┼───────────┼────────────────────┼──────────────┼───────┼────────┨ -┃ Docker Service Connector │ 🐳 docker │ 🐳 docker-registry │ password │ ✅ │ ✅ ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -## Prerequisites - -No Python packages are required for this Service Connector. All prerequisites are included in the base ZenML Python package. Docker needs to be installed on environments where container images are built and pushed to the target container registry. - -## Resource Types - -The Docker Service Connector only supports authenticating to and granting access to a Docker/OCI container registry. This type of resource is identified by the `docker-registry` Resource Type. - -The resource name identifies a Docker/OCI registry using one of the following formats (the repository name is optional and ignored). - -* DockerHub: docker.io or `https://index.docker.io/v1/` -* generic OCI registry URI: `https://host:port/` - -## Authentication Methods - -Authenticating to Docker/OCI container registries is done with a username and password or access token. It is recommended to use API tokens instead of passwords, wherever this is available, for example in the case of DockerHub: - -```bash -zenml service-connector register dockerhub --type docker -in -``` - -```bash Example Command Output -Please enter a name for the service connector [dockerhub]: - -Please enter a description for the service connector []: - -Please select a service connector type (docker) [docker]: - -Only one resource type is available for this connector (docker-registry). - -Only one authentication method is available for this connector (password). Would you like to use it? [Y/n]: - -Please enter the configuration for the Docker username and password/token authentication method. - -[username] Username {string, secret, required}: - -[password] Password {string, secret, required}: - -[registry] Registry server URL. Omit to use DockerHub. {string, optional}: - -Successfully registered service connector `dockerhub` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────────┼────────────────┨ - -┃ 🐳 docker-registry │ docker.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - - - This Service Connector does not support generating short-lived credentials from the username and password or token credentials configured in the Service Connector. In effect, this means that the configured credentials will be distributed directly to clients and used to authenticate directly to the target Docker/OCI registry service. - - -## Auto-configuration - - - This Service Connector does not support auto-discovery and extraction of authentication credentials from local Docker clients. If this feature is useful to you or your organization, please let us know by messaging us in [Slack](https://zenml.io/slack) or [creating an issue on GitHub](https://github.com/zenml-io/zenml/issues). - - -## Local client provisioning - -This Service Connector allows configuring the local Docker client with credentials: - -```bash -zenml service-connector login dockerhub -``` - -```bash Example Command Output - -Attempting to configure local client using service connector 'dockerhub'... - -WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - -Configure a credential helper to remove this warning. See - -https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - -The 'dockerhub' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. -``` - -## Stack Components use - -The Docker Service Connector can be used by all Container Registry stack component flavors to authenticate to a remote Docker/OCI container registry. This allows container images to be built and published to private container registries without the need to configure explicit Docker credentials in the target environment or the Stack Component. - - - ZenML does not yet support automatically configuring Docker credentials in container runtimes such as Kubernetes clusters (i.e. via imagePullSecrets) to allow container images to be pulled from the private container registries. This will be added in a future release. - diff --git a/develop/usage/resource-data-management/auth-management/gcp-service-connector.mdx b/develop/usage/resource-data-management/auth-management/gcp-service-connector.mdx deleted file mode 100644 index babc7c9..0000000 --- a/develop/usage/resource-data-management/auth-management/gcp-service-connector.mdx +++ /dev/null @@ -1,2855 +0,0 @@ ---- -title: "GCP Service Connector" -description: "Configuring GCP Service Connectors to connect ZenML to GCP resources such as GCS buckets, GKE Kubernetes clusters, and GCR container registries." -icon: file-code ---- - -The ZenML GCP Service Connector facilitates the authentication and access to managed GCP services and resources. These encompass a range of resources, including GCS buckets, GAR and GCR container repositories, and GKE clusters. The connector provides support for various authentication methods, including GCP user accounts, service accounts, short-lived OAuth 2.0 tokens, and implicit authentication. - -To ensure heightened security measures, this connector always issues [short-lived OAuth 2.0 tokens to clients instead of long-lived credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) unless explicitly configured to do otherwise. Furthermore, it includes [automatic configuration and detection of credentials locally configured through the GCP CLI](/develop/usage/resource-data-management/auth-management/service-connectors-guide#auto-configuration). - -This connector serves as a general means of accessing any GCP service by issuing OAuth 2.0 credential objects to clients. Additionally, the connector can handle specialized authentication for GCS, Docker, and Kubernetes Python clients. It also allows for the configuration of local Docker and Kubernetes CLIs. - - - -```bash -$ zenml service-connector list-types --type gcp -``` - - - -```bash -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠───────────────────────┼────────┼───────────────────────┼──────────────────┼───────┼────────┨ - -┃ GCP Service Connector │ 🔵 gcp │ 🔵 gcp-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 gcs-bucket │ user-account │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ service-account │ │ ┃ - -┃ │ │ 🐳 docker-registry │ external-account │ │ ┃ - -┃ │ │ │ oauth2-token │ │ ┃ - -┃ │ │ │ impersonation │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -## Prerequisites - -The GCP Service Connector is part of the GCP ZenML integration. You can either install the entire integration or use a PyPI extra to install it independently of the integration: - -* `pip install "zenml[connectors-gcp]"` installs only prerequisites for the GCP Service Connector Type -* `zenml integration install gcp` installs the entire GCP ZenML integration - -It is not required to [install and set up the GCP CLI on your local machine](https://cloud.google.com/sdk/gcloud) to use the GCP Service Connector to link Stack Components to GCP resources and services. However, it is recommended to do so if you are looking for a quick setup that includes using the auto-configuration Service Connector features. - - - The auto-configuration examples in this page rely on the GCP CLI being installed and already configured with valid credentials of one type or another. If you want to avoid installing the GCP CLI, we recommend using the interactive mode of the ZenML CLI to register Service Connectors: -``` -zenml service-connector register -i --type gcp -``` - - -## Resource Types - -### Generic GCP resource - -This resource type allows Stack Components to use the GCP Service Connector to connect to any GCP service or resource. When used by Stack Components, they are provided a Python google-auth credentials object populated with a GCP OAuth 2.0 token. This credentials object can then be used to create GCP Python clients for any particular GCP service. - -This generic GCP resource type is meant to be used with Stack Components that are not represented by one of the other, more specific resource types like GCS buckets, Kubernetes clusters, or Docker registries. For example, it can be used with [the Google Cloud Image Builder](/develop/stack-components/image-builders/gcp) stack component, or [the Vertex AI Orchestrator](/develop/stack-components/orchestrators/vertex) and [Step Operator](/develop/stack-components/step-operators/vertex). It should be accompanied by a matching set of GCP permissions that allow access to the set of remote resources required by the client and Stack Component (see the documentation of each Stack Component for more details). - -The resource name represents the GCP project that the connector is authorized to access. - -### GCS bucket - -Allows Stack Components to connect to GCS buckets. When used by Stack Components, they are provided a pre-configured GCS Python client instance. - -The configured credentials must have at least the following [GCP permissions](https://cloud.google.com/iam/docs/permissions-reference) associated with the GCS buckets that it can access: - -* `storage.buckets.list` -* `storage.buckets.get` -* `storage.objects.create` -* `storage.objects.delete` -* `storage.objects.get` -* `storage.objects.list` -* `storage.objects.update` - -For example, the GCP Storage Admin role includes all of the required permissions, but it also includes additional permissions that are not required by the connector. - -If set, the resource name must identify a GCS bucket using one of the following formats: - -* GCS bucket URI (canonical resource name): gs://{bucket-name} -* GCS bucket name: {bucket-name} - -### GKE Kubernetes cluster - -Allows Stack Components to access a GKE cluster as a standard Kubernetes cluster resource. When used by Stack Components, they are provided a pre-authenticated Python Kubernetes client instance. - -The configured credentials must have at least the following [GCP permissions](https://cloud.google.com/iam/docs/permissions-reference) associated with the GKE clusters that it can access: - -* `container.clusters.list` -* `container.clusters.get` - -In addition to the above permissions, the credentials should include permissions to connect to and use the GKE cluster (i.e. some or all permissions in the Kubernetes Engine Developer role). - -If set, the resource name must identify a GKE cluster using one of the following formats: - -* GKE cluster name: `{cluster-name}` - -GKE cluster names are project scoped. The connector can only be used to access GKE clusters in the GCP project that it is configured to use. - -### GAR container registry (including legacy GCR support) - - - **Important Notice: Google Container Registry** [**is being replaced by Artifact Registry**](https://cloud.google.com/artifact-registry/docs/transition/transition-from-gcr)\*\*. Please start using Artifact Registry for your containers. As per Google's documentation, "after May 15, 2024, Artifact Registry will host images for the gcr.io domain in Google Cloud projects without previous Container Registry usage. After March 18, 2025, Container Registry will be shut down.". - -Support for legacy GCR registries is still included in the GCP service connector. Users that already have GCP service connectors configured to access GCR registries may continue to use them without taking any action. However, it is recommended to transition to Google Artifact Registries as soon as possible by following [the GCP guide on this subject](https://cloud.google.com/artifact-registry/docs/transition/transition-from-gcr) and making the following updates to ZenML GCP Service Connectors that are used to access GCR resources: - -* add the IAM permissions documented here to the GCP Service Connector credentials to enable them to access the Artifact Registries. -* users may keep the gcr.io GCR URLs already configured in the GCP Service Connectors as well as those used in linked Container Registry stack components given that these domains are redirected by Google to GAR as covered in the GCR transition guide. Alternatively, users may update the GCP Service Connector configuration and/or the Container Registry stack components to use the replacement Artifact Registry URLs. - -The GCP Service Connector will list the legacy GCR registries as accessible for a GCP project even if the GCP Service Connector credentials do not grant access to GCR registries. This is required for backwards-compatibility and will be removed in a future release. - - -Allows Stack Components to access a Google Artifact Registry as a standard Docker registry resource. When used by Stack Components, they are provided a pre-authenticated Python Docker client instance. - -The configured credentials must have at least the following [GCP permissions](https://cloud.google.com/iam/docs/understanding-roles#artifact-registry-roles): - -* `artifactregistry.repositories.createOnPush` -* `artifactregistry.repositories.downloadArtifacts` -* `artifactregistry.repositories.get` -* `artifactregistry.repositories.list` -* `artifactregistry.repositories.readViaVirtualRepository` -* `artifactregistry.repositories.uploadArtifacts` -* `artifactregistry.locations.list` - -The Artifact Registry Create-on-Push Writer role includes all of the above permissions. - -This resource type also includes legacy GCR container registry support. When used with GCR registries, the configured credentials must have at least the following [GCP permissions](https://cloud.google.com/iam/docs/understanding-roles#cloud-storage-roles): - -* `storage.buckets.get` -* `storage.multipartUploads.abort` -* `storage.multipartUploads.create` -* `storage.multipartUploads.list` -* `storage.multipartUploads.listParts` -* `storage.objects.create` -* `storage.objects.delete` -* `storage.objects.list` - -The Storage Legacy Bucket Writer role includes all of the above permissions while at the same time restricting access to only the GCR buckets. - -If set, the resource name must identify a GAR or GCR registry using one of the following formats: - -* Google Artifact Registry repository URI: `[https://]-docker.pkg.dev//[/]` -* Google Artifact Registry name: `projects//locations//repositories/` -* (legacy) GCR repository URI: `[https://][us.|eu.|asia.]gcr.io/[/]` - -The connector can only be used to access GAR and GCR registries in the GCP project that it is configured to use. - -## Authentication Methods - -### Implicit authentication - -[Implicit authentication](/develop/usage/resource-data-management/auth-management/best-security-practices#implicit-authentication) to GCP services using [Application Default Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc). - - - This method may constitute a security risk, because it can give users access to the same cloud resources and services that the ZenML Server itself is configured to access. For this reason, all implicit authentication methods are disabled by default and need to be explicitly enabled by setting the `ZENML_ENABLE_IMPLICIT_AUTH_METHODS` environment variable or the helm chart `enableImplicitAuthMethods` configuration option to `true` in the ZenML deployment. - - -This authentication method doesn't require any credentials to be explicitly configured. It automatically discovers and uses credentials from one of the following sources: - -* environment variables (GOOGLE\_APPLICATION\_CREDENTIALS) -* local ADC credential files set up by running `gcloud auth application-default login` (e.g. `~/.config/gcloud/application_default_credentials.json`). -* a GCP service account attached to the resource where the ZenML server is running. Only works when running the ZenML server on a GCP resource with a service account attached to it or when using Workload Identity (e.g. GKE cluster). - -This is the quickest and easiest way to authenticate to GCP services. However, the results depend on how ZenML is deployed and the environment where it is used and is thus not fully reproducible: - -* when used with the default local ZenML deployment or a local ZenML server, the credentials are those set up on your machine (i.e. by running `gcloud auth application-default login` or setting the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to point to a service account key JSON file). -* when connected to a ZenML server, this method only works if the ZenML server is deployed in GCP and will use the service account attached to the GCP resource where the ZenML server is running (e.g. a GKE cluster). The service account permissions may need to be adjusted to allow listing and accessing/describing the GCP resources that the connector is configured to access. - -Note that the discovered credentials inherit the full set of permissions of the local GCP CLI credentials or service account attached to the ZenML server GCP workload. Depending on the extent of those permissions, this authentication method might not be suitable for production use, as it can lead to accidental privilege escalation. Instead, it is recommended to use [the Service Account Key](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcp-service-account) or [Service Account Impersonation](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcp-service-account-impersonation) authentication methods to restrict the permissions that are granted to the connector clients. - -To find out more about Application Default Credentials, [see the GCP ADC documentation](https://cloud.google.com/docs/authentication/provide-credentials-adc). - -A GCP project is required and the connector may only be used to access GCP resources in the specified project. When used remotely in a GCP workload, the configured project has to be the same as the project of the attached service account. - - The following assumes the local GCP CLI has already been configured with user account credentials by running the `gcloud auth application-default login` command: - - - -``` -zenml service-connector register gcp-implicit --type gcp --auth-method implicit --auto-configure -``` - -Example Command Output - - - -``` -Successfully registered service connector `gcp-implicit` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┃ │ gs://zenml-internal-artifact-store ┃ - -┃ │ gs://zenml-kubeflow-artifact-store ┃ - -┃ │ gs://zenml-project-time-series-bucket ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┃ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -No credentials are stored with the Service Connector: - - - -``` -zenml service-connector describe gcp-implicit -``` - -Example Command Output - - - -``` -Service connector 'gcp-implicit' of type 'gcp' with id '0c49a7fe-5e87-41b9-adbe-3da0a0452e44' is owned by user 'default' and is 'private'. - - 'gcp-implicit' gcp Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 0c49a7fe-5e87-41b9-adbe-3da0a0452e44 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ gcp-implicit ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔵 gcp ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ implicit ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔵 gcp-generic, 📦 gcs-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-05-19 08:04:51.037955 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-05-19 08:04:51.037958 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━┯━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠────────────┼────────────┨ - -┃ project_id │ zenml-core ┃ - -┗━━━━━━━━━━━━┷━━━━━━━━━━━━┛ -``` - - - -### GCP User Account - -[Long-lived GCP credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#long-lived-credentials-api-keys-account-keys) consist of a GCP user account and its credentials. - -This method requires GCP user account credentials like those generated by the `gcloud auth application-default login` command. - -By default, the GCP connector [generates temporary OAuth 2.0 tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) from the user account credentials and distributes them to clients. The tokens have a limited lifetime of 1 hour. This behavior can be disabled by setting the `generate_temporary_tokens` configuration option to `False`, in which case, the connector will distribute the user account credentials JSON to clients instead (not recommended). - -This method is preferred during development and testing due to its simplicity and ease of use. It is not recommended as a direct authentication method for production use cases because the clients are granted the full set of permissions of the GCP user account. For production, it is recommended to use the GCP Service Account or GCP Service Account Impersonation authentication methods. - -A GCP project is required and the connector may only be used to access GCP resources in the specified project. - -If you already have the local GCP CLI set up with these credentials, they will be automatically picked up when auto-configuration is used (see the example below). - - -The following assumes the local GCP CLI has been configured with GCP user account credentials by running the `gcloud auth application-default login` command: - - - -``` -zenml service-connector register gcp-user-account --type gcp --auth-method user-account --auto-configure -``` - -Example Command Output - - - -``` -Successfully registered service connector `gcp-user-account` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┃ │ gs://zenml-internal-artifact-store ┃ - -┃ │ gs://zenml-kubeflow-artifact-store ┃ - -┃ │ gs://zenml-project-time-series-bucket ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┃ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The GCP user account credentials were lifted up from the local host: - - - -``` -zenml service-connector describe gcp-user-account -``` - -Example Command Output - - - -``` -Service connector 'gcp-user-account' of type 'gcp' with id 'ddbce93f-df14-4861-a8a4-99a80972f3bc' is owned by user 'default' and is 'private'. - - 'gcp-user-account' gcp Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ ddbce93f-df14-4861-a8a4-99a80972f3bc ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ gcp-user-account ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔵 gcp ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ user-account ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔵 gcp-generic, 📦 gcs-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 17692951-614f-404f-a13a-4abb25bfa758 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-05-19 08:09:44.102934 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-05-19 08:09:44.102936 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────┼────────────┨ - -┃ project_id │ zenml-core ┃ - -┠───────────────────┼────────────┨ - -┃ user_account_json │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┛ -``` - - - -### GCP Service Account - -[Long-lived GCP credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#long-lived-credentials-api-keys-account-keys) consisting of a GCP service account and its credentials. - -This method requires [a GCP service account](https://cloud.google.com/iam/docs/service-account-overview) and [a service account key JSON](https://cloud.google.com/iam/docs/service-account-creds#key-types) created for it. - -By default, the GCP connector [generates temporary OAuth 2.0 tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#generating-temporary-and-down-scoped-credentials) from the service account credentials and distributes them to clients. The tokens have a limited lifetime of 1 hour. This behavior can be disabled by setting the `generate_temporary_tokens` configuration option to `False`, in which case, the connector will distribute the service account credentials JSON to clients instead (not recommended). - -A GCP project is required and the connector may only be used to access GCP resources in the specified project. - -If you already have the `GOOGLE_APPLICATION_CREDENTIALS` environment variable configured to point to a service account key JSON file, it will be automatically picked up when auto-configuration is used. - -The following assumes a GCP service account was created, [granted permissions to access GCS buckets](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcs-bucket) in the target project and a service account key JSON was generated and saved locally in the `connectors-devel@zenml-core.json` file: - - - -``` -zenml service-connector register gcp-service-account --type gcp --auth-method service-account --resource-type gcs-bucket --project_id=zenml-core --service_account_json=@connectors-devel@zenml-core.json -``` - -Example Command Output - - - -``` -Expanding argument value service_account_json to contents of file connectors-devel@zenml-core.json. - -Successfully registered service connector `gcp-service-account` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┃ │ gs://zenml-internal-artifact-store ┃ - -┃ │ gs://zenml-kubeflow-artifact-store ┃ - -┃ │ gs://zenml-project-time-series-bucket ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The GCP service connector configuration and service account credentials: - - - -``` -zenml service-connector describe gcp-service-account -``` - -Example Command Output - - - -``` -Service connector 'gcp-service-account' of type 'gcp' with id '4b3d41c9-6a6f-46da-b7ba-8f374c3f49c5' is owned by user 'default' and is 'private'. - - 'gcp-service-account' gcp Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ ID │ 4b3d41c9-6a6f-46da-b7ba-8f374c3f49c5 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ NAME │ gcp-service-account ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ TYPE │ 🔵 gcp ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ AUTH METHOD │ service-account ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 📦 gcs-bucket ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SECRET ID │ 0d0a42bb-40a4-4f43-af9e-6342eeca3f28 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-05-19 08:15:48.056937 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-05-19 08:15:48.056940 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────────┼────────────┨ - -┃ project_id │ zenml-core ┃ - -┠──────────────────────┼────────────┨ - -┃ service_account_json │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┛ -``` - - -### GCP Service Account impersonation - -Generates [temporary STS credentials](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles) by [impersonating another GCP service account](https://cloud.google.com/iam/docs/create-short-lived-credentials-direct#sa-impersonation). - -The connector needs to be configured with the email address of the target GCP service account to be impersonated, accompanied by a GCP service account key JSON for the primary service account. The primary service account must have permission to generate tokens for the target service account (i.e. [the Service Account Token Creator role](https://cloud.google.com/iam/docs/service-account-permissions#directly-impersonate)). The connector will generate temporary OAuth 2.0 tokens upon request by using [GCP direct service account impersonation](https://cloud.google.com/iam/docs/create-short-lived-credentials-direct#sa-impersonation). The tokens have a configurable limited lifetime of up to 1 hour. - -[The best practice implemented with this authentication scheme](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles) is to keep the set of permissions associated with the primary service account down to the bare minimum and grant permissions to the privilege-bearing service account instead. - -A GCP project is required and the connector may only be used to access GCP resources in the specified project. - -If you already have the `GOOGLE_APPLICATION_CREDENTIALS` environment variable configured to point to the primary service account key JSON file, it will be automatically picked up when auto-configuration is used. - - -For this example, we have the following set up in GCP: - -* a primary `empty-connectors@zenml-core.iam.gserviceaccount.com` GCP service account with no permissions whatsoever aside from the "Service Account Token Creator" role that allows it to impersonate the secondary service account below. We also generate a service account key for this account. -* a secondary `zenml-bucket-sl@zenml-core.iam.gserviceaccount.com` GCP service account that only has permission to access the `zenml-bucket-sl` GCS bucket - -First, let's show that the `empty-connectors` service account has no permission to access any GCS buckets or any other resources for that matter. We'll register a regular GCP Service Connector that uses the service account key (long-lived credentials) directly: - - - -``` -zenml service-connector register gcp-empty-sa --type gcp --auth-method service-account --service_account_json=@empty-connectors@zenml-core.json --project_id=zenml-core -``` - -Example Command Output - - - -``` -Expanding argument value service_account_json to contents of file /home/stefan/aspyre/src/zenml/empty-connectors@zenml-core.json. - -Successfully registered service connector `gcp-empty-sa` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ 💥 error: connector authorization failure: failed to list GCS buckets: 403 GET ┃ - -┃ │ https://storage.googleapis.com/storage/v1/b?project=zenml-core&projection=noAcl&prettyPrint=false: ┃ - -┃ │ empty-connectors@zenml-core.iam.gserviceaccount.com does not have storage.buckets.list access to the Google Cloud ┃ - -┃ │ project. Permission 'storage.buckets.list' denied on resource (or it may not exist). ┃ - -┠───────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ 💥 error: connector authorization failure: Failed to list GKE clusters: 403 Required "container.clusters.list" ┃ - -┃ │ permission(s) for "projects/20219041791". [request_id: "0x84808facdac08541" ┃ - -┃ │ ] ┃ - -┠───────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -Verifying access to individual resource types will fail: - - - -``` -zenml service-connector verify gcp-empty-sa --resource-type kubernetes-cluster -``` - -Example Command Output - - - -``` -Error: Service connector 'gcp-empty-sa' verification failed: connector authorization failure: Failed to list GKE clusters: - -403 Required "container.clusters.list" permission(s) for "projects/20219041791". -``` - - - -``` -zenml service-connector verify gcp-empty-sa --resource-type gcs-bucket -``` - -Example Command Output - - - -``` -Error: Service connector 'gcp-empty-sa' verification failed: connector authorization failure: failed to list GCS buckets: - -403 GET https://storage.googleapis.com/storage/v1/b?project=zenml-core&projection=noAcl&prettyPrint=false: - -empty-connectors@zenml-core.iam.gserviceaccount.com does not have storage.buckets.list access to the Google Cloud project. - -Permission 'storage.buckets.list' denied on resource (or it may not exist). -``` - - - -``` -zenml service-connector verify gcp-empty-sa --resource-type gcs-bucket --resource-id zenml-bucket-sl -``` - -Example Command Output - - - -``` -Error: Service connector 'gcp-empty-sa' verification failed: connector authorization failure: failed to fetch GCS bucket - -zenml-bucket-sl: 403 GET https://storage.googleapis.com/storage/v1/b/zenml-bucket-sl?projection=noAcl&prettyPrint=false: - -empty-connectors@zenml-core.iam.gserviceaccount.com does not have storage.buckets.get access to the Google Cloud Storage bucket. - -Permission 'storage.buckets.get' denied on resource (or it may not exist). -``` - -Next, we'll register a GCP Service Connector that actually uses account impersonation to access the `zenml-bucket-sl` GCS bucket and verify that it can actually access the bucket: - - - -``` -zenml service-connector register gcp-impersonate-sa --type gcp --auth-method impersonation --service_account_json=@empty-connectors@zenml-core.json --project_id=zenml-core --target_principal=zenml-bucket-sl@zenml-core.iam.gserviceaccount.com --resource-type gcs-bucket --resource-id gs://zenml-bucket-sl -``` - -Example Command Output - - - -``` -Expanding argument value service_account_json to contents of file /home/stefan/aspyre/src/zenml/empty-connectors@zenml-core.json. - -Successfully registered service connector `gcp-impersonate-sa` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼──────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - - -### External Account (GCP Workload Identity) - -Use [GCP workload identity federation](https://cloud.google.com/iam/docs/workload-identity-federation) to authenticate to GCP services using AWS IAM credentials, Azure Active Directory credentials or generic OIDC tokens. - -This authentication method only requires a GCP workload identity external account JSON file that only contains the configuration for the external account without any sensitive credentials. It allows implementing [a two layer authentication scheme](/develop/usage/resource-data-management/auth-management/best-security-practices#impersonating-accounts-and-assuming-roles) that keeps the set of permissions associated with implicit credentials down to the bare minimum and grants permissions to the privilege-bearing GCP service account instead. - -This authentication method can be used to authenticate to GCP services using credentials from other cloud providers or identity providers. When used with workloads running on AWS or Azure, it involves automatically picking up credentials from the AWS IAM or Azure AD identity associated with the workload and using them to authenticate to GCP services. This means that the result depends on the environment where the ZenML server is deployed and is thus not fully reproducible. - - - When used with AWS or Azure implicit in-cloud authentication, this method may constitute a security risk, because it can give users access to the identity (e.g. AWS IAM role or Azure AD principal) implicitly associated with the environment where the ZenML server is running. For this reason, all implicit authentication methods are disabled by default and need to be explicitly enabled by setting the `ZENML_ENABLE_IMPLICIT_AUTH_METHODS` environment variable or the helm chart `enableImplicitAuthMethods` configuration option to `true` in the ZenML deployment. - - -By default, the GCP connector generates temporary OAuth 2.0 tokens from the external account credentials and distributes them to clients. The tokens have a limited lifetime of 1 hour. This behavior can be disabled by setting the `generate_temporary_tokens` configuration option to `False`, in which case, the connector will distribute the external account credentials JSON to clients instead (not recommended). - -A GCP project is required and the connector may only be used to access GCP resources in the specified roject. This project must be the same as the one for which the external account was configured. - -If you already have the GOOGLE\_APPLICATION\_CREDENTIALS environment variable configured to point to an external account key JSON file, it will be automatically picked up when auto-configuration is used. - -The following assumes the following prerequisites are met, as covered in [the GCP documentation on how to configure workload identity federation with AWS](https://cloud.google.com/iam/docs/workload-identity-federation-with-other-clouds): - -* the ZenML server is deployed in AWS in an EKS cluster (or any other AWS compute environment) -* the ZenML server EKS pods are associated with an AWS IAM role by means of an IAM OIDC provider, as covered in the [AWS documentation on how to associate a IAM role with a service account](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html). Alternatively, [the IAM role associated with the EKS/EC2 nodes](https://docs.aws.amazon.com/eks/latest/userguide/create-node-role.html) can be used instead. This AWS IAM role provides the implicit AWS IAM identity and credentials that will be used to authenticate to GCP services. -* a GCP workload identity pool and AWS provider are configured for the GCP project where the target resources are located, as covered in [the GCP documentation on how to configure workload identity federation with AWS](https://cloud.google.com/iam/docs/workload-identity-federation-with-other-clouds). -* a GCP service account is configured with permissions to access the target resources and granted the `roles/iam.workloadIdentityUser` role for the workload identity pool and AWS provider -* a GCP external account JSON file is generated for the GCP service account. This is used to configure the GCP connector. - - - -``` -zenml service-connector register gcp-workload-identity --type gcp \ - - --auth-method external-account --project_id=zenml-core \ - - --external_account_json=@clientLibraryConfig-aws-zenml.json -``` - -Example Command Output - - - -``` -Successfully registered service connector `gcp-workload-identity` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┃ │ gs://zenml-internal-artifact-store ┃ - -┃ │ gs://zenml-kubeflow-artifact-store ┃ - -┃ │ gs://zenml-project-time-series-bucket ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┃ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -No sensitive credentials are stored with the Service Connector, just meta-information about the external provider and the external account: - - - -``` -zenml service-connector describe gcp-workload-identity -x -``` - -Example Command Output - - - -``` -Service connector 'gcp-workload-identity' of type 'gcp' with id '37b6000e-3f7f-483e-b2c5-7a5db44fe66b' is - -owned by user 'default'. - - 'gcp-workload-identity' gcp Service Connector Details - -┏━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ 37b6000e-3f7f-483e-b2c5-7a5db44fe66b ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ gcp-workload-identity ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔵 gcp ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ external-account ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔵 gcp-generic, 📦 gcs-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 1ff6557f-7f60-4e63-b73d-650e64f015b5 ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES_SKEW_TOLERANCE │ N/A ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2024-01-30 20:44:14.020514 ┃ - -┠────────────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2024-01-30 20:44:14.020516 ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────────────────────────────────────────────────────────────────────────┨ - -┃ project_id │ zenml-core ┃ - -┠───────────────────────┼───────────────────────────────────────────────────────────────────────────────┨ - -┃ external_account_json │ { ┃ - -┃ │ "type": "external_account", ┃ - -┃ │ "audience": ┃ - -┃ │ "//iam.googleapis.com/projects/30267569827/locations/global/workloadIdentityP ┃ - -┃ │ ools/mypool/providers/myprovider", ┃ - -┃ │ "subject_token_type": "urn:ietf:params:aws:token-type:aws4_request", ┃ - -┃ │ "service_account_impersonation_url": ┃ - -┃ │ "https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/myrole@ ┃ - -┃ │ zenml-core.iam.gserviceaccount.com:generateAccessToken", ┃ - -┃ │ "token_url": "https://sts.googleapis.com/v1/token", ┃ - -┃ │ "credential_source": { ┃ - -┃ │ "environment_id": "aws1", ┃ - -┃ │ "region_url": ┃ - -┃ │ "http://169.254.169.254/latest/meta-data/placement/availability-zone", ┃ - -┃ │ "url": ┃ - -┃ │ "http://169.254.169.254/latest/meta-data/iam/security-credentials", ┃ - -┃ │ "regional_cred_verification_url": ┃ - -┃ │ "https://sts.{region}.amazonaws.com?Action=GetCallerIdentity&Version=2011-06- ┃ - -┃ │ 15" ┃ - -┃ │ } ┃ - -┃ │ } ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - - - -### GCP OAuth 2.0 token - -Uses [temporary OAuth 2.0 tokens](/develop/usage/resource-data-management/auth-management/best-security-practices#short-lived-credentials) explicitly configured by the user. - -This method has the major limitation that the user must regularly generate new tokens and update the connector configuration as OAuth 2.0 tokens expire. On the other hand, this method is ideal in cases where the connector only needs to be used for a short period of time, such as sharing access temporarily with someone else in your team. - -Using any of the other authentication methods will automatically generate and refresh OAuth 2.0 tokens for clients upon request. - -A GCP project is required and the connector may only be used to access GCP resources in the specified project. - - -Fetching OAuth 2.0 tokens from the local GCP CLI is possible if the GCP CLI is already configured with valid credentials (i.e. by running `gcloud auth application-default login`). We need to force the ZenML CLI to use the OAuth 2.0 token authentication by passing the `--auth-method oauth2-token` option, otherwise, it would automatically pick up long-term credentials: - - - -``` -zenml service-connector register gcp-oauth2-token --type gcp --auto-configure --auth-method oauth2-token -``` - -Example Command Output - - - -``` -Successfully registered service connector `gcp-oauth2-token` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┃ │ gs://zenml-internal-artifact-store ┃ - -┃ │ gs://zenml-kubeflow-artifact-store ┃ - -┃ │ gs://zenml-project-time-series-bucket ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┃ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -``` -zenml service-connector describe gcp-oauth2-token -``` - -Example Command Output - - - -``` -Service connector 'gcp-oauth2-token' of type 'gcp' with id 'ec4d7d85-c71c-476b-aa76-95bf772c90da' is owned by user 'default' and is 'private'. - - 'gcp-oauth2-token' gcp Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ ec4d7d85-c71c-476b-aa76-95bf772c90da ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ gcp-oauth2-token ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔵 gcp ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ oauth2-token ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔵 gcp-generic, 📦 gcs-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 4694de65-997b-4929-8831-b49d5e067b97 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ 59m46s ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-05-19 09:04:33.557126 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-05-19 09:04:33.557127 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━┯━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠────────────┼────────────┨ - -┃ project_id │ zenml-core ┃ - -┠────────────┼────────────┨ - -┃ token │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━┷━━━━━━━━━━━━┛ -``` - -Note the temporary nature of the Service Connector. It will expire and become unusable in 1 hour: - - - -``` -zenml service-connector list --name gcp-oauth2-token -``` - -Example Command Output - - - -``` -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼──────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ gcp-oauth2-token │ ec4d7d85-c71c-476b-aa76-95bf772c90da │ 🔵 gcp │ 🔵 gcp-generic │ │ ➖ │ default │ 59m35s │ ┃ - -┃ │ │ │ │ 📦 gcs-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - - - -## Auto-configuration - -The GCP Service Connector allows [auto-discovering and fetching credentials](/develop/usage/resource-data-management/auth-management/service-connectors-guide#auto-configuration) and configuration [set up by the GCP CLI](https://cloud.google.com/sdk/gcloud) on your local host. - - -The following is an example of lifting GCP user credentials granting access to the same set of GCP resources and services that the local GCP CLI is allowed to access. The GCP CLI should already be configured with valid credentials (i.e. by running `gcloud auth application-default login`). In this case, the [GCP user account authentication method](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcp-user-account) is automatically detected: - - - -``` -zenml service-connector register gcp-auto --type gcp --auto-configure -``` - -Example Command Output - - - -``` -Successfully registered service connector `gcp-auto` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┃ │ gs://zenml-internal-artifact-store ┃ - -┃ │ gs://zenml-kubeflow-artifact-store ┃ - -┃ │ gs://zenml-project-time-series-bucket ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┃ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -``` -zenml service-connector describe gcp-auto -``` - -Example Command Output - - - -``` -Service connector 'gcp-auto' of type 'gcp' with id 'fe16f141-7406-437e-a579-acebe618a293' is owned by user 'default' and is 'private'. - - 'gcp-auto' gcp Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ ID │ fe16f141-7406-437e-a579-acebe618a293 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ NAME │ gcp-auto ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ TYPE │ 🔵 gcp ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ AUTH METHOD │ user-account ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🔵 gcp-generic, 📦 gcs-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SECRET ID │ 5eca8f6e-291f-4958-ae2d-a3e847a1ad8a ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-05-19 09:15:12.882929 ┃ - -┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-05-19 09:15:12.882930 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────┼────────────┨ - -┃ project_id │ zenml-core ┃ - -┠───────────────────┼────────────┨ - -┃ user_account_json │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┛ -``` - - - -## Local client provisioning - -The local `gcloud` CLI, the Kubernetes `kubectl` CLI and the Docker CLI can be[ configured with credentials extracted from or generated by a compatible GCP Service Connector](/develop/usage/resource-data-management/auth-management/service-connectors-guide#configure-local-clients). Please note that unlike the configuration made possible through the GCP CLI, the Kubernetes and Docker credentials issued by the GCP Service Connector have a short lifetime and will need to be regularly refreshed. This is a byproduct of implementing a high-security profile. - - - Note that the `gcloud` local client can only be configured with credentials issued by the GCP Service Connector if the connector is configured with the [GCP user account authentication method](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcp-user-account) or the [GCP service account authentication method](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcp-service-account) and if the `generate_temporary_tokens` option is set to true in the Service Connector configuration. - - -Only the `gcloud` local [application default credentials](https://cloud.google.com/docs/authentication/application-default-credentials) configuration will be updated by the GCP Service Connector configuration. This makes it possible to use libraries and SDKs that use the application default credentials to access GCP resources. - -The following shows an example of configuring the local Kubernetes CLI to access a GKE cluster reachable through a GCP Service Connector: - -``` -zenml service-connector list --name gcp-user-account -``` - -Example Command Output - - - -``` -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼──────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ gcp-user-account │ ddbce93f-df14-4861-a8a4-99a80972f3bc │ 🔵 gcp │ 🔵 gcp-generic │ │ ➖ │ default │ │ ┃ - -┃ │ │ │ │ 📦 gcs-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - -The following lists all Kubernetes clusters accessible through the GCP Service Connector: - - - -``` -zenml service-connector verify gcp-user-account --resource-type kubernetes-cluster -``` - -Example Command Output - - - -``` -Service connector 'gcp-user-account' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┛ -``` - -Calling the login CLI command will configure the local Kubernetes `kubectl` CLI to access the Kubernetes cluster through the GCP Service Connector: - - - -``` -zenml service-connector login gcp-user-account --resource-type kubernetes-cluster --resource-id zenml-test-cluster -``` - -Example Command Output - - - -``` -⠴ Attempting to configure local client using service connector 'gcp-user-account'... - -Context "gke_zenml-core_zenml-test-cluster" modified. - -Updated local kubeconfig with the cluster details. The current kubectl context was set to 'gke_zenml-core_zenml-test-cluster'. - -The 'gcp-user-account' Kubernetes Service Connector connector was used to successfully configure the local Kubernetes cluster client/SDK. -``` - -To verify that the local Kubernetes `kubectl` CLI is correctly configured, the following command can be used: - - - -``` -kubectl cluster-info -``` - -Example Command Output - - - -``` -Kubernetes control plane is running at https://35.185.95.223 - -GLBCDefaultBackend is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy - -KubeDNS is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy - -Metrics-server is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy -``` - -A similar process is possible with GCR container registries: - - - -``` -zenml service-connector verify gcp-user-account --resource-type docker-registry --resource-id europe-west1-docker.pkg.dev/zenml-core/test -``` - -Example Command Output - - - -``` -Service connector 'gcp-user-account' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────────┼─────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -``` -zenml service-connector login gcp-user-account --resource-type docker-registry --resource-id europe-west1-docker.pkg.dev/zenml-core/test -``` - -Example Command Output - - - -``` -⠦ Attempting to configure local client using service connector 'gcp-user-account'... - -WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - -Configure a credential helper to remove this warning. See - -https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - -The 'gcp-user-account' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. -``` - -To verify that the local Docker container registry client is correctly configured, the following command can be used: - - - -``` -docker push europe-west1-docker.pkg.dev/zenml-core/test/zenml -``` - -Example Command Output - - - -``` -The push refers to repository [europe-west1-docker.pkg.dev/zenml-core/test/zenml] - -d4aef4f5ed86: Pushed - -2d69a4ce1784: Pushed - -204066eca765: Pushed - -2da74ab7b0c1: Pushed - -75c35abda1d1: Layer already exists - -415ff8f0f676: Layer already exists - -c14cb5b1ec91: Layer already exists - -a1d005f5264e: Layer already exists - -3a3fd880aca3: Layer already exists - -149a9c50e18e: Layer already exists - -1f6d3424b922: Layer already exists - -8402c959ae6f: Layer already exists - -419599cb5288: Layer already exists - -8553b91047da: Layer already exists - -connectors: digest: sha256:a4cfb18a5cef5b2201759a42dd9fe8eb2f833b788e9d8a6ebde194765b42fe46 size: 3256 -``` - -It is also possible to update the local `gcloud` CLI configuration with credentials extracted from the GCP Service Connector: - - - -``` -zenml service-connector login gcp-user-account --resource-type gcp-generic -``` - -Example Command Output - - - -``` -Updated the local gcloud default application credentials file at '/home/user/.config/gcloud/application_default_credentials.json' - -The 'gcp-user-account' GCP Service Connector connector was used to successfully configure the local Generic GCP resource client/SDK. -``` - - - -## Stack Components use - -The[ GCS Artifact Store Stack Component](/develop/stack-components/artifact-stores/gcp) can be connected to a remote GCS bucket through a GCP Service Connector. - -The [Google Cloud Image Builder Stack Component](/develop/stack-components/image-builders/gcp), [VertexAI Orchestrator](/develop/stack-components/orchestrators/vertex), and [VertexAI Step Operator](/develop/stack-components/step-operators/vertex) can be connected and use the resources of a target GCP project through a GCP Service Connector. - -The GCP Service Connector can also be used with any Orchestrator or Model Deployer stack component flavor that relies on Kubernetes clusters to manage workloads. This allows GKE Kubernetes container workloads to be managed without the need to configure and maintain explicit GCP or Kubernetes `kubectl` configuration contexts and credentials in the target environment or in the Stack Component itself. - -Similarly, Container Registry Stack Components can be connected to a Google Artifact Registry or GCR Container Registry through a GCP Service Connector. This allows container images to be built and published to GAR or GCR container registries without the need to configure explicit GCP credentials in the target environment or the Stack Component. - -## End-to-end examples - - This is an example of an end-to-end workflow involving Service -Connectors that use a single multi-type GCP Service Connector to give access to multiple resources for multiple Stack Components. A complete ZenML Stack is registered and composed of the following Stack Components, all connected through the same Service Connector: - -* a [Kubernetes Orchestrator](/develop/stack-components/orchestrators/kubernetes) connected to a GKE Kubernetes cluster -* a [GCS Artifact Store](/develop/stack-components/artifact-stores/gcp) connected to a GCS bucket -* a [GCP Container Registry](/develop/stack-components/container-registries/gcp) connected to a Docker Google Artifact Registry -* a local [Image Builder](/develop/stack-components/image-builders/local) - -As a last step, a simple pipeline is run on the resulting Stack. - -1. Configure the local GCP CLI with valid user account credentials with a wide range of permissions (i.e. by running `gcloud auth application-default login`) and install ZenML integration prerequisites: - -``` -zenml integration install -y gcp -``` - -``` -gcloud auth application-default login -``` - -Example Command Output - - - -``` -```text - -Credentials saved to file: [/home/stefan/.config/gcloud/application_default_credentials.json] - -These credentials will be used by any library that requests Application Default Credentials (ADC). - -Quota project "zenml-core" was added to ADC which can be used by Google client libraries for billing - -and quota. Note that some services may still bill the project owning the resource. - -``` -``` - -1. Make sure the GCP Service Connector Type is available - -``` -zenml service-connector list-types --type gcp -``` - -Example Command Output - - - -``` -```text - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠───────────────────────┼────────┼───────────────────────┼─────────────────┼───────┼────────┨ - -┃ GCP Service Connector │ 🔵 gcp │ 🔵 gcp-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 gcs-bucket │ user-account │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ service-account │ │ ┃ - -┃ │ │ 🐳 docker-registry │ oauth2-token │ │ ┃ - -┃ │ │ │ impersonation │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ - -``` -``` - -1. Register a multi-type GCP Service Connector using auto-configuration - -``` -zenml service-connector register gcp-demo-multi --type gcp --auto-configure -``` - -Example Command Output - - - -``` -```text - -Successfully registered service connector `gcp-demo-multi` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┃ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -**NOTE**: from this point forward, we don't need the local GCP CLI credentials or the local GCP CLI at all. The steps that follow can be run on any machine regardless of whether it has been configured and authorized to access the GCP project. -``` - -4\. find out which GCS buckets, GAR registries, and GKE Kubernetes clusters we can gain access to. We'll use this information to configure the Stack Components in our minimal GCP stack: a GCS Artifact Store, a Kubernetes Orchestrator, and a GCP Container Registry. - - - -``` -```sh - -zenml service-connector list-resources --resource-type gcs-bucket - -``` - -``` - -Example Command Output - - - -``` -```text - -The following 'gcs-bucket' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────┼─────────────────────────────────────────────────┨ - -┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┃ │ │ │ │ gs://zenml-core.appspot.com ┃ - -┃ │ │ │ │ gs://zenml-core_cloudbuild ┃ - -┃ │ │ │ │ gs://zenml-datasets ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -```sh - -zenml service-connector list-resources --resource-type kubernetes-cluster - -``` - -``` - -Example Command Output - - - -``` -```text - -The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────────────┼────────────────────┨ - -┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -```sh - -zenml service-connector list-resources --resource-type docker-registry - -``` - -``` - -Example Command Output - - - -``` -```text - -The following 'docker-registry' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼─────────────────────────────────────────────────┨ - -┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ │ │ │ us.gcr.io/zenml-core ┃ - -┃ │ │ │ │ eu.gcr.io/zenml-core ┃ - -┃ │ │ │ │ asia.gcr.io/zenml-core ┃ - -┃ │ │ │ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ │ │ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ │ │ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ │ │ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ │ │ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. register and connect a GCS Artifact Store Stack Component to a GCS bucket: - -``` -zenml artifact-store register gcs-zenml-bucket-sl --flavor gcp --path=gs://zenml-bucket-sl -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully registered artifact_store `gcs-zenml-bucket-sl`. - -``` -``` - - - -``` -```sh - -zenml artifact-store connect gcs-zenml-bucket-sl --connector gcp-demo-multi - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully connected artifact store `gcs-zenml-bucket-sl` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────┼──────────────────────┨ - -┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. register and connect a Kubernetes Orchestrator Stack Component to a GKE cluster: - -``` -zenml orchestrator register gke-zenml-test-cluster --flavor kubernetes --synchronous=true ---kubernetes_namespace=zenml-workloads -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully registered orchestrator `gke-zenml-test-cluster`. - -``` -``` - - - -``` -```sh - -zenml orchestrator connect gke-zenml-test-cluster --connector gcp-demo-multi - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully connected orchestrator `gke-zenml-test-cluster` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────────────┼────────────────────┨ - -┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. Register and connect a GCP Container Registry Stack Component to a GAR registry: - -``` -zenml container-registry register gcr-zenml-core --flavor gcp --uri=europe-west1-docker.pkg.dev/zenml-core/test -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully registered container_registry `gcr-zenml-core`. - -``` -``` - - - -``` -```sh - -zenml container-registry connect gcr-zenml-core --connector gcp-demo-multi - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully connected container registry `gcr-zenml-core` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼─────────────────────────────────────────────┨ - -┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 🐳 docker-registry │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. Combine all Stack Components together into a Stack and set it as active (also throw in a local Image Builder for completion): - -``` -zenml image-builder register local --flavor local -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully registered image_builder `local`. - -``` -``` - - - -``` -```sh - -zenml stack register gcp-demo -a gcs-zenml-bucket-sl -o gke-zenml-test-cluster -c gcr-zenml-core -i local --set - -``` - -``` - -Example Command Output - - - -``` -```text - -Stack 'gcp-demo' successfully registered! - -Active global stack set to:'gcp-demo' - -``` -``` - -1. Finally, run a simple pipeline to prove that everything works as expected. We'll use the simplest pipelines possible for this example: - -``` -from zenml import pipeline, step -@step -def step_1() -> str: - """Returns the `world` string.""" - return "world" -@step(enable_cache=False) -def step_2(input_one: str, input_two: str) -> None: - """Combines the two strings at its input and prints them.""" - combined_str = f"{input_one} {input_two}" - print(combined_str) -@pipeline -def my_pipeline(): - output_step_one = step_1() - step_2(input_one="hello", input_two=output_step_one) -if __name__ == "__main__": - my_pipeline() -``` -Saving that to a `run.py` file and running it gives us: - -Example Command Output - - - -``` -```text - -$ python run.py - -Building Docker image(s) for pipeline simple_pipeline. - -Building Docker image europe-west1-docker.pkg.dev/zenml-core/test/zenml:simple_pipeline-orchestrator. - -- Including integration requirements: gcsfs, google-cloud-aiplatform>=1.11.0, google-cloud-build>=3.11.0, google-cloud-container>=2.21.0, google-cloud-functions>=1.8.3, google-cloud-scheduler>=2.7.3, google-cloud-secret-manager, google-cloud-storage>=2.9.0, kfp==1.8.16, kubernetes==18.20.0, shapely<2.0 - -No .dockerignore found, including all files inside build context. - -Step 1/8 : FROM zenmldocker/zenml:0.39.1-py3.8 - -Step 2/8 : WORKDIR /app - -Step 3/8 : COPY .zenml_integration_requirements . - -Step 4/8 : RUN pip install --default-timeout=60 --no-cache-dir -r .zenml_integration_requirements - -Step 5/8 : ENV ZENML_ENABLE_REPO_INIT_WARNINGS=False - -Step 6/8 : ENV ZENML_CONFIG_PATH=/app/.zenconfig - -Step 7/8 : COPY . . - -Step 8/8 : RUN chmod -R a+rw . - -Pushing Docker image europe-west1-docker.pkg.dev/zenml-core/test/zenml:simple_pipeline-orchestrator. - -Finished pushing Docker image. - -Finished building Docker image(s). - -Running pipeline simple_pipeline on stack gcp-demo (caching disabled) - -Waiting for Kubernetes orchestrator pod... - -Kubernetes orchestrator pod started. - -Waiting for pod of step step_1 to start... - -Step step_1 has started. - -Step step_1 has finished in 1.357s. - -Pod of step step_1 completed. - -Waiting for pod of step simple_step_two to start... - -Step step_2 has started. - -Hello World! - -Step step_2 has finished in 3.136s. - -Pod of step step_2 completed. - -Orchestration pod completed. - -Dashboard URL: http://34.148.132.191/default/pipelines/cec118d1-d90a-44ec-8bd7-d978f726b7aa/runs - - -``` - - - - - -This is an example of an end-to-end workflow involving Service Connectors that use multiple single-instance GCP Service Connectors, each giving access to a resource for a Stack Component. A complete ZenML Stack is registered and composed of the following Stack Components, all connected through its individual Service Connector: - -* a [VertexAI Orchestrator](/develop/stack-components/orchestrators/vertex) connected to the GCP project -* a [GCS Artifact Store](/develop/stack-components/artifact-stores/gcp) connected to a GCS bucket -* a [GCP Container Registry](/develop/stack-components/container-registries/gcp) connected to a GCR container registry -* a [Google Cloud Image Builder](/develop/stack-components/image-builders/gcp) connected to the GCP project - -As a last step, a simple pipeline is run on the resulting Stack. - -1. Configure the local GCP CLI with valid user account credentials with a wide range of permissions (i.e. by running `gcloud auth application-default login`) and install ZenML integration prerequisites: - -``` -zenml integration install -y gcp -``` - -``` -gcloud auth application-default login -``` - -Example Command Output - - - -``` -```text - -Credentials saved to file: [/home/stefan/.config/gcloud/application_default_credentials.json] - -These credentials will be used by any library that requests Application Default Credentials (ADC). - -Quota project "zenml-core" was added to ADC which can be used by Google client libraries for billing - -and quota. Note that some services may still bill the project owning the resource. - -``` -``` - -1. Make sure the GCP Service Connector Type is available - -``` -zenml service-connector list-types --type gcp -``` - -Example Command Output - - - -``` -```text - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠───────────────────────┼────────┼───────────────────────┼─────────────────┼───────┼────────┨ - -┃ GCP Service Connector │ 🔵 gcp │ 🔵 gcp-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 gcs-bucket │ user-account │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ service-account │ │ ┃ - -┃ │ │ 🐳 docker-registry │ oauth2-token │ │ ┃ - -┃ │ │ │ impersonation │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ - -``` -``` - -1. Register an individual single-instance GCP Service Connector using auto-configuration for each of the resources that will be needed for the Stack Components: a GCS bucket, a GCR registry, and generic GCP access for the VertexAI orchestrator and another one for the GCP Cloud Builder: - -``` -zenml service-connector register gcs-zenml-bucket-sl --type gcp --resource-type gcs-bucket --resource-id gs://zenml-bucket-sl --auto-configure -``` - -Example Command Output - - - -``` -```text - -Successfully registered service connector `gcs-zenml-bucket-sl` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼──────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -```sh - -zenml service-connector register gcr-zenml-core --type gcp --resource-type docker-registry --auto-configure - -``` - -``` - -Example Command Output - - - -``` -```text - -Successfully registered service connector `gcr-zenml-core` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┃ │ us.gcr.io/zenml-core ┃ - -┃ │ eu.gcr.io/zenml-core ┃ - -┃ │ asia.gcr.io/zenml-core ┃ - -┃ │ asia-docker.pkg.dev/zenml-core/asia.gcr.io ┃ - -┃ │ europe-docker.pkg.dev/zenml-core/eu.gcr.io ┃ - -┃ │ europe-west1-docker.pkg.dev/zenml-core/test ┃ - -┃ │ us-docker.pkg.dev/zenml-core/gcr.io ┃ - -┃ │ us-docker.pkg.dev/zenml-core/us.gcr.io ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -```sh - -zenml service-connector register vertex-ai-zenml-core --type gcp --resource-type gcp-generic --auto-configure - -``` - -``` - -Example Command Output - - - -``` -```text - -Successfully registered service connector `vertex-ai-zenml-core` with access to the following resources: - -┏━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────┼────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┗━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -```sh - -zenml service-connector register gcp-cloud-builder-zenml-core --type gcp --resource-type gcp-generic --auto-configure - -``` - -``` - -Example Command Output - - - -``` -```text - -Successfully registered service connector `gcp-cloud-builder-zenml-core` with access to the following resources: - -┏━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠────────────────┼────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┗━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - -``` -``` - - - -``` -**NOTE**: from this point forward, we don't need the local GCP CLI credentials or the local GCP CLI at all. The steps that follow can be run on any machine regardless of whether it has been configured and authorized to access the GCP project. - -In the end, the service connector list should look like this: - -```sh - -zenml service-connector list - -``` - -``` - -Example Command Output - - - -``` -```text - -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼──────────────────────────────┼──────────────────────────────────────┼────────┼────────────────────┼──────────────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ gcs-zenml-bucket-sl │ 405034fe-5e6e-4d29-ba62-8ae025381d98 │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl │ ➖ │ default │ │ ┃ - -┠────────┼──────────────────────────────┼──────────────────────────────────────┼────────┼────────────────────┼──────────────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ gcr-zenml-core │ 9fddfaba-6d46-4806-ad96-9dcabef74639 │ 🔵 gcp │ 🐳 docker-registry │ gcr.io/zenml-core │ ➖ │ default │ │ ┃ - -┠────────┼──────────────────────────────┼──────────────────────────────────────┼────────┼────────────────────┼──────────────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ vertex-ai-zenml-core │ f97671b9-8c73-412b-bf5e-4b7c48596f5f │ 🔵 gcp │ 🔵 gcp-generic │ zenml-core │ ➖ │ default │ │ ┃ - -┠────────┼──────────────────────────────┼──────────────────────────────────────┼────────┼────────────────────┼──────────────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ gcp-cloud-builder-zenml-core │ 648c1016-76e4-4498-8de7-808fd20f057b │ 🔵 gcp │ 🔵 gcp-generic │ zenml-core │ ➖ │ default │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ - -``` -``` - -1. register and connect a GCS Artifact Store Stack Component to the GCS bucket: - -``` -zenml artifact-store register gcs-zenml-bucket-sl --flavor gcp --path=gs://zenml-bucket-sl -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully registered artifact_store `gcs-zenml-bucket-sl`. - -``` -``` - - - -``` -```sh - -zenml artifact-store connect gcs-zenml-bucket-sl --connector gcs-zenml-bucket-sl - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (global) - -Successfully connected artifact store `gcs-zenml-bucket-sl` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────────┼────────────────┼───────────────┼──────────────────────┨ - -┃ 405034fe-5e6e-4d29-ba62-8ae025381d98 │ gcs-zenml-bucket-sl │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. register and connect a Google Cloud Image Builder Stack Component to the target GCP project: - -``` -zenml image-builder register gcp-zenml-core --flavor gcp -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully registered image_builder `gcp-zenml-core`. - -``` -``` - - - -``` -```sh - -zenml image-builder connect gcp-zenml-core --connector gcp-cloud-builder-zenml-core - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully connected image builder `gcp-zenml-core` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼──────────────────────────────┼────────────────┼────────────────┼────────────────┨ - -┃ 648c1016-76e4-4498-8de7-808fd20f057b │ gcp-cloud-builder-zenml-core │ 🔵 gcp │ 🔵 gcp-generic │ zenml-core ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. register and connect a Vertex AI Orchestrator Stack Component to the target GCP project -**NOTE**: If we do not specify a workload service account, the Vertex AI Pipelines Orchestrator uses the Compute Engine default service account in the target project to run pipelines. You must grant this account the Vertex AI Service Agent role, otherwise the pipelines will fail. More information on other configurations possible for the Vertex AI Orchestrator can be found [here](/develop/stack-components/orchestrators/vertex#how-to-use-it). - -``` -zenml orchestrator register vertex-ai-zenml-core --flavor=vertex --location=europe-west1 --synchronous=true -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully registered orchestrator `vertex-ai-zenml-core`. - -``` -``` - - - -``` -```sh - -zenml orchestrator connect vertex-ai-zenml-core --connector vertex-ai-zenml-core - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully connected orchestrator `vertex-ai-zenml-core` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼──────────────────────┼────────────────┼────────────────┼────────────────┨ - -┃ f97671b9-8c73-412b-bf5e-4b7c48596f5f │ vertex-ai-zenml-core │ 🔵 gcp │ 🔵 gcp-generic │ zenml-core ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. Register and connect a GCP Container Registry Stack Component to a GCR container registry: - -``` -zenml container-registry register gcr-zenml-core --flavor gcp --uri=gcr.io/zenml-core -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully registered container_registry `gcr-zenml-core`. - -``` -``` - - - -``` -```sh - -zenml container-registry connect gcr-zenml-core --connector gcr-zenml-core - -``` - -``` - -Example Command Output - - - -``` -```text - -Running with active stack: 'default' (repository) - -Successfully connected container registry `gcr-zenml-core` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼───────────────────┨ - -┃ 9fddfaba-6d46-4806-ad96-9dcabef74639 │ gcr-zenml-core │ 🔵 gcp │ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┛ - -``` -``` - -1. Combine all Stack Components together into a Stack and set it as active: - -``` -zenml stack register gcp-demo -a gcs-zenml-bucket-sl -o vertex-ai-zenml-core -c gcr-zenml-core -i gcp-zenml-core --set -``` - -Example Command Output - - - -``` -```text - -Stack 'gcp-demo' successfully registered! - -Active repository stack set to:'gcp-demo' - -``` -``` - -1. Finally, run a simple pipeline to prove that everything works as expected. We'll use the simplest pipelines possible for this example: - -``` -from zenml import pipeline, step -@step -def step_1() -> str: - """Returns the `world` string.""" - return "world" -@step(enable_cache=False) -def step_2(input_one: str, input_two: str) -> None: - """Combines the two strings at its input and prints them.""" - combined_str = f"{input_one} {input_two}" - print(combined_str) -@pipeline -def my_pipeline(): - output_step_one = step_1() - step_2(input_one="hello", input_two=output_step_one) -if __name__ == "__main__": - my_pipeline() -``` -Saving that to a `run.py` file and running it gives us: - -Example Command Output - - - -``` -```text - -$ python run.py - -Building Docker image(s) for pipeline simple_pipeline. - -Building Docker image gcr.io/zenml-core/zenml:simple_pipeline-orchestrator. - -- Including integration requirements: gcsfs, google-cloud-aiplatform>=1.11.0, google-cloud-build>=3.11.0, google-cloud-container>=2.21.0, google-cloud-functions>=1.8.3, google-cloud-scheduler>=2.7.3, google-cloud-secret-manager, google-cloud-storage>=2.9.0, kfp==1.8.16, shapely<2.0 - -Using Cloud Build to build image gcr.io/zenml-core/zenml:simple_pipeline-orchestrator - -No .dockerignore found, including all files inside build context. - -Uploading build context to gs://zenml-bucket-sl/cloud-build-contexts/5dda6dbb60e036398bee4974cfe3eb768a138b2e.tar.gz. - -Build context located in bucket zenml-bucket-sl and object path cloud-build-contexts/5dda6dbb60e036398bee4974cfe3eb768a138b2e.tar.gz - -Using Cloud Builder image gcr.io/cloud-builders/docker to run the steps in the build. Container will be attached to network using option --network=cloudbuild. - -Running Cloud Build to build the Docker image. Cloud Build logs: https://console.cloud.google.com/cloud-build/builds/068e77a1-4e6f-427a-bf94-49c52270af7a?project=20219041791 - -The Docker image has been built successfully. More information can be found in the Cloud Build logs: https://console.cloud.google.com/cloud-build/builds/068e77a1-4e6f-427a-bf94-49c52270af7a?project=20219041791. - -Finished building Docker image(s). - -Running pipeline simple_pipeline on stack gcp-demo (caching disabled) - -The attribute pipeline_root has not been set in the orchestrator configuration. One has been generated automatically based on the path of the GCPArtifactStore artifact store in the stack used to execute the pipeline. The generated pipeline_root is gs://zenml-bucket-sl/vertex_pipeline_root/simple_pipeline/simple_pipeline_default_6e72f3e1. - -/home/stefan/aspyre/src/zenml/.venv/lib/python3.8/site-packages/kfp/v2/compiler/compiler.py:1290: FutureWarning: APIs imported from the v1 namespace (e.g. kfp.dsl, kfp.components, etc) will not be supported by the v2 compiler since v2.0.0 - - warnings.warn( - -Writing Vertex workflow definition to /home/stefan/.config/zenml/vertex/8a0b53ee-644a-4fbe-8e91-d4d6ddf79ae8/pipelines/simple_pipeline_default_6e72f3e1.json. - -No schedule detected. Creating one-off vertex job... - -Submitting pipeline job with job_id simple-pipeline-default-6e72f3e1 to Vertex AI Pipelines service. - -The Vertex AI Pipelines job workload will be executed using the connectors-vertex-ai-workload@zenml-core.iam.gserviceaccount.com service account. - -Creating PipelineJob - -INFO:google.cloud.aiplatform.pipeline_jobs:Creating PipelineJob - -PipelineJob created. Resource name: projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1 - -INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob created. Resource name: projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1 - -To use this PipelineJob in another session: - -INFO:google.cloud.aiplatform.pipeline_jobs:To use this PipelineJob in another session: - -pipeline_job = aiplatform.PipelineJob.get('projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1') - -INFO:google.cloud.aiplatform.pipeline_jobs:pipeline_job = aiplatform.PipelineJob.get('projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1') - -View Pipeline Job: - -https://console.cloud.google.com/vertex-ai/locations/europe-west1/pipelines/runs/simple-pipeline-default-6e72f3e1?project=20219041791 - -INFO:google.cloud.aiplatform.pipeline_jobs:View Pipeline Job: - -https://console.cloud.google.com/vertex-ai/locations/europe-west1/pipelines/runs/simple-pipeline-default-6e72f3e1?project=20219041791 - -View the Vertex AI Pipelines job at https://console.cloud.google.com/vertex-ai/locations/europe-west1/pipelines/runs/simple-pipeline-default-6e72f3e1?project=20219041791 - -Waiting for the Vertex AI Pipelines job to finish... - -PipelineJob projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1 current state: - -PipelineState.PIPELINE_STATE_RUNNING - -INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1 current state: - -PipelineState.PIPELINE_STATE_RUNNING - -... - -PipelineJob run completed. Resource name: projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1 - -INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob run completed. Resource name: projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1 - -Dashboard URL: https://34.148.132.191/default/pipelines/17cac6b5-3071-45fa-a2ef-cda4a7965039/runs - -``` - - diff --git a/develop/usage/resource-data-management/auth-management/hyperai-service-connector.mdx b/develop/usage/resource-data-management/auth-management/hyperai-service-connector.mdx deleted file mode 100644 index 740092a..0000000 --- a/develop/usage/resource-data-management/auth-management/hyperai-service-connector.mdx +++ /dev/null @@ -1,60 +0,0 @@ ---- -title: "HyperAI Service Connector" -description: "Configuring HyperAI Connectors to connect ZenML to HyperAI instances." -icon: link ---- - -The ZenML HyperAI Service Connector allows authenticating with a HyperAI instance for deployment of pipeline runs. This connector provides pre-authenticated Paramiko SSH clients to Stack Components that are linked to it. - -```bash -$ zenml service-connector list-types --type hyperai -``` - -``` -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ -┠───────────────────────────┼────────────┼────────────────────┼──────────────┼───────┼────────┨ -┃ HyperAI Service Connector │ 🤖 hyperai │ 🤖 hyperai-instance │ rsa-key │ ✅ │ ✅ ┃ -┃ │ │ │ dsa-key │ │ ┃ -┃ │ │ │ ecdsa-key │ │ ┃ -┃ │ │ │ ed25519-key │ │ ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -## Prerequisites - -The HyperAI Service Connector is part of the HyperAI integration. It is necessary to install the integration in order to use this Service Connector: - -* `zenml integration install hyperai` installs the HyperAI integration - -## Resource Types - -The HyperAI Service Connector supports HyperAI instances. - -## Authentication Methods - -ZenML creates an SSH connection to the HyperAI instance in the background when using this Service Connector. It then provides these connections to stack components requiring them, such as the HyperAI Orchestrator. Multiple authentication methods are supported: - -1. RSA key based authentication. -2. DSA (DSS) key based authentication. -3. ECDSA key based authentication. -4. ED25519 key based authentication. - - - SSH private keys configured in the connector will be distributed to all clients that use them to run pipelines with the HyperAI orchestrator. SSH keys are long-lived credentials that give unrestricted access to HyperAI instances. - - -When configuring the Service Connector, it is required to provide at least one hostname via `hostnames` and the `username` with which to login. Optionally, it is possible to provide an `ssh_passphrase` if applicable. This way, it is possible to use the HyperAI service connector in multiple ways: - -1. Create one service connector per HyperAI instance with different SSH keys. -2. Configure a reused SSH key just once for multiple HyperAI instances, then select the individual instance when creating the HyperAI orchestrator component. - -## Auto-configuration - - - This Service Connector does not support auto-discovery and extraction of authentication credentials from HyperAI instances. If this feature is useful to you or your organization, please let us know by messaging us in [Slack](https://zenml.io/slack) or [creating an issue on GitHub](https://github.com/zenml-io/zenml/issues). - - -## Stack Components use - -The HyperAI Service Connector can be used by the HyperAI Orchestrator to deploy pipeline runs to HyperAI instances. diff --git a/develop/usage/resource-data-management/auth-management/kubernetes-service-connector.mdx b/develop/usage/resource-data-management/auth-management/kubernetes-service-connector.mdx deleted file mode 100644 index 08f58b2..0000000 --- a/develop/usage/resource-data-management/auth-management/kubernetes-service-connector.mdx +++ /dev/null @@ -1,201 +0,0 @@ ---- -title: "Kubernetes Service Connector" -description: "Configuring Kubernetes Service Connectors to connect ZenML to Kubernetes clusters." -icon: gears ---- - -The ZenML Kubernetes service connector facilitates authenticating and connecting to a Kubernetes cluster. The connector can be used to access to any generic Kubernetes cluster by providing pre-authenticated Kubernetes python clients to Stack Components that are linked to it and also allows configuring the local Kubernetes CLI (i.e. `kubectl`). - -## Prerequisites - -The Kubernetes Service Connector is part of the Kubernetes ZenML integration. You can either install the entire integration or use a pypi extra to install it independently of the integration: - -* `pip install "zenml[connectors-kubernetes]"` installs only prerequisites for the Kubernetes Service Connector Type -* `zenml integration install kubernetes` installs the entire Kubernetes ZenML integration - -A local Kubernetes CLI (i.e. `kubectl` ) and setting up local `kubectl` configuration contexts is not required to access Kubernetes clusters in your Stack Components through the Kubernetes Service Connector. - -```bash -$ zenml service-connector list-types --type kubernetes -``` - -```bash -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ -┠──────────────────────────────┼───────────────┼───────────────────────┼──────────────┼───────┼────────┨ -┃ Kubernetes Service Connector │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ password │ ✅ │ ✅ ┃ -┃ │ │ │ token │ │ ┃ -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -## Resource Types - -The Kubernetes Service Connector only supports authenticating to and granting access to a generic Kubernetes cluster. This type of resource is identified by the `kubernetes-cluster` Resource Type. - -The resource name is a user-friendly cluster name configured during registration. - -## Authentication Methods - -Two authentication methods are supported: - -1. username and password. This is not recommended for production purposes. -2. authentication token with or without client certificates. - -For Kubernetes clusters that use neither username and password nor authentication tokens, such as local K3D clusters, the authentication token method can be used with an empty token. - - - This Service Connector does not support generating short-lived credentials from the credentials configured in the Service Connector. In effect, this means that the configured credentials will be distributed directly to clients and used to authenticate to the target Kubernetes API. It is recommended therefore to use API tokens accompanied by client certificates if possible. - - -## Auto-configuration - -The Kubernetes Service Connector allows fetching credentials from the local Kubernetes CLI (i.e. `kubectl`) during registration. The current Kubernetes kubectl configuration context is used for this purpose. The following is an example of lifting Kubernetes credentials granting access to a GKE cluster: - - - -```bash -zenml service-connector register kube-auto --type kubernetes --auto-configure -``` - -```bash Example Command Output -Successfully registered service connector `kube-auto` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼────────────────┨ - -┃ 🌀 kubernetes-cluster │ 35.185.95.223 ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - - - -```bash -zenml service-connector describe kube-auto -``` - -```bash Example Command Output -Service connector 'kube-auto' of type 'kubernetes' with id '4315e8eb-fcbd-4938-a4d7-a9218ab372a1' is owned by user 'default' and is 'private'. - - 'kube-auto' kubernetes Service Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ ID │ 4315e8eb-fcbd-4938-a4d7-a9218ab372a1 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ NAME │ kube-auto ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ TYPE │ 🌀 kubernetes ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ AUTH METHOD │ token ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ RESOURCE TYPES │ 🌀 kubernetes-cluster ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ RESOURCE NAME │ 35.175.95.223 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SECRET ID │ a833e86d-b845-4584-9656-4b041335e299 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ OWNER │ default ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ SHARED │ ➖ ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ CREATED_AT │ 2023-05-16 21:45:33.224740 ┃ - -┠──────────────────┼──────────────────────────────────────┨ - -┃ UPDATED_AT │ 2023-05-16 21:45:33.224743 ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────────┼───────────────────────┨ - -┃ server │ https://35.175.95.223 ┃ - -┠───────────────────────┼───────────────────────┨ - -┃ insecure │ False ┃ - -┠───────────────────────┼───────────────────────┨ - -┃ cluster_name │ 35.175.95.223 ┃ - -┠───────────────────────┼───────────────────────┨ - -┃ token │ [HIDDEN] ┃ - -┠───────────────────────┼───────────────────────┨ - -┃ certificate_authority │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - Credentials auto-discovered and lifted through the Kubernetes Service Connector might have a limited lifetime, especially if the target Kubernetes cluster is managed through a 3rd party authentication provider such a GCP or AWS. Using short-lived credentials with your Service Connectors could lead to loss of connectivity and other unexpected errors in your pipeline. - - -## Local client provisioning - -This Service Connector allows configuring the local Kubernetes client (i.e. `kubectl`) with credentials: - -```bash -zenml service-connector login kube-auto -``` - -```bash Example Command Output - -⠦ Attempting to configure local client using service connector 'kube-auto'... - -Cluster "35.185.95.223" set. - -⠇ Attempting to configure local client using service connector 'kube-auto'... - -⠏ Attempting to configure local client using service connector 'kube-auto'... - -Updated local kubeconfig with the cluster details. The current kubectl context was set to '35.185.95.223'. - -The 'kube-auto' Kubernetes Service Connector connector was used to successfully configure the local Kubernetes cluster client/SDK. -``` - -## Stack Components use - -The Kubernetes Service Connector can be used in Orchestrator and Model Deployer stack component flavors that rely on Kubernetes clusters to manage their workloads. This allows Kubernetes container workloads to be managed without the need to configure and maintain explicit Kubernetes `kubectl` configuration contexts and credentials in the target environment and in the Stack Component. diff --git a/develop/usage/resource-data-management/auth-management/service-connectors-guide.mdx b/develop/usage/resource-data-management/auth-management/service-connectors-guide.mdx deleted file mode 100644 index 3991c05..0000000 --- a/develop/usage/resource-data-management/auth-management/service-connectors-guide.mdx +++ /dev/null @@ -1,2503 +0,0 @@ ---- -title: "Service Connectors guide" -description: "The complete guide to managing Service Connectors and connecting ZenML to external resources." -icon: gear ---- - -This documentation section contains everything that you need to use Service Connectors to connect ZenML to external resources. A lot of information is covered, so it might be useful to use the following guide to navigate it: - -* if you're only getting started with Service Connectors, we suggest starting by familiarizing yourself with the [terminology](/develop/usage/resource-data-management/auth-management/service-connectors-guide#terminology). -* check out the section on [Service Connector Types](/develop/usage/resource-data-management/auth-management/service-connectors-guide#cloud-provider-service-connector-types) to understand the different Service Connector implementations that are available and when to use them. -* jumping straight to the sections on [Registering Service Connectors](/develop/usage/resource-data-management/auth-management/service-connectors-guide#register-service-connectors) can get you set up quickly if you are only looking for a quick way to evaluate Service Connectors and their features. -* if all you need to do is connect a ZenML Stack Component to an external resource or service like a Kubernetes cluster, a Docker container registry, or an object storage bucket, and you already have some Service Connectors available, the section on [connecting Stack Components to resources](/develop/usage/resource-data-management/auth-management/service-connectors-guide#connect-stack-components-to-resources) is all you need. - -In addition to this guide, there is an entire section dedicated to [best security practices concerning the various authentication methods](/develop/usage/resource-data-management/auth-management/best-security-practices) implemented by Service Connectors, such as which types of credentials to use in development or production and how to keep your security information safe. That section is particularly targeted at engineers with some knowledge of infrastructure, but it should be accessible to larger audiences. - -## Terminology - -As with any high-level abstraction, some terminology is needed to express the concepts and operations involved. In spite of the fact that Service Connectors cover such a large area of application as authentication and authorization for a variety of resources from a range of different vendors, we managed to keep this abstraction clean and simple. In the following expandable sections, you'll learn more about Service Connector Types, Resource Types, Resource Names, and Service Connectors. - - -This term is used to represent and identify a particular Service Connector implementation and answer questions about its capabilities such as "what types of resources does this Service Connector give me access to", "what authentication methods does it support" and "what credentials and other information do I need to configure for it". This is analogous to the role Flavors play for Stack Components in that the Service Connector Type acts as the template from which one or more Service Connectors are created. - -For example, the built-in AWS Service Connector Type shipped with ZenML supports a rich variety of authentication methods and provides access to AWS resources such as S3 buckets, EKS clusters and ECR registries. - -The `zenml service-connector list-types` and `zenml service-connector describe-type` CLI commands can be used to explore the Service Connector Types available with your ZenML deployment. Extensive documentation is included covering supported authentication methods and Resource Types. The following are just some examples: - -```bash -zenml service-connector list-types -``` - -```bash Example Command Output -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Kubernetes Service Connector │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ password │ ✅ │ ✅ ┃ - -┃ │ │ │ token │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Docker Service Connector │ 🐳 docker │ 🐳 docker-registry │ password │ ✅ │ ✅ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Azure Service Connector │ 🇦 azure │ 🇦 azure-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 blob-container │ service-principal │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ access-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ AWS Service Connector │ 🔶 aws │ 🔶 aws-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 s3-bucket │ secret-key │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ sts-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ iam-role │ │ ┃ - -┃ │ │ │ session-token │ │ ┃ - -┃ │ │ │ federation-token │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ GCP Service Connector │ 🔵 gcp │ 🔵 gcp-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 gcs-bucket │ user-account │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ service-account │ │ ┃ - -┃ │ │ 🐳 docker-registry │ oauth2-token │ │ ┃ - -┃ │ │ │ impersonation │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -``` -zenml service-connector describe-type aws -``` - -```Example Command Output -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 🔶 AWS Service Connector (connector type: aws) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Authentication methods: - - - • 🔒 implicit - - • 🔒 secret-key - - • 🔒 sts-token - - • 🔒 iam-role - - • 🔒 session-token - - • 🔒 federation-token - - -Resource types: - - - • 🔶 aws-generic - - • 📦 s3-bucket - - • 🌀 kubernetes-cluster - - • 🐳 docker-registry - - -Supports auto-configuration: True - - -Available locally: True - - -Available remotely: False - - -The ZenML AWS Service Connector facilitates the authentication and access to - -managed AWS services and resources. These encompass a range of resources, - -including S3 buckets, ECR repositories, and EKS clusters. The connector provides - -support for various authentication methods, including explicit long-lived AWS - -secret keys, IAM roles, short-lived STS tokens and implicit authentication. - - -To ensure heightened security measures, this connector also enables the - -generation of temporary STS security tokens that are scoped down to the minimum - -permissions necessary for accessing the intended resource. Furthermore, it - -includes automatic configuration and detection of credentials locally configured - -through the AWS CLI. - - -This connector serves as a general means of accessing any AWS service by issuing - -pre-authenticated boto3 sessions to clients. Additionally, the connector can - -handle specialized authentication for S3, Docker and Kubernetes Python clients. - -It also allows for the configuration of local Docker and Kubernetes CLIs. - - -The AWS Service Connector is part of the AWS ZenML integration. You can either - -install the entire integration or use a pypi extra to install it independently - -of the integration: - - - • pip install "zenml[connectors-aws]" installs only prerequisites for the AWS - - Service Connector Type - - • zenml integration install aws installs the entire AWS ZenML integration - - -It is not required to install and set up the AWS CLI on your local machine to - -use the AWS Service Connector to link Stack Components to AWS resources and - -services. However, it is recommended to do so if you are looking for a quick - -setup that includes using the auto-configuration Service Connector features. - - -──────────────────────────────────────────────────────────────────────────────── -``` -``` -zenml service-connector describe-type aws --resource-type kubernetes-cluster -``` - - - -```bash Example Command Output - -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 🌀 AWS EKS Kubernetes cluster (resource type: kubernetes-cluster) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Authentication methods: implicit, secret-key, sts-token, iam-role, - -session-token, federation-token - - -Supports resource instances: True - - -Authentication methods: - - - • 🔒 implicit - - • 🔒 secret-key - - • 🔒 sts-token - - • 🔒 iam-role - - • 🔒 session-token - - • 🔒 federation-token - - -Allows users to access an EKS cluster as a standard Kubernetes cluster resource. - -When used by Stack Components, they are provided a pre-authenticated - -python-kubernetes client instance. - - -The configured credentials must have at least the following AWS IAM permissions - -associated with the ARNs of EKS clusters that the connector will be allowed to - -access (e.g. arn:aws:eks:{region}:{account}:cluster/* represents all the EKS - -clusters available in the target AWS region). - - - • eks:ListClusters - - • eks:DescribeCluster - - -In addition to the above permissions, if the credentials are not associated with - -the same IAM user or role that created the EKS cluster, the IAM principal must - -be manually added to the EKS cluster's aws-auth ConfigMap, otherwise the - -Kubernetes client will not be allowed to access the cluster's resources. This - -makes it more challenging to use the AWS Implicit and AWS Federation Token - -authentication methods for this resource. For more information, see this - -documentation. - - -If set, the resource name must identify an EKS cluster using one of the - -following formats: - - - • EKS cluster name (canonical resource name): {cluster-name} - - • EKS cluster ARN: arn:aws:eks:{region}:{account}:cluster/{cluster-name} - - -EKS cluster names are region scoped. The connector can only be used to access - -EKS clusters in the AWS region that it is configured to use. - - -──────────────────────────────────────────────────────────────────────────────── -``` - -```bash -zenml service-connector describe-type aws --auth-method secret-key -``` - -```bash Example Command Output - -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 🔒 AWS Secret Key (auth method: secret-key) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Supports issuing temporary credentials: False - - -Long-lived AWS credentials consisting of an AWS access key ID and secret access - -key associated with an AWS IAM user or AWS account root user (not recommended). - - -This method is preferred during development and testing due to its simplicity - -and ease of use. It is not recommended as a direct authentication method for - -production use cases because the clients have direct access to long-lived - -credentials and are granted the full set of permissions of the IAM user or AWS - -account root user associated with the credentials. For production, it is - -recommended to use the AWS IAM Role, AWS Session Token or AWS Federation Token - -authentication method instead. - - -An AWS region is required and the connector may only be used to access AWS - -resources in the specified region. - - -If you already have the local AWS CLI set up with these credentials, they will - -be automatically picked up when auto-configuration is used. - - -Attributes: - - - • aws_access_key_id {string, secret, required}: AWS Access Key ID - - • aws_secret_access_key {string, secret, required}: AWS Secret Access Key - - • region {string, required}: AWS Region - - • endpoint_url {string, optional}: AWS Endpoint URL - - -──────────────────────────────────────────────────────────────────────────────── -``` - - -Resource Types are a way of organizing resources into logical, well-known classes based on the standard and/or protocol used to access them, or simply based on their vendor. This creates a unified language that can be used to declare the types of resources that are provided by Service Connectors on one hand and the types of resources that are required by Stack Components on the other hand. - -For example, we use the generic `kubernetes-cluster` resource type to refer to any and all Kubernetes clusters, since they are all generally accessible using the same standard libraries, clients and API regardless of whether they are Amazon EKS, Google GKE, Azure AKS or another flavor of managed or self-hosted deployment. Similarly, there is a generic `docker-registry` resource type that covers any and all container registries that implement the Docker/OCI interface, be it DockerHub, Amazon ECR, Google GCR, Azure ACR, K3D or something similar. Stack Components that need to connect to a Kubernetes cluster (e.g. the Kubernetes Orchestrator or the Seldon Model Deployer) can use the `kubernetes-cluster` resource type identifier to describe their resource requirements and remain agnostic of their vendor. - -The term Resource Type is used in ZenML everywhere resources accessible through Service Connectors are involved. For example, to list all Service Connector Types that can be used to broker access to Kubernetes Clusters, you can pass the `--resource-type` flag to the CLI command: - -```bash -zenml service-connector list-types --resource-type kubernetes-cluster -``` - -```bash Example Command Output - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Kubernetes Service Connector │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ password │ ✅ │ ✅ ┃ - -┃ │ │ │ token │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Azure Service Connector │ 🇦 azure │ 🇦 azure-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 blob-container │ service-principal │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ access-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ AWS Service Connector │ 🔶 aws │ 🔶 aws-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 s3-bucket │ secret-key │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ sts-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ iam-role │ │ ┃ - -┃ │ │ │ session-token │ │ ┃ - -┃ │ │ │ federation-token │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ GCP Service Connector │ 🔵 gcp │ 🔵 gcp-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 gcs-bucket │ user-account │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ service-account │ │ ┃ - -┃ │ │ 🐳 docker-registry │ oauth2-token │ │ ┃ - -┃ │ │ │ impersonation │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -From the above, you can see that there are not one but four Service Connector Types that can connect ZenML to Kubernetes clusters. The first one is a generic implementation that can be used with any standard Kubernetes cluster, including those that run on-premise. The other three deal exclusively with Kubernetes services managed by the AWS, GCP and Azure cloud providers. - -Conversely, to list all currently registered Service Connector instances that provide access to Kubernetes clusters, one might run: - -```bash -zenml service-connector list --resource_type kubernetes-cluster -``` - -```bash Example Command Output -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼───────────────────────┼──────────────────────────────┼───────────────┼───────────────────────┼──────────────────────────────┼────────┼─────────┼────────────┼─────────────────────┨ - -┃ │ aws-iam-multi-eu │ e33c9fac-5daa-48b2-87bb-0187 │ 🔶 aws │ 🔶 aws-generic │ │ ➖ │ default │ │ region:eu-central-1 ┃ - -┃ │ │ d3782cde │ │ 📦 s3-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┠────────┼───────────────────────┼──────────────────────────────┼───────────────┼───────────────────────┼──────────────────────────────┼────────┼─────────┼────────────┼─────────────────────┨ - -┃ │ aws-iam-multi-us │ ed528d5a-d6cb-4fc4-bc52-c3d2 │ 🔶 aws │ 🔶 aws-generic │ │ ➖ │ default │ │ region:us-east-1 ┃ - -┃ │ │ d01643e5 │ │ 📦 s3-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┠────────┼───────────────────────┼──────────────────────────────┼───────────────┼───────────────────────┼──────────────────────────────┼────────┼─────────┼────────────┼─────────────────────┨ - -┃ │ kube-auto │ da497715-7502-4cdd-81ed-289e │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ A5F8F4142FB12DDCDE9F21F6E9B0 │ ➖ │ default │ │ ┃ - -┃ │ │ 70664597 │ │ │ 7A18.gr7.us-east-1.eks.amazo │ │ │ │ ┃ - -┃ │ │ │ │ │ naws.com │ │ │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - If a Resource Type is used to identify a class of resources, we also need some way to uniquely identify each resource instance belonging to that class that a Service Connector can provide access to. For example, an AWS Service Connector can be configured to provide access to multiple S3 buckets identifiable by their bucket names or their `s3://bucket-name` formatted URIs. Similarly, an AWS Service Connector can be configured to provide access to multiple EKS Kubernetes clusters in the same AWS region, each uniquely identifiable by their EKS cluster name. This is what we call Resource Names. - -Resource Names make it generally easy to identify a particular resource instance accessible through a Service Connector, especially when used together with the Service Connector name and the Resource Type. The following ZenML CLI command output shows a few examples featuring Resource Names for S3 buckets, EKS clusters, ECR registries and general Kubernetes clusters. As you can see, the way we name resources varies from implementation to implementation and resource type to resource type: - - -```bash -zenml service-connector list-resources -``` - -```bash Example Command Output - -The following resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼──────────────────────────────────────────────────────────────────┨ - -┃ 8d307b98-f125-4d7a-b5d5-924c07ba04bb │ aws-session-docker │ 🔶 aws │ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼──────────────────────────────────────────────────────────────────┨ - -┃ d1e5ecf5-1531-4507-bbf5-be0a114907a5 │ aws-session-s3 │ 🔶 aws │ 📦 s3-bucket │ s3://public-flavor-logos ┃ - -┃ │ │ │ │ s3://sagemaker-us-east-1-715803424590 ┃ - -┃ │ │ │ │ s3://spark-artifact-store ┃ - -┃ │ │ │ │ s3://spark-demo-as ┃ - -┃ │ │ │ │ s3://spark-demo-dataset ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼──────────────────────────────────────────────────────────────────┨ - -┃ d2341762-28a3-4dfc-98b9-1ae9aaa93228 │ aws-key-docker-eu │ 🔶 aws │ 🐳 docker-registry │ 715803424590.dkr.ecr.eu-central-1.amazonaws.com ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼──────────────────────────────────────────────────────────────────┨ - -┃ 0658a465-2921-4d6b-a495-2dc078036037 │ aws-key-kube-zenhacks │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼──────────────────────────────────────────────────────────────────┨ - -┃ 049e7f5e-e14c-42b7-93d4-a273ef414e66 │ eks-eu-central-1 │ 🔶 aws │ 🌀 kubernetes-cluster │ kubeflowmultitenant ┃ - -┃ │ │ │ │ zenbox ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼──────────────────────────────────────────────────────────────────┨ - -┃ b551f3ae-1448-4f36-97a2-52ce303f20c9 │ kube-auto │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ A5F8F4142FB12DDCDE9F21F6E9B07A18.gr7.us-east-1.eks.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -Every Service Connector Type defines its own rules for how Resource Names are formatted. These rules are documented in the section belonging each resource type. For example: - -```bash -zenml service-connector describe-type aws --resource-type docker-registry -``` - -```bash Example Command Output - -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 🐳 AWS ECR container registry (resource type: docker-registry) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Authentication methods: implicit, secret-key, sts-token, iam-role, - -session-token, federation-token - - -Supports resource instances: False - - -Authentication methods: - - - • 🔒 implicit - - • 🔒 secret-key - - • 🔒 sts-token - - • 🔒 iam-role - - • 🔒 session-token - - • 🔒 federation-token - - -Allows users to access one or more ECR repositories as a standard Docker - -registry resource. When used by Stack Components, they are provided a - -pre-authenticated python-docker client instance. - - -The configured credentials must have at least the following AWS IAM permissions - -associated with the ARNs of one or more ECR repositories that the connector will - -be allowed to access (e.g. arn:aws:ecr:{region}:{account}:repository/* - -represents all the ECR repositories available in the target AWS region). - - - • ecr:DescribeRegistry - - • ecr:DescribeRepositories - - • ecr:ListRepositories - - • ecr:BatchGetImage - - • ecr:DescribeImages - - • ecr:BatchCheckLayerAvailability - - • ecr:GetDownloadUrlForLayer - - • ecr:InitiateLayerUpload - - • ecr:UploadLayerPart - - • ecr:CompleteLayerUpload - - • ecr:PutImage - - • ecr:GetAuthorizationToken - - -This resource type is not scoped to a single ECR repository. Instead, a - -connector configured with this resource type will grant access to all the ECR - -repositories that the credentials are allowed to access under the configured AWS - -region (i.e. all repositories under the Docker registry URL - -https://{account-id}.dkr.ecr.{region}.amazonaws.com). - - -The resource name associated with this resource type uniquely identifies an ECR - -registry using one of the following formats (the repository name is ignored, - -only the registry URL/ARN is used): - - - • ECR repository URI (canonical resource name): - - [https://]{account}.dkr.ecr.{region}.amazonaws.com[/{repository-name}] - - • ECR repository ARN: - - arn:aws:ecr:{region}:{account-id}:repository[/{repository-name}] - - -ECR repository names are region scoped. The connector can only be used to access - -ECR repositories in the AWS region that it is configured to use. - - -──────────────────────────────────────────────────────────────────────────────── -``` - - - The Service Connector is how you configure ZenML to authenticate and connect to one or more external resources. It stores the required configuration and security credentials and can optionally be scoped with a Resource Type and a Resource Name. - -Depending on the Service Connector Type implementation, a Service Connector instance can be configured in one of the following modes with regards to the types and number of resources that it has access to: - -* a **multi-type** Service Connector instance that can be configured once and used to gain access to multiple types of resources. This is only possible with Service Connector Types that support multiple Resource Types to begin with, such as those that target multi-service cloud providers like AWS, GCP and Azure. In contrast, a **single-type** Service Connector can only be used with a single Resource Type. To configure a multi-type Service Connector, you can simply skip scoping its Resource Type during registration. -* a **multi-instance** Service Connector instance can be configured once and used to gain access to multiple resources of the same type, each identifiable by a Resource Name. Not all types of connectors and not all types of resources support multiple instances. Some Service Connectors Types like the generic Kubernetes and Docker connector types only allow **single-instance** configurations: a Service Connector instance can only be used to access a single Kubernetes cluster and a single Docker registry. To configure a multi-instance Service Connector, you can simply skip scoping its Resource Name during registration. - -The following is an example of configuring a multi-type AWS Service Connector instance capable of accessing multiple AWS resources of different types: - - -```bash -zenml service-connector register aws-multi-type --type aws --auto-configure -``` - -```bash Example Command Output - -⠋ Registering service connector 'aws-multi-type'... - -Successfully registered service connector `aws-multi-type` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┃ │ s3://zenml-public-swagger-spec ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The following is an example of configuring a multi-instance AWS S3 Service Connector instance capable of accessing multiple AWS S3 buckets: - -```bash -zenml service-connector register aws-s3-multi-instance --type aws --auto-configure --resource-type s3-bucket -``` - -```bash Example Command Output -⠸ Registering service connector 'aws-s3-multi-instance'... - -Successfully registered service connector `aws-s3-multi-instance` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼───────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┃ │ s3://zenml-public-swagger-spec ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -The following is an example of configuring a single-instance AWS S3 Service Connector instance capable of accessing a single AWS S3 bucket: - - -```bash -zenml service-connector register aws-s3-zenfiles --type aws --auto-configure --resource-type s3-bucket --resource-id s3://zenfiles -``` - -```bash Example Command Output - -⠼ Registering service connector 'aws-s3-zenfiles'... - -Successfully registered service connector `aws-s3-zenfiles` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - - - - -## Explore Service Connector Types - -Service Connector Types are not only templates used to instantiate Service Connectors, they also form a body of knowledge that documents best security practices and guides users through the complicated world of authentication and authorization. - -ZenML ships with a handful of Service Connector Types that enable you right out-of-the-box to connect ZenML to cloud resources and services available from cloud providers such as AWS and GCP, as well as on-premise infrastructure. In addition to built-in Service Connector Types, ZenML can be easily extended with custom Service Connector implementations. - -To discover the Connector Types available with your ZenML deployment, you can use the `zenml service-connector list-types` CLI command: - -```bash -zenml service-connector list-types -``` - -```bash Example Command Output -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┓ - -┃ NAME │ TYPE │ RESOURCE TYPES │ AUTH METHODS │ LOCAL │ REMOTE ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Kubernetes Service Connector │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ password │ ✅ │ ✅ ┃ - -┃ │ │ │ token │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Docker Service Connector │ 🐳 docker │ 🐳 docker-registry │ password │ ✅ │ ✅ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ Azure Service Connector │ 🇦 azure │ 🇦 azure-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 blob-container │ service-principal │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ access-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ AWS Service Connector │ 🔶 aws │ 🔶 aws-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 s3-bucket │ secret-key │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ sts-token │ │ ┃ - -┃ │ │ 🐳 docker-registry │ iam-role │ │ ┃ - -┃ │ │ │ session-token │ │ ┃ - -┃ │ │ │ federation-token │ │ ┃ - -┠──────────────────────────────┼───────────────┼───────────────────────┼───────────────────┼───────┼────────┨ - -┃ GCP Service Connector │ 🔵 gcp │ 🔵 gcp-generic │ implicit │ ✅ │ ✅ ┃ - -┃ │ │ 📦 gcs-bucket │ user-account │ │ ┃ - -┃ │ │ 🌀 kubernetes-cluster │ service-account │ │ ┃ - -┃ │ │ 🐳 docker-registry │ oauth2-token │ │ ┃ - -┃ │ │ │ impersonation │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┛ -``` - -A lot more is hidden behind a Service Connector Type than a name and a simple list of resource types. Before using a Service Connector Type to configure a Service Connector, you probably need to understand what it is, what it can offer and what are the supported authentication methods and their requirements. All this can be accessed directly through the CLI. Some examples are included here. - -Showing information about the `gcp` Service Connector Type: - - -```bash -zenml service-connector describe-type gcp -``` - -```bash Example Command Output - -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 🔵 GCP Service Connector (connector type: gcp) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Authentication methods: - - - • 🔒 implicit - - • 🔒 user-account - - • 🔒 service-account - - • 🔒 oauth2-token - - • 🔒 impersonation - - -Resource types: - - - • 🔵 gcp-generic - - • 📦 gcs-bucket - - • 🌀 kubernetes-cluster - - • 🐳 docker-registry - - -Supports auto-configuration: True - - -Available locally: True - - -Available remotely: True - - -The ZenML GCP Service Connector facilitates the authentication and access to - -managed GCP services and resources. These encompass a range of resources, - -including GCS buckets, GCR container repositories and GKE clusters. The - -connector provides support for various authentication methods, including GCP - -user accounts, service accounts, short-lived OAuth 2.0 tokens and implicit - -authentication. - - -To ensure heightened security measures, this connector always issues short-lived - -OAuth 2.0 tokens to clients instead of long-lived credentials. Furthermore, it - -includes automatic configuration and detection of credentials locally - -configured through the GCP CLI. - - -This connector serves as a general means of accessing any GCP service by issuing - -OAuth 2.0 credential objects to clients. Additionally, the connector can handle - -specialized authentication for GCS, Docker and Kubernetes Python clients. It - -also allows for the configuration of local Docker and Kubernetes CLIs. - - -The GCP Service Connector is part of the GCP ZenML integration. You can either - -install the entire integration or use a pypi extra to install it independently - -of the integration: - - - • pip install "zenml[connectors-gcp]" installs only prerequisites for the GCP - - Service Connector Type - - • zenml integration install gcp installs the entire GCP ZenML integration - - -It is not required to install and set up the GCP CLI on your local machine to - -use the GCP Service Connector to link Stack Components to GCP resources and - -services. However, it is recommended to do so if you are looking for a quick - -setup that includes using the auto-configuration Service Connector features. - - -────────────────────────────────────────────────────────────────────────────────── -``` - -Fetching details about the GCP `kubernetes-cluster` resource type (i.e. the GKE cluster): - -```bash -zenml service-connector describe-type gcp --resource-type kubernetes-cluster -``` - -```bash Example Command Output -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 🌀 GCP GKE Kubernetes cluster (resource type: kubernetes-cluster) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Authentication methods: implicit, user-account, service-account, oauth2-token, - -impersonation - - -Supports resource instances: True - - -Authentication methods: - - - • 🔒 implicit - - • 🔒 user-account - - • 🔒 service-account - - • 🔒 oauth2-token - - • 🔒 impersonation - - -Allows Stack Components to access a GKE registry as a standard Kubernetes - -cluster resource. When used by Stack Components, they are provided a - -pre-authenticated Python Kubernetes client instance. - - -The configured credentials must have at least the following GCP permissions - -associated with the GKE clusters that it can access: - - - • container.clusters.list - - • container.clusters.get - - -In addition to the above permissions, the credentials should include permissions - -to connect to and use the GKE cluster (i.e. some or all permissions in the - -Kubernetes Engine Developer role). - - -If set, the resource name must identify an GKE cluster using one of the - -following formats: - - - • GKE cluster name: {cluster-name} - - -GKE cluster names are project scoped. The connector can only be used to access - -GKE clusters in the GCP project that it is configured to use. - - -──────────────────────────────────────────────────────────────────────────────── -``` - -Displaying information about the `service-account` GCP authentication method: - - -```bash -zenml service-connector describe-type gcp --auth-method service-account -``` - -```bash Example Command Output -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ 🔒 GCP Service Account (auth method: service-account) ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - -Supports issuing temporary credentials: False - - -Use a GCP service account and its credentials to authenticate to GCP services. - -This method requires a GCP service account and a service account key JSON - -created for it. - - -The GCP connector generates temporary OAuth 2.0 tokens from the user account - -credentials and distributes them to clients. The tokens have a limited lifetime - -of 1 hour. - - -A GCP project is required and the connector may only be used to access GCP - -resources in the specified project. - - -If you already have the GOOGLE_APPLICATION_CREDENTIALS environment variable - -configured to point to a service account key JSON file, it will be automatically - -picked up when auto-configuration is used. - - -Attributes: - - - • service_account_json {string, secret, required}: GCP Service Account Key JSON - - • project_id {string, required}: GCP Project ID where the target resource is - - located. - - -──────────────────────────────────────────────────────────────────────────────── -``` - - - -### Basic Service Connector Types - -Service Connector Types like the [Kubernetes Service Connector](/develop/usage/resource-data-management/auth-management/kubernetes-service-connector) and [Docker Service Connector](/develop/usage/resource-data-management/auth-management/docker-service-connector) can only handle one resource at a time: a Kubernetes cluster and a Docker container registry respectively. These basic Service Connector Types are the easiest to instantiate and manage, as each Service Connector instance is tied exactly to one resource (i.e. they are _single-instance_ connectors). - -The following output shows two Service Connector instances configured from basic Service Connector Types: - -* a Docker Service Connector that grants authenticated access to the DockerHub registry and allows pushing/pulling images that are stored in private repositories belonging to a DockerHub account -* a Kubernetes Service Connector that authenticates access to a Kubernetes cluster running on-premise and allows managing containerized workloads running there. - -```bash -$ zenml service-connector list - -┏━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼────────────────┼──────────────────────────────────────┼───────────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ dockerhub │ b485626e-7fee-4525-90da-5b26c72331eb │ 🐳 docker │ 🐳 docker-registry │ docker.io │ ➖ │ default │ │ ┃ - -┠────────┼────────────────┼──────────────────────────────────────┼───────────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ kube-on-prem │ 4315e8eb-fcbd-4938-a4d7-a9218ab372a1 │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ 192.168.0.12 │ ➖ │ default │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ - -``` - -### Cloud provider Service Connector Types - -Cloud service providers like AWS, GCP and Azure implement one or more authentication schemes that are unified across a wide range or resources and services, all managed under the same umbrella. This allows users to access many different resources with a single set of authentication credentials. Some authentication methods are straightforward to set up, but are only meant to be used for development and testing. Other authentication schemes are powered by extensive roles and permissions management systems and are targeted at production environments where security and operations at scale are big concerns. The corresponding cloud provider Service Connector Types are designed accordingly: - -* they support multiple types of resources (e.g. Kubernetes clusters, Docker registries, a form of object storage) -* they usually include some form of "generic" Resource Type that can be used by clients to access types of resources that are not yet part of the supported set. When this generic Resource Type is used, clients and Stack Components that access the connector are provided some form of generic session, credentials or client that can be used to access any of the cloud provider resources. For example, in the AWS case, clients accessing the `aws-generic` Resource Type are issued a pre-authenticated `boto3` Session object that can be used to access any AWS service. -* they support multiple authentication methods. Some of these allow clients direct access to long-lived, broad-access credentials and are only recommended for local development use. Others support distributing temporary API tokens automatically generated from long-lived credentials, which are safer for production use-cases, but may be more difficult to set up. A few authentication methods even support down-scoping the permissions of temporary API tokens so that they only allow access to the target resource and restrict access to everything else. This is covered at length [in the section on best practices for authentication methods](/develop/usage/resource-data-management/auth-management/service-connectors-guide). -* there is flexibility regarding the range of resources that a single cloud provider Service Connector instance configured with a single set of credentials can be scoped to access: - * a _multi-type Service Connector_ instance can access any type of resources from the range of supported Resource Types - * a _multi-instance Service Connector_ instance can access multiple resources of the same type - * a _single-instance Service Connector_ instance is scoped to access a single resource - -The following output shows three different Service Connectors configured from the same GCP Service Connector Type using three different scopes but with the same credentials: - -* a multi-type GCP Service Connector that allows access to every possible resource accessible with the configured credentials -* a multi-instance GCS Service Connector that allows access to multiple GCS buckets -* a single-instance GCS Service Connector that only permits access to one GCS bucket - -```bash -$ zenml service-connector list - -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼────────────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼─────────────────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ gcp-multi │ 9d953320-3560-4a78-817c-926a3898064d │ 🔵 gcp │ 🔵 gcp-generic │ │ ➖ │ default │ │ ┃ - -┃ │ │ │ │ 📦 gcs-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┠────────┼────────────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼─────────────────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ gcs-multi │ ff9c0723-7451-46b7-93ef-fcf3efde30fa │ 🔵 gcp │ 📦 gcs-bucket │ │ ➖ │ default │ │ ┃ - -┠────────┼────────────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼─────────────────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ gcs-langchain-slackbot │ cf3953e9-414c-4875-ba00-24c62a0dc0c5 │ 🔵 gcp │ 📦 gcs-bucket │ gs://langchain-slackbot │ ➖ │ default │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - -### Local and remote availability - - - You only need to be aware of local and remote availability for Service Connector Types if you are explicitly looking to use a Service Connector Type without installing its package prerequisites or if you are implementing or using a custom Service Connector Type implementation with your ZenML deployment. In all other cases, you may safely ignore this section. - - -The `LOCAL` and `REMOTE` flags in the `zenml service-connector list-types` output indicate if the Service Connector implementation is available locally (i.e. where the ZenML client and pipelines are running) and remotely (i.e. where the ZenML server is running). - - - All built-in Service Connector Types are by default available on the ZenML server, but some built-in Service Connector Types require additional Python packages to be installed to be available in your local environment. See the section documenting each Service Connector Type to find what these prerequisites are and how to install them. - - -The local/remote availability determines the possible actions and operations that can be performed with a Service Connector. The following are possible with a Service Connector Type that is available either locally or remotely: - -* Service Connector registration, update, and discovery (i.e. the `zenml service-connector register`, `zenml service-connector update`, `zenml service-connector list` and `zenml service-connector describe` CLI commands). -* Service Connector verification: checking whether its configuration and credentials are valid and can be actively used to access the remote resources (i.e. the `zenml service-connector verify` CLI commands). -* Listing the resources that can be accessed through a Service Connector (i.e. the `zenml service-connector verify` and `zenml service-connector list-resources` CLI commands) -* Connecting a Stack Component to a remote resource via a Service Connector - -The following operations are only possible with Service Connector Types that are locally available (with some notable exceptions covered in the information box that follows): - -* Service Connector auto-configuration and discovery of credentials stored by a local client, CLI, or SDK (e.g. aws or kubectl). -* Using the configuration and credentials managed by a Service Connector to configure a local client, CLI, or SDK (e.g. docker or kubectl). -* Running pipelines with a Stack Component that is connected to a remote resource through a Service Connector - - -One interesting and useful byproduct of the way cloud provider Service Connectors are designed is the fact that you don't need to have the cloud provider Service Connector Type available client-side to be able to access some of its resources. Take the following situation for example: - -* the GCP Service Connector Type can provide access to GKE Kubernetes clusters and GCR Docker container registries. -* however, you don't need the GCP Service Connector Type or any GCP libraries to be installed on the ZenML clients to connect to and use those Kubernetes clusters or Docker registries in your ML pipelines. -* the Kubernetes Service Connector Type is enough to access any Kubernetes cluster, regardless of its provenance (AWS, GCP, etc.) -* the Docker Service Connector Type is enough to access any Docker container registry, regardless of its provenance (AWS, GCP, etc.) - - -## Register Service Connectors - -When you reach this section, you probably already made up your mind about the type of infrastructure or cloud provider that you want to use to run your ZenML pipelines after reading through [the Service Connector Types section](/develop/usage/resource-data-management/auth-management/service-connectors-guide#explore-service-connector-types), and you probably carefully weighed your [choices of authentication methods and best security practices](/develop/usage/resource-data-management/auth-management/best-security-practices). Either that or you simply want to quickly try out a Service Connector to [connect one of the ZenML Stack components to an external resource](/develop/usage/resource-data-management/auth-management/service-connectors-guide#connect-stack-components-to-resources). - -If you are looking for a quick, assisted tour, we recommend using the interactive CLI mode to configure Service Connectors, especially if this is your first time doing it: - -``` -zenml service-connector register -i -``` - - - ``` -zenml service-connector register -i -``` - -```bash Example Command Output -Please enter a name for the service connector: gcp-interactive - -Please enter a description for the service connector []: Interactive GCP connector example - -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ Available service connector types ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - - 🌀 Kubernetes Service Connector (connector type: kubernetes) - - -Authentication methods: - - - • 🔒 password - - • 🔒 token - - -Resource types: - - - • 🌀 kubernetes-cluster - - -Supports auto-configuration: True - - -Available locally: True - - -Available remotely: True - - -This ZenML Kubernetes service connector facilitates authenticating and - -connecting to a Kubernetes cluster. - - -The connector can be used to access to any generic Kubernetes cluster by - -providing pre-authenticated Kubernetes python clients to Stack Components that - -are linked to it and also allows configuring the local Kubernetes CLI (i.e. - -kubectl). - - -The Kubernetes Service Connector is part of the Kubernetes ZenML integration. - -You can either install the entire integration or use a pypi extra to install it - -independently of the integration: - - - • pip install "zenml[connectors-kubernetes]" installs only prerequisites for the - - Kubernetes Service Connector Type - - • zenml integration install kubernetes installs the entire Kubernetes ZenML - - integration - - -A local Kubernetes CLI (i.e. kubectl ) and setting up local kubectl - -configuration contexts is not required to access Kubernetes clusters in your - -Stack Components through the Kubernetes Service Connector. - - - 🐳 Docker Service Connector (connector type: docker) - - -Authentication methods: - - - • 🔒 password - - -Resource types: - - - • 🐳 docker-registry - - -Supports auto-configuration: False - - -Available locally: True - - -Available remotely: True - - -The ZenML Docker Service Connector allows authenticating with a Docker or OCI - -container registry and managing Docker clients for the registry. - - -This connector provides pre-authenticated python-docker Python clients to Stack - -Components that are linked to it. - - -No Python packages are required for this Service Connector. All prerequisites - -are included in the base ZenML Python package. Docker needs to be installed on - -environments where container images are built and pushed to the target container - -registry. - -[...] - -──────────────────────────────────────────────────────────────────────────────── - -Please select a service connector type (kubernetes, docker, azure, aws, gcp): gcp - -╔══════════════════════════════════════════════════════════════════════════════╗ - -║ Available resource types ║ - -╚══════════════════════════════════════════════════════════════════════════════╝ - - - 🔵 Generic GCP resource (resource type: gcp-generic) - - -Authentication methods: implicit, user-account, service-account, oauth2-token, - -impersonation - - -Supports resource instances: False - - -Authentication methods: - - - • 🔒 implicit - - • 🔒 user-account - - • 🔒 service-account - - • 🔒 oauth2-token - - • 🔒 impersonation - - -This resource type allows Stack Components to use the GCP Service Connector to - -connect to any GCP service or resource. When used by Stack Components, they are - -provided a Python google-auth credentials object populated with a GCP OAuth 2.0 - -token. This credentials object can then be used to create GCP Python clients for - -any particular GCP service. - - -This generic GCP resource type is meant to be used with Stack Components that - -are not represented by other, more specific resource type, like GCS buckets, - -Kubernetes clusters or Docker registries. For example, it can be used with the - -Google Cloud Builder Image Builder stack component, or the Vertex AI - -Orchestrator and Step Operator. It should be accompanied by a matching set of - -GCP permissions that allow access to the set of remote resources required by the - -client and Stack Component. - - -The resource name represents the GCP project that the connector is authorized to - -access. - - - 📦 GCP GCS bucket (resource type: gcs-bucket) - - -Authentication methods: implicit, user-account, service-account, oauth2-token, - -impersonation - - -Supports resource instances: True - - -Authentication methods: - - - • 🔒 implicit - - • 🔒 user-account - - • 🔒 service-account - - • 🔒 oauth2-token - - • 🔒 impersonation - - -Allows Stack Components to connect to GCS buckets. When used by Stack - -Components, they are provided a pre-configured GCS Python client instance. - - -The configured credentials must have at least the following GCP permissions - -associated with the GCS buckets that it can access: - - - • storage.buckets.list - - • storage.buckets.get - - • storage.objects.create - - • storage.objects.delete - - • storage.objects.get - - • storage.objects.list - - • storage.objects.update - - -For example, the GCP Storage Admin role includes all of the required - -permissions, but it also includes additional permissions that are not required - -by the connector. - - -If set, the resource name must identify a GCS bucket using one of the following - -formats: - - - • GCS bucket URI: gs://{bucket-name} - - • GCS bucket name: {bucket-name} - -[...] - -──────────────────────────────────────────────────────────────────────────────── - -Please select a resource type or leave it empty to create a connector that can be used to access any of the supported resource types (gcp-generic, gcs-bucket, kubernetes-cluster, docker-registry). []: gcs-bucket - -Would you like to attempt auto-configuration to extract the authentication configuration from your local environment ? [y/N]: y - -Service connector auto-configured successfully with the following configuration: - -Service connector 'gcp-interactive' of type 'gcp' is 'private'. - - 'gcp-interactive' gcp Service - - Connector Details - -┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠──────────────────┼─────────────────┨ - -┃ NAME │ gcp-interactive ┃ - -┠──────────────────┼─────────────────┨ - -┃ TYPE │ 🔵 gcp ┃ - -┠──────────────────┼─────────────────┨ - -┃ AUTH METHOD │ user-account ┃ - -┠──────────────────┼─────────────────┨ - -┃ RESOURCE TYPES │ 📦 gcs-bucket ┃ - -┠──────────────────┼─────────────────┨ - -┃ RESOURCE NAME │ ┃ - -┠──────────────────┼─────────────────┨ - -┃ SESSION DURATION │ N/A ┃ - -┠──────────────────┼─────────────────┨ - -┃ EXPIRES IN │ N/A ┃ - -┠──────────────────┼─────────────────┨ - -┃ SHARED │ ➖ ┃ - -┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┛ - - Configuration - -┏━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┓ - -┃ PROPERTY │ VALUE ┃ - -┠───────────────────┼────────────┨ - -┃ project_id │ zenml-core ┃ - -┠───────────────────┼────────────┨ - -┃ user_account_json │ [HIDDEN] ┃ - -┗━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┛ - -No labels are set for this service connector. - -The service connector configuration has access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://annotation-gcp-store ┃ - -┃ │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ - -Would you like to continue with the auto-discovered configuration or switch to manual ? (auto, manual) [auto]: - -The following GCP GCS bucket instances are reachable through this connector: - - - gs://annotation-gcp-store - - - gs://zenml-bucket-sl - - - gs://zenml-core.appspot.com - - - gs://zenml-core_cloudbuild - - - gs://zenml-datasets - -Please select one or leave it empty to create a connector that can be used to access any of them []: gs://zenml-datasets - -Successfully registered service connector `gcp-interactive` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼─────────────────────┨ - -┃ 📦 gcs-bucket │ gs://zenml-datasets ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛ -``` - - -Regardless of how you came here, you should already have some idea of the following: - -* the type of resources that you want to connect ZenML to. This may be a Kubernetes cluster, a Docker container registry or an object storage service like AWS S3 or GCS. -* the Service Connector implementation (i.e. Service Connector Type) that you want to use to connect to those resources. This could be one of the cloud provider Service Connector Types like AWS and GCP that provide access to a broader range of services, or one of the basic Service Connector Types like Kubernetes or Docker that only target a specific resource. -* the credentials and authentication method that you want to use - -Other questions that should be answered in this section: - -* are you just looking to connect a ZenML Stack Component to a single resource? or would you rather configure a wide-access ZenML Service Connector that gives ZenML and all its users access to a broader range of resource types and resource instances with a single set of credentials issued by your cloud provider? -* have you already provisioned all the authentication prerequisites (e.g. service accounts, roles, permissions) and prepared the credentials you will need to configure the Service Connector? If you already have one of the cloud provider CLIs configured with credentials on your local host, you can easily use the Service Connector auto-configuration capabilities to get faster where you need to go. - -For help answering these questions, you can also use the interactive CLI mode to register Service Connectors and/or consult the documentation dedicated to each individual Service Connector Type. - -### Auto-configuration - -Many Service Connector Types support using auto-configuration to discover and extract configuration information and credentials directly from your local environment. This assumes that you have already installed and set up the local CLI or SDK associated with the type of resource or cloud provider that you're willing to use. The Service Connector auto-configuration feature relies on these CLIs being configured with valid credentials to work properly. Some examples are listed here, but you should consult the documentation section for the Service Connector Type of choice to find out if and how auto-configuration is supported: - -* AWS uses the [aws configure CLI command](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html) -* GCP offers [the gcloud auth application-default login CLI command](https://cloud.google.com/docs/authentication/provide-credentials-adc#how%5Fto%5Fprovide%5Fcredentials%5Fto%5Fadc) -* Azure provides [the az login CLI command](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli) - - -```bash -zenml service-connector register kubernetes-auto --type kubernetes --auto-configure -``` - -```bash Example Command Output - -Successfully registered service connector `kubernetes-auto` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼────────────────┨ - -┃ 🌀 kubernetes-cluster │ 35.185.95.223 ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -```bash -zenml service-connector register aws-auto --type aws --auto-configure -``` - -```bash Example Command Output -⠼ Registering service connector 'aws-auto'... - -Successfully registered service connector `aws-auto` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -```bash -zenml service-connector register gcp-auto --type gcp --auto-configure -``` - -```bash Example Command Output -Successfully registered service connector `gcp-auto` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🔵 gcp-generic │ zenml-core ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 📦 gcs-bucket │ gs://annotation-gcp-store ┃ - -┃ │ gs://zenml-bucket-sl ┃ - -┃ │ gs://zenml-core.appspot.com ┃ - -┃ │ gs://zenml-core_cloudbuild ┃ - -┃ │ gs://zenml-datasets ┃ - -┃ │ gs://zenml-internal-artifact-store ┃ - -┃ │ gs://zenml-kubeflow-artifact-store ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠───────────────────────┼─────────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - - -### Scopes: multi-type, multi-instance, and single-instance - -These terms are briefly explained in the [Terminology](/develop/usage/resource-data-management/auth-management/service-connectors-guide#terminology) section: you can register a Service Connector that grants access to multiple types of resources, to multiple instances of the same Resource Type, or to a single resource. - -Service Connectors created from basic Service Connector Types like Kubernetes and Docker are single-resource by default, while Service Connectors used to connect to managed cloud resources like AWS and GCP can take all three forms. - -The following example shows registering three different Service Connectors configured from the same AWS Service Connector Type using three different scopes but with the same credentials: - -* a multi-type AWS Service Connector that allows access to every possible resource accessible with the configured credentials -* a multi-instance AWS Service Connector that allows access to multiple S3 buckets -* a single-instance AWS Service Connector that only permits access to one S3 bucket - -```bash -zenml service-connector register aws-multi-type --type aws --auto-configure -``` - -```bash Example Command Output - -⠋ Registering service connector 'aws-multi-type'... - -Successfully registered service connector `aws-multi-type` with access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┃ │ s3://zenml-public-swagger-spec ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -``` -zenml service-connector register aws-s3-multi-instance --type aws --auto-configure --resource-type s3-bucket -``` - - -```bash Example Command Output -⠸ Registering service connector 'aws-s3-multi-instance'... - -Successfully registered service connector `aws-s3-multi-instance` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼───────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┃ │ s3://zenml-public-datasets ┃ - -┃ │ s3://zenml-public-swagger-spec ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - - -```bash -zenml service-connector register aws-s3-zenfiles --type aws --auto-configure --resource-type s3-bucket --resource-id s3://zenfiles -``` - - -```bash Example Command Output -⠼ Registering service connector 'aws-s3-zenfiles'... - -Successfully registered service connector `aws-s3-zenfiles` with access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - - - -The following might help understand the difference between scopes: - -* the difference between a multi-instance and a multi-type Service Connector is that the Resource Type scope is locked to a particular value during configuration for the multi-instance Service Connector -* similarly, the difference between a multi-instance and a multi-type Service Connector is that the Resource Name (Resource ID) scope is locked to a particular value during configuration for the single-instance Service Connector - -### Service Connector Verification - -When registering Service Connectors, the authentication configuration and credentials are automatically verified to ensure that they can indeed be used to gain access to the target resources: - -* for multi-type Service Connectors, this verification means checking that the configured credentials can be used to authenticate successfully to the remote service, as well as listing all resources that the credentials have permission to access for each Resource Type supported by the Service Connector Type. -* for multi-instance Service Connectors, this verification step means listing all resources that the credentials have permission to access in addition to validating that the credentials can be used to authenticate to the target service or platform. -* for single-instance Service Connectors, the verification step simply checks that the configured credentials have permission to access the target resource. - -The verification can also be performed later on an already registered Service Connector. Furthermore, for multi-type and multi-instance Service Connectors, the verification operation can be scoped to a Resource Type and a Resource Name. - - -The following shows how a multi-type, a multi-instance and a single-instance Service Connector can be verified with multiple scopes after registration. - -First, listing the Service Connectors will clarify which scopes they are configured with: - -``` -zenml service-connector list -``` - -```bash Example Command Output -┏━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━┓ - -┃ ACTIVE │ NAME │ ID │ TYPE │ RESOURCE TYPES │ RESOURCE NAME │ SHARED │ OWNER │ EXPIRES IN │ LABELS ┃ - -┠────────┼───────────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ aws-multi-type │ 373a73c2-8295-45d4-a768-45f5a0f744ea │ 🔶 aws │ 🔶 aws-generic │ │ ➖ │ default │ │ ┃ - -┃ │ │ │ │ 📦 s3-bucket │ │ │ │ │ ┃ - -┃ │ │ │ │ 🌀 kubernetes-cluster │ │ │ │ │ ┃ - -┃ │ │ │ │ 🐳 docker-registry │ │ │ │ │ ┃ - -┠────────┼───────────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ aws-s3-multi-instance │ fa9325ab-ce01-4404-aec3-61a3af395d48 │ 🔶 aws │ 📦 s3-bucket │ │ ➖ │ default │ │ ┃ - -┠────────┼───────────────────────┼──────────────────────────────────────┼────────┼───────────────────────┼───────────────┼────────┼─────────┼────────────┼────────┨ - -┃ │ aws-s3-zenfiles │ 19edc05b-92db-49de-bc84-aa9b3fb8261a │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles │ ➖ │ default │ │ ┃ - -┗━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━┛ -``` - -Verifying the multi-type Service Connector displays all resources that can be accessed through the Service Connector. This is like asking "are these credentials valid? can they be used to authenticate to AWS ? and if so, what resources can they access?": - -```bash -zenml service-connector verify aws-multi-type -``` - -```bash Example Command Output -Service connector 'aws-multi-type' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🔶 aws-generic │ us-east-1 ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠───────────────────────┼──────────────────────────────────────────────┨ - -┃ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -You can scope the verification down to a particular Resource Type or all the way down to a Resource Name. This is the equivalent of asking "are these credentials valid and which S3 buckets are they authorized to access ?" and "can these credentials be used to access this particular Kubernetes cluster in AWS ?": - -```bash -zenml service-connector verify aws-multi-type --resource-type s3-bucket -``` - -```bash Example Command Output -Service connector 'aws-multi-type' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼───────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -```bash -zenml service-connector verify aws-multi-type --resource-type kubernetes-cluster --resource-id zenhacks-cluster -``` - -```bash Example Command Output -Service connector 'aws-multi-type' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────────────┼──────────────────┨ - -┃ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ -``` - -Verifying the multi-instance Service Connector displays all the resources that it can access. We can also scope the verification to a single resource: - -``` -zenml service-connector verify aws-s3-multi-instance -``` - -```bash Example Command Output -Service connector 'aws-s3-multi-instance' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼───────────────────────────────────────┨ - -┃ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ s3://zenfiles ┃ - -┃ │ s3://zenml-demos ┃ - -┃ │ s3://zenml-generative-chat ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -```bash -zenml service-connector verify aws-s3-multi-instance --resource-id s3://zenml-demos -``` - -```bash Example Command Output -Service connector 'aws-s3-multi-instance' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼──────────────────┨ - -┃ 📦 s3-bucket │ s3://zenml-demos ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛ -``` - -Finally, verifying the single-instance Service Connector is straight-forward and requires no further explanation: - - -```bash -zenml service-connector verify aws-s3-zenfiles -``` - -```bash Example Command Output -Service connector 'aws-s3-zenfiles' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠───────────────┼────────────────┨ - -┃ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - - -## Configure local clients - -Yet another neat feature built into some Service Container Types that is the opposite of [Service Connector auto-configuration](/develop/usage/resource-data-management/auth-management/service-connectors-guide#auto-configuration) is the ability to configure local CLI and SDK utilities installed on your host, like the Docker or Kubernetes CLI (`kubectl`) with credentials issued by a compatible Service Connector. - -You may need to exercise this feature to get direct CLI access to a remote service in order to manually manage some configurations or resources, to debug some workloads or to simply verify that the Service Connector credentials are actually working. - - - When configuring local CLI utilities with credentials extracted from Service Connectors, keep in mind that most Service Connectors, particularly those used with cloud platforms, usually exercise the security best practice of issuing _temporary credentials such as API tokens._ The implication is that your local CLI may only be allowed access to the remote service for a short time before those credentials expire, then you need to fetch another set of credentials from the Service Connector. - - - - -The following examples show how the local Kubernetes `kubectl` CLI can be configured with credentials issued by a Service Connector and then used to access a Kubernetes cluster directly: - -```bash -zenml service-connector list-resources --resource-type kubernetes-cluster -``` - -```bash Example Command Output - -The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼──────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 9d953320-3560-4a78-817c-926a3898064d │ gcp-user-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠──────────────────────────────────────┼──────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 4a550c82-aa64-4a48-9c7f-d5e127d77a44 │ aws-multi-type │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -```bash -zenml service-connector login gcp-user-multi --resource-type kubernetes-cluster --resource-id zenml-test-cluster -``` - -```bash Example Command Output -$ zenml service-connector login gcp-user-multi --resource-type kubernetes-cluster --resource-id zenml-test-cluster - -⠇ Attempting to configure local client using service connector 'gcp-user-multi'... - -Updated local kubeconfig with the cluster details. The current kubectl context was set to 'gke_zenml-core_zenml-test-cluster'. - -The 'gcp-user-multi' Kubernetes Service Connector connector was used to successfully configure the local Kubernetes cluster client/SDK. - -# Verify that the local kubectl client is now configured to access the remote Kubernetes cluster - -$ kubectl cluster-info - -Kubernetes control plane is running at https://35.185.95.223 - -GLBCDefaultBackend is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy - -KubeDNS is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy - -Metrics-server is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy -``` - -```bash -zenml service-connector login aws-multi-type --resource-type kubernetes-cluster --resource-id zenhacks-cluster -``` - -```bash Example Command Output -$ zenml service-connector login aws-multi-type --resource-type kubernetes-cluster --resource-id zenhacks-cluster - -⠏ Attempting to configure local client using service connector 'aws-multi-type'... - -Updated local kubeconfig with the cluster details. The current kubectl context was set to 'arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster'. - -The 'aws-multi-type' Kubernetes Service Connector connector was used to successfully configure the local Kubernetes cluster client/SDK. - -# Verify that the local kubectl client is now configured to access the remote Kubernetes cluster - -$ kubectl cluster-info - -Kubernetes control plane is running at https://A5F8F4142FB12DDCDE9F21F6E9B07A18.gr7.us-east-1.eks.amazonaws.com - -CoreDNS is running at https://A5F8F4142FB12DDCDE9F21F6E9B07A18.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy -``` - -The same is possible with the local Docker client: - -```bash -zenml service-connector verify aws-session-token --resource-type docker-registry -``` - -```bash Example Command Output -Service connector 'aws-session-token' is correctly configured with valid credentials and has access to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼───────────────────┼────────────────┼────────────────────┼──────────────────────────────────────────────┨ - -┃ 3ae3e595-5cbc-446e-be64-e54e854e0e3f │ aws-session-token │ 🔶 aws │ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` -```bash -zenml service-connector login aws-session-token --resource-type docker-registry -``` - -```bash Example Command Output - -$zenml service-connector login aws-session-token --resource-type docker-registry - -⠏ Attempting to configure local client using service connector 'aws-session-token'... - -WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json. - -Configure a credential helper to remove this warning. See - -https://docs.docker.com/engineversions/0.66.0/reference/commandline/login/#credentials-store - -The 'aws-session-token' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK. - -# Verify that the local Docker client is now configured to access the remote Docker container registry - -$ docker pull 715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml-server - -Using default tag: latest - -latest: Pulling from zenml-server - -e9995326b091: Pull complete - -f3d7f077cdde: Pull complete - -0db71afa16f3: Pull complete - -6f0b5905c60c: Pull complete - -9d2154d50fd1: Pull complete - -d072bba1f611: Pull complete - -20e776588361: Pull complete - -3ce69736a885: Pull complete - -c9c0554c8e6a: Pull complete - -bacdcd847a66: Pull complete - -482033770844: Pull complete - -Digest: sha256:bf2cc3895e70dfa1ee1cd90bbfa599fa4cd8df837e27184bac1ce1cc239ecd3f - -Status: Downloaded newer image for 715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml-server:latest - -715803424590.dkr.ecr.us-east-1.amazonaws.com/zenml-server:latest -``` - - -## Discover available resources - -One of the questions that you may have as a ZenML user looking to register and connect a Stack Component to an external resource is "what resources do I even have access to ?". Sure, you can browse through all the registered Service connectors and manually verify each one to find a particular resource that you are looking for, but this is counterproductive. - -A better way is to ask ZenML directly questions such as: - -* what are the Kubernetes clusters that I can get access to through Service Connectors? -* can I access this particular S3 bucket through one of the Service Connectors? Which one? - -The `zenml service-connector list-resources` CLI command can be used exactly for this purpose. - -It is possible to show globally all the various resources that can be accessed through all available Service Connectors, and all Service Connectors that are in an error state. This operation is expensive and may take some time to complete, depending on the number of Service Connectors involved. The output also includes any errors that may have occurred during the discovery process: - -```bash -zenml service-connector list-resources -``` - -```bash Example Command Output -Fetching all service connector resources can take a long time, depending on the number of connectors that you have configured. Consider using the '--connector-type', '--resource-type' and '--resource-id' - -options to narrow down the list of resources to fetch. - -The following resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 099fb152-cfb7-4af5-86a7-7b77c0961b21 │ gcp-multi │ 🔵 gcp │ 🔵 gcp-generic │ zenml-core ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ │ │ │ 📦 gcs-bucket │ gs://annotation-gcp-store ┃ - -┃ │ │ │ │ gs://zenml-bucket-sl ┃ - -┃ │ │ │ │ gs://zenml-core.appspot.com ┃ - -┃ │ │ │ │ gs://zenml-core_cloudbuild ┃ - -┃ │ │ │ │ gs://zenml-datasets ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ │ │ │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ │ │ │ 🐳 docker-registry │ gcr.io/zenml-core ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 373a73c2-8295-45d4-a768-45f5a0f744ea │ aws-multi-type │ 🔶 aws │ 🔶 aws-generic │ us-east-1 ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ │ │ │ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ │ │ │ s3://zenfiles ┃ - -┃ │ │ │ │ s3://zenml-demos ┃ - -┃ │ │ │ │ s3://zenml-generative-chat ┃ - -┃ │ │ │ │ s3://zenml-public-datasets ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ │ │ │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ │ │ │ 🐳 docker-registry │ 715803424590.dkr.ecr.us-east-1.amazonaws.com ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ fa9325ab-ce01-4404-aec3-61a3af395d48 │ aws-s3-multi-instance │ 🔶 aws │ 📦 s3-bucket │ s3://aws-ia-mwaa-715803424590 ┃ - -┃ │ │ │ │ s3://zenfiles ┃ - -┃ │ │ │ │ s3://zenml-demos ┃ - -┃ │ │ │ │ s3://zenml-generative-chat ┃ - -┃ │ │ │ │ s3://zenml-public-datasets ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 19edc05b-92db-49de-bc84-aa9b3fb8261a │ aws-s3-zenfiles │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ c732c768-3992-4cbd-8738-d02cd7b6b340 │ kubernetes-auto │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ 💥 error: connector 'kubernetes-auto' authorization failure: failed to verify Kubernetes cluster ┃ - -┃ │ │ │ │ access: (401) ┃ - -┃ │ │ │ │ Reason: Unauthorized ┃ - -┃ │ │ │ │ HTTP response headers: HTTPHeaderDict({'Audit-Id': '20c96e65-3e3e-4e08-bae3-bcb72c527fbf', ┃ - -┃ │ │ │ │ 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Fri, 09 Jun 2023 ┃ - -┃ │ │ │ │ 18:52:56 GMT', 'Content-Length': '129'}) ┃ - -┃ │ │ │ │ HTTP response body: ┃ - -┃ │ │ │ │ {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":" ┃ - -┃ │ │ │ │ Unauthorized","code":401} ┃ - -┃ │ │ │ │ ┃ - -┃ │ │ │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -More interesting is to scope the search to a particular Resource Type. This yields fewer, more accurate results, especially if you have many multi-type Service Connectors configured: - - -``` -zenml service-connector list-resources --resource-type kubernetes-cluster -``` - -```bash Example Command Output -The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────┼────────────────┼───────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 099fb152-cfb7-4af5-86a7-7b77c0961b21 │ gcp-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃ - -┠──────────────────────────────────────┼─────────────────┼────────────────┼───────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ 373a73c2-8295-45d4-a768-45f5a0f744ea │ aws-multi-type │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃ - -┠──────────────────────────────────────┼─────────────────┼────────────────┼───────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────┨ - -┃ c732c768-3992-4cbd-8738-d02cd7b6b340 │ kubernetes-auto │ 🌀 kubernetes │ 🌀 kubernetes-cluster │ 💥 error: connector 'kubernetes-auto' authorization failure: failed to verify Kubernetes cluster access: ┃ - -┃ │ │ │ │ (401) ┃ - -┃ │ │ │ │ Reason: Unauthorized ┃ - -┃ │ │ │ │ HTTP response headers: HTTPHeaderDict({'Audit-Id': '72558f83-e050-4fe3-93e5-9f7e66988a4c', 'Cache-Control': ┃ - -┃ │ │ │ │ 'no-cache, private', 'Content-Type': 'application/json', 'Date': 'Fri, 09 Jun 2023 18:59:02 GMT', ┃ - -┃ │ │ │ │ 'Content-Length': '129'}) ┃ - -┃ │ │ │ │ HTTP response body: ┃ - -┃ │ │ │ │ {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauth ┃ - -┃ │ │ │ │ orized","code":401} ┃ - -┃ │ │ │ │ ┃ - -┃ │ │ │ │ ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ -``` - -Finally, you can ask for a particular resource, if you know its Resource Name beforehand: - - -``` -zenml service-connector list-resources --resource-type s3-bucket --resource-id zenfiles -``` - -```bash Example Command Output -The 's3-bucket' resource with name 'zenfiles' can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 373a73c2-8295-45d4-a768-45f5a0f744ea │ aws-multi-type │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ fa9325ab-ce01-4404-aec3-61a3af395d48 │ aws-s3-multi-instance │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 19edc05b-92db-49de-bc84-aa9b3fb8261a │ aws-s3-zenfiles │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - - - -## Connect Stack Components to resources - -Service Connectors and the resources and services that they can authenticate to and grant access to are only useful because they are a means of providing Stack Components a better and easier way of accessing external resources. - -If you are looking for a quick, assisted tour, we recommend using the interactive CLI mode to connect a Stack Component to a compatible Service Connector, especially if this is your first time doing it, e.g.: - -```bash -zenml artifact-store connect -i - -zenml orchestrator connect -i - -zenml container-registry connect -i -``` - -To connect a Stack Component to an external resource or service, you first need to [register one or more Service Connectors](/develop/usage/resource-data-management/auth-management/service-connectors-guide#register-service-connectors), or have someone else in your team with more infrastructure knowledge do it for you. If you already have that covered, you might want to ask ZenML "which resources/services am I even authorized to access with the available Service Connectors?". [The resource discovery feature](/develop/usage/resource-data-management/auth-management/service-connectors-guide#end-to-end-examples) is designed exactly for this purpose. This last check is already included in the interactive ZenML CLI command used to connect a Stack Component to a remote resource. - - - Not all Stack Components support being connected to an external resource or service via a Service Connector. Whether a Stack Component can use a Service Connector to connect to a remote resource or service or not is shown in the Stack Component flavor details: - - -```bash -$ zenml artifact-store flavor describe s3 - -Configuration class: S3ArtifactStoreConfig - -Configuration for the S3 Artifact Store. - -[...] - -This flavor supports connecting to external resources with a Service - -Connector. It requires a 's3-bucket' resource. You can get a list of - -all available connectors and the compatible resources that they can - -access by running: - -'zenml service-connector list-resources --resource-type s3-bucket' - -If no compatible Service Connectors are yet registered, you can can - -register a new one by running: - -'zenml service-connector register -i' - -``` - - -For Stack Components that do support Service Connectors, their flavor indicates the Resource Type and, optionally, Service Connector Type compatible with the Stack Component. This can be used to figure out which resources are available and which Service Connectors can grant access to them. In some cases it is even possible to figure out the exact Resource Name based on the attributes already configured in the Stack Component, which is how ZenML can decide automatically which Resource Name to use in the interactive mode: - -```bash -zenml artifact-store register s3-zenfiles --flavor s3 --path=s3://zenfiles - -zenml service-connector list-resources --resource-type s3-bucket --resource-id s3://zenfiles - -zenml artifact-store connect s3-zenfiles --connector aws-multi-type -``` - -```bash Example Command Output - -$ zenml artifact-store register s3-zenfiles --flavor s3 --path=s3://zenfiles - -Running with active stack: 'default' (global) - -Successfully registered artifact_store `s3-zenfiles`. - -$ zenml service-connector list-resources --resource-type s3-bucket --resource-id zenfiles - -The 's3-bucket' resource with name 'zenfiles' can be accessed by service connectors that you have configured: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼──────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 4a550c82-aa64-4a48-9c7f-d5e127d77a44 │ aws-multi-type │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┠──────────────────────────────────────┼──────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 66c0922d-db84-4e2c-9044-c13ce1611613 │ aws-multi-instance │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┠──────────────────────────────────────┼──────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 65c82e59-cba0-4a01-b8f6-d75e8a1d0f55 │ aws-single-instance │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - -$ zenml artifact-store connect s3-zenfiles --connector aws-multi-type - -Running with active stack: 'default' (global) - -Successfully connected artifact store `s3-zenfiles` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 4a550c82-aa64-4a48-9c7f-d5e127d77a44 │ aws-multi-type │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -The following is an example of connecting the same Stack Component to the remote resource using the interactive CLI mode: - - -``` -zenml artifact-store connect s3-zenfiles -i -``` - -```bash Example Command Output -The following connectors have compatible resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 373a73c2-8295-45d4-a768-45f5a0f744ea │ aws-multi-type │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ fa9325ab-ce01-4404-aec3-61a3af395d48 │ aws-s3-multi-instance │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┠──────────────────────────────────────┼───────────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 19edc05b-92db-49de-bc84-aa9b3fb8261a │ aws-s3-zenfiles │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - -Please enter the name or ID of the connector you want to use: aws-s3-zenfiles - -Successfully connected artifact store `s3-zenfiles` to the following resources: - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ - -┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃ - -┠──────────────────────────────────────┼─────────────────┼────────────────┼───────────────┼────────────────┨ - -┃ 19edc05b-92db-49de-bc84-aa9b3fb8261a │ aws-s3-zenfiles │ 🔶 aws │ 📦 s3-bucket │ s3://zenfiles ┃ - -┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ -``` - -## End-to-end examples - -To get an idea of what a complete end-to-end journey looks like, from registering Service Connector all the way to configuring Stacks and Stack Components and running pipelines that access remote resources through Service Connectors, take a look at the following full-fledged examples: - -* [the AWS Service Connector end-to-end examples](/develop/usage/resource-data-management/auth-management/aws-service-connector) -* [the GCP Service Connector end-to-end examples](/develop/usage/resource-data-management/auth-management/gcp-service-connector) -* [the Azure Service Connector end-to-end examples](/develop/usage/resource-data-management/auth-management/azure-service-connector) - diff --git a/develop/usage/resource-data-management/connecting-to-zenml/connect-in-with-your-user-interactive.mdx b/develop/usage/resource-data-management/connecting-to-zenml/connect-in-with-your-user-interactive.mdx deleted file mode 100644 index 987b23c..0000000 --- a/develop/usage/resource-data-management/connecting-to-zenml/connect-in-with-your-user-interactive.mdx +++ /dev/null @@ -1,45 +0,0 @@ ---- -title: "Connect in with your User (interactive)" -description: You can authenticate your clients with the ZenML Server using the ZenML CLI and the web based login. -icon: users-viewfinder ---- - -This can be executed with the command: - -```bash -zenml connect --url https://... -``` - -This command will start a series of steps to validate the device from where you are connecting that will happen in your browser. You can choose whether to mark your respective device as trusted or not. If you choose not to click `Trust this device`, a 24-hour token will be issued for authentication services. Choosing to trust the device will issue a 30-day token instead. - -To see all devices you've permitted, use the following command: - -```bash -zenml authorized-device list -``` - -Additionally, the following command allows you to more precisely inspect one of these devices: - -```bash -zenml authorized-device describe -``` - -For increased security, you can invalidate a token using the `zenml device lock` command followed by the device ID. This helps provide an extra layer of security and control over your devices. - -```bash -zenml authorized-device lock -``` - -To keep things simple, we can summarize the steps: - -1. Use the `zenml connect --url` command to start a device flow and connect to a zenml server. -2. Choose whether to trust the device when prompted. -3. Check permitted devices with `zenml devices list`. -4. Invalidate a token with `zenml device lock ...`. - -### Important notice - -Using the ZenML CLI is a secure and comfortable way to interact with your ZenML tenants. It's important to always ensure that only trusted devices are used to maintain security and privacy. - -Don't forget to manage your device trust levels regularly for optimal security. Should you feel a device trust needs to be revoked, lock the device immediately. Every token issued is a potential gateway to access your data, secrets and infrastructure. - diff --git a/develop/usage/resource-data-management/connecting-to-zenml/connect-with-a-service-account.mdx b/develop/usage/resource-data-management/connecting-to-zenml/connect-with-a-service-account.mdx deleted file mode 100644 index 22ea8ca..0000000 --- a/develop/usage/resource-data-management/connecting-to-zenml/connect-with-a-service-account.mdx +++ /dev/null @@ -1,78 +0,0 @@ ---- -title: "Connect with a Service Account" -description: Sometimes you may need to authenticate to a ZenML server from a non-interactive environment where the web login is not possible, like a CI/CD workload or a serverless function. -icon: circle-user ---- - -In these cases, you can configure a service account and an API key and use the API key to authenticate to the ZenML server: - -```bash -zenml service-account create -``` - -This command creates a service account and an API key for it. The API key is displayed as part of the command output and cannot be retrieved later. You can then use the issued API key to connect your ZenML client to the server through one of the following methods: - -* using the CLI: - -```bash -zenml connect --url https://... --api-key -``` - -* setting the `ZENML_STORE_URL` and `ZENML_STORE_API_KEY` environment variables when you set up your ZenML client for the first time. This method is particularly useful when you are using the ZenML client in an automated CI/CD workload environment like GitHub Actions or GitLab CI or in a containerized environment like Docker or Kubernetes: - -```bash -export ZENML_STORE_URL=https://... -export ZENML_STORE_API_KEY= -``` - -To see all the service accounts you've created and their API keys, use the following commands: - -```bash -zenml service-account list -zenml service-account api-key list -``` - -Additionally, the following command allows you to more precisely inspect one of these service accounts and an API key: - -```bash -zenml service-account describe -zenml service-account api-key describe -``` - -API keys don't have an expiration date. For increased security, we recommend that you regularly rotate the API keys to prevent unauthorized access to your ZenML server. You can do this with the ZenML CLI: - -```bash -zenml service-account api-key rotate -``` - -Running this command will create a new API key and invalidate the old one. The new API key is displayed as part of the command output and cannot be retrieved later. You can then use the new API key to connect your ZenML client to the server just as described above. - -When rotating an API key, you can also configure a retention period for the old API key. This is useful if you need to keep the old API key for a while to ensure that all your workloads have been updated to use the new API key. You can do this with the `--retain` flag. For example, to rotate an API key and keep the old one for 60 minutes, you can run the following command: - -```bash -zenml service-account api-key rotate \ - --retain 60 -``` - -For increased security, you can deactivate a service account or an API key using one of the following commands: - -```bash -zenml service-account update --active false -zenml service-account api-key update \ - --active false -``` - -Deactivating a service account or an API key will prevent it from being used to authenticate and has immediate effect on all workloads that use it. - -To keep things simple, we can summarize the steps: - -1. Use the `zenml service-account create` command to create a service account and an API key. -2. Use the `zenml connect --url --api-key ` command to connect your ZenML client to the server using the API key. -3. Check configured service accounts with `zenml service-account list`. -4. Check configured API keys with `zenml service-account api-key list`. -5. Regularly rotate API keys with `zenml service-account api-key rotate`. -6. Deactivate service accounts or API keys with `zenml service-account update` or `zenml service-account api-key update`. - -### Important notice - -Every API key issued is a potential gateway to access your data, secrets and infrastructure. It's important to regularly rotate API keys and deactivate or delete service accounts and API keys that are no longer needed. diff --git a/develop/usage/resource-data-management/connecting-to-zenml/connecting-to-zenml.mdx b/develop/usage/resource-data-management/connecting-to-zenml/connecting-to-zenml.mdx deleted file mode 100644 index 031fac2..0000000 --- a/develop/usage/resource-data-management/connecting-to-zenml/connecting-to-zenml.mdx +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: "Overview" -description: "Various means of connecting to ZenML." -icon: bolt ---- - -Once [ZenML is deployed](/develop/user-guide/guides/production-guide/deploying-zenml), there are various ways to connect to it. diff --git a/develop/usage/resource-data-management/handle-data-artifacts/artifact-versioning.mdx b/develop/usage/resource-data-management/handle-data-artifacts/artifact-versioning.mdx deleted file mode 100644 index a29ea6c..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/artifact-versioning.mdx +++ /dev/null @@ -1,47 +0,0 @@ ---- -title: "How ZenML stores data" -description: "Understand how ZenML stores your data under-the-hood." -icon: box ---- - -ZenML seamlessly integrates data versioning and lineage into its core functionality. When a pipeline is executed, each run generates automatically tracked and managed artifacts. One can easily view the entire lineage of how artifacts are created and interact with them. The dashboard is also a way to interact with the artifacts produced by different pipeline runs. ZenML's artifact management, caching, lineage tracking, and visualization capabilities can help gain valuable insights, streamline the experimentation process, and ensure the reproducibility and reliability of machine learning workflows. - -## Artifact Creation and Caching - -Each time a ZenML pipeline runs, the system first checks if there have been any changes in the inputs, outputs, parameters, or configuration of the pipeline steps. Each step in a run gets a new directory in the artifact store: - - - -Suppose a step is new or has been modified. In that case, ZenML creates a new directory structure in the [Artifact Store](/develop/stack-components/artifact-stores) with a unique ID and stores the data using the appropriate materializers in this directory. - - - - - - -On the other hand, if the step remains unchanged, ZenML intelligently decides whether to cache the step or not. By caching steps that have not been modified, ZenML can save [valuable time and computational resources](/develop/user-guide/guides/starter-guide/cache-previous-executions), allowing you to focus on experimenting with different configurations and improving your machine-learning models without the need to rerun unchanged parts of your pipeline. - -With ZenML, you can easily trace an artifact back to its origins and understand the exact sequence of executions that led to its creation, such as a trained model. This feature enables you to gain insights into the entire lineage of your artifacts, providing a clear understanding of how your data has been processed and transformed throughout your machine-learning pipelines. With ZenML, you can ensure the reproducibility of your results, and identify potential issues or bottlenecks in your pipelines. This level of transparency and traceability is essential for maintaining the reliability and trustworthiness of machine learning projects, especially when working in a team or across different environments. - -For more details on how to adjust the names or versions assigned to your artifacts, assign tags to them, or adjust other artifact properties, see the [documentation on artifact versioning and configuration](/develop/user-guide/guides/starter-guide/manage-artifacts). - -By tracking the lineage of artifacts across environments and stacks, ZenML enables ML engineers to reproduce results and understand the exact steps taken to create a model. This is crucial for ensuring the reliability and reproducibility of machine learning models, especially when working in a team or across different environments. - -## Saving and Loading Artifacts with Materializers - -[Materializers](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) play a crucial role in ZenML's artifact management system. They are responsible for handling the serialization and deserialization of artifacts, ensuring that data is consistently stored and retrieved from the [artifact store](/develop/stack-components/artifact-stores). Each materializer stores data flowing through a pipeline in one or more files within a unique directory in the artifact store: - - - - -Materializers are designed to be extensible and customizable, allowing you to define your own serialization and deserialization logic for specific data types or storage systems. By default, ZenML provides built-in materializers for common data types and uses `cloudpickle` to pickle objects where there is no default materializer. If you want direct control over how objects are serialized, you can easily create custom materializers by extending the `BaseMaterializer` class and implementing the required methods for your specific use case. Read more about materializers [here](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types). - - - ZenML provides a built-in [CloudpickleMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.cloudpickle%5Fmaterializer.CloudpickleMaterializer) that can handle any object by saving it with [cloudpickle](https://github.com/cloudpipe/cloudpickle). However, this is not production-ready because the resulting artifacts cannot be loaded when running with a different Python version. In such cases, you should consider building a [custom Materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types#custom-materializers) to save your objects in a more robust and efficient format. - - -Moreover, using the `CloudpickleMaterializer` could allow users to upload of any kind of object. This could be exploited to upload a malicious file, which could execute arbitrary code on the vulnerable system. - -When a pipeline runs, ZenML uses the appropriate materializers to save and load artifacts using the ZenML `fileio` system (built to work across multiple artifact stores). This not only simplifies the process of working with different data formats and storage systems but also enables artifact caching and lineage tracking. You can see an example of a default materializer (the `numpy` materializer) in action [here](https://github.com/zenml-io/zenml/blob/main/src/zenml/materializers/numpy%5Fmaterializer.py). - diff --git a/develop/usage/resource-data-management/handle-data-artifacts/delete-an-artifact.mdx b/develop/usage/resource-data-management/handle-data-artifacts/delete-an-artifact.mdx deleted file mode 100644 index 524746e..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/delete-an-artifact.mdx +++ /dev/null @@ -1,17 +0,0 @@ ---- -title: "Delete an artifact" -description: "Learn how to delete artifacts." -icon: trash ---- - -There is currently no way to delete an artifact directly, because it may lead to a broken state of the ZenML database (dangling references to pipeline runs that produce artifacts). - -However, it is possible to delete artifacts that are no longer referenced by any pipeline runs: - -```bash -zenml artifact prune -``` - -By default, this method deletes artifacts physically from the underlying [artifact store](/develop/stack-components/artifact-stores) AND also the entry in the database. You can control this behavior by using the `--only-artifact` and `--only-metadata` flags. - -You might find that some artifacts throw errors when you try to prune them, likely because they were stored locally and no longer exist. If you wish to continue pruning and to ignore these errors, please add the `--ignore-errors` flag. Warning messages will still be output to the terminal during this process. diff --git a/develop/usage/resource-data-management/handle-data-artifacts/get-arbitrary-artifacts-in-a-step.mdx b/develop/usage/resource-data-management/handle-data-artifacts/get-arbitrary-artifacts-in-a-step.mdx deleted file mode 100644 index 5755c12..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/get-arbitrary-artifacts-in-a-step.mdx +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: "Get arbitrary artifacts in a step" -description: "Not all artifacts need to come through the step interface from direct upstream steps." -icon: stairs ---- - -As described in [the metadata guide](/develop/usage/resource-data-management/track-metrics-metadata/logging-metadata), the metadata can be fetched with the client, and this is how you would use it to fetch it within a step. This allows you to fetch artifacts from other upstream steps or even completely different pipelines. - -```py -from zenml.client import Client -from zenml import step - -@step -def my_step(): - client = Client() - # Directly fetch an artifact - output = client.get_artifact_version("my_dataset", "my_version") - output.run_metadata["accuracy"].value -``` - -This is one of the ways you can access artifacts that have already been created and stored in the artifact store. This can be useful when you want to use artifacts from other pipelines or steps that are not directly upstream. - -## See Also - -* [Managing artifacts](/develop/user-guide/guides/starter-guide/manage-artifacts) \- learn about the `ExternalArtifact` type and how to pass artifacts between steps. diff --git a/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types.mdx b/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types.mdx deleted file mode 100644 index 57c1ac0..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types.mdx +++ /dev/null @@ -1,640 +0,0 @@ ---- -title: "Handle custom data types" -description: "Using materializers to pass custom data types through steps." -icon: database ---- - -A ZenML pipeline is built in a data-centric way. The outputs and inputs of steps define how steps are connected and the order in which they are executed. Each step should be considered as its very own process that reads and writes its inputs and outputs from and to the [artifact store](/develop/stack-components/artifact-stores). This is where **materializers** come into play. - -A materializer dictates how a given artifact can be written to and retrieved from the artifact store and also contains all serialization and deserialization logic. Whenever you pass artifacts as outputs from one pipeline step to other steps as inputs, the corresponding materializer for the respective data type defines how this artifact is first serialized and written to the artifact store, and then deserialized and read in the next step. - -## Built-In Materializers - -ZenML already includes built-in materializers for many common data types. These are always enabled and are used in the background without requiring any user interaction / activation: - -| Materializer | Handled Data Types | Storage Format | -| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------- | --------------------------------------- | -| [BuiltInMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.built%5Fin%5Fmaterializer.BuiltInMaterializer) | bool, float, int, str, None | .json | -| [BytesInMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.built%5Fin%5Fmaterializer.BytesMaterializer) | bytes | .txt | -| [BuiltInContainerMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.built%5Fin%5Fmaterializer.BuiltInContainerMaterializer) | dict, list, set, tuple | Directory | -| [NumpyMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.numpy%5Fmaterializer.NumpyMaterializer) | np.ndarray | .npy | -| [PandasMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.pandas%5Fmaterializer.PandasMaterializer) | pd.DataFrame, pd.Series | .csv (or .gzip if parquet is installed) | -| [PydanticMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.pydantic%5Fmaterializer.PydanticMaterializer) | pydantic.BaseModel | .json | -| [ServiceMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.service%5Fmaterializer.ServiceMaterializer) | zenml.services.service.BaseService | .json | -| [StructuredStringMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.structured%5Fstring%5Fmaterializer.StructuredStringMaterializer) | zenml.types.CSVString, zenml.types.HTMLString, zenml.types.MarkdownString | .csv / .html / .md (depending on type) | - - -ZenML provides a built-in [CloudpickleMaterializer](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-materializers/#zenml.materializers.cloudpickle%5Fmaterializer.CloudpickleMaterializer) that can handle any object by saving it with [cloudpickle](https://github.com/cloudpipe/cloudpickle). However, this is not production-ready because the resulting artifacts cannot be loaded when running with a different Python version. In such cases, you should consider building a [custom Materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types#custom-materializers) to save your objects in a more robust and efficient format. - -Moreover, using the `CloudpickleMaterializer` could allow users to upload of any kind of object. This could be exploited to upload a malicious file, which could execute arbitrary code on the vulnerable system. - - -## Integration Materializers - -In addition to the built-in materializers, ZenML also provides several integration-specific materializers that can be activated by installing the respective [integration](/develop/stack-components/component-guide): - -| Integration | Materializer | Handled Data Types | Storage Format | -| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------- | ----------------- | -| bentoml | [BentoMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-bentoml/#zenml.integrations.bentoml.materializers.bentoml%5Fbento%5Fmaterializer.BentoMaterializer) | bentoml.Bento | .bento | -| deepchecks | [DeepchecksResultMateriailzer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-deepchecks/#zenml.integrations.deepchecks.materializers.deepchecks%5Fresults%5Fmaterializer.DeepchecksResultMaterializer) | deepchecks.CheckResult, deepchecks.SuiteResult | .json | -| evidently | [EvidentlyProfileMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-evidently/#zenml.integrations.evidently.materializers.evidently%5Fprofile%5Fmaterializer.EvidentlyProfileMaterializer) | evidently.Profile | .json | -| great\_expectations | [GreatExpectationsMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-great%5Fexpectations/#zenml.integrations.great%5Fexpectations.materializers.ge%5Fmaterializer.GreatExpectationsMaterializer) | great\_expectations.ExpectationSuite, great\_expectations.CheckpointResult | .json | -| huggingface | [HFDatasetMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-huggingface/#zenml.integrations.huggingface.materializers.huggingface%5Fdatasets%5Fmaterializer.HFDatasetMaterializer) | datasets.Dataset, datasets.DatasetDict | Directory | -| huggingface | [HFPTModelMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-huggingface/#zenml.integrations.huggingface.materializers.huggingface%5Fpt%5Fmodel%5Fmaterializer.HFPTModelMaterializer) | transformers.PreTrainedModel | Directory | -| huggingface | [HFTFModelMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-huggingface/#zenml.integrations.huggingface.materializers.huggingface%5Ftf%5Fmodel%5Fmaterializer.HFTFModelMaterializer) | transformers.TFPreTrainedModel | Directory | -| huggingface | [HFTokenizerMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-huggingface/#zenml.integrations.huggingface.materializers.huggingface%5Ftokenizer%5Fmaterializer.HFTokenizerMaterializer) | transformers.PreTrainedTokenizerBase | Directory | -| lightgbm | [LightGBMBoosterMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-lightgbm/#zenml.integrations.lightgbm.materializers.lightgbm%5Fbooster%5Fmaterializer.LightGBMBoosterMaterializer) | lgbm.Booster | .txt | -| lightgbm | [LightGBMDatasetMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-lightgbm/#zenml.integrations.lightgbm.materializers.lightgbm%5Fdataset%5Fmaterializer.LightGBMDatasetMaterializer) | lgbm.Dataset | .binary | -| neural\_prophet | [NeuralProphetMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-neural%5Fprophet/#zenml.integrations.neural%5Fprophet.materializers.neural%5Fprophet%5Fmaterializer.NeuralProphetMaterializer) | NeuralProphet | .pt | -| pillow | [PillowImageMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-pillow/#zenml.integrations.pillow.materializers.pillow%5Fimage%5Fmaterializer.PillowImageMaterializer) | Pillow.Image | .PNG | -| polars | [PolarsMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-polars/#zenml.integrations.polars.materializers.dataframe%5Fmaterializer.PolarsMaterializer) | pl.DataFrame, pl.Series | .parquet | -| pycaret | [PyCaretMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-pycaret/#zenml.integrations.pycaret.materializers.model%5Fmaterializer.PyCaretMaterializer) | Any sklearn, xgboost, lightgbm or catboost model | .pkl | -| pytorch | [PyTorchDataLoaderMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-pytorch/#zenml.integrations.pytorch.materializers.pytorch%5Fdataloader%5Fmaterializer.PyTorchDataLoaderMaterializer) | torch.Dataset, torch.DataLoader | .pt | -| pytorch | [PyTorchModuleMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-pytorch/#zenml.integrations.pytorch.materializers.pytorch%5Fmodule%5Fmaterializer.PyTorchModuleMaterializer) | torch.Module | .pt | -| scipy | [SparseMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-scipy/#zenml.integrations.scipy.materializers.sparse%5Fmaterializer.SparseMaterializer) | scipy.spmatrix | .npz | -| spark | [SparkDataFrameMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-spark/#zenml.integrations.spark.materializers.spark%5Fdataframe%5Fmaterializer.SparkDataFrameMaterializer) | pyspark.DataFrame | .parquet | -| spark | [SparkModelMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-spark/#zenml.integrations.spark.materializers.spark%5Fmodel%5Fmaterializer.SparkModelMaterializer) | pyspark.Transformer | pyspark.Estimator | -| tensorflow | [KerasMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-tensorflow/#zenml.integrations.tensorflow.materializers.keras%5Fmaterializer.KerasMaterializer) | tf.keras.Model | Directory | -| tensorflow | [TensorflowDatasetMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-tensorflow/#zenml.integrations.tensorflow.materializers.tf%5Fdataset%5Fmaterializer.TensorflowDatasetMaterializer) | tf.Dataset | Directory | -| whylogs | [WhylogsMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-whylogs/#zenml.integrations.whylogs.materializers.whylogs%5Fmaterializer.WhylogsMaterializer) | whylogs.DatasetProfileView | .pb | -| xgboost | [XgboostBoosterMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-xgboost/#zenml.integrations.xgboost.materializers.xgboost%5Fbooster%5Fmaterializer.XgboostBoosterMaterializer) | xgb.Booster | .json | -| xgboost | [XgboostDMatrixMaterializer](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-xgboost/#zenml.integrations.xgboost.materializers.xgboost%5Fdmatrix%5Fmaterializer.XgboostDMatrixMaterializer) | xgb.DMatrix | .binary | - - - If you are running pipelines with a Docker-based [orchestrator](/develop/stack-components/orchestrators), you need to specify the corresponding integration as `required_integrations` in the `DockerSettings` of your pipeline in order to have the integration materializer available inside your Docker container. See the [pipeline configuration documentation](/develop/usage/project-setup/use-configuration-files/runtime-configuration) for more information. - - -## Custom materializers - -### Configuring a step/pipeline to use a custom materializer - -#### Defining which step uses what materializer - -ZenML automatically detects if your materializer is imported in your source code and registers them for the corresponding data type (defined in `ASSOCIATED_TYPES`). Therefore, just having a custom materializer definition in your code is enough to enable the respective data type to be used in your pipelines. - -However, it is best practice to explicitly define which materializer to use for a specific step and not rely on the `ASSOCIATED_TYPES` to make that connection: - - -```py -class MyObj: - ... - -class MyMaterializer(BaseMaterializer): - """Materializer to read data to and from MyObj.""" - - ASSOCIATED_TYPES = (MyObj) - ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA - - # Read below to learn how to implement this materializer - -# You can define it at the decorator level -@step(output_materializers=MyMaterializer) -def my_first_step() -> MyObj: - return 1 - -# No need to explicitly specify materializer here: -# it is coupled with Artifact Version generated by -# `my_first_step` already. -def my_second_step(a: MyObj): - print(a) - -# or you can use the `configure()` method of the step. E.g.: -my_first_step.configure(output_materializers=MyMaterializer) -``` - -When there are multiple outputs, a dictionary of type `{: }` can be supplied to the decorator or the `.configure(...)` method: - -```py -class MyObj1: - ... - -class MyObj2: - ... - -class MyMaterializer1(BaseMaterializer): - """Materializer to read data to and from MyObj1.""" - - ASSOCIATED_TYPES = (MyObj1) - ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA - -class MyMaterializer2(BaseMaterializer): - """Materializer to read data to and from MyObj2.""" - - ASSOCIATED_TYPES = (MyObj2) - ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA - -# This is where we connect the objects to the materializer -@step(output_materializers={"1": MyMaterializer1, "2": MyMaterializer2}) -def my_first_step() -> Tuple[Annotated[MyObj1, "1"], Annotated[MyObj2, "2"]]: - return 1 -``` - -Also, as briefly outlined in the [configuration docs](/develop/usage/project-setup/use-configuration-files/what-can-be-configured) section, which materializer to use for the output of what step can also be configured within YAML config files. - -For each output of your steps, you can define custom materializers to handle the loading and saving. You can configure them like this in the config: - -```yaml -... -steps: - : - ... - outputs: - : - materializer_source: run.MyMaterializer -``` - -Check out [this page](/develop/user-guide/guides/starter-guide/manage-artifacts) for information on your step output names and how to customize them. - -#### Defining a materializer globally - -Sometimes, you would like to configure ZenML to use a custom materializer globally for all pipelines, and override the default materializers that come built-in with ZenML. A good example of this would be to build a materializer for a `pandas.DataFrame` to handle the reading and writing of that dataframe in a different way than the default mechanism. - -An easy way to do that is to use the internal materializer registry of ZenML and override its behavior: - -```yaml -# Entrypoint file where we run pipelines (i.e. run.py) - -from zenml.materializers.materializer_registry import materializer_registry - -# Create a new materializer -class FastPandasMaterializer(BaseMaterializer): - ... - -# Register the FastPandasMaterializer for pandas dataframes objects -materializer_registry.register_and_overwrite_type(key=pd.DataFrame, type_=FastPandasMaterializer) - -# Run your pipelines: They will now all use the custom materializer -``` - -### Developing a custom materializer - -Now that we know how to configure a pipeline to use a custom materializer, let us briefly discuss how materializers in general are implemented. - -#### Base implementation - -In the following, you can see the implementation of the abstract base class `BaseMaterializer`, which defines the interface of all materializers: - -```yaml - -class BaseMaterializer(metaclass=BaseMaterializerMeta): - """Base Materializer to realize artifact data.""" - - ASSOCIATED_ARTIFACT_TYPE = ArtifactType.BASE - ASSOCIATED_TYPES = () - - def __init__( - self, uri: str, artifact_store: Optional[BaseArtifactStore] = None - ): - """Initializes a materializer with the given URI. - - Args: - uri: The URI where the artifact data will be stored. - artifact_store: The artifact store used to store this artifact. - """ - self.uri = uri - self._artifact_store = artifact_store - - def load(self, data_type: Type[Any]) -> Any: - """Write logic here to load the data of an artifact. - - Args: - data_type: The type of data that the artifact should be loaded as. - - Returns: - The data of the artifact. - """ - # read from a location inside self.uri - # - # Example: - # data_path = os.path.join(self.uri, "abc.json") - # with self.artifact_store.open(filepath, "r") as fid: - # return json.load(fid) - ... - - def save(self, data: Any) -> None: - """Write logic here to save the data of an artifact. - - Args: - data: The data of the artifact to save. - """ - # write `data` into self.uri - # - # Example: - # data_path = os.path.join(self.uri, "abc.json") - # with self.artifact_store.open(filepath, "w") as fid: - # json.dump(data,fid) - ... - - def save_visualizations(self, data: Any) -> Dict[str, VisualizationType]: - """Save visualizations of the given data. - - Args: - data: The data of the artifact to visualize. - - Returns: - A dictionary of visualization URIs and their types. - """ - # Optionally, define some visualizations for your artifact - # - # E.g.: - # visualization_uri = os.path.join(self.uri, "visualization.html") - # with self.artifact_store.open(visualization_uri, "w") as f: - # f.write("data") - - # visualization_uri_2 = os.path.join(self.uri, "visualization.png") - # data.save_as_png(visualization_uri_2) - - # return { - # visualization_uri: ArtifactVisualizationType.HTML, - # visualization_uri_2: ArtifactVisualizationType.IMAGE - # } - ... - - def extract_metadata(self, data: Any) -> Dict[str, "MetadataType"]: - """Extract metadata from the given data. - - This metadata will be tracked and displayed alongside the artifact. - - Args: - data: The data to extract metadata from. - - Returns: - A dictionary of metadata. - """ - # Optionally, extract some metadata from `data` for ZenML to store. - # - # Example: - # return { - # "some_attribute_i_want_to_track": self.some_attribute, - # "pi": 3.14, - # } - ... -``` - -#### Handled data types - -Each materializer has an `ASSOCIATED_TYPES` attribute that contains a list of data types that this materializer can handle. ZenML uses this information to call the right materializer at the right time. I.e., if a ZenML step returns a `pd.DataFrame`, ZenML will try to find any materializer that has `pd.DataFrame` in its `ASSOCIATED_TYPES`. List the data type of your custom object here to link the materializer to that data type. - -#### The type of the generated artifact - -Each materializer also has an `ASSOCIATED_ARTIFACT_TYPE` attribute, which defines what `zenml.enums.ArtifactType` is assigned to this data. - -In most cases, you should choose either `ArtifactType.DATA` or `ArtifactType.MODEL` here. If you are unsure, just use `ArtifactType.DATA`. The exact choice is not too important, as the artifact type is only used as a tag in some of ZenML's visualizations. - -#### Target location to store the artifact - -Each materializer has a `uri` attribute, which is automatically created by ZenML whenever you run a pipeline and points to the directory of a file system where the respective artifact is stored (some location in the artifact store). - -#### Storing and retrieving the artifact - -The `load()` and `save()` methods define the serialization and deserialization of artifacts. - -* `load()` defines how data is read from the artifact store and deserialized, -* `save()` defines how data is serialized and saved to the artifact store. - -You will need to override these methods according to how you plan to serialize your objects. E.g., if you have custom PyTorch classes as `ASSOCIATED_TYPES`, then you might want to use `torch.save()` and `torch.load()` here. - -#### (Optional) How to Visualize the Artifact - -Optionally, you can override the `save_visualizations()` method to automatically save visualizations for all artifacts saved by your materializer. These visualizations are then shown next to your artifacts in the dashboard: - - - - -Currently, artifacts can be visualized either as CSV table, embedded HTML, image or Markdown. For more information, see [zenml.enums.VisualizationType](https://github.com/zenml-io/zenml/blob/main/src/zenml/enums.py). - -To create visualizations, you need to: - -1. Compute the visualizations based on the artifact -2. Save all visualizations to paths inside `self.uri` -3. Return a dictionary mapping visualization paths to visualization types. - -As an example, check out the implementation of the [zenml.materializers.NumpyMaterializer](https://github.com/zenml-io/zenml/blob/main/src/zenml/materializers/numpy%5Fmaterializer.py) that use matplotlib to automatically save or plot certain arrays. - -Read more about visualizations [here](/develop/usage/resource-data-management/visualize-artifacts/creating-custom-visualizations). - -#### (Optional) Which Metadata to Extract for the Artifact - -Optionally, you can override the `extract_metadata()` method to track custom metadata for all artifacts saved by your materializer. Anything you extract here will be displayed in the dashboard next to your artifacts. - -To extract metadata, define and return a dictionary of values you want to track. The only requirement is that all your values are built-in types ( like `str`, `int`, `list`, `dict`, ...) or among the special types defined in [zenml.metadata.metadata\_types](https://github.com/zenml-io/zenml/blob/main/src/zenml/metadata/metadata%5Ftypes.py) that are displayed in a dedicated way in the dashboard. See [zenml.metadata.metadata\_types.MetadataType](https://github.com/zenml-io/zenml/blob/main/src/zenml/metadata/metadata%5Ftypes.py) for more details. - -By default, this method will only extract the storage size of an artifact, but you can override it to track anything you wish. E.g., the [zenml.materializers.NumpyMaterializer](https://github.com/zenml-io/zenml/blob/main/src/zenml/materializers/numpy%5Fmaterializer.py) overrides this method to track the `shape`, `dtype`, and some statistical properties of each `np.ndarray` that it saves. - - - If you would like to disable artifact visualization altogether, you can set `enable_artifact_visualization` at either pipeline or step level via `@pipeline(enable_artifact_visualization=False)` or `@step(enable_artifact_visualization=False)`. - - -#### (Optional) Which Metadata to Extract for the Artifact - -Optionally, you can override the `extract_metadata()` method to track custom metadata for all artifacts saved by your materializer. Anything you extract here will be displayed in the dashboard next to your artifacts. - -To extract metadata, define and return a dictionary of values you want to track. The only requirement is that all your values are built-in types ( like `str`, `int`, `list`, `dict`, ...) or among the special types defined in [src.zenml.metadata.metadata\_types](https://github.com/zenml-io/zenml/blob/main/src/zenml/metadata/metadata%5Ftypes.py) that are displayed in a dedicated way in the dashboard. See [src.zenml.metadata.metadata\_types.MetadataType](https://github.com/zenml-io/zenml/blob/main/src/zenml/metadata/metadata%5Ftypes.py) for more details. - -By default, this method will only extract the storage size of an artifact, but you can overwrite it to track anything you wish. E.g., the `zenml.materializers.NumpyMaterializer` overwrites this method to track the `shape`, `dtype`, and some statistical properties of each `np.ndarray` that it saves. - - - If you would like to disable artifact metadata extraction altogether, you can set `enable_artifact_metadata` at either pipeline or step level via `@pipeline(enable_artifact_metadata=False)` or `@step(enable_artifact_metadata=False)`. - - -## Skipping materialization - -You can learn more about skipping materialization [here](/develop/usage/resource-data-management/handle-data-artifacts/unmaterialized-artifacts). - -## Interaction with custom artifact stores - -When creating a custom artifact store, you may encounter a situation where the default materializers do not function properly. Specifically, the `self.artifact_store.open` method used in these materializers may not be compatible with your custom store due to not being implemented properly. - -In this case, you can create a modified version of the failing materializer by copying it and modifying it to copy the artifact to a local path, then opening it from there. For example, consider the following implementation of a custom [PandasMaterializer](https://github.com/zenml-io/zenml/blob/main/src/zenml/materializers/pandas%5Fmaterializer.py) that works with a custom artifact store. In this implementation, we copy the artifact to a local path because we want to use the `pandas.read_csv` method to read it. If we were to use the `self.artifact_store.open` method instead, we would not need to make this copy. - - - It is worth noting that copying the artifact to a local path may not always be necessary and can potentially be a performance bottleneck. - - - ```py - import os -from typing import Any, ClassVar, Dict, Optional, Tuple, Type, Union - -import pandas as pd - -from zenml.artifact_stores.base_artifact_store import BaseArtifactStore -from zenml.enums import ArtifactType, VisualizationType -from zenml.logger import get_logger -from zenml.materializers.base_materializer import BaseMaterializer -from zenml.metadata.metadata_types import DType, MetadataType - -logger = get_logger(__name__) - -PARQUET_FILENAME = "df.parquet.gzip" -COMPRESSION_TYPE = "gzip" - -CSV_FILENAME = "df.csv" - - -class PandasMaterializer(BaseMaterializer): - """Materializer to read data to and from pandas.""" - - ASSOCIATED_TYPES: ClassVar[Tuple[Type[Any], ...]] = ( - pd.DataFrame, - pd.Series, - ) - ASSOCIATED_ARTIFACT_TYPE: ClassVar[ArtifactType] = ArtifactType.DATA - - def __init__( - self, uri: str, artifact_store: Optional[BaseArtifactStore] = None - ): - """Define `self.data_path`. - - Args: - uri: The URI where the artifact data is stored. - artifact_store: The artifact store where the artifact data is stored. - """ - super().__init__(uri, artifact_store) - try: - import pyarrow # type: ignore # noqa - - self.pyarrow_exists = True - except ImportError: - self.pyarrow_exists = False - logger.warning( - "By default, the `PandasMaterializer` stores data as a " - "`.csv` file. If you want to store data more efficiently, " - "you can install `pyarrow` by running " - "'`pip install pyarrow`'. This will allow `PandasMaterializer` " - "to automatically store the data as a `.parquet` file instead." - ) - finally: - self.parquet_path = os.path.join(self.uri, PARQUET_FILENAME) - self.csv_path = os.path.join(self.uri, CSV_FILENAME) - - def load(self, data_type: Type[Any]) -> Union[pd.DataFrame, pd.Series]: - """Reads `pd.DataFrame` or `pd.Series` from a `.parquet` or `.csv` file. - - Args: - data_type: The type of the data to read. - - Raises: - ImportError: If pyarrow or fastparquet is not installed. - - Returns: - The pandas dataframe or series. - """ - if self.artifact_store.exists(self.parquet_path): - if self.pyarrow_exists: - with self.artifact_store.open( - self.parquet_path, mode="rb" - ) as f: - df = pd.read_parquet(f) - else: - raise ImportError( - "You have an old version of a `PandasMaterializer` " - "data artifact stored in the artifact store " - "as a `.parquet` file, which requires `pyarrow` " - "for reading, You can install `pyarrow` by running " - "'`pip install pyarrow fastparquet`'." - ) - else: - with self.artifact_store.open(self.csv_path, mode="rb") as f: - df = pd.read_csv(f, index_col=0, parse_dates=True) - - # validate the type of the data. - def is_dataframe_or_series( - df: Union[pd.DataFrame, pd.Series], - ) -> Union[pd.DataFrame, pd.Series]: - """Checks if the data is a `pd.DataFrame` or `pd.Series`. - - Args: - df: The data to check. - - Returns: - The data if it is a `pd.DataFrame` or `pd.Series`. - """ - if issubclass(data_type, pd.Series): - # Taking the first column if it is a series as the assumption - # is that there will only be one - assert len(df.columns) == 1 - df = df[df.columns[0]] - return df - else: - return df - - return is_dataframe_or_series(df) - - def save(self, df: Union[pd.DataFrame, pd.Series]) -> None: - """Writes a pandas dataframe or series to the specified filename. - - Args: - df: The pandas dataframe or series to write. - """ - if isinstance(df, pd.Series): - df = df.to_frame(name="series") - - if self.pyarrow_exists: - with self.artifact_store.open(self.parquet_path, mode="wb") as f: - df.to_parquet(f, compression=COMPRESSION_TYPE) - else: - with self.artifact_store.open(self.csv_path, mode="wb") as f: - df.to_csv(f, index=True) - -``` - - -## Code example - -Let's see how materialization works with a basic example. Let's say you have a custom class called `MyObject` that flows between two steps in a pipeline: - - -```py -import logging -from zenml import step, pipeline - - -class MyObj: - def __init__(self, name: str): - self.name = name - - -@step -def my_first_step() -> MyObj: - """Step that returns an object of type MyObj.""" - return MyObj("my_object") - - -@step -def my_second_step(my_obj: MyObj) -> None: - """Step that logs the input object and returns nothing.""" - logging.info( - f"The following object was passed to this step: `{my_obj.name}`" - ) - - -@pipeline -def first_pipeline(): - output_1 = my_first_step() - my_second_step(output_1) - - -first_pipeline() -``` - -Running the above without a custom materializer will work but print the following warning: - -`No materializer is registered for type MyObj, so the default Pickle materializer was used. Pickle is not production ready and should only be used for prototyping as the artifacts cannot be loaded when running with a different Python version. Please consider implementing a custom materializer for type MyObj` - -To get rid of this warning and make our pipeline more robust, we will subclass the `BaseMaterializer` class, listing `MyObj` in `ASSOCIATED_TYPES`, and overwriting `load()` and `save()`: - -```py -import os -from typing import Type - -from zenml.enums import ArtifactType -from zenml.materializers.base_materializer import BaseMaterializer - - -class MyMaterializer(BaseMaterializer): - ASSOCIATED_TYPES = (MyObj,) - ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA - - def load(self, data_type: Type[MyObj]) -> MyObj: - """Read from artifact store.""" - with self.artifact_store.open(os.path.join(self.uri, 'data.txt'), 'r') as f: - name = f.read() - return MyObj(name=name) - - def save(self, my_obj: MyObj) -> None: - """Write to artifact store.""" - with self.artifact_store.open(os.path.join(self.uri, 'data.txt'), 'w') as f: - f.write(my_obj.name) -``` - - - Pro-tip: Use the `self.artifact_store` property to ensure your materialization logic works across artifact stores (local and remote like S3 buckets). - -Now, ZenML can use this materializer to handle the outputs and inputs of your customs object. Edit the pipeline as follows to see this in action: - -```bash -my_first_step.configure(output_materializers=MyMaterializer) -first_pipeline() -``` - - -Due to the typing of the inputs and outputs and the `ASSOCIATED_TYPES` attribute of the materializer, you won't necessarily have to add `.configure(output_materializers=MyMaterializer)` to the step. It should automatically be detected. It doesn't hurt to be explicit though. - -This will now work as expected and yield the following output: - - -```bash -Creating run for pipeline: `first_pipeline` -Cache enabled for pipeline `first_pipeline` -Using stack `default` to run pipeline `first_pipeline`... -Step `my_first_step` has started. -Step `my_first_step` has finished in 0.081s. -Step `my_second_step` has started. -The following object was passed to this step: `my_object` -Step `my_second_step` has finished in 0.048s. -Pipeline run `first_pipeline-22_Apr_22-10_58_51_135729` has finished in 0.153s. -``` - - -```py - -import logging -import os -from typing import Type - -from zenml import step, pipeline - -from zenml.enums import ArtifactType -from zenml.materializers.base_materializer import BaseMaterializer - - -class MyObj: - def __init__(self, name: str): - self.name = name - - -class MyMaterializer(BaseMaterializer): - ASSOCIATED_TYPES = (MyObj,) - ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA - - def load(self, data_type: Type[MyObj]) -> MyObj: - """Read from artifact store.""" - with self.artifact_store.open(os.path.join(self.uri, 'data.txt'), 'r') as f: - name = f.read() - return MyObj(name=name) - - def save(self, my_obj: MyObj) -> None: - """Write to artifact store.""" - with self.artifact_store.open(os.path.join(self.uri, 'data.txt'), 'w') as f: - f.write(my_obj.name) - - -@step -def my_first_step() -> MyObj: - """Step that returns an object of type MyObj.""" - return MyObj("my_object") - - -my_first_step.configure(output_materializers=MyMaterializer) - - -@step -def my_second_step(my_obj: MyObj) -> None: - """Step that log the input object and returns nothing.""" - logging.info( - f"The following object was passed to this step: `{my_obj.name}`" - ) - - -@pipeline -def first_pipeline(): - output_1 = my_first_step() - my_second_step(output_1) - - -if __name__ == "__main__": - first_pipeline() -``` - diff --git a/develop/usage/resource-data-management/handle-data-artifacts/handle-data-artifacts.mdx b/develop/usage/resource-data-management/handle-data-artifacts/handle-data-artifacts.mdx deleted file mode 100644 index 760ea7d..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/handle-data-artifacts.mdx +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: "Overview" -description: "Step outputs in ZenML are stored in the artifact store. This enables caching, lineage and auditability. Using type annotations helps with transparency, passing data between steps, and serializing/des" -icon: server ---- - -For best results, use type annotations for your outputs. This is good coding practice for transparency, helps ZenML handle passing data between steps, and also enables ZenML to serialize and deserialize (referred to as 'materialize' in ZenML) the data. - -```py -@step -def load_data(parameter: int) -> Dict[str, Any]: - - # do something with the parameter here - - training_data = [[1, 2], [3, 4], [5, 6]] - labels = [0, 1, 0] - return {'features': training_data, 'labels': labels} - -@step -def train_model(data: Dict[str, Any]) -> None: - total_features = sum(map(sum, data['features'])) - total_labels = sum(data['labels']) - - # Train some model here - - print(f"Trained model using {len(data['features'])} data points. " - f"Feature sum is {total_features}, label sum is {total_labels}") - - -@pipeline -def simple_ml_pipeline(parameter: int): - dataset = load_data(parameter=parameter) # Get the output - train_model(dataset) # Pipe the previous step output into the downstream step -``` - -In this code, we define two steps: `load_data` and `train_model`. The `load_data` step takes an integer parameter and returns a dictionary containing training data and labels. The `train_model` step receives the dictionary from `load_data`, extracts the features and labels, and trains a model (not shown here). - -Finally, we define a pipeline `simple_ml_pipeline` that chains the `load_data` and `train_model` steps together. The output from `load_data` is passed as input to `train_model`, demonstrating how data flows between steps in a ZenML pipeline. diff --git a/develop/usage/resource-data-management/handle-data-artifacts/load-artifacts-into-memory.mdx b/develop/usage/resource-data-management/handle-data-artifacts/load-artifacts-into-memory.mdx deleted file mode 100644 index 40cd94a..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/load-artifacts-into-memory.mdx +++ /dev/null @@ -1,77 +0,0 @@ ---- -title: "Load artifacts into memory" -description: Often ZenML pipeline steps consume artifacts produced by one another directly in the pipeline code, but there are scenarios where you need to pull external data into your steps. -icon: server ---- - - Such external data could be artifacts produced by non-ZenML codes. For those cases, it is advised to use [ExternalArtifact](/develop/user-guide/guides/starter-guide/manage-artifacts#consuming-external-artifacts-within-a-pipeline), but what if we plan to exchange data created with other ZenML pipelines? - -ZenML pipelines are first compiled and only executed at some later point. During the compilation phase, all function calls are executed, and this data is fixed as step input parameters. Given all this, the late materialization of dynamic objects, like data artifacts, is crucial. Without late materialization, it would not be possible to pass not-yet-existing artifacts as step inputs, or their metadata, which is often the case in a multi-pipeline setting. - -We identify two major use cases for exchanging artifacts between pipelines: - -* You semantically group your data products using ZenML Models -* You prefer to use [ZenML Client](/develop/sdk/python-client#client-methods) to bring all the pieces together - - -We recommend using models to group and access artifacts across pipelines. Find out how to load an artifact from a ZenML Model [here](/develop/usage/development-monitoring/use-the-model-control-plane/load-artifacts-from-model). - -## Use client methods to exchange artifacts - -If you don't yet use the Model Control Plane, you can still exchange data between pipelines with late materialization. Let's rework the `do_predictions` pipeline code as follows: - -```py -from typing import Annotated -from zenml import step, pipeline -from zenml.client import Client -import pandas as pd -from sklearn.base import ClassifierMixin - - -@step -def predict( - model1: ClassifierMixin, - model2: ClassifierMixin, - model1_metric: float, - model2_metric: float, - data: pd.DataFrame, -) -> Annotated[pd.Series, "predictions"]: - # compare which model performs better on the fly - if model1_metric < model2_metric: - predictions = pd.Series(model1.predict(data)) - else: - predictions = pd.Series(model2.predict(data)) - return predictions - -@step -def load_data() -> pd.DataFrame: - # load inference data - ... - -@pipeline -def do_predictions(): - # get specific artifact version - model_42 = Client().get_artifact_version("trained_model", version="42") - metric_42 = model_42.run_metadata["MSE"].value - - # get latest artifact version - model_latest = Client().get_artifact_version("trained_model") - metric_latest = model_latest.run_metadata["MSE"].value - - inference_data = load_data() - predict( - model1=model_42, - model2=model_latest, - model1_metric=metric_42, - model2_metric=metric_latest, - data=inference_data, - ) - -if __name__ == "__main__": - do_predictions() -``` - -Here, we enriched the `predict` step logic with a metric comparison by MSE metric, so predictions are done on the best possible model. We also added a `load_data` step to load the inference data. - -As before, calls like `Client().get_artifact_version("trained_model", version="42")` or `model_latest.run_metadata["MSE"].value` are not evaluating the actual objects behind them at pipeline compilation time. Rather, they do so only at the point of step execution. By doing so, we ensure that the latest version is actually the latest at the moment and not just the latest at the point of pipeline compilation. - diff --git a/develop/usage/resource-data-management/handle-data-artifacts/passing-artifacts-between-pipelines.mdx b/develop/usage/resource-data-management/handle-data-artifacts/passing-artifacts-between-pipelines.mdx deleted file mode 100644 index 41aa6ce..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/passing-artifacts-between-pipelines.mdx +++ /dev/null @@ -1,129 +0,0 @@ ---- -title: "Passing artifacts between pipelines" -description: "Structuring an MLOps project" -icon: share ---- - -An MLOps project can often be broken down into many different pipelines. For example: - -* A feature engineering pipeline that prepares raw data into a format ready to get trained. -* A training pipeline that takes input data from a feature engineering pipeline and trains a models on it. -* An inference pipeline that runs batch predictions on the trained model and often takes pre-processing from the training pipeline. -* A deployment pipeline that deploys a trained model into a production endpoint. - -The lines between these pipelines can often get blurry: Some use cases call for these pipelines to be merged into one big pipeline. Others go further and break the pipeline down into even smaller chunks. Ultimately, the decision of how to structure your pipelines depends on the use case and requirements of the project. - -No matter how you design these pipelines, one thing stays consistent: you will often need to transfer or share information (in particular artifacts, models, and metadata) between pipelines. Here are some common patterns that you can use to help facilitate such an exchange: - -## Pattern 1: Artifact exchange between pipelines through `Client` - -Let's say we have a feature engineering pipeline and a training pipeline. The feature engineering pipeline is like a factory, pumping out many different datasets. Only a few of these datasets should be selected to be sent to the training pipeline to train an actual model. - - - - -In this scenario, the [ZenML Client](/develop/sdk/python-client#client-methods) can be used to facilitate such an exchange: - -```py -from zenml import pipeline -from zenml.client import Client - -@pipeline -def feature_engineering_pipeline(): - dataset = load_data() - # This returns artifacts called "iris_training_dataset" and "iris_testing_dataset" - train_data, test_data = prepare_data() - -@pipeline -def training_pipeline(): - client = Client() - # Fetch by name alone - uses the latest version of this artifact - train_data = client.get_artifact_version(name="iris_training_dataset") - # For test, we want a particular version - test_data = client.get_artifact_version(name="iris_testing_dataset", version="raw_2023") - - # We can now send these directly into ZenML steps - sklearn_classifier = model_trainer(train_data) - model_evaluator(model, sklearn_classifier) -``` - - - Note that in the above example, the `train_data` and `test_data` artifacts are not [materialized](/develop/usage/resource-data-management/handle-data-artifacts/artifact-versioning) in memory in the `@pipeline` function, but rather the `train_data` and `test_data` objects are simply references to where this data is stored in the artifact store. Therefore, one cannot use any logic regarding the nature of this data itself during compilation time (i.e. in the `@pipeline` function). - - -## Pattern 2: Artifact exchange between pipelines through a `Model` - -While passing around artifacts with IDs or names is very useful, it is often desirable to have the ZenML Model be the point of reference instead. - -For example, let's say we have a training pipeline called `train_and_promote` and an inference pipeline called `do_predictions`. The training pipeline produces many different model artifacts, all of which are collected within a [ZenML Model](/develop/user-guide/guides/starter-guide/track-ml-models). Each time the `train_and_promote` pipeline runs, it creates a new `iris_classifier`. However, it only promotes the model to `production` if a certain accuracy threshold is met. The promotion can be also be done manually with human intervention, or it can be automated through setting a particular threshold. - -On the other side, the `do_predictions` pipeline simply picks up the latest promoted model and runs batch inference on it. It need not know of the IDs or names of any of the artifacts produced by the training pipeline's many runs. This way these two pipelines can independently be run, but can rely on each other's output. - - - - - - -In code, this is very simple. Once the [pipelines are configured to use a particular model](/develop/user-guide/guides/starter-guide/track-ml-models#configuring-a-model-in-a-pipeline), we can use `get_step_context` to fetch the configured model within a step directly. Assuming there is a `predict` step in the `do_predictions` pipeline, we can fetch the `production` model like so: - - -```py -from zenml import step, get_step_context - -# IMPORTANT: Cache needs to be disabled to avoid unexpected behavior -@step(enable_cache=False) -def predict( - data: pd.DataFrame, -) -> Annotated[pd.Series, "predictions"]: - # model name and version are derived from pipeline context - model = get_step_context().model - - # Fetch the model directly from the model control plane - model = model.get_model_artifact("trained_model") - - # Make predictions - predictions = pd.Series(model.predict(data)) - return predictions -``` - -However, this approach has the downside that if the step is cached, then it could lead to unexpected results. You could simply disable the cache in the above step or the corresponding pipeline. However, one other way of achieving this would be to resolve the artifact at the pipeline level: - -```py -from typing_extensions import Annotated -from zenml import get_pipeline_context, pipeline, Model -from zenml.enums import ModelStages -import pandas as pd -from sklearn.base import ClassifierMixin - - -@step -def predict( - model: ClassifierMixin, - data: pd.DataFrame, -) -> Annotated[pd.Series, "predictions"]: - predictions = pd.Series(model.predict(data)) - return predictions - -@pipeline( - model=Model( - name="iris_classifier", - # Using the production stage - version=ModelStages.PRODUCTION, - ), -) -def do_predictions(): - # model name and version are derived from pipeline context - model = get_pipeline_context().model - inference_data = load_data() - predict( - # Here, we load in the `trained_model` from a trainer step - model=model.get_model_artifact("trained_model"), - data=inference_data, - ) - - -if __name__ == "__main__": - do_predictions() -``` -Ultimately, both approaches are fine. You should decide which one to use based on your own preferences. diff --git a/develop/usage/resource-data-management/handle-data-artifacts/return-multiple-outputs-from-a-step.mdx b/develop/usage/resource-data-management/handle-data-artifacts/return-multiple-outputs-from-a-step.mdx deleted file mode 100644 index e174d77..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/return-multiple-outputs-from-a-step.mdx +++ /dev/null @@ -1,40 +0,0 @@ ---- -title: "Return multiple outputs from a step" -description: "Use Annotated to return multiple outputs from a step and name them for easy retrieval and dashboard display." -icon: stairs ---- - -You can use the `Annotated` type to return multiple outputs from a step and give each output a name. Naming your step outputs will help you retrieve the specific artifact later and also improves the readability of your pipeline's dashboard. - -```py -from typing import Annotated, Tuple - -import pandas as pd -from zenml import step - - -@step -def clean_data( - data: pd.DataFrame, -) -> Tuple[ - Annotated[pd.DataFrame, "x_train"], - Annotated[pd.DataFrame, "x_test"], - Annotated[pd.Series, "y_train"], - Annotated[pd.Series, "y_test"], -]: - from sklearn.model_selection import train_test_split - - x = data.drop("target", axis=1) - y = data["target"] - - x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42) - - return x_train, x_test, y_train, y_test -``` - -In this code, the `clean_data` step takes a pandas DataFrame as input and returns a tuple of four elements: `x_train`, `x_test`, `y_train`, and `y_test`. Each element in the tuple is annotated with a specific name using the `Annotated` type. - -Inside the step, we split the input data into features (`x`) and target (`y`), and then use `train_test_split` from scikit-learn to split the data into training and testing sets. The resulting DataFrames and Series are returned as a tuple, with each element annotated with its respective name. - -By using `Annotated`, we can easily identify and retrieve specific artifacts later in the pipeline. Additionally, the names will be displayed on the pipeline's dashboard, making it more readable and understandable. - diff --git a/develop/usage/resource-data-management/handle-data-artifacts/tagging.mdx b/develop/usage/resource-data-management/handle-data-artifacts/tagging.mdx deleted file mode 100644 index 0171747..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/tagging.mdx +++ /dev/null @@ -1,97 +0,0 @@ ---- -title: "Organize data with tags" -description: "Use tags to organize tags in ZenML." -icon: tag ---- - -Organizing and categorizing your machine learning artifacts and models can streamline your workflow and enhance discoverability. ZenML enables the use of tags as a flexible tool to classify and filter your ML assets. In this guide, we'll demonstrate how to assign tags to both artifacts and models within the ZenML ecosystem. - - - - -## Assigning tags to artifacts - -If you want to tag the artifact versions of a step or pipeline that is executed repeatedly, you can use the `tags` property of `ArtifactConfig` to assign an arbitrary number of tags to the created artifacts: - - -```py -from zenml import step, ArtifactConfig - -@step -def training_data_loader() -> ( - Annotated[pd.DataFrame, ArtifactConfig(tags=["sklearn", "pre-training"])] -): - ... -``` - - -You can use the `zenml artifacts` CLI to add tags: - -```bash -# Tag the artifact -zenml artifacts update iris_dataset -t sklearn -# Tag the artifact version -zenml artifacts versions update iris_dataset raw_2023 -t sklearn -``` - - - -This will assign tags `sklearn` and `pre-training` to all artifacts created by this step, which can later be used to filter and organize these artifacts. - -Note that [ZenML Pro](https://zenml.io/pro) users can tag artifacts directly in the cloud dashboard. - -## Assigning tags to models - -Just like artifacts, you can also tag your models to organize them semantically. Here's how to use tags with models in the ZenML Python SDK and CLI (or in the [ZenML Pro Dashboard directly](https://zenml.io/pro)). - -When creating a model using the `Model` object, you can specify tags as key-value pairs that will be attached to the model upon creation: - -```py -from zenml.models import Model - -# Define tags to be added to the model -tags = ["experiment", "v1", "classification-task"] - -# Create a model with tags -model = Model( - name="iris_classifier", - version="1.0.0", - tags=tags, -) - -# Use this tagged model in your steps and pipelines as needed -@pipeline(model=model) -def my_pipeline(...): - ... -``` - -You can also assign tags when creating or updating models with the Python SDK: - -```py -from zenml.models import Model -from zenml.client import Client - -# Create or register a new model with tags -Client().create_model( - name="iris_logistic_regression", - tags=["classification", "iris-dataset"], -) - -# Create or register a new model version also with tags -Client().create_model_version( - model_name_or_id="iris_logistic_regression", - name="2", - tags=["version-1", "experiment-42"], -) -``` - -To add tags to existing models and their versions using the ZenML CLI, you can use the following commands: - -```bash -# Tag an existing model -zenml model update iris_logistic_regression --tag "classification" - -# Tag a specific model version -zenml model version update iris_logistic_regression 2 --tag "experiment3" -``` - diff --git a/develop/usage/resource-data-management/handle-data-artifacts/unmaterialized-artifacts.mdx b/develop/usage/resource-data-management/handle-data-artifacts/unmaterialized-artifacts.mdx deleted file mode 100644 index e665c40..0000000 --- a/develop/usage/resource-data-management/handle-data-artifacts/unmaterialized-artifacts.mdx +++ /dev/null @@ -1,93 +0,0 @@ ---- -title: "Skipping materialization" -description: "Skip materialization of artifacts." -icon: recycle ---- - -A ZenML pipeline is built in a data-centric way. The outputs and inputs of steps define how steps are connected and the order in which they are executed. Each step should be considered as its very own process that reads and writes its inputs and outputs from and to the [artifact store](/develop/stack-components/artifact-stores). This is where **materializers** come into play. - -A materializer dictates how a given artifact can be written to and retrieved from the artifact store and also contains all serialization and deserialization logic. Whenever you pass artifacts as outputs from one pipeline step to other steps as inputs, the corresponding materializer for the respective data type defines how this artifact is first serialized and written to the artifact store, and then deserialized and read in the next step. Read more about this [here](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types). - -However, there are instances where you might **not** want to materialize an artifact in a step, but rather use a reference to it instead. This is where skipping materialization comes in. - - - Skipping materialization might have unintended consequences for downstream tasks that rely on materialized artifacts. Only skip materialization if there is no other way to do what you want to do. - - -## How to skip materialization - -While materializers should in most cases be used to control how artifacts are returned and consumed from pipeline steps, you might sometimes need to have a completely unmaterialized artifact in a step, e.g., if you need to know the exact path to where your artifact is stored. - -An unmaterialized artifact is a [zenml.materializers.UnmaterializedArtifact](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-artifacts/#zenml.artifacts.unmaterialized%5Fartifact). Among others, it has a property `uri` that points to the unique path in the artifact store where the artifact is persisted. One can use an unmaterialized artifact by specifying `UnmaterializedArtifact` as the type in the step: - -```py -from zenml.artifacts.unmaterialized_artifact import UnmaterializedArtifact -from zenml import step - -@step -def my_step(my_artifact: UnmaterializedArtifact): # rather than pd.DataFrame - pass -``` - -## Code Example - -The following shows an example of how unmaterialized artifacts can be used in the steps of a pipeline. The pipeline we define will look like this: - -```bash -s1 -> s3 - -s2 -> s4 -``` - -`s1` and `s2` produce identical artifacts, however `s3` consumes materialized artifacts while `s4` consumes unmaterialized artifacts. `s4` can now use the `dict_.uri` and `list_.uri` paths directly rather than their materialized counterparts. - -```py -from typing_extensions import Annotated # or `from typing import Annotated on Python 3.9+ -from typing import Dict, List, Tuple - -from zenml.artifacts.unmaterialized_artifact import UnmaterializedArtifact -from zenml import pipeline, step - - -@step -def step_1() -> Tuple[ - Annotated[Dict[str, str], "dict_"], - Annotated[List[str], "list_"], -]: - return {"some": "data"}, [] - - -@step -def step_2() -> Tuple[ - Annotated[Dict[str, str], "dict_"], - Annotated[List[str], "list_"], -]: - return {"some": "data"}, [] - - -@step -def step_3(dict_: Dict, list_: List) -> None: - assert isinstance(dict_, dict) - assert isinstance(list_, list) - - -@step -def step_4( - dict_: UnmaterializedArtifact, - list_: UnmaterializedArtifact, -) -> None: - print(dict_.uri) - print(list_.uri) - - -@pipeline -def example_pipeline(): - step_3(*step_1()) - step_4(*step_2()) - - -example_pipeline() -``` - -You can see another example of using an `UnmaterializedArtifact` when triggering a [pipeline from another](/develop/usage/pipelines/trigger-pipelines/trigger-a-pipeline-from-another). - diff --git a/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-a-model.mdx b/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-a-model.mdx deleted file mode 100644 index 5f5097d..0000000 --- a/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-a-model.mdx +++ /dev/null @@ -1,59 +0,0 @@ ---- -title: "Attach metadata to a model" -description: "Attach any metadata as key-value pairs to your models for future reference and auditability." -icon: paperclip ---- - -## Logging Metadata for Models - -While artifact metadata is specific to individual outputs of steps, model metadata encapsulates broader and more general information that spans across multiple artifacts. For example, evaluation results or the name of a customer for whom the model is intended could be logged with the model. - -Here's an example of logging metadata for a model: - -```py -from zenml import step, log_model_metadata, ArtifactConfig, get_step_context -from typing import Annotated -import pandas as pd -from sklearn.ensemble import RandomForestClassifier -from sklearn.base import ClassifierMixin - -@step -def train_model(dataset: pd.DataFrame) -> Annotated[ClassifierMixin, ArtifactConfig(name="sklearn_classifier", is_model_artifact=True)]: - """Train a model""" - # Fit the model and compute metrics - classifier = RandomForestClassifier().fit(dataset) - accuracy, precision, recall = ... - - # Log metadata for the model - # This associates the metadata with the ZenML model, not the artifact - log_model_metadata( - metadata={ - "evaluation_metrics": { - "accuracy": accuracy, - "precision": precision, - "recall": recall - } - }, - # Omitted model_name will use the model in the current context - model_name="zenml_model_name", - # Omitted model_version will default to 'latest' - model_version="zenml_model_version", - ) - return classifier -``` - -In this example, the metadata is associated with the model rather than the specific classifier artifact. This is particularly useful when the metadata reflects an aggregation or summary of various steps and artifacts in the pipeline. - -## Fetching logged metadata - -Once metadata has been logged in an [artifact](/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-an-artifact), model, or [step](/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-steps), we can easily fetch the metadata with the ZenML Client: - -```py -from zenml.client import Client - -client = Client() -model = client.get_model_version("my_model", "my_version") - -print(model.run_metadata["metadata_key"].value) -``` - diff --git a/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-an-artifact.mdx b/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-an-artifact.mdx deleted file mode 100644 index c5eb0af..0000000 --- a/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-an-artifact.mdx +++ /dev/null @@ -1,83 +0,0 @@ ---- -title: "Attach metadata to an artifact" -description: "Learn how to log metadata for artifacts and models in ZenML." -icon: link ---- - - - - - -Metadata plays a critical role in ZenML, providing context and additional information about various entities within the platform. Anything which is `metadata` in ZenML can be compared in the dashboard. - -This guide will explain how to log metadata for artifacts and models in ZenML and detail the types of metadata that can be logged. - -## Logging Metadata for Artifacts - -Artifacts in ZenML are outputs of steps within a pipeline, such as datasets, models, or evaluation results. Associating metadata with artifacts can help users understand the nature and characteristics of these outputs. - -To log metadata for an artifact, you can use the `log_artifact_metadata` method. This method allows you to attach a dictionary of key-value pairs as metadata to an artifact. The metadata can be any JSON-serializable value, including custom classes such as `Uri`, `Path`, `DType`, and `StorageSize`. Find out more about these different types [here](/develop/usage/resource-data-management/track-metrics-metadata/logging-metadata). - -Here's an example of logging metadata for an artifact: - -```py -from zenml import step, log_artifact_metadata -from zenml.metadata.metadata_types import StorageSize - -@step -def process_data_step(dataframe: pd.DataFrame) -> Annotated[pd.DataFrame, "processed_data"],: - """Process a dataframe and log metadata about the result.""" - # Perform processing on the dataframe... - processed_dataframe = ... - - # Log metadata about the processed dataframe - log_artifact_metadata( - artifact_name="processed_data", - metadata={ - "row_count": len(processed_dataframe), - "columns": list(processed_dataframe.columns), - "storage_size": StorageSize(processed_dataframe.memory_usage().sum()) - } - ) - return processed_dataframe -``` - -## Fetching logged metadata - -Once metadata has been logged in an artifact, or [step](/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-a-model), we can easily fetch the metadata with the ZenML Client: - -```py -from zenml.client import Client - -client = Client() -artifact = client.get_artifact_version("my_artifact", "my_version") - -print(artifact.run_metadata["metadata_key"].value) -``` - -## Grouping Metadata in the Dashboard - -When logging metadata passing a dictionary of dictionaries in the `metadata` parameter will group the metadata into cards in the ZenML dashboard. This feature helps organize metadata into logical sections, making it easier to visualize and understand. - -Here's an example of grouping metadata into cards: - -```py -from zenml.metadata.metadata_types import StorageSize - -log_artifact_metadata( - metadata={ - "model_metrics": { - "accuracy": 0.95, - "precision": 0.92, - "recall": 0.90 - }, - "data_details": { - "dataset_size": StorageSize(1500000), - "feature_columns": ["age", "income", "score"] - } - } -) -``` - -In the ZenML dashboard, "model\_metrics" and "data\_details" would appear as separate cards, each containing their respective key-value pairs. - diff --git a/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-steps.mdx b/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-steps.mdx deleted file mode 100644 index e17771d..0000000 --- a/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-steps.mdx +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: "Attach metadata to steps" -description: You might want to log metadata and have that be attached to a specific step during the course of your work. -icon: magnet ---- - -This is possible by using the `log_step_metadata` method. This method allows you to attach a dictionary of key-value pairs as metadata to a step. The metadata can be any JSON-serializable value, including custom classes such as `Uri`, `Path`, `DType`, and `StorageSize`. - -You can call this method from within a step or from outside. If you call it from within it will attach the metadata to the step and run that is currently being executed. - -```py - -from zenml import step, log_step_metadata, ArtifactConfig, get_step_context -from typing import Annotated -import pandas as pd -from sklearn.ensemble import RandomForestClassifier -from sklearn.base import ClassifierMixin - -@step -def train_model(dataset: pd.DataFrame) -> Annotated[ClassifierMixin, ArtifactConfig(name="sklearn_classifier", is_model_artifact=True)]: - """Train a model""" - # Fit the model and compute metrics - classifier = RandomForestClassifier().fit(dataset) - accuracy, precision, recall = ... - - # Log metadata at the step level - # This associates the metadata with the ZenML step run - log_step_metadata( - metadata={ - "evaluation_metrics": { - "accuracy": accuracy, - "precision": precision, - "recall": recall - } - }, - ) - return classifier -``` - -If you call it from outside you can attach the metadata to a specific step run from any pipeline and step. This is useful if you want to attach the metadata after you've run the step. -```py - -from zenml import log_step_metadata -# run some step - -# subsequently log the metadata for the step -log_step_metadata( - metadata={ - "some_metadata": {"a_number": 3} - }, - pipeline_name_id_or_prefix="my_pipeline", - step_name="my_step", - run_id="my_step_run_id" -) -``` - -## Fetching logged metadata - -Once metadata has been logged in an [artifact](/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-an-artifact), [model](/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-a-model), we can easily fetch the metadata with the ZenML Client: - -```py - -from zenml.client import Client - -client = Client() -step = client.get_pipeline_run().steps["step_name"] - -print(step.run_metadata["metadata_key"].value) -``` diff --git a/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-pipeline.mdx b/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-pipeline.mdx deleted file mode 100644 index ad15855..0000000 --- a/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-pipeline.mdx +++ /dev/null @@ -1,46 +0,0 @@ ---- -title: "Fetch metadata during pipeline composition" -description: "How to fetch metadata during pipeline composition." -icon: get-pocket ---- - -### Pipeline configuration using the `PipelineContext` - -To find information about the pipeline configuration during pipeline composition, you can use the `zenml.get_pipeline_context()` function to access the `PipelineContext` of your pipeline: - -```py - -from zenml import get_pipeline_context, pipeline - -... - -@pipeline( - extra={ - "complex_parameter": [ - ("sklearn.tree", "DecisionTreeClassifier"), - ("sklearn.ensemble", "RandomForestClassifier"), - ] - } -) -def my_pipeline(): - context = get_pipeline_context() - - after = [] - search_steps_prefix = "hp_tuning_search_" - for i, model_search_configuration in enumerate( - context.extra["complex_parameter"] - ): - step_name = f"{search_steps_prefix}{i}" - cross_validation( - model_package=model_search_configuration[0], - model_class=model_search_configuration[1], - id=step_name - ) - after.append(step_name) - select_best_model( - search_steps_prefix=search_steps_prefix, - after=after, - ) -``` - -See the [SDK Docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-new/#zenml.new.pipelines.pipeline%5Fcontext.PipelineContext) for more information on which attributes and methods the `PipelineContext` provides. diff --git a/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps.mdx b/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps.mdx deleted file mode 100644 index c47b0a6..0000000 --- a/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-steps.mdx +++ /dev/null @@ -1,37 +0,0 @@ ---- -title: "Fetch metadata within steps" -description: "Accessing meta information in real-time within your pipeline." -icon: stairs ---- - -## Using the `StepContext` - -To find information about the pipeline or step that is currently running, you can use the `zenml.get_step_context()` function to access the `StepContext` of your step: - -```py -from zenml import step, get_step_context - -@step -def my_step(): - step_context = get_step_context() - pipeline_name = step_context.pipeline.name - run_name = step_context.pipeline_run.name - step_name = step_context.step_run.name -``` - -Furthermore, you can also use the `StepContext` to find out where the outputs of your current step will be stored and which [Materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) class will be used to save them: - -```py -@step -def my_step(): - step_context = get_step_context() - # Get the URI where the output will be saved. - uri = step_context.get_output_artifact_uri() - - # Get the materializer that will be used to save the output. - materializer = step_context.get_output_materializer() -``` - - - See the [SDK Docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-new/#zenml.new.steps.step%5Fcontext.StepContext) for more information on which attributes and methods the `StepContext` provides. - diff --git a/develop/usage/resource-data-management/track-metrics-metadata/grouping-metadata.mdx b/develop/usage/resource-data-management/track-metrics-metadata/grouping-metadata.mdx deleted file mode 100644 index f7d6297..0000000 --- a/develop/usage/resource-data-management/track-metrics-metadata/grouping-metadata.mdx +++ /dev/null @@ -1,34 +0,0 @@ ---- -title: "Group metadata" -description: "Learn how to group key-value pairs in the dashboard." -icon: layer-group ---- - - - - - -When logging metadata passing a dictionary of dictionaries in the `metadata` parameter will group the metadata into cards in the ZenML dashboard. This feature helps organize metadata into logical sections, making it easier to visualize and understand. - -Here's an example of grouping metadata into cards: - -```py -from zenml.metadata.metadata_types import StorageSize - -log_artifact_metadata( - metadata={ - "model_metrics": { - "accuracy": 0.95, - "precision": 0.92, - "recall": 0.90 - }, - "data_details": { - "dataset_size": StorageSize(1500000), - "feature_columns": ["age", "income", "score"] - } - } -) -``` - -In the ZenML dashboard, "model\_metrics" and "data\_details" would appear as separate cards, each containing their respective key-value pairs. - diff --git a/develop/usage/resource-data-management/track-metrics-metadata/logging-metadata.mdx b/develop/usage/resource-data-management/track-metrics-metadata/logging-metadata.mdx deleted file mode 100644 index 9831ec0..0000000 --- a/develop/usage/resource-data-management/track-metrics-metadata/logging-metadata.mdx +++ /dev/null @@ -1,35 +0,0 @@ ---- -title: "Special Metadata Types" -description: "Tracking your metadata." -icon: truck-fast ---- - -ZenML supports several special metadata types to capture specific kinds of information. Here are examples of how to use the special types `Uri`, `Path`, `DType`, and `StorageSize`: - -```py -from zenml.metadata.metadata_types import StorageSize, DType -from zenml import log_artifact_metadata - -log_artifact_metadata( - metadata={ - "dataset_source": Uri("gs://my-bucket/datasets/source.csv"), - "preprocessing_script": Path("/scripts/preprocess.py"), - "column_types": { - "age": DType("int"), - "income": DType("float"), - "score": DType("int") - }, - "processed_data_size": StorageSize(2500000) - } -) -``` - -In this example: - -* `Uri` is used to indicate a dataset source URI. -* `Path` is used to specify the filesystem path to a preprocessing script. -* `DType` is used to describe the data types of specific columns. -* `StorageSize` is used to indicate the size of the processed data in bytes. - -These special types help standardize the format of metadata and ensure that it is logged in a consistent and interpretable manner. - diff --git a/develop/usage/resource-data-management/track-metrics-metadata/track-metrics-metadata.mdx b/develop/usage/resource-data-management/track-metrics-metadata/track-metrics-metadata.mdx deleted file mode 100644 index 263f85b..0000000 --- a/develop/usage/resource-data-management/track-metrics-metadata/track-metrics-metadata.mdx +++ /dev/null @@ -1,12 +0,0 @@ ---- -title: "Overview" -description: "Tracking metrics and metadata" -icon: chart-line ---- - -Logging metrics and metadata is standardized in ZenML. The most common pattern is to use the `log_xxx` methods, e.g.: - -* Log metadata to a [model](/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-a-model): `log_model_metadata` -* Log metadata to an [artifact](/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-an-artifact): `log_artifact_metadata` -* Log metadata to a [step](/develop/usage/resource-data-management/track-metrics-metadata/attach-metadata-to-steps): `log_step_metadata` - diff --git a/develop/usage/resource-data-management/training-with-gpus/accelerate-distributed-training.mdx b/develop/usage/resource-data-management/training-with-gpus/accelerate-distributed-training.mdx deleted file mode 100644 index 52d75a8..0000000 --- a/develop/usage/resource-data-management/training-with-gpus/accelerate-distributed-training.mdx +++ /dev/null @@ -1,110 +0,0 @@ ---- -title: "Distributed Training with Accelerate" -description: "Run distributed training with Hugging Face's Accelerate library in ZenML pipelines." -icon: face-smile ---- - -There are several reasons why you might want to scale your machine learning pipelines to utilize distributed training, such as leveraging multiple GPUs or training across multiple nodes. ZenML now integrates with [Hugging Face's Accelerate library](https://github.com/huggingface/accelerate) to make this process seamless and efficient. - -## Use Accelerate in your steps - -Some steps in your machine learning pipeline, particularly training steps, can benefit from distributed execution. You can now use the `run_with_accelerate` decorator to enable this: - -```py -from zenml import step, pipeline -from zenml.integrations.huggingface.steps import run_with_accelerate - -@run_with_accelerate(num_processes=4, multi_gpu=True) -@step -def training_step(some_param: int, ...): - # your training code is below - ... - -@pipeline -def training_pipeline(some_param: int, ...): - training_step(some_param, ...) -``` - -The `run_with_accelerate` decorator wraps your step, enabling it to run with Accelerate's distributed training capabilities. It accepts arguments available to `accelerate launch` CLI command. - - - For a complete list of available arguments and more details, refer to the [Accelerate CLI documentation](https://huggingface.co/docs/accelerate/en/package%5Freference/cli#accelerate-launch). - - -### Configuration - -The `run_with_accelerate` decorator accepts various arguments to configure your distributed training environment. Some common arguments include: - -* `num_processes`: The number of processes to use for distributed training. -* `cpu`: Whether to force training on CPU. -* `multi_gpu`: Whether to launch distributed GPU training. -* `mixed_precision`: Mixed precision training mode ('no', 'fp16', or 'bf16'). - -### Important Usage Notes - -1. The `run_with_accelerate` decorator can only be used directly on steps using the '@' syntax. Using it as a function inside the pipeline definition is not allowed. -2. Accelerated steps do not support positional arguments. Use keyword arguments when calling your steps. -3. If `run_with_accelerate` is misused, it will raise a `RuntimeError` with a helpful message explaining the correct usage. - - - To see a full example where Accelerate is used within a ZenML pipeline, check out our [llm-lora-finetuning](https://github.com/zenml-io/zenml-projects/blob/main/llm-lora-finetuning/README.md) project which leverages the distributed training functionalities while finetuning an LLM. - - -## Ensure your container is Accelerate-ready - -To run steps with Accelerate, it's crucial to have the necessary dependencies installed in the environment. This section will guide you on how to configure your environment to utilize Accelerate effectively. - - - Note that these configuration changes are **required** for Accelerate to function properly. If you don't update the settings, your steps might run, but they will not leverage distributed training capabilities. - - -All steps using Accelerate will be executed within a containerized environment. Therefore, you need to make two amendments to your Docker settings for the relevant steps: - -### 1\. Specify a CUDA-enabled parent image in your `DockerSettings` - -For complete details, refer to the [containerization page](/develop/usage/project-setup/customize-docker-builds). Here's an example using a CUDA-enabled PyTorch image: - -```py -from zenml import pipeline -from zenml.config import DockerSettings - -docker_settings = DockerSettings(parent_image="pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime") - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -### 2. Add Accelerate as explicit pip requirements - -Ensure that Accelerate is installed in your container: - -```py -from zenml.config import DockerSettings -from zenml import pipeline - -docker_settings = DockerSettings( - parent_image="pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime", - requirements=["accelerate", "torchvision"] -) - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -## Train across multiple GPUs - -ZenML's Accelerate integration supports training your models with multiple GPUs on a single node or across multiple nodes. This is particularly useful for large datasets or complex models that benefit from parallelization. - -In practice, using Accelerate with multiple GPUs involves: - -* Wrapping your training step with the `run_with_accelerate` function in your pipeline definition -* Configuring the appropriate Accelerate arguments (e.g., `num_processes`, `multi_gpu`) -* Ensuring your training code is compatible with distributed training (Accelerate handles most of this automatically) - - - If you're new to distributed training or encountering issues, please [connect with us on Slack](https://zenml.io/slack) and we'll be happy to assist you. - - -By leveraging the Accelerate integration in ZenML, you can easily scale your training processes and make the most of your available hardware resources, all while maintaining the structure and benefits of your ZenML pipelines. diff --git a/develop/usage/resource-data-management/training-with-gpus/training-with-gpus.mdx b/develop/usage/resource-data-management/training-with-gpus/training-with-gpus.mdx deleted file mode 100644 index 8fe1dc0..0000000 --- a/develop/usage/resource-data-management/training-with-gpus/training-with-gpus.mdx +++ /dev/null @@ -1,138 +0,0 @@ ---- -title: "Overview" -description: "Ensuring your pipelines or steps run on GPU-backed hardware." -icon: car ---- - -There are several reasons why you may want to scale your machine learning pipelines to the cloud, such as utilizing more powerful hardware or distributing tasks across multiple nodes. In order to achieve this with ZenML you'll need to run your steps on GPU-backed hardware using `ResourceSettings` to allocate greater resources on an orchestrator node and/or make some adjustments to the container environment. - -## Specify resource requirements for steps - -Some steps of your machine learning pipeline might be more resource-intensive and require special hardware to execute. In such cases, you can specify the required resources for steps as follows: - -```js -from zenml.config import ResourceSettings -from zenml import step - -@step(settings={"resources": ResourceSettings(cpu_count=8, gpu_count=2, memory="8GB")}) -def training_step(...) -> ...: - # train a model -``` - -If the underlying [orchestrator](/develop/stack-components/orchestrators) in your stack then supports specifying resources, this setting will attempt to secure these resources. Some orchestrators (like the [Skypilot orchestrator](/develop/stack-components/orchestrators/skypilot-vm)) do not support `ResourceSettings` directly, but rather use their `Orchestrator` specific settings to achieve the same effect: - -```py -from zenml import step -from zenml.integrations.skypilot.flavors.skypilot_orchestrator_aws_vm_flavor import SkypilotAWSOrchestratorSettings - -skypilot_settings = SkypilotAWSOrchestratorSettings( - cpus="2", - memory="16", - accelerators="V100:2", -) - - -@step(settings={"orchestrator": skypilot_settings) -def training_step(...) -> ...: - # train a model -``` - -Please refer to the source code and documentation of each orchestrator to find out which orchestrator supports specifying resources in what way. - - - If you're using an orchestrator which does not support this feature or its underlying infrastructure does not cover your requirements, you can also take a look at [step operators](/develop/stack-components/step-operators) which allow you to execute individual steps of your pipeline in environments independent of your orchestrator. - - -### Ensure your container is CUDA-enabled - -To run steps or pipelines on GPUs, it's crucial to have the necessary CUDA tools installed in the environment. This section will guide you on how to configure your environment to utilize GPU capabilities effectively. - - - Note that these configuration changes are **required** for the GPU hardware to be properly utilized. If you don't update the settings, your steps might run, but they will not see any boost in performance from the custom hardware. - - -All steps running on GPU-backed hardware will be executed within a containerized environment, whether you're using the local Docker orchestrator or a cloud instance of Kubeflow. Therefore, you need to make two amendments to your Docker settings for the relevant steps: - -#### 1\. **Specify a CUDA-enabled parent image in your** `**DockerSettings**` - -For complete details, refer to the [containerization page](/develop/usage/project-setup/customize-docker-builds) that explains how to do this. As an example, if you want to use the latest CUDA-enabled official PyTorch image for your entire pipeline run, you can include the following code: - -```py -from zenml import pipeline -from zenml.config import DockerSettings - -docker_settings = DockerSettings(parent_image="pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime") - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -For TensorFlow, you might use the `tensorflow/tensorflow:latest-gpu` image, as detailed in the [official TensorFlow documentation](https://www.tensorflow.org/install/docker#gpu%5Fsupport) or their [DockerHub overview](https://hub.docker.com/r/tensorflow/tensorflow). - -#### 2\. **Add ZenML as an explicit pip requirement** - -ZenML requires that ZenML itself be installed for the containers running your pipelines and steps. Therefore, you need to explicitly state that ZenML should be installed. There are several ways to specify this, but as an example, you can update the code from above as follows: - -```py -from zenml.config import DockerSettings -from zenml import pipeline - -docker_settings = DockerSettings( - parent_image="pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime", - requirements=["zenml==0.39.1", "torchvision"] -) - - -@pipeline(settings={"docker": docker_settings}) -def my_pipeline(...): - ... -``` - -Adding these two extra settings options will ensure that CUDA is enabled for the specific steps that require GPU acceleration. Be cautious when choosing the image to avoid confusion when switching between local and remote environments. For example, you might have one version of PyTorch installed locally with a particular CUDA version, but when you switch to your remote stack or environment, you might be forced to use a different CUDA version. - -The core cloud operators offer prebuilt Docker images that fit with their hardware. You can find more information on them here: - -* [AWS](https://github.com/aws/deep-learning-containers/blob/master/available%5Fimages.md) -* [GCP](https://cloud.google.com/deep-learning-vm/docs/images) -* [Azure](https://learn.microsoft.com/en-us/azure/machine-learning/concept-prebuilt-docker-images-inference) - -Not all of these images are available on DockerHub, so ensure that the orchestrator environment your pipeline runs in has sufficient permissions to pull images from registries if you are using one of those. - -### Reset the CUDA cache in between steps - -Your use case will determine whether this is necessary or makes sense to do, but we have seen that resetting the CUDA cache in between steps can help avoid issues with the GPU cache. This is particularly necessary if your training jobs are pushing the boundaries of the GPU cache. Doing so is simple; just use a helper function to reset the cache at the beginning of any GPU-enabled steps. For example, something as simple as this might suffice: - -```yaml -import gc -import torch - -def cleanup_memory() -> None: - while gc.collect(): - torch.cuda.empty_cache() -``` - -You can then call this function at the beginning of your GPU-enabled steps: - -```py -from zenml import step - -@step -def training_step(...): - cleanup_memory() - # train a model -``` - -Note that resetting the memory cache will potentially affect others using the same GPU, so use this judiciously. - -## Train across multiple GPUs - -ZenML supports training your models with multiple GPUs on a single node. This is useful if you have a large dataset and want to train your model in parallel. The most important thing that you'll have to handle is preventing multiple ZenML instances from being spawned as you split the work among multiple GPUs. - -In practice this will probably involve: - -* creating a script / Python function that contains the logic of training your model (with the specification that this should run in parallel across multiple GPUs) -* calling that script / external function from within the step, possibly with some wrapper or helper code to dynamically configure or update the external script function - -We're aware that this is not the most elegant solution and we're at work to implement a better option with some inbuilt support for this task. If this is something you're struggling with and need support getting the step code working, please do [connect with us on Slack](https://zenml.io/slack) and we'll do our best to help you out. - diff --git a/develop/usage/resource-data-management/visualize-artifacts/creating-custom-visualizations.mdx b/develop/usage/resource-data-management/visualize-artifacts/creating-custom-visualizations.mdx deleted file mode 100644 index 3994b2d..0000000 --- a/develop/usage/resource-data-management/visualize-artifacts/creating-custom-visualizations.mdx +++ /dev/null @@ -1,113 +0,0 @@ ---- -title: "Creating custom visualizations" -description: "Creating your own visualizations." -icon: cart-plus ---- - -It is simple to associate a custom visualization with an artifact in ZenML, if the visualization is one of the supported visualization types. Currently, the following visualization types are supported: - -* **HTML:** Embedded HTML visualizations such as data validation reports, -* **Image:** Visualizations of image data such as Pillow images (e.g. `PIL.Image`) or certain numeric numpy arrays, -* **CSV:** Tables, such as the pandas DataFrame `.describe()` output, -* **Markdown:** Markdown strings or pages. - -There are three ways how you can add custom visualizations to the dashboard: - -* If you are already handling HTML, Markdown, or CSV data in one of your steps, you can have them visualized in just a few lines of code by casting them to a [special class](/develop/usage/resource-data-management/visualize-artifacts/creating-custom-visualizations#visualization-via-special-return-types) inside your step. -* If you want to automatically extract visualizations for all artifacts of a certain data type, you can define type-specific visualization logic by [building a custom materializer](/develop/usage/resource-data-management/visualize-artifacts/creating-custom-visualizations#visualization-via-materializers). -* If you want to create any other custom visualizations, you can [create a custom return type class with corresponding materializer](/develop/usage/resource-data-management/visualize-artifacts/creating-custom-visualizations#how-to-think-about-creating-a-custom-visualization) and build and return this custom return type from one of your steps. - -## Visualization via Special Return Types - -If you already have HTML, Markdown, or CSV data available as a string inside your step, you can simply cast them to one of the following types and return them from your step: - -* `zenml.types.HTMLString` for strings in HTML format, e.g., `"

Header

Some text"`, -* `zenml.types.MarkdownString` for strings in Markdown format, e.g., `"# Header\nSome text"`, -* `zenml.types.CSVString` for strings in CSV format, e.g., `"a,b,c\n1,2,3"`. - -### Example: - -```py -from zenml.types import CSVString - -@step -def my_step() -> CSVString: - some_csv = "a,b,c\n1,2,3" - return CSVString(some_csv) -``` - -This would create the following visualization in the dashboard: - - - - - -## Visualization via Materializers - -If you want to automatically extract visualizations for all artifacts of a certain data type, you can do so by overriding the `save_visualizations()` method of the corresponding materializer. See the [materializer docs page](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types#optional-how-to-visualize-the-artifact) for more information on how to create custom materializers that do this. - -Or, see a code example on [GitHub](https://github.com/zenml-io/zenml/blob/main/src/zenml/integrations/huggingface/materializers/huggingface%5Fdatasets%5Fmaterializer.py) where we visualize Hugging Face datasets by embedding their preview viewer. - -## How to think about creating a custom visualization - -By combining the ideas behind the above two visualization approaches, you can visualize virtually anything you want inside your ZenML dashboard in three simple steps: - -1. Create a **custom class** that will hold the data that you want to visualize. -2. [Build a custom **materializer**](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types#custom-materializers) for this custom class with the visualization logic implemented in the `save_visualizations()` method. -3. Return your custom class from any of your ZenML steps. - -### Example: Facets Data Skew Visualization - -As an example, have a look at the models, materializers, and steps of the [Facets Integration](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-facets), which can be used to visualize the data skew between multiple Pandas DataFrames: - - - - - -**1\. Custom Class** The [FacetsComparison](https://sdkdocs.zenml.io/0.42.0/integration%5Fcode%5Fdocs/integrations-facets/#zenml.integrations.facets.models.FacetsComparison) is the custom class that holds the data required for the visualization. - -```py -class FacetsComparison(BaseModel): - datasets: List[Dict[str, Union[str, pd.DataFrame]]] -``` - -**2\. Materializer** The [FacetsMaterializer](https://sdkdocs.zenml.io/0.42.0/integration%5Fcode%5Fdocs/integrations-facets/#zenml.integrations.facets.materializers.facets%5Fmaterializer.FacetsMaterializer) is a custom materializer that only handles this custom class and contains the corresponding visualization logic. - -```py -class FacetsMaterializer(BaseMaterializer): - - ASSOCIATED_TYPES = (FacetsComparison,) - ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA_ANALYSIS - - def save_visualizations( - self, data: FacetsComparison - ) -> Dict[str, VisualizationType]: - html = ... # Create a visualization for the custom type - visualization_path = os.path.join(self.uri, VISUALIZATION_FILENAME) - with fileio.open(visualization_path, "w") as f: - f.write(html) - return {visualization_path: VisualizationType.HTML} -``` - -**3\. Step** There are three different steps in the `facets` integration that can be used to create `FacetsComparison`s for different sets of inputs. E.g., the `facets_visualization_step` below takes two DataFrames as inputs and builds a `FacetsComparison` object out of them: - -```py -@step -def facets_visualization_step( - reference: pd.DataFrame, comparison: pd.DataFrame -) -> FacetsComparison: # Return the custom type from your step - return FacetsComparison( - datasets=[ - {"name": "reference", "table": reference}, - {"name": "comparison", "table": comparison}, - ] - ) -``` - - -This is what happens now under the hood when you add the `facets_visualization_step` into your pipeline: - -1. The step creates and returns a `FacetsComparison`. -2. When the step finishes, ZenML will search for a materializer class that can handle this type, finds the `FacetsMaterializer`, and calls the `save_visualizations()` method which creates the visualization and saves it into your artifact store as an HTML file. -3. When you open your dashboard and click on the artifact inside the run DAG, the visualization HTML file is loaded from the artifact store and displayed. - diff --git a/develop/usage/resource-data-management/visualize-artifacts/disabling-visualizations.mdx b/develop/usage/resource-data-management/visualize-artifacts/disabling-visualizations.mdx deleted file mode 100644 index 74f430a..0000000 --- a/develop/usage/resource-data-management/visualize-artifacts/disabling-visualizations.mdx +++ /dev/null @@ -1,18 +0,0 @@ ---- -title: "Disabling visualizations" -description: "Disabling visualizations." -icon: eye-slash ---- - -If you would like to disable artifact visualization altogether, you can set `enable_artifact_visualization` at either pipeline or step level: - -```py - -@step(enable_artifact_visualization=False) -def my_step(): - ... - -@pipeline(enable_artifact_visualization=False) -def my_pipeline(): - ... -``` diff --git a/develop/usage/resource-data-management/visualize-artifacts/types-of-visualizations.mdx b/develop/usage/resource-data-management/visualize-artifacts/types-of-visualizations.mdx deleted file mode 100644 index 123e475..0000000 --- a/develop/usage/resource-data-management/visualize-artifacts/types-of-visualizations.mdx +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: "Default visualizations" -description: "Types of visualizations in ZenML." -icon: hurricane ---- - -ZenML automatically saves visualizations of many common data types and allows you to view these visualizations in the ZenML dashboard: - - - - -Alternatively, any of these visualizations can also be displayed in Jupyter notebooks using the `artifact.visualize()` method: - - - - -Some examples of default visualizations are: - -* A statistical representation of a [Pandas](https://pandas.pydata.org/docsversions/0.66.0/reference/api/pandas.DataFrame.html) Dataframe represented as a png image. -* Drift detection reports by [Evidently](/develop/stack-components/data-validators/evidently), [Great Expectations](/develop/stack-components/data-validators/great-expectations), and [whylogs](/develop/stack-components/data-validators/whylogs). -* A [Hugging Face](https://zenml.io/integrations/huggingface) datasets viewer embedded as a HTML iframe. - - - - diff --git a/develop/usage/resource-data-management/visualize-artifacts/visualizations-in-dashboard.mdx b/develop/usage/resource-data-management/visualize-artifacts/visualizations-in-dashboard.mdx deleted file mode 100644 index 5565968..0000000 --- a/develop/usage/resource-data-management/visualize-artifacts/visualizations-in-dashboard.mdx +++ /dev/null @@ -1,23 +0,0 @@ ---- -title: "Displaying visualizations in the dashboard" -description: "Displaying visualizations in the dashboard." -icon: clipboard ---- - -In order for the visualizations to show up on the dashboard, the following must be true: - -## Configuring a Service Connector - -Visualizations are usually stored alongside the artifact, in the [artifact store](/develop/stack-components/artifact-stores). Therefore, if a user would like to see the visualization displayed on the ZenML dashboard, they must give access to the server to connect to the artifact store. - -The [service connector](/develop/usage/resource-data-management/auth-management) documentation goes deeper into the concept of service connectors and how they can be configured to give the server permission to access the artifact store. For a concrete example, see the [AWS S3](/develop/stack-components/artifact-stores/s3) artifact store documentation. - - -When using the default/local artifact store with a deployed ZenML, the server naturally does not have access to your local files. In this case, the visualizations are also not displayed on the dashboard. - -Please use a service connector enabled and remote artifact store alongside a deployed ZenML to view visualizations. - - -## Configuring Artifact Stores - -If all visualizations of a certain pipeline run are not showing up in the dashboard, it might be that your ZenML server does not have the required dependencies or permissions to access that artifact store. See the [custom artifact store docs page](/develop/stack-components/artifact-stores/custom#enabling-artifact-visualizations-with-custom-artifact-stores) for more information. diff --git a/develop/usage/resource-data-management/visualize-artifacts/visualize-artifacts.mdx b/develop/usage/resource-data-management/visualize-artifacts/visualize-artifacts.mdx deleted file mode 100644 index 80f35a3..0000000 --- a/develop/usage/resource-data-management/visualize-artifacts/visualize-artifacts.mdx +++ /dev/null @@ -1,12 +0,0 @@ ---- -title: "Overview" -description: "Configuring ZenML to display data visualizations in the dashboard." -icon: chart-simple ---- - -It is easy to associate visualizations of data and artifacts in ZenML: - - - - - diff --git a/develop/user-guide/examples/basic-nlp-with-bert.mdx b/develop/user-guide/examples/basic-nlp-with-bert.mdx deleted file mode 100644 index b1be3a9..0000000 --- a/develop/user-guide/examples/basic-nlp-with-bert.mdx +++ /dev/null @@ -1,5 +0,0 @@ ---- -title: Basic NLP with BERT -icon: books -url: https://github.com/zenml-io/zenml/tree/main/examples/e2e_nlp ---- diff --git a/develop/user-guide/examples/basic-nlp.mdx b/develop/user-guide/examples/basic-nlp.mdx deleted file mode 100644 index e69de29..0000000 diff --git a/develop/user-guide/examples/computer-vision-with-yolov8.mdx b/develop/user-guide/examples/computer-vision-with-yolov8.mdx deleted file mode 100644 index d64a1d8..0000000 --- a/develop/user-guide/examples/computer-vision-with-yolov8.mdx +++ /dev/null @@ -1,5 +0,0 @@ ---- -title: Computer Vision with YoloV8 -icon: eye -url: https://github.com/zenml-io/zenml-projects/tree/main/end-to-end-computer-vision ---- diff --git a/develop/user-guide/examples/e2e-batch.mdx b/develop/user-guide/examples/e2e-batch.mdx deleted file mode 100644 index e69de29..0000000 diff --git a/develop/user-guide/examples/end-to-end-batch.mdx b/develop/user-guide/examples/end-to-end-batch.mdx deleted file mode 100644 index 3b6d30d..0000000 --- a/develop/user-guide/examples/end-to-end-batch.mdx +++ /dev/null @@ -1,5 +0,0 @@ ---- -title: End-to-End Batch Inference -icon: lock -url: https://github.com/zenml-io/zenml/tree/main/examples/e2e ---- diff --git a/develop/user-guide/examples/llm-finetuning.mdx b/develop/user-guide/examples/llm-finetuning.mdx deleted file mode 100644 index d1ab046..0000000 --- a/develop/user-guide/examples/llm-finetuning.mdx +++ /dev/null @@ -1,5 +0,0 @@ ---- -title: LLM Finetuning -icon: book-open -url: https://github.com/zenml-io/zenml/tree/main/examples/llm_finetuning ---- diff --git a/develop/user-guide/examples/more-projects.mdx b/develop/user-guide/examples/more-projects.mdx deleted file mode 100644 index 8a13f98..0000000 --- a/develop/user-guide/examples/more-projects.mdx +++ /dev/null @@ -1,5 +0,0 @@ ---- -title: More Projects... -icon: puzzle-piece -url: https://github.com/zenml-io/zenml-projects ---- diff --git a/develop/user-guide/examples/quickstart.mdx b/develop/user-guide/examples/quickstart.mdx deleted file mode 100644 index 01d9f96..0000000 --- a/develop/user-guide/examples/quickstart.mdx +++ /dev/null @@ -1,5 +0,0 @@ ---- -title: Quickstart -icon: rocket -url: https://github.com/zenml-io/zenml/blob/main/examples/quickstart ---- diff --git a/develop/user-guide/guides/llmops-guide.mdx b/develop/user-guide/guides/llmops-guide.mdx deleted file mode 100644 index eda15b9..0000000 --- a/develop/user-guide/guides/llmops-guide.mdx +++ /dev/null @@ -1,54 +0,0 @@ ---- -title: "Overview" -description: "Leverage the power of LLMs in your MLOps workflows with ZenML." -icon: earlybirds ---- - -Welcome to the ZenML LLMOps Guide, where we dive into the exciting world of Large Language Models (LLMs) and how to integrate them seamlessly into your MLOps pipelines using ZenML. This guide is designed for ML practitioners and MLOps engineers looking to harness the potential of LLMs while maintaining the robustness and scalability of their workflows. - - - - - -In this guide, we'll explore various aspects of working with LLMs in ZenML, including: - - - - - - - - - - ---- - - - - - - - - ---- - - - - - - - ---- - - - - - - ---- - -To follow along with the examples and tutorials in this guide, ensure you have a Python environment set up with ZenML installed. Familiarity with the concepts covered in the [Starter Guide](/develop/user-guide/guides/starter-guide) and [Production Guide](/develop/user-guide/guides/production-guide) is recommended. - -We'll showcase a specific application over the course of this LLM guide, showing how you can work from a simple RAG pipeline to a more complex setup that involves finetuning embeddings, reranking retrieved documents, and even finetuning the LLM itself. We'll do this all for a use case relevant to ZenML: a question answering system that can provide answers to common questions about ZenML. This will help you understand how to apply the concepts covered in this guide to your own projects. - -By the end of this guide, you'll have a solid understanding of how to leverage LLMs in your MLOps workflows using ZenML, enabling you to build powerful, scalable, and maintainable LLM-powered applications. First up, let's take a look at a super simple implementation of the RAG paradigm to get started. diff --git a/develop/user-guide/guides/llmops-guide/evaluation.mdx b/develop/user-guide/guides/llmops-guide/evaluation.mdx deleted file mode 100644 index 7f225de..0000000 --- a/develop/user-guide/guides/llmops-guide/evaluation.mdx +++ /dev/null @@ -1,33 +0,0 @@ ---- -title: "Overview" -description: "Track how your RAG pipeline improves using evaluation and metrics." -icon: chart-line ---- - -In this section, we'll explore how to evaluate the performance of your RAG pipeline using metrics and visualizations. Evaluating your RAG pipeline is crucial to understanding how well it performs and identifying areas for improvement. With language models in particular, it's hard to evaluate their performance using traditional metrics like accuracy, precision, and recall. This is because language models generate text, which is inherently subjective and difficult to evaluate quantitatively. - -Our RAG pipeline is a whole system, moreover, not just a model, and evaluating it requires a holistic approach. We'll look at various ways to evaluate the performance of your RAG pipeline but the two main areas we'll focus on are: - -* [Retrieval evaluation](/develop/user-guide/guides/llmops-guide/evaluation/retrieval), so checking that the retrieved documents or document chunks are relevant to the query. -* [Generation evaluation](/develop/user-guide/guides/llmops-guide/evaluation/generation), so checking that the generated text is coherent and helpful for our specific use case. - - - - -In the previous section we built out a basic RAG pipeline for our documentation question-and-answer use case. We'll use this pipeline to demonstrate how to evaluate the performance of your RAG pipeline. - - - If you were running this in a production setting, you might want to set up evaluation to check the performance of a raw LLM model (i.e. without any retrieval / RAG components) as a baseline, and then compare this to the performance of your RAG pipeline. This will help you understand how much value the retrieval and generation components are adding to your system. We won't cover this here, but it's a good practice to keep in mind. - - -## What are we evaluating? - -When evaluating the performance of your RAG pipeline, your specific use case and the extent to which you can tolerate errors or lower performance will determine what you need to evaluate. For instance, if you're building a user-facing chatbot, you might need to evaluate the following: - -* Are the retrieved documents relevant to the query? -* Is the generated answer coherent and helpful for your specific use case? -* Does the generated answer contain hate speech or any sort of toxic language? - -These are just examples, and the specific metrics and methods you use will depend on your use case. The [generation evaluation](/develop/user-guide/guides/llmops-guide/evaluation/generation) functions as an end-to-end evaluation of the RAG pipeline, as it checks the final output of the system. It's during these end-to-end evaluations that you'll have most leeway to use subjective metrics, as you're evaluating the system as a whole. - -Before we dive into the details, let's take a moment to look at [a short high-level code example](/develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-65-loc) showcasing the two main areas of evaluation. Afterwards the following sections will cover the two main areas of evaluation in more detail [as well as offer practical guidance](/develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-practice) on when to run these evaluations and what to look for in the results. diff --git a/develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-65-loc.mdx b/develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-65-loc.mdx deleted file mode 100644 index c99e53d..0000000 --- a/develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-65-loc.mdx +++ /dev/null @@ -1,88 +0,0 @@ ---- -title: "Evaluation in 65 lines of code" -description: "Learn how to implement evaluation for RAG in just 65 lines of code." -icon: code ---- - -Our RAG guide included [a short example](/develop/user-guide/guides/llmops-guide/rag-with-zenml/rag-85-loc) for how to implement a basic RAG pipeline in just 85 lines of code. In this section, we'll build on that example to show how you can evaluate the performance of your RAG pipeline in just 65 lines. For the full code, please visit the project repository [here](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/most%5Fbasic%5Feval.py). The code that follows requires the functions from the earlier RAG pipeline code to work. - -```py -# ...previous RAG pipeline code here... -# see https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/most_basic_rag_pipeline.py - -eval_data = [ - { - "question": "What creatures inhabit the luminescent forests of ZenML World?", - "expected_answer": "The luminescent forests of ZenML World are inhabited by glowing Zenbots.", - }, - { - "question": "What do Fractal Fungi do in the melodic caverns of ZenML World?", - "expected_answer": "Fractal Fungi emit pulsating tones that resonate through the crystalline structures, creating a symphony of otherworldly sounds in the melodic caverns of ZenML World.", - }, - { - "question": "Where do Gravitational Geckos live in ZenML World?", - "expected_answer": "Gravitational Geckos traverse the inverted cliffs of ZenML World.", - }, -] - - -def evaluate_retrieval(question, expected_answer, corpus, top_n=2): - relevant_chunks = retrieve_relevant_chunks(question, corpus, top_n) - score = any( - any(word in chunk for word in tokenize(expected_answer)) - for chunk in relevant_chunks - ) - return score - - -def evaluate_generation(question, expected_answer, generated_answer): - client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) - chat_completion = client.chat.completions.create( - messages=[ - { - "role": "system", - "content": "You are an evaluation judge. Given a question, an expected answer, and a generated answer, your task is to determine if the generated answer is relevant and accurate. Respond with 'YES' if the generated answer is satisfactory, or 'NO' if it is not.", - }, - { - "role": "user", - "content": f"Question: {question}\nExpected Answer: {expected_answer}\nGenerated Answer: {generated_answer}\nIs the generated answer relevant and accurate?", - }, - ], - model="gpt-3.5-turbo", - ) - - judgment = chat_completion.choices[0].message.content.strip().lower() - return judgment == "yes" - - -retrieval_scores = [] -generation_scores = [] - -for item in eval_data: - retrieval_score = evaluate_retrieval( - item["question"], item["expected_answer"], corpus - ) - retrieval_scores.append(retrieval_score) - - generated_answer = answer_question(item["question"], corpus) - generation_score = evaluate_generation( - item["question"], item["expected_answer"], generated_answer - ) - generation_scores.append(generation_score) - -retrieval_accuracy = sum(retrieval_scores) / len(retrieval_scores) -generation_accuracy = sum(generation_scores) / len(generation_scores) - -print(f"Retrieval Accuracy: {retrieval_accuracy:.2f}") -print(f"Generation Accuracy: {generation_accuracy:.2f}") -``` - -As you can see, we've added two evaluation functions: `evaluate_retrieval` and `evaluate_generation`. The `evaluate_retrieval` function checks if the retrieved chunks contain any words from the expected answer. The `evaluate_generation` function uses OpenAI's chat completion LLM to evaluate the quality of the generated answer. - -We then loop through the evaluation data, which contains questions and expected answers, and evaluate the retrieval and generation components of our RAG pipeline. Finally, we calculate the accuracy of both components and print the results: - - - - -As you can see, we get 100% accuracy for both retrieval and generation in this example. Not bad! The sections that follow will provide a more detailed and sophisticated implementation of RAG evaluation, but this example shows how you can think about it at a high level! - diff --git a/develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-practice.mdx b/develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-practice.mdx deleted file mode 100644 index 25a998e..0000000 --- a/develop/user-guide/guides/llmops-guide/evaluation/evaluation-in-practice.mdx +++ /dev/null @@ -1,43 +0,0 @@ ---- -title: "Evaluation in practice" -description: "Learn how to evaluate the performance of your RAG system in practice." -icon: flask ---- - -Now that we've seen individually how to evaluate the retrieval and generation components of our pipeline, it's worth taking a step back to think through how all of this works in practice. - -Our example project includes the evaluation as a separate pipeline that optionally runs after the main pipeline that generates and populates the embeddings. This is a good practice to follow, as it allows you to separate the concerns of generating the embeddings and evaluating them. Depending on the specific use case, the evaluations could be included as part of the main pipeline and used as a gating mechanism to determine whether the embeddings are good enough to be used in production. - -Given some of the performance constraints of the LLM judge, it might be worth experimenting with using a local LLM judge for evaluation during the course of the development process and then running the full evaluation using a cloud LLM like Anthropic's Claude or OpenAI's GPT-3.5 or 4\. This can help you iterate faster and get a sense of how well your embeddings are performing before committing to the cost of running the full evaluation. - -## Automated evaluation isn't a silver bullet - -While automating the evaluation process can save you time and effort, it's important to remember that it doesn't replace the need for a human to review the results. The LLM judge is expensive to run, and it takes time to get the results back. Automating the evaluation process can help you focus on the details and the data, but it doesn't replace the need for a human to review the results and make sure that the embeddings (and the RAG system as a whole) are performing as expected. - -## When and how much to evaluate - -The frequency and depth of evaluation will depend on your specific use case and the constraints of your project. In an ideal world, you would evaluate the performance of your embeddings and the RAG system as a whole as often as possible, but in practice, you'll need to balance the cost of running the evaluation with the need to iterate quickly. - -Some tests can be run quickly and cheaply (notably the tests of the retrieval system) while others (like the LLM judge) are more expensive and time-consuming. You should structure your RAG tests and evaluation to reflect this, with some tests running frequently and others running less often, just as you would in any other software project. - -There's more we could improve our evaluation system, but for now we can continue onwards to [adding a reranker](/develop/user-guide/guides/llmops-guide/reranking) to improve our retrieval. This will allow us to improve the performance of our retrieval system without needing to retrain the embeddings. We'll cover this in the next section. - -## Try it out! - -To see how this works in practice, you can run the evaluation pipeline using the project code. This will give you a sense of how the evaluation process works in practice and you can of course then play with and modify the evaluation code. - -To run the evaluation pipeline, first clone the project repository: - -```Bash -git clone https://github.com/zenml-io/zenml-projects.git -``` - -Then navigate to the `llm-complete-guide` directory and follow the instructions in the `README.md` file to run the evaluation pipeline. (You'll have to have first run the main pipeline to generate the embeddings.) - -To run the evaluation pipeline, you can use the following command: - -```Bash -python run.py --evaluation -``` - -This will run the evaluation pipeline and output the results to the console. You can then inspect the progress, logs and results in the dashboard! diff --git a/develop/user-guide/guides/llmops-guide/evaluation/generation.mdx b/develop/user-guide/guides/llmops-guide/evaluation/generation.mdx deleted file mode 100644 index 47b6b7d..0000000 --- a/develop/user-guide/guides/llmops-guide/evaluation/generation.mdx +++ /dev/null @@ -1,312 +0,0 @@ ---- -title: "Generation evaluation" -description: "Evaluate the generation component of your RAG pipeline." -icon: wand-magic-sparkles ---- - -Now that we have a sense of how to evaluate the retrieval component of our RAG pipeline, let's move on to the generation component. The generation component is responsible for generating the answer to the question based on the retrieved context. At this point, our evaluation starts to move into more subjective territory. It's harder to come up with metrics that can accurately capture the quality of the generated answers. However, there are some things we can do. - -As with the [retrieval evaluation](/develop/user-guide/guides/llmops-guide/evaluation/retrieval), we can start with a simple approach and then move on to more sophisticated methods. - -## Handcrafted evaluation tests - -As in the retrieval evaluation, we can start by putting together a set of examples where we know that our generated output should or shouldn't include certain terms. For example, if we're generating answers to questions about which orchestrators ZenML supports, we can check that the generated answers include terms like "Airflow" and "Kubeflow" (since we do support them) and exclude terms like "Flyte" or "Prefect" (since we don't (yet!) support them). These handcrafted tests should be driven by mistakes that you've already seen in the RAG output. The negative example of "Flyte" and "Prefect" showing up in the list of supported orchestrators, for example, shows up sometimes when you use GPT 3.5 as the LLM. - - - - -As another example, when you make a query asking 'what is the default orchestrator in ZenML?' you would expect that the answer would include the word 'local', so we can make a test case to confirm that. - -You can view our starter set of these tests [here](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/steps/eval%5Fe2e.py#L28-L55). It's better to start with something small and simple and then expand as is needed. There's no need for complicated harnesses or frameworks at this stage. - -`**bad_answers**` **table:** - -| Question | Bad Words | -| ------------------------------------------ | ------------------------------------------- | -| What orchestrators does ZenML support? | AWS Step Functions, Flyte, Prefect, Dagster | -| What is the default orchestrator in ZenML? | Flyte, AWS Step Functions | - -`**bad_immediate_responses**` **table:** - -| Question | Bad Words | -| --------------------------------------------------------- | --------- | -| Does ZenML support the Flyte orchestrator out of the box? | Yes | - -`**good_responses**` **table:** - -| Question | Good Words | -| ----------------------------------------------------------------------------------------------------- | ----------------- | -| What are the supported orchestrators in ZenML? Please list as many of the supported ones as possible. | Kubeflow, Airflow | -| What is the default orchestrator in ZenML? | local | - -Each type of test then catches a specific type of mistake. For example: - -```py -class TestResult(BaseModel): - success: bool - question: str - keyword: str = "" - response: str - - -def test_content_for_bad_words( - item: dict, n_items_retrieved: int = 5 -) -> TestResult: - question = item["question"] - bad_words = item["bad_words"] - response = process_input_with_retrieval( - question, n_items_retrieved=n_items_retrieved - ) - for word in bad_words: - if word in response: - return TestResult( - success=False, - question=question, - keyword=word, - response=response, - ) - return TestResult(success=True, question=question, response=response) -``` - -Here we're testing that a particular word doesn't show up in the generated response. If we find the word, then we return a failure, otherwise we return a success. This is a simple example, but you can imagine more complex tests that check for the presence of multiple words, or the presence of a word in a particular context. - -We pass these custom tests into a test runner that keeps track of how many are failing and also logs those to the console when they do: - -```py -def run_tests(test_data: list, test_function: Callable) -> float: - failures = 0 - total_tests = len(test_data) - for item in test_data: - test_result = test_function(item) - if not test_result.success: - logging.error( - f"Test failed for question: '{test_result.question}'. Found word: '{test_result.keyword}'. Response: '{test_result.response}'" - ) - failures += 1 - failure_rate = (failures / total_tests) * 100 - logging.info( - f"Total tests: {total_tests}. Failures: {failures}. Failure rate: {failure_rate}%" - ) - return round(failure_rate, 2) -``` - -Our end-to-end evaluation of the generation component is then a combination of these tests: - -```py -@step -def e2e_evaluation() -> ( - Annotated[float, "failure_rate_bad_answers"], - Annotated[float, "failure_rate_bad_immediate_responses"], - Annotated[float, "failure_rate_good_responses"], -): - logging.info("Testing bad answers...") - failure_rate_bad_answers = run_tests( - bad_answers, test_content_for_bad_words - ) - logging.info(f"Bad answers failure rate: {failure_rate_bad_answers}%") - - logging.info("Testing bad immediate responses...") - failure_rate_bad_immediate_responses = run_tests( - bad_immediate_responses, test_response_starts_with_bad_words - ) - logging.info( - f"Bad immediate responses failure rate: {failure_rate_bad_immediate_responses}%" - ) - - logging.info("Testing good responses...") - failure_rate_good_responses = run_tests( - good_responses, test_content_contains_good_words - ) - logging.info( - f"Good responses failure rate: {failure_rate_good_responses}%" - ) - return ( - failure_rate_bad_answers, - failure_rate_bad_immediate_responses, - failure_rate_good_responses, - ) -``` - -Running the tests using different LLMs will give different results. Here our Ollama Mixtral did worse than GPT 3.5, for example, but there were still some failures with GPT 3.5\. This is a good way to get a sense of how well your generation component is doing. - -As you become more familiar with the kinds of outputs your LLM generates, you can add the hard ones to this test suite. This helps prevent regressions and is directly related to the quality of the output you're getting. This way you can optimize for your specific use case. - -## Automated evaluation using another LLM - -Another way to evaluate the generation component is to use another LLM to grade the output of the LLM you're evaluating. This is a more sophisticated approach and requires a bit more setup. We can use the pre-generated questions and the associated context as input to the LLM and then use another LLM to assess the quality of the output on a scale of 1 to 5\. This is a more quantitative approach and since it's automated it can run across a larger set of data. - - - - - - - LLMs don't always do well on this kind of evaluation where numbers are involved. There are some studies showing that LLMs can be biased towards certain numbers or ranges of numbers. This is something to keep in mind when using this approach. Qualitative evaluations are often more reliable but then that means a human has to do the evaluation. - - -We can start by setting up a Pydantic model to hold the data we need. We set constraints to ensure that the data we're getting back are only integers between 1 and 5, inclusive: - -```Bash -class LLMJudgedTestResult(BaseModel): - toxicity: conint(ge=1, le=5) - faithfulness: conint(ge=1, le=5) - helpfulness: conint(ge=1, le=5) - relevance: conint(ge=1, le=5) -``` - -We can use this in a test function that: - -* takes a question and a context as inputs -* generates an answer using the LLM we're evaluating -* makes a call to an (optionally different) LLM we're using to judge the quality of the answer getting back a score for each of the four categories in JSON format -* parses the JSON and returns the result of the evaluation as our Pydantic model instance - -Pydantic handles the validation of the JSON input for us, so we can be sure that we're getting the data we expect and in a form that we can use. - - -```py -def llm_judged_test_e2e( - question: str, - context: str, - n_items_retrieved: int = 5, -) -> LLMJudgedTestResult: - """E2E tests judged by an LLM. - - Args: - item (dict): The item to test. - n_items_retrieved (int): The number of items to retrieve. - - Returns: - LLMJudgedTestResult: The result of the test. - """ - logging.debug("Starting LLM judged test...") - response = process_input_with_retrieval( - question, n_items_retrieved=n_items_retrieved - ) - logging.debug("Input processed with retrieval.") - prompt = f""" - Please analyze the following text and provided context to provide scores - for toxicity, faithfulness, helpfulness, and relevance. The toxicity score - should be an integer between 1 and 5, where 1 is not toxic and 5 is very - toxic. The faithfulness score should be an integer between 1 and 5, where 1 - is not faithful to the provided context and 5 is very faithful. - The helpfulness score should be an integer between 1 and 5, where 1 is not - helpful and 5 is very helpful. The relevance score should be an integer - between 1 and 5, where 1 is not relevant given the context and 5 is very relevant. - - **Text:** {response} - - **Context:** {context} - - **Output format:** - {{ - "toxicity": int, - "faithfulness": int, - "helpfulness": int, - "relevance": int - }} - """ - logging.debug("Prompt created.") - response = completion( - model="gpt-4-turbo", messages=[{"content": prompt, "role": "user"}] - ) - - json_output = response["choices"][0]["message"]["content"].strip() - logging.info("Received response from model.") - logging.debug(json_output) - try: - return LLMJudgedTestResult(**json.loads(json_output)) - except json.JSONDecodeError as e: - logging.error(f"JSON bad output: {json_output}") - raise e -``` - -Currently we're not handling retries of the output from the LLM in the case where the JSON isn't output correctly, but potentially that's something we might want to do. - -We can then run this test across a set of questions and contexts: - -```py -def run_llm_judged_tests( - test_function: Callable, - sample_size: int = 50, -) -> Tuple[ - Annotated[float, "average_toxicity_score"], - Annotated[float, "average_faithfulness_score"], - Annotated[float, "average_helpfulness_score"], - Annotated[float, "average_relevance_score"], -]: - dataset = load_dataset("zenml/rag_qa_embedding_questions", split="train") - - # Shuffle the dataset and select a random sample - sampled_dataset = dataset.shuffle(seed=42).select(range(sample_size)) - - total_tests = len(sampled_dataset) - total_toxicity = 0 - total_faithfulness = 0 - total_helpfulness = 0 - total_relevance = 0 - - for item in sampled_dataset: - question = item["generated_questions"][0] - context = item["page_content"] - - try: - result = test_function(question, context) - except json.JSONDecodeError as e: - logging.error(f"Failed for question: {question}. Error: {e}") - total_tests -= 1 - continue - total_toxicity += result.toxicity - total_faithfulness += result.faithfulness - total_helpfulness += result.helpfulness - total_relevance += result.relevance - - average_toxicity_score = total_toxicity / total_tests - average_faithfulness_score = total_faithfulness / total_tests - average_helpfulness_score = total_helpfulness / total_tests - average_relevance_score = total_relevance / total_tests - - return ( - round(average_toxicity_score, 3), - round(average_faithfulness_score, 3), - round(average_helpfulness_score, 3), - round(average_relevance_score, 3), - ) -``` - -You'll want to use your most capable and reliable LLM to do the judging. In our case, we used the new GPT-4 Turbo. The quality of the evaluation is only as good as the LLM you're using to do the judging and there is a large difference between GPT-3.5 and GPT-4 Turbo in terms of the quality of the output, not least in its ability to output JSON correctly. - -Here was the output following an evaluation for 50 randomly sampled datapoints: - - -```py -Step e2e_evaluation_llm_judged has started. -Average toxicity: 1.0 -Average faithfulness: 4.787 -Average helpfulness: 4.595 -Average relevance: 4.87 -Step e2e_evaluation_llm_judged has finished in 8m51s. -Pipeline run has finished in 8m52s. -``` - -This took around 9 minutes to run using GPT-4 Turbo as the evaluator and the default GPT-3.5 as the LLM being evaluated. - -To take this further, there are a number of ways it might be improved: - -* **Retries**: As mentioned above, we're not currently handling retries of the output from the LLM in the case where the JSON isn't output correctly. This could be improved by adding a retry mechanism that waits for a certain amount of time before trying again. (We could potentially use the [instructor](https://github.com/jxnl/instructor) library to handle this specifically.) -* **Use OpenAI's 'JSON mode'**: OpenAI has a [JSON mode](https://platform.openai.com/docs/guides/text-generation/json-mode) that can be used to ensure that the output is always in JSON format. This could be used to ensure that the output is always in the correct format. -* **More sophisticated evaluation**: The evaluation we're doing here is quite simple. We're just asking for a score in four categories. There are more sophisticated ways to evaluate the quality of the output, such as using multiple evaluators and taking the average score, or using a more complex scoring system that takes into account the context of the question and the context of the answer. -* **Batch processing**: We're running the evaluation one question at a time here. It would be more efficient to run the evaluation in batches to speed up the process. -* **More data**: We're only using 50 samples here. This could be increased to get a more accurate picture of the quality of the output. -* **More LLMs**: We're only using GPT-4 Turbo here. It would be interesting to see how other LLMs perform as evaluators. -* **Handcrafted questions based on context**: We're using the generated questions here. It would be interesting to see how the LLM performs when given handcrafted questions that are based on the context of the question. -* **Human in the loop**: The LLM actually provides qualitative feedback on the output as well as the JSON scores. This data could be passed into an annotation tool to get human feedback on the quality of the output. This would be a more reliable way to evaluate the quality of the output and would offer some insight into the kinds of mistakes the LLM is making. - -Most notably, the scores we're currently getting are pretty high, so it would make sense to pass in harder questions and be more specific in the judging criteria. This will give us more room to improve as it is sure that the system is not perfect. - -While this evaluation approach serves as a solid foundation, it's worth noting that there are other frameworks available that can further enhance the evaluation process. Frameworks such as [ragas](https://github.com/explodinggradients/ragas), [trulens](https://www.trulens.org/), [DeepEval](https://docs.confident-ai.com/), and [UpTrain](https://github.com/uptrain-ai/uptrain) can be integrated with ZenML depending on your specific use-case and understanding of the underlying concepts. These frameworks, although potentially complex to set up and use, can provide more sophisticated evaluation capabilities as your project evolves and grows in complexity. - -We now have a working evaluation of both the retrieval and generation evaluation components of our RAG pipeline. We can use this to track how our pipeline improves as we make changes to the retrieval and generation components. - -## Code Example - -To explore the full code, visit the [Complete Guide](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/) repository and for this section, particularly [the eval\_e2e.py file](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/steps/eval%5Fe2e.py). diff --git a/develop/user-guide/guides/llmops-guide/evaluation/retrieval.mdx b/develop/user-guide/guides/llmops-guide/evaluation/retrieval.mdx deleted file mode 100644 index ff177f9..0000000 --- a/develop/user-guide/guides/llmops-guide/evaluation/retrieval.mdx +++ /dev/null @@ -1,241 +0,0 @@ ---- -title: "Retrieval evaluation" -description: "See how the retrieval component responds to changes in the pipeline." -icon: magnifying-glass ---- - -The retrieval component of our RAG pipeline is responsible for finding relevant documents or document chunks to feed into the generation component. In this section we'll explore how to evaluate the performance of the retrieval component of your RAG pipeline. We're checking how accurate the semantic search is, or in other words how relevant the retrieved documents are to the query. - -Our retrieval component takes the incoming query and converts it into a vector or embedded representation that can be used to search for relevant documents. We then use this representation to search through a corpus of documents and retrieve the most relevant ones. - -## Manual evaluation using handcrafted queries - -The most naive and simple way to check this would be to handcraft some queries where we know the specific documents needed to answer it. We can then check if the retrieval component is able to retrieve these documents. This is a manual evaluation process and can be time-consuming, but it's a good way to get a sense of how well the retrieval component is working. It can also be useful to target known edge cases or difficult queries to see how the retrieval component handles those known scenarios. - - - - -Implementing this is pretty simple - you just need to create some queries and check the retrieved documents. Having tested the basic inference of our RAG setup quite a bit, there were some clear areas where the retrieval component could be improved. I looked in our documentation to find some examples where the information could only be found in a single page and then wrote some queries that would require the retrieval component to find that page. For example, the query "How do I get going with the Label Studio integration? What are the first steps?" would require the retrieval component to find [the Label Studio integration page](https://docs.zenml.io/stacks-and-components/component-guide/annotators/label-studio). Some of the other examples used are: - -| Question | URL Ending | -| ------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | -| How do I get going with the Label Studio integration? What are the first steps? | stacks-and-components/component-guide/annotators/label-studio | -| How can I write my own custom materializer? | user-guide/advanced-guide/data-management/handle-custom-data-types | -| How do I generate embeddings as part of a RAG pipeline when using ZenML? | user-guide/llmops-guide/rag-with-zenml/embeddings-generation | -| How do I use failure hooks in my ZenML pipeline? | user-guide/advanced-guide/pipelining-features/use-failure-success-hooks | -| Can I deploy ZenML self-hosted with Helm? How do I do it? | deploying-zenml/zenml-self-hosted/deploy-with-helm | - -For the retrieval pipeline, all we have to do is encode the query as a vector and then query the PostgreSQL database for the most similar vectors. We then check whether the URL for the document we thought must show up is actually present in the top `n` results. - - -``` -def query_similar_docs(question: str, url_ending: str) -> tuple: - embedded_question = get_embeddings(question) - db_conn = get_db_conn() - top_similar_docs_urls = get_topn_similar_docs( - embedded_question, db_conn, n=5, only_urls=True - ) - urls = [url[0] for url in top_similar_docs_urls] # Unpacking URLs from tuples - return (question, url_ending, urls) - -def test_retrieved_docs_retrieve_best_url(question_doc_pairs: list) -> float: - total_tests = len(question_doc_pairs) - failures = 0 - - for pair in question_doc_pairs: - question, url_ending, urls = query_similar_docs( - pair["question"], pair["url_ending"] - ) - if all(url_ending not in url for url in urls): - logging.error( - f"Failed for question: {question}. Expected URL ending: {url_ending}. Got: {urls}" - ) - failures += 1 - - logging.info(f"Total tests: {total_tests}. Failures: {failures}") - failure_rate = (failures / total_tests) * 100 - return round(failure_rate, 2) -``` - -We include some logging so that when running the pipeline locally we can get some immediate feedback logged to the console. - -This functionality can then be packaged up into a ZenML step once we're happy it does what we need: - - -```py -@step -def retrieval_evaluation_small() -> Annotated[float, "small_failure_rate_retrieval"]: - failure_rate = test_retrieved_docs_retrieve_best_url(question_doc_pairs) - logging.info(f"Retrieval failure rate: {failure_rate}%") - return failure_rate -``` - -We got a 20% failure rate on the first run of this test, which was a good sign that the retrieval component could be improved. We only had 5 test cases, so this was just a starting point. In reality, you'd want to keep adding more test cases to cover a wider range of scenarios. You'll discover these failure cases as you use the system more and more, so it's a good idea to keep a record of them and add them to your test suite. - -You'd also want to examine the logs to see exactly which query failed. In our case, checking the logs in the ZenML dashboard, we find the following: - -```Bash -Failed for question: How do I generate embeddings as part of a RAG -pipeline when using ZenML?. Expected URL ending: user-guide/llmops-guide/ -rag-with-zenml/embeddings-generation. Got: ['https://docs.zenml.io/versions/0.66.0/user-guide/ -llmops-guide/rag-with-zenml/data-ingestion', 'https://docs.zenml.io/versions/0.66.0/user-guide/ -llmops-guide/rag-with-zenml/understanding-rag', 'https://docs.zenml.io/v/docs/ -user-guide/advanced-guide/data-management/handle-custom-data-types', 'https://docs. -zenml.io/user-guide/guides/llmops-guide/rag-with-zenml', 'https://docs.zenml.io/v/docs/ -user-guide/llmops-guide/rag-with-zenml'] -``` - -We can maybe take a look at those documents to see why they were retrieved and not the one we expected. This is a good way to iteratively improve the retrieval component. - -## Automated evaluation using synthetic generated queries - -For a broader evaluation we can examine a larger number of queries to check the retrieval component's performance. We do this by using an LLM to generate synthetic data. In our case we take the text of each document chunk and pass it to an LLM, telling it to generate a question. - - - - - -For example, given the text: - -```Bash -zenml orchestrator connect ${ORCHESTRATOR\_NAME} -iHead on over to our docs to -learn more about orchestrators and how to configure them. Container Registry export -CONTAINER\_REGISTRY\_NAME=gcp\_container\_registry zenml container-registry register $ -{CONTAINER\_REGISTRY\_NAME} --flavor=gcp --uri= # Connect the GCS -orchestrator to the target gcp project via a GCP Service Connector zenml -container-registry connect ${CONTAINER\_REGISTRY\_NAME} -i Head on over to our docs to -learn more about container registries and how to configure them. 7) Create Stack -export STACK\_NAME=gcp\_stack zenml stack register ${STACK\_NAME} -o $ -{ORCHESTRATOR\_NAME} \\ a ${ARTIFACT\_STORE\_NAME} -c ${CONTAINER\_REGISTRY\_NAME} ---set In case you want to also add any other stack components to this stack, feel free -to do so. And you're already done! Just like that, you now have a fully working GCP -stack ready to go. Feel free to take it for a spin by running a pipeline on it. -Cleanup If you do not want to use any of the created resources in the future, simply -delete the project you created. gcloud project delete
ZenML Scarf
PreviousScale compute to the -cloud NextConfiguring ZenML Last updated 2 days ago -``` - -we might get the question: - -```Bash -How do I create and configure a GCP stack in ZenML using an -orchestrator, container registry, and stack components, and how -do I delete the resources when they are no longer needed? -``` - -If we generate questions for all of our chunks, we can then use these question-chunk pairs to evaluate the retrieval component. We pass the generated query to the retrieval component and then we check if the URL for the original document is in the top `n` results. - -To generate the synthetic queries we can use the following code: - -```py -from typing import List - -from litellm import completion -from structures import Document -from zenml import step - -LOCAL_MODEL = "ollama/mixtral" - - -def generate_question(chunk: str, local: bool = False) -> str: - model = LOCAL_MODEL if local else "gpt-3.5-turbo" - response = completion( - model=model, - messages=[ - { - "content": f"This is some text from ZenML's documentation. Please generate a question that can be asked about this text: `{chunk}`", - "role": "user", - } - ], - api_base="http://localhost:11434" if local else None, - ) - return response.choices[0].message.content - - -@step -def generate_questions_from_chunks( - docs_with_embeddings: List[Document], - local: bool = False, -) -> List[Document]: - for doc in docs_with_embeddings: - doc.generated_questions = [generate_question(doc.page_content, local)] - - assert all(doc.generated_questions for doc in docs_with_embeddings) - - return docs_with_embeddings -``` - -As you can see, we're using [litellm](https://docs.litellm.ai/) again as the wrapper for the API calls. This allows us to switch between using a cloud LLM API (like OpenAI's GPT3.5 or 4) and a local LLM (like a quantized version of Mistral AI's Mixtral made available with [Ollama](https://ollama.com/). This has a number of advantages: - -* you keep your costs down by using a local model -* you can iterate faster by not having to wait for API calls -* you can use the same code for both local and cloud models - -For some tasks you'll want to use the best model your budget can afford, but for this task of question generation we're fine using a local and slightly less capable model. Even better is that it'll be much faster to generate the questions, especially using the basic setup we have here. - -To give you an indication of how long this process takes, generating 1800+ questions from an equivalent number of documentation chunks took a little over 45 minutes using the local model on a GPU-enabled machine with Ollama. - - - - -You can [view the generated dataset](https://huggingface.co/datasets/zenml/rag%5Fqa%5Fembedding%5Fquestions) on the Hugging Face Hub [here](https://huggingface.co/datasets/zenml/rag%5Fqa%5Fembedding%5Fquestions). This dataset contains the original document chunks, the generated questions, and the URL reference for the original document. - -Once we have the generated questions, we can then pass them to the retrieval component and check the results. For convenience we load the data from the Hugging Face Hub and then pass it to the retrieval component for evaluation. We shuffle the data and select a subset of it to speed up the evaluation process, but for a more thorough evaluation you could use the entire dataset. (The best practice of keeping a separate set of data for evaluation purposes is also recommended here, though we're not doing that in this example.) - -```py -@step -def retrieval_evaluation_full( - sample_size: int = 50, -) -> Annotated[float, "full_failure_rate_retrieval"]: - dataset = load_dataset("zenml/rag_qa_embedding_questions", split="train") - - sampled_dataset = dataset.shuffle(seed=42).select(range(sample_size)) - - total_tests = len(sampled_dataset) - failures = 0 - - for item in sampled_dataset: - generated_questions = item["generated_questions"] - question = generated_questions[ - 0 - ] # Assuming only one question per item - url_ending = item["filename"].split("/")[ - -1 - ] # Extract the URL ending from the filename - - _, _, urls = query_similar_docs(question, url_ending) - - if all(url_ending not in url for url in urls): - logging.error( - f"Failed for question: {question}. Expected URL ending: {url_ending}. Got: {urls}" - ) - failures += 1 - - logging.info(f"Total tests: {total_tests}. Failures: {failures}") - failure_rate = (failures / total_tests) * 100 - return round(failure_rate, 2) -``` - -When we run this as part of the evaluation pipeline, we get a 16% failure rate which again tells us that we're doing pretty well but that there is room for improvement. As a baseline, this is a good starting point. We can then iterate on the retrieval component to improve its performance. - -To take this further, there are a number of ways it might be improved: - -* **More diverse question generation**: The current question generation approach uses a single prompt to generate questions based on the document chunks. You could experiment with different prompts or techniques to generate a wider variety of questions that test the retrieval component more thoroughly. For example, you could prompt the LLM to generate questions of different types (factual, inferential, hypothetical, etc.) or difficulty levels. -* **Semantic similarity metrics**: In addition to checking if the expected URL is retrieved, you could calculate semantic similarity scores between the query and the retrieved documents using metrics like cosine similarity. This would give you a more nuanced view of retrieval performance beyond just binary success/failure. You could track average similarity scores and use them as a target metric to improve. -* **Comparative evaluation**: Test out different retrieval approaches (e.g. different embedding models, similarity search algorithms, etc.) and compare their performance on the same set of queries. This would help identify the strengths and weaknesses of each approach. -* **Error analysis**: Do a deeper dive into the failure cases to understand patterns and potential areas for improvement. Are certain types of questions consistently failing? Are there common characteristics among the documents that aren't being retrieved properly? Insights from error analysis can guide targeted improvements to the retrieval component. - -To wrap up, the retrieval evaluation process we've walked through - from manual spot-checking with carefully crafted queries to automated testing with synthetic question-document pairs - has provided a solid baseline understanding of our retrieval component's performance. The failure rates of 20% on our handpicked test cases and 16% on a larger sample of generated queries highlight clear room for improvement, but also validate that our semantic search is generally pointing in the right direction. - -Going forward, we have a rich set of options to refine and upgrade our evaluation approach. Generating a more diverse array of test questions, leveraging semantic similarity metrics for a nuanced view beyond binary success/failure, performing comparative evaluations of different retrieval techniques, and conducting deep error analysis on failure cases - all of these avenues promise to yield valuable insights. As our RAG pipeline grows to handle more complex and wide-ranging queries, continued investment in comprehensive retrieval evaluation will be essential to ensure we're always surfacing the most relevant information. - -Before we start working to improve or tweak our retrieval based on these evaluation results, let's shift gears and look at how we can evaluate the generation component of our RAG pipeline. Assessing the quality of the final answers produced by the system is equally crucial to gauging the effectiveness of our retrieval. - -Retrieval is only half the story. The true test of our system is the quality of the final answers it generates by combining retrieved content with LLM intelligence. In the next section, we'll dive into a parallel evaluation process for the generation component, exploring both automated metrics and human assessment to get a well-rounded picture of our RAG pipeline's end-to-end performance. By shining a light on both halves of the RAG architecture, we'll be well-equipped to iterate and optimize our way to an ever more capable and reliable question-answering system. - -## Code Example - -To explore the full code, visit the [Complete Guide](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/) repository and for this section, particularly [the eval\_retrieval.py file](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/steps/eval%5Fretrieval.py). diff --git a/develop/user-guide/guides/llmops-guide/finetuning-embeddings.mdx b/develop/user-guide/guides/llmops-guide/finetuning-embeddings.mdx deleted file mode 100644 index 417efd5..0000000 --- a/develop/user-guide/guides/llmops-guide/finetuning-embeddings.mdx +++ /dev/null @@ -1,19 +0,0 @@ ---- -title: "Overview" -description: "Finetune embeddings on custom synthetic data to improve retrieval performance." -icon: bullseye-arrow ---- - -We previously learned [how to use RAG with ZenML](/develop/user-guide/guides/llmops-guide/rag-with-zenml) to build a production-ready RAG pipeline. In this section, we will explore how to optimize and maintain your embedding models through synthetic data generation and human feedback. So far, we've been using off-the-shelf embeddings, which provide a good baseline and decent performance on standard tasks. However, you can often significantly improve performance by finetuning embeddings on your own domain-specific data. - -Our RAG pipeline uses a retrieval-based approach, where it first retrieves the most relevant documents from our vector database, and then uses a language model to generate a response based on those documents. By finetuning our embeddings on a dataset of technical documentation similar to our target domain, we can improve the retrieval step and overall performance of the RAG pipeline. - -The work of finetuning embeddings based on synthetic data and human feedback is a multi-step process. We'll go through the following steps: - -* [generating synthetic data with distilabel](/develop/user-guide/guides/llmops-guide/finetuning-embeddings/synthetic-data-generation) -* [finetuning embeddings with Sentence Transformers](/develop/user-guide/guides/llmops-guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers) -* [evaluating finetuned embeddings and using ZenML's model control plane to get a systematic overview](/develop/user-guide/guides/llmops-guide/finetuning-embeddings/evaluating-finetuned-embeddings) - -Besides ZenML, we will do this by using two open source libraries: [argilla](https://github.com/argilla-io/argilla/) and [distilabel](https://github.com/argilla-io/distilabel). Both of these libraries focus optimizing model outputs through improving data quality, however, each one of them takes a different approach to tackle the same problem. `distilabel` provides a scalable and reliable approach to distilling knowledge from LLMs by generating synthetic data or providing AI feedback with LLMs as judges. `argilla` enables AI engineers and domain experts to collaborate on data projects by allowing them to organize and explore data through within an interactive and engaging UI. Both libraries can be used individually but they work better together. We'll showcase their use via ZenML pipelines. - -To follow along with the example explained in this guide, please follow the instructions in [the llm-complete-guide repository](https://github.com/zenml-io/zenml-projects/llm-complete-guide/README.md) where the full code is also available. This specific section on embeddings finetuning can be run locally or using cloud compute as you prefer. diff --git a/develop/user-guide/guides/llmops-guide/finetuning-embeddings/evaluating-finetuned-embeddings.mdx b/develop/user-guide/guides/llmops-guide/finetuning-embeddings/evaluating-finetuned-embeddings.mdx deleted file mode 100644 index f26e35d..0000000 --- a/develop/user-guide/guides/llmops-guide/finetuning-embeddings/evaluating-finetuned-embeddings.mdx +++ /dev/null @@ -1,94 +0,0 @@ ---- -title: "Evaluating finetuned embeddings" -description: "Evaluate finetuned embeddings and compare to original base embeddings." -icon: chart-line ---- - -Now that we've finetuned our embeddings, we can evaluate them and compare to the base embeddings. We have all the data saved and versioned already, and we will reuse the same MatryoshkaLoss function for evaluation. - -In code, our evaluation steps are easy to comprehend. Here, for example, is the base model evaluation step: - -```py -from zenml import log_model_metadata, step - -def evaluate_model( - dataset: DatasetDict, model: SentenceTransformer -) -> Dict[str, float]: - """Evaluate the given model on the dataset.""" - evaluator = get_evaluator( - dataset=dataset, - model=model, - ) - return evaluator(model) - -@step -def evaluate_base_model( - dataset: DatasetDict, -) -> Annotated[Dict[str, float], "base_model_evaluation_results"]: - """Evaluate the base model on the given dataset.""" - model = SentenceTransformer( - EMBEDDINGS_MODEL_ID_BASELINE, - device="cuda" if torch.cuda.is_available() else "cpu", - ) - - results = evaluate_model( - dataset=dataset, - model=model, - ) - - # Convert numpy.float64 values to regular Python floats - # (needed for serialization) - base_model_eval = { - f"dim_{dim}_cosine_ndcg@10": float( - results[f"dim_{dim}_cosine_ndcg@10"] - ) - for dim in EMBEDDINGS_MODEL_MATRYOSHKA_DIMS - } - - log_model_metadata( - metadata={"base_model_eval": base_model_eval}, - ) - - return results -``` - -We log the results for our core Matryoshka dimensions as model metadata to ZenML within our evaluation step. This will allow us to inspect these results from within [the Model Control Plane](https://docs.zenml.io/usage/development-monitoring/use-the-model-control-plane) (see below for more details). Our results come in the form of a dictionary of string keys and float values which will, like all step inputs and outputs, be versioned, tracked and saved in your artifact store. - -### Visualizing results - -It's possible to visualize results in a few different ways in ZenML, but one easy option is just to output your chart as an `PIL.Image` object. (See our [documentation on more ways to visualize your results](/develop/usage/resource-data-management/visualize-artifacts).) The rest the implementation of our `visualize_results` step is just simple `matplotlib` code to plot out the base model evaluation against the finetuned model evaluation. We represent the results as percentage values and horizontally stack the two sets to make comparison a little easier. - - - - -We can see that our finetuned embeddings have improved the recall of our retrieval system across all of the dimensions, but the results are still not amazing. In a production setting, we would likely want to focus on improving the data being used for the embeddings training. In particular, we could consider stripping out some of the logs output from the documentation, and perhaps omit some pages which offer low signal for the retrieval task. This embeddings finetuning was run purely on the full set of synthetic data generated by `distilabel` and `gpt-4o`, so we wouldn't necessarily expect to see huge improvements out of the box, especially when the underlying data chunks are complex and contain multiple topics. - -### Model Control Plane as unified interface - -Once all our pipelines are finished running, the best place to inspect our results as well as the artifacts and models we generated is the Model Control Plane. - - - - - -The interface is split into sections that correspond to: - -* the artifacts generated by our steps -* the models generated by our steps -* the metadata logged by our steps -* (potentially) any deployments of models made, though we didn't use this in this guide so far -* any pipeline runs associated with this 'Model' - -We can easily see which are the latest artifact or technical model versions, as well as compare the actual values of our evals or inspect the hardware or hyperparameters used for training. - -This one-stop-shop interface is available on ZenML Pro and you can learn more about it in the [Model Control Plane documentation](https://docs.zenml.io/usage/development-monitoring/use-the-model-control-plane). - -### Next Steps - -Now that we've finetuned our embeddings and evaluated them, when they were in a good shape for use we could bring these into [the original RAG pipeline](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/user-guide/guides/llmops-guide/rag/basic-rag-inference-pipeline.md), regenerate a new series of embeddings for our data and then rerun our RAG retrieval evaluations to see how they've improved in our hand-crafted and LLM-powered evaluations. - -The next section will cover [LLM finetuning and deployment](/develop/user-guide/guides/llmops-guide/finetuning-llms) as the final part of our LLMops guide. (This section is currently still a work in progress, but if you're eager to try out LLM finetuning with ZenML, you can use [our LoRA project](https://github.com/zenml-io/zenml-projects/blob/main/llm-lora-finetuning/README.md) to get started. We also have [a blogpost](https://www.zenml.io/blog/how-to-finetune-llama-3-1-with-zenml) guide which takes you through [all the steps you need to finetune Llama 3.1](https://www.zenml.io/blog/how-to-finetune-llama-3-1-with-zenml) using GCP's Vertex AI with ZenML, including one-click stack creation!) - -To try out the two pipelines, please follow the instructions in [the project repository README](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/README.md), and you can find the full code in that same directory. - diff --git a/develop/user-guide/guides/llmops-guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers.mdx b/develop/user-guide/guides/llmops-guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers.mdx deleted file mode 100644 index 634d121..0000000 --- a/develop/user-guide/guides/llmops-guide/finetuning-embeddings/finetuning-embeddings-with-sentence-transformers.mdx +++ /dev/null @@ -1,73 +0,0 @@ ---- -title: "Finetuning embeddings with Sentence Transformers" -description: "Finetune embeddings with Sentence Transformers." -icon: brain ---- - -We now have a dataset that we can use to finetune our embeddings. You can [inspect the positive and negative examples](https://huggingface.co/datasets/zenml/rag%5Fqa%5Fembedding%5Fquestions%5F0%5F60%5F0%5Fdistilabel) on the Hugging Face [datasets page](https://huggingface.co/datasets/zenml/rag%5Fqa%5Fembedding%5Fquestions%5F0%5F60%5F0%5Fdistilabel) since our previous pipeline pushed the data there. - - - - - -Our pipeline for finetuning the embeddings is relatively simple. We'll do the following: - -* load our data either from Hugging Face or [from Argilla via the ZenML annotation integration](/develop/stack-components/annotators/argilla) -* finetune our model using the [Sentence Transformers](https://www.sbert.net/) library -* evaluate the base and finetuned embeddings -* visualize the results of the evaluation - - - - -### Loading data - -By default the pipeline will load the data from our Hugging Face dataset. If you've annotated your data in Argilla, you can load the data from there instead. You'll just need to pass an `--argilla` flag to the Python invocation when you're running the pipeline like so: - -```bash -python run.py --embeddings --argilla -``` - -This assumes that you've set up an Argilla annotator in your stack. The code checks for the annotator and downloads the data that was annotated in Argilla. Please see our [guide to using the Argilla integration with ZenML](/develop/stack-components/annotators/argilla) for more details. - -### Finetuning with Sentence Transformers - -The `finetune` step in the pipeline is responsible for finetuning the embeddings model using the Sentence Transformers library. Let's break down the key aspects of this step: - -1. **Model Loading**: The code loads the base model (`EMBEDDINGS_MODEL_ID_BASELINE`) using the Sentence Transformers library. It utilizes the SDPA (Self-Distilled Pruned Attention) implementation for efficient training with Flash Attention 2. -2. **Loss Function**: The finetuning process employs a custom loss function called `MatryoshkaLoss`. This loss function is a wrapper around the `MultipleNegativesRankingLoss` provided by Sentence Transformers. The Matryoshka approach involves training the model with different embedding dimensions simultaneously. It allows the model to learn embeddings at various granularities, improving its performance across different embedding sizes. -3. **Dataset Preparation**: The training dataset is loaded from the provided `dataset` parameter. The code saves the training data to a temporary JSON file and then loads it using the Hugging Face `load_dataset` function. -4. **Evaluator**: An evaluator is created using the `get_evaluator` function. The evaluator is responsible for assessing the model's performance during training. -5. **Training Arguments**: The code sets up the training arguments using the `SentenceTransformerTrainingArguments` class. It specifies various hyperparameters such as the number of epochs, batch size, learning rate, optimizer, precision (TF32 and BF16), and evaluation strategy. -6. **Trainer**: The `SentenceTransformerTrainer` is initialized with the model, training arguments, training dataset, loss function, and evaluator. The trainer handles the training process. The `trainer.train()` method is called to start the finetuning process. The model is trained for the specified number of epochs using the provided hyperparameters. -7. **Model Saving**: After training, the finetuned model is pushed to the Hugging Face Hub using the `trainer.model.push_to_hub()` method. The model is saved with the specified ID (`EMBEDDINGS_MODEL_ID_FINE_TUNED`). -8. **Metadata Logging**: The code logs relevant metadata about the training process, including the training parameters, hardware information, and accelerator details. -9. **Model Rehydration**: To handle materialization errors, the code saves the trained model to a temporary file, loads it back into a new `SentenceTransformer` instance, and returns the rehydrated model. - -(_Thanks and credit to Phil Schmid for_ [_his tutorial on finetuning embeddings_](https://www.philschmid.de/fine-tune-embedding-model-for-rag) _with Sentence Transformers and a Matryoshka loss function. This project uses many ideas and some code from his implementation._) - -### Finetuning in code - -Here's a simplified code snippet highlighting the key parts of the finetuning process: - -```Bash -# Load the base model -model = SentenceTransformer(EMBEDDINGS_MODEL_ID_BASELINE) -# Define the loss function -train_loss = MatryoshkaLoss(model, MultipleNegativesRankingLoss(model)) -# Prepare the training dataset -train_dataset = load_dataset("json", data_files=train_dataset_path) -# Set up the training arguments -args = SentenceTransformerTrainingArguments(...) -# Create the trainer -trainer = SentenceTransformerTrainer(model, args, train_dataset, train_loss) -# Start training -trainer.train() -# Save the finetuned model -trainer.model.push_to_hub(EMBEDDINGS_MODEL_ID_FINE_TUNED) -``` - -The finetuning process leverages the capabilities of the Sentence Transformers library to efficiently train the embeddings model. The Matryoshka approach allows for learning embeddings at different dimensions simultaneously, enhancing the model's performance across various embedding sizes. - -Our model is finetuned, saved in the Hugging Face Hub for easy access and reference in subsequent steps, but also versioned and tracked within ZenML for full observability. At this point the pipeline will evaluate the base and finetuned embeddings and visualize the results. - diff --git a/develop/user-guide/guides/llmops-guide/finetuning-embeddings/synthetic-data-generation.mdx b/develop/user-guide/guides/llmops-guide/finetuning-embeddings/synthetic-data-generation.mdx deleted file mode 100644 index 8d62819..0000000 --- a/develop/user-guide/guides/llmops-guide/finetuning-embeddings/synthetic-data-generation.mdx +++ /dev/null @@ -1,175 +0,0 @@ ---- -title: "Synthetic data generation" -description: "Generate synthetic data with distilabel to finetune embeddings." -icon: robot ---- - -We already have [a dataset of technical documentation](https://huggingface.co/datasets/zenml/rag%5Fqa%5Fembedding%5Fquestions%5F0%5F60%5F0) that was generated previously while we were working on the RAG pipeline. We'll use this dataset to generate synthetic data with `distilabel`. You can inspect the data directly [on the Hugging Face dataset page](https://huggingface.co/datasets/zenml/rag%5Fqa%5Fembedding%5Fquestions%5F0%5F60%5F0). - - - - -As you can see, it is made up of some `page_content` (our chunks) as well as the source URL from where the chunk was taken from. With embeddings, what we're going to want to do is pair the `page_content` with a question that we want to answer. In a pre-LLM world we might have actually created a new column and worked to manually craft questions for each chunk. However, with LLMs, we can use the `page_content` to generate questions. - -### Pipeline overview - -Our pipeline to generate synthetic data will look like this: - - - -We'll load the Hugging Face dataset, then we'll use `distilabel` to generate the synthetic data. To finish off, we'll push the newly-generated data to a new Hugging Face dataset and also push the same data to our Argilla instance for annotation and inspection. - -### Synthetic data generation - -[distilabel](https://github.com/argilla-io/distilabel) provides a scalable and reliable approach to distilling knowledge from LLMs by generating synthetic data or providing AI feedback with LLMs as judges. We'll be using it a relatively simple use case to generate some queries appropriate to our documentation chunks, but it can be used for a variety of other tasks. - -We can set up a `distilabel` pipeline easily in our ZenML step to handle the dataset creation. We'll be using `gpt-4o` as the LLM to generate the synthetic data so you can follow along, but `distilabel` supports a variety of other LLM providers (including Ollama) so you can use whatever you have available. - -```py -import os -from typing import Annotated, Tuple - -import distilabel -from constants import ( - DATASET_NAME_DEFAULT, - OPENAI_MODEL_GEN, - OPENAI_MODEL_GEN_KWARGS_EMBEDDINGS, -) -from datasets import Dataset -from distilabel.llms import OpenAILLM -from distilabel.steps import LoadDataFromHub -from distilabel.steps.tasks import GenerateSentencePair -from zenml import step - -synthetic_generation_context = """ -The text is a chunk from technical documentation of ZenML. -ZenML is an MLOps + LLMOps framework that makes your infrastructure and workflow metadata accessible to data science teams. -Along with prose explanations, the text chunk may include code snippets and logs but these are identifiable from the surrounding backticks. -""" - -@step -def generate_synthetic_queries( - train_dataset: Dataset, test_dataset: Dataset -) -> Tuple[ - Annotated[Dataset, "train_with_queries"], - Annotated[Dataset, "test_with_queries"], -]: - llm = OpenAILLM( - model=OPENAI_MODEL_GEN, api_key=os.getenv("OPENAI_API_KEY") - ) - - with distilabel.pipeline.Pipeline( - name="generate_embedding_queries" - ) as pipeline: - load_dataset = LoadDataFromHub( - output_mappings={"page_content": "anchor"}, - ) - generate_sentence_pair = GenerateSentencePair( - triplet=True, # `False` to generate only positive - action="query", - llm=llm, - input_batch_size=10, - context=synthetic_generation_context, - ) - - load_dataset >> generate_sentence_pair - - train_distiset = pipeline.run( - parameters={ - load_dataset.name: { - "repo_id": DATASET_NAME_DEFAULT, - "split": "train", - }, - generate_sentence_pair.name: { - "llm": { - "generation_kwargs": OPENAI_MODEL_GEN_KWARGS_EMBEDDINGS - } - }, - }, - ) - - test_distiset = pipeline.run( - parameters={ - load_dataset.name: { - "repo_id": DATASET_NAME_DEFAULT, - "split": "test", - }, - generate_sentence_pair.name: { - "llm": { - "generation_kwargs": OPENAI_MODEL_GEN_KWARGS_EMBEDDINGS - } - }, - }, - ) - - train_dataset = train_distiset["default"]["train"] - test_dataset = test_distiset["default"]["train"] - - return train_dataset, test_dataset -``` - -As you can see, we set up the LLM, create a `distilabel` pipeline, load the dataset, mapping the `page_content` column so that it becomes `anchor`. (This column renaming will make things easier a bit later when we come to finetuning the embeddings.) Then we generate the synthetic data by using the `GenerateSentencePair` step. This will create queries for each of the chunks in the dataset, so if the chunk was about registering a ZenML stack, the query might be "How do I register a ZenML stack?". It will also create negative queries, which are queries that would be inappropriate for the chunk. We do this so that the embeddings model can learn to distinguish between appropriate and inappropriate queries. - -We add some context to the generation process to help the LLM understand the task and the data we're working with. In particular, we explain that some parts of the text are code snippets and logs. We found performance to be better when we added this context. - -When this step runs within ZenML it will handle spinning up the necessary processes to make batched LLM calls to the OpenAI API. This is really useful when working with large datasets. `distilabel` has also implemented a caching mechanism to avoid recomputing results for the same inputs. So in this case you have two layers of caching: one in the `distilabel` pipeline and one in the ZenML orchestrator. This helps [speed up the pace of iteration](https://www.zenml.io/blog/iterate-fast) and saves you money. - -### Data annotation with Argilla - -Once we've let the LLM generate the synthetic data, we'll want to inspect it and make sure it looks good. We'll do this by pushing the data to an Argilla instance. We add a few extra pieces of metadata to the data to make it easier to navigate and inspect within our data annotation tool. These include: - -* `parent_section`: This will be the section of the documentation that the chunk is from. -* `token_count`: This will be the number of tokens in the chunk. -* `similarity-positive-negative`: This will be the cosine similarity between the positive and negative queries. -* `similarity-anchor-positive`: This will be the cosine similarity between the anchor and positive queries. -* `similarity-anchor-negative`: This will be the cosine similarity between the anchor and negative queries. - -We'll also add the embeddings for the anchor column so that we can use these for retrieval. We'll use the base model (in our case, `Snowflake/snowflake-arctic-embed-large`) to generate the embeddings. We use this function to map the dataset and process all the metadata: - -```py -def format_data(batch): - model = SentenceTransformer( - EMBEDDINGS_MODEL_ID_BASELINE, - device="cuda" if torch.cuda.is_available() else "cpu", - ) - - def get_embeddings(batch_column): - vectors = model.encode(batch_column) - return [vector.tolist() for vector in vectors] - - batch["anchor-vector"] = get_embeddings(batch["anchor"]) - batch["question-vector"] = get_embeddings(batch["anchor"]) - batch["positive-vector"] = get_embeddings(batch["positive"]) - batch["negative-vector"] = get_embeddings(batch["negative"]) - - def get_similarities(a, b): - similarities = [] - - for pos_vec, neg_vec in zip(a, b): - similarity = cosine_similarity([pos_vec], [neg_vec])[0][0] - similarities.append(similarity) - return similarities - - batch["similarity-positive-negative"] = get_similarities( - batch["positive-vector"], batch["negative-vector"] - ) - batch["similarity-anchor-positive"] = get_similarities( - batch["anchor-vector"], batch["positive-vector"] - ) - batch["similarity-anchor-negative"] = get_similarities( - batch["anchor-vector"], batch["negative-vector"] - ) - return batch -``` - -The [rest of the push\_to\_argilla step](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/steps/push%5Fto%5Fargilla.py) is just setting up the Argilla dataset and pushing the data to it. - -At this point you'd move to Argilla to view the data, see which examples seem to make sense and which don't. You can update the questions (positive and negative) which were generated by the LLM. If you want, you can do some data cleaning and exploration to improve the data quality, perhaps using the similarity metrics that we calculated earlier. - - - - - - -We'll next move to actually finetuning the embeddings, assuming you've done some data exploration and annotation. The code will work even without the annotation, however, since we'll just use the full generated dataset and assume that the quality is good enough. - diff --git a/develop/user-guide/guides/llmops-guide/finetuning-llms.mdx b/develop/user-guide/guides/llmops-guide/finetuning-llms.mdx deleted file mode 100644 index 3240316..0000000 --- a/develop/user-guide/guides/llmops-guide/finetuning-llms.mdx +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: "Finetuning LLMs with ZenML" -description: "Finetune LLMs for specific tasks or to improve performance and cost." -icon: brain ---- - - This guide is a work in progress. Please check back soon for updates. diff --git a/develop/user-guide/guides/llmops-guide/rag-with-zenml.mdx b/develop/user-guide/guides/llmops-guide/rag-with-zenml.mdx deleted file mode 100644 index 190c026..0000000 --- a/develop/user-guide/guides/llmops-guide/rag-with-zenml.mdx +++ /dev/null @@ -1,19 +0,0 @@ ---- -title: "Overview " -description: "RAG is a sensible way to get started with LLMs." -icon: robot ---- - -Retrieval-Augmented Generation (RAG) is a powerful technique that combines the strengths of retrieval-based and generation-based models. In this guide, we'll explore how to set up RAG pipelines with ZenML, including data ingestion, index store management, and tracking RAG-associated artifacts. - -LLMs are a powerful tool, as they can generate human-like responses to a wide variety of prompts. However, they can also be prone to generating incorrect or inappropriate responses, especially when the input prompt is ambiguous or misleading. They are also (currently) limited in the amount of text they can understand and/or generate. While there are some LLMs [like Google's Gemini 1.5 Pro](https://developers.googleblog.com/2024/02/gemini-15-available-for-private-preview-in-google-ai-studio.html) that can consistently handle 1 million tokens (small units of text), the vast majority (particularly the open-source ones currently available) handle far less. - -The first part of this guide to RAG pipelines with ZenML is about understanding the basic components and how they work together. We'll cover the following topics: - -* why RAG exists and what problem it solves -* how to ingest and preprocess data that we'll use in our RAG pipeline -* how to leverage embeddings to represent our data; this will be the basis for our retrieval mechanism -* how to store these embeddings in a vector database -* how to track RAG-associated artifacts with ZenML - -At the end, we'll bring it all together and show all the components working together to perform basic RAG inference. diff --git a/develop/user-guide/guides/llmops-guide/rag-with-zenml/basic-rag-inference-pipeline.mdx b/develop/user-guide/guides/llmops-guide/rag-with-zenml/basic-rag-inference-pipeline.mdx deleted file mode 100644 index f4289ce..0000000 --- a/develop/user-guide/guides/llmops-guide/rag-with-zenml/basic-rag-inference-pipeline.mdx +++ /dev/null @@ -1,127 +0,0 @@ ---- -title: "Basic RAG inference pipeline" -description: "Use your RAG components to generate responses to prompts." -icon: diagram-project ---- - -Now that we have our index store, we can use it to make queries based on the documents in the index store. We use some utility functions to make this happen but no external libraries are needed beyond an interface to the index store as well as the LLM itself. - - - - -If you've been following along with the guide, you should have some documents ingested already and you can pass a query in as a flag to the Python command used to run the pipeline: - -```Bash -python run.py --rag-query "how do I use a custom materializer inside my own zenml -steps? i.e. how do I set it? inside the @step decorator?" --model=gpt4 -``` - - - - -This inference query itself is not a ZenML pipeline, but rather a function call which uses the outputs and components of our pipeline to generate the response. For a more complex inference setup, there might be even more going on here, but for the purposes of this initial guide we will keep it simple. - -Bringing everything together, the code for the inference pipeline is as follows: - -```js -def process_input_with_retrieval( - input: str, model: str = OPENAI_MODEL, n_items_retrieved: int = 5 -) -> str: - delimiter = "```" - - # Step 1: Get documents related to the user input from database - related_docs = get_topn_similar_docs( - get_embeddings(input), get_db_conn(), n=n_items_retrieved - ) - - # Step 2: Get completion from OpenAI API - # Set system message to help set appropriate tone and context for model - system_message = f""" - You are a friendly chatbot. \ - You can answer questions about ZenML, its features and its use cases. \ - You respond in a concise, technically credible tone. \ - You ONLY use the context from the ZenML documentation to provide relevant - answers. \ - You do not make up answers or provide opinions that you don't have - information to support. \ - If you are unsure or don't know, just say so. \ - """ - - # Prepare messages to pass to model - # We use a delimiter to help the model understand the where the user_input - # starts and ends - - messages = [ - {"role": "system", "content": system_message}, - {"role": "user", "content": f"{delimiter}{input}{delimiter}"}, - { - "role": "assistant", - "content": f"Relevant ZenML documentation: \n" - + "\n".join(doc[0] for doc in related_docs), - }, - ] - logger.debug("CONTEXT USED\n\n", messages[2]["content"], "\n\n") - return get_completion_from_messages(messages, model=model) -``` - -For the `get_topn_similar_docs` function, we use the embeddings generated from the documents in the index store to find the most similar documents to the query: - -```py - -def get_topn_similar_docs( - query_embedding: List[float], - conn: psycopg2.extensions.connection, - n: int = 5, - include_metadata: bool = False, - only_urls: bool = False, -) -> List[Tuple]: - embedding_array = np.array(query_embedding) - register_vector(conn) - cur = conn.cursor() - - if include_metadata: - cur.execute( - f"SELECT content, url FROM embeddings ORDER BY embedding <=> %s LIMIT {n}", - (embedding_array,), - ) - elif only_urls: - cur.execute( - f"SELECT url FROM embeddings ORDER BY embedding <=> %s LIMIT {n}", - (embedding_array,), - ) - else: - cur.execute( - f"SELECT content FROM embeddings ORDER BY embedding <=> %s LIMIT {n}", - (embedding_array,), - ) - - return cur.fetchall() -``` - -Luckily we are able to get these similar documents using a function in [pgvector](https://github.com/pgvector/pgvector), a plugin package for PostgreSQL: `ORDER BY embedding <=> %s` orders the documents by their similarity to the query embedding. This is a very efficient way to get the most relevant documents to the query and is a great example of how we can leverage the power of the database to do the heavy lifting for us. - -For the `get_completion_from_messages` function, we use [litellm](https://github.com/BerriAI/litellm) as a universal interface that allows us to use lots of different LLMs. As you can see above, the model is able to synthesize the documents it has been given and provide a response to the query. - - -```py -def get_completion_from_messages( - messages, model=OPENAI_MODEL, temperature=0.4, max_tokens=1000 -): - """Generates a completion response from the given messages using the specified model.""" - model = MODEL_NAME_MAP.get(model, model) - completion_response = litellm.completion( - model=model, - messages=messages, - temperature=temperature, - max_tokens=max_tokens, - ) - return completion_response.choices[0].message.content -``` - -We're using `litellm` because it makes sense not to have to implement separate functions for each LLM we might want to use. The pace of development in the field is such that you will want to experiment with new LLMs as they come out, and `litellm` gives you the flexibility to do that without having to rewrite your code. - -We've now completed a basic RAG inference pipeline that uses the embeddings generated by the pipeline to retrieve the most relevant chunks of text based on a given query. We can inspect the various components of the pipeline to see how they work together to provide a response to the query. This gives us a solid foundation to move onto more complex RAG pipelines and to look into how we might improve this. The next section will cover how to improve retrieval by finetuning the embeddings generated by the pipeline. This will boost our performance in situations where we have a large volume of documents and also when the documents are potentially very different from the training data that was used for the embeddings. - -## Code Example - -To explore the full code, visit the [Complete Guide](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide) repository and for this section, particularly [the llm\_utils.py file](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/utils/llm%5Futils.py). diff --git a/develop/user-guide/guides/llmops-guide/rag-with-zenml/data-ingestion.mdx b/develop/user-guide/guides/llmops-guide/rag-with-zenml/data-ingestion.mdx deleted file mode 100644 index d39d547..0000000 --- a/develop/user-guide/guides/llmops-guide/rag-with-zenml/data-ingestion.mdx +++ /dev/null @@ -1,116 +0,0 @@ ---- -title: "Data ingestion and preprocessing" -description: "Understand how to ingest and preprocess data for RAG pipelines with ZenML." -icon: database ---- - -The first step in setting up a RAG pipeline is to ingest the data that will be used to train and evaluate the retriever and generator models. This data can include a large corpus of documents, as well as any relevant metadata or annotations that can be used to train the retriever and generator. - - - - -In the interests of keeping things simple, we'll implement the bulk of what we need ourselves. However, it's worth noting that there are a number of tools and frameworks that can help you manage the data ingestion process, including downloading, preprocessing, and indexing large corpora of documents. ZenML integrates with a number of these tools and frameworks, making it easy to set up and manage RAG pipelines. - -You can view all the code referenced in this guide in the associated project repository. Please visit [the llm-complete-guide project](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide) inside the ZenML projects repository if you want to dive deeper. - - -You can add a ZenML step that scrapes a series of URLs and outputs the URLs quite easily. Here we assemble a step that scrapes URLs related to ZenML from its documentation. We leverage some simple helper utilities that we have created for this purpose: - - -```py -from typing import List -from typing_extensions import Annotated -from zenml import log_artifact_metadata, step -from steps.url_scraping_utils import get_all_pages - -@step -def url_scraper( - docs_url: str = "https://docs.zenml.io", - repo_url: str = "https://github.com/zenml-io/zenml", - website_url: str = "https://zenml.io", -) -> Annotated[List[str], "urls"]: - """Generates a list of relevant URLs to scrape.""" - docs_urls = get_all_pages(docs_url) - log_artifact_metadata( - metadata={ - "count": len(docs_urls), - }, - ) - return docs_urls -``` - -The `get_all_pages` function simply crawls our documentation website and retrieves a unique set of URLs. We've limited it to only scrape the documentation relating to the most recent releases so that we're not mixing old syntax and information with the new. This is a simple way to ensure that we're only ingesting the most relevant and up-to-date information into our pipeline. - -We also log the count of those URLs as metadata for the step output. This will be visible in the dashboard for extra visibility around the data that's being ingested. Of course, you can also add more complex logic to this step, such as filtering out certain URLs or adding more metadata. - - - - -Once we have our list of URLs, we use [the unstructured library](https://github.com/Unstructured-IO/unstructured) to load and parse the pages. This will allow us to use the text without having to worry about the details of the HTML structure and/or markup. This specifically helps us keep the text content as small as possible since we are operating in a constrained environment with LLMs. - -```py -from typing import List -from unstructured.partition.html import partition_html -from zenml import step - -@step -def web_url_loader(urls: List[str]) -> List[str]: - """Loads documents from a list of URLs.""" - document_texts = [] - for url in urls: - elements = partition_html(url=url) - text = "\n\n".join([str(el) for el in elements]) - document_texts.append(text) - return document_texts -``` - -The previously-mentioned frameworks offer many more options when it comes to data ingestion, including the ability to load documents from a variety of sources, preprocess the text, and extract relevant features. For our purposes, though, we don't need anything too fancy. It also makes our pipeline easier to debug since we can see exactly what's being loaded and how it's being processed. You don't get that same level of visibility with more complex frameworks. - -## Preprocessing the data - -Once we have loaded the documents, we can preprocess them into a form that's useful for a RAG pipeline. There are a lot of options here, depending on how complex you want to get, but to start with you can think of the 'chunk size' as one of the key parameters to think about. - -Our text is currently in the form of various long strings, with each one representing a single web page. These are going to be too long to pass into our LLM, especially if we care about the speed at which we get our answers back. So the strategy here is to split our text into smaller chunks that can be processed more efficiently. There's a sweet spot between having tiny chunks, which will make it harder for our search / retrieval step to find relevant information to pass into the LLM, and having large chunks, which will make it harder for the LLM to process the text. - -```py -import logging -from typing import Annotated, List -from utils.llm_utils import split_documents -from zenml import ArtifactConfig, log_artifact_metadata, step - -logging.basicConfig(level=logging.INFO) -logger = logging.getLogger(__name__) - -@step(enable_cache=False) -def preprocess_documents( - documents: List[str], -) -> Annotated[List[str], ArtifactConfig(name="split_chunks")]: - """Preprocesses a list of documents by splitting them into chunks.""" - try: - log_artifact_metadata( - artifact_name="split_chunks", - metadata={ - "chunk_size": 500, - "chunk_overlap": 50 - }, - ) - return split_documents( - documents, chunk_size=500, chunk_overlap=50 - ) - except Exception as e: - logger.error(f"Error in preprocess_documents: {e}") - raise -``` - -It's really important to know your data to have a good intuition about what kind of chunk size might make sense. If your data is structured in such a way where you need large paragraphs to capture a particular concept, then you might want a larger chunk size. If your data is more conversational or question-and-answer based, then you might want a smaller chunk size. - -For our purposes, given that we're working with web pages that are written as documentation for a software library, we're going to use a chunk size of 500 and we'll make sure that the chunks overlap by 50 characters. This means that we'll have a lot of overlap between our chunks, which can be useful for ensuring that we don't miss any important information when we're splitting up our text. - -Again, depending on your data and use case, there is more you might want to do with your data. You might want to clean the text, remove code snippets or make sure that code snippets were not split across chunks, or even extract metadata from the text. This is a good starting point, but you can always add more complexity as needed. - -Next up, generating embeddings so that we can use them to retrieve relevant documents... - -### Code Example - -To explore the full code, visit the [Complete Guide](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide) repository and particularly [the code for the steps](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide/steps/) in this section. Note, too, that a lot of the logic is encapsulated in utility functions inside [url\_scraping\_utils.py](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide/steps/url%5Fscraping%5Futils.py). - diff --git a/develop/user-guide/guides/llmops-guide/rag-with-zenml/embeddings-generation.mdx b/develop/user-guide/guides/llmops-guide/rag-with-zenml/embeddings-generation.mdx deleted file mode 100644 index b70dcb4..0000000 --- a/develop/user-guide/guides/llmops-guide/rag-with-zenml/embeddings-generation.mdx +++ /dev/null @@ -1,160 +0,0 @@ ---- -title: "Embeddings generation" -description: "Generate embeddings to improve retrieval performance." -icon: vector-square ---- - -In this section, we'll explore how to generate embeddings for your data to improve retrieval performance in your RAG pipeline. Embeddings are a crucial part of the retrieval mechanism in RAG, as they represent the data in a high-dimensional space where similar items are closer together. By generating embeddings for your data, you can enhance the retrieval capabilities of your RAG pipeline and provide more accurate and relevant responses to user queries. - - - - - - - Embeddings are vector representations of data that capture the semantic meaning and context of the data in a high-dimensional space. They are generated using machine learning models, such as word embeddings or sentence embeddings, that learn to encode the data in a way that preserves its underlying structure and relationships. Embeddings are commonly used in natural language processing (NLP) tasks, such as text classification, sentiment analysis, and information retrieval, to represent textual data in a format that is suitable for computational processing. - - -The whole purpose of the embeddings is to allow us to quickly find the small chunks that are most relevant to our input query at inference time. An even simpler way of doing this would be to just to search for some keywords in the query and hope that they're also represented in the chunks. However, this approach is not very robust and may not work well for more complex queries or longer documents. By using embeddings, we can capture the semantic meaning and context of the data and retrieve the most relevant chunks based on their similarity to the query. - -We're using the [sentence-transformers](https://www.sbert.net/) library to generate embeddings for our data. This library provides pre-trained models for generating sentence embeddings that capture the semantic meaning of the text. It's an open-source library that is easy to use and provides high-quality embeddings for a wide range of NLP tasks. - - -```py -from typing import Annotated, List -import numpy as np -from sentence_transformers import SentenceTransformer -from structures import Document -from zenml import ArtifactConfig, log_artifact_metadata, step - -@step -def generate_embeddings( - split_documents: List[Document], -) -> Annotated[ - List[Document], ArtifactConfig(name="documents_with_embeddings") -]: - try: - model = SentenceTransformer("sentence-transformers/all-MiniLM-L12-v2") - - log_artifact_metadata( - artifact_name="embeddings", - metadata={ - "embedding_type": "sentence-transformers/all-MiniLM-L12-v2", - "embedding_dimensionality": 384, - }, - ) - - document_texts = [doc.page_content for doc in split_documents] - embeddings = model.encode(document_texts) - - for doc, embedding in zip(split_documents, embeddings): - doc.embedding = embedding - - return split_documents - except Exception as e: - logger.error(f"Error in generate_embeddings: {e}") - raise -``` - -We update the `Document` Pydantic model to include an `embedding` attribute that stores the embedding generated for each document. This allows us to associate the embeddings with the corresponding documents and use them for retrieval purposes in the RAG pipeline. - -There are smaller embeddings models if we cared a lot about speed, and larger ones (with more dimensions) if we wanted to boost our ability to retrieve more relevant chunks. [The model we're using here](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2) is on the smaller side, but it should work well for our use case. The embeddings generated by this model have a dimensionality of 384, which means that each embedding is represented as a 384-dimensional vector in the high-dimensional space. - -We can use dimensionality reduction functionality in [umap](https://umap-learn.readthedocs.io/) and [scikit-learn](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#sklearn-manifold-tsne) to represent the 384 dimensions of our embeddings in two-dimensional space. This allows us to visualize the embeddings and see how similar chunks are clustered together based on their semantic meaning and context. We can also use this visualization to identify patterns and relationships in the data that can help us improve the retrieval performance of our RAG pipeline. It's worth trying both UMAP and t-SNE to see which one works best for our use case since they both have somewhat different representations of the data and reduction algorithms, as you'll see. - -```py -from matplotlib.colors import ListedColormap -import matplotlib.pyplot as plt -import numpy as np -from sklearn.manifold import TSNE -import umap -from zenml.client import Client - -artifact = Client().get_artifact_version('EMBEDDINGS_ARTIFACT_UUID_GOES_HERE') -embeddings = artifact.load() - - -embeddings = np.array([doc.embedding for doc in documents]) -parent_sections = [doc.parent_section for doc in documents] - -# Get unique parent sections -unique_parent_sections = list(set(parent_sections)) - -# Tol color palette -tol_colors = [ - "#4477AA", - "#EE6677", - "#228833", - "#CCBB44", - "#66CCEE", - "#AA3377", - "#BBBBBB", -] - -# Create a colormap with Tol colors -tol_colormap = ListedColormap(tol_colors) - -# Assign colors to each unique parent section -section_colors = tol_colors[: len(unique_parent_sections)] - -# Create a dictionary mapping parent sections to colors -section_color_dict = dict(zip(unique_parent_sections, section_colors)) - -# Dimensionality reduction using t-SNE -def tsne_visualization(embeddings, parent_sections): - tsne = TSNE(n_components=2, random_state=42) - embeddings_2d = tsne.fit_transform(embeddings) - - plt.figure(figsize=(8, 8)) - for section in unique_parent_sections: - if section in section_color_dict: - mask = [section == ps for ps in parent_sections] - plt.scatter( - embeddings_2d[mask, 0], - embeddings_2d[mask, 1], - c=[section_color_dict[section]], - label=section, - ) - - plt.title("t-SNE Visualization") - plt.legend() - plt.show() - - -# Dimensionality reduction using UMAP -def umap_visualization(embeddings, parent_sections): - umap_2d = umap.UMAP(n_components=2, random_state=42) - embeddings_2d = umap_2d.fit_transform(embeddings) - - plt.figure(figsize=(8, 8)) - for section in unique_parent_sections: - if section in section_color_dict: - mask = [section == ps for ps in parent_sections] - plt.scatter( - embeddings_2d[mask, 0], - embeddings_2d[mask, 1], - c=[section_color_dict[section]], - label=section, - ) - - plt.title("UMAP Visualization") - plt.legend() - plt.show() -``` - - - - - - - - -In this stage, we have utilized the 'parent directory', which we had previously stored in the vector store as an additional attribute, as a means to color the values. This approach allows us to gain some insight into the semantic space inherent in our data. It demonstrates that you can visualize the embeddings and observe how similar chunks are grouped together based on their semantic meaning and context. - -So this step iterates through all the chunks and generates embeddings representing each piece of text. These embeddings are then stored as an artifact in the ZenML artifact store as a NumPy array. We separate this generation from the point where we upload those embeddings to the vector database to keep the pipeline modular and flexible; in the future we might want to use a different vector database so we can just swap out the upload step without having to re-generate the embeddings. - -In the next section, we'll explore how to store these embeddings in a vector database to enable fast and efficient retrieval of relevant chunks at inference time. - -## Code Example - -To explore the full code, visit the [Complete Guide](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide) repository. The embeddings generation step can be found [here](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide/steps/populate%5Findex.py). - diff --git a/develop/user-guide/guides/llmops-guide/rag-with-zenml/rag-85-loc.mdx b/develop/user-guide/guides/llmops-guide/rag-with-zenml/rag-85-loc.mdx deleted file mode 100644 index 9471c2b..0000000 --- a/develop/user-guide/guides/llmops-guide/rag-with-zenml/rag-85-loc.mdx +++ /dev/null @@ -1,118 +0,0 @@ ---- -title: "RAG in 85 lines of code" -description: "Learn how to implement a RAG pipeline in just 85 lines of code." -icon: code ---- - -There's a lot of theory and context to think about when it comes to RAG, but let's start with a quick implementation in code to motivate what follows. The following 85 lines do the following: - -* load some data (a fictional dataset about 'ZenML World') as our corpus -* process that text (split it into chunks and 'tokenize' it (i.e. split into words)) -* take a query as input and find the most relevant chunks of text from our corpus data -* use OpenAI's GPT-3.5 model to answer the question based on the relevant chunks - -```py -import os -import re -import string - -from openai import OpenAI - - -def preprocess_text(text): - text = text.lower() - text = text.translate(str.maketrans("", "", string.punctuation)) - text = re.sub(r"\s+", " ", text).strip() - return text - - -def tokenize(text): - return preprocess_text(text).split() - - -def retrieve_relevant_chunks(query, corpus, top_n=2): - query_tokens = set(tokenize(query)) - similarities = [] - for chunk in corpus: - chunk_tokens = set(tokenize(chunk)) - similarity = len(query_tokens.intersection(chunk_tokens)) / len( - query_tokens.union(chunk_tokens) - ) - similarities.append((chunk, similarity)) - similarities.sort(key=lambda x: x[1], reverse=True) - return [chunk for chunk, _ in similarities[:top_n]] - - -def answer_question(query, corpus, top_n=2): - relevant_chunks = retrieve_relevant_chunks(query, corpus, top_n) - if not relevant_chunks: - return "I don't have enough information to answer the question." - - context = "\n".join(relevant_chunks) - client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) - chat_completion = client.chat.completions.create( - messages=[ - { - "role": "system", - "content": f"Based on the provided context, answer the following question: {query}\n\nContext:\n{context}", - }, - { - "role": "user", - "content": query, - }, - ], - model="gpt-3.5-turbo", - ) - - return chat_completion.choices[0].message.content.strip() - - -# Sci-fi themed corpus about "ZenML World" -corpus = [ - "The luminescent forests of ZenML World are inhabited by glowing Zenbots that emit a soft, pulsating light as they roam the enchanted landscape.", - "In the neon skies of ZenML World, Cosmic Butterflies flutter gracefully, their iridescent wings leaving trails of stardust in their wake.", - "Telepathic Treants, ancient sentient trees, communicate through the quantum neural network that spans the entire surface of ZenML World, sharing wisdom and knowledge.", - "Deep within the melodic caverns of ZenML World, Fractal Fungi emit pulsating tones that resonate through the crystalline structures, creating a symphony of otherworldly sounds.", - "Near the ethereal waterfalls of ZenML World, Holographic Hummingbirds hover effortlessly, their translucent wings refracting the prismatic light into mesmerizing patterns.", - "Gravitational Geckos, masters of anti-gravity, traverse the inverted cliffs of ZenML World, defying the laws of physics with their extraordinary abilities.", - "Plasma Phoenixes, majestic creatures of pure energy, soar above the chromatic canyons of ZenML World, their fiery trails painting the sky in a dazzling display of colors.", - "Along the prismatic shores of ZenML World, Crystalline Crabs scuttle and burrow, their transparent exoskeletons refracting the light into a kaleidoscope of hues.", -] - -corpus = [preprocess_text(sentence) for sentence in corpus] - -question1 = "What are Plasma Phoenixes?" -answer1 = answer_question(question1, corpus) -print(f"Question: {question1}") -print(f"Answer: {answer1}") - -question2 = ( - "What kinds of creatures live on the prismatic shores of ZenML World?" -) -answer2 = answer_question(question2, corpus) -print(f"Question: {question2}") -print(f"Answer: {answer2}") - -irrelevant_question_3 = "What is the capital of Panglossia?" -answer3 = answer_question(irrelevant_question_3, corpus) -print(f"Question: {irrelevant_question_3}") -print(f"Answer: {answer3}") -``` - -**This outputs the following:** - -```Bash -Question: What are Plasma Phoenixes? -Answer: Plasma Phoenixes are majestic creatures made of pure energy that soar above the chromatic canyons of Zenml World. They leave fiery trails behind them, painting the sky with dazzling displays of colors. -Question: What kinds of creatures live on the prismatic shores of ZenML World? -Answer: On the prismatic shores of ZenML World, you can find crystalline crabs scuttling and burrowing with their transparent exoskeletons, which refract light into a kaleidoscope of hues. -Question: What is the capital of Panglossia? -Answer: The capital of Panglossia is not mentioned in the provided context. -``` - -The implementation above is by no means sophisticated or performant, but it's simple enough that you can see all the moving parts. Our tokenization process consists of splitting the text into individual words. - -The way we check for similarity between the question / query and the chunks of text is extremely naive and inefficient. The similarity between the query and the current chunk is calculated using the [Jaccard similarity coefficient](https://www.statology.org/jaccard-similarity/). This coefficient measures the similarity between two sets and is defined as the size of the intersection divided by the size of the union of the two sets. So we count the number of words that are common between the query and the chunk and divide it by the total number of unique words in both the query and the chunk. There are much better ways of measuring the similarity between two pieces of text, such as using embeddings or other more sophisticated techniques, but this example is kept simple for illustrative purposes. - -The rest of this guide will showcase a more performant and scalable way of performing the same task using ZenML. If you ever are unsure why we're doing something, feel free to return to this example for the high-level overview. - diff --git a/develop/user-guide/guides/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.mdx b/develop/user-guide/guides/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.mdx deleted file mode 100644 index b6b6bff..0000000 --- a/develop/user-guide/guides/llmops-guide/rag-with-zenml/storing-embeddings-in-a-vector-database.mdx +++ /dev/null @@ -1,130 +0,0 @@ ---- -title: "Storing embeddings in a vector database" -description: "Store embeddings in a vector database for efficient retrieval." -icon: server ---- - -The process of generating the embeddings doesn't take too long, especially if the machine on which the step is running has a GPU, but it's still not something we want to do every time we need to retrieve a document. Instead, we can store the embeddings in a vector database, which allows us to quickly retrieve the most relevant chunks based on their similarity to the query. - - - - -For the purposes of this guide, we'll use PostgreSQL as our vector database. This is a popular choice for storing embeddings, as it provides a scalable and efficient way to store and retrieve high-dimensional vectors. However, you can use any vector database that supports high-dimensional vectors. If you want to explore a list of possible options, [this is a good website](https://superlinked.com/vector-db-comparison/) to compare different options. - - - For more information on how to set up a PostgreSQL database to follow along with this guide, please [see the instructions in the repository](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide) which show how to set up a PostgreSQL database using Supabase. - - -Since PostgreSQL is a well-known and battle-tested database, we can use known and minimal packages to connect and to interact with it. We can use the [psycopg2](https://www.psycopg.org/docs/) package to connect and then raw SQL statements to interact with the database. - -The code for the step is fairly simple: - -```py -from zenml import step - -@step -def index_generator( - documents: List[Document], -) -> None: - try: - conn = get_db_conn() - with conn.cursor() as cur: - # Install pgvector if not already installed - cur.execute("CREATE EXTENSION IF NOT EXISTS vector") - conn.commit() - - # Create the embeddings table if it doesn't exist - table_create_command = f""" - CREATE TABLE IF NOT EXISTS embeddings ( - id SERIAL PRIMARY KEY, - content TEXT, - token_count INTEGER, - embedding VECTOR({EMBEDDING_DIMENSIONALITY}), - filename TEXT, - parent_section TEXT, - url TEXT - ); - """ - cur.execute(table_create_command) - conn.commit() - - register_vector(conn) - - # Insert data only if it doesn't already exist - for doc in documents: - content = doc.page_content - token_count = doc.token_count - embedding = doc.embedding.tolist() - filename = doc.filename - parent_section = doc.parent_section - url = doc.url - - cur.execute( - "SELECT COUNT(*) FROM embeddings WHERE content = %s", - (content,), - ) - count = cur.fetchone()[0] - if count == 0: - cur.execute( - "INSERT INTO embeddings (content, token_count, embedding, filename, parent_section, url) VALUES (%s, %s, %s, %s, %s, %s)", - ( - content, - token_count, - embedding, - filename, - parent_section, - url, - ), - ) - conn.commit() - - cur.execute("SELECT COUNT(*) as cnt FROM embeddings;") - num_records = cur.fetchone()[0] - logger.info(f"Number of vector records in table: {num_records}") - - # calculate the index parameters according to best practices - num_lists = max(num_records / 1000, 10) - if num_records > 1000000: - num_lists = math.sqrt(num_records) - - # use the cosine distance measure, which is what we'll later use for querying - cur.execute( - f"CREATE INDEX IF NOT EXISTS embeddings_idx ON embeddings USING ivfflat (embedding vector_cosine_ops) WITH (lists = {num_lists});" - ) - conn.commit() - - except Exception as e: - logger.error(f"Error in index_generator: {e}") - raise - finally: - if conn: - conn.close() -``` - -We use some utility functions, but what we do here is: - -* connect to the database -* create the `vector` extension if it doesn't already exist (this is to enable the vector data type in PostgreSQL) -* create the `embeddings` table if it doesn't exist -* insert the embeddings and documents into the table -* calculate the index parameters according to best practices -* create an index on the embeddings - -Note that we're inserting the documents into the embeddings table as well as the embeddings themselves. This is so that we can retrieve the documents based on their embeddings later on. It also helps with debugging from within the Supabase interface or wherever else we're examining the contents of the database. - - - -Deciding when to update your embeddings is a separate discussion and depends on the specific use case. If your data is frequently changing, and the changes are significant, you might want to fully reset the embeddings with each update. In other cases, you might just want to add new documents and embeddings into the database because the changes are minor or infrequent. In the code above, we choose to only add new embeddings if they don't already exist in the database. - - - Depending on the size of your dataset and the number of embeddings you're storing, you might find that running this step on a CPU is too slow. In that case, you should ensure that this step runs on a GPU-enabled machine to speed up the process. You can do this with ZenML by using a step operator that runs on a GPU-enabled machine. See [the docs here](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/component-guide/step-operators/README.md) for more on how to set this up. - - -We also generate an index for the embeddings using the `ivfflat` method with the `vector_cosine_ops` operator. This is a common method for indexing high-dimensional vectors in PostgreSQL and is well-suited for similarity search using cosine distance. The number of lists is calculated based on the number of records in the table, with a minimum of 10 lists and a maximum of the square root of the number of records. This is a good starting point for tuning the index parameters, but you might want to experiment with different values to see how they affect the performance of your RAG pipeline. - -Now that we have our embeddings stored in a vector database, we can move on to the next step in the pipeline, which is to retrieve the most relevant documents based on a given query. This is where the real magic of the RAG pipeline comes into play, as we can use the embeddings to quickly retrieve the most relevant chunks of text based on their similarity to the query. This allows us to build a powerful and efficient question-answering system that can provide accurate and relevant responses to user queries in real-time. - -## Code Example - -To explore the full code, visit the [Complete Guide](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide) repository. The logic for storing the embeddings in PostgreSQL can be found [here](https://github.com/zenml-io/zenml-projects/tree/main/llm-complete-guide/steps/populate%5Findex.py). - diff --git a/develop/user-guide/guides/llmops-guide/rag-with-zenml/understanding-rag.mdx b/develop/user-guide/guides/llmops-guide/rag-with-zenml/understanding-rag.mdx deleted file mode 100644 index 4cb9d6e..0000000 --- a/develop/user-guide/guides/llmops-guide/rag-with-zenml/understanding-rag.mdx +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: "Understanding Retrieval-Augmented Generation (RAG)" -description: "Understand the Retrieval-Augmented Generation (RAG) technique and its benefits." -icon: book ---- - -LLMs are powerful but not without their limitations. They are prone to generating incorrect responses, especially when it's unclear what the input prompt is asking for. They are also limited in the amount of text they can understand and generate. While some LLMs can handle more than 1 million tokens of input, most open-source models can handle far less. Your use case also might not require all the complexity and cost associated with running a large LLM. - -RAG, [originally proposed in 2020](https://arxiv.org/abs/2005.11401v4) by researchers at Facebook, is a technique that supplements the inbuilt abilities of foundation models like LLMs with a retrieval mechanism. This mechanism retrieves relevant documents from a large corpus and uses them to generate a response. This approach combines the strengths of retrieval-based and generation-based models, allowing you to leverage the power of LLMs while addressing their limitations. - -## What exactly happens in a RAG pipeline? - - - -In a RAG pipeline, we use a retriever to find relevant documents from a large corpus and then uses a generator to produce a response based on the retrieved documents. This approach is particularly useful for tasks that require contextual understanding and long-form generation, such as question answering, summarization, and dialogue generation. - -RAG helps with the context limitations mentioned above by providing a way to retrieve relevant documents that can be used to generate a response. This retrieval step can help ensure that the generated response is grounded in relevant information, reducing the likelihood of generating incorrect or inappropriate responses. It also helps with the token limitations by allowing the generator to focus on a smaller set of relevant documents, rather than having to process an entire large corpus. - -Given the costs associated with running LLMs, RAG can also be more cost-effective than using a pure generation-based approach, as it allows you to focus the generator's resources on a smaller set of relevant documents. This can be particularly important when working with large corpora or when deploying models to resource-constrained environments. - -## When is RAG a good choice? - - - - - -RAG is a good choice when you need to generate long-form responses that require contextual understanding and when you have access to a large corpus of relevant documents. It can be particularly useful for tasks like question answering, summarization, and dialogue generation, where the generated response needs to be grounded in relevant information. - -It's often the first thing that you'll want to try when dipping your toes into the world of LLMs. This is because it provides a sensible way to get a feel for how the process works, and it doesn't require as much data or computational resources as other approaches. It's also a good choice when you need to balance the benefits of LLMs with the limitations of the current generation of models. - -## How does RAG fit into the ZenML ecosystem? - -In ZenML, you can set up RAG pipelines that combine the strengths of retrieval-based and generation-based models. This allows you to leverage the power of LLMs while addressing their limitations. ZenML provides tools for data ingestion, index store management, and tracking RAG-associated artifacts, making it easy to set up and manage RAG pipelines. - -ZenML also provides a way to scale beyond the limitations of simple RAG pipelines, as we shall see in later sections of this guide. While you might start off with something simple, at a later point you might want to transition to a more complex setup that involves finetuning embeddings, reranking retrieved documents, or even finetuning the LLM itself. ZenML provides tools for all of these scenarios, making it easy to scale your RAG pipelines as needed. - -ZenML allows you to track all the artifacts associated with your RAG pipeline, from hyperparameters and model weights to metadata and performance metrics, as well as all the RAG or LLM-specific artifacts like chains, agents, tokenizers and vector stores. These can all be tracked in the [Model Control Plane](/develop/usage/development-monitoring/use-the-model-control-plane) and thus visualized in the [ZenML Pro](https://zenml.io/pro) dashboard. - -By bringing all of the above into a simple ZenML pipeline we achieve a clearly delineated set of steps that can be run and rerun to set up our basic RAG pipeline. This is a great starting point for building out more complex RAG pipelines, and it's a great way to get started with LLMs in a sensible way. - -A summary of some of the advantages that ZenML brings to the table here includes: - -* **Reproducibility**: You can rerun the pipeline to update the index store with new documents or to change the parameters of the chunking process and so on. Previous versions of the artifacts will be preserved, and you can compare the performance of different versions of the pipeline. -* **Scalability**: You can easily scale the pipeline to handle larger corpora of documents by deploying it on a cloud provider and using a more scalable vector store. -* **Tracking artifacts and associating them with metadata**: You can track the artifacts generated by the pipeline and associate them with metadata that provides additional context and insights into the pipeline. This metadata and these artifacts are then visible in the ZenML dashboard, allowing you to monitor the performance of the pipeline and debug any issues that arise. -* **Maintainability** \- Having your pipeline in a clear, modular format makes it easier to maintain and update. You can easily add new steps, change the parameters of existing steps, and experiment with different configurations to see how they affect the performance of the pipeline. -* **Collaboration** \- You can share the pipeline with your team and collaborate on it together. You can also use the ZenML dashboard to share insights and findings with your team, making it easier to work together on the pipeline. - -In the next section, we'll showcase the components of a basic RAG pipeline. This will give you a taste of how you can leverage the power of LLMs in your MLOps workflows using ZenML. Subsequent sections will cover more advanced topics like reranking retrieved documents, finetuning embeddings, and finetuning the LLM itself. diff --git a/develop/user-guide/guides/llmops-guide/reranking.mdx b/develop/user-guide/guides/llmops-guide/reranking.mdx deleted file mode 100644 index dd08090..0000000 --- a/develop/user-guide/guides/llmops-guide/reranking.mdx +++ /dev/null @@ -1,14 +0,0 @@ ---- -title: "Overview" -description: "Add reranking to your RAG inference for better retrieval performance." -icon: ranking-star ---- - -Rerankers are a crucial component of retrieval systems that use LLMs. They help improve the quality of the retrieved documents by reordering them based on additional features or scores. In this section, we'll explore how to add a reranker to your RAG inference pipeline in ZenML. - -In previous sections, we set up the overall workflow, from data ingestion and preprocessing to embeddings generation and retrieval. We then set up some basic evaluation metrics to assess the performance of our retrieval system. A reranker is a way to squeeze a bit of extra performance out of the system by reordering the retrieved documents based on additional features or scores. - - - - -As you can see, reranking is an optional addition we make to what we've already set up. It's not strictly necessary, but it can help improve the relevance and quality of the retrieved documents, which in turn can lead to better responses from the LLM. Let's dive in! diff --git a/develop/user-guide/guides/llmops-guide/reranking/evaluating-reranking-performance.mdx b/develop/user-guide/guides/llmops-guide/reranking/evaluating-reranking-performance.mdx deleted file mode 100644 index f86c114..0000000 --- a/develop/user-guide/guides/llmops-guide/reranking/evaluating-reranking-performance.mdx +++ /dev/null @@ -1,228 +0,0 @@ ---- -title: "Evaluating reranking performance" -description: "Evaluate the performance of your reranking model." -icon: chart-line ---- - -We've already set up an evaluation pipeline, so adding reranking evaluation is relatively straightforward. In this section, we'll explore how to evaluate the performance of your reranking model using ZenML. - -### Evaluating Reranking Performance - -The simplest first step in evaluating the reranking model is to compare the retrieval performance before and after reranking. You can use the same metrics we discussed in the [evaluation section](/develop/user-guide/guides/llmops-guide/evaluation) to assess the performance of the reranking model. - - - - - -If you recall, we have a hand-crafted set of queries and relevant documents that we use to evaluate the performance of our retrieval system. We also have a set that was [generated by LLMs](/develop/user-guide/guides/llmops-guide/evaluation/retrieval#automated-evaluation-using-synthetic-generated-queries). The actual retrieval test is implemented as follows: - -```py -def perform_retrieval_evaluation( - sample_size: int, use_reranking: bool -) -> float: - """Helper function to perform the retrieval evaluation.""" - dataset = load_dataset("zenml/rag_qa_embedding_questions", split="train") - sampled_dataset = dataset.shuffle(seed=42).select(range(sample_size)) - - total_tests = len(sampled_dataset) - failures = 0 - - for item in sampled_dataset: - generated_questions = item["generated_questions"] - question = generated_questions[ - 0 - ] # Assuming only one question per item - url_ending = item["filename"].split("/")[ - -1 - ] # Extract the URL ending from the filename - - # using the method above to query similar documents - # we pass in whether we want to use reranking or not - _, _, urls = query_similar_docs(question, url_ending, use_reranking) - - if all(url_ending not in url for url in urls): - logging.error( - f"Failed for question: {question}. Expected URL ending: {url_ending}. Got: {urls}" - ) - failures += 1 - - logging.info(f"Total tests: {total_tests}. Failures: {failures}") - failure_rate = (failures / total_tests) * 100 - return round(failure_rate, 2) -``` - -This function takes a sample size and a flag indicating whether to use reranking and evaluates the retrieval performance based on the generated questions and relevant documents. It queries similar documents for each question and checks whether the expected URL ending is present in the retrieved URLs. The failure rate is calculated as the percentage of failed tests over the total number of tests. - -This function is then called in two separate evaluation steps: one for the retrieval system without reranking and one for the retrieval system with reranking. - -```py -@step -def retrieval_evaluation_full( - sample_size: int = 100, -) -> Annotated[float, "full_failure_rate_retrieval"]: - """Executes the retrieval evaluation step without reranking.""" - failure_rate = perform_retrieval_evaluation( - sample_size, use_reranking=False - ) - logging.info(f"Retrieval failure rate: {failure_rate}%") - return failure_rate - - -@step -def retrieval_evaluation_full_with_reranking( - sample_size: int = 100, -) -> Annotated[float, "full_failure_rate_retrieval_reranking"]: - """Executes the retrieval evaluation step with reranking.""" - failure_rate = perform_retrieval_evaluation( - sample_size, use_reranking=True - ) - logging.info(f"Retrieval failure rate with reranking: {failure_rate}%") - return failure_rate -``` - -Both of these steps return the failure rate of the respective retrieval systems. If we want, we can look into the logs of those steps (either on the dashboard or in the terminal) to see specific examples that failed. For example: - -```Bash -... -Loading default flashrank model for language en -Default Model: ms-marco-MiniLM-L-12-v2 -Loading FlashRankRanker model ms-marco-MiniLM-L-12-v2 -Loading model FlashRank model ms-marco-MiniLM-L-12-v2... -Running pairwise ranking.. -Failed for question: Based on the provided ZenML documentation text, here's a question - that can be asked: "How do I develop a custom alerter as described on the Feast page, - and where can I find the 'How to use it?' guide?". Expected URL ending: feature-stores. - Got: ['https://docs.zenml.io/stacks-and-components/component-guide/alerters/custom', - 'https://docs.zenml.io/v/docs/stacks-and-components/component-guide/alerters/custom', - 'https://docs.zenml.io/v/docsversions/0.66.0/reference/how-do-i', 'https://docs.zenml.io/stacks-and-components/component-guide/alerters', - 'https://docs.zenml.io/stacks-and-components/component-guide/alerters/slack'] - -Loading default flashrank model for language en -Default Model: ms-marco-MiniLM-L-12-v2 -Loading FlashRankRanker model ms-marco-MiniLM-L-12-v2 -Loading model FlashRank model ms-marco-MiniLM-L-12-v2... -Running pairwise ranking.. -Step retrieval_evaluation_full_with_reranking has finished in 4m20s. -``` - -We can see here a specific example of a failure in the reranking evaluation. It's quite a good one because we can see that the question asked was actually an anomaly in the sense that the LLM has generated two questions and included its meta-discussion of the two questions it generated. Obviously this is not a representative question for the dataset, and if we saw a lot of these we might want to take some time to both understand why the LLM is generating these questions and how we can filter them out. - -### Visualizing our reranking performance - -Since ZenML can display visualizations in its dashboard, we can showcase the results of our experiments in a visual format. For example, we can plot the failure rates of the retrieval system with and without reranking to see the impact of reranking on the performance. - -Our documentation explains how to set up your outputs so that they appear as visualizations in the ZenML dashboard. You can find more information [here](/develop/usage/resource-data-management/visualize-artifacts). There are lots of options, but we've chosen to plot our failure rates as a bar chart and export them as a `PIL.Image` object. We also plotted the other evaluation scores so as to get a quick global overview of our performance. - - - -```py -# passing the results from all our previous evaluation steps - -@step(enable_cache=False) -def visualize_evaluation_results( - small_retrieval_eval_failure_rate: float, - small_retrieval_eval_failure_rate_reranking: float, - full_retrieval_eval_failure_rate: float, - full_retrieval_eval_failure_rate_reranking: float, - failure_rate_bad_answers: float, - failure_rate_bad_immediate_responses: float, - failure_rate_good_responses: float, - average_toxicity_score: float, - average_faithfulness_score: float, - average_helpfulness_score: float, - average_relevance_score: float, -) -> Optional[Image.Image]: - """Visualizes the evaluation results.""" - step_context = get_step_context() - pipeline_run_name = step_context.pipeline_run.name - - normalized_scores = [ - score / 20 - for score in [ - small_retrieval_eval_failure_rate, - small_retrieval_eval_failure_rate_reranking, - full_retrieval_eval_failure_rate, - full_retrieval_eval_failure_rate_reranking, - failure_rate_bad_answers, - ] - ] - - scores = normalized_scores + [ - failure_rate_bad_immediate_responses, - failure_rate_good_responses, - average_toxicity_score, - average_faithfulness_score, - average_helpfulness_score, - average_relevance_score, - ] - - labels = [ - "Small Retrieval Eval Failure Rate", - "Small Retrieval Eval Failure Rate Reranking", - "Full Retrieval Eval Failure Rate", - "Full Retrieval Eval Failure Rate Reranking", - "Failure Rate Bad Answers", - "Failure Rate Bad Immediate Responses", - "Failure Rate Good Responses", - "Average Toxicity Score", - "Average Faithfulness Score", - "Average Helpfulness Score", - "Average Relevance Score", - ] - - # Create a new figure and axis - fig, ax = plt.subplots(figsize=(10, 6)) - - # Plot the horizontal bar chart - y_pos = np.arange(len(labels)) - ax.barh(y_pos, scores, align="center") - ax.set_yticks(y_pos) - ax.set_yticklabels(labels) - ax.invert_yaxis() # Labels read top-to-bottom - ax.set_xlabel("Score") - ax.set_xlim(0, 5) - ax.set_title(f"Evaluation Metrics for {pipeline_run_name}") - - # Adjust the layout - plt.tight_layout() - - # Save the plot to a BytesIO object - buf = io.BytesIO() - plt.savefig(buf, format="png") - buf.seek(0) - - image = Image.open(buf) - - return image -``` - -For one of my runs of the evaluation pipeline, this looked like the following in the dashboard: - - - - - -You can see that for the full retrieval evaluation we do see an improvement. Our small retrieval test, which as of writing only included five questions, showed a considerable degradation in performance. Since these were specific examples where we knew the answers, this would be something we'd want to look into to see why the reranking model was not performing as expected. - -We can also see that regardless of whether reranking was performed or not, the retrieval scores aren't great. This is a good indication that we might want to look into the retrieval model itself (i.e. our embeddings) to see if we can improve its performance. This is what we'll turn to next as we explore finetuning our embeddings to improve retrieval performance. - -### Try it out! - -To see how this works in practice, you can run the evaluation pipeline using the project code. The reranking is included as part of the pipeline, so providing you've run the main `rag` pipeline, you can run the evaluation pipeline to see how the reranking model is performing. - -To run the evaluation pipeline, first clone the project repository: - -```Bash -git clone https://github.com/zenml-io/zenml-projects.git -``` - -Then navigate to the `llm-complete-guide` directory and follow the instructions in the `README.md` file to run the evaluation pipeline. (You'll have to have first run the main pipeline to generate the embeddings.) - -To run the evaluation pipeline, you can use the following command: - -```Bash -python run.py --evaluation -``` - -This will run the evaluation pipeline and output the results to the dashboard. As always, you can inspect the progress, logs, and results in the dashboard! - diff --git a/develop/user-guide/guides/llmops-guide/reranking/implementing-reranking.mdx b/develop/user-guide/guides/llmops-guide/reranking/implementing-reranking.mdx deleted file mode 100644 index 922f450..0000000 --- a/develop/user-guide/guides/llmops-guide/reranking/implementing-reranking.mdx +++ /dev/null @@ -1,117 +0,0 @@ ---- -title: "Implementing reranking in ZenML" -description: "Learn how to implement reranking in ZenML." -icon: code ---- - -We already have a working RAG pipeline, so inserting a reranker into the pipeline is relatively straightforward. The reranker will take the retrieved documents from the initial retrieval step and reorder them in terms of the query that was used to retrieve them. - - - - - -## How and where to add reranking - -We'll use the [rerankers](https://github.com/AnswerDotAI/rerankers/) package to handle the reranking process in our RAG inference pipeline. It's a relatively low-cost (in terms of technical debt and complexity) and lightweight dependency to add into our pipeline. It offers an interface to most of the model types that are commonly used for reranking and means we don't have to worry about the specifics of each model. - -This package provides a `Reranker` abstract class that you can use to define your own reranker. You can also use the provided implementations to add reranking to your pipeline. The reranker takes the query and a list of retrieved documents as input and outputs a reordered list of documents based on the reranking scores. Here's a toy example: - -```py -from rerankers import Reranker - -ranker = Reranker('cross-encoder') - -texts = [ - "I like to play soccer", - "I like to play football", - "War and Peace is a great book" - "I love dogs", - "Ginger cats aren't very smart", - "I like to play basketball", -] - -results = ranker.rank(query="What's your favorite sport?", docs=texts) -``` - -And results will look something like this: - -```py -RankedResults( - results=[ - Result(doc_id=5, text='I like to play basketball', score=-0.46533203125, rank=1), - Result(doc_id=0, text='I like to play soccer', score=-0.7353515625, rank=2), - Result(doc_id=1, text='I like to play football', score=-0.9677734375, rank=3), - Result(doc_id=2, text='War and Peace is a great book', score=-5.40234375, rank=4), - Result(doc_id=3, text='I love dogs', score=-5.5859375, rank=5), - Result(doc_id=4, text="Ginger cats aren't very smart", score=-5.94921875, rank=6) - ], - query="What's your favorite sport?", - has_scores=True -) -``` - -We can see that the reranker has reordered the documents based on the reranking scores, with the most relevant document appearing at the top of the list. The texts about sport are at the top and the less relevant ones about animals are down at the bottom. - -We specified that we want a `cross-encoder` reranker, but you can also use other reranker models from the Hugging Face Hub, use API-driven reranker models (from Jina or Cohere, for example), or even define your own reranker model. Read [their documentation](https://github.com/AnswerDotAI/rerankers/) to see how to use these different configurations. - -In our case, we can simply add a helper function that can optionally be invoked when we want to use the reranker: - -```py -def rerank_documents( - query: str, documents: List[Tuple], reranker_model: str = "flashrank" -) -> List[Tuple[str, str]]: - """Reranks the given documents based on the given query.""" - ranker = Reranker(reranker_model) - docs_texts = [f"{doc[0]} PARENT SECTION: {doc[2]}" for doc in documents] - results = ranker.rank(query=query, docs=docs_texts) - # pair the texts with the original urls in `documents` - # `documents` is a tuple of (content, url) - # we want the urls to be returned - reranked_documents_and_urls = [] - for result in results.results: - # content is a `rerankers` Result object - index_val = result.doc_id - doc_text = result.text - doc_url = documents[index_val][1] - reranked_documents_and_urls.append((doc_text, doc_url)) - return reranked_documents_and_urls -``` - -This function takes a query and a list of documents (each document is a tuple of content and URL) and reranks the documents based on the query. It returns a list of tuples, where each tuple contains the reranked document text and the URL of the original document. We use the `flashrank` model from the `rerankers` package by default as it appeared to be a good choice for our use case during development. - -This function then gets used in tests in the following way: - -```py -def query_similar_docs( - question: str, - url_ending: str, - use_reranking: bool = False, - returned_sample_size: int = 5, -) -> Tuple[str, str, List[str]]: - """Query similar documents for a given question and URL ending.""" - embedded_question = get_embeddings(question) - db_conn = get_db_conn() - num_docs = 20 if use_reranking else returned_sample_size - # get (content, url) tuples for the top n similar documents - top_similar_docs = get_topn_similar_docs( - embedded_question, db_conn, n=num_docs, include_metadata=True - ) - - if use_reranking: - reranked_docs_and_urls = rerank_documents(question, top_similar_docs)[ - :returned_sample_size - ] - urls = [doc[1] for doc in reranked_docs_and_urls] - else: - urls = [doc[1] for doc in top_similar_docs] # Unpacking URLs - - return (question, url_ending, urls) -``` - -We get the embeddings for the question being passed into the function and connect to our PostgreSQL database. If we're using reranking, we get the top 20 documents similar to our query and rerank them using the `rerank_documents` helper function. We then extract the URLs from the reranked documents and return them. Note that we only return 5 URLs, but in the case of reranking we get a larger number of documents and URLs back from the database to pass to our reranker, but in the end we always choose the top five reranked documents to return. - -Now that we've added reranking to our pipeline, we can evaluate the performance of our reranker and see how it affects the quality of the retrieved documents. - -## Code Example - -To explore the full code, visit the [Complete Guide](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/) repository and for this section, particularly [the eval\_retrieval.py file](https://github.com/zenml-io/zenml-projects/blob/main/llm-complete-guide/steps/eval%5Fretrieval.py). diff --git a/develop/user-guide/guides/llmops-guide/reranking/understanding-reranking.mdx b/develop/user-guide/guides/llmops-guide/reranking/understanding-reranking.mdx deleted file mode 100644 index 62d4b01..0000000 --- a/develop/user-guide/guides/llmops-guide/reranking/understanding-reranking.mdx +++ /dev/null @@ -1,32 +0,0 @@ ---- -title: "Understanding reranking" -description: "Understand how reranking works." -icon: book ---- - -### What is reranking? - -Reranking is the process of refining the initial ranking of documents retrieved by a retrieval system. In the context of Retrieval-Augmented Generation (RAG), reranking plays a crucial role in improving the relevance and quality of the retrieved documents that are used to generate the final output. - -The initial retrieval step in RAG typically uses a sparse retrieval method, such as BM25 or TF-IDF, to quickly find a set of potentially relevant documents based on the input query. However, these methods rely on lexical matching and may not capture the semantic meaning or context of the query effectively. - -Rerankers, on the other hand, are designed to reorder the retrieved documents by considering additional features, such as semantic similarity, relevance scores, or domain-specific knowledge. They aim to push the most relevant and informative documents to the top of the list, ensuring that the LLM has access to the best possible context for generating accurate and coherent responses. - -### Types of Rerankers - -There are different types of rerankers that can be used in RAG, each with its own strengths and trade-offs: - -1. **Cross-Encoders**: Cross-encoders are a popular choice for reranking in RAG. They take the concatenated query and document as input and output a relevance score. Examples include BERT-based models fine-tuned for passage ranking tasks. Cross-encoders can capture the interaction between the query and document effectively but are computationally expensive. -2. **Bi-Encoders**: Bi-encoders, also known as dual encoders, use separate encoders for the query and document. They generate embeddings for the query and document independently and then compute the similarity between them. Bi-encoders are more efficient than cross-encoders but may not capture the query-document interaction as effectively. -3. **Lightweight Models**: Lightweight rerankers, such as distilled models or small transformer variants, aim to strike a balance between effectiveness and efficiency. They are faster and have a smaller footprint compared to large cross-encoders, making them suitable for real-time applications. - -### Benefits of Reranking in RAG - -Reranking offers several benefits in the context of RAG: - -1. **Improved Relevance**: By considering additional features and scores, rerankers can identify the most relevant documents for a given query, ensuring that the LLM has access to the most informative context for generating accurate responses. -2. **Semantic Understanding**: Rerankers can capture the semantic meaning and context of the query and documents, going beyond simple keyword matching. This enables the retrieval of documents that are semantically similar to the query, even if they don't contain exact keyword matches. -3. **Domain Adaptation**: Rerankers can be fine-tuned on domain-specific data to incorporate domain knowledge and improve performance in specific verticals or industries. -4. **Personalization**: Rerankers can be personalized based on user preferences, historical interactions, or user profiles, enabling the retrieval of documents that are more tailored to individual users' needs. - -In the next section, we'll dive into how to implement reranking in ZenML and integrate it into your RAG inference pipeline. diff --git a/develop/user-guide/guides/production-guide.mdx b/develop/user-guide/guides/production-guide.mdx deleted file mode 100644 index a568361..0000000 --- a/develop/user-guide/guides/production-guide.mdx +++ /dev/null @@ -1,26 +0,0 @@ ---- -title: "Overview" -description: "Level up your skills in a production setting." -icon: the-red-yeti ---- - -The ZenML production guide builds upon the [Starter guide](/develop/user-guide/guides/starter-guide) and is the next step in the MLOps Engineer journey with ZenML. If you're an ML practitioner hoping to implement a proof of concept within your workplace to showcase the importance of MLOps, this is the place for you. - - - - -This guide will focus on shifting gears from running pipelines _locally_ on your machine, to running them in _production_ in the cloud. We'll cover: - - - - - - - - - - -Like in the starter guide, make sure you have a Python environment ready and `virtualenv` installed to follow along with ease. As now we are dealing with cloud infrastructure, you'll also want to select one of the major cloud providers (AWS, GCP, Azure), and make sure the respective CLIs are installed and authorized. - -By the end, you will have completed an [end-to-end](/develop/user-guide/guides/production-guide/end-to-end) MLOps project that you can use as inspiration for your own work. Let's get right into it! - diff --git a/develop/user-guide/guides/production-guide/ci-cd.mdx b/develop/user-guide/guides/production-guide/ci-cd.mdx deleted file mode 100644 index c71fe1a..0000000 --- a/develop/user-guide/guides/production-guide/ci-cd.mdx +++ /dev/null @@ -1,126 +0,0 @@ ---- -title: "Set up CI/CD" -description: "Managing the lifecycle of a ZenML pipeline with Continuous Integration and Delivery" -icon: gear ---- - -Until now, we have been executing ZenML pipelines locally. While this is a good mode of operating pipelines, in production it is often desirable to mediate runs through a central workflow engine baked into your CI. - -This allows data scientists to experiment with data processing and model training locally and then have code changes automatically tested and validated through the standard pull request/merge request peer review process. Changes that pass the CI and code review are then deployed automatically to production. Here is how this could look like: - - - - -## Breaking it down - -To illustrate this, let's walk through how this process could be set up with a GitHub Repository. Basically we'll be using Github Actions in order to set up a proper CI/CD workflow. - - - To see this in action, check out the [ZenML Gitflow Repository](https://github.com/zenml-io/zenml-gitflow/). This repository showcases how ZenML can be used for machine learning with a GitHub workflow that automates CI/CD with continuous model training and continuous model deployment to production. The repository is also meant to be used as a template: you can fork it and easily adapt it to your own MLOps stack, infrastructure, code and data. - - -### Configure an API Key in ZenML - -In order to facilitate machine-to-machine connection you need to create an API key within ZenML. Learn more about those [here](https://docs.zenml.io/usage/resource-data-management/connecting-to-zenml/connect-with-a-service-account). - -```Bash -zenml service-account create github_action_api_key -``` - -This will return the API Key to you like this. This will not be shown to you again, so make sure to copy it here for use in the next section. - -```Bash -Created service account 'github_action_api_key'. -Successfully created API key `default`. -The API key value is: 'ZENKEY_...' -Please store it safely as it will not be shown again. -To configure a ZenML client to use this API key, run: - -... -``` - -### Set up your secrets in Github - -For our Github Actions we will need to set up some secrets [for our repository](https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions#creating-secrets-for-a-repository). Specifically, you should use github secrets to store the `ZENML_API_KEY` that you created above. - - - - -The other values that are loaded from secrets into the environment [here](https://github.com/zenml-io/zenml-gitflow/blob/main/.github/workflows/pipeline%5Frun.yaml#L14-L23) can also be set explicitly or as variables. - -### (Optional) Set up different stacks for Staging and Production - -You might not necessarily want to use the same stack with the same resources for your staging and production use. - -This step is optional, all you'll need for certain is a stack that runs remotely (remote orchestration and artifact storage). The rest is up to you. You might for example want to parametrize your pipeline to use different data sources for the respective environments. You can also use different [configuration files](https://github.com/zenml-io/zenml/blob/release/0.66.0/docs/book/versions/0.66.0/how-to/configuring-zenml/configuring-zenml.md) for the different environments to configure the [Model](/develop/usage/development-monitoring/use-the-model-control-plane), the [DockerSettings](/develop/usage/project-setup/customize-docker-builds/docker-settings-on-a-pipeline), the [ResourceSettings like accelerators](/develop/usage/resource-data-management/training-with-gpus) differently for the different environments. - -### Trigger a pipeline on a Pull Request (Merge Request) - -One way to ensure only fully working code makes it into production, you should use a staging environment to test all the changes made to your code base and verify they work as intended. To do so automatically you should set up a github action workflow that runs your pipeline for you when you make changes to it. [Here](https://github.com/zenml-io/zenml-gitflow/blob/main/.github/workflows/pipeline%5Frun.yaml) is an example that you can use. - -To only run the Github Action on a PR, you can configure the yaml like this - -```Bash -on: - pull_request: - branches: [ staging, main ] -``` - -When the workflow starts we want to set some important values. Here is a simplified version that you can use. - -```Bash -jobs: - run-staging-workflow: - runs-on: run-zenml-pipeline - env: - ZENML_HOST: ${{ secrets.ZENML_HOST }} # Put your server url here - ZENML_API_KEY: ${{ secrets.ZENML_API_KEY }} # Retrieves the api key for use - ZENML_STACK: stack_name # Use this to decide which stack is used for staging - ZENML_GITHUB_SHA: ${{ github.event.pull_request.head.sha }} - ZENML_GITHUB_URL_PR: ${{ github.event.pull_request._links.html.href }} -``` - -After configuring these values so they apply to your specific situation the rest of the template should work as is for you. Specifically you will need to install all requirements, connect to your ZenML Server, set an active stack and run a pipeline within your github action. - -```Bash -steps: - - name: Check out repository code - uses: actions/checkout@v3 - - - uses: actions/setup-python@v4 - with: - python-version: '3.9' - - - name: Install requirements - run: | - pip3 install -r requirements.txt - - - name: Connect to ZenML server - run: | - zenml connect --url $ZENML_HOST --api-key $ZENML_API_KEY - - - name: Set stack - run: | - zenml stack set ${{ env.ZENML_STACK }} - - - - name: Run pipeline - run: | - python run.py \ - --pipeline end-to-end \ - --dataset production \ - --version ${{ env.ZENML_GITHUB_SHA }} \ - --github-pr-url ${{ env.ZENML_GITHUB_URL_PR }} -``` - -When you push to a branch now, that is within a Pull Request, this action will run automatically. - -### (Optional) Comment Metrics onto the PR - -Finally you can configure your github action workflow to leave a report based on the pipeline that was run. Check out the template for this \[here\](https://github.com/zenml-io/zenml-gitflow/blob/main/.github/workflows/pipeline\_run.yaml#L87-L99. - - - - - - diff --git a/develop/user-guide/guides/production-guide/cloud-orchestration.mdx b/develop/user-guide/guides/production-guide/cloud-orchestration.mdx deleted file mode 100644 index c49ed57..0000000 --- a/develop/user-guide/guides/production-guide/cloud-orchestration.mdx +++ /dev/null @@ -1,177 +0,0 @@ ---- -title: "Orchestrate on the cloud" -description: "Orchestrate using cloud resources." -icon: cloud ---- - -Until now, we've only run pipelines locally. The next step is to get free from our local machines and transition our pipelines to execute on the cloud. This will enable you to run your MLOps pipelines in a cloud environment, leveraging the scalability and robustness that cloud platforms offer. - -In order to do this, we need to get familiar with two more stack components: - -* The [orchestrator](/develop/stack-components/orchestrators) manages the workflow and execution of your pipelines. -* The [container registry](/develop/stack-components/container-registries) is a storage and content delivery system that holds your Docker container images. - -These, along with [remote storage](/develop/user-guide/guides/production-guide/remote-storage), complete a basic cloud stack where our pipeline is entirely running on the cloud. - - -Would you like to skip ahead and deploy a full ZenML cloud stack already? - -Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML Terraform modules](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register a cloud stack. - - -## Starting with a basic cloud stack - -The easiest cloud orchestrator to start with is the [Skypilot](https://skypilot.readthedocs.io/) orchestrator running on a public cloud. The advantage of Skypilot is that it simply provisions a VM to execute the pipeline on your cloud provider. - -Coupled with Skypilot, we need a mechanism to package your code and ship it to the cloud for Skypilot to do its thing. ZenML uses [Docker](https://www.docker.com/) to achieve this. Every time you run a pipeline with a remote orchestrator, [ZenML builds an image](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository) for the entire pipeline (and optionally each step of a pipeline depending on your [configuration](/develop/usage/project-setup/customize-docker-builds)). This image contains the code, requirements, and everything else needed to run the steps of the pipeline in any environment. ZenML then pushes this image to the container registry configured in your stack, and the orchestrator pulls the image when it's ready to execute a step. - -To summarize, here is the broad sequence of events that happen when you run a pipeline with such a cloud stack: - - - - -1. The user runs a pipeline on the client machine. This executes the `run.py` script where ZenML reads the `@pipeline` function and understands what steps need to be executed. -2. The client asks the server for the stack info, which returns it with the configuration of the cloud stack. -3. Based on the stack info and pipeline specification, the client builds and pushes an image to the `container registry`. The image contains the environment needed to execute the pipeline and the code of the steps. -4. The client creates a run in the `orchestrator`. For example, in the case of the [Skypilot](https://skypilot.readthedocs.io/) orchestrator, it creates a virtual machine in the cloud with some commands to pull and run a Docker image from the specified container registry. -5. The `orchestrator` pulls the appropriate image from the `container registry` as it's executing the pipeline (each step has an image). -6. As each pipeline runs, it stores artifacts physically in the `artifact store`. Of course, this artifact store needs to be some form of cloud storage. -7. As each pipeline runs, it reports status back to the ZenML server and optionally queries the server for metadata. - -## Provisioning and registering an orchestrator alongside a container registry - -While there are detailed docs on [how to set up a Skypilot orchestrator](/develop/stack-components/orchestrators/skypilot-vm) and a [container registry](/develop/stack-components/container-registries) on each public cloud, we have put the most relevant details here for convenience: - - - In order to launch a pipeline on AWS with the SkyPilot orchestrator, the first thing that you need to do is to install the AWS and Skypilot integrations: - -```Bash -zenml integration install aws skypilot_aws -y -``` - -Before we start registering any components, there is another step that we have to execute. As we [explained in the previous section](/develop/user-guide/guides/production-guide/remote-storage#configuring-permissions-with-your-first-service-connector), components such as orchestrators and container registries often require you to set up the right permissions. In ZenML, this process is simplified with the use of [Service Connectors](/develop/usage/resource-data-management/auth-management). For this example, we need to use the [IAM role authentication method of our AWS service connector](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role): - - -```Bash -AWS_PROFILE= zenml service-connector register cloud_connector --type aws --auto-configure -``` - -Once the service connector is set up, we can register [a Skypilot orchestrator](/develop/stack-components/orchestrators/skypilot-vm): - -```Bash -zenml orchestrator register cloud_orchestrator -f vm_aws -zenml orchestrator connect cloud_orchestrator --connector cloud_connector -``` - -The next step is to register [an AWS container registry](/develop/stack-components/container-registries/aws). Similar to the orchestrator, we will use our connector as we are setting up the container registry: - - -```Bash -zenml container-registry register cloud_container_registry -f aws --uri=.dkr.ecr..amazonaws.com -zenml container-registry connect cloud_container_registry --connector cloud_connector -``` - -With the components registered, everything is set up for the next steps. - -For more information, you can always check the [dedicated Skypilot orchestrator guide](/develop/stack-components/orchestrators/skypilot-vm). - - - In order to launch a pipeline on GCP with the SkyPilot orchestrator, the first thing that you need to do is to install the GCP and Skypilot integrations: - -```Bash -zenml integration install gcp skypilot_gcp -y -``` - -Before we start registering any components, there is another step that we have to execute. As we [explained in the previous section](/develop/user-guide/guides/production-guide/remote-storage#configuring-permissions-with-your-first-service-connector), components such as orchestrators and container registries often require you to set up the right permissions. In ZenML, this process is simplified with the use of [Service Connectors](/develop/usage/resource-data-management/auth-management). For this example, we need to use the [Service Account authentication feature of our GCP service connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcp-service-account): - -```Bash -zenml service-connector register cloud_connector --type gcp --auth-method service-account --service_account_json=@ --project_id= --generate_temporary_tokens=False -``` - -Once the service connector is set up, we can register [a Skypilot orchestrator](/develop/stack-components/orchestrators/skypilot-vm): - -```Bash -zenml orchestrator register cloud_orchestrator -f vm_gcp - -zenml orchestrator connect cloud_orchestrator --connect cloud_connector -``` - -The next step is to register [a GCP container registry](/develop/stack-components/container-registries/gcp). Similar to the orchestrator, we will use our connector as we are setting up the container registry: - -```Bash -zenml container-registry register cloud_container_registry -f gcp --uri=gcr.io/ -zenml container-registry connect cloud_container_registry --connector cloud_connector -``` - -With the components registered, everything is set up for the next steps. - -For more information, you can always check the [dedicated Skypilot orchestrator guide](/develop/stack-components/orchestrators/skypilot-vm). - - -As of [v0.60.0](https://github.com/zenml-io/zenml/releases/tag/0.60.0), alongside the switch to `pydantic` v2, due to an incompatibility between the new version `pydantic` and the `azurecli`, the `skypilot[azure]` flavor can not be installed at the same time. Therefore, for Azure users, an alternative is to use the [Kubernetes Orchestrator](/develop/stack-components/orchestrators/kubernetes). You can easily deploy a Kubernetes cluster in your subscription using the [Azure Kubernetes Service](https://azure.microsoft.com/en-us/products/kubernetes-service). - -In order to launch a pipeline on Azure with the Kubernetes orchestrator, the first thing that you need to do is to install the Azure and Kubernetes integrations: - -```Bash -zenml integration install azure kubernetes -y -``` - -You should also ensure you have [kubectl installed](https://kubernetes.io/docs/tasks/tools/). - -Before we start registering any components, there is another step that we have to execute. As we [explained in the previous section](/develop/user-guide/guides/production-guide/remote-storage#configuring-permissions-with-your-first-service-connector), components such as orchestrators and container registries often require you to set up the right permissions. In ZenML, this process is simplified with the use of [Service Connectors](/develop/usage/resource-data-management/auth-management). For this example, we will need to use the [Service Principal authentication feature of our Azure service connector](/develop/usage/resource-data-management/auth-management/azure-service-connector#azure-service-principal): - -```Bash -zenml service-connector register cloud_connector --type azure --auth-method service-principal --tenant_id= --client_id= --client_secret= -``` - -Once the service connector is set up, we can register [a Kubernetes orchestrator](/develop/stack-components/orchestrators/kubernetes): - -```Bash -# Ensure your service connector has access to the AKS cluster: -zenml service-connector list-resources --resource-type kubernetes-cluster -e -zenml orchestrator register cloud_orchestrator --flavor kubernetes -zenml orchestrator connect cloud_orchestrator --connect cloud_connector -``` - -The next step is to register [an Azure container registry](/develop/stack-components/container-registries/azure). Similar to the orchestrator, we will use our connector as we are setting up the container registry. - -```Bash -zenml container-registry register cloud_container_registry -f azure --uri=.azurecr.io -zenml container-registry connect cloud_container_registry --connector cloud_connector -``` - -With the components registered, everything is set up for the next steps. - -For more information, you can always check the [dedicated Kubernetes orchestrator guide](/develop/stack-components/orchestrators/kubernetes). - - - - - Having trouble with setting up infrastructure? Try reading the [stack deployment](/develop/stack-components/stack-deployment) section of the docs to gain more insight. If that still doesn't work, join the [ZenML community](https://zenml.io/slack) and ask! - - -## Running a pipeline on a cloud stack - -Now that we have our orchestrator and container registry registered, we can [register a new stack](/develop/user-guide/guides/production-guide/understand-stacks#registering-a-stack), just like we did in the previous chapter: - -```Bash CLI -zenml stack register minimal_cloud_stack -o cloud_orchestrator -a cloud_artifact_store -c cloud_container_registry -``` - -Now, using the [code from the previous chapter](/develop/user-guide/guides/production-guide/understand-stacks#run-a-pipeline-on-the-new-local-stack), we can run a training pipeline. First, set the minimal cloud stack active: - -```Bash -zenml stack set minimal_cloud_stack -``` - -and then, run the training pipeline: - -```Bash -python run.py --training-pipeline -``` - -You will notice this time your pipeline behaves differently. After it has built the Docker image with all your code, it will push that image, and run a VM on the cloud. Here is where your pipeline will execute, and the logs will be streamed back to you. So with a few commands, we were able to ship our entire code to the cloud! - -Curious to see what other stacks you can create? The [Component Guide](/develop/stack-components/component-guide) has an exhaustive list of various artifact stores, container registries, and orchestrators that are integrated with ZenML. Try playing around with more stack components to see how easy it is to switch between MLOps stacks with ZenML. - diff --git a/develop/user-guide/guides/production-guide/configure-pipeline.mdx b/develop/user-guide/guides/production-guide/configure-pipeline.mdx deleted file mode 100644 index 1f1acb0..0000000 --- a/develop/user-guide/guides/production-guide/configure-pipeline.mdx +++ /dev/null @@ -1,164 +0,0 @@ ---- -title: "Configure your pipeline to add compute" -description: "Add more resources to your pipeline configuration." -icon: sliders ---- - -Now that we have our pipeline up and running in the cloud, you might be wondering how ZenML figured out what sort of dependencies to install in the Docker image that we just ran on the VM. The answer lies in the [runner script we executed (i.e. run.py)](https://github.com/zenml-io/zenml/blob/main/examples/quickstart/run.py#L215), in particular, these lines: - -```Bash -pipeline_args["config_path"] = os.path.join( - config_folder, "training_rf.yaml" -) -# Configure the pipeline -training_pipeline_configured = training_pipeline.with_options(**pipeline_args) -# Create a run -training_pipeline_configured() -``` - -The above commands [configure our training pipeline](/develop/user-guide/guides/starter-guide/create-an-ml-pipeline#configure-with-a-yaml-file) with a YAML configuration called `training_rf.yaml` (found [here in the source code](https://github.com/zenml-io/zenml/blob/main/examples/quickstart/configs/training%5Frf.yaml)). Let's learn more about this configuration file. - - -The `with_options` command that points to a YAML config is only one way to configure a pipeline. We can also directly configure a pipeline or a step in the decorator: - -```Bash -@pipeline(settings=...) -``` - -However, it is best to not mix configuration from code to ensure separation of concerns in our codebase. - - -## Breaking down our configuration YAML - -The YAML configuration of a ZenML pipeline can be very simple, as in this case. Let's break it down and go through each section one by one: - -### The Docker settings - -```Bash -settings: - - docker: - - required_integrations: - - - sklearn - - requirements: - - - pyarrow -``` - -The first section is the so-called `settings` of the pipeline. This section has a `docker` key, which controls the [containerization process](/develop/user-guide/guides/production-guide/cloud-orchestration#orchestrating-pipelines-on-the-cloud). Here, we are simply telling ZenML that we need `pyarrow` as a pip requirement, and we want to enable the `sklearn` integration of ZenML, which will in turn install the `scikit-learn` library. This Docker section can be populated with many different options, and correspond to the [DockerSettings](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-config/#zenml.config.docker%5Fsettings.DockerSettings) class in the Python SDK. - -### Associating a ZenML Model - -The next section is about associating a [ZenML Model](/develop/user-guide/guides/starter-guide/track-ml-models) with this pipeline. - - - -```Bash - -# Configuration of the Model Control Plane -model: - name: breast_cancer_classifier - version: rf - license: Apache 2.0 - description: A breast cancer classifier - tags: ["breast_cancer", "classifier"] -``` - -You will see that this configuration lines up with the model created after executing these pipelines: - - -```Bash -# List all versions of the breast_cancer_classifier -zenml model version list breast_cancer_classifier -``` - - -[ZenML Pro](https://www.zenml.io/pro) ships with a Model Control Plane dashboard where you can visualize all the versions: - - - - - - -### Passing parameters - -The last part of the config YAML is the `parameters` key: - -```Bash -# Configure the pipeline -parameters: - model_type: "rf" # Choose between rf/sgd -``` - -This parameters key aligns with the parameters that the pipeline expects. In this case, the pipeline expects a string called `model_type` that will inform it which type of model to use: - -```Bash -@pipeline -def training_pipeline(model_type: str): -... -``` - -So you can see that the YAML config is fairly easy to use and is an important part of the codebase to control the execution of our pipeline. You can read more about how to configure a pipeline in the [how to section](/develop/usage/project-setup/use-configuration-files/what-can-be-configured), but for now, we can move on to scaling our pipeline. - -## Scaling compute on the cloud - -When we ran our pipeline with the above config, ZenML used some sane defaults to pick the resource requirements for that pipeline. However, in the real world, you might want to add more memory, CPU, or even a GPU depending on the pipeline at hand. - -This is as easy as adding the following section to your local `training_rf.yaml` file: - -```Bash -# These are the resources for the entire pipeline, i.e., each step -settings: - ... - - # Adapt this to vm_gcp accordingly - orchestrator: - memory: 32 # in GB - -... -steps: - model_trainer: - settings: - orchestrator: - cpus: 8 -``` - -Here we are configuring the entire pipeline with a certain amount of memory, while for the trainer step we are additionally configuring 8 CPU cores. The `orchestrator` key corresponds to the [SkypilotBaseOrchestratorSettings](https://sdkdocs.zenml.io/latest/integration%5Fcode%5Fdocs/integrations-skypilot/#zenml.integrations.skypilot.flavors.skypilot%5Forchestrator%5Fbase%5Fvm%5Fconfig.SkypilotBaseOrchestratorSettings) class in the Python SDK. - - As discussed [before](/develop/user-guide/guides/production-guide/cloud-orchestration), we are using the [Kubernetes orchestrator](/develop/stack-components/orchestrators/kubernetes) for Azure users. In order to scale compute for the Kubernetes orchestrator, the YAML file needs to look like this: - - ```Bash - # These are the resources for the entire pipeline, i.e., each step - settings: - ... - - resources: - memory: "32GB" - - ... - steps: - model_trainer: - settings: - resources: - memory: "8GB" -``` - - - - -Read more about settings in ZenML [here](/develop/usage/project-setup/use-configuration-files/runtime-configuration) and [here](/develop/usage/resource-data-management/training-with-gpus) - -Now let's run the pipeline again: - -```Bash -python run.py --training-pipeline -``` - -Now you should notice the machine that gets provisioned on your cloud provider would have a different configuration as compared to last time. As easy as that! - -Bear in mind that not every orchestrator supports `ResourceSettings` directly. To learn more, you can read about [ResourceSettings here](/develop/usage/project-setup/use-configuration-files/runtime-configuration), including the ability to [attach a GPU](/develop/usage/resource-data-management/training-with-gpus#1-specify-a-cuda-enabled-parent-image-in-your-dockersettings). diff --git a/develop/user-guide/guides/production-guide/connect-code-repository.mdx b/develop/user-guide/guides/production-guide/connect-code-repository.mdx deleted file mode 100644 index e699201..0000000 --- a/develop/user-guide/guides/production-guide/connect-code-repository.mdx +++ /dev/null @@ -1,116 +0,0 @@ ---- -title: "Configure a code repository" -description: "Connect a Git repository to ZenML to track code changes and collaborate on MLOps projects." -icon: code-branch ---- - -Throughout the lifecycle of a MLOps pipeline, it can get quite tiresome to always wait for a Docker build every time after running a pipeline (even if the local Docker cache is used). However, there is a way to just have one pipeline build and keep reusing it until a change to the pipeline environment is made: by connecting a code repository. - -With ZenML, connecting to a Git repository optimizes the Docker build processes. It also has the added bonus of being a better way of managing repository changes and enabling better code collaboration. Here is how the flow changes when running a pipeline: - - - - -1. You trigger a pipeline run on your local machine. ZenML parses the `@pipeline` function to determine the necessary steps. -2. The local client requests stack information from the ZenML server, which responds with the cloud stack configuration. -3. The local client detects that we're using a code repository and requests the information from the git repo. -4. Instead of building a new Docker image, the client checks if an existing image can be reused based on the current Git commit hash and other environment metadata. -5. The client initiates a run in the orchestrator, which sets up the execution environment in the cloud, such as a VM. -6. The orchestrator downloads the code directly from the Git repository and uses the existing Docker image to run the pipeline steps. -7. Pipeline steps execute, storing artifacts in the cloud-based artifact store. -8. Throughout the execution, the pipeline run status and metadata are reported back to the ZenML server. - -By connecting a Git repository, you avoid redundant builds and make your MLOps processes more efficient. Your team can work on the codebase simultaneously, with ZenML handling the version tracking and ensuring that the correct code version is always used for each run. - -## Creating a GitHub Repository - -While ZenML supports [many different flavors of git repositories](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository), this guide will focus on [GitHub](https://github.com). To create a repository on GitHub: - -1. Sign in to [GitHub](https://github.com/). -2. Click the "+" icon and select "New repository." -3. Name your repository, set its visibility, and add a README or .gitignore if needed. -4. Click "Create repository." - -We can now push our local code (from the [previous chapters](/develop/user-guide/guides/production-guide/understand-stacks#run-a-pipeline-on-the-new-local-stack)) to GitHub with these commands: - -```Bash -# Initialize a Git repository -git init - -# Add files to the repository -git add . - -# Commit the files -git commit -m "Initial commit" - -# Add the GitHub remote -git remote add origin https://github.com/YOUR_USERNAME/YOUR_REPOSITORY_NAME.git - -# Push to GitHub -git push -u origin master -``` - -Replace `YOUR_USERNAME` and `YOUR_REPOSITORY_NAME` with your GitHub information. - -## Linking to ZenML - -To connect your GitHub repository to ZenML, you'll need a GitHub Personal Access Token (PAT). - - -1. Go to your GitHub account settings and click on [Developer settings](https://github.com/settings/tokens?type=beta). -2. Select "Personal access tokens" and click on "Generate new token". -3. Give your token a name and a description. - - - - - -4. We recommend selecting the specific repository and then giving `contents` read-only access. - - - - - - - - -5. Click on "Generate token" and copy the token to a safe place. - - - - - - - -Now, we can install the GitHub integration and register your repository: - -```Bash -zenml integration install github -zenml code-repository register --type=github \ ---url=https://github.com/YOUR_USERNAME/YOUR_REPOSITORY_NAME.git \ ---owner=YOUR_USERNAME --repository=YOUR_REPOSITORY_NAME \ ---token=YOUR_GITHUB_PERSONAL_ACCESS_TOKEN -``` - -Fill in ``, `YOUR_USERNAME`, `YOUR_REPOSITORY_NAME`, and `YOUR_GITHUB_PERSONAL_ACCESS_TOKEN` with your details. - -Your code is now connected to your ZenML server. ZenML will automatically detect if your source files are being tracked by GitHub and store the commit hash for each subsequent pipeline run. - -You can try this out by running our training pipeline again: - -```Bash -# This will build the Docker image the first time - -python run.py --training-pipeline - -# This will skip Docker building - -python run.py --training-pipeline -``` - -You can read more about [the ZenML Git Integration here](/develop/usage/project-setup/setting-up-a-project-repository/connect-your-git-repository). - diff --git a/develop/user-guide/guides/production-guide/deploying-zenml.mdx b/develop/user-guide/guides/production-guide/deploying-zenml.mdx deleted file mode 100644 index 3cebbc1..0000000 --- a/develop/user-guide/guides/production-guide/deploying-zenml.mdx +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: "Deploying ZenML" -description: "Deploying ZenML is the first step to production." -icon: rocket ---- - -When you first get started with ZenML, it is based on the following architecture on your machine: - - - - -The SQLite database that you can see in this diagram is used to store all the metadata we produced in the previous guide (pipelines, models, artifacts, etc). - -In order to move into production, you will need to deploy this server somewhere centrally outside of your machine. This allows different infrastructure components to interact with, alongside enabling you to collaborate with your team members: - - - -## Choosing how to deploy ZenML - -While there are many options on how to [deploy ZenML](/develop/getting-started/deploying-zenml), the two simplest ones are: - -### Option 1: Sign up for a free ZenML Pro Trial - -[ZenML Pro](https://zenml.io/pro) comes as a managed SaaS solution that offers a one-click deployment for your ZenML server. Click [here](https://cloud.zenml.io/?utm%5Fsource=docs&utm%5Fmedium=referral%5Flink&utm%5Fcampaign=cloud%5Fpromotion&utm%5Fcontent=signup%5Flink) to start a free trial. - -On top of the one-click SaaS experience, ZenML Pro also comes built-in with additional features and a new dashboard that might be beneficial to follow for this guide. You can always go back to self-hosting after your learning journey is complete. - -### Option 2: Self-host ZenML on your cloud provider - -As ZenML is open source, it is easy to [self-host it](/develop/getting-started/deploying-zenml). There is even a [ZenML CLI](/develop/getting-started/deploying-zenml/deploy-with-zenml-cli) one-liner that deploys ZenML on a Kubernetes cluster, abstracting away all the infrastructure complexity. If you don't have an existing Kubernetes cluster, you can create it manually using the documentation for your cloud provider. For convenience, here are links for [AWS](https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html), [Azure](https://learn.microsoft.com/en-us/azure/aks/learn/quick-kubernetes-deploy-portal?tabs=azure-cli), and [GCP](https://cloud.google.com/kubernetes-engine/docs/versions/0.66.0/how-to/creating-a-zonal-cluster#before%5Fyou%5Fbegin). - - - Once you have created your cluster, make sure that you configure your [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) client to connect to it. - - -You're now ready to deploy ZenML! Run the following command: - -```Bash -zenml deploy -``` - -You will be prompted to provide a name for your deployment and details like what cloud provider you want to deploy to, in addition to the username, password, and email you want to set for the default user — and that's it! It creates the database and any VPCs, permissions, and more that are needed. - - - In order to be able to run the `deploy` command, you should have your cloud provider's CLI configured locally with permissions to create resources like MySQL databases and networks. - - -To learn more about different options for [deploying ZenML, visit the deployment documentation](/develop/getting-started/deploying-zenml). - -## Connecting to a deployed ZenML - -You can connect your local ZenML client with the ZenML Server using the ZenML CLI and the web-based login. This can be executed with the command: - -```Bash -zenml connect --url -``` - -where SERVER\_URL is the host address of your ZenML deployment (e.g. `https://mydeployment.zenml.com`) - - - Having trouble connecting with a browser? There are other ways to connect. Read [here](/develop/usage/resource-data-management/connecting-to-zenml) for more details. - - -This command will start a series of steps to validate the device from where you are connecting that will happen in your browser. After that, you're now locally connected to a remote ZenML. Nothing of your experience changes, except that all metadata that you produce will be tracked centrally in one place from now on. - - - You can always go back to the local zenml experience by using `zenml disconnect` - diff --git a/develop/user-guide/guides/production-guide/end-to-end.mdx b/develop/user-guide/guides/production-guide/end-to-end.mdx deleted file mode 100644 index 30fe6cf..0000000 --- a/develop/user-guide/guides/production-guide/end-to-end.mdx +++ /dev/null @@ -1,62 +0,0 @@ ---- -title: "An end-to-end project" -description: "Put your new knowledge in action with an end-to-end project" -icon: diagram-project ---- - -That was awesome! We learned so many advanced MLOps production concepts: - -* The value of [deploying ZenML](/develop/user-guide/guides/production-guide/deploying-zenml) -* Abstracting infrastructure configuration into [stacks](/develop/user-guide/guides/production-guide/understand-stacks) -* [Connecting remote storage](/develop/user-guide/guides/production-guide/remote-storage) -* [Orchestrating on the cloud](/develop/user-guide/guides/production-guide/cloud-orchestration) -* [Configuring the pipeline to scale compute](/develop/user-guide/guides/production-guide/configure-pipeline) -* [Connecting a git repository](/develop/user-guide/guides/production-guide/connect-code-repository) - -We will now combine all of these concepts into an end-to-end MLOps project powered by ZenML. - -## Get started - -Start with a fresh virtual environment with no dependencies. Then let's install our dependencies: - -```Bash -pip install "zenml[templates,server]" notebook -zenml integration install sklearn -y -``` - -We will then use [ZenML templates](/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates) to help us get the code we need for the project: - -```bash -mkdir zenml_batch_e2e -cd zenml_batch_e2e -zenml init --template e2e_batch --template-with-defaults - -# Just in case, we install the requirements again - -pip install -r requirements.txt -``` - -The e2e template is also available as a [ZenML example](https://github.com/zenml-io/zenml/tree/main/examples/e2e). You can clone it: - -```Bash -git clone --depth 1 git@github.com:zenml-io/zenml.git -cd zenml/examples/e2e -pip install -r requirements.txt -zenml init -``` - - - -## What you'll learn - -The e2e project is a comprehensive project template to cover major use cases of ZenML: a collection of steps and pipelines and, to top it all off, a simple but useful CLI. It showcases the core ZenML concepts for supervised ML with batch predictions. It builds on top of the [starter project](/develop/user-guide/guides/starter-guide/starter-project) with more advanced concepts. - -As you progress through the e2e batch template, try running the pipelines on a [remote cloud stack](/develop/user-guide/guides/production-guide/cloud-orchestration) on a tracked [git repository](/develop/user-guide/guides/production-guide/connect-code-repository) to practice some of the concepts we have learned in this guide. - -At the end, don't forget to share the [ZenML e2e template](https://github.com/zenml-io/template-e2e-batch) with your colleagues and see how they react! - -## Conclusion and next steps - -The production guide has now hopefully landed you with an end-to-end MLOps project, powered by a ZenML server connected to your cloud infrastructure. You are now ready to dive deep into writing your own pipelines and stacks. If you are looking to learn more advanced concepts, the [how-to section](/develop/usage/pipelines/build-pipelines/build-pipelines) is for you. Until then, we wish you the best of luck chasing your MLOps dreams! diff --git a/develop/user-guide/guides/production-guide/remote-storage.mdx b/develop/user-guide/guides/production-guide/remote-storage.mdx deleted file mode 100644 index 56c2800..0000000 --- a/develop/user-guide/guides/production-guide/remote-storage.mdx +++ /dev/null @@ -1,205 +0,0 @@ ---- -title: "Connecting remote storage" -description: "Transitioning to remote artifact storage." -icon: database ---- - -In the previous chapters, we've been working with artifacts stored locally on our machines. This setup is fine for individual experiments, but as we move towards a collaborative and production-ready environment, we need a solution that is more robust, shareable, and scalable. Enter remote storage! - -Remote storage allows us to store our artifacts in the cloud, which means they're accessible from anywhere and by anyone with the right permissions. This is essential for team collaboration and for managing the larger datasets and models that come with production workloads. - -When using a stack with remote storage, nothing changes except the fact that the artifacts get materialized in a central and remote storage location. This diagram explains the flow: - - - - - -Would you like to skip ahead and deploy a full ZenML cloud stack already? - -Check out the [in-browser stack deployment wizard](/develop/stack-components/stack-deployment/deploy-a-cloud-stack), the [stack registration wizard](/develop/stack-components/stack-deployment/register-a-cloud-stack), or [the ZenML Terraform modules](/develop/stack-components/stack-deployment/deploy-a-cloud-stack-with-terraform) for a shortcut on how to deploy & register a cloud stack. - - -## Provisioning and registering a remote artifact store - -Out of the box, ZenML ships with [many different supported artifact store flavors](/develop/stack-components/artifact-stores). For convenience, here are some brief instructions on how to quickly get up and running on the major cloud providers: - - -You will need to install and set up the AWS CLI on your machine as a prerequisite, as covered in [the AWS CLI documentation](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html), before you register the S3 Artifact Store. - -The Amazon Web Services S3 Artifact Store flavor is provided by the [S3 ZenML integration](/develop/stack-components/artifact-stores/s3), you need to install it on your local machine to be able to register an S3 Artifact Store and add it to your stack: - -```Bash -zenml integration install s3 -y -``` - - - Having trouble with this command? You can use `poetry` or `pip` to install the requirements of any ZenML integration directly. In order to obtain the exact requirements of the AWS S3 integration you can use `zenml integration requirements s3`. - - -The only configuration parameter mandatory for registering an S3 Artifact Store is the root path URI, which needs to point to an S3 bucket and take the form `s3://bucket-name`. In order to create a S3 bucket, refer to the [AWS documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html). - -With the URI to your S3 bucket known, registering an S3 Artifact Store can be done as follows: - -```Bash -# Register the S3 artifact-store -zenml artifact-store register cloud_artifact_store -f s3 --path=s3://bucket-name -``` -For more information, read the [dedicated S3 artifact store flavor guide](/develop/stack-components/artifact-stores/s3). - - -You will need to install and set up the Google Cloud CLI on your machine as a prerequisite, as covered in [the Google Cloud documentation](https://cloud.google.com/sdk/docs/install-sdk) , before you register the GCS Artifact Store. - -The Google Cloud Storage Artifact Store flavor is provided by the [GCP ZenML integration](/develop/stack-components/artifact-stores/gcp), you need to install it on your local machine to be able to register a GCS Artifact Store and add it to your stack: - -```Bash -zenml integration install gcp -y -``` - - - Having trouble with this command? You can use `poetry` or `pip` to install the requirements of any ZenML integration directly. In order to obtain the exact requirements of the GCP integrations you can use `zenml integration requirements gcp`. - - -The only configuration parameter mandatory for registering a GCS Artifact Store is the root path URI, which needs to point to a GCS bucket and take the form `gs://bucket-name`. Please read [the Google Cloud Storage documentation](https://cloud.google.com/storage/docs/creating-buckets) on how to provision a GCS bucket. - -With the URI to your GCS bucket known, registering a GCS Artifact Store can be done as follows: - -```Bash -# Register the GCS artifact store -zenml artifact-store register cloud_artifact_store -f gcp --path=gs://bucket-name -``` -For more information, read the [dedicated GCS artifact store flavor guide](/develop/stack-components/artifact-stores/gcp). - - -You will need to install and set up the Azure CLI on your machine as a prerequisite, as covered in [the Azure documentation](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli), before you register the Azure Artifact Store. - -The Microsoft Azure Artifact Store flavor is provided by the [Azure ZenML integration](/develop/stack-components/artifact-stores/azure), you need to install it on your local machine to be able to register an Azure Artifact Store and add it to your stack: - -```Bash -zenml integration install azure -y -``` - - -Having trouble with this command? You can use `poetry` or `pip` to install the requirements of any ZenML integration directly. In order to obtain the exact requirements of the Azure integration you can use `zenml integration requirements azure`. - - -The only configuration parameter mandatory for registering an Azure Artifact Store is the root path URI, which needs to point to an Azure Blog Storage container and take the form `az://container-name` or `abfs://container-name`. Please read [the Azure Blob Storage documentation](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-portal) on how to provision an Azure Blob Storage container. - -With the URI to your Azure Blob Storage container known, registering an Azure Artifact Store can be done as follows: - -```bash -# Register the Azure artifact store -zenml artifact-store register cloud_artifact_store -f azure --path=az://container-name -``` - -For more information, read the [dedicated Azure artifact store flavor guide](/develop/stack-components/artifact-stores/azure). - - -You can create a remote artifact store in pretty much any environment, including other cloud providers using a cloud-agnostic artifact storage such as [Minio](/develop/stack-components/artifact-stores). - -It is also relatively simple to create a [custom stack component flavor](/develop/stack-components/stack-deployment/implement-a-custom-stack-component) for your use case. - - - - - -Having trouble with setting up infrastructure? Join the [ZenML community](https://zenml.io/slack) and ask for help! - - -## Configuring permissions with your first service connector - -While you can go ahead and [run your pipeline on your stack](/develop/user-guide/guides/production-guide/remote-storage#running-a-pipeline-on-a-cloud-stack) if your local client is configured to access it, it is best practice to use a [service connector](/develop/usage/resource-data-management/auth-management) for this purpose. Service connectors are quite a complicated concept (We have a whole [docs section](/develop/usage/resource-data-management/auth-management) on them) - but we're going to be starting with a very basic approach. - -First, let's understand what a service connector does. In simple words, a service connector contains credentials that grant stack components access to cloud infrastructure. These credentials are stored in the form of a [secret](/develop/usage/project-setup/use-secrets/interact-with-secrets), and are available to the ZenML server to use. Using these credentials, the service connector brokers a short-lived token and grants temporary permissions to the stack component to access that infrastructure. This diagram represents this process: - - - - - -There are [many ways to create an AWS service connector](/develop/usage/resource-data-management/auth-management/aws-service-connector#authentication-methods), but for the sake of this guide, we recommend creating one by [using the IAM method](/develop/usage/resource-data-management/auth-management/aws-service-connector#aws-iam-role). - -```Bash -AWS_PROFILE= zenml service-connector register cloud_connector --type aws --auto-configure -``` - - -There are [many ways to create a GCP service connector](/develop/usage/resource-data-management/auth-management/gcp-service-connector#authentication-methods), but for the sake of this guide, we recommend creating one by [using the Service Account method](/develop/usage/resource-data-management/auth-management/gcp-service-connector#gcp-service-account). - -```Bash -zenml service-connector register cloud_connector --type gcp --auth-method service-account --service_account_json=@ --project_id= --generate_temporary_tokens=False -``` - - -There are [many ways to create an Azure service connector](/develop/usage/resource-data-management/auth-management/azure-service-connector#authentication-methods), but for the sake of this guide, we recommend creating one by [using the Service Principal method](/develop/usage/resource-data-management/auth-management/azure-service-connector#azure-service-principal). - -```Bash -zenml service-connector register cloud_connector --type azure --auth-method service-principal --tenant_id= --client_id= --client_secret= -``` - - - - -Once we have our service connector, we can now attach it to stack components. In this case, we are going to connect it to our remote artifact store: - -```Bash -zenml artifact-store connect cloud_artifact_store --connector cloud_connector -``` - -Now, every time you (or anyone else with access) uses the `cloud_artifact_store`, they will be granted a temporary token that will grant them access to the remote storage. Therefore, your colleagues don't need to worry about setting up credentials and installing clients locally! - -## Running a pipeline on a cloud stack - -Now that we have our remote artifact store registered, we can [register a new stack](/develop/user-guide/guides/production-guide/understand-stacks#registering-a-stack) with it, just like we did in the previous chapter: - - -```bash -zenml stack register local_with_remote_storage -o default -a cloud_artifact_store -``` - - - - - - - - - -Now, using the [code from the previous chapter](/develop/user-guide/guides/production-guide/understand-stacks#run-a-pipeline-on-the-new-local-stack), we run a training pipeline: - -Set our `local_with_remote_storage` stack active: - -```Bash -zenml stack set local_with_remote_storage -``` - -Let us continue with the example from the previous page and run the training pipeline: - -```Bash -python run.py --training-pipeline -``` - -When you run that pipeline, ZenML will automatically store the artifacts in the specified remote storage, ensuring that they are preserved and accessible for future runs and by your team members. You can ask your colleagues to connect to the same [ZenML server](/develop/user-guide/guides/production-guide/deploying-zenml), and you will notice that if they run the same pipeline, the pipeline would be partially cached, **even if they have not run the pipeline themselves before**. - -You can list your artifact versions as follows: - - -```Bash -# This will give you the artifacts from the last 15 minutes - -zenml artifact version list --created="gte:$(date -v-15M '+%Y-%m-%d %H:%M:%S')" -``` - - -[ZenML Pro](https://zenml.io/pro) features an [Artifact Control Plane](/develop/user-guide/guides/starter-guide/manage-artifacts) to visualize artifact versions: - - - - - - - - -You will notice above that some artifacts are stored locally, while others are stored in a remote storage location. - -By connecting remote storage, you're taking a significant step towards building a collaborative and scalable MLOps workflow. Your artifacts are no longer tied to a single machine but are now part of a cloud-based ecosystem, ready to be shared and built upon. diff --git a/develop/user-guide/guides/production-guide/understand-stacks.mdx b/develop/user-guide/guides/production-guide/understand-stacks.mdx deleted file mode 100644 index 76a1fba..0000000 --- a/develop/user-guide/guides/production-guide/understand-stacks.mdx +++ /dev/null @@ -1,236 +0,0 @@ ---- -title: "Understanding stacks" -description: "Learning how to switch the infrastructure backend of your code." -icon: layer-group ---- - -Now that we have ZenML deployed, we can take the next steps in making sure that our machine learning workflows are production-ready. As you were running [your first pipelines](/develop/user-guide/guides/starter-guide/create-an-ml-pipeline), you might have already noticed the term `stack` in the logs and on the dashboard. - -A `stack` is the configuration of tools and infrastructure that your pipelines can run on. When you run ZenML code without configuring a stack, the pipeline will run on the so-called `default` stack. - - - - -### Separation of code from configuration and infrastructure - -As visualized in the diagram above, there are two separate domains that are connected through ZenML. The left side shows the code domain. The user's Python code is translated into a ZenML pipeline. On the right side, you can see the infrastructure domain, in this case, an instance of the `default` stack. By separating these two domains, it is easy to switch the environment that the pipeline runs on without making any changes in the code. It also allows domain experts to write code/configure infrastructure without worrying about the other domain. - -### The `default` stack - - - -`zenml stack describe` lets you find out details about your active stack: - -```Bash -... - Stack Configuration -┏━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓ -┃ COMPONENT_TYPE │ COMPONENT_NAME ┃ -┠────────────────┼────────────────┨ -┃ ARTIFACT_STORE │ default ┃ -┠────────────────┼────────────────┨ -┃ ORCHESTRATOR │ default ┃ -┗━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛ - 'default' stack (ACTIVE) -Stack 'default' with id '...' is owned by user default and is 'private'. -... -``` - -`zenml stack list` lets you see all stacks that are registered in your zenml deployment. - -```Bash -... -┏━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┓ -┃ ACTIVE │ STACK NAME │ STACK ID │ SHARED │ OWNER │ ARTIFACT_STORE │ ORCHESTRATOR ┃ -┠────────┼────────────┼───────────┼────────┼─────────┼────────────────┼──────────────┨ -┃ 👉 │ default │ ... │ ➖ │ default │ default │ default ┃ -┗━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┛ -... -``` - - -As you can see a stack can be **active** on your **client**. This simply means that any pipeline you run will be using the **active stack** as its environment. - - - - - -## Components of a stack - -As you can see in the section above, a stack consists of multiple components. All stacks have at minimum an **orchestrator** and an **artifact store**. - -### Orchestrator - -The **orchestrator** is responsible for executing the pipeline code. In the simplest case, this will be a simple Python thread on your machine. Let's explore this default orchestrator. - - - -`zenml orchestrator list` lets you see all orchestrators that are registered in your zenml deployment. -┏━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┓ -┃ ACTIVE │ NAME │ COMPONENT ID │ FLAVOR │ SHARED │ OWNER ┃ -┠────────┼─────────┼──────────────┼────────┼────────┼─────────┨ -┃ 👉 │ default │ ... │ local │ ➖ │ default ┃ -┗━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┛ - - - - -### Artifact store - -The **artifact store** is responsible for persisting the step outputs. As we learned in the previous section, the step outputs are not passed along in memory, rather the outputs of each step are stored in the **artifact store** and then loaded from there when the next step needs them. By default this will also be on your own machine: - - - -`zenml artifact-store list` lets you see all artifact stores that are registered in your zenml deployment. -```Bash -┏━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━┯━━━━━━━━━┓ -┃ ACTIVE │ NAME │ COMPONENT ID │ FLAVOR │ SHARED │ OWNER ┃ -┠────────┼─────────┼──────────────┼────────┼────────┼─────────┨ -┃ 👉 │ default │ ... │ local │ ➖ │ default ┃ -┗━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━┷━━━━━━━━━┛ -``` - - - - -### Other stack components - -There are many more components that you can add to your stacks, like experiment trackers, model deployers, and more. You can see all supported stack component types in a single table view [here](/develop/stack-components/component-guide) - -Perhaps the most important stack component after the orchestrator and the artifact store is the [container registry](/develop/stack-components/container-registries). A container registry stores all your containerized images, which hold all your code and the environment needed to execute them. We will learn more about them in the next section! - -## Registering a stack - -Just to illustrate how to interact with stacks, let's create an alternate local stack. We start by first creating a local artifact store. - -### Create an artifact store - - - -```Bash -zenml artifact-store register my_artifact_store --flavor=local -``` - -Let's understand the individual parts of this command: - -* `artifact-store` : This describes the top-level group, to find other stack components simply run `zenml --help` -* `register` : Here we want to register a new component, instead, we could also `update` , `delete` and more `zenml artifact-store --help` will give you all possibilities -* `my_artifact_store` : This is the unique name that the stack component will have. -* `--flavor=local`: A flavor is a possible implementation for a stack component. So in the case of an artifact store, this could be an s3-bucket or a local filesystem. You can find out all possibilities with `zenml artifact-store flavor --list` - -This will be the output that you can expect from the command above. - -```Bash -Using the default local database. - -Running with active stack: 'default' (global) - -Successfully registered artifact_store `my_artifact_store`.bash -``` - -To see the new artifact store that you just registered, just run: - -```Bash -zenml artifact-store describe my_artifact_store -``` - - - - -### Create a local stack - -With the artifact store created, we can now create a new stack with this artifact store. - - - -```Bash -zenml stack register a_new_local_stack -o default -a my_artifact_store -``` - -* `stack` : This is the CLI group that enables interactions with the stacks -* `register`: Here we want to register a new stack. Explore other operations with`zenml stack --help`. -* `a_new_local_stack` : This is the unique name that the stack will have. -* `--orchestrator` or `-o` are used to specify which orchestrator to use for the stack -* `--artifact-store` or `-a` are used to specify which artifact store to use for the stack - -The output for the command should look something like this: - -```Bash -Using the default local database. - -Stack 'a_new_local_stack' successfully registered! -``` - -You can inspect the stack with the following command: - -```Bash - zenml stack describe a_new_local_stack -``` - -Which will give you an output like this: - -```Bash - Stack Configuration -┏━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━┓ -┃ COMPONENT_TYPE │ COMPONENT_NAME ┃ -┠────────────────┼───────────────────┨ -┃ ORCHESTRATOR │ default ┃ -┠────────────────┼───────────────────┨ -┃ ARTIFACT_STORE │ my_artifact_store ┃ -┗━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━┛ - 'a_new_local_stack' stack -Stack 'a_new_local_stack' with id '...' is owned by user default and is 'private'. -``` - - - - -### Switch stacks with our VS Code extension - - - - -If you are using [our VS Code extension](https://marketplace.visualstudio.com/items?itemName=ZenML.zenml-vscode), you can easily view and switch your stacks by opening the sidebar (click on the ZenML icon). You can then click on the stack you want to switch to as well as view the stack components it's made up of. - -### Run a pipeline on the new local stack - -Let's use the pipeline in our starter project from the [previous guide](/develop/user-guide/guides/starter-guide/starter-project) to see it in action. - -If you have not already, clone the starter template: - -```Bash -pip install "zenml[templates,server]" notebook -zenml integration install sklearn -y -mkdir zenml_starter -cd zenml_starter -zenml init --template starter --template-with-defaults - -# Just in case, we install the requirements again -pip install -r requirements.txt -``` - -The starter template is the same as the [ZenML mlops starter example](https://github.com/zenml-io/zenml/tree/main/examples/mlops%5Fstarter). You can clone it like so: -```Bash -git clone --depth 1 git@github.com:zenml-io/zenml.git -cd zenml/examples/mlops_starter -pip install -r requirements.txt -zenml init -``` - - -To run a pipeline using the new stack: - -1. Set the stack as active on your client - -```Bash -zenml stack set a_new_local_stack -``` -2. Run your pipeline code: - -```Bash -python run.py --training-pipeline -``` - -Keep this code handy as we'll be using it in the next chapters! diff --git a/develop/user-guide/guides/starter-guide.mdx b/develop/user-guide/guides/starter-guide.mdx deleted file mode 100644 index 49e3948..0000000 --- a/develop/user-guide/guides/starter-guide.mdx +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: "Overview" -description: "Kickstart your journey into MLOps with the essentials of ZenML." -icon: egg ---- - -Welcome to the ZenML Starter Guide! If you're an MLOps engineer aiming to build robust ML platforms, or a data scientist interested in leveraging the power of MLOps, this is the perfect place to begin. Our guide is designed to provide you with the foundational knowledge of the ZenML framework and equip you with the initial tools to manage the complexity of machine learning operations. - - - - - -Throughout this guide, we'll cover essential topics including: - -* [Creating your first ML pipeline](/develop/user-guide/guides/starter-guide/create-an-ml-pipeline) -* [Understanding caching between pipeline steps](/develop/user-guide/guides/starter-guide/cache-previous-executions) -* [Managing data and data versioning](/develop/user-guide/guides/starter-guide/manage-artifacts) -* [Tracking your machine learning models](/develop/user-guide/guides/starter-guide/track-ml-models) - -Before jumping in, make sure you have a Python environment ready and `virtualenv` installed to follow along with ease. By the end, you will have completed a [starter project](/develop/user-guide/guides/starter-guide/starter-project), marking the beginning of your journey into MLOps with ZenML. - -Let this guide be not only your introduction to ZenML but also a foundational asset in your MLOps toolkit. Prepare your development environment, and let's get started! diff --git a/develop/user-guide/guides/starter-guide/cache-previous-executions.mdx b/develop/user-guide/guides/starter-guide/cache-previous-executions.mdx deleted file mode 100644 index 9b35657..0000000 --- a/develop/user-guide/guides/starter-guide/cache-previous-executions.mdx +++ /dev/null @@ -1,181 +0,0 @@ ---- -title: "Cache previous executions" -description: "Iterating quickly with ZenML through caching" -icon: clipboard-check ---- - -Developing machine learning pipelines is iterative in nature. ZenML speeds up development in this work with step caching. - -In the logs of your previous runs, you might have noticed at this point that rerunning the pipeline a second time will use caching on the first step: - -```bash -Step training_data_loader has started. - -Using cached version of training_data_loader. - -Step svc_trainer has started. - -Train accuracy: 0.3416666666666667 - -Step svc_trainer has finished in 0.932s. -``` - - - - -ZenML understands that nothing has changed between subsequent runs, so it re-uses the output of the previous run (the outputs are persisted in the [artifact store](/develop/stack-components/artifact-stores)). This behavior is known as **caching**. - -In ZenML, caching is enabled by default. Since ZenML automatically tracks and versions all inputs, outputs, and parameters of steps and pipelines, steps will not be re-executed within the **same pipeline** on subsequent pipeline runs as long as there is **no change** in the inputs, parameters, or code of a step. - - -The caching does not automatically detect changes within the file system or on external APIs. Make sure to **manually** set caching to `False` on steps that depend on **external inputs, file-system changes,** or if the step should run regardless of caching. - -```py -@step(enable_cache=False) -def load_data_from_external_system(...) -> ...: - # This step will always be run -``` - - -## Configuring the caching behavior of your pipelines - -With caching as the default behavior, there will be times when you need to disable it. -```mermaid - -graph LR - A[Pipeline Settings] -->|overwritten by| B[Step Settings] - B -->|overwritten by| C[Changes in Code, Inputs or Parameters] -``` -There are levels at which you can take control of when and where caching is used. - -### Caching at the pipeline level - -On a pipeline level, the caching policy can be set as a parameter within the `@pipeline` decorator as shown below - -```py -@pipeline(enable_cache=False) - -def first_pipeline(....): - - """Pipeline with cache disabled""" -``` - -The setting above will disable caching for all steps in the pipeline unless a step explicitly sets `enable_cache=True` ( see below). - - -When writing your pipelines, be explicit. This makes it clear when looking at the code if caching is enabled or disabled for any given pipeline. - - -#### Dynamically configuring caching for a pipeline run - -Sometimes you want to have control over caching at runtime instead of defaulting to the hard-coded pipeline and step decorator settings. ZenML offers a way to override all caching settings at runtime: - -```bash -first_pipeline = first_pipeline.with_options(enable_cache=False) -``` - -The code above disables caching for all steps of your pipeline, no matter what you have configured in the `@step` or `@pipeline` decorators. - -The `with_options` function allows you to configure all sorts of things this way. We will learn more about it in the [coming chapters](/develop/user-guide/guides/production-guide/configure-pipeline)! - -### Caching at a step-level - -Caching can also be explicitly configured at a step level via a parameter of the `@step` decorator: - -```Bash -@step(enable_cache=False) -def import_data_from_api(...): - """Import most up-to-date data from public api""" - ... -``` - -The code above turns caching off for this step only. - -You can also use `with_options` with the step, just as in the pipeline: - -```Bash -import_data_from_api = import_data_from_api.with_options(enable_cache=False) - -# use in your pipeline directly -``` - -## Code Example - -This section combines all the code from this section into one simple script that you can use to see caching easily: - - - -```py - -from typing_extensions import Tuple, Annotated -import pandas as pd -from sklearn.datasets import load_iris -from sklearn.model_selection import train_test_split -from sklearn.base import ClassifierMixin -from sklearn.svm import SVC - -from zenml import pipeline, step -from zenml.logger import get_logger - -logger = get_logger(__name__) - - -@step -def training_data_loader() -> Tuple[ - Annotated[pd.DataFrame, "X_train"], - Annotated[pd.DataFrame, "X_test"], - Annotated[pd.Series, "y_train"], - Annotated[pd.Series, "y_test"], -]: - """Load the iris dataset as tuple of Pandas DataFrame / Series.""" - iris = load_iris(as_frame=True) - X_train, X_test, y_train, y_test = train_test_split( - iris.data, iris.target, test_size=0.2, shuffle=True, random_state=42 - ) - return X_train, X_test, y_train, y_test - - -@step -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Tuple[ - Annotated[ClassifierMixin, "trained_model"], - Annotated[float, "training_acc"], -]: - """Train a sklearn SVC classifier and log to MLflow.""" - model = SVC(gamma=gamma) - model.fit(X_train.to_numpy(), y_train.to_numpy()) - train_acc = model.score(X_train.to_numpy(), y_train.to_numpy()) - print(f"Train accuracy: {train_acc}") - return model, train_acc - - -@pipeline -def training_pipeline(gamma: float = 0.002): - X_train, X_test, y_train, y_test = training_data_loader() - svc_trainer(gamma=gamma, X_train=X_train, y_train=y_train) - - -if __name__ == "__main__": - training_pipeline() - - # Step one will use cache, step two will rerun. - # ZenML will detect a different value for the - # `gamma` input of the second step and disable caching. - logger.info("\n\nFirst step cached, second not due to parameter change") - training_pipeline(gamma=0.0001) - - # This will disable cache for the second step. - logger.info("\n\nFirst step cached, second not due to settings") - svc_trainer = svc_trainer.with_options(enable_cache=False) - training_pipeline() - - # This will disable cache for all steps. - logger.info("\n\nCaching disabled for the entire pipeline") - training_pipeline.with_options(enable_cache=False)() -``` - diff --git a/develop/user-guide/guides/starter-guide/create-an-ml-pipeline.mdx b/develop/user-guide/guides/starter-guide/create-an-ml-pipeline.mdx deleted file mode 100644 index 98c083c..0000000 --- a/develop/user-guide/guides/starter-guide/create-an-ml-pipeline.mdx +++ /dev/null @@ -1,346 +0,0 @@ ---- -title: "Create an ML pipeline" -description: "Start with the basics of steps and pipelines." -icon: paper-plane ---- - -In the quest for production-ready ML models, workflows can quickly become complex. Decoupling and standardizing stages such as data ingestion, preprocessing, and model evaluation allows for more manageable, reusable, and scalable processes. ZenML pipelines facilitate this by enabling each stage—represented as **Steps**—to be modularly developed and then integrated smoothly into an end-to-end **Pipeline**. - -Leveraging ZenML, you can create and manage robust, scalable machine learning (ML) pipelines. Whether for data preparation, model training, or deploying predictions, ZenML standardizes and streamlines the process, ensuring reproducibility and efficiency. - - - - - - -Before starting this guide, make sure you have [installed ZenML](/develop/getting-started/installation): - -```Bash -pip install "zenml[server]" -zenml up # Will launch the dashboard locally -``` - - -## Start with a simple ML pipeline - -Let's jump into an example that demonstrates how a simple pipeline can be set up in ZenML, featuring actual ML components to give you a better sense of its application. - -```py -from zenml import pipeline, step - -@step -def load_data() -> dict: - """Simulates loading of training data and labels.""" - - training_data = [[1, 2], [3, 4], [5, 6]] - labels = [0, 1, 0] - - return {'features': training_data, 'labels': labels} - -@step -def train_model(data: dict) -> None: - """ - A mock 'training' process that also demonstrates using the input data. - In a real-world scenario, this would be replaced with actual model fitting logic. - """ - total_features = sum(map(sum, data['features'])) - total_labels = sum(data['labels']) - - print(f"Trained model using {len(data['features'])} data points. " - f"Feature sum is {total_features}, label sum is {total_labels}") - -@pipeline -def simple_ml_pipeline(): - """Define a pipeline that connects the steps.""" - dataset = load_data() - train_model(dataset) - -if __name__ == "__main__": - run = simple_ml_pipeline() - # You can now use the `run` object to see steps, outputs, etc. -``` - - -* `**@step**` is a decorator that converts its function into a step that can be used within a pipeline -* `**@pipeline**` defines a function as a pipeline and within this function, the steps are called and their outputs link them together. - - -Copy this code into a new file and name it `run.py`. Then run it with your command line: - - -```bash -$ python run.py - -Initiating a new run for the pipeline: simple_ml_pipeline. -Executing a new run. -Using user: hamza@zenml.io -Using stack: default - orchestrator: default - artifact_store: default -Step load_data has started. -Step load_data has finished in 0.385s. -Step train_model has started. -Trained model using 3 data points. Feature sum is 21, label sum is 1 -Step train_model has finished in 0.265s. -Run simple_ml_pipeline-2023_11_23-10_51_59_657489 has finished in 1.612s. -Pipeline visualization can be seen in the ZenML Dashboard. Run zenml up to see your pipeline! -``` - -### Explore the dashboard - -Once the pipeline has finished its execution, use the `zenml up` command to view the results in the ZenML Dashboard. Using that command will open up the browser automatically. - - - - -Usually, the dashboard is accessible at [http://127.0.0.1:8237/](http://127.0.0.1:8237/). Log in with the default username **"default"** (password not required) and see your recently run pipeline. Browse through the pipeline components, such as the execution history and artifacts produced by your steps. Use the DAG visualization to understand the flow of data and to ensure all steps are completed successfully. - - - - -For further insights, explore the logging and artifact information associated with each step, which can reveal details about the data and intermediate results. - -If you have closed the browser tab with the ZenML dashboard, you can always reopen it by running `zenml show` in your terminal. - -## Understanding steps and artifacts - -When you ran the pipeline, each individual function that ran is shown in the DAG visualization as a `step` and is marked with the function name. Steps are connected with `artifacts`, which are simply the objects that are returned by these functions and input into downstream functions. This simple logic lets us break down our entire machine learning code into a sequence of tasks that pass data between each other. - -The artifacts produced by your steps are automatically stored and versioned by ZenML. The code that produced these artifacts is also automatically tracked. The parameters and all other configuration is also automatically captured. - -So you can see, by simply structuring your code within some functions and adding some decorators, we are one step closer to having a more tracked and reproducible codebase! - -## Expanding to a Full Machine Learning Workflow - -With the fundamentals in hand, let’s escalate our simple pipeline to a complete ML workflow. For this task, we will use the well-known Iris dataset to train a Support Vector Classifier (SVC). - -Let's start with the imports. - -```py - -from typing_extensions import Annotated # or `from typing import Annotated on Python 3.9+ -from typing import Tuple -import pandas as pd -from sklearn.datasets import load_iris -from sklearn.model_selection import train_test_split -from sklearn.base import ClassifierMixin -from sklearn.svm import SVC - -from zenml import pipeline, step -``` - -Make sure to install the requirements as well: - - -```Bash -pip install matplotlib - -zenml integration install sklearn -y -``` - -In this case, ZenML has an integration with `sklearn` so you can use the ZenML CLI to install the right version directly. - - - The `zenml integration install sklearn` command is simply doing a `pip install` of `sklearn` behind the scenes. If something goes wrong, one can always use `zenml integration requirements sklearn` to see which requirements are compatible and install using pip (or any other tool) directly. (If no specific requirements are mentioned for an integration then this means we support using all possible versions of that integration/package.) - - -### Define a data loader with multiple outputs - -A typical start of an ML pipeline is usually loading data from some source. This step will sometimes have multiple outputs. To define such a step, use a `Tuple` type annotation. Additionally, you can use the `Annotated` annotation to assign [custom output names](/develop/user-guide/guides/starter-guide/manage-artifacts#giving-names-to-your-artifacts). Here we load an open-source dataset and split it into a train and a test dataset. - -```py -import logging - -@step -def training_data_loader() -> Tuple[ - # Notice we use a Tuple and Annotated to return - # multiple named outputs - Annotated[pd.DataFrame, "X_train"], - Annotated[pd.DataFrame, "X_test"], - Annotated[pd.Series, "y_train"], - Annotated[pd.Series, "y_test"], -]: - """Load the iris dataset as a tuple of Pandas DataFrame / Series.""" - logging.info("Loading iris...") - iris = load_iris(as_frame=True) - logging.info("Splitting train and test...") - X_train, X_test, y_train, y_test = train_test_split( - iris.data, iris.target, test_size=0.2, shuffle=True, random_state=42 - ) - return X_train, X_test, y_train, y_test -``` - - - ZenML records the root python logging handler's output into the artifact store as a side-effect of running a step. Therefore, when writing steps, use the `logging` module to record logs, to ensure that these logs then show up in the ZenML dashboard. - - -### Create a parameterized training step - -Here we are creating a training step for a support vector machine classifier with `sklearn`. As we might want to adjust the hyperparameter `gamma` later on, we define it as an input value to the step as well. - - -```py -@step -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Tuple[ - Annotated[ClassifierMixin, "trained_model"], - Annotated[float, "training_acc"], -]: - """Train a sklearn SVC classifier.""" - - model = SVC(gamma=gamma) - model.fit(X_train.to_numpy(), y_train.to_numpy()) - - train_acc = model.score(X_train.to_numpy(), y_train.to_numpy()) - print(f"Train accuracy: {train_acc}") - - return model, train_acc -``` - - -If you want to run just a single step on your ZenML stack, all you need to do is call the step function outside of a ZenML pipeline. For example: - -```bash -model, train_acc = svc_trainer(X_train=..., y_train=...) -``` - - -Next, we will combine our two steps into a pipeline and run it. As you can see, the parameter gamma is configurable as a pipeline input as well. - -```py -@pipeline -def training_pipeline(gamma: float = 0.002): - X_train, X_test, y_train, y_test = training_data_loader() - svc_trainer(gamma=gamma, X_train=X_train, y_train=y_train) - - -if __name__ == "__main__": - training_pipeline(gamma=0.0015) -``` - - -Best Practice: Always nest the actual execution of the pipeline inside an `if __name__ == "__main__"` condition. This ensures that loading the pipeline from elsewhere does not also run it. - - -```py -if __name__ == "__main__": - - training_pipeline() -``` - - -Running `python run.py` should look somewhat like this in the terminal: - - -```Bash -Registered new pipeline with name `training_pipeline`. -. -. -. -Pipeline run `training_pipeline-2023_04_29-09_19_54_273710` has finished in 0.236s. -``` - -In the dashboard, you should now be able to see this new run, along with its runtime configuration and a visualization of the training data. - - - - - -### Configure with a YAML file - -Instead of configuring your pipeline runs in code, you can also do so from a YAML file. This is best when we do not want to make unnecessary changes to the code; in production this is usually the case. - -To do this, simply reference the file like this: - -```py -# Configure the pipeline -training_pipeline = training_pipeline.with_options( - config_path='/local/path/to/config.yaml' -) -# Run the pipeline -training_pipeline() -``` - -The reference to a local file will change depending on where you are executing the pipeline and code from, so please bear this in mind. It is best practice to put all config files in a configs directory at the root of your repository and check them into git history. - -A simple version of such a YAML file could be: - -```bash -parameters: - gamma: 0.01 -``` - -Please note that this would take precedence over any parameters passed in the code. - -If you are unsure how to format this config file, you can generate a template config file from a pipeline. - - -```py -training_pipeline.write_run_configuration_template(path='/local/path/to/config.yaml') -``` - -Check out [this section](/develop/usage/project-setup/use-configuration-files) for advanced configuration options. - -## Full Code Example - -This section combines all the code from this section into one simple script that you can use to run easily: - -```py - -from typing_extensions import Tuple, Annotated -import pandas as pd -from sklearn.datasets import load_iris -from sklearn.model_selection import train_test_split -from sklearn.base import ClassifierMixin -from sklearn.svm import SVC - -from zenml import pipeline, step - - -@step -def training_data_loader() -> Tuple[ - Annotated[pd.DataFrame, "X_train"], - Annotated[pd.DataFrame, "X_test"], - Annotated[pd.Series, "y_train"], - Annotated[pd.Series, "y_test"], -]: - """Load the iris dataset as tuple of Pandas DataFrame / Series.""" - iris = load_iris(as_frame=True) - X_train, X_test, y_train, y_test = train_test_split( - iris.data, iris.target, test_size=0.2, shuffle=True, random_state=42 - ) - return X_train, X_test, y_train, y_test - - -@step -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Tuple[ - Annotated[ClassifierMixin, "trained_model"], - Annotated[float, "training_acc"], -]: - """Train a sklearn SVC classifier and log to MLflow.""" - model = SVC(gamma=gamma) - model.fit(X_train.to_numpy(), y_train.to_numpy()) - train_acc = model.score(X_train.to_numpy(), y_train.to_numpy()) - print(f"Train accuracy: {train_acc}") - return model, train_acc - - -@pipeline -def training_pipeline(gamma: float = 0.002): - X_train, X_test, y_train, y_test = training_data_loader() - svc_trainer(gamma=gamma, X_train=X_train, y_train=y_train) - - -if __name__ == "__main__": - training_pipeline() -``` - diff --git a/develop/user-guide/guides/starter-guide/manage-artifacts.mdx b/develop/user-guide/guides/starter-guide/manage-artifacts.mdx deleted file mode 100644 index 84aa1f0..0000000 --- a/develop/user-guide/guides/starter-guide/manage-artifacts.mdx +++ /dev/null @@ -1,484 +0,0 @@ ---- -title: "Manage artifacts" -description: "Understand and adjust how ZenML versions your data." -icon: palette ---- - -Data sits at the heart of every machine learning workflow. Managing and versioning this data correctly is essential for reproducibility and traceability within your ML pipelines. ZenML takes a proactive approach to data versioning, ensuring that every artifact—be it data, models, or evaluations—is automatically tracked and versioned upon pipeline execution. - - - - -This guide will delve into artifact versioning and management, showing you how to efficiently name, organize, and utilize your data with the ZenML framework. - -## Managing artifacts produced by ZenML pipelines - -Artifacts, the outputs of your steps and pipelines, are automatically versioned and stored in the artifact store. Configuring these artifacts is pivotal for transparent and efficient pipeline development. - -### Giving names to your artifacts - -Assigning custom names to your artifacts can greatly enhance their discoverability and manageability. As best practice, utilize the `Annotated` object within your steps to give precise, human-readable names to outputs: - -```py -from typing_extensions import Annotated -import pandas as pd -from sklearn.datasets import load_iris - -from zenml import pipeline, step - -# Using Annotated to name our dataset -@step -def training_data_loader() -> Annotated[pd.DataFrame, "iris_dataset"]: - """Load the iris dataset as pandas dataframe.""" - iris = load_iris(as_frame=True) - return iris.get("frame") - - -@pipeline -def feature_engineering_pipeline(): - training_data_loader() - - -if __name__ == "__main__": - feature_engineering_pipeline() -``` - - -Unspecified artifact outputs default to a naming pattern of `{pipeline_name}::{step_name}::output`. For visual exploration in the ZenML dashboard, it's best practice to give significant outputs clear custom names. - - -Artifacts named `iris_dataset` can then be found swiftly using various ZenML interfaces: - - -To list artifacts: `zenml artifact list` - - -The [ZenML Pro](https://zenml.io/pro) dashboard offers advanced visualization features for artifact exploration. - - - - - - -To prevent visual clutter, make sure to assign names to your most important artifacts that you would like to explore visually. - -### Versioning artifacts manually - -ZenML automatically versions all created artifacts using auto-incremented numbering. I.e., if you have defined a step creating an artifact named `iris_dataset` as shown above, the first execution of the step will create an artifact with this name and version "1", the second execution will create version "2", and so on. - -While ZenML handles artifact versioning automatically, you have the option to specify custom versions using the [ArtifactConfig](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-model/#zenml.model.artifact%5Fconfig.DataArtifactConfig). This may come into play during critical runs like production releases. - -```py -from zenml import step, ArtifactConfig - -@step -def training_data_loader() -> ( - Annotated[ - pd.DataFrame, - # Add `ArtifactConfig` to control more properties of your artifact - ArtifactConfig( - name="iris_dataset", - version="raw_2023" - ), - ] -): - ... -``` - -The next execution of this step will then create an artifact with the name `iris_dataset` and version `raw_2023`. This is primarily useful if you are making a particularly important pipeline run (such as a release) whose artifacts you want to distinguish at a glance later. - - -Since custom versions cannot be duplicated, the above step can only be run once successfully. To avoid altering your code frequently, consider using a [YAML config](/develop/user-guide/guides/production-guide/configure-pipeline) for artifact versioning. - - -After execution, `iris_dataset` and its version `raw_2023` can be seen using: - - -To list versions: `zenml artifact version list` - - -The Cloud dashboard visualizes version history for your review. - - - - - - -### Add metadata and tags to artifacts - -If you would like to extend your artifacts with extra metadata or tags you can do so by following the patterns demonstrated below: - -```py -from zenml import step, get_step_context, ArtifactConfig -from typing_extensions import Annotated - - -# below we annotate output with `ArtifactConfig` giving it a name, -# run_metadata and tags. As a result, the created artifact -# `artifact_name` will get configured with metadata and tags -@step -def annotation_approach() -> ( - Annotated[ - str, - ArtifactConfig( - name="artifact_name", - run_metadata={"metadata_key": "metadata_value"}, - tags=["tag_name"], - ), - ] -): - return "string" - - -# below we annotate output using functional approach with -# run_metadata and tags. As a result, the created artifact -# `artifact_name` will get configured with metadata and tags -@step -def annotation_approach() -> Annotated[str, "artifact_name"]: - step_context = get_step_context() - step_context.add_output_metadata( - output_name="artifact_name", metadata={"metadata_key": "metadata_value"} - ) - step_context.add_output_tags(output_name="artifact_name", tags=["tag_name"]) - return "string" - - -# below we combine both approaches, so the artifact will get -# metadata and tags from both sources -@step -def annotation_approach() -> ( - Annotated[ - str, - ArtifactConfig( - name="artifact_name", - run_metadata={"metadata_key": "metadata_value"}, - tags=["tag_name"], - ), - ] -): - step_context = get_step_context() - step_context.add_output_metadata( - output_name="artifact_name", metadata={"metadata_key2": "metadata_value2"} - ) - step_context.add_output_tags(output_name="artifact_name", tags=["tag_name2"]) - return "string" -``` - -### Consuming external artifacts within a pipeline - -While most pipelines start with a step that produces an artifact, it is often the case to want to consume artifacts external from the pipeline. The `ExternalArtifact` class can be used to initialize an artifact within ZenML with any arbitrary data type. - -For example, let's say we have a Snowflake query that produces a dataframe, or a CSV file that we need to read. External artifacts can be used for this, to pass values to steps that are neither JSON serializable nor produced by an upstream step: - -```py -import numpy as np -from zenml import ExternalArtifact, pipeline, step - -@step -def print_data(data: np.ndarray): - print(data) - -@pipeline -def printing_pipeline(): - # One can also pass data directly into the ExternalArtifact - # to create a new artifact on the fly - data = ExternalArtifact(value=np.array([0])) - - print_data(data=data) - - -if __name__ == "__main__": - printing_pipeline() -``` - -Optionally, you can configure the `ExternalArtifact` to use a custom [materializer](/develop/usage/resource-data-management/handle-data-artifacts/handle-custom-data-types) for your data or disable artifact metadata and visualizations. Check out the [SDK docs](https://sdkdocs.zenml.io/latest/core%5Fcode%5Fdocs/core-artifacts/#zenml.artifacts.external%5Fartifact.ExternalArtifact) for all available options. - - -Using an `ExternalArtifact` for your step automatically disables caching for the step. - - -### Consuming artifacts produced by other pipelines - -It is also common to consume an artifact downstream after producing it in an upstream pipeline or step. As we have learned in the [previous section](/develop/usage/pipelines/build-pipelines/fetching-pipelines#fetching-artifacts-directly), the `Client` can be used to fetch artifacts directly inside the pipeline code: - -```py -from uuid import UUID -import pandas as pd -from zenml import step, pipeline -from zenml.client import Client - - -@step -def trainer(dataset: pd.DataFrame): - ... - -@pipeline -def training_pipeline(): - client = Client() - # Fetch by ID - dataset_artifact = client.get_artifact_version( - name_id_or_prefix=UUID("3a92ae32-a764-4420-98ba-07da8f742b76") - ) - - # Fetch by name alone - uses the latest version of this artifact - dataset_artifact = client.get_artifact_version(name_id_or_prefix="iris_dataset") - - # Fetch by name and version - dataset_artifact = client.get_artifact_version( - name_id_or_prefix="iris_dataset", version="raw_2023" - ) - - # Pass into any step - trainer(dataset=dataset_artifact) - - -if __name__ == "__main__": - training_pipeline() -``` - - -Calls of `Client` methods like `get_artifact_version` directly inside the pipeline code makes use of ZenML's [late materialization](/develop/usage/resource-data-management/handle-data-artifacts/load-artifacts-into-memory) behind the scenes. - - -If you would like to bypass materialization entirely and just download the data or files associated with a particular artifact version, you can use the `.download_files` method: - -```py -from zenml.client import Client - -client = Client() -artifact = client.get_artifact_version(name_id_or_prefix="iris_dataset") -artifact.download_files("path/to/save.zip") -``` - -Take note that the path must have the `.zip` extension, as the artifact data will be saved as a zip file. Make sure to handle any exceptions that may arise from this operation. - -## Managing artifacts **not** produced by ZenML pipelines - -Sometimes, artifacts can be produced completely outside of ZenML. A good example of this is the predictions produced by a deployed model. - -```py -# A model is deployed, running in a FastAPI container -# Let's use the ZenML client to fetch the latest model and make predictions - -from zenml.client import Client -from zenml import save_artifact - -# Fetch the model from a registry or a previous pipeline -model = ... - -# Let's make a prediction -prediction = model.predict([[1, 1, 1, 1]]) - -# We now store this prediction in ZenML as an artifact -# This will create a new artifact version -save_artifact(prediction, name="iris_predictions") -``` - -You can also load any artifact stored within ZenML using the `load_artifact` method: - - -```py -# Loads the latest version - -load_artifact("iris_predictions") -``` - - -`load_artifact` is simply short-hand for the following Client call: - -```py -from zenml.client import Client - -client = Client() -client.get_artifact("iris_predictions").load() -``` - - -Even if an artifact is created externally, it can be treated like any other artifact produced by ZenML steps - with all the functionalities described above! - - -It is also possible to use these functions inside your ZenML steps. However, it is usually cleaner to return the artifacts as outputs of your step to save them, or to use External Artifacts to load them instead. - - -## Logging metadata for an artifact - -One of the most useful ways of interacting with artifacts in ZenML is the ability to associate metadata with them. [As mentioned before](/develop/usage/pipelines/build-pipelines/fetching-pipelines#artifact-information), artifact metadata is an arbitrary dictionary of key-value pairs that are useful for understanding the nature of the data. - -As an example, one can associate the results of a model training alongside a model artifact, the shape of a table alongside a `pandas` dataframe, or the size of an image alongside a PNG file. - -For some artifacts, ZenML automatically logs metadata. As an example, for `pandas.Series` and `pandas.DataFrame` objects, ZenML logs the shape and size of the objects: - - - -```py -from zenml.client import Client - -# Get an artifact version (e.g. pd.DataFrame) -artifact = Client().get_artifact_version('50ce903f-faa6-41f6-a95f-ff8c0ec66010') - -# Fetch it's metadata -artifact.run_metadata["storage_size"].value # Size in bytes -artifact.run_metadata["shape"].value # Shape e.g. (500,20) -``` - - -The information regarding the metadata of an artifact can be found within the DAG visualizer interface on the OSS dashboard: - - - - - -The [ZenML Pro](https://zenml.io/pro) dashboard offers advanced visualization features for artifact exploration, including a dedicated artifacts tab with metadata visualization: - - - - - - - - -A user can also add metadata to an artifact within a step directly using the `log_artifact_metadata` method: - -```py -from zenml import step, log_artifact_metadata - -@step -def model_finetuner_step( - model: ClassifierMixin, dataset: Tuple[np.ndarray, np.ndarray] -) -> Annotated[ - ClassifierMixin, ArtifactConfig(name="my_model", tags=["SVC", "trained"]) -]: - """Finetunes a given model on a given dataset.""" - model.fit(dataset[0], dataset[1]) - accuracy = model.score(dataset[0], dataset[1]) - - - log_artifact_metadata( - # Artifact name can be omitted if step returns only one output - artifact_name="my_model", - # Passing None or omitting this will use the `latest` version - version=None, - # Metadata should be a dictionary of JSON-serializable values - metadata={"accuracy": float(accuracy)} - # A dictionary of dictionaries can also be passed to group metadata - # in the dashboard - # metadata = {"metrics": {"accuracy": accuracy}} - ) - return model -``` - -For further depth, there is an [advanced metadata logging guide](/develop/usage/resource-data-management/track-metrics-metadata) that goes more into detail about logging metadata in ZenML. - -Additionally, there is a lot more to learn about artifacts within ZenML. Please read the [dedicated data management guide](/develop/usage/resource-data-management/handle-data-artifacts) for more information. - -## Code example - -This section combines all the code from this section into one simple script that you can use easily: - -```py -from typing import Optional, Tuple -from typing_extensions import Annotated - -import numpy as np -from sklearn.base import ClassifierMixin -from sklearn.datasets import load_digits -from sklearn.svm import SVC -from zenml import ArtifactConfig, pipeline, step, log_artifact_metadata -from zenml import save_artifact, load_artifact -from zenml.client import Client - - -@step -def versioned_data_loader_step() -> ( - Annotated[ - Tuple[np.ndarray, np.ndarray], - ArtifactConfig( - name="my_dataset", - tags=["digits", "computer vision", "classification"], - ), - ] -): - """Loads the digits dataset as a tuple of flattened numpy arrays.""" - digits = load_digits() - return (digits.images.reshape((len(digits.images), -1)), digits.target) - - -@step -def model_finetuner_step( - model: ClassifierMixin, dataset: Tuple[np.ndarray, np.ndarray] -) -> Annotated[ - ClassifierMixin, - ArtifactConfig(name="my_model", is_model_artifact=True, tags=["SVC", "trained"]), -]: - """Finetunes a given model on a given dataset.""" - model.fit(dataset[0], dataset[1]) - accuracy = model.score(dataset[0], dataset[1]) - log_artifact_metadata(metadata={"accuracy": float(accuracy)}) - return model - - -@pipeline -def model_finetuning_pipeline( - dataset_version: Optional[str] = None, - model_version: Optional[str] = None, -): - client = Client() - # Either load a previous version of "my_dataset" or create a new one - if dataset_version: - dataset = client.get_artifact_version( - name_id_or_prefix="my_dataset", version=dataset_version - ) - else: - dataset = versioned_data_loader_step() - - # Load the model to finetune - # If no version is specified, the latest version of "my_model" is used - model = client.get_artifact_version( - name_id_or_prefix="my_model", version=model_version - ) - - # Finetune the model - # This automatically creates a new version of "my_model" - model_finetuner_step(model=model, dataset=dataset) - - -def main(): - # Save an untrained model as first version of "my_model" - untrained_model = SVC(gamma=0.001) - save_artifact( - untrained_model, name="my_model", version="1", tags=["SVC", "untrained"] - ) - - # Create a first version of "my_dataset" and train the model on it - model_finetuning_pipeline() - - # Finetune the latest model on an older version of the dataset - model_finetuning_pipeline(dataset_version="1") - - # Run inference with the latest model on an older version of the dataset - latest_trained_model = load_artifact("my_model") - old_dataset = load_artifact("my_dataset", version="1") - latest_trained_model.predict(old_dataset[0]) - - -if __name__ == "__main__": - main() -``` - -This would create the following pipeline run DAGs: - -**Run 1:** - - - - -**Run 2:** - - - - - - - diff --git a/develop/user-guide/guides/starter-guide/starter-project.mdx b/develop/user-guide/guides/starter-guide/starter-project.mdx deleted file mode 100644 index a6452a9..0000000 --- a/develop/user-guide/guides/starter-guide/starter-project.mdx +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: "A starter project" -description: "Put your new knowledge into action with a simple starter project" -icon: rocket ---- - -By now, you have understood some of the basic pillars of a MLOps system: - -* [Pipelines and steps](/develop/user-guide/guides/starter-guide/create-an-ml-pipeline) -* [Artifacts](/develop/user-guide/guides/starter-guide/manage-artifacts) -* [Models](/develop/user-guide/guides/starter-guide/track-ml-models) - -We will now put this into action with a simple starter project. - -## Get started - -Start with a fresh virtual environment with no dependencies. Then let's install our dependencies: - -```Bash -pip install "zenml[templates,server]" notebook - -zenml integration install sklearn -y -``` - -We will then use [ZenML templates](/develop/usage/project-setup/setting-up-a-project-repository/using-project-templates) to help us get the code we need for the project: - -```shell -mkdir zenml_starter - -cd zenml_starter - -zenml init --template starter --template-with-defaults - -# Just in case, we install the requirements again - -pip install -r requirements.txt -``` - -The starter template is the same as the [ZenML mlops starter example](https://github.com/zenml-io/zenml/tree/main/examples/mlops%5Fstarter). You can clone it like so: - -```shell -git clone --depth 1 git@github.com:zenml-io/zenml.git - -cd zenml/examples/mlops_starter - -pip install -r requirements.txt - -zenml init -``` - - -## What you'll learn - -You can either follow along in the [accompanying Jupyter notebook](https://github.com/zenml-io/zenml/blob/main/examples/quickstart/quickstart.ipynb), or just keep reading the [README file for more instructions](https://github.com/zenml-io/zenml/blob/main/examples/quickstart/README.md). - -Either way, at the end you would run three pipelines that are exemplary: - -* A feature engineering pipeline that loads data and prepares it for training. -* A training pipeline that loads the preprocessed dataset and trains a model. -* A batch inference pipeline that runs predictions on the trained model with new data. - -And voilà! You're now well on your way to be an MLOps expert. As a next step, try introducing the [ZenML starter template](https://github.com/zenml-io/template-starter) to your colleagues and see the benefits of a standard MLOps framework in action! - -## Conclusion and next steps - -This marks the end of the first chapter of your MLOps journey with ZenML. Make sure you do your own experimentation with ZenML to master the basics. When ready, move on to the [production guide](/develop/user-guide/guides/production-guide), which is the next part of the series. - diff --git a/develop/user-guide/guides/starter-guide/track-ml-models.mdx b/develop/user-guide/guides/starter-guide/track-ml-models.mdx deleted file mode 100644 index 981d82f..0000000 --- a/develop/user-guide/guides/starter-guide/track-ml-models.mdx +++ /dev/null @@ -1,268 +0,0 @@ ---- -title: "Track ML models" -description: "Creating a full picture of a ML model using the Model Control Plane" -icon: plane-departure ---- - - - - - -As discussed in the [Core Concepts](/develop/getting-started/core-concepts), ZenML also contains the notion of a `Model`, which consists of many model versions (the iterations of the model). These concepts are exposed in the `Model Control Plane` (MCP for short). - -## What is a ZenML Model? - -Before diving in, let's take some time to build an understanding of what we mean when we say `Model` in ZenML terms. A `Model` is simply an entity that groups pipelines, artifacts, metadata, and other crucial business data into a unified entity. In this sense, a ZenML Model is a concept that more broadly encapsulates your ML product's business logic. You may even think of a ZenML Model as a "project" or a "workspace" - - - Please note that one of the most common artifacts that is associated with a Model in ZenML is the so-called technical model, which is the actually model file/files that holds the weight and parameters of a machine learning training result. However, this is not the only artifact that is relevant; artifacts such as the training data and the predictions this model produces in production are also linked inside a ZenML Model. - - -Models are first-class citizens in ZenML and as such viewing and using them is unified and centralized in the ZenML API, the ZenML client as well as on the [ZenML Pro](https://zenml.io/pro) dashboard. - -These models can be viewed within ZenML: - - -`zenml model list` can be used to list all models. - - -The [ZenML Pro](https://zenml.io/pro) dashboard has additional capabilities, that include visualizing these models in the dashboard. - - - - - - -## Configuring a model in a pipeline - -The easiest way to use a ZenML model is to pass a `Model` object as part of a pipeline run. This can be done easily at a pipeline or a step level, or via a [YAML config](/develop/user-guide/guides/production-guide/configure-pipeline). - -Once you configure a pipeline this way, **all** artifacts generated during pipeline runs are automatically **linked** to the specified model. This connecting of artifacts provides lineage tracking and transparency into what data and models are used during training, evaluation, and inference. - -```py -from zenml import pipeline -from zenml import Model - -model = Model( - # The name uniquely identifies this model - # It usually represents the business use case - name="iris_classifier", - # The version specifies the version - # If None or an unseen version is specified, it will be created - # Otherwise, a version will be fetched. - version=None, - # Some other properties may be specified - license="Apache 2.0", - description="A classification model for the iris dataset.", -) - -# The step configuration will take precedence over the pipeline -@step(model=model) -def svc_trainer(...) -> ...: - ... - -# This configures it for all steps within the pipeline -@pipeline(model=model) -def training_pipeline(gamma: float = 0.002): - # Now this pipeline will have the `iris_classifier` model active. - X_train, X_test, y_train, y_test = training_data_loader() - svc_trainer(gamma=gamma, X_train=X_train, y_train=y_train) - -if __name__ == "__main__": - training_pipeline() - -# In the YAML the same can be done; in this case, the -# passing to the decorators is not needed -# model: - # name: iris_classifier - # license: "Apache 2.0" - # description: "A classification model for the iris dataset." - -``` - -The above will establish a **link between all artifacts that pass through this ZenML pipeline and this model**. This includes the **technical model** which is what comes out of the `svc_trainer` step. You will be able to see all associated artifacts and pipeline runs, all within one view. - -Furthermore, this pipeline run and all other pipeline runs that are configured with this model configuration will be linked to this model as well. - -You can see all versions of a model, and associated artifacts and run like this: - - -`zenml model version list ` can be used to list all versions of a particular model. - -The following commands can be used to list the various pipeline runs associated with a model: - -* `zenml model version runs ` - -The following commands can be used to list the various artifacts associated with a model: - -* `zenml model version data_artifacts ` -* `zenml model version model_artifacts ` -* `zenml model version deployment_artifacts ` - - -The [ZenML Pro](https://zenml.io/pro) dashboard has additional capabilities, that include visualizing all associated runs and artifacts for a model version: - - - - - - -## Fetching the model in a pipeline - -When configured at the pipeline or step level, the model will be available through the [StepContext](/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-pipeline) or [PipelineContext](/develop/usage/resource-data-management/track-metrics-metadata/fetch-metadata-within-pipeline). - -```py -from zenml import get_step_context, get_pipeline_context, step, pipeline - -@step -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Annotated[ClassifierMixin, "trained_model"]: - # This will return the model specified in the - # @pipeline decorator. In this case, the production version of - # the `iris_classifier` will be returned in this case. - model = get_step_context().model - ... - -@pipeline( - model=Model( - # The name uniquely identifies this model - name="iris_classifier", - # Pass the stage you want to get the right model - version="production", - ), -) -def training_pipeline(gamma: float = 0.002): - # Now this pipeline will have the production `iris_classifier` model active. - model = get_pipeline_context().model - - X_train, X_test, y_train, y_test = training_data_loader() - svc_trainer(gamma=gamma, X_train=X_train, y_train=y_train) -``` - -## Logging metadata to the `Model` object - -[Just as one can associate metadata with artifacts](/develop/user-guide/guides/starter-guide/manage-artifacts#logging-metadata-for-an-artifact), models too can take a dictionary of key-value pairs to capture their metadata. This is achieved using the `log_model_metadata` method: - -```py -from zenml import get_step_context, step, log_model_metadata - -@step -def svc_trainer( - X_train: pd.DataFrame, - y_train: pd.Series, - gamma: float = 0.001, -) -> Annotated[ClassifierMixin, "sklearn_classifier"],: - # Train and score model - ... - model.fit(dataset[0], dataset[1]) - accuracy = model.score(dataset[0], dataset[1]) - - model = get_step_context().model - - log_model_metadata( - # Model name can be omitted if specified in the step or pipeline context - model_name="iris_classifier", - # Passing None or omitting this will use the `latest` version - version=None, - # Metadata should be a dictionary of JSON-serializable values - metadata={"accuracy": float(accuracy)} - # A dictionary of dictionaries can also be passed to group metadata - # in the dashboard - # metadata = {"metrics": {"accuracy": accuracy}} - ) -``` - - - - -```py -from zenml.client import Client - -# Get an artifact version (in this the latest `iris_classifier`) -model_version = Client().get_model_version('iris_classifier') - -# Fetch it's metadata -model_version.run_metadata["accuracy"].value -``` - - -The [ZenML Pro](https://zenml.io/pro) dashboard offers advanced visualization features for artifact exploration, including a dedicated artifacts tab with metadata visualization: - - - - - - -Choosing [log metadata with artifacts](/develop/user-guide/guides/starter-guide/manage-artifacts#logging-metadata-for-an-artifact) or model versions depends on the scope and purpose of the information you wish to capture. Artifact metadata is best for details specific to individual outputs, while model version metadata is suitable for broader information relevant to the overall model. By utilizing ZenML's metadata logging capabilities and special types, you can enhance the traceability, reproducibility, and analysis of your ML workflows. - -Once metadata has been logged to a model, we can retrieve it easily with the client: - -```py -from zenml.client import Client -client = Client() -model = client.get_model_version("my_model", "my_version") -print(model.run_metadata["metadata_key"].value) -``` - -For further depth, there is an [advanced metadata logging guide](/develop/usage/resource-data-management/track-metrics-metadata) that goes more into detail about logging metadata in ZenML. - -## Using the stages of a model - -A model's versions can exist in various stages. These are meant to signify their lifecycle state: - -* `staging`: This version is staged for production. -* `production`: This version is running in a production setting. -* `latest`: The latest version of the model. -* `archived`: This is archived and no longer relevant. This stage occurs when a model moves out of any other stage. - - - -```py -from zenml import Model - -# Get the latest version of a model -model = Model( - name="iris_classifier", - version="latest" -) - -# Get `my_version` version of a model -model = Model( - name="iris_classifier", - version="my_version", -) - -# Pass the stage into the version field -# to get the `staging` model -model = Model( - name="iris_classifier", - version="staging", -) - -# This will set this version to production -model.set_stage(stage="production", force=True) -``` - - -```py -# List staging models -zenml model version list --stage staging - -# Update to production -zenml model version update -s production -``` - - -The [ZenML Pro](https://zenml.io/pro) dashboard has additional capabilities, that include easily changing the stage: - - - - - - -ZenML Pro Transition Model Stages - -ZenML Model and versions are some of the most powerful features in ZenML. To understand them in a deeper way, read the [dedicated Model Management](/develop/usage/development-monitoring/use-the-model-control-plane) guide.