diff --git a/README.md b/README.md index ff26193..bc47aca 100644 --- a/README.md +++ b/README.md @@ -1,69 +1,172 @@ -# 💫 ZenML End-to-End NLP Training and Deployment Project Template +# 🌟 ZenML End-to-End NLP Training, Promotion, and Deployment Project Template + +Welcome to the ZenML NLP project template, a comprehensive starting point for anyone looking to build and deploy NLP models using the ZenML framework. This template is designed to streamline the process of training, promoting, and deploying NLP models with a focus on reproducibility, scalability, and ease of use. + +## 🚀 Quickstart + +To get started with this template, you'll need to have ZenML installed along with the necessary extras for NLP: + +```bash +pip install zenml[templates] +``` + +Once you have ZenML installed, you can initialize a new project using this template: + +```bash +zenml init --template +# example: zenml init --template template-nlp +``` + +Running the command above will result in input prompts being shown to you. If you would like to rely on default values for the ZenML project template - you can add --template-with-defaults to the same command, like this: + +```bash +zenml init --template --template-with-defaults +# example: zenml init --template template-nlp --template-with-defaults +``` -This project template is designed to help you get started with training and deploying NLP models using the ZenML framework. It provides a comprehensive set of steps and pipelines to cover major use cases of NLP model development, including dataset loading, tokenization, model training, model registration, and deployment. ## 📃 Template Parameters +The template can be configured using the following parameters: + | Parameter | Description | Default | |-----------|-------------|---------| -| Name | The name of the person/entity holding the copyright | ZenML GmbH | -| Email | The email of the person/entity holding the copyright | info@zenml.io | | Project Name | Short name for your project | ZenML NLP project | | Project Version | The version of your project | 0.0.1 | | Project License | The license under which your project will be released | Apache Software License 2.0 | -| Technical product name | The technical name to prefix all tech assets (pipelines, models, etc.) | nlp_use_case | -| Target environment | The target environment for deployments/promotions | staging | -| Use metric-based promotion | Whether to compare metric of interest to make model version promotion | True | -| Notifications on failure | Whether to notify about pipeline failures | True | -| Notifications on success | Whether to notify about pipeline successes | False | -| ZenML Server URL | Optional URL of a remote ZenML server for support scripts | - | - -## 🚀 Generate a ZenML Project - -To generate a project from this template, make sure you have ZenML and its `templates` extras installed: +| Full Name | The name of the copyright holder | ZenML GmbH | +| Email | The email of the copyright holder | info@zenml.io | +| Product Name | The technical name for your NLP product | nlp_use_case | +| Target Environment | The target environment for deployments | staging | +| Accelerator | The accelerator to use for training | gpu | +| Deploy Locally | Whether to deploy the model locally | True | +| Deploy to HuggingFace | Whether to deploy to HuggingFace Hub | False | +| Deploy to SkyPilot | Whether to deploy to SkyPilot | False | +| Dataset | The dataset to use from HuggingFace Datasets | airline_reviews | +| Model | The model to use from HuggingFace Models | roberta-base | +| Cloud Provider | The cloud provider to use (AWS or GCP) | aws | +| Metric-Based Promotion | Whether to promote models based on metrics | True | +| Notifications on Failure | Whether to notify about pipeline failures | True | +| Notifications on Success | Whether to notify about pipeline successes | False | +| ZenML Server URL | URL of a remote ZenML server (optional) | | + +After configuring your project, you can run the following commands to set up your environment and start the ZenML Dashboard: ```bash -pip install zenml[templates] +cd +make setup +make install-local-stack +zenml up +python run.py ``` -Then, run the following command to generate the project: +For more details, check the `README.md` file in the generated project directory. -```bash -zenml init --template template-nlp -``` +## 🛠️ Project Structure + +This NLP project template includes three main pipelines: + +### Training Pipeline + +The training pipeline is designed to handle the end-to-end process of training an NLP model. It includes steps for data loading, tokenization, model training, and model registration. The pipeline is parameterized to allow for customization of the training process, such as sequence length, batch size, and learning rate. + +### Promotion Pipeline + +The promotion pipeline is responsible for promoting a model version to a specified stage. It can be configured to promote based on a metric comparison between the latest and currently promoted model versions or to simply promote the latest model version. + +### Deployment Pipeline + +The deployment pipeline handles the deployment of the model to various environments. It can be configured to deploy locally, to HuggingFace Hub, or to SkyPilot, depending on your requirements. + +## 📚 Core Concepts + +This template showcases several core concepts of ZenML: + +- Designing and running ZenML pipelines for NLP tasks. +- Parameterizing steps for flexibility and reusability. +- Model promotion strategies based on metrics or latest versions. +- Deploying models to different environments, including local and cloud-based platforms. +- Best practices for implementing and running reproducible and reliable ML pipelines with ZenML. + +## 📦 Additional Features + +- Integration with HuggingFace Datasets and Models for easy access to pre-trained models and datasets. +- Support for different accelerators (GPU/CPU) to speed up training. +- Conditional deployment steps based on user configuration. +- Notification hooks for pipeline successes and failures. + +## 🌐 Deployment Options + +The template supports multiple deployment options, including local deployment for testing and cloud-based deployment for production use. You can choose to deploy to HuggingFace Hub for easy sharing and collaboration or to SkyPilot for managed deployment services. + +## 📋 License + +This project is licensed under the Apache License 2.0, ensuring that you have the freedom to use, modify, and distribute the software as you see fit. + +### 🛠️ How this template is implemented + +Let's dive into the implementation details and share some tips and best practices for using this template effectively. + +### Training Pipeline + +The training pipeline is the heart of the NLP project. It is responsible for preparing the data, training the model, and registering the model in the model registry. The pipeline is designed to ensure that only models that meet quality standards are promoted for inference. + +The training pipeline is configured using the `{{product_name}}_training_pipeline` function, which includes steps for data loading, tokenization, model training, and model registration. The pipeline can be customized with parameters such as `lower_case`, `padding`, `max_seq_length`, and others to tailor the tokenization and training process to your specific NLP use case. + +### Training Pipeline: Data and Tokenization + +[📂 Code folder](template/steps/data_tokenization/) +

+ +

+ +The first stage of the training pipeline involves loading the dataset and preparing it for the model. The `data_loader` step fetches the dataset, which is then passed to the `tokenizer_loader` and `tokenization_step` to convert the raw text data into a format suitable for the NLP model. + +Tokenization is a critical step in NLP pipelines, as it converts text into tokens that the model can understand. The tokenizer can be configured to handle case sensitivity, padding strategies, and sequence lengths, ensuring that the input data is consistent and optimized for training. + +### Training Pipeline: Model Training -You will be prompted to provide values for the template parameters. If you want to use the default values, you can add the `--template-with-defaults` flag to the command. +[📂 Code folder](template/steps/model_training/) +

+ +

-## 🧰 How this template is implemented +Once the data is tokenized, the `model_trainer` step takes over to train the NLP model. This step utilizes the tokenized dataset and the tokenizer itself to fine-tune the model on the specific task, such as sentiment analysis, text classification, or named entity recognition. -This template provides a set of pipelines and steps to cover the end-to-end process of training and deploying NLP models. Here is an overview of the main components: +The model training step can be configured with parameters like `train_batch_size`, `eval_batch_size`, `num_epochs`, `learning_rate`, and `weight_decay` to control the training process. After training, the model is evaluated, and if it meets the quality criteria, it is registered in the model registry with a unique name. -### Dataset Loading +### Promotion Pipeline -The template includes a step for loading the dataset from the HuggingFace Datasets library. You can choose from three available datasets: financial_news, airline_reviews, and imdb_reviews. +[📂 Code folder](template/steps/promotion/) +

+ +

-### Tokenization +The promotion pipeline is responsible for promoting the best model to the chosen stage, such as Production or Staging. The pipeline can be configured to promote models based on metric comparison or simply promote the latest model version. -The tokenization step preprocesses the dataset by tokenizing the text data using the tokenizer provided by the HuggingFace Models library. You can choose from three available models: bert-base-uncased, roberta-base, and distilbert-base-cased. +The `{{product_name}}_promote_pipeline` function orchestrates the promotion process. If `metric_compare_promotion` is enabled, the pipeline retrieves the metrics for the latest and currently promoted model versions and compares them to decide which model to promote. -### Model Training +### Deployment Pipeline -The training pipeline consists of several steps, including model architecture search, hyperparameter tuning, model training, and model evaluation. The best model architecture and hyperparameters are selected based on the performance on the validation set. The trained model is then evaluated on the holdout set to assess its performance. +[📂 Code folder](template/steps/deployment/) +

+ +

-### Model Registration and Promotion +The deployment pipeline handles the deployment of the model to various environments. It can be configured to deploy locally, to HuggingFace Hub, or to SkyPilot, depending on the project's needs. -After training, the best model version is registered in the ZenML Model Registry. The template provides an option to promote the model version based on a specified metric of interest. If metric-based promotion is enabled, the template compares the metric value of the new model version with the metric value of the current production model version and promotes the new version if it performs better. +The `{{product_name}}_deploy_pipeline` function manages the deployment process, which includes saving the model locally and deploying it to the chosen target. The pipeline can be customized with parameters like `labels`, `title`, `description`, `model_name_or_path`, and others to provide metadata for the deployed model. -### Batch Inference +### 📝 Configuration and Customization -The template includes a batch inference pipeline that loads the inference dataset, preprocesses it using the same tokenizer as during training, and runs predictions using the deployed model version. The predictions are stored as an artifact for future use. +The template can be configured using the provided YAML configuration file. You can specify the project name, version, license, target environment, dataset, model, cloud provider, and other options to tailor the template to your specific requirements. -### Deployment Options +### 📚 Further Reading -The template provides options to deploy the trained model locally or to the HuggingFace Hub. You can choose whether to deploy locally or to the HuggingFace Hub by setting the `deploy_locally` and `deploy_to_huggingface` parameters. +For more information on ZenML and how to use it for NLP projects, check out the [ZenML Documentation](https://docs.zenml.io/). You'll find guides on pipelining, stack management, and best practices for MLOps. -## Next Steps +Happy coding, and may your NLP models be ever accurate and insightful! -Once you have generated the project using this template, you can explore the generated code and customize it to fit your specific NLP use case. The README.md file in the generated project provides further instructions on how to set up and run the project. +## 📩 Contact -Happy coding with ZenML and NLP! +For any questions or support, please reach out to ZenML GmbH at info@zenml.io. diff --git a/template/README.md b/template/README.md index ba7c492..2361d74 100644 --- a/template/README.md +++ b/template/README.md @@ -1,10 +1,10 @@ # {{project_name}} -This is a comprehensive NLP project built with the +This is a comprehensive supervised ML project built with the ZenML framework and its integration. The project trains one or more scikit-learn classification models to make predictions on the tabular classification datasets provided by the scikit-learn library. The project was -generated from the [NLP ZenML project template](https://github.com/zenml-io/nlp-template) +generated from the [NLP ZenML project template](https://github.com/zenml-io/template-nlp). with the following properties: - Project name: {{project_name}} - Technical Name: {{product_name}} @@ -18,9 +18,30 @@ with the following properties: {%- endif %} Settings of your project are: +- Accelerator: `{{accelerator}}` +{%- if metric_compare_promotion %} +- Trained model promotion to `{{target_environment}}` based on accuracy metric vs currently deployed model +{%- else %} +- Every trained model will be promoted to `{{target_environment}}` +{%- endif %} +{%- if deploy_locally %} +- Local deployment enabled +{%- endif %} +{%- if deploy_to_huggingface %} +- Deployment to HuggingFace Hub enabled +{%- endif %} +{%- if deploy_to_skypilot %} +- Deployment to SkyPilot enabled +{%- endif %} +{%- if dataset %} - Dataset: `{{dataset}}` +{%- endif %} +{%- if model %} - Model: `{{model}}` -- Every trained model will be promoted to `{{target_environment}}` +{%- endif %} +{%- if cloud_provider %} +- Cloud provider: `{{cloud_provider}}` +{%- endif %} {%- if notify_on_failures and notify_on_successes %} - Notifications about failures and successes enabled {%- elif notify_on_failures %} @@ -53,7 +74,7 @@ source .venv/bin/activate # Install requirements & integrations make setup # Optionally, provision default local stack -make install-stack +make install-local-stack # Start the ZenML UI locally (recommended, but optional); # the default username is "admin" with an empty password zenml up @@ -71,3 +92,142 @@ Next, you should: ```bash python run.py --help ``` +* go back and [try out different parameters](https://github.com/zenml-io/template-nlp#-template-parameters) +for your generated project. For example, you could disable hyperparameters +tuning and use your favorite model architecture or promote every trained model, +if you haven't already! +* take a look at [the project structure](#📜-project-structure) and the code +itself. The code is heavily commented and should be easy to follow. +* read the [ZenML documentation](https://docs.zenml.io) to learn more about +various ZenML concepts referenced in the code and to get a better sense of +what you can do with ZenML. +* start building your own ZenML project by modifying this code + +## 📦 What's in the box? + +The {{project_name}} project demonstrates how the most important steps of +the ML Production Lifecycle can be implemented in a reusable way remaining +agnostic to the underlying infrastructure for a Natural Language Processing +(NLP) task. + +This template uses one of these datasets: +* [IMDB Movie Reviews](https://huggingface.co/datasets/imdb) +* [Financial News](https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment) +* [Airlines Reviews](https://huggingface.co/datasets/Shayanvsf/US_Airline_Sentiment) + +and one of these models: +* [DistilBERT](https://huggingface.co/distilbert-base-uncased) +* [RoBERTa](https://huggingface.co/roberta-base) +* [BERT](https://huggingface.co/bert-base-uncased) + +It consists of three pipelines with the following high-level setup: +

+ +

+ +All pipelines are leveraging the Model Control Plane to bring all parts together - the training pipeline creates and promotes a new Model Control Plane version with a trained model object in it, deployment pipeline uses the inference Model Control Plane version (the one promoted during training) to create a deployment service and inference pipeline using deployment service from the inference Model Control Plane version and store back new set of predictions as a versioned data artifact for future use. This makes those pipelines closely connected while ensuring that only quality-assured Model Control Plane versions are used to produce predictions delivered to stakeholders. +* [CT] Training + * Load the training dataset from HuggingFace Datasets + * Load Tokenizer from HuggingFace Models based on the model name + * Tokenize the training dataset and store the tokenizer as an artifact + * Train and evaluate a model object using the training dataset and store it as an artifact + * Register the model object as a new inference Model Control Plane version +* [CD] Promotion + * Evaluate the latest Model Control Plane version using the evaluation metric + * Compare the evaluation metric of the latest Model Control Plane version with the evaluation metric of the currently promoted Model Control Plane version + * If the evaluation metric of the latest Model Control Plane version is better than the evaluation metric of the currently promoted Model Control Plane version, promote the latest Model Control Plane version to the specified stage + * If the evaluation metric of the latest Model Control Plane version is worse than the evaluation metric of the currently promoted Model Control Plane version, do not promote the latest Model Control Plane version +* [CD] Deployment + * Load the inference Model Control Plane version + * Save the Model locally (for that this pipeline needs to be run on the local machine) + * Deploy the Model to the specified environment + * If the specified environment is HuggingFace Hub, upload the Model to the HuggingFace Hub + * If the specified environment is SkyPilot, deploy the Model to the SkyPilot + * If the specified environment is local, do not deploy the Model + +In [the repository documentation](https://github.com/zenml-io/template-nlp#-how-this-template-is-implemented), +you can find more details about every step of this template. + +The project code is meant to be used as a template for your projects. For +this reason, you will find several places in the code specifically marked +to indicate where you can add your code: + +```python +### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### +... +### YOUR CODE ENDS HERE ### +``` + +## 📜 Project Structure + +The project loosely follows [the recommended ZenML project structure](https://docs.zenml.io/user-guide/starter-guide/follow-best-practices): + +``` +. +├── gradio # Gradio app for inference +│ ├── __init__.py # Gradio app initialization +│ ├── app.py # Gradio app entrypoint +│ ├── Dockerfile # Gradio app Dockerfile +│ ├── requirements.txt # Gradio app Python dependencies +│ └── serve.yaml # Gradio app SkyPilot deployment configuration +├── pipelines # `zenml.pipeline` implementations +│ ├── __init__.py +│ ├── deployment.py # deployment pipeline +│ ├── promotion.py # promotion pipeline +│ └── training.py # training pipeline +├── steps # `zenml.steps` implementations +│ ├── __init__.py +│ ├── alerts # `zenml.steps.alerts` implementations +│ │ ├── __init__.py +│ │ └── notify_on.py # notify step +│ ├── dataset_loader # `zenml.steps.dataset_loader` implementations +│ │ ├── __init__.py +│ │ └── data_loader.py # data loader step +│ ├── deploying # `zenml.steps.deploying` implementations +│ │ ├── __init__.py +│ │ ├── save_model.py # save model step +│ │ ├── deploy_locally.py # deploy locally step +│ │ ├── deploy_to_huggingface.py # deploy to HuggingFace Hub step +│ │ └── deploy_to_skypilot.py # deploy to SkyPilot step +│ ├── promotion # `zenml.steps.promotion` implementations +│ │ ├── __init__.py +│ │ ├── promote_latest.py # promote latest step +│ │ ├── promote_get_version.py # get model version step +│ │ ├── promote_metric_compare_promoter.py # metric compare promoter step +│ │ └── promote_get_metrics.py # get metric step +│ ├── registrer # `zenml.steps.registrer` implementations +│ │ ├── __init__.py +│ │ └── model_log_registrer.py # model log registrer step +│ ├── tokenization # `zenml.steps.tokenization` implementations +│ │ ├── __init__.py +│ │ └── tokenization.py # tokenization step +│ ├── tokenizer_loader # `zenml.steps.tokenizer_loader` implementations +│ │ ├── __init__.py +│ │ └── tokenizer_loader.py # tokenizer loader step +│ └── training # `zenml.steps.training` implementations +│ ├── __init__.py +│ └── trainer.py # train step +└── utils # `zenml.utils` implementations +│ └── misc.py # miscellaneous utilities +├── README.md # this file +├── requirements.txt # extra Python dependencies +├── config.yaml # ZenML configuration file +└── run.py # CLI tool to run pipelines on ZenML Stack +``` + + + + + + + + + + + + +integrations +├── README.md # this file +├── requirements.txt # extra Python dependencies +└── run.py # CLI tool to run pipelines on ZenML Stack +``` \ No newline at end of file diff --git a/template/config.yaml b/template/config.yaml index 277fbe4..d8f3e23 100644 --- a/template/config.yaml +++ b/template/config.yaml @@ -4,6 +4,8 @@ settings: docker: {%- if accelerator == 'gpu' %} parent_image: 'huggingface/transformers-pytorch-gpu' + build_options: + - --platform linux/amd64 {%- endif %} required_integrations: {%- if cloud_of_choice == 'aws' %} @@ -26,9 +28,9 @@ settings: extra: mlflow_model_name: nlp_use_case_model {%- if target_environment == 'production' %} - target_env: Production + target_env: production {%- else %} - target_env: Staging + target_env: staging {%- endif %} notify_on_success: False notify_on_failure: True \ No newline at end of file diff --git a/template/pipelines/deploying.py b/template/pipelines/deploying.py index 8540664..8b62333 100644 --- a/template/pipelines/deploying.py +++ b/template/pipelines/deploying.py @@ -26,10 +26,11 @@ orchestrator = Client().active_stack.orchestrator # Check if orchestrator flavor is local -if orchestrator.flavor not in ["local"]: +if orchestrator.flavor not in ["local", "vm_aws", "vm_gcp", "vm_azure"]: raise RuntimeError( - "Your active stack needs to contain a local orchestrator for " - "the deployment pipeline to work." + "Your active stack needs to contain a local orchestrator or a VM " + "orchestrator to run this pipeline. However, we recommend using " + "the local orchestrator for this pipeline." ) @pipeline( @@ -67,10 +68,7 @@ def {{product_name}}_deploy_pipeline( pipeline_extra = get_pipeline_context().extra ########## Save Model locally ########## - save_model_to_deploy( - mlflow_model_name=pipeline_extra["mlflow_model_name"], - stage=pipeline_extra["target_env"], - ) + save_model_to_deploy() {%- if deploy_locally %} ########## Deploy Locally ########## diff --git a/template/pipelines/promoting.py b/template/pipelines/promoting.py index 896871f..69a82c2 100644 --- a/template/pipelines/promoting.py +++ b/template/pipelines/promoting.py @@ -6,12 +6,11 @@ notify_on_failure, notify_on_success, {%- if metric_compare_promotion %} - promote_get_metric, + promote_get_metrics, promote_metric_compare_promoter, {%- else %} - promote_latest, + promote_current, {%- endif %} - promote_get_versions, ) from zenml import pipeline, get_pipeline_context from zenml.logger import get_logger @@ -37,32 +36,17 @@ def {{product_name}}_promote_pipeline(): pipeline_extra = get_pipeline_context().extra ########## Promotion stage ########## - latest_version, current_version = promote_get_versions() {%- if metric_compare_promotion %} - latest_metric = promote_get_metric( - name=pipeline_extra["mlflow_model_name"], - metric="eval_loss", - version=latest_version, - ) - current_metric = promote_get_metric( - name=pipeline_extra["mlflow_model_name"], - metric="eval_loss", - version=current_version, - ) + latest_metrics, current_metrics = promote_get_metrics() promote_metric_compare_promoter( - latest_metric=latest_metric, - current_metric=current_metric, - latest_version=latest_version, - current_version=current_version, + latest_metrics=latest_metrics, + current_metrics=current_metrics, ) last_step_name = "promote_metric_compare_promoter" {%- else %} - promote_latest( - latest_version=latest_version, - current_version=current_version, - ) - last_step_name = "promote_latest" + promote_current() + last_step_name = "promote_current" {%- endif %} notify_on_success(after=[last_step_name]) diff --git a/template/pipelines/training.py b/template/pipelines/training.py index 502785e..cb306a7 100644 --- a/template/pipelines/training.py +++ b/template/pipelines/training.py @@ -12,7 +12,7 @@ model_trainer, register_model, {%- if metric_compare_promotion %} - promote_get_metric, + promote_get_metrics, promote_metric_compare_promoter, {%- else %} promote_latest, diff --git a/template/run.py b/template/run.py index 0d7f430..537db3a 100644 --- a/template/run.py +++ b/template/run.py @@ -10,7 +10,8 @@ {{product_name}}_deploy_pipeline, ) from zenml.logger import get_logger - +from zenml.model import ModelConfig +from zenml.enums import ModelStages logger = get_logger(__name__) @@ -83,6 +84,12 @@ type=click.FLOAT, help="Weight decay for training the model.", ) +@click.option( + "--training-pipeline", + is_flag=True, + default=True, + help="Whether to run the pipeline that traines the model to {{target_environment}}.", +) @click.option( "--promoting-pipeline", is_flag=True, @@ -119,6 +126,12 @@ type=click.STRING, help="Comma-separated list of examples to show in the Gradio interface.", ) +@click.option( + "--zenml-model-name", + default="sentiment_analysis", + type=click.STRING, + help="Name of the ZenML Model.", +) def main( no_cache: bool = True, num_epochs: int = 3, @@ -126,12 +139,14 @@ def main( eval_batch_size: int = 8, learning_rate: float = 2e-5, weight_decay: float = 0.01, + training_pipeline: bool = True, promoting_pipeline: bool = True, - deploying_pipeline: bool = True, - depployment_app_title: str = "Sentiment Analyzer", - depployment_app_description: str = "Sentiment Analyzer", - depployment_app_interpretation: str = "default", - depployment_app_example: str = "", + deploying_pipeline: bool = False, + deployment_app_title: str = "Sentiment Analyzer", + deployment_app_description: str = "Sentiment Analyzer", + deployment_app_interpretation: str = "default", + deployment_app_example: str = "", + zenml_model_name: str = "sentiment_analysis", ): """Main entry point for the pipeline execution. @@ -157,24 +172,38 @@ def main( if no_cache: pipeline_args["enable_cache"] = False - # Execute Training Pipeline - run_args_train = { - "num_epochs": num_epochs, - "train_batch_size": train_batch_size, - "eval_batch_size": eval_batch_size, - "learning_rate": learning_rate, - "weight_decay": weight_decay, - } - - pipeline_args[ - "run_name" - ] = f"{{product_name}}_run_{dt.now().strftime('%Y_%m_%d_%H_%M_%S')}" - {{product_name}}_training_pipeline.with_options(**pipeline_args)(**run_args_train) - logger.info("Training pipeline finished successfully!") + if training_pipeline: + # Execute Training Pipeline + run_args_train = { + "num_epochs": num_epochs, + "train_batch_size": train_batch_size, + "eval_batch_size": eval_batch_size, + "learning_rate": learning_rate, + "weight_decay": weight_decay, + } + + model_config = ModelConfig( + name=zenml_model_name, + license="{{open_source_license}}", + description="Show case Model Control Plane.", + create_new_model_version=True, + delete_new_version_on_failure=True, + tags=["sentiment_analysis", "huggingface"], + ) + + pipeline_args["model_config"] = model_config + + pipeline_args[ + "run_name" + ] = f"{{product_name}}_run_{dt.now().strftime('%Y_%m_%d_%H_%M_%S')}" + {{product_name}}_training_pipeline.with_options(**pipeline_args)(**run_args_train) + logger.info("Training pipeline finished successfully!") # Execute Promoting Pipeline if promoting_pipeline: run_args_promoting = {} + model_config = ModelConfig(name=zenml_model_name) + pipeline_args["model_config"] = model_config pipeline_args[ "run_name" ] = f"{{product_name}}_promoting_pipeline_run_{dt.now().strftime('%Y_%m_%d_%H_%M_%S')}" @@ -183,11 +212,17 @@ def main( if deploying_pipeline: pipeline_args["enable_cache"] = False + # Deploying pipeline has new ZenML model config + model_config = ModelConfig( + name=zenml_model_name, + version=ModelStages("{{target_environment}}"), + ) + pipeline_args["model_config"] = model_config run_args_deploying = { - "title": depployment_app_title, - "description": depployment_app_description, - "interpretation": depployment_app_interpretation, - "example": depployment_app_example, + "title": deployment_app_title, + "description": deployment_app_description, + "interpretation": deployment_app_interpretation, + "example": deployment_app_example, } pipeline_args[ "run_name" diff --git a/template/steps/__init__.py b/template/steps/__init__.py index f1ba5a5..62b6cb0 100644 --- a/template/steps/__init__.py +++ b/template/steps/__init__.py @@ -7,12 +7,11 @@ ) from .promotion import ( {%- if metric_compare_promotion %} - promote_get_metric, + promote_get_metrics, promote_metric_compare_promoter, {%- else %} - promote_latest, + promote_current, {%- endif %} - promote_get_versions, ) from .registrer import register_model from .tokenizer_loader import ( diff --git a/template/steps/dataset_loader/{% if custom_training %}prepare_data_loaders.py{% endif %} b/template/steps/dataset_loader/{% if custom_training %}prepare_data_loaders.py{% endif %} deleted file mode 100644 index 7c6f5ac..0000000 --- a/template/steps/dataset_loader/{% if custom_training %}prepare_data_loaders.py{% endif %} +++ /dev/null @@ -1,64 +0,0 @@ -# {% include 'template/license_header' %} - - -from typing_extensions import Annotated - -from torch.utils.data import DataLoader - -from zenml import step -from zenml.logger import get_logger - -from config import HFSentimentAnalysisDataset - -logger = get_logger(__name__) - - -@step -def prepare_dataloaders_step( - tokenizer: PreTrainedTokenizerBase, - dataset: DatasetDict, -) -> ( - Tuple[ - Annotated[DataLoader, "train_dataloader"], - Annotated[DataLoader, "validation_dataloader"], - ] -): - """Data splitter step. - - This is an example of a data splitter step that splits the dataset into - training and dev subsets to be used for model training and evaluation. It - takes in a dataset as an step input artifact and returns the training and - dev subsets as two separate step output artifacts. - - Data splitter steps should have a deterministic behavior, i.e. they should - use a fixed random seed and always return the same split when called with - the same input dataset. This is to ensure reproducibility of your pipeline - runs. - - This step is parameterized using the `DataSplitterStepParameters` class, - which allows you to configure the step independently of the step code, - before running it in a pipeline. In this example, the step can be configured - to use a different random seed, change the split ratio, or control whether - to shuffle or stratify the split. See the documentation for more - information: - - https://docs.zenml.io/user-guide/starter-guide/cache-previous-executions - - Args: - params: Parameters for the data splitter step. - dataset: The dataset to split. - - Returns: - The resulting training and dev subsets. - """ - proccessed_datasets = dataset.remove_columns(["text"]) - proccessed_datasets = proccessed_datasets.rename_column("label", "labels") - proccessed_datasets.set_format("torch") - data_collator = DataCollatorWithPadding(tokenizer=tokenizer) - train_dataloader = DataLoader( - proccessed_datasets["train"], shuffle=True, batch_size=16, collate_fn=data_collator - ) - validation_dataloader = DataLoader( - proccessed_datasets["validation"], batch_size=16, collate_fn=data_collator - ) - return train_dataloader, validation_dataloader \ No newline at end of file diff --git a/template/steps/deploying/save_model.py b/template/steps/deploying/save_model.py index a0c8f76..09f8e2e 100644 --- a/template/steps/deploying/save_model.py +++ b/template/steps/deploying/save_model.py @@ -1,24 +1,18 @@ # {% include 'template/license_header' %} -import mlflow -from zenml import step +from zenml import get_step_context, step from zenml.client import Client from zenml.logger import get_logger -from zenml.model_registries.base_model_registry import ModelVersionStage +from zenml.enums import ModelStages # Initialize logger logger = get_logger(__name__) -# Get experiment tracker -model_registry = Client().active_stack.model_registry @step() -def save_model_to_deploy( - mlflow_model_name: str, - stage: str, -): +def save_model_to_deploy(): """ This step saves the latest model and tokenizer to the local filesystem. @@ -32,27 +26,28 @@ def save_model_to_deploy( stage: The stage of the model in MLFlow. """ ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### + pipeline_extra = get_step_context().pipeline_run.config.extra + zenml_client = Client() + logger.info( - f" Loading latest version of model {mlflow_model_name} for stage {stage}..." + f" Loading latest version of the model for stage {pipeline_extra['target_env']}..." ) - # Load model from MLFlow registry - model_version = model_registry.get_latest_model_version( - name=mlflow_model_name, - stage=ModelVersionStage(stage), - ).version - # Load model from MLFlow registry - model_version = model_registry.get_model_version( - name=mlflow_model_name, - version=model_version, + # Get latest saved model version in target environment + model_config = get_step_context().model_config + latest_version = zenml_client.get_model_version( + model_name_or_id=model_config.name, + model_version_name_or_number_or_id=ModelStages(pipeline_extra["target_env"]), ) - transformer_model = mlflow.transformers.load_model(model_version.model_source_uri) + # Load model and tokenizer from Model Control Plane + model = latest_version.get_artifact_object(name="model").load() + tokenizer = latest_version.get_artifact_object(name="tokenizer").load() # Save the model and tokenizer locally model_path = "./gradio/model" # replace with the actual path tokenizer_path = "./gradio/tokenizer" # replace with the actual path # Save model locally - transformer_model.model.save_pretrained(model_path) - transformer_model.tokenizer.save_pretrained(tokenizer_path) + model.model.save_pretrained(model_path) + tokenizer.tokenizer.save_pretrained(tokenizer_path) logger.info( f" Model and tokenizer saved to {model_path} and {tokenizer_path} respectively." ) diff --git a/template/steps/deploying/{% if deploy_locally %}local_deployment.py{% endif %} b/template/steps/deploying/{% if deploy_locally %}local_deployment.py{% endif %} index 235b4ea..0ae9bf1 100644 --- a/template/steps/deploying/{% if deploy_locally %}local_deployment.py{% endif %} +++ b/template/steps/deploying/{% if deploy_locally %}local_deployment.py{% endif %} @@ -1,6 +1,7 @@ # {% include 'template/license_header' %} from typing import Optional, List +from typing_extensions import Annotated import os import subprocess diff --git a/template/steps/inference/__init__.py b/template/steps/inference/__init__.py deleted file mode 100644 index fc63455..0000000 --- a/template/steps/inference/__init__.py +++ /dev/null @@ -1,5 +0,0 @@ -# {% include 'template/license_header' %} - - -from .inference_get_current_version import inference_get_current_version -from .inference_predict import inference_predict diff --git a/template/steps/inference/inference_get_current_version.py b/template/steps/inference/inference_get_current_version.py deleted file mode 100644 index a04c2ba..0000000 --- a/template/steps/inference/inference_get_current_version.py +++ /dev/null @@ -1,33 +0,0 @@ -# {% include 'template/license_header' %} - - -from config import MetaConfig -from typing_extensions import Annotated -from zenml import step -from zenml.client import Client -from zenml.logger import get_logger - -logger = get_logger(__name__) - -model_registry = Client().active_stack.model_registry - - -@step -def inference_get_current_version() -> Annotated[str, "model_version"]: - """Get currently tagged model version for deployment. - - Returns: - The model version of currently tagged model in Registry. - """ - - ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### - - current_version = model_registry.list_model_versions( - name=MetaConfig.mlflow_model_name, - stage=MetaConfig.target_env, - )[0].version - logger.info( - f"Current model version in `{MetaConfig.target_env.value}` is `{current_version}`" - ) - - return current_version diff --git a/template/steps/inference/inference_predict.py b/template/steps/inference/inference_predict.py deleted file mode 100644 index 851b473..0000000 --- a/template/steps/inference/inference_predict.py +++ /dev/null @@ -1,42 +0,0 @@ -# {% include 'template/license_header' %} - - -import pandas as pd -from typing_extensions import Annotated -from zenml import step -from zenml.integrations.mlflow.model_deployers.mlflow_model_deployer import ( - MLFlowDeploymentService, -) - - -@step -def inference_predict( - deployment_service: MLFlowDeploymentService, - dataset_inf: pd.DataFrame, -) -> Annotated[pd.Series, "predictions"]: - """Predictions step. - - This is an example of a predictions step that takes the data in and returns - predicted values. - - This step is parameterized, which allows you to configure the step - independently of the step code, before running it in a pipeline. - In this example, the step can be configured to use different input data - and model version in registry. See the documentation for more information: - - https://docs.zenml.io/user-guide/advanced-guide/configure-steps-pipelines - - Args: - deployment_service: Deployed model service. - dataset_inf: The inference dataset. - - Returns: - The processed dataframe: dataset_inf. - """ - ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### - predictions = deployment_service.predict(request=dataset_inf) - predictions = pd.Series(predictions, name="predicted") - deployment_service.deprovision(force=True) - ### YOUR CODE ENDS HERE ### - - return predictions diff --git a/template/steps/promotion/__init__.py b/template/steps/promotion/__init__.py index 039bfe8..eb8a1db 100644 --- a/template/steps/promotion/__init__.py +++ b/template/steps/promotion/__init__.py @@ -1,10 +1,9 @@ # {% include 'template/license_header' %} -from .promote_get_versions import promote_get_versions {%- if metric_compare_promotion %} -from .promote_get_metric import promote_get_metric +from .promote_get_metrics import promote_get_metrics from .promote_metric_compare_promoter import promote_metric_compare_promoter {%- else %} -from .promote_latest import promote_latest +from .promote_current import promote_current {%- endif %} diff --git a/template/steps/promotion/promote_get_versions.py b/template/steps/promotion/promote_get_versions.py deleted file mode 100644 index 96dc344..0000000 --- a/template/steps/promotion/promote_get_versions.py +++ /dev/null @@ -1,53 +0,0 @@ -# {% include 'template/license_header' %} - - -from typing import Tuple - -from typing_extensions import Annotated -from zenml import get_step_context, step -from zenml.client import Client -from zenml.logger import get_logger -from zenml.model_registries.base_model_registry import ModelVersionStage - -logger = get_logger(__name__) - -model_registry = Client().active_stack.model_registry - - -@step -def promote_get_versions() -> ( - Tuple[Annotated[str, "latest_version"], Annotated[str, "current_version"]] -): - """Step to get latest and currently tagged model version from Model Registry. - - This is an example of a model version extraction step. It will retrieve 2 model - versions from Model Registry: latest and currently promoted to target - environment (Production, Staging, etc). - - Returns: - The model versions: latest and current. If not current version - returns same - for both. - """ - - ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### - pipeline_extra = get_step_context().pipeline_run.config.extra - none_versions = model_registry.list_model_versions( - name=pipeline_extra["mlflow_model_name"], - stage=None, - ) - latest_versions = none_versions[0].version - logger.info(f"Latest model version is {latest_versions}") - - target_versions = model_registry.list_model_versions( - name=pipeline_extra["mlflow_model_name"], - stage=ModelVersionStage(pipeline_extra["target_env"]), - ) - current_version = latest_versions - if target_versions: - current_version = target_versions[0].version - logger.info(f"Currently promoted model version is {current_version}") - else: - logger.info("No currently promoted model version found.") - ### YOUR CODE ENDS HERE ### - - return latest_versions, current_version diff --git a/template/steps/promotion/promote_latest.py b/template/steps/promotion/promote_latest.py deleted file mode 100644 index 1a16f53..0000000 --- a/template/steps/promotion/promote_latest.py +++ /dev/null @@ -1,48 +0,0 @@ -# {% include 'template/license_header' %} - - -from config import MetaConfig -from zenml import step -from zenml.client import Client -from zenml.logger import get_logger -from zenml.model_registries.base_model_registry import ModelVersionStage - -logger = get_logger(__name__) - -model_registry = Client().active_stack.model_registry - - -@step -def promote_latest(latest_version: str, current_version: str): - """Promote latest trained model. - - This is an example of a model promotion step, which promotes the - latest trained model to the current version. - - Args: - latest_version: Recently trained model version. - current_version: Current model version, if present. - - """ - - ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### - logger.info(f"Promoting latest model version `{latest_version}`") - if latest_version != current_version: - model_registry.update_model_version( - name=MetaConfig.mlflow_model_name, - version=current_version, - stage=ModelVersionStage.ARCHIVED, - metadata={}, - ) - model_registry.update_model_version( - name=MetaConfig.mlflow_model_name, - version=latest_version, - stage=MetaConfig.target_env, - metadata={}, - ) - promoted_version = latest_version - - logger.info( - f"Current model version in `{MetaConfig.target_env.value}` is `{promoted_version}`" - ) - ### YOUR CODE ENDS HERE ### diff --git a/template/steps/promotion/{% if metric_compare_promotion %}promote_get_metric.py{% endif %} b/template/steps/promotion/{% if metric_compare_promotion %}promote_get_metric.py{% endif %} deleted file mode 100644 index b444ed7..0000000 --- a/template/steps/promotion/{% if metric_compare_promotion %}promote_get_metric.py{% endif %} +++ /dev/null @@ -1,45 +0,0 @@ -# {% include 'template/license_header' %} - - -from typing_extensions import Annotated -import mlflow -from zenml import get_step_context, step -from zenml.client import Client -from zenml.logger import get_logger - -logger = get_logger(__name__) - -model_registry = Client().active_stack.model_registry - -@step -def promote_get_metric( - name: str, - metric: str, - version: str, -) -> Annotated[float, "metric"]: - """Get metric for comparison for promoting a model. - - This is an example of a metric retrieval step. It is used to retrieve - a metric from an MLFlow run, that is linked to a model version in the - model registry. This step is used in the `promote_model` pipeline. - - Args: - name: Name of the model registered in the model registry. - metric: Name of the metric to be retrieved. - version: Version of the model to be retrieved. - - Returns: - Metric value for a given model version. - """ - - ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### - model_version = model_registry.get_model_version( - name = name, - version = version - ) - mlflow_run = mlflow.get_run(run_id=model_version.metadata.mlflow_run_id) - logger.info("Getting metric from MLFlow run %s", mlflow_run.info.run_id) - - metric = mlflow_run.data.metrics.get(metric) - ### YOUR CODE ENDS HERE ### - return metric diff --git a/template/steps/promotion/{% if metric_compare_promotion %}promote_get_metrics.py{% endif %} b/template/steps/promotion/{% if metric_compare_promotion %}promote_get_metrics.py{% endif %} new file mode 100644 index 0000000..45846c7 --- /dev/null +++ b/template/steps/promotion/{% if metric_compare_promotion %}promote_get_metrics.py{% endif %} @@ -0,0 +1,59 @@ +# {% include 'template/license_header' %} + +from typing import Tuple, Any +from typing_extensions import Annotated +from zenml import get_step_context, step +from zenml.client import Client +from zenml.logger import get_logger +from zenml.enums import ModelStages + +logger = get_logger(__name__) + +model_registry = Client().active_stack.model_registry + +@step +def promote_get_metrics( +) -> ( + Tuple[Annotated[dict[str, Any], "latest_metrics"], Annotated[dict[str, Any], "current_metrics`"]] +): + """Get metrics for comparison for promoting a model. + + This is an example of a metric retrieval step. It is used to retrieve + a metric from an MLFlow run, that is linked to a model version in the + model registry. This step is used in the `promote_model` pipeline. + + Args: + name: Name of the model registered in the model registry. + metric: Name of the metric to be retrieved. + version: Version of the model to be retrieved. + + Returns: + Metric value for a given model version. + """ + ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### + pipeline_extra = get_step_context().pipeline_run.config.extra + zenml_client = Client() + + # Get current model version metric in current run + model_config = get_step_context().model_config + current_version = model_config._get_model_version() + current_metrics = current_version.get_artifact_object(name="model").metadata["metrics"].value + logger.info(f"Current model version metrics are {current_metrics}") + + # Get latest saved model version metric in target environment + try: + latest_version = zenml_client.get_model_version( + model_name_or_id=model_config.name, + model_version_name_or_number_or_id=ModelStages(pipeline_extra["target_env"]), + ) + except KeyError: + latest_version = None + if latest_version: + latest_metrics = current_version.get_artifact_object(name="model").metadata["metrics"].value + logger.info(f"Current model version metrics are {latest_metrics}") + else: + logger.info("No currently promoted model version found.") + latest_metrics = current_metrics + ### YOUR CODE ENDS HERE ### + + return latest_metrics, current_metrics diff --git a/template/steps/promotion/{% if metric_compare_promotion %}promote_metric_compare_promoter.py{% endif %} b/template/steps/promotion/{% if metric_compare_promotion %}promote_metric_compare_promoter.py{% endif %} index 0152776..8582bd0 100644 --- a/template/steps/promotion/{% if metric_compare_promotion %}promote_metric_compare_promoter.py{% endif %} +++ b/template/steps/promotion/{% if metric_compare_promotion %}promote_metric_compare_promoter.py{% endif %} @@ -1,7 +1,8 @@ # {% include 'template/license_header' %} - +from typing import Any from zenml import get_step_context, step +from zenml.enums import ModelStages from zenml.client import Client from zenml.logger import get_logger from zenml.model_registries.base_model_registry import ModelVersionStage @@ -13,10 +14,9 @@ model_registry = Client().active_stack.model_registry @step def promote_metric_compare_promoter( - latest_metric: float, - current_metric: float, - latest_version: str, - current_version: str, + latest_metrics: dict[str, str], + current_metrics: dict[str, str], + metric_to_compare: str = "accuracy", ): """Try to promote trained model. @@ -37,50 +37,41 @@ def promote_metric_compare_promoter( https://docs.zenml.io/user-guide/advanced-guide/configure-steps-pipelines Args: - latest_metric: Recently trained model metric results. - current_metric: Previously promoted model metric results. - latest_version: Recently trained model version. - current_version: Previously promoted model version. - + latest_metrics: Recently trained model metrics results. + current_metrics: Previously promoted model metrics results. """ + ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### pipeline_extra = get_step_context().pipeline_run.config.extra should_promote = True - if latest_version == current_version: + breakpoint() + + if latest_metrics == current_metrics: logger.info("No current model version found - promoting latest") else: logger.info( - f"Latest model metric={latest_metric:.6f}\n" - f"Current model metric={current_metric:.6f}" + f"Latest model metric={latest_metrics[metric_to_compare]:.6f}\n" + f"Current model metric={current_metrics[metric_to_compare]:.6f}" ) - if latest_metric <= current_metric: + if latest_metrics[metric_to_compare] < current_metrics[metric_to_compare]: logger.info( - "Latest model versions outperformed current versions - promoting latest" + "Current model versions outperformed latest versions - promoting current" ) + else: logger.info( - "Current model versions outperformed latest versions - keeping current" + "Latest model versions outperformed current versions - keeping latest" ) should_promote = False - promoted_version = current_version if should_promote: - if latest_version != current_version: - model_registry.update_model_version( - name=pipeline_extra["mlflow_model_name"], - version=current_version, - stage=ModelVersionStage.ARCHIVED, - ) - model_registry.update_model_version( - name=pipeline_extra["mlflow_model_name"], - version=latest_version, - stage=ModelVersionStage(pipeline_extra["target_env"]), - ) - promoted_version = latest_version + model_config = get_step_context().model_config + model_version = model_config._get_model_version() + model_version.set_stage(pipeline_extra["target_env"], force=True) logger.info( - f"Current model version in `{pipeline_extra['target_env']}` is `{promoted_version}`" + f"Promoted current model version to {pipeline_extra['target_env']} environment" ) ### YOUR CODE ENDS HERE ### diff --git a/template/steps/promotion/{% if not metric_compare_promotion %}promote_current.py{% endif %} b/template/steps/promotion/{% if not metric_compare_promotion %}promote_current.py{% endif %} new file mode 100644 index 0000000..5d36808 --- /dev/null +++ b/template/steps/promotion/{% if not metric_compare_promotion %}promote_current.py{% endif %} @@ -0,0 +1,30 @@ +# {% include 'template/license_header' %} + + +from zenml import get_step_context, step +from zenml.client import Client +from zenml.logger import get_logger +from zenml.model_registries.base_model_registry import ModelVersionStage + +logger = get_logger(__name__) + +model_registry = Client().active_stack.model_registry + + +@step +def promote_current(): + """Promote current trained model. + + This is an example of a model promotion step, which promotes the + current trained model to the version. + """ + + ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### + logger.info(f"Promoting current model version") + model_config = get_step_context().model_config + model_version = model_config._get_model_version() + model_version.set_stage(pipeline_extra["target_env"], force=True) + logger.info( + f"Current model version promoted to {pipeline_extra['target_env']}" + ) + ### YOUR CODE ENDS HERE ### diff --git a/template/steps/promotion/{% if not metric_compare_promotion %}promote_latest.py{% endif %} b/template/steps/promotion/{% if not metric_compare_promotion %}promote_latest.py{% endif %} deleted file mode 100644 index 7c335b8..0000000 --- a/template/steps/promotion/{% if not metric_compare_promotion %}promote_latest.py{% endif %} +++ /dev/null @@ -1,48 +0,0 @@ -# {% include 'template/license_header' %} - - -from zenml import get_step_context, step -from zenml.client import Client -from zenml.logger import get_logger -from zenml.model_registries.base_model_registry import ModelVersionStage - -logger = get_logger(__name__) - -model_registry = Client().active_stack.model_registry - - -@step -def promote_latest(latest_version:str, current_version:str): - """Promote latest trained model. - - This is an example of a model promotion step, which promotes the - latest trained model to the current version. - - Args: - latest_version: Recently trained model version. - current_version: Current model version, if present. - - """ - - ### ADD YOUR OWN CODE HERE - THIS IS JUST AN EXAMPLE ### - logger.info(f"Promoting latest model version `{latest_version}`") - pipeline_extra = get_step_context().pipeline_run.config.extra - if latest_version != current_version: - model_registry.update_model_version( - name=pipeline_extra["mlflow_model_name"], - version=current_version, - stage=ModelVersionStage.ARCHIVED, - metadata={}, - ) - model_registry.update_model_version( - name=pipeline_extra["mlflow_model_name"], - version=latest_version, - stage=ModelVersionStage(pipeline_extra["target_env"]), - metadata={}, - ) - promoted_version = latest_version - - logger.info( - f"Current model version in `{pipeline_extra['target_env']}` is `{promoted_version}`" - ) - ### YOUR CODE ENDS HERE ### diff --git a/template/steps/training/model_trainer.py b/template/steps/training/model_trainer.py index 6da8649..4cd1b76 100644 --- a/template/steps/training/model_trainer.py +++ b/template/steps/training/model_trainer.py @@ -13,7 +13,7 @@ TrainingArguments, AutoModelForSequenceClassification, ) -from zenml import step +from zenml import log_artifact_metadata, step from zenml.client import Client from zenml.integrations.mlflow.experiment_trackers import MLFlowExperimentTracker from zenml.logger import get_logger @@ -132,7 +132,10 @@ def model_trainer( # Train and evaluate the model trainer.train() - trainer.evaluate() + eval_results = trainer.evaluate(metric_key_prefix="") + + # Log the evaluation results in model control plane + log_artifact_metadata(output_name="model", metrics=eval_results) ### YOUR CODE ENDS HERE ### return model, tokenizer diff --git a/template/steps/training/{% if custom_training %}full_evaluation.py{% endif %} b/template/steps/training/{% if custom_training %}full_evaluation.py{% endif %} deleted file mode 100644 index 71d2e67..0000000 --- a/template/steps/training/{% if custom_training %}full_evaluation.py{% endif %} +++ /dev/null @@ -1,80 +0,0 @@ -# {% include 'template/license_header' %} - - -import torch -from torch import nn -from transformers import AdamW -from transformers import get_scheduler -from tqdm.auto import tqdm - -from zenml import step -from zenml.enums import StrEnum -from zenml.client import Client - -experiment_tracker = Client().active_stack.experiment_tracker - - -@step -def full_evaluation_step( - evaluation_dataloader: DataLoader, - model: nn.Module, -) -> nn.Module: - """Data splitter step. - - This is an example of a data splitter step that splits the dataset into - training and dev subsets to be used for model training and evaluation. It - takes in a dataset as an step input artifact and returns the training and - dev subsets as two separate step output artifacts. - - Data splitter steps should have a deterministic behavior, i.e. they should - use a fixed random seed and always return the same split when called with - the same input dataset. This is to ensure reproducibility of your pipeline - runs. - - This step is parameterized using the `DataSplitterStepParameters` class, - which allows you to configure the step independently of the step code, - before running it in a pipeline. In this example, the step can be configured - to use a different random seed, change the split ratio, or control whether - to shuffle or stratify the split. See the documentation for more - information: - - https://docs.zenml.io/user-guide/starter-guide/cache-previous-executions - - Args: - params: Parameters for the data splitter step. - dataset: The dataset to split. - - Returns: - The resulting training and dev subsets. - """ - mlflow.pytorch.autolog() - model = BertForSequenceClassification.from_pretrained(hf_pretrained_model.value, num_labels=3) - optimizer = AdamW(model.parameters(), lr=5e-5) - num_epochs = 5 - num_training_steps = num_epochs * len(train_dataloader) - lr_scheduler = get_scheduler( - "linear", - optimizer=optimizer, - num_warmup_steps=0, - num_training_steps=num_training_steps, - ) - device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu") - model.to(device) - progress_bar = tqdm(range(num_training_steps)) - model.train() - for epoch in range(num_epochs): - for batch in train_dataloader: - batch = {k: v.to(device) for k, v in batch.items()} - outputs = model(**batch) - loss = outputs.loss - loss.backward() - - optimizer.step() - lr_scheduler.step() - optimizer.zero_grad() - progress_bar.update(1) - - loss, current = loss.item(), batch * len(X) - logger.info(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]") - - return model \ No newline at end of file diff --git a/template/steps/training/{% if custom_training %}full_training.py{% endif %} b/template/steps/training/{% if custom_training %}full_training.py{% endif %} deleted file mode 100644 index 823cc80..0000000 --- a/template/steps/training/{% if custom_training %}full_training.py{% endif %} +++ /dev/null @@ -1,79 +0,0 @@ -# {% include 'template/license_header' %} - - -import torch -from torch import nn -from transformers import AdamW -from transformers import get_scheduler -from tqdm.auto import tqdm - -from zenml import step -from zenml.enums import StrEnum -from zenml.client import Client - -experiment_tracker = Client().active_stack.experiment_tracker - -@step -def full_training_step( - train_dataloader: DataLoader, - hf_pretrained_model: HFPretrainedModel, -) -> nn.Module: - """Data splitter step. - - This is an example of a data splitter step that splits the dataset into - training and dev subsets to be used for model training and evaluation. It - takes in a dataset as an step input artifact and returns the training and - dev subsets as two separate step output artifacts. - - Data splitter steps should have a deterministic behavior, i.e. they should - use a fixed random seed and always return the same split when called with - the same input dataset. This is to ensure reproducibility of your pipeline - runs. - - This step is parameterized using the `DataSplitterStepParameters` class, - which allows you to configure the step independently of the step code, - before running it in a pipeline. In this example, the step can be configured - to use a different random seed, change the split ratio, or control whether - to shuffle or stratify the split. See the documentation for more - information: - - https://docs.zenml.io/user-guide/starter-guide/cache-previous-executions - - Args: - params: Parameters for the data splitter step. - dataset: The dataset to split. - - Returns: - The resulting training and dev subsets. - """ - mlflow.pytorch.autolog() - model = BertForSequenceClassification.from_pretrained(hf_pretrained_model.value, num_labels=3) - optimizer = AdamW(model.parameters(), lr=5e-5) - num_epochs = 5 - num_training_steps = num_epochs * len(train_dataloader) - lr_scheduler = get_scheduler( - "linear", - optimizer=optimizer, - num_warmup_steps=0, - num_training_steps=num_training_steps, - ) - device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu") - model.to(device) - progress_bar = tqdm(range(num_training_steps)) - model.train() - for epoch in range(num_epochs): - for batch in train_dataloader: - batch = {k: v.to(device) for k, v in batch.items()} - outputs = model(**batch) - loss = outputs.loss - loss.backward() - - optimizer.step() - lr_scheduler.step() - optimizer.zero_grad() - progress_bar.update(1) - - loss, current = loss.item(), batch * len(X) - logger.info(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]") - - return model \ No newline at end of file diff --git a/template/utils/misc.py b/template/utils/misc.py index 03a64a2..228c234 100644 --- a/template/utils/misc.py +++ b/template/utils/misc.py @@ -25,7 +25,7 @@ def compute_metrics(eval_pred: tuple[np.ndarray, np.ndarray]) -> dict[str, float precision = load_metric("precision").compute( predictions=predictions, references=labels, average="weighted" ) - return {"accuracy": accuracy, "f1": f1, "precision": precision} + return {"accuracy": accuracy["accuracy"], "f1": f1["f1"], "precision": precision["precision"]} def find_max_length(dataset: list[str]) -> int: