Skip to content

Latest commit

 

History

History
227 lines (146 loc) · 7.01 KB

README.md

File metadata and controls

227 lines (146 loc) · 7.01 KB

MLOps Zoomcamp

Our MLOps Zoomcamp course

Overview

Objective

Teach practical aspects of productionizing ML services — from collecting requirements to model deployment and monitoring.

Target audience

Data scientists and ML engineers. Also software engineers and data engineers interested in learning about putting ML in production.

Pre-requisites

  • Python
  • Docker
  • Being comfortable with command line
  • Prior exposure to machine learning (at work or from other courses, e.g. from ML Zoomcamp)
  • Prior programming experience (at least 1+ year)

Timeline

Course start: 16 of May

Asking for help in Slack

The best way to get support is to use DataTalks.Club's Slack. Join the #course-mlops-zoomcamp channel.

To make discussions in Slack more organized:

Syllabus

This is a draft and will change.

  • What is MLOps
  • MLOps maturity model
  • Running example: NY Taxi trips dataset
  • Why do we need MLOps
  • Course overview
  • Environment preparation
  • Homework

More details

  • Experiment tracking intro
  • Getting started with MLflow
  • Experiment tracking with MLflow
  • Saving and loading models with MLflow
  • Model registry
  • MLflow in practice
  • Homework

More details

Module 3: Orchestration and ML Pipelines

  • ML Pipelines: introduction
  • Prefect
  • Turning a notebook into a pipeline
  • Kubeflow Pipelines
  • Homework

Module 4: Model Deployment

  • Batch vs online
  • For online: web services vs streaming
  • Serving models in Batch mode
  • Web services
  • Streaming (Kinesis/SQS + AWS Lambda)
  • Homework

Module 5: Model Monitoring

  • ML monitoring vs software monitoring
  • Data quality monitoring
  • Data drift / concept drift
  • Batch vs real-time monitoring
  • Tools: Evidently, Prometheus and Grafana
  • Homework

Module 6: Best Practices

  • Devops
  • Virtual environments and Docker
  • Python: logging, linting
  • Testing: unit, integration, regression
  • CI/CD (github actions)
  • Infrastructure as code (terraform, cloudformation)
  • Cookiecutter
  • Makefiles
  • Homework

Module 7: Processes

  • CRISP-DM, CRISP-ML
  • ML Canvas
  • Data Landscape canvas
  • MLOps Stack Canvas
  • Documentation practices in ML projects (Model Cards Toolkit)

Project

  • End-to-end project with all the things above

Running example

To make it easier to connect different modules together, we’d like to use the same running example throughout the course.

Possible candidates:

Instructors

  • Larysa Visengeriyeva
  • Cristian Martinez
  • Kevin Kho
  • Theofilos Papapanagiotou
  • Alexey Grigorev
  • Emeli Dral
  • Sejal Vaidya

Other courses from DataTalks.Club:

FAQ

I want to start preparing for the course. What can I do?

If you haven't used Flask or Docker

If you have no previous experience with ML

  • Check Module 1 from ML Zoomcamp for an overview
  • Module 3 will also be helpful if you want to learn Scikit-Learn (we'll use it in this course)
  • We'll also use XGBoost. You don't have to know it well, but if you want to learn more about it, refer to module 6 of ML Zoomcamp

I registered but haven't received an invite link. Is it normal?

Yes, we haven't automated it. You'll get a mail from us eventually, don't worry.

If you want to make sure you don't miss anything:

Is it going to be live?

No and yes. There will be two parts:

  • Lectures: Pre-recorded, you can watch them when it's convenient for you.
  • Office hours: Live on Mondays (17:00 CET), but recorded, so you can watch later.

Supporters and partners

Thanks to the course sponsors for making it possible to create this course

Thanks to our friends for spreading the word about the course