Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix example #23

Merged
merged 2 commits into from
Sep 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,5 +55,5 @@ jobs:
with:
stack-name: ${{ matrix.stack-name }}
python-version: ${{ matrix.python-version }}
ref-zenml: ${{ inputs.ref-zenml || 'feature/PRD-566-dependency-cleanup' }}
ref-zenml: ${{ inputs.ref-zenml || 'develop' }}
ref-template: ${{ inputs.ref-template || github.ref }}
8 changes: 4 additions & 4 deletions template/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Along the way we will also show you how to:

You can use Google Colab to see ZenML in action, no signup / installation required!

<a href="https://colab.research.google.com/github/zenml-io/zenml/blob/main/examples/quickstart/quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
<a href="https://colab.research.google.com/github/zenml-io/zenml/blob/main/examples/mlops_starter/quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## :computer: Run Locally

Expand All @@ -36,7 +36,7 @@ pip install "zenml[server]"

# clone the ZenML repository
git clone https://github.com/zenml-io/zenml.git
cd zenml/examples/quickstart
cd zenml/examples/mlops_starter
```

Now we're ready to start. You have two options for running the quickstart locally:
Expand All @@ -45,13 +45,13 @@ Now we're ready to start. You have two options for running the quickstart locall
```bash
pip install notebook
jupyter notebook
# open notebooks/quickstart.ipynb
# open quickstart.ipynb
```

#### Option 2 - Execute the whole ML pipeline from a Python script:
```bash
# Install required zenml integrations
zenml integration install sklearn -y
zenml integration install sklearn pandas -y

# Initialize ZenML
zenml init
Expand Down
1 change: 1 addition & 0 deletions template/configs/feature_engineering.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ settings:
docker:
required_integrations:
- sklearn
- pandas
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

Inconsistent Inclusion of pandas in required_integrations

The pandas library is already included in multiple configuration files (training_sgd.yaml, training_rf.yaml, inference.yaml). To maintain consistency, consider adding pandas to the required_integrations list in these files as well.

Analysis chain

LGTM: Addition of pandas to required_integrations.

The inclusion of pandas as a required integration is a good addition, as it's commonly used alongside sklearn for data manipulation in feature engineering tasks.

To ensure consistency across the project, let's verify the usage of pandas in other files:

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for pandas usage in Python files and other YAML configs

echo "Checking for pandas imports in Python files:"
rg --type python 'import\s+pandas'

echo "\nChecking for pandas in other YAML config files:"
rg --type yaml 'pandas' --glob '!template/configs/feature_engineering.yaml'

Length of output: 497


Script:

#!/bin/bash
# Description: Check for pandas imports in Python files by explicitly searching .py files

echo "Checking for pandas imports in Python files:"
rg 'import\s+pandas' --glob '*.py'

echo "\nChecking for pandas in other YAML config files:"
rg 'pandas' --type yaml --glob '!template/configs/feature_engineering.yaml'

Length of output: 896

requirements:
- pyarrow

Expand Down
1 change: 1 addition & 0 deletions template/configs/inference.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ settings:
docker:
required_integrations:
- sklearn
- pandas
requirements:
- pyarrow

Expand Down
1 change: 1 addition & 0 deletions template/configs/training_rf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ settings:
docker:
required_integrations:
- sklearn
- pandas
requirements:
- pyarrow

Expand Down
1 change: 1 addition & 0 deletions template/configs/training_sgd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ settings:
docker:
required_integrations:
- sklearn
- pandas
requirements:
- pyarrow

Expand Down
2 changes: 1 addition & 1 deletion template/quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
"required!\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](\n",
"https://colab.research.google.com/github/zenml-io/zenml/blob/main/examples/quickstart/quickstart.ipynb)"
"https://colab.research.google.com/github/zenml-io/zenml/blob/main/examples/mlops_starter/quickstart.ipynb)"
]
},
{
Expand Down
1 change: 1 addition & 0 deletions template/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ zenml[server]>=0.50.0
notebook
scikit-learn
pyarrow
pandas
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve pandas addition, but consider version constraint

The addition of pandas is a good choice for data manipulation tasks and aligns well with the existing data-related packages. However, to ensure long-term stability and reproducibility, consider adding a version constraint.

Consider updating the line to include a version constraint:

-pandas
+pandas>=1.3.0

Replace 1.3.0 with the minimum version that works for your project.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pandas
pandas>=1.3.0

Loading