Skip to content

Commit

Permalink
Upgraded the ml-metadata version and dvc version to the latest (#178)
Browse files Browse the repository at this point in the history
* Upgraded the ml-metadata version and dvc version to the latest

* Upgraded the ml-metadata version and dvc version to the latest

* fixing merge issues

* Doc updates
  • Loading branch information
annmary-roy committed May 31, 2024
1 parent e852960 commit b9c8538
Show file tree
Hide file tree
Showing 11 changed files with 21 additions and 22 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/deploy_docs_to_gh_pages.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
- name: Setup Python Environment
uses: actions/setup-python@v3
with:
python-version: '3.8'
python-version: '3.10'

- name: Install Python Dependencies
run: |
Expand Down
2 changes: 1 addition & 1 deletion docs/cmf_client/Getting Started with cmf.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Follow the below-mentioned steps for the end-to-end setup of cmf-client:-

**Pre-Requisites**

- Python 3.8+
- Python 3.9+
- Git latest version

**Install cmf library i.e. cmflib**
Expand Down
4 changes: 2 additions & 2 deletions docs/examples/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
> [anaconda](https://docs.anaconda.com/anaconda/install/linux/) to manage python virtual environments.
> This example was tested in the following environments:
>
> - `Ubuntu-22.04 with python-3.8.15`
> - `Ubuntu-22.04 with python-3.10`
This example demonstrates how CMF tracks a metadata associated with executions of various machine learning (ML)
pipelines. ML pipelines differ from other pipelines (e.g., data Extract-Transform-Load pipelines) by the presence of
Expand Down Expand Up @@ -43,7 +43,7 @@ mkdir cmf_getting_started_example
cd cmf_getting_started_example

# Create and activate Python virtual environment (the Python version may need to be adjusted depending on your system)
conda create -n cmf_getting_started_example python=3.8
conda create -n cmf_getting_started_example python=3.10
conda activate cmf_getting_started_example

# Clone the CMF project from GitHub and install CMF
Expand Down
9 changes: 4 additions & 5 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,20 @@ models and performance metrics) recorded by the framework are versioned and iden
## Installation

#### 1. Pre-Requisites:
* 3.8>= Python <=3.9
* 3.9>= Python <=3.11
* Git latest version

#### 2. Set up Python Virtual Environment:

=== "Using Conda"
```shell
conda create -n cmf python=3.8
conda create -n cmf python=3.10
conda activate cmf
```

=== "Using VirtualEnv"
```shell
virtualenv --python=3.8 .cmf
virtualenv --python=3.10 .cmf
source .cmf/bin/activate
```

Expand All @@ -40,8 +40,6 @@ models and performance metrics) recorded by the framework are versioned and iden
After installing CMF, proceed to configure CMF server and client. For detailed configuration instructions, refer to the [Quick start with cmf-client](./cmf_client/step-by-step.md) page.


### [Jupyter Lab docker container with CMF pre-installed](#docker-section)

## Introduction
Complex ML projects rely on `ML pipelines` to train and test ML models. An ML pipeline is a sequence of stages where
each stage performs a particular task, such as data loading, pre-processing, ML model training and testing stages.
Expand Down Expand Up @@ -247,6 +245,7 @@ cmf = cmf.Cmf(
)
```

### [Jupyter Lab docker container with CMF pre-installed](#docker-section)
## <a name="docker-section"></a> Use a Jupyterlab Docker environment with CMF pre-installed
CMF has a docker-compose file which creates two docker containers,
- JupyterLab Notebook Environment with CMF pre installed.
Expand Down
2 changes: 1 addition & 1 deletion examples/example-get-started/src/featurize.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def featurize(input_dir: str, output_dir: str) -> None:
output_ds = Dataset(train=os.path.join(output_dir, "train.pkl"), test=os.path.join(output_dir, "test.pkl"))
graph_env = os.getenv("NEO4J", "False")
graph = True if graph_env == "True" or graph_env == "TRUE" else False
metawriter = cmf.Cmf(filename="mlmd", pipeline_name="Test-env", graph=graph)
metawriter = cmf.Cmf(filepath="mlmd", pipeline_name="Test-env", graph=graph)

_ = metawriter.create_context(pipeline_stage="Featurize")
_ = metawriter.create_execution(execution_type="Featurize-execution", custom_properties=params)
Expand Down
2 changes: 1 addition & 1 deletion examples/example-get-started/src/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def parse(input_file: str, output_dir: str) -> None:
random.seed(params["seed"])
graph_env = os.getenv("NEO4J", "False")
graph = True if graph_env == "True" or graph_env == "TRUE" else False
metawriter = cmf.Cmf(filename="mlmd", pipeline_name="Test-env", graph=graph)
metawriter = cmf.Cmf(filepath="mlmd", pipeline_name="Test-env", graph=graph)
_ = metawriter.create_context(pipeline_stage="Prepare", custom_properties={"user-metadata1": "metadata_value"})
_ = metawriter.create_execution(execution_type="Prepare", custom_properties=params)
_ = metawriter.log_dataset(input_file, "input", custom_properties={"user-metadata1": "metadata_value"})
Expand Down
2 changes: 1 addition & 1 deletion examples/example-get-started/src/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def test(model_dir: str, dataset_dir: str, output_dir: str) -> None:
)
graph_env = os.getenv("NEO4J", "False")
graph = True if graph_env == "True" or graph_env == "TRUE" else False
metawriter = cmf.Cmf(filename="mlmd", pipeline_name="Test-env", graph=graph)
metawriter = cmf.Cmf(filepath="mlmd", pipeline_name="Test-env", graph=graph)
_ = metawriter.create_context(pipeline_stage="Evaluate")
_ = metawriter.create_execution(execution_type="Evaluate-execution")

Expand Down
2 changes: 1 addition & 1 deletion examples/example-get-started/src/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def train(input_dir: str, output_dir: str) -> None:
params = yaml.safe_load(open("params.yaml"))["train"]
graph_env = os.getenv("NEO4J", "False")
graph = True if graph_env == "True" or graph_env == "TRUE" else False
metawriter = cmf.Cmf(filename="mlmd", pipeline_name="Test-env", graph=graph)
metawriter = cmf.Cmf(filepath="mlmd", pipeline_name="Test-env", graph=graph)
_ = metawriter.create_context(pipeline_stage="Train")
_ = metawriter.create_execution(execution_type="Train-execution", custom_properties=params)

Expand Down
8 changes: 4 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
[project]
name = "cmflib"
version = "0.0.8"
version = "0.0.9"
dependencies = [
"ml-metadata==1.11.0",
"dvc[ssh,s3]==2.27.0",
"ml-metadata==1.15.0",
"dvc[ssh,s3]==3.51.1",
"pandas",
"retrying",
"pyarrow",
Expand All @@ -19,7 +19,7 @@ authors = [
]
description = "Track metadata for AI pipeline"
readme = "README.md"
requires-python = ">=3.8,<3.10"
requires-python = ">=3.9,<=3.11"
classifiers = [
"Programming Language :: Python :: 3",
"Operating System :: POSIX :: Linux",
Expand Down
4 changes: 2 additions & 2 deletions server/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@
# limitations under the License.
###

# Use python 3.8-slim-buster as base image.
FROM python:3.8-slim-buster
# Use python 3.10-slim-buster as base image.
FROM python:3.10-slim-buster

#Update the proxy if needed
#ENV http_proxy http://web-proxy.corp.hpecorp.net:8080
Expand Down
6 changes: 3 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from setuptools import setup, find_packages

VERSION = '0.0.8'
VERSION = '0.0.9'
DESCRIPTION = 'Metadata Python Package'
LONG_DESCRIPTION = 'Metadata framework storing AI metadata into MLMD'

Expand All @@ -13,8 +13,8 @@
description=DESCRIPTION,
long_description=LONG_DESCRIPTION,
packages=find_packages(),
install_requires=["ml-metadata==1.11.0",
"dvc[ssh,s3]==2.27.0", "pandas", "retrying", "pyarrow", "neo4j", \
install_requires=["ml-metadata==1.15.0",
"dvc[ssh,s3]==3.51.1", "pandas", "retrying", "pyarrow", "neo4j", \
"scikit-learn", "tabulate", "click", "minio", "paramiko"], # add any additional packages that
# needs to be installed along with your package. Eg: 'caer'

Expand Down

0 comments on commit b9c8538

Please sign in to comment.