Skip to content

Commit

Permalink
Merge pull request #65 from GoogleCloudPlatform/feature/github-default
Browse files Browse the repository at this point in the history
Feature/GitHub default
  • Loading branch information
srastatter authored Nov 7, 2024
2 parents 4f04d3f + 7ae461a commit fca32f5
Show file tree
Hide file tree
Showing 27 changed files with 421 additions and 110 deletions.
Binary file modified AutoMLOps_User_Guide.pdf
Binary file not shown.
22 changes: 22 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,28 @@
# Change Log
All notable changes to this project will be documented in this file.

## [1.3.1] - 2024-10-30

### Added

- Guide explaining how to set up Workload Identity Federation for AutoMLOps
- Created new docs/ directory
- Check that prohibits using Github and Cloud Build (this is currently not supported)

### Changed

- Made `github` and `github-actions` the default options for `source_repo_type` and `deployment_framework` respectively
- Updated example notebooks so that they are set to `use_ci=False` by default
- Provided samples in the example notebooks that demonstrate how to use the `github` and `github_actions` setup
- Added a guide for setting up Workload Identity Federation
- Updated github actions op versions (checkout and auth)

### Fixed

- Fixed broken default of now deprecated Cloud Source Repositories
- Updated sklearn version of training example00 to use v1.3
- Removed `build_trigger_location` from resources manifest output

## [1.3.0] - 2024-5-22

### Added
Expand Down
37 changes: 12 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,8 @@ Inferencing
- Artifact Registry

**Deployment Frameworks**: Builds component docker containers, compiles pipelines, and submits Pipeline Jobs
- Cloud Build
- Github Actions
- Cloud Build
- [coming soon] Gitlab CI
- [coming soon] Bitbucket Pipelines
- [coming soon] Jenkins
Expand All @@ -87,10 +87,10 @@ Inferencing
- [coming soon] pulumi

**Source Code Repositories**: Repository for versioning generated MLOps code
- [deprecating soon] Cloud Source Repositories
- Bitbucket
- Github
- Bitbucket
- Gitlab
- [deprecated] Cloud Source Repositories

# Prerequisites
### Generate
Expand Down Expand Up @@ -122,11 +122,6 @@ In order to use `AutoMLOps.deploy(...)` with `use_ci=True`, the following are re
git config --global user.name "Your Name"
```
- Registered and setup your SSH key if you are using Github, Gitlab, or Bitbucket
- [Application Default Credentials (ADC)](https://cloud.google.com/docs/authentication/provide-credentials-adc) are set up if you are using Cloud Source Repositories. This can be done through the following commands:
```
gcloud auth application-default login
gcloud config set account <[email protected]>
```

### Monitor
In order to use `AutoMLOps.monitor(...)`, the following are required:
Expand Down Expand Up @@ -170,10 +165,7 @@ AutoMLOps will makes use of the following products based on user selected option
6. if `use_ci=True` and `schedule_pattern` is specified, AutoMLOps will use:
- [Cloud Scheduler](https://cloud.google.com/scheduler/docs/overview)

7. if `use_ci=True` and `source_repo_type='cloud-source-repositories'`, AutoMLOps will use:
- [Cloud Source Repositories](https://cloud.google.com/source-repositories/docs)

8. if `use_ci=True` and `setup_model_monitoring=True`, AutoMLOps will use:
7. if `use_ci=True` and `setup_model_monitoring=True`, AutoMLOps will use:
- [Vertex AI Model Monitoring](https://cloud.google.com/vertex-ai/docs/model-monitoring/overview)
- [Cloud Logging](https://cloud.google.com/logging/docs/overview)

Expand All @@ -193,7 +185,6 @@ Based on the above user selection, AutoMLOps will enable up to the following API
- [pubsub.googleapis.com](https://cloud.google.com/pubsub/docs/reference/rest)
- [run.googleapis.com](https://cloud.google.com/run/docs/reference/rest)
- [storage.googleapis.com](https://cloud.google.com/storage/docs/apis)
- [sourcerepo.googleapis.com](https://cloud.google.com/source-repositories/docs/reference/rest)


AutoMLOps will create the following service account and update [IAM permissions](https://cloud.google.com/iam/docs/understanding-roles) during the provision step:
Expand Down Expand Up @@ -232,7 +223,7 @@ Optional parameters (defaults shown):
5. `build_trigger_location: str = 'us-central1'`
6. `build_trigger_name: str = f'{naming_prefix}-build-trigger'`
7. `custom_training_job_specs: list[dict] = None`
8. `deployment_framework: str = 'cloud-build'`
8. `deployment_framework: str = 'github-actions'`
9. `naming_prefix: str = 'automlops-default-prefix'`
10. `orchestration_framework: str = 'kfp'`
11. `pipeline_job_runner_service_account: str = f'vertex-pipelines@{project_id}.iam.gserviceaccount.com'`
Expand All @@ -249,7 +240,7 @@ Optional parameters (defaults shown):
22. `setup_model_monitoring: Optional[bool] = False`
23. `source_repo_branch: str = 'automlops'`
24. `source_repo_name: str = f'{naming_prefix}-repository'`
25. `source_repo_type: str = 'cloud-source-repositories'`
25. `source_repo_type: str = 'github'`
26. `storage_bucket_location: str = 'us-central1'`
27. `storage_bucket_name: str = f'{project_id}-{naming_prefix}-bucket'`
28. `use_ci: bool = False`
Expand All @@ -262,8 +253,8 @@ Parameter Options:
- `artifact_repo_type=`:
- 'artifact-registry' (default)
- `deployment_framework=`:
- 'cloud-build' (default)
- 'github-actions'
- 'github-actions' (default)
- 'cloud-build'
- [coming soon] 'gitlab-ci'
- [coming soon] 'bitbucket-pipelines'
- [coming soon] 'jenkins'
Expand All @@ -281,8 +272,7 @@ Parameter Options:
- 'terraform'
- [coming soon] 'pulumi'
- `source_repo_type=`:
- 'cloud-source-repositories' (default)
- 'github'
- 'github' (default)
- 'gitlab'
- 'bitbucket'

Expand Down Expand Up @@ -324,7 +314,7 @@ A description of the parameters is below:
- `workload_identity_provider`: Provider for workload identity federation.
- `workload_identity_service_account`: Service account for workload identity federation (specify the full string).

AutoMLOps will generate the resources specified by these parameters (e.g. Artifact Registry, Cloud Source Repo, etc.). If use_ci is set to True, AutoMLOps will turn the outputted AutoMLOps/ directory into a Git repo and use it for the source repo. If a cron formatted str is given as an arg for `schedule_pattern` then it will set up a Cloud Schedule to run accordingly. If `setup_model_monitoring` is set to true, a model_monitoring/ directory will be created and a monitoring section will be added to config/defaults.yaml with empty values. These values are then set by running `AutoMLOps.monitor()`.
AutoMLOps will generate the resources specified by these parameters (e.g. Artifact Registry, GCS bucket, etc.). If use_ci is set to True, AutoMLOps will turn the outputted AutoMLOps/ directory into a Git repo and use it for the source repo. If a cron formatted str is given as an arg for `schedule_pattern` then it will set up a Cloud Schedule to run accordingly. If `setup_model_monitoring` is set to true, a model_monitoring/ directory will be created and a monitoring section will be added to config/defaults.yaml with empty values. These values are then set by running `AutoMLOps.monitor()`.

# Generating Code

Expand Down Expand Up @@ -385,11 +375,6 @@ AutoMLOps currently provides 2 primary options for provisioning infrastructure:
### Cloud Continuous Integration and Continuous Deployment Workflow
If `use_ci=True`, AutoMLOps will generate and use a fully featured CI/CD environment for the pipeline. Otherwise, it will use the local scripts to build and run the pipeline. In the diagrams below dashed boxes show areas users can select and customize their tooling.

**<center>Cloud Build option:</center>**
<p align="center">
<img src="https://raw.githubusercontent.com/GoogleCloudPlatform/automlops/main/assets/deploy/CICD-default.png" alt="CICD" width="1000"/>
</p>

**<center>Github Actions option:</center>**
<p align="center">
<img src="https://raw.githubusercontent.com/GoogleCloudPlatform/automlops/main/assets/deploy/CICD-github.png" alt="CICD" width="1000"/>
Expand Down Expand Up @@ -478,6 +463,8 @@ AutoMLOps.generate(project_id=PROJECT_ID,
workload_identity_service_account='workload_identity_sa')
```

More specific details for setting up AutoMLOps to use Github and Github Actions can be found in [this doc](docs/Using%20Github%20With%20AMO.md).

**Set scheduled run:**

Use the `schedule_pattern` parameter to specify a cron job schedule to run the pipeline job on a recurring basis.
Expand Down
Binary file added assets/deploy/github_actions/image1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/deploy/github_actions/image2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/deploy/github_actions/image3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/deploy/github_actions/image4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/deploy/github_actions/image5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/deploy/github_actions/image6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
149 changes: 149 additions & 0 deletions docs/Using Github With AMO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
**Using Github With AMO**

The steps below are required in order to deploy a model using the AutoMLOps package. The process below assumes that the [AutoMLOps](https://github.com/GoogleCloudPlatform/automlops) package is already installed in the environment where the AutoMLOps execution cells are being run.

## **Create a Repository**

Go to Github, and create a new repository leaving fields as default. Let’s call it automlops-test.

<p align="left">
<img src="../assets/deploy/github_actions/image1.png" alt="image1" width="600"/>
</p>

\[Optional\] Create repository via Terraform

```
resource "github_repository" "my_repo" {
name = "my_cool_repo"
description = "My awesome codebase"
visibility = "private"
}
```

## **Set up SSH connection**

Follow [this guide](https://docs.github.com/en/authentication/connecting-to-github-with-ssh) on how to add your SSH keys to the environment from where you are running the AutoMLOps commands.

## **Pre-requisites**

Ensure that you have the following IAM Roles to provision an Identity Pool:

* roles/iam.workloadIdentityPoolAdmin

**Set up [Workload Identity Federation](https://cloud.google.com/iam/docs/workload-identity-federation) (WIF) for GitHub Actions**

1. Create an Identity Pool in GCP
<p align="left">
<img src="../assets/deploy/github_actions/image2.png" alt="image2" width="600"/>
</p>
- Enter a pool name and click ‘Continue’. Example details:
- Name: automlops-1-github-pool
- Pool ID: automlops-1-github-pool
- \<check Enabled pool\>

2. Create an Identity Provider and add it to the pool:

<p align="left">
<img src="../assets/deploy/github_actions/image3.png" alt="image3" width="600"/>
</p>

<p align="left">
<img src="../assets/deploy/github_actions/image4.png" alt="image4" width="600"/>
</p>

- Fill in the attributes and conditions for the provider with the following details:

- #### Provider type: OpenID Connect (OIDC)

- Provider Name: automlops-github-provider
- Provider ID: automlops-github-provider
- Issuer (URL): https://token.actions.githubusercontent.com/
- JWK file (JSON): \<leave empty\>
- Audiences: \<select Default audience\>
- Attribute Mappings:
- Google 1: google.subject
- OIDC 1: assertion.sub
- Google 2: attribute.actor
- OIDC 2: assertion.actor
- Google 3: attribute.repository
- OIDC 3: assertion.repository
- Attribute Conditions: google.subject.contains("\<my\_repo\_name\>")
- i.e. google.subject.contains("Akhan221/automlops-test")


3. Create a service account
Go to the service accounts tab and click on new service account in the top right. Provide a unique name for the service account (SA).

<p align="left">
<img src="../assets/deploy/github_actions/image5.png" alt="image5" width="600"/>
</p>

\[Optional\] Create the Service Account using Terraform



```
resource "google_service_account" "service_account" {
account_id = "service-account-id"
display_name = "Service Account"
}
```

4. Grant necessary permission to the Service Account
1. *roles/artifactregistry.writer*
2. *roles/pubsub.publisher*
3. *roles/iam.workloadIdentityUser*
4. *roles/storage.admin*

<p align="left">
<img src="../assets/deploy/github_actions/image6.png" alt="image6" width="600"/>
</p>

5. Grant identity pool access to the SA, following the instructions at [this link](https://cloud.google.com/iam/docs/workload-download-cred-and-grant-access?_gl=1*1qn8wpp*_ga*MTU3MjE2NjAzMy4xNzA2NzI1NDU0*_ga_WH2QY8WWF5*MTczMDM4MzY1MC41OS4xLjE3MzAzODQ2NTUuNjAuMC4w#service-account-in-the-same-project).
1. Option 1
1. Select “+ Grant Access” at the top of the Identity Pool view
2. Check “Grant access using Service Account impersonation”
3. Select the service account you just created
4. For the attribute name, select “repository”
5. For the attribute value, write full name of your repo e.g. “Akhan221/automlops-test”
6. Click Save
2. Option 2
1. Navigate to IAM \-\> Service Accounts, select the desired service account from the list
2. Select “Permissions” from the top menu
3. Select “+ Grant Access”
4. Add the principal that aligns to the Identity Pool
1. principalSet://iam.googleapis.com/projects/PROJECT\_NUMBER/locations/global/workloadIdentityPools/POOL\_ID/attribute.repository/FULL\_REPO\_STRING
3. Option 3 (grant full access to the pool)
1. Run the following command:

gcloud iam service-accounts add-iam-policy-binding "FULL\_SERVICE\_ACCOUNT\_STRING" \\

\--project="PROJECT\_ID" \\

\--role="roles/iam.workloadIdentityUser" \\

\--member=principalSet://iam.googleapis.com/projects/PROJECT\_NUMBER/locations/global/workloadIdentityPools/POOL\_ID/\*

## **Run the Generate Step with the WIF Details**

When going through the AutoMLOps example notebook, on the generate step make sure to include these fields along with the other required parameters:

```py
AutoMLOps.generate(
...
use_ci=True,
deployment_framework='github-actions',
project_number='<project_number>', # ie. 99800000432
source_repo_type='github',
source_repo_name='source/repo/string', # ie. Akhan221/automlops-test
workload_identity_pool='identity_pool_string', # ID string
workload_identity_provider='identity_provider_string', # ID string
workload_identity_service_account='workload_identity_sa' # full SA string
)
```

For the source\_repo\_name, note that this is referring to the full name of the repo including the owner prefix. For example, Akhan221/my-test-repo.

## **Run the provision and deploy steps**

Finish running the remainder of the commands and verify that the pipeline actually gets created in Vertex AI.
27 changes: 21 additions & 6 deletions examples/inferencing/00_batch_prediction_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,6 @@
"- [pubsub.googleapis.com](https://cloud.google.com/pubsub/docs/reference/rest)\n",
"- [run.googleapis.com](https://cloud.google.com/run/docs/reference/rest)\n",
"- [storage.googleapis.com](https://cloud.google.com/storage/docs/apis)\n",
"- [sourcerepo.googleapis.com](https://cloud.google.com/source-repositories/docs/reference/rest)\n",
"\n",
"\n",
"AutoMLOps will create the following service account and update [IAM permissions](https://cloud.google.com/iam/docs/understanding-roles) during the provision step:\n",
Expand All @@ -130,7 +129,6 @@
"- Vertex AI\n",
"- Artifact Registry\n",
"- Cloud Storage\n",
"- Cloud Source Repository\n",
"- Cloud Build\n",
"- Cloud Run\n",
"- Cloud Scheduler\n",
Expand Down Expand Up @@ -562,7 +560,7 @@
"metadata": {},
"source": [
"## Generate and Run the pipeline\n",
"`AutoMLOps.generate(...)` generates the MLOps codebase. Users can specify the tooling and technologies they would like to use in their MLOps pipeline."
"`AutoMLOps.generate(...)` generates the MLOps codebase. Users can specify the tooling and technologies they would like to use in their MLOps pipeline. If you are interested in integrating with Github and Github Actions, please follow the setup steps in [this doc](../../docs/Using%20Github%20With%20AMO.md) and uncomment the relevant code block below."
]
},
{
Expand All @@ -588,12 +586,29 @@
}
],
"source": [
"# Setup using local scripts and cloudbuild:\n",
"AutoMLOps.generate(project_id=PROJECT_ID,\n",
" pipeline_params=pipeline_params,\n",
" use_ci=True,\n",
" use_ci=False,\n",
" naming_prefix=MODEL_ID,\n",
" schedule_pattern='59 11 * * 0' # rerun every Sunday at Midnight\n",
")"
" deployment_framework='cloud-build',\n",
")\n",
"\n",
"# # Setup using Github, Github Actions, and Terraform:\n",
"# AutoMLOps.generate(project_id=PROJECT_ID,\n",
"# pipeline_params=pipeline_params,\n",
"# naming_prefix=MODEL_ID,\n",
"# schedule_pattern='59 11 * * 0', # retrain every Sunday at Midnight\n",
"# use_ci=True,\n",
"# deployment_framework='github-actions',\n",
"# provisioning_framework='terraform', \n",
"# source_repo_type='github',\n",
"# project_number='<project_number>',\n",
"# source_repo_name='<source/repo/string>',\n",
"# workload_identity_pool='<identity_pool_string>',\n",
"# workload_identity_provider='<identity_provider_string>',\n",
"# workload_identity_service_account='<workload_identity_sa>'\n",
"# )"
]
},
{
Expand Down
Loading

0 comments on commit fca32f5

Please sign in to comment.