Welcome to the tutorial for Gable! In this tutorial, you will:
- Set up a forked tutorial repository with Gable and Github Actions
- Validate and publish a new data contract
- Attempt to make a breaking change to a data asset under contract
This tutorial utilizes Github Actions to validate & publish data contracts, register data assets, and detect contract violations all using the Gable platform. If you would like to run a similar tutorial using just the CLI, you can check out the CLI tutorial.
The tutorial repository is a small, self contained sample repository that lets you run through the end-to-end process of publishing a data contract and seeing that contract be enforced with Gable's platform. In this tutorial, you'll create a fork of the repository, configure your Gable credentials, and run the CI/CD workflows in Github Actions. If you're using the credentials from your sandbox environment, you can even run through this tutorial by making a personal fork of this repository.
Navigate to the tutorial repository and click on the "Fork" button in the top-right corner of the repository's page.
A dialog may appear, asking you where to fork the repository (if you're part of any organizations). Choose your personal account (or the desired organization) to create the fork.
After clicking the create button, wait a moment as GitHub creates a copy of the repository under your account.
Github is cautious when running workflows for forked repositories so you need to grant explicit permission to run the workflows for the tutorial. Navigate to your forked repository in Github and click on the Actions tab. You'll see a message like the following:
Click on the "I understand my workflows, go ahead and enable them" button. This will give Github permission to run the CI/CD checks that come bundled with the tutorial.
In order to connect the Github Actions to Gable, you need:
- The API endpoint associated with your organization
- An API key that corresponds to the endpoint
You can find your API endpoint and key by navigating to the Settings -> API Keys
page of Gable. Under API Keys you can click View
to reveal your API key.
Next, you will need to create the GABLE_API_KEY
and GABLE_API_ENDPOINT
secrets in your repository settings to configure Github Actions to talk to Gable.
Follow the Github instructions for Creating encrypted secrets for a repository to create the GABLE_API_KEY
and GABLE_API_ENDPOINT
repository secrets.
The last setup step is to clone the forked repository to your local machine. Navigate to the main page of the forked repository within your GitHub account and click the "Code" button and copy either the HTTPS or SSH url for the repository.
Now open a terminal and run:
git clone <COPIED_REPO_URL> --single-branch
cd <FORKED_REPO_NAME>
For the tutorial, there are two types of data assets: event schemas and database tables. The event schemas are Protobuf and Avro files stored within the repository. The database tables are generated by migration files which apply changes to a test database in Github Actions.
In order to kick off the first workflow to register the data assets, you'll need to push an empty commit to the main
branch of the repository. Run:
git commit --allow-empty -m "Run workflow"
git push
The workflow will start after you have created a Pull Request using the GitHub web interface.
Once the workflow completes, you should be able to see the Orders
contract in the Gable UI.
Congratulations! You've set up your tutorial repository and are ready to try out Gable's platform!
The tutorial repository includes many different data assets for a generic e-commerce company. It includes database tables and event schemas. For this tutorial, we are going to create a contract on the order_details
database table.
You are going to create a new data contract in the tutorial repository. Just like code changes, rather than creating the contract in the main
branch of the repository, it's best practice to create a new branch, and open a Pull Request for the changes. The PR allows others to comment on your changes, and also allows the Github Actions to validate the contract is syntactically correct before pushing changes.
You can either create a new branch using the Github web interface or using the git command line tools. This tutorial will walk through creating the branch using the git command line. In the tutorial repository on your local machine:
-
Open a Terminal Window: Navigate to the directory where your cloned tutorial repository is located
-
Checkout a New Branch: Use the following command to create and switch to a new branch called
first_contract
:git checkout -b first_contract
Great! Now you can start writing the contract!
You are going to create a data contract for the order_details
table of the database, which represents the individual items included in a customer order. Writing a data contract involves creating a YAML file that declares the schema and semantics of the data following the data contract specification.
Contracts are associated with a specific data asset. When you first enabled Github Actions, it created several data assets for database tables and event schemas. You can navigate to the Data Assets
page to view a list of your organization's assets.
You can click on the data asset where you can view its details including the ID Gable uses for the asset. You can click on the clipboard next to the ID to copy it. For this tutorial, we are using the tutorial.public.order_details
data asset.
In the contracts
directory of your local repository, create a file called order_details.yaml
. NOTE: The CLI can upload both .yaml
and .yml
files but the tutorial is set up to only upload .yaml
files. Please ensure that the file extension is .yaml
for the purposes of this tutorial.
Copy and paste the following into the contents of order_details.yaml
:
id: 7b7eabe6-a52a-4e19-8bf3-9015eadbbed0
dataAssetResourceName: postgres://prod.store.com:5432:tutorial.public.order_details
spec-version: 0.1.0
name: OrderDetails
namespace: Tutorial
doc: Details of items for an order. Each row represents the different items that were part of an order including their quantity and price.
owner: [email protected]
schema:
- name: order_id
doc: The identifier of the order the item is associated with
type: int32
- name: product_id
doc: The identifier of the product that was ordered
type: int32
- name: quantity
doc: The quantity of the product in the order
type: int32
- name: total_price
doc: The total price of the item in the order
type: decimal256
- name: vendor_id
doc: The identifier of the vendor that the product was ordered from
type: int32
This contract contains information on what data the contract applies to, who owns the contract, as well as the minimum expected schema for the data from the order_details
table.
Remark: You can generate a Data Contract template for a Data Asset using the Gable UI.
Now that you have created the contract, it is time to commit the change to the repository. To stage and commit your changes, run:
git add .
git commit -m "Add order details data contract"
git push origin first_contract
Before merging your changes back to the main
branch, it is a good idea to create a Pull Request. Creating the Pull Request will serve two purposes:
- Allow others to review your newly-created data contract
- Allow the Github Action to validate the data contract is syntactically correct
To create the Pull Request:
-
Navigate to the main page of your forked tutorial repository in Github
-
Click on the "Pull requests" tab near the top of the page
-
Click the "New Pull Request" button
-
In the "base repository" dropdown, select your forked tutorial repository
-
In the "base" dropdown, select the
main
branch -
In the "compare" dropdown, select the
first_contract
branch that contains your new data contract -
In the "Title" field, add
Add order details data contract
-
In the "Leave a comment" field, add the following:
Adds a new data contract for `order_details` table.
-
Click the "Create Pull Request" button.
Once your PR is open, Gable's contract validation CI/CD check runs automatically to check for syntax errors in your data contracts. You can open the Validate Data Contracts
check to see the status, if there were any errors in your contract they would appear here.
Merge the Pull Request you opened. This will kick off the Github Action which publishes the contract.
Once the Github Action on the main
branch has been run and the data contract has been published, navigate to the Gable UI and you should see your new data contract!
NOTE: This section relies on the order_details
data contract created in the previous step. Please complete that step if you have not already.
Now that there is a data contract in place for the order_details
data, every time a change affecting the underlying order_details
table, Gable checks to ensure that changes do not violate the contract. Let's try making a breaking change to the order_details
table.
First off, pull changes down from the main branch to ensure that your repository is up-to-date. Create a new branch called breaking_data_change
:
git checkout main
git pull
git checkout -b breaking_data_change
We are going to add a new migration file for our database to attempt to break the contract.
Create a new file called 7a341d712aa5_order_detail_changes.py
in the ./db_migrations/versions/
folder. Paste the following contents into the new 7a341d712aa5_order_detail_changes.py
file:
"""order_detail changes
Revision ID: 7a341d712aa5
Revises: 78da3a78bfd6
Create Date: 2023-09-29 13:22:35.204135
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = '7a341d712aa5'
down_revision: Union[str, None] = '78da3a78bfd6'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Make the vendor optional
op.alter_column(
'order_details',
'vendor_id',
nullable=True,
)
# Convert existing values to integers (for example, if representing cents)
op.execute('UPDATE order_details SET unit_price = ROUND(unit_price * 100)')
op.execute('UPDATE order_details SET total_price = ROUND(total_price * 100)')
# Alter the column type
op.alter_column('order_details', 'unit_price', type_=sa.Integer)
op.alter_column('order_details', 'total_price', type_=sa.Integer)
def downgrade() -> None:
op.alter_column(
'order_details',
'vendor_id',
nullable=False,
)
# Alter the column type back to Numeric
op.alter_column('order_details', 'unit_price', type_=sa.Numeric(10, 2))
op.alter_column('order_details', 'total_price', type_=sa.Numeric(10, 2))
# Convert integer values back to their decimal representations
op.execute('UPDATE order_details SET unit_price = total_price / 100.0')
op.execute('UPDATE order_details SET total_price = total_price / 100.0')
This migration file does a couple of things. First, it converts the unit_price
and total_price
columns to an integer field and makes the vendor_id
optional.
Both the total_price
and vendor_id
columns are protected in the contract. Changing the type of total_price
and making vendor_id
represent breaking changes.
Once the change has been saved, commit the change and push it to Github:
git add .
git commit -m "Update order details table"
git push origin breaking_data_change
Now open a Pull Request for the proposed breaking change:
-
Navigate to the main page of your forked tutorial repository in Github
-
Click on the "Pull requests" tab near the top of the page
-
Click the "New Pull Request" button
-
In the "base repository" dropdown, select your forked tutorial repository
-
In the "base" dropdown, select the
main
branch -
In the "compare" dropdown, select the
breaking_data_change
branch that contains the breaking data change -
In the "Title" field, add
Update order details table
-
In the "Leave a comment" field, add the following:
Make updates to the order details table
-
Click the "Create Pull Request" button.
The Github Action will validate the changes against existing data contracts. It will detect the change to the order_details
table and post a message in the PR that this change breaks the existing data contract.
Congratulations on creating your first data contract and validating data asset changes! Be sure to check out more of Gable's documentation for more information on our platform!