Skip to content

Commit

Permalink
Merge pull request #1 from Snowflake-Labs/jhilgart/initial-commit-add…
Browse files Browse the repository at this point in the history
…-semantic-model-generator

Initial commit to push semantic model generator
  • Loading branch information
sfc-gh-jhilgart authored Apr 9, 2024
2 parents 3f74b8d + 4f0c976 commit dfaf0bd
Show file tree
Hide file tree
Showing 26 changed files with 3,456 additions and 1 deletion.
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @sfc-gh-nsehrawat @sfc-gh-rehuang @sfc-gh-nlimtiaco @sfc-gh-jhilgart
59 changes: 59 additions & 0 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: Semantic Model Format & Lint

on:
pull_request:
paths:
- "semantic_model_generator/**"

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]

steps:
- name: Check out the code
uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

# Caching dependencies using Poetry
- name: Cache Poetry virtualenv
uses: actions/cache@v2
with:
path: ~/.cache/pypoetry/virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-
- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -
- name: Configure Poetry
run: |
$HOME/.local/bin/poetry config virtualenvs.create false
- name: Install dependencies using Poetry
run: |
$HOME/.local/bin/poetry install --no-interaction
- name: Run mypy
run: |
make run_mypy
- name: Check with black
run: |
make check_black
- name: Check with isort
run: |
make check_isort
- name: Run flake8
run: |
make run_flake8
47 changes: 47 additions & 0 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Build and Attach Wheel to GitHub Release

on:
push:
branches:
- 'main'
paths:
- '**.py'
- 'pyproject.toml'
- 'CHANGELOG.md'

jobs:
build:
if: contains(github.ref, 'refs/heads/release/v')
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2

- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.10'

- name: Install Poetry
run: pip install poetry

- name: Get the version from pyproject.toml
run: echo "VERSION=$(poetry version -s)" >> $GITHUB_ENV

- name: Build Wheel
run: poetry build -f wheel

- name: Extract Changelog for the Version
run: |
VERSION=${{ env.VERSION }}
CHANGELOG=$(awk '/^## \['"${VERSION//./\\.}"'\]/ {flag=1; next} /^## \[/ {flag=0} flag' CHANGELOG.md)
echo "CHANGELOG<<EOF" >> $GITHUB_ENV
echo "$CHANGELOG" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
- name: Upload Wheel to Release
uses: softprops/action-gh-release@v1
with:
files: dist/*.whl
body: ${{ env.CHANGELOG }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
46 changes: 46 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Semantic Model Generator Test

on:
pull_request:
paths:
- "semantic_model_generator/**"

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]
steps:
- name: Check out the code
uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

# Caching dependencies using Poetry
- name: Cache Poetry virtualenv
uses: actions/cache@v2
with:
path: ~/.cache/pypoetry/virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-
- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -
- name: Configure Poetry
run: |
$HOME/.local/bin/poetry config virtualenvs.create false
- name: Install dependencies using Poetry
run: |
$HOME/.local/bin/poetry install --no-interaction
- name: Test
run: |
make test_github_workflow
37 changes: 37 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Local python environment
pyvenv

# Notebook intermediate state
*.ipynb_checkpoints

# Mypy
.mypy_cache

# Pytest
.pytest_cache

# pycache
**/__pycache__

# Python package builds
*.egg-info

# VSCode
.vscode/settings.json
.vscode/.ropeproject
.vscode/*.log

# Envs
.env
.venv
.direnv
.envrc

# Output semantic models
semantic_model_generator/output_models/*

# Whls
dist/

# test coverage
.coverage
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.10
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Changelog

You must follow the format of `## [VERSION-NUMBER]` for the GitHub workflow to pick up the text.

## [1.0.0] - 2024-04-08
### Released
- Initial release of the project.
49 changes: 49 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@



check-deps: ## Check if poetry is installed on your system.
@command -v poetry >/dev/null 2>&1 || { echo >&2 "Poetry is required but it's not installed. Please install Poetry by following the instructions at: https://python-poetry.org/docs/#installation"; exit 1; }
@command -v pyenv >/dev/null 2>&1 || { echo >&2 "pyenv is recommended for managing Python versions but it's not installed. Install via `brew install pyenv`"; exit 1; }
@echo "Setting Python version to 3.10 using pyenv."
@pyenv local 3.10

shell: check-deps ## Get into a poetry shell
poetry shell

setup: check-deps shell ## Install dependencies into your poetry environment.
poetry install

# Linting and formatting below.
run_mypy: ## Run mypy
mypy --config-file=mypy.ini .

run_flake8: ## Run flake8
flake8 --ignore=E203,E501,W503 --exclude=pyvenv,tmp,*_pb2.py,*_pb2.pyi,images/*/src .

check_black: ## Check to see if files would be updated with black.
# Exclude pyvenv and all generated protobuf code.
black --check --exclude="pyvenv|.*_pb2.py|.*_pb2.pyi" .

run_black: ## Run black to format files.
# Exclude pyvenv, tmp, and all generated protobuf code.
black --exclude="pyvenv|tmp|.*_pb2.py|.*_pb2.pyi" .

check_isort: ## Check if files would be updated with isort.
isort --profile black --check --skip=pyvenv --skip-glob='*_pb2.py*' .

run_isort: ## Run isort to update imports.
isort --profile black --skip=pyvenv --skip=tmp --skip-glob='*_pb2.py*' .


fmt_lint: shell ## lint/fmt in current python environment
make run_mypy run_black run_isort run_flake8

# Test below
test: shell ## Run tests.
python -m pytest -vvs semantic_model_generator

test_github_workflow: ## For use on github workflow.
python -m pytest -vvs semantic_model_generator

help: ## Show this help.
@fgrep -h "##" $(MAKEFILE_LIST) | fgrep -v fgrep | sed -e 's/\\$$//' | sed -e 's
63 changes: 62 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,62 @@
# copilot-semantic-model-generator
# semantic-model-generator

The `Semantic Model Generator` is used to generate a semantic model for use in your Snowflake account.

## Setup

This project uses `poetry` to manage dependencies and we recommend `pyenv` for python version management.

1. `make setup`

Next, export your credentials as environment variables. Note, `host` is optional depending on your Snowflake deployment.

```bash
export SNOWFLAKE_ROLE = "<your-snowflake-role>"
export SNOWFLAKE_WAREHOUSE = "<your-snowflake-warehouse>"
export SNOWFLAKE_USER = "<your-snowflake-user>"
export SNOWFLAKE_PASSWORD = "<your-snowflake-password>"
export SNOWFLAKE_HOST = "<your-snowflake-host>"
```

## Usage

### Generation

You may generate a semantic model for a given list of fully qualified tables following the `{database}.{schema}.{table}` format. Each table in this list should be a physical table or a view present in your database.

All generated semantic models by default are saved under `semantic_model_generator/output_models`.


```bash
python -m semantic_model_generator.main \
--fqn_tables "['<your-database-name-1>.<your-schema-name-1>.<your-physical-table-or-view-name-1>','<your-database-name-2>.<your-schema-name-2>.<your-physical-table-or-view-name-2>']" \
--semantic_model_name "<a-meaningful-semantic-model-name>" \
--snowflake_account="<your-snowflake-account>"
```

### Post-Generation

**Important**: After generation, your YAML files will have a series of lines with `# <FILL-OUT>`. Please take the time to fill these out with your business context.

By default, the generated semantic model will contain all columns from the provided tables/views. However, it's highly encouraged to only keep relevant columns and drop any unwanted columns from the generated semantic model

In addition, consider adding the following elements to your semantic model:

1. Logical columns for a given table/view that are expressions over physical columns.
* Example: `col1 - col2` could be the `expr` for a logical column.
2. Synonyms. Any additional synonyms for column names.
3. Filters. Additional filters with their relevant `expr`.

## Release

In order to push a new build and release, follow the steps below.

1. Checkout a new branch from main. You must name this branch `release/vYYYY-MM-DD`. The `release/v` prefix is used to trigger a github workflow post-merge.
2. Bump the poetry:
* `poetry version patch`
* `poetry version minor`
* `poetry version major`
3. Update the `CHANGELOG.md` adding a relevant header for your version number along with a description of the changes made.
4. Commit the updated `pyproject.toml` and `CHANGELOG.md` and push.
5. Merge your branch which will trigger the release workflow.

10 changes: 10 additions & 0 deletions mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[mypy]
plugins = pydantic.mypy

ignore_missing_imports = True
strict = True
disallow_untyped_defs = True
warn_unused_ignores = False
disallow_any_generics = True

exclude = pyvenv|(_test\.py|test_.*\.py)|_pb2\.py|_pb2\.pyi
Loading

0 comments on commit dfaf0bd

Please sign in to comment.