Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit to push semantic model generator #1

Merged
Merged
59 changes: 59 additions & 0 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: Semantic Model Format & Lint

on:
pull_request:
paths:
- "semantic_model_generator/**"

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]

steps:
- name: Check out the code
uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

# Caching dependencies using Poetry
- name: Cache Poetry virtualenv
uses: actions/cache@v2
with:
path: ~/.cache/pypoetry/virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-

- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -

- name: Configure Poetry
run: |
$HOME/.local/bin/poetry config virtualenvs.create false

- name: Install dependencies using Poetry
run: |
$HOME/.local/bin/poetry install --no-interaction

- name: Run mypy
run: |
make run_mypy

- name: Check with black
run: |
make check_black

- name: Check with isort
run: |
make check_isort

- name: Run flake8
run: |
make run_flake8
47 changes: 47 additions & 0 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Build and Attach Wheel to GitHub Release

on:
push:
branches:
- 'main'
paths:
- '**.py'
- 'pyproject.toml'
- 'CHANGELOG.md'

jobs:
build:
if: contains(github.ref, 'refs/heads/release/v')
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2

- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.10'

- name: Install Poetry
run: pip install poetry

- name: Get the version from pyproject.toml
run: echo "VERSION=$(poetry version -s)" >> $GITHUB_ENV

- name: Build Wheel
run: poetry build -f wheel

- name: Extract Changelog for the Version
run: |
VERSION=${{ env.VERSION }}
CHANGELOG=$(awk '/^## \['"${VERSION//./\\.}"'\]/ {flag=1; next} /^## \[/ {flag=0} flag' CHANGELOG.md)
echo "CHANGELOG<<EOF" >> $GITHUB_ENV
echo "$CHANGELOG" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV

- name: Upload Wheel to Release
uses: softprops/action-gh-release@v1
with:
files: dist/*.whl
body: ${{ env.CHANGELOG }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
46 changes: 46 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Semantic Model Generator Test

on:
pull_request:
paths:
- "semantic_model_generator/**"

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]
steps:
- name: Check out the code
uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

# Caching dependencies using Poetry
- name: Cache Poetry virtualenv
uses: actions/cache@v2
with:
path: ~/.cache/pypoetry/virtualenvs
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
${{ runner.os }}-poetry-

- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -

- name: Configure Poetry
run: |
$HOME/.local/bin/poetry config virtualenvs.create false

- name: Install dependencies using Poetry
run: |
$HOME/.local/bin/poetry install --no-interaction

- name: Test
run: |
make test_github_workflow
37 changes: 37 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Local python environment
pyvenv

# Notebook intermediate state
*.ipynb_checkpoints

# Mypy
.mypy_cache

# Pytest
.pytest_cache

# pycache
**/__pycache__

# Python package builds
*.egg-info

# VSCode
.vscode/settings.json
.vscode/.ropeproject
.vscode/*.log

# Envs
.env
.venv
.direnv
.envrc

# Output semantic models
semantic_model_generator/output_models/*

# Whls
dist/

# test coverage
.coverage
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.10
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Changelog

You must follow the format of `## [VERSION-NUMBER]` for the GitHub workflow to pick up the text.

## [1.0.0] - 2024-04-08
### Released
- Initial release of the project.
49 changes: 49 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@



check-deps: ## Check if poetry is installed on your system.
@command -v poetry >/dev/null 2>&1 || { echo >&2 "Poetry is required but it's not installed. Please install Poetry by following the instructions at: https://python-poetry.org/docs/#installation"; exit 1; }
@command -v pyenv >/dev/null 2>&1 || { echo >&2 "pyenv is recommended for managing Python versions but it's not installed. Install via `brew install pyenv`"; exit 1; }
@echo "Setting Python version to 3.10 using pyenv."
@pyenv local 3.10

shell: check-deps ## Get into a poetry shell
poetry shell

setup: check-deps shell ## Install dependencies into your poetry environment.
poetry install

# Linting and formatting below.
run_mypy: ## Run mypy
mypy --config-file=mypy.ini .

run_flake8: ## Run flake8
flake8 --ignore=E203,E501,W503 --exclude=pyvenv,tmp,*_pb2.py,*_pb2.pyi,images/*/src .

check_black: ## Check to see if files would be updated with black.
# Exclude pyvenv and all generated protobuf code.
black --check --exclude="pyvenv|.*_pb2.py|.*_pb2.pyi" .

run_black: ## Run black to format files.
# Exclude pyvenv, tmp, and all generated protobuf code.
black --exclude="pyvenv|tmp|.*_pb2.py|.*_pb2.pyi" .

check_isort: ## Check if files would be updated with isort.
isort --profile black --check --skip=pyvenv --skip-glob='*_pb2.py*' .

run_isort: ## Run isort to update imports.
isort --profile black --skip=pyvenv --skip=tmp --skip-glob='*_pb2.py*' .


fmt_lint: shell ## lint/fmt in current python environment
make run_mypy run_black run_isort run_flake8

# Test below
test: shell ## Run tests.
python -m pytest -vvs semantic_model_generator

test_github_workflow: ## For use on github workflow.
python -m pytest -vvs semantic_model_generator

help: ## Show this help.
@fgrep -h "##" $(MAKEFILE_LIST) | fgrep -v fgrep | sed -e 's/\\$$//' | sed -e 's
61 changes: 60 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,60 @@
# copilot-semantic-model-generator
# semantic-model-generator
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfc-gh-pverhoeven Please review


The `Semantic Model Generator` is used to generate a semantic model for use in your Snowflake account.

## Setup

This project uses `poetry` to manage dependencies and we recommend `pyenv` for python version management.

1. `make setup`

Next, export your credentials as environment variables. Note, `host` is optional depending on your Snowflake deployment.

```bash
export SNOWFLAKE_ROLE = "<your-snowflake-role>"
export SNOWFLAKE_WAREHOUSE = "<your-snowflake-warehouse>"
export SNOWFLAKE_USER = "<your-snowflake-user>"
export SNOWFLAKE_PASSWORD = "<your-snowflake-password>"
export SNOWFLAKE_HOST = "<your-snowflake-host>"
```

## Usage

### Generation

You may generate a semantic model for a given list of fully qualified tables following the `{database}.{schema}.{table}` format. Every table should be a physical table present in your database.schema.
sfc-gh-jhilgart marked this conversation as resolved.
Show resolved Hide resolved

All generated semantic models by default are saved under `semantic_model_generator/output_models`.


```bash
python -m semantic_model_generator.main \
--fqn_tables "['<your-database-name-1>.<your-schema-name-1>.<your-physical-table-name-1>','<your-database-name-2>.<your-schema-name-2>.<your-physical-table-name-2>']" \
--semantic_model_name "<a-meaningful-semantic-name-for-your-team>" \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about s/a-meaningful-semantic-name-for-your-team/a-meaningful-semantic-model-name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!

--snowflake_account="<your-snowflake-account>"
```

### Post-Generation

**Important**: After generation, your YAML files will have a series of lines with `# <FILL-OUT>`. Please take the time to fill these out with your business context. In addition, if there are columns included that are not useful for your internal teams, please remove them from the semantic model.
sfc-gh-jhilgart marked this conversation as resolved.
Show resolved Hide resolved

In addition, consider adding the following elements to your semantic model:

1. Logical columns for a given table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say, "Logical columns that are expressions over physical columns"? Since all columns in a semantic model are logical columns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - done

* Example: `col1 - col2` could be the `expr` for a logical col
2. Synonyms. Any additional synonyms for column names.
3. Metrics. Additional metrics with their relevant `expr`.
sfc-gh-jhilgart marked this conversation as resolved.
Show resolved Hide resolved

## Release
sfc-gh-jhilgart marked this conversation as resolved.
Show resolved Hide resolved

In order to push a new build and release, follow the below steps.

1. Checkout a new branch from main. You must name this branch `release/vYYYY-MM-DD`. The `release/v` prefix is used to trigger a github workflow post-merge.
2. Bump the poetry:
* `poetry version patch`
* `poetry version minor`
* `poetry version major`
3. Update the `CHANGELOG.md` adding a relevant header for your version number along with a description of the changes made.
4. Commit the updated `pyproject.toml` and `CHANGELOG.md` and push.
5. Merge your branch which will trigger the release workflow.

10 changes: 10 additions & 0 deletions mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[mypy]
plugins = pydantic.mypy

ignore_missing_imports = True
strict = True
disallow_untyped_defs = True
warn_unused_ignores = False
disallow_any_generics = True

exclude = pyvenv|(_test\.py|test_.*\.py)|_pb2\.py|_pb2\.pyi
Loading
Loading