-
Notifications
You must be signed in to change notification settings - Fork 74
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from Snowflake-Labs/jhilgart/initial-commit-add…
…-semantic-model-generator Initial commit to push semantic model generator
- Loading branch information
Showing
26 changed files
with
3,456 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
* @sfc-gh-nsehrawat @sfc-gh-rehuang @sfc-gh-nlimtiaco @sfc-gh-jhilgart |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
name: Semantic Model Format & Lint | ||
|
||
on: | ||
pull_request: | ||
paths: | ||
- "semantic_model_generator/**" | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
python-version: ["3.10"] | ||
|
||
steps: | ||
- name: Check out the code | ||
uses: actions/checkout@v4 | ||
|
||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
# Caching dependencies using Poetry | ||
- name: Cache Poetry virtualenv | ||
uses: actions/cache@v2 | ||
with: | ||
path: ~/.cache/pypoetry/virtualenvs | ||
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }} | ||
restore-keys: | | ||
${{ runner.os }}-poetry- | ||
- name: Install Poetry | ||
run: | | ||
curl -sSL https://install.python-poetry.org | python3 - | ||
- name: Configure Poetry | ||
run: | | ||
$HOME/.local/bin/poetry config virtualenvs.create false | ||
- name: Install dependencies using Poetry | ||
run: | | ||
$HOME/.local/bin/poetry install --no-interaction | ||
- name: Run mypy | ||
run: | | ||
make run_mypy | ||
- name: Check with black | ||
run: | | ||
make check_black | ||
- name: Check with isort | ||
run: | | ||
make check_isort | ||
- name: Run flake8 | ||
run: | | ||
make run_flake8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
name: Build and Attach Wheel to GitHub Release | ||
|
||
on: | ||
push: | ||
branches: | ||
- 'main' | ||
paths: | ||
- '**.py' | ||
- 'pyproject.toml' | ||
- 'CHANGELOG.md' | ||
|
||
jobs: | ||
build: | ||
if: contains(github.ref, 'refs/heads/release/v') | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: '3.10' | ||
|
||
- name: Install Poetry | ||
run: pip install poetry | ||
|
||
- name: Get the version from pyproject.toml | ||
run: echo "VERSION=$(poetry version -s)" >> $GITHUB_ENV | ||
|
||
- name: Build Wheel | ||
run: poetry build -f wheel | ||
|
||
- name: Extract Changelog for the Version | ||
run: | | ||
VERSION=${{ env.VERSION }} | ||
CHANGELOG=$(awk '/^## \['"${VERSION//./\\.}"'\]/ {flag=1; next} /^## \[/ {flag=0} flag' CHANGELOG.md) | ||
echo "CHANGELOG<<EOF" >> $GITHUB_ENV | ||
echo "$CHANGELOG" >> $GITHUB_ENV | ||
echo "EOF" >> $GITHUB_ENV | ||
- name: Upload Wheel to Release | ||
uses: softprops/action-gh-release@v1 | ||
with: | ||
files: dist/*.whl | ||
body: ${{ env.CHANGELOG }} | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
name: Semantic Model Generator Test | ||
|
||
on: | ||
pull_request: | ||
paths: | ||
- "semantic_model_generator/**" | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
python-version: ["3.10"] | ||
steps: | ||
- name: Check out the code | ||
uses: actions/checkout@v4 | ||
|
||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
# Caching dependencies using Poetry | ||
- name: Cache Poetry virtualenv | ||
uses: actions/cache@v2 | ||
with: | ||
path: ~/.cache/pypoetry/virtualenvs | ||
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }} | ||
restore-keys: | | ||
${{ runner.os }}-poetry- | ||
- name: Install Poetry | ||
run: | | ||
curl -sSL https://install.python-poetry.org | python3 - | ||
- name: Configure Poetry | ||
run: | | ||
$HOME/.local/bin/poetry config virtualenvs.create false | ||
- name: Install dependencies using Poetry | ||
run: | | ||
$HOME/.local/bin/poetry install --no-interaction | ||
- name: Test | ||
run: | | ||
make test_github_workflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# Local python environment | ||
pyvenv | ||
|
||
# Notebook intermediate state | ||
*.ipynb_checkpoints | ||
|
||
# Mypy | ||
.mypy_cache | ||
|
||
# Pytest | ||
.pytest_cache | ||
|
||
# pycache | ||
**/__pycache__ | ||
|
||
# Python package builds | ||
*.egg-info | ||
|
||
# VSCode | ||
.vscode/settings.json | ||
.vscode/.ropeproject | ||
.vscode/*.log | ||
|
||
# Envs | ||
.env | ||
.venv | ||
.direnv | ||
.envrc | ||
|
||
# Output semantic models | ||
semantic_model_generator/output_models/* | ||
|
||
# Whls | ||
dist/ | ||
|
||
# test coverage | ||
.coverage |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
3.10 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Changelog | ||
|
||
You must follow the format of `## [VERSION-NUMBER]` for the GitHub workflow to pick up the text. | ||
|
||
## [1.0.0] - 2024-04-08 | ||
### Released | ||
- Initial release of the project. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
|
||
|
||
|
||
check-deps: ## Check if poetry is installed on your system. | ||
@command -v poetry >/dev/null 2>&1 || { echo >&2 "Poetry is required but it's not installed. Please install Poetry by following the instructions at: https://python-poetry.org/docs/#installation"; exit 1; } | ||
@command -v pyenv >/dev/null 2>&1 || { echo >&2 "pyenv is recommended for managing Python versions but it's not installed. Install via `brew install pyenv`"; exit 1; } | ||
@echo "Setting Python version to 3.10 using pyenv." | ||
@pyenv local 3.10 | ||
|
||
shell: check-deps ## Get into a poetry shell | ||
poetry shell | ||
|
||
setup: check-deps shell ## Install dependencies into your poetry environment. | ||
poetry install | ||
|
||
# Linting and formatting below. | ||
run_mypy: ## Run mypy | ||
mypy --config-file=mypy.ini . | ||
|
||
run_flake8: ## Run flake8 | ||
flake8 --ignore=E203,E501,W503 --exclude=pyvenv,tmp,*_pb2.py,*_pb2.pyi,images/*/src . | ||
|
||
check_black: ## Check to see if files would be updated with black. | ||
# Exclude pyvenv and all generated protobuf code. | ||
black --check --exclude="pyvenv|.*_pb2.py|.*_pb2.pyi" . | ||
|
||
run_black: ## Run black to format files. | ||
# Exclude pyvenv, tmp, and all generated protobuf code. | ||
black --exclude="pyvenv|tmp|.*_pb2.py|.*_pb2.pyi" . | ||
|
||
check_isort: ## Check if files would be updated with isort. | ||
isort --profile black --check --skip=pyvenv --skip-glob='*_pb2.py*' . | ||
|
||
run_isort: ## Run isort to update imports. | ||
isort --profile black --skip=pyvenv --skip=tmp --skip-glob='*_pb2.py*' . | ||
|
||
|
||
fmt_lint: shell ## lint/fmt in current python environment | ||
make run_mypy run_black run_isort run_flake8 | ||
|
||
# Test below | ||
test: shell ## Run tests. | ||
python -m pytest -vvs semantic_model_generator | ||
|
||
test_github_workflow: ## For use on github workflow. | ||
python -m pytest -vvs semantic_model_generator | ||
|
||
help: ## Show this help. | ||
@fgrep -h "##" $(MAKEFILE_LIST) | fgrep -v fgrep | sed -e 's/\\$$//' | sed -e 's |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,62 @@ | ||
# copilot-semantic-model-generator | ||
# semantic-model-generator | ||
|
||
The `Semantic Model Generator` is used to generate a semantic model for use in your Snowflake account. | ||
|
||
## Setup | ||
|
||
This project uses `poetry` to manage dependencies and we recommend `pyenv` for python version management. | ||
|
||
1. `make setup` | ||
|
||
Next, export your credentials as environment variables. Note, `host` is optional depending on your Snowflake deployment. | ||
|
||
```bash | ||
export SNOWFLAKE_ROLE = "<your-snowflake-role>" | ||
export SNOWFLAKE_WAREHOUSE = "<your-snowflake-warehouse>" | ||
export SNOWFLAKE_USER = "<your-snowflake-user>" | ||
export SNOWFLAKE_PASSWORD = "<your-snowflake-password>" | ||
export SNOWFLAKE_HOST = "<your-snowflake-host>" | ||
``` | ||
|
||
## Usage | ||
|
||
### Generation | ||
|
||
You may generate a semantic model for a given list of fully qualified tables following the `{database}.{schema}.{table}` format. Each table in this list should be a physical table or a view present in your database. | ||
|
||
All generated semantic models by default are saved under `semantic_model_generator/output_models`. | ||
|
||
|
||
```bash | ||
python -m semantic_model_generator.main \ | ||
--fqn_tables "['<your-database-name-1>.<your-schema-name-1>.<your-physical-table-or-view-name-1>','<your-database-name-2>.<your-schema-name-2>.<your-physical-table-or-view-name-2>']" \ | ||
--semantic_model_name "<a-meaningful-semantic-model-name>" \ | ||
--snowflake_account="<your-snowflake-account>" | ||
``` | ||
|
||
### Post-Generation | ||
|
||
**Important**: After generation, your YAML files will have a series of lines with `# <FILL-OUT>`. Please take the time to fill these out with your business context. | ||
|
||
By default, the generated semantic model will contain all columns from the provided tables/views. However, it's highly encouraged to only keep relevant columns and drop any unwanted columns from the generated semantic model | ||
|
||
In addition, consider adding the following elements to your semantic model: | ||
|
||
1. Logical columns for a given table/view that are expressions over physical columns. | ||
* Example: `col1 - col2` could be the `expr` for a logical column. | ||
2. Synonyms. Any additional synonyms for column names. | ||
3. Filters. Additional filters with their relevant `expr`. | ||
|
||
## Release | ||
|
||
In order to push a new build and release, follow the steps below. | ||
|
||
1. Checkout a new branch from main. You must name this branch `release/vYYYY-MM-DD`. The `release/v` prefix is used to trigger a github workflow post-merge. | ||
2. Bump the poetry: | ||
* `poetry version patch` | ||
* `poetry version minor` | ||
* `poetry version major` | ||
3. Update the `CHANGELOG.md` adding a relevant header for your version number along with a description of the changes made. | ||
4. Commit the updated `pyproject.toml` and `CHANGELOG.md` and push. | ||
5. Merge your branch which will trigger the release workflow. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
[mypy] | ||
plugins = pydantic.mypy | ||
|
||
ignore_missing_imports = True | ||
strict = True | ||
disallow_untyped_defs = True | ||
warn_unused_ignores = False | ||
disallow_any_generics = True | ||
|
||
exclude = pyvenv|(_test\.py|test_.*\.py)|_pb2\.py|_pb2\.pyi |
Oops, something went wrong.