Queries the ACCESS_HISTORY and QUERY_HISTORY views, from the SNOWFLAKE.ACCOUNT_USAGE schema, and generates two interactive GraphViz visual diagrams online, in a visual editor:
- one for column lineage
- the other for table lineage
The generated DOT Graphviz models are also saved in the output/ folder.
To repro, you may run the sql/create-script.sql file in Snowflake, which is similar to the one described in the documentation samples. Wait a few hours, until the changes are propagated in the ACCOUNT_USAGE views.
Then run the sql/query-access-history.sql query to get lineage info about the created test tables. {{database}} must be replaced by TEST_DB for our use case here:
This diagram shows how data moved between the table columns in the TEST_DB database. You may go over the links and the SQL query that made that transformation appears as tooltip.
The table lineage graph is a simplified derived diagram, in which there are no columns, and all column dependencies appear as one single link at the container table level:
To connect to Snowflake, create a profiles_db.conf copy of the profiles_db_template.conf file, and customize it with your own Snowflake connection parameters, the user name and the account. Your top [default] profile is the active profile, considered by our tool.
Save your password in a SNOWFLAKE_PASSWORD local environment variable. Never add the password or any other sensitive information to your code or to profile files. All names must be case sensitive, with no quotes.
Without an executable, you can use the source file directly:
python data-lineage.py TEST_DB
To compile into a CLI executable:
pip install pyinstaller
pyinstaller --onefile data-lineage.py
dist/data-lineage TEST_DB