Skip to content

Queries the ACCESS_HISTORY and QUERY_HISTORY views, from the SNOWFLAKE.ACCOUNT_USAGE schema, and generates two interactive GraphViz visual diagrams online, in a visual editor: one for column lineage, the other for table lineage.

License

Notifications You must be signed in to change notification settings

cristiscu/data-lineage-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Aug 28, 2023
8a89499 · Aug 28, 2023

History

5 Commits
Aug 22, 2023
Aug 22, 2023
Aug 28, 2023
Aug 22, 2023
Aug 22, 2023
Aug 22, 2023
Aug 22, 2023
Aug 22, 2023

Repository files navigation

Data Lineage Viewer for Snowflake

Queries the ACCESS_HISTORY and QUERY_HISTORY views, from the SNOWFLAKE.ACCOUNT_USAGE schema, and generates two interactive GraphViz visual diagrams online, in a visual editor:

The generated DOT Graphviz models are also saved in the output/ folder.

The Query Result for the Lineage Graph

To repro, you may run the sql/create-script.sql file in Snowflake, which is similar to the one described in the documentation samples. Wait a few hours, until the changes are propagated in the ACCOUNT_USAGE views.

Then run the sql/query-access-history.sql query to get lineage info about the created test tables. {{database}} must be replaced by TEST_DB for our use case here:

Table Lineage

The Column Lineage Graph

This diagram shows how data moved between the table columns in the TEST_DB database. You may go over the links and the SQL query that made that transformation appears as tooltip.

Table Lineage

The Table Lineage Graph

The table lineage graph is a simplified derived diagram, in which there are no columns, and all column dependencies appear as one single link at the container table level:

Table Lineage

Database Profile File

To connect to Snowflake, create a profiles_db.conf copy of the profiles_db_template.conf file, and customize it with your own Snowflake connection parameters, the user name and the account. Your top [default] profile is the active profile, considered by our tool.

Save your password in a SNOWFLAKE_PASSWORD local environment variable. Never add the password or any other sensitive information to your code or to profile files. All names must be case sensitive, with no quotes.

CLI Executable File

Without an executable, you can use the source file directly:

python data-lineage.py TEST_DB

To compile into a CLI executable:

pip install pyinstaller
pyinstaller --onefile data-lineage.py
dist/data-lineage TEST_DB

About

Queries the ACCESS_HISTORY and QUERY_HISTORY views, from the SNOWFLAKE.ACCOUNT_USAGE schema, and generates two interactive GraphViz visual diagrams online, in a visual editor: one for column lineage, the other for table lineage.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages