- Avoid double execution of tests on run command
- Don't execute tests when task failed
- Snowflake: temporary objects are created as
temp
where possible (ie: stages, incremental tables,...) - Added flag to run command (--with-tests) that combines the execution of both tests and tasks
- Added new flag to run and test commands (--fail-fast) that terminates the execution upon an error on any task or test
- Allow changing the target database for tests
- Fixes load_data code when checking if the table exists
- Fixes issue with Redshift region setting for IAM connections
- Update driver versions for Redshift, Snowflake and BigQuery
- Issue with incremental sql models on first load not creating the table
- Colon (:) in SQL is not interpreted as bind parameters
- Update Redshift dependencies
- Fixes to Redshift load_data when using an S3 bucket
- Add support for 3rd level (eg: projects in BigQuery or databases in Snowflake)
when referencing database objects with
src
andout
- Add support for python 3.11
- Switch Redshift driver to use AWS' redshift-connector
- Improve support for Redshift IAM authentications
- Allow data types other than strings in
allowed_values
data test - Fixes to ddl parsing for view materialisations
- Drop support for Python 3.7
- Upgrades numpy version
- Improved messaging upon load_data errors
- Removed Jinja caching to allow more code reusability
- CLI command status code changed to error when tasks fail
- Fixed error in src or out to allow for missing schema
- Fixes serialisation to json preventing copy tasks to work on BigQuery
- Fixes character encoding issue on windows
- sql tasks replace autosql with new materialisation "script"
- Allows custom tests to be defined from project.yaml as a group with the type: test
- Root for custom tests changed to the sql folder, rather than sql/tests
- UUID support for copy to BigQuery
- Adds support for missing task properties to config macro and python task decorator
- Fixes to introspection with missing connections
- Upstream prod and from_prod are recognised with sayn test
- Allows connections for decorator based python tasks to be missing from settings when the task is not in the current execution
- Allows adding tags from decorator based python tasks
- Adds src and out to jinja environment in python tasks
- Supports
src
in custom tests
- Issue with copy tasks from the
default_db
- Test results completion improvements
- New settings
from_prod
allows to mark tables that are always read from production - New settings
default_run
allows to define a default run filter - Apply db object transformations only to
defaul_db
- Allow columns in copy to be specified with just the name
- Renamed
values
toallowed_values
in data testing - Various improvements to cli messaging
- Refactoring of db objects internal code
- Allows SAYN to run when unused credentials are missing from settings
- Pin dependency version to resolve issue with MarkupSafe
- Fixes to bigquery introspection
- Task groups can be generated from automatically from a path specification
- Task dependencies can be set in code using src and out macros without YAML
- Simpler pattern for creating python tasks based on decorators
- BigQuery driver upgraded to sqlalchemy-bigquery 1.3.0, deprecating pybigquery
- Fixes to incremental autosql tasks in BigQuery
- Fixes to credentials_path property for BigQuery credentials
- Enables installation on Python 3.10
- Fixes colour of last message in SAYN cli
- Allow max_merge_rows with append copy tasks
- Adds append only mode for copy tasks
- Make parameter and credential names case lowercase allowing environment variables on Windows
- Bigquery support for changing autosql models between views and tables
- Better error messages when additional properties are specified in a task definition
- Fixes issue with NaN values when loading to Snowflake
- Fixes issue with filters in the cli
- Adds on_fail functionaly to tasks allowing children to run when parent tasks fail
- Improvements to cli to allow lists with a single -t or -x flag
- Adds support for renaming columns on copy tasks
- Improvements to BigQuery data load of nested fields
- Adds staging area based batch copy for snowflake
- Returns number of records loaded in load_data
- Adds support for copy to merge frequently to target table
- Fixes to columns without names in ddl
- Adds sorting to copy's get_data_query
- Refactoring of database code to improve performance by adding introspection after all tasks have been setup
- Improvements to BigQuery process
- Adds support for testing all databases
- Adds support for environment variables specified in YAML (JSON still supported as well)
- Duplication of data in sample project
- Fixes unicode issues with load_date in bigquery
- Adds BigQuery support
- Renamed task attribute task_group to group
- Changed concept of dags to tasks
- Added automated detection of task files in tasks folder
- Added option to change database destination
- Fixed issues with primary key DDLs
- Fixed bug preventing SQL task execution introduced in 0.4.1
- Database methods raise exceptions rather than return Result objects
- load_data automatically creates tables
- max_batch_rows introduced to allow manipulating the size of batches in load_data
- Switched to pydantic for project validation
- Removed Config and Dag objects and split that functionality into separate modules improving the ability for automated testing
- Created an App object to encapsulate most of the the running logic
- Created a TaskWrapper object to isolate the task lifetime logic from the execution
- Added new Result type for error reporting
- Switched from standard python logging to an event reporting model
- Major changes to console UI
- Added the concept of task step to improve feedback to the user
- With
-d
sayn will write all sql related to every step in the compile folder
- Allows indexes definition without column definition under ddl for autosql
- Reworks the db credentials specifications
- Adds Redshift distribution and sort table attributes
- Adds Redshift connection through IAM temporary passwords
- Added MySQL
- Updated copy task to latest changes of db drivers
- Added select_stream to improve performance in copy tasks
- Added load_data_strem to postgresql for bulk loading in copy tasks
- Changed underlying structure of logging
- Added first tests
- Re-releasing due to issue when uploading to PyPI
- Fixed errors when missing temporary schema in autosql tasks
- Fixed autocommit issues with snowflake
- Fixed crashing bug when parsing credentials
- Renames the following:
- models.yaml > project.yaml
- groups > presets
- models > dags
- to > destination (copy and autosql task types)
- from > source (copy task types)
- staging_schema > tmp_schema (copy and autosql task types)
- Alows specifying presets in project.yaml
- Presets in the dags can reference a preset in project.yaml with the preset property
- Compilation output is saved in a folder named as the dag within the compile folder
- module in python tasks is deprecated and the class should now point at the full class path (ie: class: my_module.MyTask points at a sayn python task called MyTask within python/my_module.py
- Task and preset names are restricted by a regex ^[a-zA-Z0-9][-_a-zA-Z0-9]+$