Releases: acryldata/datahub
Acryl DataHub v0.8.26.4
Changelog
- datahub-project#4110 @treff7es feat(ingest): Glue - Support for domains and containers
- datahub-project#4154 @jjoyce0510 fix(ui): Fix cutoff profiling axis labels
- datahub-project#4149 @anshbansal fix(docs): fix example of delta lake
- datahub-project#4150 @anshbansal chore(cli): update default cli version pinned in the UI based ingestion
- datahub-project#4135 @anshbansal fix(cli): add timeout for telemetry calls
- datahub-project#4128 @RyanHolstien Feature/oss/update to v2 endpoints
- datahub-project#4139 @gabe-lyons Update README.md
- datahub-project#4028 @claudio-benfatto feat(ingestion): Kafka stateful ingestion
- datahub-project#4148 @satyamkrishna docs : markdown fixes for db retention table
- datahub-project#4133 @satyamkrishna Fix the markdown for tables
- datahub-project#4136 @dexter-mh-lee Fix logging
- datahub-project#4123 @treff7es feat(ingest) Athena: Getting table properties for Athena datasets
Full Changelog: v0.8.26.3...v0.8.26.4
Acryl DataHub v0.8.26.3
Changelog
- datahub-project#4102 @gabe-lyons fix(cypress): force clicks on tag mutation test
- datahub-project#4112 @daha fix(docs) Fix doc on modelDocUpload
- datahub-project#4125 @aditya-radhakrishnan fix(ui) - move book logo to right of glossary term
- datahub-project#4127 @treff7es fix(ingest) Athena: db filter was not applied
- datahub-project#4039 @anshbansal fix(docs): make intro to metadata ingestion easier for beginners
- datahub-project#3917 @anshbansal docs(backup): add doc for taking backup
- datahub-project#4096 @jjoyce0510 feat(Tags/Terms): Backend support for tag & term mutations
- datahub-project#4122 @swaroopjagadish fix(ci): fix formatting for action yaml
- datahub-project#4121 @swaroopjagadish fix(ci): fix fomatting in doc generation action yaml
- datahub-project#4120 @swaroopjagadish fix(docs): fixing metadata model doc generation script and updating png
- datahub-project#4011 @hsheth2 chore(ingest): remove unused groupby_unsorted utility
- datahub-project#4077 @aditya-radhakrishnan fix(ingest): okta - better use of asyncio and additional debug logging
Full Changelog: v0.8.26.2...v0.8.26.3
Acryl DataHub v0.8.26.2
Full Changelog: v0.8.26.1...v0.8.26.2
Acryl DataHub v0.8.26.1
Release Highlights
Bugfix release
- Fix issue with missing dependency definition on
psutil
(affectssnowflake
connector on0.8.26.0
) - Fix bug with evaluating commit policy for stateful ingestion
Full Changelog
- datahub-project#4099 @rslanka fix(ingest): dependencies - Add psutil dependency for stateful ingestion reporting.
- datahub-project#4082 @wangqinghuan docs(adoption): add Haibo corp
- datahub-project#4095 @dexter-mh-lee fix(analytics): fix NPE in aggregate api
- datahub-project#4092 @claudio-benfatto fix(ingest): enforce correct behaviour for commit policy
- datahub-project#4072 @Ankit-Keshari-Vituity Fixed auto complete pr coments
- datahub-project#4073 @jjoyce0510 feat(deprecation): Entity Deprecation Backend
Full Changelog: v0.8.26.0...v0.8.26.1
DataHub v0.8.26.0
Release Highlights
- Adding [BETA] Tableau Ingestion Source! See source docs for more.
DataHub v0.8.25.2
Release Notes
This issue fixes a critical client-side ingestion bug in 0.8.25.1
, which we'll be yanking from PyPi.
DataHub v0.8.25.1
Release Highlights
Buckle up, folks! v0.8.25 brings some very exciting (and highly-requested!) updates.
Notable UI-Based Features
UI-based Ingestion - as demoed in December Town Hall, we now support creating, configuring, scheduling, & executing batch metadata ingestion using the DataHub user interface. This makes getting metadata into DataHub easier by minimizing the overhead required to operate custom integration pipelines.
Data Domains - DataHub now supports grouping data assets into logical collections called Domains. Domains are curated, top-level folders or categories where related assets can be explicitly grouped. Read the guide here!
Data Containers are now supported! This is the physical grouping of entities, ex. a Schema is a container of 1 or more Datasets; a Dashboard is a container of 1 or more Charts.
Notable Metadata Model & Ingestion-Based Features
Data Quality test results are now supported in the DataHub metadata model. This is the first milestone toward surfacing Dataset & Column-level Data Quality results in the UI (read full scope of work here). Future releases will include a Great Expectations integration & UI support - we’re on track to complete this in Q1 as planned.
Avro files are now supported in the Data Lake File ingestion source
Ingest metadata from multiple instances of the same platform type. This has been a very common use case within the Community - you can now differentiate multiple instances of the same platform type! If you already have pre-existing entries, use the datahub migrate command to migrate them over to platform instances.
Ignore users from Top Users calculation
feat(ingestion): Adding ability to ignore users from top users calculation by @treff7es in datahub-project#3735
BigQuery - Data Profiling on only the latest partition/shard
feat(ingestion) bigquery: Profiling only the latest partition/shard on bigquery by @treff7es in datahub-project#3930
(feat)(Business Glossary) add tabular schema and new UI for business glossary by @saxo-lalrishav in datahub-project#3813
Notable Fixes
Fix to support View in Looker * feat(looker): Adding optional Looker external url base url config by @jjoyce0510 in datahub-project#3985
fix(graphql): support group display name in ownership by @thomasplarsson in datahub-project#3979
fix(profiling): Enabling profiling for low cardinality number columns by @treff7es in datahub-project#3990
fix(ingestion): match default username for Azure OIDC and Azure ingestion source by @iasoon in datahub-project#3926
DataHub Usage Guides
docs(domains): Adding a User Guide for Domains by @jjoyce0510 in datahub-project#4038
docs(ingest): Adding UI ingestion guide by @jjoyce0510 in datahub-project#4048
What's Changed
fix(vulnerability): Upgrade gms base image by @dexter-mh-lee in datahub-project#3962
logging(frontend): Improve OIDC debug logs by @jjoyce0510 in datahub-project#3967
docs(delete): add curl request example to delete entity by @anshbansal in datahub-project#3928
fix(ingestion): match default username for Azure OIDC and Azure ingestion source by @iasoon in datahub-project#3926
Feature/dynamic platform icons by @RyanHolstien in datahub-project#3968
refactor(ingestion): remove duplicate aspect type by @hsheth2 in datahub-project#3972
fix(example): fix typo by @anshbansal in datahub-project#3907
fix(ingestion): Restrict python to <=3.9.9 by @treff7es in datahub-project#3961
feat(build): remove requirement for git directory for builds by @swaroopjagadish in datahub-project#3977
fix(ingestion): tighten conditions for restli json transformation by @hsheth2 in datahub-project#3973
fix(ingestion): don't dump variables for config errors by @hsheth2 in datahub-project#3974
Bugfix/increase socket timeout by @RyanHolstien in datahub-project#3982
feat(ingest): support for Avro data lake files by @kevinhu in datahub-project#3913
fix(build): exclude old log4j core by @RickardCardell in datahub-project#3966
fix(quickstart): Pin Quickstart version to v0.8.23. by @jjoyce0510 in datahub-project#3983
feat(looker): Adding optional Looker external url base url config by @jjoyce0510 in datahub-project#3985
fix(graphql): support group display name in ownership by @thomasplarsson in datahub-project#3979
fix(quickstart): Assign correct mysql-setup container for M1 and remove "head" default version. by @jjoyce0510 in datahub-project#3987
feat(embedded search results): support custom endpoints in embedded search result by @gabe-lyons in datahub-project#3986
fix(docker): datahub-gms - build in native, copy to target by @swaroopjagadish in datahub-project#3992
fix(ci): moving defaults back to head now that docker builds are green by @swaroopjagadish in datahub-project#3993
feat(ui): UI-based ingestion (as featured in Dec Townhall) by @jjoyce0510 in datahub-project#3975
quickstart: Adding UI ingestion to quickstart YAML by @jjoyce0510 in datahub-project#3994
feat(domains): Adding backend for Asset Domains (p1) by @jjoyce0510 in datahub-project#3952
Bug: a bug fix to bigquery_to_datahub.yml file by @dipeshmaurya in datahub-project#3988
fix(ingest): check if feature data type is present by @maaaikoool in datahub-project#3932
feat(platform-instance): a simple client-only change to support platf… by @swaroopjagadish in datahub-project#3996
docs(metadata-model): Adding to Metadata model docs by @jjoyce0510 in datahub-project#3998
Add Stash Logo & new Source Icons by @maggiehays in datahub-project#4002
feat(domains): UI for Asset Domains (p2) by @jjoyce0510 in datahub-project#3995
docs: add missing back tick for metadata-ingestion/README.md by @nickwu241 in datahub-project#4003
Bugfix/add missing classes by @RyanHolstien in datahub-project#4000
fix(superset): fix connection for redshift by @anshbansal in datahub-project#3944
fix(setup): fix setup for M1 by @anshbansal in datahub-project#3958
docs:add Optum logo by @maggiehays in datahub-project#4005
Refining Metadata Model docs further by @jjoyce0510 in datahub-project#4001
fix(docker): Alpine based multiplatform docker build for kafka-setup by @treff7es in datahub-project#3991
Bugfix/graph concurrency issue by @RyanHolstien in datahub-project#4007
feat(ingest): Add additional snowflake auth by @MikeSchlosser16 in datahub-project#4009
fix(ci): Reverting unnecessary domain test changes by @jjoyce0510 in datahub-project#4013
fix(metrics): Add metrics for mcl hooks by @dexter-mh-lee in datahub-project#4008
feat(platform) - Update FabricType enum to represent more fabrics by @aditya-radhakrishnan in datahub-project#3997
feat(ingest): emit flags and stats for profiling telemetry by @kevinhu in datahub-project#3969
fix(formatting): fix linting lib version requirement by @anshbansal in datahub-project#3939
fix(docs): fix business glossary docs by @anshbansal in datahub-project#3916
fix(profiling): Enabling profiling for low cardinality number columns by @treff7es in datahub-project#3990
fix(docs): update gms link by @lhvubtqn in datahub-project#3927
fix(ingest): lint fix a few files by @swaroopjagadish in datahub-project#4016
fix(ingest): adding platform instance urn to data platform instance aspects by @swaroopjagadish in datahub-project#4015
feat(ingest): use trino python client for sqlalchemy, supports python… by @mayurinehate in datahub-project#3888
fix(spark-lineage): select mock server port dynamically for unit test by @MugdhaHardikar-GSLab in datahub-project#4018
(feat)(Business Glossary) add tabular schema and new UI for business glossary by @saxo-lalrishav in datahub-project#3813
Test/add concurrency issue smoke test by @RyanHolstien in datahub-project#4014
feat(glossary-terms): Index glossary term custom properties by @jjoyce0510 in datahub-project#3960
feat(ingestion): Adding ability to ignore users from top users calculation by @treff7es in datahub-project#3735
Docs/remote deploy and auto render by @RyanHolstien in datahub-project#4020
fix(ingest): snowflake - Run authentication validation if default value used by @treff7es in datahub-project#4024
feat(nifi): handle provenance api variation for older versions by @mayurinehate in ...
DataHub v0.8.25.0
Release Highlights
Buckle up, folks! v0.8.25 brings some very exciting (and highly-requested!) updates.
Notable UI-Based Features
UI-based Ingestion - as demoed in December Town Hall, we now support creating, configuring, scheduling, & executing batch metadata ingestion using the DataHub user interface. This makes getting metadata into DataHub easier by minimizing the overhead required to operate custom integration pipelines.
Data Domains - DataHub now supports grouping data assets into logical collections called Domains. Domains are curated, top-level folders or categories where related assets can be explicitly grouped. Read the guide here!
Data Containers are now supported! This is the physical grouping of entities, ex. a Schema is a container of 1 or more Datasets; a Dashboard is a container of 1 or more Charts.
Notable Metadata Model & Ingestion-Based Features
Data Quality test results are now supported in the DataHub metadata model. This is the first milestone toward surfacing Dataset & Column-level Data Quality results in the UI (read full scope of work here). Future releases will include a Great Expectations integration & UI support - we’re on track to complete this in Q1 as planned.
Avro files are now supported in the Data Lake File ingestion source
Ingest metadata from multiple instances of the same platform type. This has been a very common use case within the Community - you can now differentiate multiple instances of the same platform type! If you already have pre-existing entries, use the datahub migrate command to migrate them over to platform instances.
Ignore users from Top Users calculation
feat(ingestion): Adding ability to ignore users from top users calculation by @treff7es in datahub-project#3735
BigQuery - Data Profiling on only the latest partition/shard
feat(ingestion) bigquery: Profiling only the latest partition/shard on bigquery by @treff7es in datahub-project#3930
(feat)(Business Glossary) add tabular schema and new UI for business glossary by @saxo-lalrishav in datahub-project#3813
Notable Fixes
Fix to support View in Looker * feat(looker): Adding optional Looker external url base url config by @jjoyce0510 in datahub-project#3985
fix(graphql): support group display name in ownership by @thomasplarsson in datahub-project#3979
fix(profiling): Enabling profiling for low cardinality number columns by @treff7es in datahub-project#3990
fix(ingestion): match default username for Azure OIDC and Azure ingestion source by @iasoon in datahub-project#3926
DataHub Usage Guides
docs(domains): Adding a User Guide for Domains by @jjoyce0510 in datahub-project#4038
docs(ingest): Adding UI ingestion guide by @jjoyce0510 in datahub-project#4048
What's Changed
fix(vulnerability): Upgrade gms base image by @dexter-mh-lee in datahub-project#3962
logging(frontend): Improve OIDC debug logs by @jjoyce0510 in datahub-project#3967
docs(delete): add curl request example to delete entity by @anshbansal in datahub-project#3928
fix(ingestion): match default username for Azure OIDC and Azure ingestion source by @iasoon in datahub-project#3926
Feature/dynamic platform icons by @RyanHolstien in datahub-project#3968
refactor(ingestion): remove duplicate aspect type by @hsheth2 in datahub-project#3972
fix(example): fix typo by @anshbansal in datahub-project#3907
fix(ingestion): Restrict python to <=3.9.9 by @treff7es in datahub-project#3961
feat(build): remove requirement for git directory for builds by @swaroopjagadish in datahub-project#3977
fix(ingestion): tighten conditions for restli json transformation by @hsheth2 in datahub-project#3973
fix(ingestion): don't dump variables for config errors by @hsheth2 in datahub-project#3974
Bugfix/increase socket timeout by @RyanHolstien in datahub-project#3982
feat(ingest): support for Avro data lake files by @kevinhu in datahub-project#3913
fix(build): exclude old log4j core by @RickardCardell in datahub-project#3966
fix(quickstart): Pin Quickstart version to v0.8.23. by @jjoyce0510 in datahub-project#3983
feat(looker): Adding optional Looker external url base url config by @jjoyce0510 in datahub-project#3985
fix(graphql): support group display name in ownership by @thomasplarsson in datahub-project#3979
fix(quickstart): Assign correct mysql-setup container for M1 and remove "head" default version. by @jjoyce0510 in datahub-project#3987
feat(embedded search results): support custom endpoints in embedded search result by @gabe-lyons in datahub-project#3986
fix(docker): datahub-gms - build in native, copy to target by @swaroopjagadish in datahub-project#3992
fix(ci): moving defaults back to head now that docker builds are green by @swaroopjagadish in datahub-project#3993
feat(ui): UI-based ingestion (as featured in Dec Townhall) by @jjoyce0510 in datahub-project#3975
quickstart: Adding UI ingestion to quickstart YAML by @jjoyce0510 in datahub-project#3994
feat(domains): Adding backend for Asset Domains (p1) by @jjoyce0510 in datahub-project#3952
Bug: a bug fix to bigquery_to_datahub.yml file by @dipeshmaurya in datahub-project#3988
fix(ingest): check if feature data type is present by @maaaikoool in datahub-project#3932
feat(platform-instance): a simple client-only change to support platf… by @swaroopjagadish in datahub-project#3996
docs(metadata-model): Adding to Metadata model docs by @jjoyce0510 in datahub-project#3998
Add Stash Logo & new Source Icons by @maggiehays in datahub-project#4002
feat(domains): UI for Asset Domains (p2) by @jjoyce0510 in datahub-project#3995
docs: add missing back tick for metadata-ingestion/README.md by @nickwu241 in datahub-project#4003
Bugfix/add missing classes by @RyanHolstien in datahub-project#4000
fix(superset): fix connection for redshift by @anshbansal in datahub-project#3944
fix(setup): fix setup for M1 by @anshbansal in datahub-project#3958
docs:add Optum logo by @maggiehays in datahub-project#4005
Refining Metadata Model docs further by @jjoyce0510 in datahub-project#4001
fix(docker): Alpine based multiplatform docker build for kafka-setup by @treff7es in datahub-project#3991
Bugfix/graph concurrency issue by @RyanHolstien in datahub-project#4007
feat(ingest): Add additional snowflake auth by @MikeSchlosser16 in datahub-project#4009
fix(ci): Reverting unnecessary domain test changes by @jjoyce0510 in datahub-project#4013
fix(metrics): Add metrics for mcl hooks by @dexter-mh-lee in datahub-project#4008
feat(platform) - Update FabricType enum to represent more fabrics by @aditya-radhakrishnan in datahub-project#3997
feat(ingest): emit flags and stats for profiling telemetry by @kevinhu in datahub-project#3969
fix(formatting): fix linting lib version requirement by @anshbansal in datahub-project#3939
fix(docs): fix business glossary docs by @anshbansal in datahub-project#3916
fix(profiling): Enabling profiling for low cardinality number columns by @treff7es in datahub-project#3990
fix(docs): update gms link by @lhvubtqn in datahub-project#3927
fix(ingest): lint fix a few files by @swaroopjagadish in datahub-project#4016
fix(ingest): adding platform instance urn to data platform instance aspects by @swaroopjagadish in datahub-project#4015
feat(ingest): use trino python client for sqlalchemy, supports python… by @mayurinehate in datahub-project#3888
fix(spark-lineage): select mock server port dynamically for unit test by @MugdhaHardikar-GSLab in datahub-project#4018
(feat)(Business Glossary) add tabular schema and new UI for business glossary by @saxo-lalrishav in datahub-project#3813
Test/add concurrency issue smoke test by @RyanHolstien in datahub-project#4014
feat(glossary-terms): Index glossary term custom properties by @jjoyce0510 in datahub-project#3960
feat(ingestion): Adding ability to ignore users from top users calculation by @treff7es in datahub-project#3735
Docs/remote deploy and auto render by @RyanHolstien in datahub-project#4020
fix(ingest): snowflake - Run authentication validation if default value used by @treff7es in datahub-project#4024
feat(nifi): handle provenance api variation for older versions by @mayurinehate in ...
Acryl DataHub v0.8.24.3
Highlights
- Fix bigquery profiling, check for every table if it is partitioned to not hit table quota
- Add support for domain and container ingestion in common sources
- Add --force option to rollback
Full Changelog
- datahub-project#4074 @treff7es fix(profile):bigquery - Check for every table if it is partitioned to not hit table quota
- datahub-project#3929 @iasoon docs(ingestion) glue: document required IAM permissions
- datahub-project#4051 @treff7es Data domain containers ingestion
- datahub-project#4026 @anshbansal fix(analytics): fix missing events from UI
- datahub-project#4032 @danilopeixoto feat(cli): add --force option to ingest rollback subcommand
- datahub-project#4064 @jjoyce0510 refactor(model): refactor new Assertion models
- datahub-project#4065 @eburairu feat(ui): Add svg datahub loading logo
- datahub-project#4054 @aditya-radhakrishnan refactor(ingest) - remove snowflake_common dependency on aws_common
- datahub-project#4060 @kevinhu fix(ingest): data-lake - add aws dependencies
- datahub-project#4062 @kevinhu feat(ingest): log CLI invocations and completions
- datahub-project#4061 @pedro93 Mark data lake metadata source as Beta
Acryl DataHub v0.8.24.2
Highlights
- Fix bigquery ingest for too many partitioned tables issue
Full Changelog
- datahub-project#4056 @treff7es fix(ingest): bigquery - fix for hitting limit if there are too many partitioned tables
- datahub-project#4059 @jjoyce0510 feat(container): Add domains aspect to container.
- datahub-project#4055 @swaroopjagadish feat(ci): pin tox requirements to speed up ci runs, remove airflow-1 suite until we can pin it
- datahub-project#4052 @dexter-mh-lee fix(mae-consumer-docker): Fix condition for skipping elasticsearch check
- datahub-project#4048 @jjoyce0510 docs(ingest): Adding UI ingestion guide
- datahub-project#4040 @anshbansal feat(analytics): add more analytics for entities
- datahub-project#4042 @MugdhaHardikar-GSLab refactor(spark-lineage): support better parsing of config for future features
- datahub-project#4045 @RyanHolstien fix(platform): prevent invalid urns during ingestion
- datahub-project#3787 @ksrinath feat(model): data quality model
- datahub-project#4047 @swaroopjagadish feat(ingest): add tests for platform instance
- datahub-project#4033 @gabe-lyons feat(users): adding user graphql mutation
- datahub-project#4037 @jjoyce0510 feat(containers): Adding Containers UI (as demo'd in Jan Townhall)
- datahub-project#3807 @rslanka feat(ingest): framework - client side changes for monitoring and reporting
- datahub-project#4038 @jjoyce0510 docs(domains): Adding a User Guide for Domains