DataHub v0.8.33
dexter-mh-lee
released this
15 Apr 18:46
·
5072 commits
to master
since this release
Release Highlights
User Experience
Refreshed the ML Entity page to match the feel of all other entity types; improved ML lineage functionality
Ingestion Improvements
- Airflow Improvements - as demoed in March Town Hall
- Add support to capture Airflow execution runs from lineage backend
- Introduce new High level API for generating dataflow/job/dataprocessinstance
- MS SQL ingestion now captures table & column descriptions
- Trino platform support for Great Expectations
- New Presto-on-Hive ingestion source
- BigQuery ingestion now supports extraction of usage info from audit logs
- Fix to Looker ingestion to extract Explore Views from join names
- Fix to Tableau ingestion to avoid duplicating schema in URNs for upstream tables
- Simplify & annotate Redshift Usage source
Full Commit Log
- feat(gms): Expose kafka listener concurrency as a GMS setting by @jjoyce0510 in #4536
- feat(ingest): add option for external Spark cluster by @kevinhu in #4571
- fix(upgrade): Renaming kafka producer since it clashes with spring-internal by @dexter-mh-lee in #4573
- feat(GraphQL): Add data platform query to GraphQL API by @jjoyce0510 in #4574
- build(ui): Fix Windows UI lint by @mattmatravers in #4556
- doc: make note prominent on quickstart by @anshbansal in #4558
- fix(protobuf) minor bugfixes for protobuf by @leifker in #4553
- feat(docs) Improves docs around developing datahub, removes deprecated docs on building metadata service by @pedro93 in #4552
- chore: cleanup extra file by @anshbansal in #4541
- feat(snowflake): reduce permissions provisioned by default by @anshbansal in #4543
- fix(ingestion): Redshift usage refactoring - simplify, annotate, fix bugs by @rslanka in #4572
- fix(graphql): Adding PRE FabricType to GraphQL by @jjoyce0510 in #4582
- feat(search) - add DATETIME FieldType by @aditya-radhakrishnan in #4407
- fix(tableau): fix for incorrect schema returned by tableau api for sn… by @mayurinehate in #4577
- chore: update default cli for managed ingestion by @anshbansal in #4581
- feat(okta) - add support for filtering/searching when ingesting Okta groups and users by @aditya-radhakrishnan in #4586
- doc(snowflake): add example of table pattern by @anshbansal in #4580
- fix(doc): try to fix broken link by @daha in #4593
- fix(bigquery): incorrect lineage when views are present by @anshbansal in #4568
- feat(metadata-service): Supporting a configurable Authorizer Chain by @jjoyce0510 in #4584
- fix(search): Make sure home page and search pages are consistent by @dexter-mh-lee in #4588
- fix(browse): Reduce browse aggregation size by @dexter-mh-lee in #4601
- doc: add page for handling deprecations, breaking changes etc. by @anshbansal in #4590
- docs(GraphQL): fix typo by @Falci in #4605
- feat(search): Add SearchScore annotation to use fields for search ranking by @dexter-mh-lee in #4596
- feat(ingestion): Redshift Usage Source - simplify OperationalStats workunit generation. by @rslanka in #4585
- feat(tableau): add some logic to normalize table names in tableau by @gabe-lyons in #4609
- fix: urlencode slash in urns too by @daha in #4527
- fix(bigquery): fix lineage bug, improve docs, add dataset filter config by @anshbansal in #4607
- fix(protobuf) fix test instabilitity by @leifker in #4612
- fix(ui): Fix dashboard tags display by @jjoyce0510 in #4611
- feat(ui): Adding GraphQL queries to fetch entity deprecation status by @jjoyce0510 in #4614
- feat(ingest): enable connection string for all sqlalchemy datasources by @ms32035 in #4508
- fix(docs): add grant statements for redshift-ingestion by @Abhiram98 in #4559
- chore: fix lint and remove incorrect integration mark from unit tests by @anshbansal in #4621
- feat: adding gradle, pip cache via gh cache, docker cache via dockerhub by @anshbansal in #4387
- doc(scheduling): make it easier to find ui ingestion by @anshbansal in #4610
- feat(glue): add CatalogId parameter for cross-account access by @BoyuanZhangDE in #4608
- doc(cli): add env variables and options for ingest command by @anshbansal in #4598
- fix(ingest): Restricting pytest docker version to <0.12 by @treff7es in #4639
- fix(cypress) - add waits for cypress search test to remove flakiness by @aditya-radhakrishnan in #4640
- Revert "feat: adding gradle, pip cache via gh cache, docker cache via dockerhub" by @dexter-mh-lee in #4637
- feat(search): Only reindex if the mappings for an existing field changed by @dexter-mh-lee in #4629
- feat: add presto-on-hive metadata ingestion source by @jchen0824 in #4625
- feat(ingest): add trino platform for great expectations by @ms32035 in #4594
- fix(kafka): Stop overriding kafka registry props with empty values by @jsotelo in #4604
- [model]: Dataprocess instance entity to model datajob/jobflow runs by @treff7es in #4459
- feat(ingest): add Urn python library for DataJob, DataFlow, Domain and Tag by @tc350981 in #4618
- fix(ingestion): ensure source/sink reports are always logged by @anshbansal in #4592
- fix(ingestion): extract explore views from join name in Looker by @dyanarose in #4627
- feat(ingestion): Enable lower-casing of the name part of dataset urn if env variable is set. by @rslanka in #4649
- feat: Enable the ingestion of bigquery audit logs to parse usage info… by @tha23rd in #4441
- fix(ingest): Fix snowflake KEY_PAIR auth by @mkamalas in #4638
- fix(home): Fix issue where some browse cards are missing by @dexter-mh-lee in #4652
- fix(tableau): avoid duplicate schema in URNs for upstream tables by @maaaikoool in #4645
- feat(ingest): capture MSSQL table+column descriptions by @kevinhu in #4579
- feat(ml): bringing ml screens up to date w/ the modern ui layout & improving ml lineage by @gabe-lyons in #4651
- (feat:airflow) Add support to capture airflow executions + high level dataflow/jobs api by @treff7es in #4615
- fix(ingestion): add missing workunit ids by @anshbansal in #4657
- fix(ingestion): Adding missing init.py by @anshbansal in #4659
- fix(bigquery-usage): missing dependency by @anshbansal in #4661
- feat(cypress) - add cypress dashboard view to CI by @aditya-radhakrishnan in #4654
- feat(autocomplete): show fully qualified name in autocomplete by @gabe-lyons in #4663
- feat(ingestion) dbt: Fixing issue with strip_user_ids_from_email and adding owner_naming_pattern by @arunvasudevan in #4587
- fix(sqlparser): fix sqlparser breaking due to # sign by @anshbansal in #4662
- fix(ingestion): validate datasource in Tableau connector, before creating its upstream by @nandacamargo in #4613
- Added Relative Routing on the Users & Groups screen by @Ankit-Keshari-Vituity in #4664
- fix(airflow): Not importing emitters directly to eliminate unneeded dependency by @treff7es in #4668
- docs: remove ingestion source summary table by @maggiehays in #4670
- feat(ml): some machine learning followups by @gabe-lyons in #4669
- fix(search): Fix urn component settings by @dexter-mh-lee in #4672
- fix(ingestion): update example recipes by @anshbansal in #4660
- feat(theming): set custom logo without rebuilding by @gabe-lyons in #4674
- feat(data-platform): Add platform entities for the connectors we support by @dexter-mh-lee in #4676
- refactor(authorization): Add authorizedActor function to Authorizer interface by @dexter-mh-lee in #4678
- docs(tags) - add tags usage guide by @aditya-radhakrishnan in #4677
- fix(cli):Supress printing variables to logs during ingestion failure by @atulsaurav in #4566
- fix(docs): Improving Add Users Doc by @jjoyce0510 in #4679
- Fix/modal validations by @ShubhamThakre in #4673
New Contributors
- @Falci made their first contribution in #4605
- @ms32035 made their first contribution in #4508
- @jchen0824 made their first contribution in #4625
- @dyanarose made their first contribution in #4627
- @mkamalas made their first contribution in #4638
- @atulsaurav made their first contribution in #4566
Full Changelog: v0.8.32...v0.8.33