From edb82ad91fba8a401c56b82bc4c2916a39a6a6dd Mon Sep 17 00:00:00 2001 From: Andrew Sikowitz Date: Tue, 24 Oct 2023 18:56:14 -0400 Subject: [PATCH 1/2] docs(ingest/bigquery): Add docs for breaking change: match_fully_qualified_names (#9094) --- docs/how/updating-datahub.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/how/updating-datahub.md b/docs/how/updating-datahub.md index 3af3b2bdda215..7d8c25b06255a 100644 --- a/docs/how/updating-datahub.md +++ b/docs/how/updating-datahub.md @@ -11,11 +11,17 @@ This file documents any backwards-incompatible changes in DataHub and assists pe by Looker and LookML source connectors. - #8853 - The Airflow plugin no longer supports Airflow 2.0.x or Python 3.7. See the docs for more details. - #8853 - Introduced the Airflow plugin v2. If you're using Airflow 2.3+, the v2 plugin will be enabled by default, and so you'll need to switch your requirements to include `pip install 'acryl-datahub-airflow-plugin[plugin-v2]'`. To continue using the v1 plugin, set the `DATAHUB_AIRFLOW_PLUGIN_USE_V1_PLUGIN` environment variable to `true`. -- #8943 The Unity Catalog ingestion source has a new option `include_metastore`, which will cause all urns to be changed when disabled. +- #8943 - The Unity Catalog ingestion source has a new option `include_metastore`, which will cause all urns to be changed when disabled. This is currently enabled by default to preserve compatibility, but will be disabled by default and then removed in the future. If stateful ingestion is enabled, simply setting `include_metastore: false` will perform all required cleanup. Otherwise, we recommend soft deleting all databricks data via the DataHub CLI: `datahub delete --platform databricks --soft` and then reingesting with `include_metastore: false`. +- #9077 - The BigQuery ingestion source by default sets `match_fully_qualified_names: true`. +This means that any `dataset_pattern` or `schema_pattern` specified will be matched on the fully +qualified dataset name, i.e. `.`. If this is not the case, please +update your pattern (e.g. prepend your old dataset pattern with `.*\.` which matches the project part), +or set `match_fully_qualified_names: false` in your recipe. However, note that +setting this to `false` is deprecated and this flag will be removed entirely in a future release. ### Potential Downtime From fe18532b29e35af1cd3007e6affc102042b1af61 Mon Sep 17 00:00:00 2001 From: skrydal Date: Wed, 25 Oct 2023 00:58:56 +0200 Subject: [PATCH 2/2] docs(update): Added info on breaking change for policies (#9093) Co-authored-by: Pedro Silva --- docs/how/updating-datahub.md | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/docs/how/updating-datahub.md b/docs/how/updating-datahub.md index 7d8c25b06255a..57193ea69f2be 100644 --- a/docs/how/updating-datahub.md +++ b/docs/how/updating-datahub.md @@ -16,6 +16,39 @@ This is currently enabled by default to preserve compatibility, but will be disa If stateful ingestion is enabled, simply setting `include_metastore: false` will perform all required cleanup. Otherwise, we recommend soft deleting all databricks data via the DataHub CLI: `datahub delete --platform databricks --soft` and then reingesting with `include_metastore: false`. +- #8846 - Changed enum values in resource filters used by policies. `RESOURCE_TYPE` became `TYPE` and `RESOURCE_URN` became `URN`. +Any existing policies using these filters (i.e. defined for particular `urns` or `types` such as `dataset`) need to be upgraded +manually, for example by retrieving their respective `dataHubPolicyInfo` aspect and changing part using filter i.e. +```yaml + "resources": { + "filter": { + "criteria": [ + { + "field": "RESOURCE_TYPE", + "condition": "EQUALS", + "values": [ + "dataset" + ] + } + ] + } +``` +into +```yaml + "resources": { + "filter": { + "criteria": [ + { + "field": "TYPE", + "condition": "EQUALS", + "values": [ + "dataset" + ] + } + ] + } +``` +for example, using `datahub put` command. Policies can be also removed and re-created via UI. - #9077 - The BigQuery ingestion source by default sets `match_fully_qualified_names: true`. This means that any `dataset_pattern` or `schema_pattern` specified will be matched on the fully qualified dataset name, i.e. `.`. If this is not the case, please