Skip to content

Commit

Permalink
New snapshot_meta_column_names config for dbt snapshots (#6211)
Browse files Browse the repository at this point in the history
Resolves #6112

## What are you changing in this pull request and why?

In Versionless and dbt v1.9, we have a new `snapshot_meta_column_names`
config for dbt snapshots that allows for customizing the [column names
of the snapshot meta
fields](https://docs.getdbt.com/docs/build/snapshots#snapshot-meta-fields)
(`dbt_valid_from`, `dbt_valid_to`, `dbt_scd_id`, `dbt_updated_at`) to
align with team or organizational conventions.

For example, they can now have _whichever_ column names you choose
(e.g., `valid_start`, `valid_end`, `scd_identifier`, `last_modified`).

## Previews

-
[`snapshot_meta_column_names`](https://docs-getdbt-com-git-dbeatty10-patch-1-dbt-labs.vercel.app/reference/resource-configs/snapshot_meta_column_names)
- [Snapshot
configurations](https://docs-getdbt-com-git-dbeatty10-patch-1-dbt-labs.vercel.app/reference/snapshot-configs#snapshot-specific-configurations)
- [Snapshot
meta-fields](https://docs-getdbt-com-git-dbeatty10-patch-1-dbt-labs.vercel.app/docs/build/snapshots#snapshot-meta-fields)
- [dbt Cloud release
notes](https://docs-getdbt-com-git-dbeatty10-patch-1-dbt-labs.vercel.app/docs/dbt-versions/dbt-cloud-release-notes#october-2024)

## Checklist
- [x] I have reviewed the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [x] The topic I'm writing about is for specific dbt version(s) and I
have versioned it according to the [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and/or [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content)
guidelines.
- [x] Add a note to the [release
notes](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/release-notes.md)
- [ ] Add a note to the prerelease version [Migration
Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
- [x] Add page in `website/sidebars.js`
- [x] Provide a unique filename for new pages

---------

Co-authored-by: Mirna Wong <[email protected]>
Co-authored-by: Leona B. Campbell <[email protected]>
Co-authored-by: Grace Goheen <[email protected]>
  • Loading branch information
4 people authored Oct 3, 2024
1 parent c8b4993 commit f90259f
Show file tree
Hide file tree
Showing 6 changed files with 120 additions and 1 deletion.
4 changes: 4 additions & 0 deletions website/dbt-versions.js
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@ exports.versions = [
* @property {string} lastVersion The last version the page is visible in the sidebar
*/
exports.versionedPages = [
{
page: "reference/resource-configs/snapshot_meta_column_names",
firstVersion: "1.9",
},
{
page: "reference/resource-configs/target_database",
lastVersion: "1.8",
Expand Down
2 changes: 2 additions & 0 deletions website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,8 @@ Basically – keep your query as simple as possible! Some reasonable exceptions

Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.

Starting in 1.9 or with [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless), these column names can be customized to your team or organizational conventions via the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config.

| Field | Meaning | Usage |
| -------------- | ------- | ----- |
| dbt_valid_from | The timestamp when this snapshot row was first inserted | This column can be used to order the different "versions" of a record. |
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@ Release notes are grouped by month for both multi-tenant and virtual private clo
- **Behavior change:** Set [`state_modified_compare_more_unrendered`](/reference/global-configs/behavior-changes#source-definitions-for-state) to true to reduce false positives for `state:modified` when configs differ between `dev` and `prod` environments.
- **Behavior change:** Set the [`skip_nodes_if_on_run_start_fails`](/reference/global-configs/behavior-changes#failures-in-on-run-start-hooks) flag to `True` to skip all selected resources from running if there is a failure on an `on-run-start` hook.
- **Enhancement**: In dbt Cloud Versionless, snapshots defined in SQL files can now use `config` defined in `schema.yml` YAML files. This update resolves the previous limitation that required snapshot properties to be defined exclusively in `dbt_project.yml` and/or a `config()` block within the SQL file. This enhancement will be included in the upcoming dbt Core v1.9 release.
- **New**: In dbt Cloud Versionless, the `snapshot_meta_column_names` config allows for customizing the snapshot metadata columns. This feature allows an organization to align these automatically-generated column names with their conventions, and will be included in the upcoming dbt Core 1.9 release.
- **Enhancement**: In May 2024, dbt Cloud versionless began inferring a model's `primary_key` based on configured data tests and/or constraints within `manifest.json`. The inferred `primary_key` is visible in dbt Explorer and utilized by the dbt Cloud [compare changes](/docs/deploy/run-visibility#compare-tab) feature. This will also be released in dbt Core 1.9.
Read about the [order dbt infers columns can be used as primary key of a model](https://github.com/dbt-labs/dbt-core/blob/7940ad5c7858ff11ef100260a372f2f06a86e71f/core/dbt/contracts/graph/nodes.py#L534-L541).
- **New:** dbt Explorer now includes trust signal icons, which is currently available as a [Preview](/docs/dbt-versions/product-lifecycles#dbt-cloud). Trust signals offer a quick, at-a-glance view of data health when browsing your dbt models in Explorer. These icons indicate whether a model is **Healthy**, **Caution**, **Degraded**, or **Unknown**. For accurate health data, ensure the resource is up-to-date and has had a recent job run. Refer to [Trust signals](/docs/collaborate/explore-projects#trust-signals-for-resources) for more information.
- **New:** Auto exposures are now available in Preview in dbt Cloud. Auto-exposures helps users understand how their models are used in downstream analytics tools to inform investments and reduce incidents. It imports and auto-generates exposures based on Tableau dashboards, with user-defined curation. To learn more, refer to [Auto exposures](/docs/collaborate/auto-exposures).


## September 2024

- **New**: Use the new recommended syntax for [defining `foreign_key` constraints](/reference/resource-properties/constraints) using `refs`, available in dbt Cloud Versionless. This will soon be released in dbt Core v1.9. This new syntax will capture dependencies and works across different environments.
Expand Down
109 changes: 109 additions & 0 deletions website/docs/reference/resource-configs/snapshot_meta_column_names.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
resource_types: [snapshots]
description: "Snapshot meta column names"
datatype: "{<dictionary>}"
default_value: {"dbt_valid_from": "dbt_valid_from", "dbt_valid_to": "dbt_valid_to", "dbt_scd_id": "dbt_scd_id", "dbt_updated_at": "dbt_updated_at"}
id: "snapshot_meta_column_names"
---

Starting in 1.9 or with [versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) dbt Cloud.

<File name='snapshots/schema.yml'>

```yaml
snapshots:
- name: <snapshot_name>
config:
snapshot_meta_column_names:
dbt_valid_from: <string>
dbt_valid_to: <string>
dbt_scd_id: <string>
dbt_updated_at: <string>

```

</File>

<File name='snapshots/<filename>.sql'>

```jinja2
{{
config(
snapshot_meta_column_names={
"dbt_valid_from": "<string>",
"dbt_valid_to": "<string>",
"dbt_scd_id": "<string>",
"dbt_updated_at": "<string>",
}
)
}}
```

</File>

<File name='dbt_project.yml'>

```yml
snapshots:
[<resource-path>](/reference/resource-configs/resource-path):
+snapshot_meta_column_names:
dbt_valid_from: <string>
dbt_valid_to: <string>
dbt_scd_id: <string>
dbt_updated_at: <string>

```

</File>

## Description

In order to align with an organization's naming conventions, the `snapshot_meta_column_names` config can be used to customize the names of the [metadata columns](/docs/build/snapshots#snapshot-meta-fields) within each snapshot.

## Default

By default, dbt snapshots use the following column names to track change history using [Type 2 slowly changing dimension](https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row) records:

| Field | Meaning | Notes |
| -------------- | ------- | ----- |
| `dbt_valid_from` | The timestamp when this snapshot row was first inserted and became valid. | The value is affected by the [`strategy`](/reference/resource-configs/strategy). |
| `dbt_valid_to` | The timestamp when this row is no longer valid. | |
| `dbt_scd_id` | A unique key generated for each snapshot row. | This is used internally by dbt. |
| `dbt_updated_at` | The `updated_at` timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt. |

However, these column names can be customized using the `snapshot_meta_column_names` config.

:::warning

To avoid any unintentional data modification, dbt will **not** automatically apply any column renames. So if a user applies `snapshot_meta_column_names` config for a snapshot without updating the pre-existing table, they will get an error. We recommend either only using these settings for net-new snapshots, or arranging an update of pre-existing tables prior to committing a column name change.

:::

## Example

<File name='snapshots/schema.yml'>

```yaml
snapshots:
- name: orders_snapshot
relation: ref("orders")
config:
unique_key: id
strategy: check
check_cols: all
snapshot_meta_column_names:
dbt_valid_from: start_date
dbt_valid_to: end_date
dbt_scd_id: scd_id
dbt_updated_at: modified_date
```
</File>
The resulting snapshot table contains the configured meta column names:
| id | scd_id | modified_date | start_date | end_date |
| -- | -------------------- | -------------------- | -------------------- | -------------------- |
| 1 | 60a1f1dbdf899a4dd... | 2024-10-02 ... | 2024-10-02 ... | 2024-10-02 ... |
| 2 | b1885d098f8bcff51... | 2024-10-02 ... | 2024-10-02 ... | |
3 changes: 3 additions & 0 deletions website/docs/reference/snapshot-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ snapshots:
[+](/reference/resource-configs/plus-prefix)[strategy](/reference/resource-configs/strategy): timestamp | check
[+](/reference/resource-configs/plus-prefix)[updated_at](/reference/resource-configs/updated_at): <column_name>
[+](/reference/resource-configs/plus-prefix)[check_cols](/reference/resource-configs/check_cols): [<column_name>] | all
[+](/reference/resource-configs/plus-prefix)[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {<dictionary>}

```

Expand Down Expand Up @@ -111,6 +112,7 @@ snapshots:
[strategy](/reference/resource-configs/strategy): timestamp | check
[updated_at](/reference/resource-configs/updated_at): <column_name>
[check_cols](/reference/resource-configs/check_cols): [<column_name>] | all
[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {<dictionary>}

```
</File>
Expand Down Expand Up @@ -138,6 +140,7 @@ Configurations can be applied to snapshots using [YAML syntax](/docs/build/snaps
[strategy](/reference/resource-configs/strategy)="timestamp" | "check",
[updated_at](/reference/resource-configs/updated_at)="<column_name>",
[check_cols](/reference/resource-configs/check_cols)=["<column_name>"] | "all"
[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names)={<dictionary>}
) }}
```
Expand Down
1 change: 1 addition & 0 deletions website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -974,6 +974,7 @@ const sidebarSettings = {
"reference/resource-configs/unique_key",
"reference/resource-configs/updated_at",
"reference/resource-configs/invalidate_hard_deletes",
"reference/resource-configs/snapshot_meta_column_names",
],
},
{
Expand Down

0 comments on commit f90259f

Please sign in to comment.