Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Allow snapshots to have unique_key as a list #9992

Closed
3 tasks done
Tracked by #10151
AGPapa opened this issue Apr 22, 2024 · 4 comments · Fixed by #10795
Closed
3 tasks done
Tracked by #10151

[Feature] Allow snapshots to have unique_key as a list #9992

AGPapa opened this issue Apr 22, 2024 · 4 comments · Fixed by #10795
Labels
enhancement New feature or request snapshots Issues related to dbt's snapshot functionality user docs [docs.getdbt.com] Needs better documentation

Comments

@AGPapa
Copy link
Contributor

AGPapa commented Apr 22, 2024

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

This a clone of dbt-labs/dbt-adapters#181

Right now the 'unique_key' for a snapshot must be a single column or an expression. If a user needs a combination of columns to be unique then they can use the expression to concatenate multiple columns together how they like.

Ideally users would be able to enter a list of columns, just like they can do for incremental models.

Describe alternatives you've considered

I considered adding additional parameters instead of changing the existing unique_key parameter. I think using the existing parameter is best because it's the most consistent with how incremental tables work.

Who will this benefit?

Snapshot users who work in Postgres or Redshift can take advantage of performance optimizations in their databases with this change.

There are performance benefits from using multiple columns rather than an expression. In Postgres users would be able to pass in partition keys or indexed columns to speed up their snapshots, without needing to make an additional expression index. In Redshift users can pass in a separate distribution key to also get performance benefits.

These benefits can't be realized unless dbt allows the unique key to natively be a combination of columns, rather than a concatenation/combination expression.

Are you interested in contributing this feature?

Yes, I have a PR in dbt-adapters and in dbt-core coming shortly

Anything else?

No response

@AGPapa AGPapa added enhancement New feature or request triage labels Apr 22, 2024
@dbeatty10 dbeatty10 added the snapshots Issues related to dbt's snapshot functionality label Apr 22, 2024
@graciegoheen
Copy link
Contributor

Thanks so much for opening - agree that ideally users would be able to enter a list of columns for unique_key for snapshots, just like they can do for incremental models.

Will queue up your PRs for review from our team

@ikhudur
Copy link

ikhudur commented Jul 5, 2024

@graciegoheen Any chance the PR will be merged soon?

@hope-wat
Copy link

hope-wat commented Oct 21, 2024

@graciegoheen can we call this one done since it moved over to adapters and was pulled into main

@dbeatty10 dbeatty10 added the user docs [docs.getdbt.com] Needs better documentation label Oct 23, 2024
@FishtownBuildBot
Copy link
Collaborator

Opened a new issue in dbt-labs/docs.getdbt.com: dbt-labs/docs.getdbt.com#6343

@dbeatty10 dbeatty10 changed the title [Feature] Allows snapshots to have a list as a unique key [Feature] Allow snapshots to have unique_key as a list Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request snapshots Issues related to dbt's snapshot functionality user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
6 participants