introduce the hubspot engagement table to adjust the joins in int_hub… #13

fivetran-reneeli · 2024-11-20T20:26:47Z

PR Overview

This PR will address the following Issue/Feature: #11 #10 #12

This PR will result in the following new package version:

v0.1.0-a4

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

Breaking Changes

Added the hubspot engagement table to the package and made the following updates:
- Added stg_rag_hubspot__engagement model as part of the hubspot staging models.
- Updated int_rag_hubspot__deal_document to adjust the method that hubspot_engagement_* models are joined by leveraging the hubspot__engagement table as the intermediary joining table for the engagement_contact and engagement_company tables.
- Updated int_rag_hubspot__deal_document to retrieve engagement_type from the hubspot engagement table as opposed to the engagement_emails and engagement_notes tables. As such, removes their respective references as they are no longer used in this model.

Under the Hood

Updated the unique key in rag__unified_document to include chunk_index. Previously, the unique key was a combination of only document_id, platform, and source_relation, which was potentially inaccurate if there were multiple chunks associated with a document.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

dbt run –full-refresh && dbt test
[na] dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

The appropriate issue has been linked, tagged, and properly assigned
All necessary documentation and version upgrades have been applied
docs were regenerated (unless this PR does not include any code or yml updates)
BuildKite integration tests are passing
Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

If you had to summarize this PR in an emoji, which would it be?

💃

…spot__deal_document

fivetran-reneeli · 2024-11-20T20:29:33Z

models/intermediate/hubspot/int_rag_hubspot__deal_document.sql

Actually I realized that now that engagement exists and the field engagement_type is available in that table, do we still need the coalesce between "engagement_emails.engagement_type", "engagement_notes.engagement_type" here

dbt_unified_rag/models/intermediate/hubspot/int_rag_hubspot__deal_document.sql

Line 56 in 96524a7

{{ unified_rag.coalesce_cast(["engagement_emails.engagement_type", "engagement_notes.engagement_type", "'UNKNOWN'"], dbt.type_string()) }} as engagement_type,

in order to grab engagement_type

update-- have since removed the previous "engagement_emails.engagement_type", "engagement_notes.engagement_type" to swap with engagement.engagement_type

fivetran-joemarkiewicz

@fivetran-reneeli thanks for working through these changes. I have a few comments to be addressed before approval. Let me know if you have any questions. Thanks!

fivetran-joemarkiewicz · 2024-11-25T20:42:22Z

CHANGELOG.md

+## Under the Hood
+- Updated the unique key in `rag__unified_document` to include `chunk_index`. Previously, the unique key was a combination of only `document_id`, `platform`, and `source_relation`, which was potentially inaccurate if there were multiple chunks associated with a document.


I would recommend we classify this as a bug fix instead of an under the hood change. This will have an effect on customer data and we should classify it more than an under the hood change.

Good point, updated

README.md

dbt_project.yml

models/intermediate/hubspot/int_rag_hubspot__deal_document.sql

fivetran-joemarkiewicz

@fivetran-reneeli approved with just a two small requests before moving to release review.

CHANGELOG.md

dbt_project.yml

integration_tests/dbt_project.yml

Co-authored-by: Joe Markiewicz <[email protected]>

fivetran-avinash · 2024-11-27T22:32:07Z

integration_tests/seeds/hubspot_engagement.csv

@@ -0,0 +1,2 @@
+id,type,_fivetran_synced,portal_id


Is there a reason only a subset of seed columns are being run and tested here? Can we add all the relevant columns brought into the staging model?

that's a great point. i initially mocked this up from the internal schema with the new version of tables that i was testing with. I added in the rest of the columns!

fivetran-avinash · 2024-11-27T22:34:02Z

models/staging/hubspot_staging/stg_rag_hubspot.yml

+          The ID of the engagement's owner.
+
+          PLEASE NOTE - This field will only be populated for pre HubSpot v3 API versions. This field is only included to allow for backwards compatibility between HubSpot API versions. This field will be deprecated in the near future.
+      - name: portal_id


Don't forget to document source_relation!

thanks! updated

CHANGELOG.md

fivetran-avinash

@fivetran-reneeli Thanks for the PR! A few comments and a question about the seed file before approving.

Co-authored-by: Avinash Kunnath <[email protected]>

fivetran-reneeli · 2024-11-28T02:06:54Z

Thanks @fivetran-avinash ! Addressed comments. Also, I noticed that I didn't update the hubspot seed data with the new category field. Even though it doesn't play a part downstream, I figured to update the seed data and get-columns macro anyways. Additionally, I noticed some missing field descriptions in the rest of the hubspot staging models, so I added those in as well.

fivetran-avinash · 2024-11-28T03:15:21Z

CHANGELOG.md

+- Updated the `unique_id` in `rag__unified_document` to include `chunk_index`. Previously, the unique key was a combination of only `document_id`, `platform`, and `source_relation`, which was potentially inaccurate if there were multiple chunks associated with a document.
+
+## Under the Hood
+- Updated the *hubspot_x* seed data and *get_hubspot_x_columns* macros with the new `category` field where relevant.


Suggested change

- Updated the *hubspot_x* seed data and *get_hubspot_x_columns* macros with the new `category` field where relevant.

- Updated the `hubspot_*` seed data and `get_hubspot_*_columns` macros with the new `category` field where relevant.

Small suggestion update.

fivetran-avinash

Thanks @fivetran-reneeli for doing this quick review!

Good call out on adding category to the relevant seed files. However, I noticed category is not being brought into the staging models, and we need to account for the joins in int_rag_hubspot__deal_document, as was discussed in [#10] (I presume it's part of this PR since it's linked to the issue). Sorry for not catching this in the first go-around.

@fivetran-joemarkiewicz Let's discuss Monday if we want to try and fold these changes back in this sprint or just release as is.

introduce the hubspot engagement table to adjust the joins in int_hub…

c3218b4

…spot__deal_document

fivetran-reneeli commented Nov 20, 2024

View reviewed changes

fivetran-reneeli added 5 commits November 20, 2024 17:07

fix

5b7920a

switch out engagement_type field source, update changelog

51a7484

update unique id in unified_document, update docs

ed1e205

update verseioning

2db9b0a

docs and update readme format

57a1017

fivetran-reneeli self-assigned this Nov 22, 2024

fivetran-reneeli added 3 commits November 25, 2024 11:28

update jira comment seed file to test for chunk_index

0829812

switch versions

938807b

try new schema

b175bfe

fivetran-reneeli requested a review from fivetran-joemarkiewicz November 25, 2024 19:22

fivetran-joemarkiewicz requested changes Nov 25, 2024

View reviewed changes

updates

f846b71

fivetran-reneeli requested a review from fivetran-joemarkiewicz November 26, 2024 16:14

fivetran-joemarkiewicz approved these changes Nov 26, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

dbt_project.yml Outdated Show resolved Hide resolved

integration_tests/dbt_project.yml Outdated Show resolved Hide resolved

fivetran-reneeli and others added 4 commits November 26, 2024 14:22

Update CHANGELOG.md

c8cd90a

Co-authored-by: Joe Markiewicz <[email protected]>

Update dbt_project.yml

ea23ba5

Co-authored-by: Joe Markiewicz <[email protected]>

Update integration_tests/dbt_project.yml

588d324

Co-authored-by: Joe Markiewicz <[email protected]>

new schema

dda981c

fivetran-avinash reviewed Nov 27, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

fivetran-avinash reviewed Nov 27, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

fivetran-avinash reviewed Nov 27, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

fivetran-avinash reviewed Nov 27, 2024

View reviewed changes

Update CHANGELOG.md

2768c4d

Co-authored-by: Avinash Kunnath <[email protected]>

fivetran-reneeli and others added 4 commits November 27, 2024 18:50

Update CHANGELOG.md

e484cfd

Co-authored-by: Avinash Kunnath <[email protected]>

updates

528f43c

Update CHANGELOG.md

67cb540

Co-authored-by: Avinash Kunnath <[email protected]>

docs

67dff0d

fivetran-avinash reviewed Nov 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce the hubspot engagement table to adjust the joins in int_hub… #13

introduce the hubspot engagement table to adjust the joins in int_hub… #13

fivetran-reneeli commented Nov 20, 2024 •

edited

Loading

fivetran-reneeli Nov 20, 2024

fivetran-reneeli Nov 22, 2024

fivetran-joemarkiewicz left a comment

fivetran-joemarkiewicz Nov 25, 2024

fivetran-reneeli Nov 26, 2024

fivetran-joemarkiewicz left a comment

fivetran-avinash Nov 27, 2024

fivetran-reneeli Nov 28, 2024

fivetran-avinash Nov 27, 2024

fivetran-reneeli Nov 28, 2024

fivetran-avinash left a comment

fivetran-reneeli commented Nov 28, 2024

fivetran-avinash Nov 28, 2024 •

edited

Loading

fivetran-avinash left a comment •

edited

Loading

		## Under the Hood
		- Updated the unique key in `rag__unified_document` to include `chunk_index`. Previously, the unique key was a combination of only `document_id`, `platform`, and `source_relation`, which was potentially inaccurate if there were multiple chunks associated with a document.

	- Updated the hubspot_x seed data and get_hubspot_x_columns macros with the new `category` field where relevant.
	- Updated the `hubspot_` seed data and `get_hubspot__columns` macros with the new `category` field where relevant.

introduce the hubspot engagement table to adjust the joins in int_hub… #13

Are you sure you want to change the base?

introduce the hubspot engagement table to adjust the joins in int_hub… #13

Conversation

fivetran-reneeli commented Nov 20, 2024 • edited Loading

PR Overview

Breaking Changes

Under the Hood

PR Checklist

Basic Validation

Detailed Validation

If you had to summarize this PR in an emoji, which would it be?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fivetran-avinash left a comment

Choose a reason for hiding this comment

fivetran-reneeli commented Nov 28, 2024

fivetran-avinash Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

fivetran-avinash left a comment • edited Loading

Choose a reason for hiding this comment

fivetran-reneeli commented Nov 20, 2024 •

edited

Loading

fivetran-avinash Nov 28, 2024 •

edited

Loading

fivetran-avinash left a comment •

edited

Loading