fivetran · fivetran-jamie · Jan 2, 2024 · Jan 8, 2024 · Jan 8, 2024 · Jan 8, 2024
diff --git a/.github/PULL_REQUEST_TEMPLATE/maintainer_pull_request_template.md b/.github/PULL_REQUEST_TEMPLATE/maintainer_pull_request_template.md
@@ -4,48 +4,27 @@
 **This PR will result in the following new package version:**
 <!--- Please add details around your decision for breaking vs non-breaking version upgrade. If this is a breaking change, were backwards-compatible options explored? -->
 
-**Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:**
+**Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:**
+<!--- Copy/paste the CHANGELOG for this version below. -->
 
 ## PR Checklist
 ### Basic Validation
 Please acknowledge that you have successfully performed the following commands locally:
-- [ ] dbt compile
-- [ ] dbt run –full-refresh
-- [ ] dbt run
-- [ ] dbt test
-- [ ] dbt run –vars (if applicable)
+- [ ] dbt run –full-refresh && dbt test
+- [ ] dbt run (if incremental models are present) && dbt test
 
 Before marking this PR as "ready for review" the following have been applied:
-- [ ] The appropriate issue has been linked and tagged
-- [ ] You are assigned to the corresponding issue and this PR
+- [ ] The appropriate issue has been linked, tagged, and properly assigned
+- [ ] All necessary documentation and version upgrades have been applied
+ <!--- Be sure to update the package version in the dbt_project.yml, integration_tests/dbt_project.yml, and README if necessary. -->
+- [ ] docs were regenerated (unless this PR does not include any code or yml updates)
 - [ ] BuildKite integration tests are passing
+- [ ] Detailed validation steps have been provided below
 
 ### Detailed Validation
-Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":
-- [ ] You have validated these changes and assure this PR will address the respective Issue/Feature.
-- [ ] You are reasonably confident these changes will not impact any other components of this package or any dependent packages.
-- [ ] You have provided details below around the validation steps performed to gain confidence in these changes.
+Please share any and all of your validation steps:
 <!--- Provide the steps you took to validate your changes below. -->
 
-### Standard Updates
-Please acknowledge that your PR contains the following standard updates:
-- Package versioning has been appropriately indexed in the following locations:
- - [ ] indexed within dbt_project.yml
- - [ ] indexed within integration_tests/dbt_project.yml
-- [ ] CHANGELOG has individual entries for each respective change in this PR
- <!--- If there is a parallel upstream change, remember to reference the corresponding CHANGELOG as an individual entry. -->
-- [ ] README updates have been applied (if applicable)
- <!--- Remember to check the following README locations for common updates. →
- <!--- Suggested install range (needed for breaking changes) →
- <!--- Dependency matrix is appropriately updated (if applicable) →
- <!--- New variable documentation (if applicable) -->
-- [ ] DECISIONLOG updates have been updated (if applicable)
-- [ ] Appropriate yml documentation has been added (if applicable)
-
-### dbt Docs
-Please acknowledge that after the above were all completed the below were applied to your branch:
-- [ ] docs were regenerated (unless this PR does not include any code or yml updates)
-
 ### If you had to summarize this PR in an emoji, which would it be?
 <!--- For a complete list of markdown compatible emojis check our this git repo (https://gist.github.com/rxaviers/7360908) --> 
-:dancer:
+:dancer:
diff --git a/.github/workflows/auto-release.yml b/.github/workflows/auto-release.yml
@@ -0,0 +1,13 @@
+name: 'auto release'
+on:
+ pull_request:
+ types:
+ - closed
+ branches:
+ - main
+
+jobs:
+ call-workflow-passing-data:
+ if: github.event.pull_request.merged
+ uses: fivetran/dbt_package_automations/.github/workflows/auto-release.yml@main
+ secrets: inherit
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,3 +1,13 @@
+# dbt_hubspot_source v0.15.0
+[PR #123](https://github.com/fivetran/dbt_hubspot_source/pull/123) includes the following updates:
+
+## 🎉 Feature Update 🎉 
+- This release supports running the package on multiple Hubspot sources at once! See the [README](https://github.com/fivetran/dbt_hubspot_source?tab=readme-ov-file#step-3-define-database-and-schema-variables) for details on how to leverage this feature.
+
+## 🛠️ Under the Hood 🛠️
+- Included auto-releaser GitHub Actions workflow to automate future releases.
+- Updated the maintainer PR template to resemble the most up to date format.
+
 # dbt_hubspot_source v0.14.0
 [PR #122](https://github.com/fivetran/dbt_hubspot_source/pull/122) includes the following updates:
 

diff --git a/README.md b/README.md
@@ -44,17 +44,62 @@ Include the following hubspot_source package version in your `packages.yml` file
 ```yaml
 packages:
  - package: fivetran/hubspot_source
- version: [">=0.14.0", "<0.15.0"]
+ version: [">=0.15.0", "<0.16.0"]
 ```
+
 ## Step 3: Define database and schema variables
+### Option 1: Single connector 💃
 By default, this package runs using your destination and the `hubspot` schema. If this is not where your HubSpot data is (for example, if your HubSpot schema is named `hubspot_fivetran`), add the following configuration to your root `dbt_project.yml` file:
 
 ```yml
 vars:
  hubspot_database: your_destination_name
  hubspot_schema: your_schema_name 
 ```
+> **Note**: If you are running the package on one source connector, each model will have a `source_relation` column that is just an empty string.
+
+### Option 2: Union multiple connectors 👯
+If you have multiple Hubspot connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `hubspot_union_schemas` OR `hubspot_union_databases` variables (cannot do both, though a more flexible approach is in the works...) in your root `dbt_project.yml` file:
+
+```yml
+# dbt_project.yml
+
+vars:
+ hubspot_union_schemas: ['hubspot_usa','hubspot_canada'] # use this if the data is in different schemas/datasets of the same database/project
+ hubspot_union_databases: ['hubspot_usa','hubspot_canada'] # use this if the data is in different databases/projects but uses the same schema name
+```
+
+#### Recommended: Incorporate unioned sources into DAG
+By default, this package defines one single-connector source, called `hubspot`, which will be disabled if you are unioning multiple connectors. This means that your DAG will not include your Hubspot sources, though the package will run successfully.
+
+To properly incorporate all of your Hubspot connectors into your project's DAG:
+1. Define each of your sources in a `.yml` file in your project. Utilize the following template for the `source`-level configurations, and, **most importantly**, copy and paste the table and column-level definitions from the package's `src_hubspot.yml` [file](https://github.com/fivetran/dbt_hubspot_source/blob/main/models/src_hubspot.yml#L9-L1313).
+
+```yml
+# a .yml file in your root project
+sources:
+ - name: <name> # ex: hubspot_usa
+ schema: <schema_name> # one of var('hubspot_union_schemas') if unioning schemas, otherwise just 'hubspot'
+ database: <database_name> # one of var('hubspot_union_databases') if unioning databases, otherwise whatever DB your hubspot schemas all live in
+ loader: Fivetran
+ loaded_at_field: _fivetran_synced
+ tables: # copy and paste from models/src_hubspot.yml 
+```
+
+> **Note**: If there are source tables you do not have (see [Step 4](https://github.com/fivetran/dbt_hubspot_source?tab=readme-ov-file#step-4-disable-models-for-non-existent-sources)), you may still include them here, as long as you have set the right variables to `False`. Otherwise, you may remove them from your source definitions.
+
+2. Set the `has_defined_sources` variable (scoped to the `hubspot_source` package) to `True`, like such:
+```yml
+# dbt_project.yml
+vars:
+ hubspot_source:
+ has_defined_sources: true
+```
+
 ## Step 4: Disable models for non-existent sources
+
+> _This step is unnecessary (but still available for use) if you are unioning multiple connectors together in the previous step. That is, the `union_data` macro we use will create completely empty staging models for sources that are not found in any of your Hubspot schemas/databases. However, you can still leverage the below variables if you would like to avoid this behavior._
+
 When setting up your Hubspot connection in Fivetran, it is possible that not every table this package expects will be synced. This can occur because you either don't use that functionality in Hubspot or have actively decided to not sync some tables. Therefore we have added enable/disable configs in the `src.yml` to allow you to disable certain sources not present. Downstream models are automatically disabled as well. In order to disable the relevant functionality in the package, you will need to add the relevant variables in your root `dbt_project.yml`. By default, all variables are assumed to be `true` (with exception of `hubspot_service_enabled`, `hubspot_ticket_deal_enabled`, and `hubspot_contact_merge_audit_enabled`). You only need to add variables for the tables different from default:
 
 ```yml
@@ -111,10 +156,8 @@ vars:
  hubspot_ticket_deal_enabled: true
 ```
 
-### Dbt-core Version Requirement for disabling freshness tests
-If you are not using a source table that involves freshness tests, please be aware that the feature to disable freshness was only introduced in dbt-core 1.1.0. Therefore ensure the dbt version you're using is v1.1.0 or greater for this config to work.
-
 ## (Optional) Step 5: Additional configurations
+<details open><summary>Expand/collapse configurations</summary>
 
 ### Adding passthrough columns
 This package includes all source columns defined in the macros folder. Models by default only bring in a few fields for the `company`, `contact`, `deal`, and `ticket` tables. You can add more columns using our pass-through column variables. These variables allow for the pass-through fields to be aliased (`alias`) and casted (`transform_sql`) if desired, but not required. Datatype casting is configured via a sql snippet within the `transform_sql` key. You may add the desired sql while omitting the `as field_name` at the end and your custom pass-though fields will be casted accordingly. Use the below format for declaring the respective pass-through variables within your root `dbt_project.yml`.
@@ -206,7 +249,7 @@ models:
  +schema: my_new_schema_name # leave blank for just the target_schema
 ```
 
-### Change the source table references
+### Change the source table references (only if using a single connector)
 If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:
 > IMPORTANT: See this project's [`dbt_project.yml`](https://github.com/fivetran/dbt_hubspot_source/blob/main/dbt_project.yml) variable declarations to see the expected names.
 
@@ -215,6 +258,8 @@ vars:
  hubspot_<default_source_table_name>_identifier: your_table_name 
 ```
 
+</details>
+
 ## (Optional) Step 6: Orchestrate your models with Fivetran Transformations for dbt Core™
 
 Fivetran offers the ability for you to orchestrate your dbt project through [Fivetran Transformations for dbt Core™](https://fivetran.com/docs/transformations/dbt). Learn how to set up your project for orchestration through Fivetran in our [Transformations for dbt Core setup guides](https://fivetran.com/docs/transformations/dbt#setupguide).

diff --git a/dbt_project.yml b/dbt_project.yml
@@ -1,5 +1,5 @@
 name: 'hubspot_source'
-version: '0.14.0'
+version: '0.15.0'
 config-version: 2
 require-dbt-version: [">=1.3.0", "<2.0.0"]
 models:

diff --git a/integration_tests/ci/sample.profiles.yml b/integration_tests/ci/sample.profiles.yml
@@ -16,13 +16,13 @@ integration_tests:
  pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
  dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
  port: 5439
- schema: hubspot_source_integration_tests_999
+ schema: hubspot_source_integration_tests_001
  threads: 8
  bigquery:
  type: bigquery
  method: service-account-json
  project: 'dbt-package-testing'
- schema: hubspot_source_integration_tests_999
+ schema: hubspot_source_integration_tests_001
  threads: 8
  keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
  snowflake:
@@ -33,7 +33,7 @@ integration_tests:
  role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
  database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
  warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
- schema: hubspot_source_integration_tests_999
+ schema: hubspot_source_integration_tests_001
  threads: 8
  postgres:
  type: postgres
@@ -42,13 +42,13 @@ integration_tests:
  pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
  dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
  port: 5432
- schema: hubspot_source_integration_tests_999
+ schema: hubspot_source_integration_tests_001
  threads: 8
  databricks:
  catalog: null
  host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
  http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
- schema: hubspot_source_integration_tests_999
+ schema: hubspot_source_integration_tests_001
  threads: 8
  token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
  type: databricks
diff --git a/integration_tests/dbt_project.yml b/integration_tests/dbt_project.yml
@@ -1,12 +1,12 @@
 name: 'hubspot_source_integration_tests'
-version: '0.14.0'
+version: '0.15.0'
 profile: 'integration_tests'
 config-version: 2
 models:
  hubspot_source:
  +schema:
 vars:
- hubspot_schema: hubspot_source_integration_tests_999
+ hubspot_schema: hubspot_source_integration_tests_001
  hubspot_source:
  hubspot_service_enabled: true
  # hubspot_sales_enabled: true # enable when generating docs

diff --git a/macros/add_property_labels.sql b/macros/add_property_labels.sql
@@ -22,16 +22,19 @@ select {{ cte_name }}.*
  left join -- create subset of property and property_options for property in question
  (select 
  property_option.property_option_value, 
- property_option.property_option_label
+ property_option.property_option_label,
+ property_option.source_relation
  from {{ ref('stg_hubspot__property_option') }} as property_option
  join {{ ref('stg_hubspot__property') }} as property
  on property_option.property_id = property._fivetran_id
+ and property_option.source_relation = property.source_relation
  where property.property_name = '{{ col.name.replace('property_', '') }}'
  and property.hubspot_object = '{{ source_name }}'
  ) as {{ col.name }}_option
 
  on cast({{ cte_name }}.{{ col_alias }} as {{ dbt.type_string() }})
  = cast({{ col.name }}_option.property_option_value as {{ dbt.type_string() }})
+ and {{ cte_name }}.source_relation = {{ col.name }}_option.source_relation
 
  {% endif -%}
  {%- endfor %}

diff --git a/macros/all_passthrough_column_check.sql b/macros/all_passthrough_column_check.sql
@@ -2,7 +2,7 @@
 
 {% set available_passthrough_columns = fivetran_utils.remove_prefix_from_columns(
  columns=adapter.get_columns_in_relation(ref(relation)), 
- prefix='property_', exclude=get_macro_columns(get_columns)) 
+ prefix='property_', exclude=(get_macro_columns(get_columns) + ['_dbt_source_relation'])) 
  %}
 
 {{ return(available_passthrough_columns|length) }}

diff --git a/models/docs.md b/models/docs.md
@@ -2,6 +2,10 @@
 Timestamp of when Fivetran synced a record.
 {% enddocs %}
 
+{% docs source_relation %}
+The schema or database this record came from if you are unioning multiple connectors together in this package. If you are running the package on a single connector, this will be its schema name.
+{% enddocs %}
+
 {% docs _fivetran_deleted %}
 Boolean indicating whether a record has been deleted in Hubspot and/or inferred deleted in Hubspot by Fivetran; _fivetran_deleted and is_deleted fields are equivalent. 
 {% enddocs %}

diff --git a/models/src_hubspot.yml b/models/src_hubspot.yml
@@ -6,6 +6,8 @@ sources:
  database: "{% if target.type != 'spark'%}{{ var('hubspot_database', target.database) }}{% endif %}"
  loader: Fivetran
  loaded_at_field: _fivetran_synced
+ config:
+ enabled: "{{ var('hubspot_union_schemas', []) == [] and var('hubspot_union_databases', []) == [] }}"
  tables:
  - name: calendar_event
  identifier: "{{ var('hubspot_calendar_event_identifier', 'calendar_event')}}"

diff --git a/models/stg_hubspot__company.sql b/models/stg_hubspot__company.sql
@@ -14,6 +14,14 @@ with base as (
  staging_columns=get_company_columns()
  )
  }}
+
+ {{ 
+ fivetran_utils.source_relation(
+ union_schema_variable='hubspot_union_schemas', 
+ union_database_variable='hubspot_union_databases'
+ ) 
+ }}
+
  from base
 
 ), fields as (
@@ -27,12 +35,20 @@ with base as (
  staging_columns=get_company_columns()
  )
  }}
+
+ {{ 
+ fivetran_utils.source_relation(
+ union_schema_variable='hubspot_union_schemas', 
+ union_database_variable='hubspot_union_databases'
+ ) 
+ }}
+
  {% if all_passthrough_column_check('stg_hubspot__company_tmp',get_company_columns()) > 0 %}
  -- just pass everything through if extra columns are present, but ensure required columns are present.
  ,{{ 
  fivetran_utils.remove_prefix_from_columns(
  columns=adapter.get_columns_in_relation(ref('stg_hubspot__company_tmp')), 
- prefix='property_', exclude=get_macro_columns(get_company_columns()))
+ prefix='property_', exclude=(get_macro_columns(get_company_columns()) + ['_dbt_source_relation']))
  }}
  {% endif %}
  from base
@@ -52,7 +68,8 @@ with base as (
  city,
  state,
  country,
- company_annual_revenue
+ company_annual_revenue,
+ source_relation
 
  --The below macro adds the fields defined within your hubspot__ticket_pass_through_columns variable into the staging model
  {{ fivetran_utils.fill_pass_through_columns('hubspot__company_pass_through_columns') }}

diff --git a/models/stg_hubspot__company.yml b/models/stg_hubspot__company.yml
@@ -19,9 +19,16 @@ models:
  description: '{{ doc("history_name") }}'
  - name: new_value
  description: '{{ doc("history_value") }}'
+ - name: source_relation
+ description: '{{ doc("source_relation") }}'
 
  - name: stg_hubspot__company
  description: Each record represents a company in Hubspot.
+ tests:
+ - dbt_utils.unique_combination_of_columns:
+ combination_of_columns: 
+ - company_id
+ - source_relation 
  columns:
  - name: _fivetran_synced
  description: '{{ doc("_fivetran_synced") }}'
@@ -30,7 +37,6 @@ models:
  - name: company_id
  description: The ID of the company.
  tests:
- - unique
  - not_null
  - name: company_name
  description: The name of the company.
@@ -52,3 +58,5 @@ models:
  description: The country where the company is located.
  - name: company_annual_revenue
  description: The actual or estimated annual revenue of the company.
+ - name: source_relation
+ description: '{{ doc("source_relation") }}'
diff --git a/models/stg_hubspot__company_property_history.sql b/models/stg_hubspot__company_property_history.sql
@@ -14,6 +14,14 @@ with base as (
  staging_columns=get_company_property_history_columns()
  )
  }}
+
+ {{ 
+ fivetran_utils.source_relation(
+ union_schema_variable='hubspot_union_schemas', 
+ union_database_variable='hubspot_union_databases'
+ ) 
+ }}
+
  from base
 
 ), fields as (
@@ -25,7 +33,8 @@ with base as (
  source as change_source,
  source_id as change_source_id,
  cast(change_timestamp as {{ dbt.type_timestamp() }}) as change_timestamp, -- source field name = timestamp ; alias declared in macros/get_company_property_history_columns.sql
- value as new_value
+ value as new_value,
+ source_relation
  from macro
 
 )