Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(source-recharge): add fivetran dbt converter #56

Merged
merged 15 commits into from
Aug 31, 2024
116 changes: 110 additions & 6 deletions connectors/source_recharge/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,113 @@
# Airbyte source_recharge dbt Package
# Recharge Airbyte dbt Package

This package contains dbt models for Airbyte source_recharge source.
---

What it includes:
- This package contains dbt models to work with Airbyte Recharge connector.
- The package is compatible with latest version of Airbyte Recharge connector.
- Currently, it is limited to creating transformations compatible with [Fivetran's modeling dbt package](https://github.com/fivetran/dbt_recharge/tree/main).
- In the future, specific models will be applied directly to Airbyte connector output. If you have an idea or want to propose an analytical model for this source, please refer to the contributing guide, which explains how to propose a new transformation model.
- This package was tested with BigQuery, Snowflake, and Postgres data warehouses.

* A complete source description
* ERD model for the source
* Diagram documentation for the source
---

## 🎯 Intructions how to use

### Airbyte dbt Package

For now Airbyte dbt packages aren't versioned. You must configure using git and subdirectory. For now there isn't any transformation model directly applied to this package. But you can generate docs and tests with dbt.

Create the following files:

**`dbt_project.yml`**

```yaml
vars:
using_fivetran_model: False
airbyte_database: "airbyte_db_default"
airbyte_schema: "airbyte_dbt_source_recharge"
```

**`packages.yml`**

```yaml
packages:
- git: "https://github.com/airbytehq/airbyte-dbt-models.git"
subdirectory: "connectors/source_recharge"
```

After you can run `dbt tests` or `dbt docs generate` to have a preview of Airbyte output data.

### Fivetran Recharge Modeling dbt package

This package transforms Airbyte connector output data, making it compatible with Fivetran's Recharge dbt package. You can check the analytical models Fivetran creates [here](https://github.com/fivetran/dbt_recharge/tree/main?tab=readme-ov-file#-what-does-this-dbt-package-do). The link also provides information about how the package works and what is configurable.

Create the require files to use Airbyte and Fivetran dbt packages:

**`packages.yml`**

```yaml
packages:
- git: "https://github.com/airbytehq/airbyte-dbt-models.git"
subdirectory: "connectors/source_recharge"

- package: fivetran/recharge
version: [">=0.16.0", "<0.17.0"]
```

This is a default variable definition you must configure to have the models created.

**`dbt_project.yml`**

```yaml
vars:
# Required by Airbyte dbt model
using_fivetran_model: True
airbyte_database: "airbyte_db_default"
airbyte_schema: "airbyte_dbt_recharge"

# Required by Fivetran dbt model
recharge_database: "airbyte_db_default"
recharge_schema: "airbyte_dbt_recharge"

recharge__one_time_product_enabled: true # Disables if you do not have the ONE_TIME_PRODUCT table. Default is True.
recharge__charge_tax_line_enabled: true # Disables if you do not have the CHARGE_TAX_LINE table. Default is True.
recharge__checkout_enabled: false # Enables if you do have the CHECKOUT table. Default is False.

recharge__standardized_billing_model_enabled: false # false by default.

recharge__using_orders: true # default is true, which will use the `orders` version of the source.

recharge_first_date: "yyyy-mm-dd"
recharge_last_date: "yyyy-mm-dd"

recharge_address_identifier: "addresses"
recharge_address_discounts_identifier: "addresses"
recharge_address_shipping_line_identifier: "addresses"
recharge_charge_identifier: "charges"
recharge_charge_line_item_identifier: "charges"
recharge_charge_order_attribute_identifier: "charges"
recharge_charge_shipping_line_identifier: "charges"
recharge_charge_tax_line_identifier: "charges"
recharge_customer_identifier: "customers"
recharge_discount_identifier: "discounts"
recharge_one_time_product_identifier: "onetimes"
recharge_order_identifier: "orders"
recharge_order_line_item_identifier: "orders"
recharge_subscription_identifier: "subscriptions"
recharge_subscription_history_identifier: "subscriptions"
```

You need to run the models in steps:

```shell
dbt run --model +source_recharge # create tables needed by Fivetran from Airbyte
dbt run --model +recharge_source # staging tables
dbt run --model +recharge # final analytical model.
```

---

## :package: Package Maintenance

- This package is maintained by the Airbyte Community.
- You can contribute any time please read the Contributing Guidelines or enter the Airbyte Slack Channel `#airbyte-dbt-packages`.
48 changes: 48 additions & 0 deletions connectors/source_recharge/integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: integration_test_recharge

config-version: 2

version: 0.1.0

profile: integration_tests

model-paths:
- models

macro-paths:
- macros

target-path: target

clean-targets:
- target
- dbt_modules
- logs

require-dbt-version:
- ">=1.0.0"
- <2.0.0

models:
airbyte_dbt_source_recharge:
materialized: view
+schema: dbt_recharge
staging:
materialized: view
tmp:
materialized: view

vars:
# Required by Airbyte dbt model
using_fivetran_model: True
airbyte_database: "airbyte_db_default"
airbyte_schema: "airbyte_dbt_source_recharge"

# Required by Fivetran dbt model
recharge_database: "airbyte_db_default"
recharge_schema: "airbyte_dbt_source_recharge"
recharge_source:
order: "{{ ref('order_extended') }}"
orders: "{{ ref('order_extended') }}"

recharge_first_date: "2021-01-01"
13 changes: 13 additions & 0 deletions connectors/source_recharge/integration_tests/package-lock.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
packages:
- local: ../
- package: fivetran/recharge
version: 0.3.0
- package: fivetran/recharge_source
version: 0.3.1
- package: fivetran/fivetran_utils
version: 0.4.10
- package: dbt-labs/spark_utils
version: 0.3.0
- package: dbt-labs/dbt_utils
version: 1.2.0
sha1_hash: 7e7cc8ea09f670fb388cbe069b5850d11625fe0f
5 changes: 5 additions & 0 deletions connectors/source_recharge/integration_tests/packages.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
packages:
- local: ../

- package: fivetran/recharge
version: ["0.3.0"]
1 change: 1 addition & 0 deletions connectors/source_recharge/integration_tests/vars
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{airbyte_database: $AB_DB, zendesk_database: $AB_DB}
85 changes: 85 additions & 0 deletions connectors/source_recharge/models/fivetran_converter/address.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
{% if target.type == "snowflake" %}

with tmp as
(
select
id as address_id,
customer_id,
first_name,
last_name,
cast(created_at as {{ dbt.type_timestamp() }}) as address_created_at, -- Snowflake: dbt.type_timestamp() should work as expected
cast(updated_at as {{ dbt.type_timestamp() }}) as address_updated_at,
address1 as address_line_1,
address2 as address_line_2,
city,
province,
zip,
country_code,
company,
phone

FROM
{{ source('source_recharge', 'addresses') }}

)

select *
from tmp

{% elif target.type == "bigquery" %}

with tmp as
(
select
id as address_id,
customer_id,
first_name,
last_name,
cast(created_at as {{ dbt.type_timestamp() }}) as address_created_at, -- BigQuery: dbt.type_timestamp() should work as expected
cast(updated_at as {{ dbt.type_timestamp() }}) as address_updated_at,
address1 as address_line_1,
address2 as address_line_2,
city,
province,
zip,
country_code,
company,
phone

FROM
{{ source('source_recharge', 'addresses') }}

)

select *
from tmp

{% elif target.type == "postgres" %}

with tmp as
(
select
id as address_id,
customer_id,
first_name,
last_name,
cast(created_at as {{ dbt.type_timestamp() }}) as address_created_at, -- Postgres: dbt.type_timestamp() should work as expected
cast(updated_at as {{ dbt.type_timestamp() }}) as address_updated_at,
address1 as address_line_1,
address2 as address_line_2,
city,
province,
zip,
country_code,
company,
phone

FROM
{{ source('source_recharge', 'addresses') }}

)

select *
from tmp

{% endif %}
38 changes: 38 additions & 0 deletions connectors/source_recharge/models/fivetran_converter/address.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
version: 2

models:
- name: address
schema: "{{ var('airbyte_schema', target.schema) }}"
database: "{{ var('airbyte_database', target.database) }}"
description: "Address table aligned with the Fivetran dbt model."
config:
+enabled: "{{ var('using_fivetran_model', False) }}"
columns:
- name: address_id
description: "address unique identifier"
- name: customer_id
description: "customer unique identifier"
- name: first_name
description: The customer's first name.
- name: last_name
description: The customer's last name.
- name: address_created_at
description: The date and time the customer address was recorded.
- name: address_updated_at
description: The date and time of when the customer's address record was last updated.
- name: address_line_1
description: The first line of the customer's address.
- name: address_line_2
description: Any additional address information associated with the customer.
- name: city
description: The city associated with the customer.
- name: province
description: The province or state name associated with the customer.
- name: zip
description: The zip or post code associated with the customer.
- name: country_code
description: The country code associated with the address.
- name: company
description: The company name associated with the customer.
- name: phone
description: The phone number associated with the customer.
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
{% if target.type == "snowflake" %}

with tmp as
(
select
discount_id,
id as address_id,
NULL::integer as index -- Snowflake requires explicit type casting

FROM
{{ source('source_recharge', 'addresses') }}

)

select *
from tmp

{% elif target.type == "bigquery" %}

with tmp as
(
select
discount_id,
id as address_id,
NULL as index -- BigQuery syntax for a NULL value, implicitly typed

FROM
{{ source('source_recharge', 'addresses') }}

)

select *
from tmp

{% elif target.type == "postgres" %}

with tmp as
(
select
discount_id,
id as address_id,
NULL::integer as index -- PostgreSQL requires explicit type casting for NULL

FROM
{{ source('source_recharge', 'addresses') }}

)

select *
from tmp

{% endif %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
version: 2

models:
- name: address_discounts
schema: "{{ var('airbyte_schema', target.schema) }}"
database: "{{ var('airbyte_database', target.database) }}"
description: "Address discounts aligned with the Fivetran dbt model."
config:
+enabled: "{{ var('using_fivetran_model', False) }}"
columns:
- name: address_id
description: "Address unique identifier"
- name: index
description: A unique numeric row produced for every concurrent address_id.
- name: discount_id
description: "Discount unique identifier"
Loading
Loading