-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(batch-exports): Add Redshift to BatchExport destinations (#18059)
* feat(batch-exports): Add backfill model and service support * feat(batch-export-backfills): Account for potential restarts while backfilling * test(batch-exports-backfills): Add Workflow test * chore(batch-exports-backfill): Bump migration * feat(batch-exports): Abstract insert activity execution * feat(batch-exports): Add RedshiftBatchExportWorkflow * feat(batch-exports): Add Redshift to BatchExport destinations * feat(batch-exports): Support properties_data_type Redshift plugin parameter * refactor(batch-exports): Insert rows instead of using COPY * test: Add unit test for insert_into_redshift_activity * fix: Address typing issue * test: Add workflow test * feat: Frontend support for Redshift batch exports * docs: Add tests README.md * fix: Use correct fixture name in test * fix: Set default properties data type * fix(batch-exports): Update test * fix: Add activity to list of supported activities
- Loading branch information
1 parent
0b90a98
commit 2adc2f9
Showing
15 changed files
with
1,091 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
28 changes: 28 additions & 0 deletions
28
posthog/migrations/0357_add_redshift_batch_export_destination.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Generated by Django 3.2.19 on 2023-10-18 11:40 | ||
|
||
from django.db import migrations, models | ||
|
||
|
||
class Migration(migrations.Migration): | ||
dependencies = [ | ||
("posthog", "0356_add_replay_cost_control"), | ||
] | ||
|
||
operations = [ | ||
migrations.AlterField( | ||
model_name="batchexportdestination", | ||
name="type", | ||
field=models.CharField( | ||
choices=[ | ||
("S3", "S3"), | ||
("Snowflake", "Snowflake"), | ||
("Postgres", "Postgres"), | ||
("Redshift", "Redshift"), | ||
("BigQuery", "Bigquery"), | ||
("NoOp", "Noop"), | ||
], | ||
help_text="A choice of supported BatchExportDestination types.", | ||
max_length=64, | ||
), | ||
), | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Testing batch exports | ||
|
||
This module contains unit tests covering activities, workflows, and helper functions that power batch exports. Tests are divided by destination, and some destinations require setup steps to enable tests. | ||
|
||
## Testing BigQuery batch exports | ||
|
||
BigQuery batch exports can be tested against a real BigQuery instance, but doing so requires additional setup. For this reason, these tests are skipped unless an environment variable pointing to a BigQuery credentials file (`GOOGLE_APPLICATION_CREDENTIALS=/path/to/my/project-credentials.json`) is set. | ||
|
||
> :warning: Since BigQuery batch export tests require additional setup, we skip them by default and will not be ran by automated CI pipelines. Please ensure these tests pass when making changes that affect BigQuery batch exports. | ||
To enable testing for BigQuery batch exports, we require: | ||
1. A BigQuery project and dataset | ||
2. A BigQuery ServiceAccount with access to said project and dataset. See the [BigQuery batch export documentation](https://posthog.com/docs/cdp/batch-exports/bigquery#setting-up-bigquery-access) on detailed steps to setup a ServiceAccount. | ||
|
||
Then, a [key](https://cloud.google.com/iam/docs/keys-create-delete#creating) can be created for the BigQuery ServiceAccount and saved to a local file. For PostHog employees, this file should already be available under the PostHog password manager. | ||
|
||
Tests for BigQuery batch exports can be then run from the root of the `posthog` repo: | ||
|
||
```bash | ||
DEBUG=1 GOOGLE_APPLICATION_CREDENTIALS=/path/to/my/project-credentials.json pytest posthog/temporal/tests/batch_exports/test_bigquery_batch_export_workflow.py | ||
``` | ||
|
||
## Testing Redshift batch exports | ||
|
||
Redshift batch exports can be tested against a real Redshift (or Redshift Serverless) instance, with additional setup steps required. Due to this requirement, these tests are skipped unless Redshift credentials are specified in the environment. | ||
|
||
> :warning: Since Redshift batch export tests require additional setup, we skip them by default and will not be ran by automated CI pipelines. Please ensure these tests pass when making changes that affect Redshift batch exports. | ||
To enable testing for Redshift batch exports, we require: | ||
1. A Redshift (or Redshift Serverless) instance. | ||
2. Network access to this instance (via a VPN connection or jumphost, making a Redshift instance publicly available has serious security implications). | ||
3. User credentials (user requires `CREATEDB` permissions for testing but **not** superuser access). | ||
|
||
For PostHog employees, check the password manager as a set of development credentials should already be available. With these credentials, and after connecting to the appropriate VPN, we can run the tests from the root of the `posthog` repo with: | ||
|
||
```bash | ||
DEBUG=1 REDSHIFT_HOST=workgroup.111222333.region.redshift-serverless.amazonaws.com REDSHIFT_USER=test_user REDSHIFT_PASSWORD=test_password pytest posthog/temporal/tests/batch_exports/test_redshift_batch_export_workflow.py | ||
``` |
Oops, something went wrong.