Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: PostgreSQL batch export with SPMC abstractions #27348

Merged
merged 3 commits into from
Jan 9, 2025

Conversation

tomasfarias
Copy link
Contributor

@tomasfarias tomasfarias commented Jan 8, 2025

Problem

Final PR to add SPMC abstractions (well, except for HTTP batch export, but those are created more on-demand). This one targets PostgreSQL.

Changes

  • Define PostgreSQLConsumer that inherits from spmc.Consumer.
  • Run consumer with run_consumer.
  • Add a new BATCH_EXPORT_POSTGRES_RECORD_BATCH_QUEUE_MAX_SIZE_BYTES setting.

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Does this work well for both Cloud and self-hosted?

How did you test this code?

Tests are passing.

@tomasfarias tomasfarias requested a review from rossgray January 8, 2025 10:18
Copy link
Contributor

@rossgray rossgray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚢

if batch_export.destination.type in ("BigQuery", "Redshift")
else SYNC_BATCH_EXPORTS_TASK_QUEUE
)
task_queue = SYNC_BATCH_EXPORTS_TASK_QUEUE if batch_export.destination.type == "HTTP" else BATCH_EXPORTS_TASK_QUEUE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will the task queue only be updated if a user manually syncs or updates the batch export?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. But we can run a script to update everything ourselves once this PR is merged.

@tomasfarias tomasfarias force-pushed the feat/spmc-for-postgresql-batch-export branch from 4bcb9e3 to a4bc9c1 Compare January 8, 2025 14:20
@tomasfarias tomasfarias force-pushed the feat/spmc-for-postgresql-batch-export branch from d759b49 to c624b6d Compare January 8, 2025 16:32
@tomasfarias tomasfarias merged commit c4dcb0b into master Jan 9, 2025
92 checks passed
@tomasfarias tomasfarias deleted the feat/spmc-for-postgresql-batch-export branch January 9, 2025 10:16
Copy link

sentry-io bot commented Jan 9, 2025

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ ForeignKeyViolation: insert or update on table "events" violates foreign key constraint "foreing key" posthog.temporal.batch_exports.postgres_batch_e... View Issue
  • ‼️ RecordBatchTaskError: The record batch consumer encountered an error during execution posthog.temporal.batch_exports.spmc in raise_on... View Issue

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants