Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: drop kafka_skip_broken_messages from kafka_events_json #18970

Closed

Conversation

bretthoerner
Copy link
Contributor

@bretthoerner bretthoerner commented Nov 29, 2023

Problem

We shouldn't be sending any messages that don't deserialize properly, and so we'd like to go back to kafka_skip_broken_messages = 0 in prod.

Changes

Remake virtual Kafka table (and MV that's based on it) without the kafka_skip_broken_messages setting.

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

How did you test this code?

This is modeled after a previous migration and should be reversible. Since it could pause the pipeline we should probably deploy early on a Monday and monitor it (with a revert PR ready).

@bretthoerner
Copy link
Contributor Author

bretthoerner commented Nov 29, 2023

@fuziontech I'm not sure if I missed a tool to create these, I created the migration manually. I assume our migration script checks for the migration by name.

Speaking of which, there are two 0049 migrations here. I hope that's OK/known?: https://github.com/PostHog/posthog/tree/master/posthog/clickhouse/migrations

@bretthoerner bretthoerner requested a review from a team November 29, 2023 20:15
Copy link
Contributor

@tiina303 tiina303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bretthoerner
Copy link
Contributor Author

Ping @fuziontech - looks like I need your approval.

Copy link
Member

@fuziontech fuziontech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! sorry for lag
🚢 it!

@fuziontech
Copy link
Member

@fuziontech I'm not sure if I missed a tool to create these, I created the migration manually. I assume our migration script checks for the migration by name.

No tool to make these, just manually. Checks for i+1 where i is the current migration id.

Speaking of which, there are two 0049 migrations here. I hope that's OK/known?: https://github.com/PostHog/posthog/tree/master/posthog/clickhouse/migrations

oh, interesting. I would probably increment this to 0051 and bump the last migration to 0050

Copy link
Member

@fuziontech fuziontech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀

@fuziontech
Copy link
Member

Actually. Give me a second to put a PR to correct this. One of those migrations needs to be removed. One sec.

@fuziontech
Copy link
Member

Once this lands, let's rebase this one and 🚢 it
#19110

@bretthoerner bretthoerner force-pushed the brett/drop-events-kafka_skip_broken_messages branch from de1741a to 7cb5d1e Compare December 6, 2023 00:26
@bretthoerner
Copy link
Contributor Author

Rebased 🤞

@bretthoerner
Copy link
Contributor Author

Looks good, just needs your stamp @fuziontech

Copy link
Member

@fuziontech fuziontech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just kept rolling for all green tests.
Ship it!

@bretthoerner
Copy link
Contributor Author

Actually going to close this in favor of the upcoming _error virtual columns.

This migration will allow bad messages to kill ingestion, where as the _error columns can be forwarded and stored in a table for us to analyze (without killing ingestion).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants