Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: blobby ingestion warnings #20963

Merged
merged 19 commits into from
Mar 19, 2024
Merged

Conversation

pauldambra
Copy link
Member

@pauldambra pauldambra commented Mar 16, 2024

As an incident follow-up we want to write ingestion warnings from blob ingestion for the version of the web sdk that is writing the data. But we don't record that info in blobby or write ingestion warnings from blobby.

Let's start writing ingestion warnings from blobby...

Let's start with data we already drop - timestamp skew. Right now there's no way for a customer to see that this is happening.

Currently we wait for up to a month of skew which is way too much. Let's allow 7 days of skew before we start dropping data. (even if this isn't handling lag correctly, kafka would be dropping data based on age before this code does)

Screenshot 2024-03-16 at 07 58 34

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

2 snapshot changes in total. 0 added, 2 modified, 0 deleted:

  • chromium: 0 added, 2 modified, 0 deleted (diff for shard 1)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@pauldambra pauldambra marked this pull request as ready for review March 16, 2024 08:19
Copy link
Contributor

github-actions bot commented Mar 16, 2024

Size Change: +2.52 kB (0%)

Total Size: 824 kB

Filename Size Change
frontend/dist/toolbar.js 824 kB +2.52 kB (0%)

compressed-size-action

@pauldambra pauldambra requested a review from a team March 17, 2024 14:59
@marandaneto
Copy link
Member

Nice improvement, this is important for Mobile since date and time is often not up to date.

Copy link
Member

@marandaneto marandaneto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments but otherwise LGTM

Co-authored-by: Manoel Aranda Neto <[email protected]>
Copy link
Contributor

@xvello xvello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on not passing the full db anymore, and happy that we're extending ingestion warnings!

There's a scalability issue with ingestion warnings though, as the app is currently loading the full dataset instead of paginating it when you unfurl a section. This causes the app to become unresponsive if there's too many warnings.

For the overflow warning we debounce per distinct_id once an hour per pod and it is not enough on some accounts.

We'll need to fix this by better CH deduplication and pagination on the read API, but in the meantime, we might want to move the token-bucket debouncing within captureIngestionWarning itself?

  • add a new optional debounce_key param, that'll be the session_id on your case
  • repurpose OverflowWarningLimiter into a wider IngestionWarningLimiter, still once per hour as currently setup
  • captureIngestionWarning consumes the limiter with ${type}:${teamId}:${debounce_key} if key is present, and skip the kafka produce if no token is available?

@pauldambra pauldambra requested a review from xvello March 19, 2024 08:24
Copy link
Contributor

@xvello xvello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a debounce key to the person state warnings then merge! Looks great

@@ -323,15 +323,15 @@ export class PersonState {
return undefined
}
if (isDistinctIdIllegal(mergeIntoDistinctId)) {
await captureIngestionWarning(this.db, teamId, 'cannot_merge_with_illegal_distinct_id', {
await captureIngestionWarning(this.db.kafkaProducer, teamId, 'cannot_merge_with_illegal_distinct_id', {
illegalDistinctId: mergeIntoDistinctId,
otherDistinctId: otherPersonDistinctId,
eventUuid: this.event.uuid,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best not to rate-limit these, as they contain useful info and merges are supposed to be rare.
I support not making the rate-limiting optional to avoid deploying a bomb, but let's add mergeIntoDistinctId as the debounce key for these the three warnings on this file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed a solution to let us override the debounce key construction so that the three uses in the person state are debounced together.... is that what you meant?

(happy to change it if I'm being silly :))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being unclear: I do not wish to debounce these warnings, and send all of them to CH. They should be extremely infrequent unless there's a subtle instrumentation bug, and in that case it's better to have as much info as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so keeping the safe by default... i've made it opt in to always sending rather than implicit on absent debounce key for e.g.

so

  1. always debounce on team id and type
  2. unless you provide a debounce key in which case it is team id, type, and key
  3. unless you say alwaysSend in which case it always sends

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

1 snapshot changes in total. 0 added, 1 modified, 0 deleted:

  • chromium: 0 added, 1 modified, 0 deleted (diff for shard 2)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

1 snapshot changes in total. 0 added, 1 modified, 0 deleted:

  • chromium: 0 added, 1 modified, 0 deleted (diff for shard 2)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

Copy link
Contributor

@xvello xvello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! Thanks and sorry for the back-and-forth

@pauldambra pauldambra merged commit 906737b into master Mar 19, 2024
131 checks passed
@pauldambra pauldambra deleted the feat/blobby-ingestion-warning branch March 19, 2024 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants