Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(blobby): overflow detection updates the redis set with keys that trigger #21154

Merged
merged 5 commits into from
Mar 27, 2024

Conversation

xvello
Copy link
Contributor

@xvello xvello commented Mar 26, 2024

Problem

Follow-up to #21046 to implement consumer-side overflow detection. See https://github.com/PostHog/product-internal/pull/573 for spec.

Changes

  • OverflowDetection is promoted into OverflowManager with the responsibility to update Redis when triggered
  • Added a CAPTURE_CONFIG_REDIS_HOST envvar to point to the Redis that capture looks at (common django Redis for now). We'll use the same envvar when implementing the same detection for analytics. Not allowing port & creds to be configured, as this was only needed for the public chart
  • Entries expired more than one hour ago are removed by plugin-server when adding new ones. We don't need to be fast here as capture will filter out when retrieving, we just want to make sure that the zset size does not explode

Does this work well for both Cloud and self-hosted?

Not documented nor supported on hobby

How did you test this code?

  • Added integration tests on expected behaviours

@xvello xvello requested review from pauldambra and a team March 26, 2024 10:43
Copy link
Member

@pauldambra pauldambra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recording ingestion runs locally with and without the overflow detection active so let' 🚢 🚢 🚢 🚢 🚢 🚢

Comment on lines 257 to +259
const key = `${team_id}-${session_id}`
// TODO: use this for session key too if it's safe to do so
const overflowKey = `${team_id}:${session_id}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the comment mean whether we should use the same for key and overflowKey?

i think safe to do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll do it in a separate no-op PR, as making this PR pass tests might be complicated already, and I don't want to add noise

await this.sessions[key]?.add(event)
await Promise.allSettled([
this.sessions[key]?.add(event),
this.overflowDetection?.observe(overflowKey, event.metadata.rawSize, event.metadata.timestamp),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could void this promise instead of waiting on it since it's ok for some to fail maybe, but it should be fast enough that that's bike-shedding and we'll see the impact if it isn't fast enough so ignore as nit-picking 🤣

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only async part is the redis ZADD when triggering on the first time, the detection itself does not yield, so I don't think voiding it would help throughput 🤔
I'd say let's rollout and re-evaluate if there's impact. When merging, overflow will be disabled, it'll be re-enabled when rolling out the new CAPTURE_CONFIG_REDIS_HOST

@xvello xvello merged commit 4edcf1b into master Mar 27, 2024
130 checks passed
@xvello xvello deleted the xvello/replay-overflow-detection-2 branch March 27, 2024 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants