Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add app_metrics2 table #22859

Merged
merged 6 commits into from
Jul 16, 2024
Merged

chore: add app_metrics2 table #22859

merged 6 commits into from
Jul 16, 2024

Conversation

bretthoerner
Copy link
Contributor

Problem

Apps shouldn't be confined to things like integer plugin_config_ids, and app_metrics is sorted on that column. So we need a new table.

Changes

Add app_metrics2, which is sorted more like [log_entries], but keeps the app_metrics metric aggregations.

Also swap successes_after_retry which was unused for skipped which is a metric we wanted.

Note that we may want to add more metrics columns, like latency ones, but those should be easy to add later.

Also note, this PR doesn't change existing app_metrics queries. We'll have to roll this out and then migrate queries and insertion over for existing plugins.

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Does this work well for both Cloud and self-hosted?

Yes

How did you test this code?

Existing

@bretthoerner bretthoerner marked this pull request as ready for review June 10, 2024 20:07
@bretthoerner bretthoerner requested a review from fuziontech as a code owner June 10, 2024 20:07
@bretthoerner bretthoerner requested a review from a team June 10, 2024 20:07
posthog/models/app_metrics2/sql.py Outdated Show resolved Hide resolved
posthog/models/app_metrics2/sql.py Outdated Show resolved Hide resolved
@bretthoerner bretthoerner force-pushed the brett/app-metrics2 branch 4 times, most recently from d6a782a to 86b5398 Compare June 11, 2024 17:33
@posthog-bot
Copy link
Contributor

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

Copy link
Member

@fuziontech fuziontech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Nice job!

posthog/models/app_metrics2/sql.py Show resolved Hide resolved
posthog/models/app_metrics2/sql.py Outdated Show resolved Hide resolved
@bretthoerner
Copy link
Contributor Author

@benjackwhite No rush from our end, but I think you were doing some kind of experiment to see if this schema would work for Hog apps? Just wanted to let you know this is ready whenever, or we can tweak the schema depending.

Comment on lines 31 to 34
successes SimpleAggregateFunction(sum, Int64),
skipped SimpleAggregateFunction(sum, Int64),
failures SimpleAggregateFunction(sum, Int64),
error_type String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if there is any strong reason why not to make this more generic?

From the work I'm doing so far there is a bunch of metrics that would be really useful to store here
success | failure | requestSuccess | requestFailure | timeout | filtered | skipped_due_to_disabled etc.

If we aggregate on the metric name it should still condense nicely I would imagine?

Suggested change
successes SimpleAggregateFunction(sum, Int64),
skipped SimpleAggregateFunction(sum, Int64),
failures SimpleAggregateFunction(sum, Int64),
error_type String
metric_name LowCardinality(String),
count SimpleAggregateFunction(sum, Int64)

Copy link
Contributor Author

@bretthoerner bretthoerner Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I can't think of a strong reason.

My best guess at the historical reasons for why it was this way in app_metrics (1) is that this table was probably created to support this page: https://us.posthog.com/project/2/pipeline/destinations/457/metrics

Your metric_name / count would be able to support the same page with minor query changes, with the relatively minor tradeoff that we'd be going a little "schema-less." It's hard to know if an arbitrary string is a success or a failure or what. But maybe the future of that page is we just break out each unique type + count and we're done? That gets a little weird for showing the total success / failure at the top, and only showing failures broken out at the bottom -- but there's nothing saying we have to do it the exact way we are today.

Maybe @tiina303 has opinions on whether it would cover existing use cases (maybe I'm missing something other than the page linked above)?

Copy link
Contributor

@tiina303 tiina303 Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a fine idea. The only worry I have is about us always needing to update the UI and alerting - specifically if some sort of a new failure category is added, e.g. spark graph for example only has success and failure, we can't display 7 lines there it would be too noisy.

A potential way to get the best of both worlds is to have top level success / failure - which can be used by alerting and sparkgraph and then subcategories, which can be used in metrics page to break down by (or in alerts to ignore 'skip' category for example or have different thresholds there).

something like this:

successes SimpleAggregateFunction(sum, Int64),
failures SimpleAggregateFunction(sum, Int64),
category LowCardinality(String),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 👍 to @benjackwhite's suggestion here.

metric_name LowCardinality(String),
count SimpleAggregateFunction(sum, Int64)

Is ideal and can represent almost everything that you want now and in the future ✔️

Copy link
Contributor Author

@bretthoerner bretthoerner Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed up one like @tiina303 suggested. Rather than a single count column, we force you to at least pick one of successes, skipped or failures -- which gives us some way to aggregate / understand counts without having to know what every metric_name "means."

    metric_name LowCardinality(String),
    successes SimpleAggregateFunction(sum, Int64),
    skipped SimpleAggregateFunction(sum, Int64),
    failures SimpleAggregateFunction(sum, Int64),

Does that seem reasonable @benjackwhite ? It seems like your success | failure | requestSuccess | requestFailure | timeout | filtered | skipped_due_to_disabled would all fit into there.

And I dropped error_type since a high level error would go in metric_name -- non-metric logs / debug info should go to the log table, and not in there, as discussed above. (That was an artifact from app_metrics (1) that we should remove.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked with Ben, went with metric_kind and count so this can be totally flexible.

i.e. kind could be error and name could be timeout or sql_error or whatever

@bretthoerner bretthoerner requested a review from a team as a code owner July 15, 2024 17:09
@bretthoerner bretthoerner merged commit f787067 into master Jul 16, 2024
84 checks passed
@bretthoerner bretthoerner deleted the brett/app-metrics2 branch July 16, 2024 17:21
silentninja pushed a commit to silentninja/posthog that referenced this pull request Aug 8, 2024
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants