Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(events tracking): add abstract class and logging implementation #80117

Merged
merged 4 commits into from
Nov 20, 2024

Conversation

victoria-yining-huang
Copy link
Contributor

@victoria-yining-huang victoria-yining-huang commented Oct 31, 2024

design doc

need to track the completion of each stage, to 1) compute events conversion rates 2) enable debugging visibility into where events are being dropped

the usage will be heavily sampled to not blow up traffic

this PR only adds REDIS_PUT stage, in subsequent PRs I will add all the other stages listed in EventStageStatus class

!!!!!IMPORTANT!!!!!!
hash based sampling
here's a blog post explaining hash based sampling, which would provide "all or nothing" logging for the events sampled across the entire pipeline. That's the idea I want to implement

the hashing algorithm used must be consistent and uniformly distributed in order for all or nothing sampling to work.
I cannot find references that say that md5 is consistent and evenly distributed other than various stackoverflow pages. All the official sources are too academic and long and i can't understand


for reviewers:
please review with the thoughts of how this can be generalized to other pipelines as well, such as errors

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Oct 31, 2024
src/sentry/utils/event_tracker.py Outdated Show resolved Hide resolved
src/sentry/utils/event_tracker.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Nov 1, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 2 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/utils/event_tracker.py 92.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #80117      +/-   ##
==========================================
- Coverage   78.48%   78.48%   -0.01%     
==========================================
  Files        7210     7207       -3     
  Lines      319532   319607      +75     
  Branches    43963    43989      +26     
==========================================
+ Hits       250797   250841      +44     
- Misses      62348    62371      +23     
- Partials     6387     6395       +8     

@@ -202,6 +203,10 @@ def process_event(
else:
with metrics.timer("ingest_consumer._store_event"):
cache_key = processing_store.store(data)
track_sampled_event(
data["event_id"], data.get("type"), TransactionStageStatus.REDIS_PUT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want the pipeline name not data.get("type") here. They are not always the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im using data.get("type") to make it generalized, so when this is eventually extended for errors, it will work too. Do you have any concerns a bout this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded it to only take transactions for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you cannot use data.get("type") in errors - there are many different types going through the errors pipeline. The pipeline name should be the generalized thing.

src/sentry/options/defaults.py Outdated Show resolved Hide resolved
src/sentry/utils/event_tracker.py Outdated Show resolved Hide resolved
src/sentry/utils/event_tracker.py Show resolved Hide resolved
@victoria-yining-huang
Copy link
Contributor Author

victoria-yining-huang commented Nov 16, 2024

The reason originally was because it was planned to be stored in redis. smaller enums means higher sampling rate for the same memory usage.
As we are not storing them in redis, there is no reason for int enum. Please turn them to strings.

@fpacifici int in logging should still be cheaper than string in google logs? Or is that too negligible

add updated list of enums

add sampling

add redis put

add sampling logic

add extra

remove class

add .value

change enum value to int

use IntEnum

add test first pass

add hash sampling

update status enum

docstring

comment

add wip

wip

tests pass

change to should_track

add TransactionStageStatus

return if rate 0

add unit test

remove old test

add comments

add TODO

add event type

use options automator

update comment

option to 0

only use options

override in tests another way
@victoria-yining-huang victoria-yining-huang merged commit e5c6492 into master Nov 20, 2024
50 checks passed
@victoria-yining-huang victoria-yining-huang deleted the vic/add_logging_module branch November 20, 2024 20:50
Comment on lines +32 to +33
if __name__ == "__main__":
unittest.main()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this needed?

harshithadurai pushed a commit that referenced this pull request Nov 25, 2024
…80117)

[design
doc](https://www.notion.so/sentry/Conversion-rate-of-ingest-transactions-to-save-trx-1298b10e4b5d801ab517c8e2218d13d5)

need to track the completion of each stage, to 1) compute events
conversion rates 2) enable debugging visibility into where events are
being dropped

the usage will be heavily sampled to not blow up traffic

this PR only adds REDIS_PUT stage, in subsequent PRs I will add all the
other stages listed in EventStageStatus class


**!!!!!IMPORTANT!!!!!!**
hash based sampling
here's a [blog
post](https://www.rsyslog.com/doc/tutorials/hash_sampling.html)
explaining hash based sampling, which would provide "all or nothing"
logging for the events sampled across the entire pipeline. That's the
idea I want to implement

the hashing algorithm used must be consistent and uniformly distributed
in order for all or nothing sampling to work.
I cannot find references that say that md5 is consistent and evenly
distributed other than various [stackoverflow
pages](https://crypto.stackexchange.com/questions/14967/distribution-for-a-subset-of-md5).
All the official sources are too academic and long and i can't
understand

----------
for reviewers:
please review with the thoughts of how this can be generalized to other
pipelines as well, such as errors
evanh pushed a commit that referenced this pull request Nov 25, 2024
…80117)

[design
doc](https://www.notion.so/sentry/Conversion-rate-of-ingest-transactions-to-save-trx-1298b10e4b5d801ab517c8e2218d13d5)

need to track the completion of each stage, to 1) compute events
conversion rates 2) enable debugging visibility into where events are
being dropped

the usage will be heavily sampled to not blow up traffic

this PR only adds REDIS_PUT stage, in subsequent PRs I will add all the
other stages listed in EventStageStatus class


**!!!!!IMPORTANT!!!!!!**
hash based sampling
here's a [blog
post](https://www.rsyslog.com/doc/tutorials/hash_sampling.html)
explaining hash based sampling, which would provide "all or nothing"
logging for the events sampled across the entire pipeline. That's the
idea I want to implement

the hashing algorithm used must be consistent and uniformly distributed
in order for all or nothing sampling to work.
I cannot find references that say that md5 is consistent and evenly
distributed other than various [stackoverflow
pages](https://crypto.stackexchange.com/questions/14967/distribution-for-a-subset-of-md5).
All the official sources are too academic and long and i can't
understand

----------
for reviewers:
please review with the thoughts of how this can be generalized to other
pipelines as well, such as errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components Scope: Frontend Automatically applied to PRs that change frontend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants