Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch to procedural approach for delivery receipts #1486

Merged
merged 17 commits into from
Dec 23, 2024
Merged

Conversation

terrazoon
Copy link
Contributor

@terrazoon terrazoon commented Dec 13, 2024

Description

The previous implementation of checking delivery receipts did not scale. It was calling cloudwatch filter_log_events() for each phone number, which worked fine at low volumes. But at high volumes AWS started throttling us and we ended up in retry hell.

This new approach makes use of Log Insights to grab all delivery receipts for all services in 10 minute time chunks. It runs every 8 minutes, so there is some intentional overlap to avoid losing any records due to edge cases or timing.

Once we fetch a batch of receipts, we are going to update the db with a batch system so we don't move the throttling from AWS to our own db.

Security Considerations

N/A

@terrazoon terrazoon marked this pull request as draft December 13, 2024 22:41
@terrazoon terrazoon marked this pull request as ready for review December 18, 2024 17:13
@terrazoon terrazoon changed the title switch to log insights for delivery receipts switch to procedural approach for delivery receipts Dec 18, 2024
@terrazoon terrazoon self-assigned this Dec 18, 2024
@terrazoon terrazoon linked an issue Dec 18, 2024 that may be closed by this pull request
@terrazoon terrazoon requested a review from a team December 19, 2024 15:57
Copy link
Contributor

@ccostino ccostino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @terrazoon!

I've gone through this and overall I think this will be a big improvement, even sticking with filter_log_events for now. 🎉

I flagged a few things that I want us to dig a bit deeper in, most notably the DB update statements because I want to make sure those don't become bottlenecks and take advantage of larger batch sizes. We may not even have to chunk them like that at all if we're only talking a few thousand records at a time.

One last thing - I see a couple of test updates and lots of older tests removed; do we need any other new tests for the new code? I'm not sure I saw them.

Thanks again for diving into this, @terrazoon!

app/celery/provider_tasks.py Show resolved Hide resolved
app/celery/scheduled_tasks.py Show resolved Hide resolved
app/celery/scheduled_tasks.py Outdated Show resolved Hide resolved
app/clients/cloudwatch/aws_cloudwatch.py Outdated Show resolved Hide resolved
app/clients/cloudwatch/aws_cloudwatch.py Show resolved Hide resolved
app/dao/notifications_dao.py Show resolved Hide resolved
Copy link
Contributor

@ccostino ccostino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @terrazoon, let's see how this goes!

@ccostino ccostino merged commit d5e1008 into main Dec 23, 2024
7 checks passed
@ccostino ccostino deleted the notify-api-1465 branch December 23, 2024 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor delivery receipts
3 participants