Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(s3-batch-exports): Swap to asyncio s3 client #17673

Merged
merged 2 commits into from
Sep 29, 2023

Conversation

tomasfarias
Copy link
Contributor

@tomasfarias tomasfarias commented Sep 28, 2023

Problem

We are still not heartbeatting even after #17642 and #17670. I believe this is because the heartbeat is spawning an async task, but since our code is not doing any await-ing, we never yield the main thread for it to run the async task.

By swapping to aioboto3 we now await on each upload part which should allow us to heartbeat while the part is uploaded.

Changes

aioboto3 is mostly compatible with the boto3 interface with 1 big difference: client must be initialized as a context manager. Besides that, I mostly added async/await in every call.

In order for aioboto3 to be compatible with our existing boto3 dependency, I've had to bump it slightly to 1.26.76 from 1.26.66. We are taking steps to split up the temporal bits of code to avoid conflicting with the django monolith. In the meantime, I hope this bump won't cause issues. Reviewing the changelog reveals no significant changes between versions.

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Next steps

Add heartbeating to every destination
Use asyncio (aiofiles) for BatchExportTemporaryFile.

How did you test this code?

Updated unit tests to use async code, which is just more of the same: Use context manager, and add async/await everywhere.

Temporal activities are not heartbeating. I believe this is because the heartbeat
is spawning an async task, but since our code is not doing any await, we never
yield the main thread for it to run the async task.

By swapping to aioboto3 we now await on each upload part which should allow us to
heartbeat while the part is uploaded.
@tomasfarias tomasfarias force-pushed the fix/attempt-heartbeating-via-async-function branch from 16e5434 to df377a0 Compare September 28, 2023 21:41
@tomasfarias tomasfarias changed the title fix: Attempt heartbeating via async function refactor(s3-batch-exports): Swap to asyncio s3 client Sep 28, 2023
@tomasfarias tomasfarias merged commit 5e192a0 into master Sep 29, 2023
@tomasfarias tomasfarias deleted the fix/attempt-heartbeating-via-async-function branch September 29, 2023 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant