Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(batch-exports): Couple of backfill bugfixes #25854

Merged
merged 5 commits into from
Oct 28, 2024

Conversation

tomasfarias
Copy link
Contributor

@tomasfarias tomasfarias commented Oct 28, 2024

Problem

A couple of bugs that need squashing:

  • We are not parsing heartbeat details correctly, seeing "Expected 4 but got 37", which means we are likely unpacking a full string.
  • We are not able to cancel backfills due to the Z suffix used by temporal instead of +00:00.

Changes

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Does this work well for both Cloud and self-hosted?

How did you test this code?

@tomasfarias tomasfarias requested a review from a team October 28, 2024 15:09
Comment on lines 295 to 296
if isinstance(end_at, str) and end_at.endswith("+00:00"):
end_at = end_at.replace("+00:00", "Z")
Copy link
Contributor Author

@tomasfarias tomasfarias Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python seems to prefer "+00:00" but Temporal schedules everything with "Z", it's quite annoying.

@@ -173,7 +173,7 @@ async def backfill_schedule(inputs: BackfillScheduleInputs) -> None:
if details:
# If we receive details from a previous run, it means we were restarted for some reason.
# Let's not double-backfill and instead wait for any outstanding runs.
last_activity_details = HeartbeatDetails(*details[0])
last_activity_details = HeartbeatDetails(*details)
Copy link
Contributor Author

@tomasfarias tomasfarias Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

details is already the tuple recorded by temporal. By doing details[0] we access the first item of the tuple (workflow_id), which is unpacked into a bunch of strings. This is, clearly, not what was intended. I think it's a remnant of the previous implementation that did things more manually instead of relying on Heartbeater.

@tomasfarias tomasfarias changed the title fix(batch-exports): Use Z suffix for UTC iso formatted timestamps fix(batch-exports): Couple of backfill bugfixes Oct 28, 2024
Comment on lines 292 to 296
end_at = self.end_at and self.end_at.isoformat()
start_at = self.start_at and self.start_at.isoformat()

if isinstance(end_at, str) and end_at.endswith("+00:00"):
end_at = end_at.replace("+00:00", "Z")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine, but just wondering if there's there a reason you can't just do

end_at = self.end_at and self.end_at.strftime("%Y-%m-%dT%H:%M:%SZ")

Copy link
Contributor Author

@tomasfarias tomasfarias Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually reverted this and copied the same formatting we use here: https://github.com/PostHog/posthog/blob/master/posthog/batch_exports/service.py#L475

Makes sense to have the service that starts the backfill and model be consistent. Although we could eventually update both. It's certainly annoying to remember to update both, so maybe some abstraction could be added.

Copy link
Contributor

@rossgray rossgray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just wondering if you could simplify the date formatting slightly

@tomasfarias tomasfarias merged commit a6abe61 into master Oct 28, 2024
89 checks passed
@tomasfarias tomasfarias deleted the fix/batch-export-backfills-problems branch October 28, 2024 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants