Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAPT-1962: Sync batch database operations with BigQuery #3378

Merged
merged 2 commits into from
Nov 8, 2024

Conversation

slorek
Copy link
Contributor

@slorek slorek commented Nov 7, 2024

https://dfedigital.atlassian.net/browse/CAPT-1962

We use dfe-analytics to sync database operations with the data warehouse held in Google BigQuery.

The library uses model callbacks to enable this, so in some instances where we use insert_all/delete_all/update_all these operations are being missed.

This change should rectify all of these sync issues.

In some cases I felt it was better, on balance, to trigger the model callbacks and sacrifice a little performance. This is because syncing large tables such as claims or tasks can take a very significant amount of time in the background jobs.

In other instances it is less expensive to simply force a full import of the table, which I have done by invoking the supplied rake task rather than re-implementing it which risks becoming out of sync or incompatible with changes to the upstream library. Unfortunately there doesn't appear to be a convenient way to do this other than invoking the rake task.

Copy link
Contributor

@rjlynch rjlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think the code changes look fine / safe

Not sure why those two tests are failing on CI though, I ran them on my machine and they passed 🙃

@@ -32,6 +32,8 @@ def run

target_data_model.insert_all(record_hashes) unless record_hashes.empty?
end

Rake::Task["dfe:analytics:import_entity"].invoke(target_data_model.table_name) if DfE::Analytics.enabled?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be tempted to wrap calling this rake task with our own model (DfEAnalytics.sync(SomeClass)) - we could then try stubbing the call to that class in the tests and potentially that would work on CI 🤷

@kenfodder
Copy link
Contributor

@slorek @rjlynch Spent about 45 mins on this - can't pin it down, although I sense something funky with the with_dfe_analytics_enabled context and the before block ordering, but a gut feeling where to dig deeper.

@slorek slorek force-pushed the dfe-analytics-syncing branch from 2277fc5 to 40e6a98 Compare November 8, 2024 10:44
@slorek slorek force-pushed the dfe-analytics-syncing branch from 40e6a98 to 493a8d4 Compare November 8, 2024 10:44
@slorek slorek merged commit 41a48f4 into master Nov 8, 2024
14 checks passed
@slorek slorek deleted the dfe-analytics-syncing branch November 8, 2024 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants