Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nexus] nicer OMDB status for instance_updater #6542

Merged
merged 6 commits into from
Sep 11, 2024
Merged

Conversation

hawkw
Copy link
Member

@hawkw hawkw commented Sep 7, 2024

Depends on #6541.

Similarly to #6541, this commit refactors the instance_updater
background task's OMDB status implementation to use a shared Rust struct
in nexus-types to represent the status JSON object. I also changed the
status message to include a full list of errors that occurred, instead
of just the most recent one and a count, and I made some tweaks to the
output alignment in OMDB.

While I was working on #6503, I became quite taken with the way @jmpesp
had done the OMDB status for his region-replacement background tasks ---
the approach of defining a struct in `nexus-types` representing the
status JSON that both the background task implementation and OMDB could
depend on seems much nicer than constructing an untyped JSON message and
then deserializing it to a type that only exists in OMDB, because this
way, OMDB and the background task implementation are kept in sync
automatically. So, I thought it would be nice to also rework the
`abandoned_vmm_reaper` task to use this style.

This commit does that. While I was making this change, I also changed
the status JSON to include a list of all errors that occurred during the
task's activation, instead of just the last one, which will probably
make it easier to debug any issues that come up. I also added some code
for aligning the numeric columns in the OMDB output.
Depends on #6541.

Similarly to #6541, this commit refactors the `instance_updater`
background task's OMDB status implementation to use a shared Rust struct
in `nexus-types` to represent the status JSON object. I also changed the
status message to include a full list of errors that occurred, instead
of just the most recent one and a count, and I made some tweaks to the
output alignment in OMDB.
@hawkw hawkw requested review from jmpesp and smklein September 7, 2024 17:46
Base automatically changed from eliza/omdb-nice to main September 9, 2024 17:12
instances with terminated active migrations: 0
update sagas started: 0
update sagas completed successfully: 0
task explicitly disabled by config!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised to see this here - the Default policy should have disabled be false right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not disabled by default, but it is disabled explicitly in the test config file. Since this task is intended as a backstop to ensure update sagas are always run eventually, but it's not actually the primary mechanism that triggers update sagas, we disable it when running Nexus' integration tests, to make sure the other mechanisms for starting update sagas work in the happy path. See:

# The purpose of the `instance-updater` background task is to ensure that update
# sagas are always *eventually* started for instances whose database state has
# changed, even if the update saga was not started by the Nexus replica handling
# an update from sled-agent. This is to ensure that updates are performed even
# in cases where a Nexus crashes or otherwise disappears between when the
# updated VMM and migration state is written to CRDB and when the resulting
# update saga actually starts executing. However, we would prefer update sagas
# to be executed in a timely manner, so for integration tests, we don't want to
# *rely* on the instance-updater background task for running these sagas.
#
# Therefore, disable the background task during tests.
instance_updater.disable = true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@hawkw hawkw force-pushed the eliza/updater-omdb-nice branch from 2316680 to 245f167 Compare September 11, 2024 20:56
@hawkw hawkw merged commit 437f841 into main Sep 11, 2024
16 checks passed
@hawkw hawkw deleted the eliza/updater-omdb-nice branch September 11, 2024 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants