Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Begin using typed instance and Propolis UUIDs in sled agent and Nexus #5896

Merged
merged 3 commits into from
Jun 17, 2024

Conversation

gjcolombo
Copy link
Contributor

One goal of the ongoing Great Instance State Rework of 2024 is to clarify that, in broad strokes, Nexus manages instances and sled agents manage VMMs. Nexus decides whether and when an instance is connected to a VMM and how to convert VMM states to instance states; sled agents manage individual Propolis processes and send their state changes to Nexus so that it has the relevant VMM states available.

Today, sled agent's instance manager tracks VMMs in a map with type BTreeMap<Uuid, (Uuid, Instance)>, where the keys are instance IDs and the values contain Propolis IDs. In our proposed brave new world, sled agent can use Propolis IDs as keys and may not need instance IDs at all--i.e., its map type should be BTreeMap<PropolisUuid, Vmm>, where the Vmm may or may not contain an InstanceUuid.

To support this future change, change sled agent's instance map to a BTreeMap<InstanceUuid, (PropolisUuid, Instance)> and work through all the compiler-error fallout. This causes the difference between instance and Propolis UUIDs to filter all the way up through Nexus's instance management routines and sagas. This should make it much easier to refactor sled agent's APIs to take Propolis IDs, since the compiler can then help catch instance and Propolis ID transpositions.

Nexus's instance management code often deals with instance IDs, Propolis IDs, and sled IDs together, so also start using SledUuid in some of these paths.

This change draws an arbitrary line at using typed UUIDs in database model types and in the generated resource objects in the authz layer; calls into/out of these layers use GenericUuid and from_untyped_uuid/into_untyped_uuid as needed.(This is not because these changes are necessarily impossible, merely because the line had to go somewhere.)

Tests: cargo nextest; while this change is large, it should by and large be a rote refactoring exercise.

@gjcolombo gjcolombo requested a review from hawkw June 13, 2024 19:06
@@ -57,6 +57,7 @@ impl_typed_uuid_kind! {
LoopbackAddress => "loopback_address",
OmicronZone => "service",
PhysicalDisk => "physical_disk",
Propolis => "propolis",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some places, we refer to the UUID of a Propolis process as a "Propolis ID", elsewhere, we call it a "VMM ID". Perhaps we should commit to being more consistent with naming and always refer to them as one or the other (at least, when it doesn't require a database migration...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general rule I've tried to apply is to use "Propolis ID" to refer to these UUIDs, except when working directly with DB queries where we're working with the primary key to the VMM table.

(When I created the VMM table in #4194, propolis_id was a much more common field in slog messages that contained a Propolis/VMM ID, so when it came time to choose an option it won. IMO the structured logging field name should carry a lot of weight--it should be easy to filter for messages concerning a single VMM without having to write propolis_id = "foo" || vmm_id = "foo" to get all the messages.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that makes sense. I may have to rename some vmm_id fields in logs I've added...

As part of the ongoing Instance Lifecycle Reimagination Exercise of
2024, we'd like to refactor sled agent's instance manager so that sled
agent thinks purely in terms of the VMMs it's managing and not in terms
of the instances that those VMMs incarnate. To make this less
challenging, add a Propolis UUID type, then rework a bunch of code to
distinguish properly between instance IDs and Propolis IDs, both in sled
agent and in Nexus.

In general the change stops at the database model boundary; that is, I
didn't try to make any of the DB model types use typed UUIDs (though I'm
curious to see if one could). Beyond that there are admittedly few rules
to describe what changed--I started with a couple of types in sled agent
and let the refactoring katamari roll from there.
@gjcolombo gjcolombo force-pushed the gjcolombo/instance-propolis-uuids branch from 37de26e to bac4652 Compare June 17, 2024 17:26
@gjcolombo gjcolombo merged commit 50e353f into main Jun 17, 2024
20 checks passed
@gjcolombo gjcolombo deleted the gjcolombo/instance-propolis-uuids branch June 17, 2024 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants