Propolis zone cleanup could happen outside of InstanceRunner
#5237
Labels
Sled Agent
Related to the Per-Sled Configuration and Management
virtualization
Propolis Integration & VM Management
Every instance managed by a sled agent has a "runner" task and a "monitor" task. All requests to do anything with the instance (change its state, forcibly remove it from the sled agent, etc.) need to execute on the runner task, which handles these requests sequentially. This includes the processing of state change messages from Propolis, which are sent from the monitor task to the runner task and processed here:
omicron/sled-agent/src/instance.rs
Lines 374 to 389 in 8697f39
If Propolis indicated that it shut down,
InstanceRunner::observe_state
will callInstanceRunner::terminate
before returning and allowing new state to be published to Nexus:omicron/sled-agent/src/instance.rs
Lines 600 to 615 in 8697f39
Terminating an instance this way collects a zone bundle from the runner thread before removing the instance from the sled agent's
InstanceManager
and tearing down the zone:omicron/sled-agent/src/instance.rs
Lines 759 to 793 in 8697f39
Creating a zone bundle may be an expensive operation, since it (potentially) has to copy and compress many different log files and command outputs. This causes a couple of problems:
InstanceRunner
is completely blocked while all this work is going on. This means that sled agent API calls targeting an instance in this state are highly likely to time out waiting for the runner to respond. If the caller is Nexus and the request is to change the instance's state, this can cause instances to be marked as Failed (due to the error conversion rules described in sled agent and Nexus frequently flatten errors into 500 Internal Server Error #3238) even though they would correctly go to Stopped if left alone.The reason I made these operations happen in this order (zone bundle collection -> zone teardown -> deregister instance -> publish to Nexus) was to try to mitigate #3325. I suspect, though, that that issue is not as much of a problem now that every newly-started instance gets a fresh Propolis ID (not the case prior to #4194), such that every incarnation of an instance on a sled will get a distinct zone name. If that's so, then it should be possible to mitigate these problems by changing what happens on VMM shutdown: the runner can remove the instance from the table, publish the new VMM state to Nexus, and then hand the defunct zone off to some other task to be cleaned up.1
Footnotes
Cleaning up the defunct zone outside of the
InstanceRunner
task allows that task to return immediately and produce a "runner task closed" error message for anyone who happened to have requested something of the runner while it was in the "observe state change" arm of its request handler. If zone cleanup happens on the runner, Nexus's instance state will still update right away, but calls that land in this window are much more likely to time out than to get a "clean" instance gone error. ↩The text was updated successfully, but these errors were encountered: