-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC: BIOS version & settings management #138
base: main
Are you sure you want to change the base?
Conversation
2ac16c3
to
e663a97
Compare
a6733e4
to
b87e059
Compare
0aea100
to
958928c
Compare
Signed-off-by: Artem Bortnikov <[email protected]>
Signed-off-by: Artem Bortnikov <[email protected]>
Signed-off-by: Artem Bortnikov <[email protected]>
Signed-off-by: Artem Bortnikov <[email protected]>
1ffebc9
to
b682a57
Compare
Signed-off-by: Artem Bortnikov <[email protected]>
b682a57
to
3468011
Compare
Signed-off-by: Artem Bortnikov <[email protected]>
func (s *ServerHTTP) registerRoutes() { | ||
s.mux.HandleFunc("/scan", s.scanHandler) | ||
s.mux.HandleFunc("/settings-apply", s.settingsApplyHandler) | ||
s.mux.HandleFunc("/version-update", s.versionUpdateHandler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idea: would be great to have an endpoint to get all current active tasks from the sync.Map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefanhipfel thanks for idea.
Totally agree. Also we might need to implement queues for requests, rate limiting, retrying on client side and tons of other stuff. Apart from that, I'd also like to generate http API from spec without hardcoding endpoints.
But for now, I think the main goal is to come to agreement - will we proceed further with this design or not.
// if referred server is not in Available state - stop reconciliation | ||
if server.Status.State != metalv1alpha1.ServerStateAvailable { | ||
return ctrl.Result{RequeueAfter: r.RequeueInterval}, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to stop the server reconciliation in the meantime?
e.g.: do not allow serverClaim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we do not need to stop server's reconciliation - we do not care what is the state of the server. But we need to stop serverBIOS reconciliation, bc both BIOS version and settings update will lead to server reboot. So for now we decided that we'll work only with servers which are in "Available" state.
If the server is in available state, then we'll need to set it to "maintenance". But the servers' maintenance topic is still open. Thus I decide to not to mention it at all for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok but during a serverBios update, someone could still claim the server. Any available server can be claimed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok but during a serverBios update, someone could still claim the server. Any available server can be claimed
@stefanhipfel that's a good point. However, I do not really like the idea that the serverBIOS controller would change server's state to exclude it from reconciliation. For number of reasons the server's "owner" might want to postpone the update. Especially in terms of version upgrade.
I could add the "Maintenance" state for server and update the controller, so it will check whether server's state is "Maintenance" instead of "Available".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aobort yes i think so as well, that the serverBIOS should not change a server's state.
Alternative idea: the server reconciler checks on the serverBIOS state.
overall I think the POC looks ok |
// if referred server is not in Available state - stop reconciliation | ||
if server.Status.State != metalv1alpha1.ServerStateAvailable { | ||
return ctrl.Result{RequeueAfter: r.RequeueInterval}, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok but during a serverBios update, someone could still claim the server. Any available server can be claimed
type ServerBIOSStatus struct { | ||
// BIOS contains a bios version and settings. | ||
// +optional | ||
BIOS BIOSSettings `json:"bios,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could add the current state of the bios task
@aobort overall I think the POC looks good. Next step would be to gather some improvements and test it with real examples! |
Proposed Changes
This PR contains PoC implementation for management of server's BIOS version and settings:
This approach fully separate concrete job implementation from the reconciliation flow.
To discuss
BootOrder
Do we need to move boot order field from server to serverBIOS CR? Semantically, yes. As it's one of BIOS settings.
BIOS settings in
.status
From my perspective, it reasonable to reflect in
.status.bios.settings
only those settings which are set in.spec.bios.settings
. This will make comparison of.spec
and.status
much easier.Storing bios settings
Do we need custom type for bios setting:
to avoid attempting to apply settings which are not supported in the specified BIOS version? I.e.:
upgrading from version 1.0.0 to 2.0.0 will lead to the bios setting responsible for legacy boot is deprecated/unsupported, thus it is explicitly marked as unsupported and will not be considered during settings applying.