-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update API for 1.31 without partitionable model #36
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnbelamaric The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
// SharedCapacity defines the set of shared capacity consumable by | ||
// devices in this ResourceSlice. | ||
// CommonData represents information that is common across various devices, | ||
// to reduce redundancy in the devices list. | ||
// | ||
// Must not have more than 128 entries. | ||
// | ||
// +listType=atomic | ||
// +optional | ||
SharedCapacity []SharedCapacity `json:"sharedCapacity,omitempty"` | ||
CommonData []CommonDeviceData `json:"commonData,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't include this. If we are ripping out the model that supports partitioning, I think we should should just go back to an explicit list of attributes per device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need an extension point to insert the things we are debating in #20. That's the primary reason that is here. Adding the common things was just a way to make it have some meaning and not be a stub. With this, an old scheduler an see that there is an entry in this list, but when it unmarshals it, all the fields the scheduler is aware of are nil, so it knows "ooo, I should ignore this slice, I don't know what to do with it".
If we don't have this, then we need some other way for an older scheduler to know that "it doesn't understand this object".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative would be to have a discriminator field, like SliceContent: "SimpleDevices"
or something. Then we would add new values to that so old schedulers know if they should handle the slice or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My current thinking is that we would not have any extension points directly in the device at this point. Instead every device will have the ability to point to one (and exactly one) DeviceShape/DeviceTemplate. That is what will have all of these extra bells and whistles. The device itself is still just a named list of attributes (and now capacities).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that works for me. The thing is, we still need a way for old schedulers to know "this slice is not something I understand". So a "SliceType" discriminator perhaps? Or we have to have a field like I have done here, that has a one-of.
// +optional | ||
SharedCapacity []SharedCapacity `json:"sharedCapacity,omitempty"` | ||
// +required | ||
SliceContent string `json:"sliceContent"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an enum instead, @pohly ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the scheduler will decode this object, read this field, and then decide if it knows how to work with it or not? an enum seems appropriate then, since this should only be allowed to have a specific set of values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
If a current-rev scheduler only knows how to work with "devices", it uses the slice.devices
field to find those.
If a future API adds "gizmos" or some variant of "devices" (remember, I wanted to keep the current field named as "namedDevices"...), then it adds a slice.gizmos
field. A current-rev scheduler will see an empty slice and ignore it because it contains no devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is the intent to add additional fields to the Device
struct? In that case such a SliceContent
discriminator would be useful because an unknown value tells old clients that there is something in the slice.devices
that they don't know about, but should know.
I just wouldn't call it SliceContent
, but rather DeviceType
: it's a description of the slice.devices
.
If we add slice.gizmos
(separate field), then they can have a slice.gizmoType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We likely would add another field in addition to "Devices" to qualify or add onto the devices.
If we want to avoid the discriminator, I would avoid the term "Devices". But I don't think "NamedDevices" works because "named" isn't a differentiating feature. Maybe "SimpleDevices" or something?
If we keep the discriminatory, I do not think "DeviceType" is right - because it wouldn't make sense for "gizmos", for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having said that, I am okay with adding the discriminator. It may be redundant when adding a second slice with a different type, but for the case where the same type gets extended (https://github.com/kubernetes-sigs/wg-device-management/pull/36/files/3a7aa4f9789c7979426ea3c65497545136428f55#r1646662809) it is useful.
My question about validation after a downgrade applies also here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the bullet on "enumerated types" here, we should be OK as long as we document that schedulers that see unknown slice types should ignore those slices. I think older clients doing a write would be subject to the API validation of the API they are using, meaning they would not be able to use the new enum. I think if there is a field associated with the new enum value, we should clear it (the client doesn't know about the new enum, but we do). I think that's consistent with the logic described here, but we should discuss with api reviewers.
One question: Should (can?) we ensure all slices of a pool are of the same slice type? If not, old schedulers would have to ignore the whole pool, at least for "all", if ANY slice is of an unknown type...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would seem to me that adding an enum value should require a bump in the API version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we have reached v1, we cannot bump it any further?
I'm still not exactly clear on how to implement validation, but there must have been precedences, so I guess it is okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A discriminator is required when you want the absence of config to imply a default, so that version skew isn't perceived as the default case.
For cases that are truly one-of (exactly one field, empty lists don't count), then a discriminator is not needed, because any actor can perceive the lack of a value, and choose not to act. A discriminator can make for better error messages, but that's about all.
Whether we extend Device
to encompass partitions (assuming we can do that safely) or we add PartitionableDevice
here, it shouldn't matter -- both are viable. Likewise, whether we want devices/gizmos/widgets/partitionables to co-exist in a slice or to be mutually exclusive, I think both are viable. There may be other things to consider to make those decisions, but I don't think this aspect matters.
Ok, I simplifed it and now it's basically just a discriminator with a single valid value in 1.31, but more will come. |
Based on kubernetes-sigs/wg-device-management#30. Instead of a discriminator (kubernetes-sigs/wg-device-management#36 (comment)), a versioning scheme gets added.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
This attempts to update the API to:
This is rough, not exactly what would go into k/k. But I think it has what we need. It allows us to do common attrs and common capacity (common, not shared). It looks little like some of Kevin's earlier prototypes and a sort of simplified version of #31 and #35.
cc @pohly @klueska