-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wip] #2411
base: master
Are you sure you want to change the base?
[wip] #2411
Conversation
Handle host-network pods as default network. Don't return per-pod errors on startup. Remove nadController from UDNHostIsolationManager as we don't use it anymore to find pod's UDN based on NADs that exist in the namespace. Signed-off-by: Nadia Pinaeva <[email protected]>
Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
udn host isolation: fix initialSync.
…face Signed-off-by: Martin Kennelly <[email protected]>
This code isnt being used anymore. We dont expect users to upgrade directly from code which contained the legacy LRPs, therefore its safe to remove. Signed-off-by: Martin Kennelly <[email protected]>
Signed-off-by: Martin Kennelly <[email protected]>
In an unlikely scenario where the service doesn't exist and there was an issue getting the current active network the code should not use the service object for the returned error. Signed-off-by: Patryk Diak <[email protected]>
Previously, if the services controller failed to start it would not be retried. Signed-off-by: Patryk Diak <[email protected]>
L2 UDN: EgressIP hosted by primary interface (`breth0`)
If EncapIP is configured, it means it is different from the node's primary address. Do not update EncapIP when node's primary address changes. Signed-off-by: Yun Zhou <[email protected]>
Assign network ID from network manager running in cluster manager. The network ID is included in NetInfo and annotated on the NAD along with the network name. Network managers running in zone & node controllers will read the network ID from the annotation to set it on NetInfo. On startup, network manager running in cluster manager will read the network IDs annotated on the nodes to cover for the upgrade scenario. Network IDs will still be annotated on the nodes because this PR does not transition all the code to use the network ID from the NetInfo instead of the node annotation. That will have to be done progressively. This have several benefits, among them: - NetworkID is available sooner overall since we dont have to wait for all the nodes to be annotated - No need to unmarshall the node annotation to get the network IDs, they are available in NetInfo - No need to unmashall the NAD to get the network name, can be accessed directly from the annotation. If a network is replaced with a different one with the same name, the network ID is reused as the respective network controller will not start as the previous one is stopped and cleaned up so it shouldn't be a problem. Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Instead of considering managed VRFs those that follow the mp<id>-udn-vrf naming template, use the table number: those vrfs associated to a table within our reserved block of table numbers are managed by us. The block right now is anything higher than RoutingTableIDStart (1000). This allows to manage VRFs with any name which is desirable if the name is going to be exposed through BGP. Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Anticipating that these VRF names are going to be exposed through BGP, we should to use friendlier names for our VRFs. The most natural name to use is the network name. Thus giving a cluster UDN a name below 15 characters that matches an already existing VRF not managed by ovn-k will fail. This is considered an admin problem and not an ovn-k problem for now. Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Was causing deadlocks in unit tests Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
…heir subcontrollers Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Assuming that there is three types of controllers, being: network agnostic, network aware and network specific; we were already notifying network specific controllers of network changes. But network aware controllers, controllers for which we have a single instance capable of managing multiple networks, had no code path to be informed of netwokr changes. This commit adds a code path for that and makes the RouteAdvertisments controller aware of network changes. Changed ClusterManager to be the controller manager for cluster manager instead of secondaryNetworkClusterManager. It just makes more sense that way sice ClusterManager is the top level manager. Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
…twork exist test Signed-off-by: Or Mergi <[email protected]>
On CUDN cleanup is inconsistent as we see some flaky tests due to CUDN "already exist" errors, implying object are not actually deleted. Wait for CUDN object be gone when deleted Signed-off-by: Or Mergi <[email protected]>
CUDN is cluster-scoped object, in case tests running in parallel, having random names avoids conflicting with other tests. Use random metadata.name for CUDN objects. The "isolates overlapping CIDRs" tests create objects based on the 'red' and 'blue' variables, including CUDN objects. Change the tests CUDN creation use random names and update the given 'networkAttachmentConfigParams' with the new generated name. Update 'red' & 'blue' vaiables with the generated name, carried by 'networkAttachmentConfigParams' (netConfig.name). The pod2Egress tests asserts on the CUDN object name given by 'userDefinedNetworkName'. In practice the tests netConfigParam.name is userDefinedNetworkName. Change the assertion to check the given netConfigParam. Signed-off-by: Or Mergi <[email protected]>
Signed-off-by: nithyar <[email protected]>
Signed-off-by: nithyar <[email protected]>
e2e, CUDN: Improve stability
Reconcile RouteAdvertisements in cluster manager
Add missing enum validation for RouteAdvertisements
Compare annotations directly if possible. For network specific map entries only compare raw json entries without parsing the map in full. Co-authored-by: Tim Rozet <[email protected]> Signed-off-by: Patryk Diak <[email protected]>
Instead of always parsing all node/join subnets parse the raw json map and only compute the results for the affected network. Signed-off-by: Patryk Diak <[email protected]>
Increases async performance of informer cache being able to always queue events and not blocking while performing ADD/UPDATE/DELETE operation. Signed-off-by: Tim Rozet <[email protected]>
Add a pool of Event handlers instead of a single (federated) event handler per informer. Ensure a controller always gets registers with the same event handler. Set the pool size to 201 (200 for secondary controllers and one, index 0, reserved for the default network). Always use pool entry with index 0 for the default network controller. Signed-off-by: Dumitru Ceara <[email protected]>
Keep the initial add/sync queue small enough though. This is needed to avoid contention on handler addition initial processing. Signed-off-by: Dumitru Ceara <[email protected]>
k8s.ovn.org/user-defined-network is now required to be labeled on a namespace at namespace creation time in order to use a primary UDN. The following conditions are true: 1. If namespace is missing the label, and a pod is created, it attaches to default network. 2. If the namespace is missing the label, and a primary UDN or CUDN is created that matches that namespace, the UDN/CUDN will report error status and the NAD will not be generated. 3. If the namespace is missing the label, and a primary UDN/CUDN exists, a pod in the namespace will be created and attached to default network. 4. If the namespace has the label, and a primary UDN/CUDN does not exist a pod in the namespace will fail creation until the UDN/CUDN is created. Also includes some fixes to unit tests that were brought to light by this PR. For example, the layer 2 multi-network tests were adding invalid annotations for node-subnets, etc. Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Patryk Diak <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Was using ipv6 on ipv4 cluster. Signed-off-by: Tim Rozet <[email protected]>
The unit tests run with race detection enabled and on constrained environments (e.g., default GitHub runners) and run out of resources when using such large event queues. This change doesn't affect e2e tests in any way. Those will use default event queue sizes (1K) in order to test what gets deployed on actual clusters. Signed-off-by: Dumitru Ceara <[email protected]>
Signed-off-by: Dumitru Ceara <[email protected]>
…w table. Signed-off-by: Dumitru Ceara <[email protected]>
Secondary network controllers should ingore resources that do not belong to the current network. Signed-off-by: Patryk Diak <[email protected]>
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kyrtapz The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test images |
@kyrtapz: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
When multinetwork policies support is disabled WatchNamespaces should run to completion for primary UDNs. This was not happening because a primary UDN is also a secondary network. Signed-off-by: Patryk Diak <[email protected]>
Signed-off-by: Patryk Diak <[email protected]>
Signed-off-by: Dumitru Ceara <[email protected]>
Otherwise there's a data race because subnets can be allocated while we read the current usage/count. Signed-off-by: Dumitru Ceara <[email protected]>
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
No description provided.