Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Test udn node scale #2364

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

Conversation

trozet
Copy link
Contributor

@trozet trozet commented Nov 22, 2024

πŸ“‘ Description

Fixes #

Additional Information for reviewers

βœ… Checks

  • My code requires changes to the documentation
  • if so, I have updated the documentation as required
  • My code requires tests
  • if so, I have added and/or updated the tests as required
  • All the tests have passed in the CI

How to verify it

Routes via mp0 were being deleted on every ovnkube-node restart:
[root@ovn-worker ~]# ip monitor route
Deleted 192.72.3.0/24 dev ovn-k8s-mp0 proto kernel scope link src 192.72.3.2
Deleted broadcast 192.72.3.255 dev ovn-k8s-mp0 table local proto kernel scope link src 192.72.3.2
Deleted local 192.72.3.2 dev ovn-k8s-mp0 table local proto kernel scope host src 192.72.3.2
local 192.72.3.2 dev ovn-k8s-mp0 table local proto kernel scope host src 192.72.3.2
broadcast 192.72.3.255 dev ovn-k8s-mp0 table local proto kernel scope link src 192.72.3.2

This causes traffic outage during upgrade, as well as other unwanted
side effects when pod-destined traffic is routed via default gateway
route in the host. This is especially disruptive in local gateway mode.

This patch removes the teardown, and then makes the synchronization of
addresses and routes more robust, so that we can safely handle changes
to MTU or mp0 addresses.

Signed-off-by: Tim Rozet <[email protected]>
Fixes unexpected mp0 route removal during start up
Next need to:
 - add mgmt port mac address
 - update l2 secondary
 - update node tracker

Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
- Adds locking to protect node/UDNNode syncmap updates
- Adds UDNNode client support to node controller for updating mac
- make codegen

Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
We still need workqueue and retry for nodetracker, as well as to
abstract to a singleton. The start up time of adding all nodes is a
waste everytime we create a new UDN.

Signed-off-by: Tim Rozet <[email protected]>
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 22, 2024
@openshift-ci openshift-ci bot requested review from jcaamano and tssurya November 22, 2024 19:34
Copy link
Contributor

openshift-ci bot commented Nov 22, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 22, 2024
@mohit-sheth
Copy link
Member

/test

Copy link
Contributor

openshift-ci bot commented Nov 22, 2024

@mohit-sheth: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

/test 4.18-upgrade-from-stable-4.17-e2e-aws-ovn-upgrade
/test 4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade
/test 4.18-upgrade-from-stable-4.17-images
/test e2e-aws-ovn
/test e2e-aws-ovn-hypershift
/test e2e-aws-ovn-local-gateway
/test e2e-aws-ovn-local-to-shared-gateway-mode-migration
/test e2e-aws-ovn-serial
/test e2e-aws-ovn-shared-to-local-gateway-mode-migration
/test e2e-aws-ovn-upgrade
/test e2e-aws-ovn-upgrade-local-gateway
/test e2e-aws-ovn-windows
/test e2e-azure-ovn-upgrade
/test e2e-gcp-ovn
/test e2e-gcp-ovn-techpreview
/test e2e-metal-ipi-ovn-dualstack
/test e2e-metal-ipi-ovn-ipv6
/test gofmt
/test images
/test lint
/test unit

The following commands are available to trigger optional jobs:

/test e2e-agent-compact-ipv4
/test e2e-aws-ovn-clusternetwork-cidr-expansion
/test e2e-aws-ovn-fdp-qe
/test e2e-aws-ovn-hypershift-conformance-techpreview
/test e2e-aws-ovn-kubevirt
/test e2e-aws-ovn-single-node-techpreview
/test e2e-aws-ovn-techpreview
/test e2e-aws-ovn-virt-techpreview
/test e2e-azure-ovn
/test e2e-azure-ovn-techpreview
/test e2e-metal-ipi-ovn-dualstack-local-gateway
/test e2e-metal-ipi-ovn-dualstack-local-gateway-techpreview
/test e2e-metal-ipi-ovn-dualstack-techpreview
/test e2e-metal-ipi-ovn-ipv4
/test e2e-metal-ipi-ovn-ipv6-techpreview
/test e2e-metal-ipi-ovn-techpreview
/test e2e-openstack-ovn
/test e2e-ovn-hybrid-step-registry
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-techpreview
/test e2e-vsphere-windows
/test okd-scos-e2e-aws-ovn
/test okd-scos-images
/test openshift-e2e-gcp-ovn-techpreview-upgrade
/test qe-perfscale-aws-ovn-medium-cluster-density
/test qe-perfscale-aws-ovn-medium-node-density-cni
/test qe-perfscale-aws-ovn-small-cluster-density
/test qe-perfscale-aws-ovn-small-node-density-cni
/test qe-perfscale-aws-ovn-small-udn-density-l2
/test qe-perfscale-aws-ovn-small-udn-density-l3
/test security

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-ovn-kubernetes-master-4.18-upgrade-from-stable-4.17-e2e-aws-ovn-upgrade
pull-ci-openshift-ovn-kubernetes-master-4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade
pull-ci-openshift-ovn-kubernetes-master-4.18-upgrade-from-stable-4.17-images
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-hypershift
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-hypershift-conformance-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-kubevirt
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-local-gateway
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-local-to-shared-gateway-mode-migration
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-serial
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-shared-to-local-gateway-mode-migration
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-single-node-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-upgrade
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-upgrade-local-gateway
pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-windows
pull-ci-openshift-ovn-kubernetes-master-e2e-azure-ovn
pull-ci-openshift-ovn-kubernetes-master-e2e-azure-ovn-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-azure-ovn-upgrade
pull-ci-openshift-ovn-kubernetes-master-e2e-gcp-ovn
pull-ci-openshift-ovn-kubernetes-master-e2e-gcp-ovn-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-metal-ipi-ovn-dualstack
pull-ci-openshift-ovn-kubernetes-master-e2e-metal-ipi-ovn-dualstack-local-gateway-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-metal-ipi-ovn-dualstack-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-metal-ipi-ovn-ipv6
pull-ci-openshift-ovn-kubernetes-master-e2e-metal-ipi-ovn-ipv6-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-metal-ipi-ovn-techpreview
pull-ci-openshift-ovn-kubernetes-master-e2e-openstack-ovn
pull-ci-openshift-ovn-kubernetes-master-e2e-ovn-hybrid-step-registry
pull-ci-openshift-ovn-kubernetes-master-e2e-vsphere-ovn
pull-ci-openshift-ovn-kubernetes-master-e2e-vsphere-ovn-techpreview
pull-ci-openshift-ovn-kubernetes-master-gofmt
pull-ci-openshift-ovn-kubernetes-master-images
pull-ci-openshift-ovn-kubernetes-master-lint
pull-ci-openshift-ovn-kubernetes-master-okd-scos-e2e-aws-ovn
pull-ci-openshift-ovn-kubernetes-master-openshift-e2e-gcp-ovn-techpreview-upgrade
pull-ci-openshift-ovn-kubernetes-master-security
pull-ci-openshift-ovn-kubernetes-master-unit

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mohit-sheth
Copy link
Member

/test qe-perfscale-aws-ovn-small-udn-density-l3

1 similar comment
@mohit-sheth
Copy link
Member

/test qe-perfscale-aws-ovn-small-udn-density-l3

Signed-off-by: Tim Rozet <[email protected]>
Signed-off-by: Tim Rozet <[email protected]>
Make UDN Node informer have 15 threads
Make Network Manager and NAD controller have higher thread count

Signed-off-by: Tim Rozet <[email protected]>
Copy link
Contributor

openshift-ci bot commented Nov 26, 2024

@trozet: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/qe-perfscale-aws-ovn-small-udn-density-l3 ab94c12 link false /test qe-perfscale-aws-ovn-small-udn-density-l3
ci/prow/e2e-aws-ovn 1225dce link true /test e2e-aws-ovn
ci/prow/e2e-gcp-ovn-techpreview 1225dce link true /test e2e-gcp-ovn-techpreview
ci/prow/e2e-aws-ovn-windows 1225dce link true /test e2e-aws-ovn-windows
ci/prow/e2e-metal-ipi-ovn-dualstack 1225dce link true /test e2e-metal-ipi-ovn-dualstack
ci/prow/e2e-aws-ovn-upgrade 1225dce link true /test e2e-aws-ovn-upgrade
ci/prow/e2e-aws-ovn-hypershift 1225dce link true /test e2e-aws-ovn-hypershift
ci/prow/e2e-gcp-ovn 1225dce link true /test e2e-gcp-ovn
ci/prow/e2e-metal-ipi-ovn-ipv6-techpreview 1225dce link false /test e2e-metal-ipi-ovn-ipv6-techpreview
ci/prow/e2e-aws-ovn-techpreview 1225dce link false /test e2e-aws-ovn-techpreview
ci/prow/openshift-e2e-gcp-ovn-techpreview-upgrade 1225dce link false /test openshift-e2e-gcp-ovn-techpreview-upgrade
ci/prow/e2e-metal-ipi-ovn-techpreview 1225dce link false /test e2e-metal-ipi-ovn-techpreview
ci/prow/4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade 1225dce link true /test 4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade
ci/prow/e2e-azure-ovn-upgrade 1225dce link true /test e2e-azure-ovn-upgrade
ci/prow/e2e-vsphere-ovn-techpreview 1225dce link false /test e2e-vsphere-ovn-techpreview
ci/prow/e2e-ovn-hybrid-step-registry 1225dce link false /test e2e-ovn-hybrid-step-registry
ci/prow/e2e-metal-ipi-ovn-dualstack-local-gateway-techpreview 1225dce link false /test e2e-metal-ipi-ovn-dualstack-local-gateway-techpreview
ci/prow/e2e-metal-ipi-ovn-ipv6 1225dce link true /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-aws-ovn-kubevirt 1225dce link false /test e2e-aws-ovn-kubevirt
ci/prow/e2e-aws-ovn-single-node-techpreview 1225dce link false /test e2e-aws-ovn-single-node-techpreview
ci/prow/okd-scos-e2e-aws-ovn 1225dce link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-azure-ovn 1225dce link false /test e2e-azure-ovn
ci/prow/security 1225dce link false /test security
ci/prow/e2e-openstack-ovn 1225dce link false /test e2e-openstack-ovn
ci/prow/4.18-upgrade-from-stable-4.17-e2e-aws-ovn-upgrade 1225dce link true /test 4.18-upgrade-from-stable-4.17-e2e-aws-ovn-upgrade
ci/prow/e2e-metal-ipi-ovn-dualstack-techpreview 1225dce link false /test e2e-metal-ipi-ovn-dualstack-techpreview
ci/prow/e2e-aws-ovn-local-gateway 1225dce link true /test e2e-aws-ovn-local-gateway
ci/prow/e2e-azure-ovn-techpreview 1225dce link false /test e2e-azure-ovn-techpreview
ci/prow/e2e-aws-ovn-serial 1225dce link true /test e2e-aws-ovn-serial
ci/prow/e2e-vsphere-ovn 1225dce link false /test e2e-vsphere-ovn
ci/prow/e2e-aws-ovn-shared-to-local-gateway-mode-migration 1225dce link true /test e2e-aws-ovn-shared-to-local-gateway-mode-migration
ci/prow/e2e-aws-ovn-hypershift-conformance-techpreview 1225dce link false /test e2e-aws-ovn-hypershift-conformance-techpreview
ci/prow/e2e-aws-ovn-upgrade-local-gateway 1225dce link true /test e2e-aws-ovn-upgrade-local-gateway
ci/prow/e2e-aws-ovn-local-to-shared-gateway-mode-migration 1225dce link true /test e2e-aws-ovn-local-to-shared-gateway-mode-migration

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants