Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGP w/OVN: nexthop value not synchronised across cluster nodes #14531

Open
jsimpso opened this issue Nov 26, 2024 · 38 comments · Fixed by canonical/lxd-ui#1036
Open

BGP w/OVN: nexthop value not synchronised across cluster nodes #14531

jsimpso opened this issue Nov 26, 2024 · 38 comments · Fixed by canonical/lxd-ui#1036
Assignees
Labels
Bug Confirmed to be a bug
Milestone

Comments

@jsimpso
Copy link

jsimpso commented Nov 26, 2024

Required information

  • Distribution: snap
  • Distribution version: 5.21.2-084c8c8
  • The output of "snap list --all lxd core20 core22 core24 snapd":
    jsimpso@mc-001:~$ snap list --all lxd core20 core22 core24 snapd
    Name    Version         Rev    Tracking       Publisher   Notes
    core22  20241001        1663   latest/stable  canonical✓  base
    core24  20240920        609    latest/stable  canonical✓  base
    lxd     5.21.2-084c8c8  31214  5.21/stable    canonical✓  in-cohort
    snapd   2.63            21759  latest/stable  canonical✓  snapd
    
  • The output of "lxc info" or if that fails:
     config:
       cluster.https_address: 10.0.25.10:8443
       core.bgp_address: 10.0.25.10
       core.bgp_asn: "64512"
       core.bgp_routerid: 10.0.25.10
       core.https_address: '[::]:8443'
       network.ovn.northbound_connection: ssl:10.0.25.10:6641,ssl:10.0.25.11:6641,ssl:10.0.25.12:6641
     api_extensions:
     - storage_zfs_remove_snapshots
     - container_host_shutdown_timeout
     - container_stop_priority
     - container_syscall_filtering
     - auth_pki
     - container_last_used_at
     - etag
     - patch
     - usb_devices
     - https_allowed_credentials
     - image_compression_algorithm
     - directory_manipulation
     - container_cpu_time
     - storage_zfs_use_refquota
     - storage_lvm_mount_options
     - network
     - profile_usedby
     - container_push
     - container_exec_recording
     - certificate_update
     - container_exec_signal_handling
     - gpu_devices
     - container_image_properties
     - migration_progress
     - id_map
     - network_firewall_filtering
     - network_routes
     - storage
     - file_delete
     - file_append
     - network_dhcp_expiry
     - storage_lvm_vg_rename
     - storage_lvm_thinpool_rename
     - network_vlan
     - image_create_aliases
     - container_stateless_copy
     - container_only_migration
     - storage_zfs_clone_copy
     - unix_device_rename
     - storage_lvm_use_thinpool
     - storage_rsync_bwlimit
     - network_vxlan_interface
     - storage_btrfs_mount_options
     - entity_description
     - image_force_refresh
     - storage_lvm_lv_resizing
     - id_map_base
     - file_symlinks
     - container_push_target
     - network_vlan_physical
     - storage_images_delete
     - container_edit_metadata
     - container_snapshot_stateful_migration
     - storage_driver_ceph
     - storage_ceph_user_name
     - resource_limits
     - storage_volatile_initial_source
     - storage_ceph_force_osd_reuse
     - storage_block_filesystem_btrfs
     - resources
     - kernel_limits
     - storage_api_volume_rename
     - network_sriov
     - console
     - restrict_devlxd
     - migration_pre_copy
     - infiniband
     - maas_network
     - devlxd_events
     - proxy
     - network_dhcp_gateway
     - file_get_symlink
     - network_leases
     - unix_device_hotplug
     - storage_api_local_volume_handling
     - operation_description
     - clustering
     - event_lifecycle
     - storage_api_remote_volume_handling
     - nvidia_runtime
     - container_mount_propagation
     - container_backup
     - devlxd_images
     - container_local_cross_pool_handling
     - proxy_unix
     - proxy_udp
     - clustering_join
     - proxy_tcp_udp_multi_port_handling
     - network_state
     - proxy_unix_dac_properties
     - container_protection_delete
     - unix_priv_drop
     - pprof_http
     - proxy_haproxy_protocol
     - network_hwaddr
     - proxy_nat
     - network_nat_order
     - container_full
     - backup_compression
     - nvidia_runtime_config
     - storage_api_volume_snapshots
     - storage_unmapped
     - projects
     - network_vxlan_ttl
     - container_incremental_copy
     - usb_optional_vendorid
     - snapshot_scheduling
     - snapshot_schedule_aliases
     - container_copy_project
     - clustering_server_address
     - clustering_image_replication
     - container_protection_shift
     - snapshot_expiry
     - container_backup_override_pool
     - snapshot_expiry_creation
     - network_leases_location
     - resources_cpu_socket
     - resources_gpu
     - resources_numa
     - kernel_features
     - id_map_current
     - event_location
     - storage_api_remote_volume_snapshots
     - network_nat_address
     - container_nic_routes
     - cluster_internal_copy
     - seccomp_notify
     - lxc_features
     - container_nic_ipvlan
     - network_vlan_sriov
     - storage_cephfs
     - container_nic_ipfilter
     - resources_v2
     - container_exec_user_group_cwd
     - container_syscall_intercept
     - container_disk_shift
     - storage_shifted
     - resources_infiniband
     - daemon_storage
     - instances
     - image_types
     - resources_disk_sata
     - clustering_roles
     - images_expiry
     - resources_network_firmware
     - backup_compression_algorithm
     - ceph_data_pool_name
     - container_syscall_intercept_mount
     - compression_squashfs
     - container_raw_mount
     - container_nic_routed
     - container_syscall_intercept_mount_fuse
     - container_disk_ceph
     - virtual-machines
     - image_profiles
     - clustering_architecture
     - resources_disk_id
     - storage_lvm_stripes
     - vm_boot_priority
     - unix_hotplug_devices
     - api_filtering
     - instance_nic_network
     - clustering_sizing
     - firewall_driver
     - projects_limits
     - container_syscall_intercept_hugetlbfs
     - limits_hugepages
     - container_nic_routed_gateway
     - projects_restrictions
     - custom_volume_snapshot_expiry
     - volume_snapshot_scheduling
     - trust_ca_certificates
     - snapshot_disk_usage
     - clustering_edit_roles
     - container_nic_routed_host_address
     - container_nic_ipvlan_gateway
     - resources_usb_pci
     - resources_cpu_threads_numa
     - resources_cpu_core_die
     - api_os
     - container_nic_routed_host_table
     - container_nic_ipvlan_host_table
     - container_nic_ipvlan_mode
     - resources_system
     - images_push_relay
     - network_dns_search
     - container_nic_routed_limits
     - instance_nic_bridged_vlan
     - network_state_bond_bridge
     - usedby_consistency
     - custom_block_volumes
     - clustering_failure_domains
     - resources_gpu_mdev
     - console_vga_type
     - projects_limits_disk
     - network_type_macvlan
     - network_type_sriov
     - container_syscall_intercept_bpf_devices
     - network_type_ovn
     - projects_networks
     - projects_networks_restricted_uplinks
     - custom_volume_backup
     - backup_override_name
     - storage_rsync_compression
     - network_type_physical
     - network_ovn_external_subnets
     - network_ovn_nat
     - network_ovn_external_routes_remove
     - tpm_device_type
     - storage_zfs_clone_copy_rebase
     - gpu_mdev
     - resources_pci_iommu
     - resources_network_usb
     - resources_disk_address
     - network_physical_ovn_ingress_mode
     - network_ovn_dhcp
     - network_physical_routes_anycast
     - projects_limits_instances
     - network_state_vlan
     - instance_nic_bridged_port_isolation
     - instance_bulk_state_change
     - network_gvrp
     - instance_pool_move
     - gpu_sriov
     - pci_device_type
     - storage_volume_state
     - network_acl
     - migration_stateful
     - disk_state_quota
     - storage_ceph_features
     - projects_compression
     - projects_images_remote_cache_expiry
     - certificate_project
     - network_ovn_acl
     - projects_images_auto_update
     - projects_restricted_cluster_target
     - images_default_architecture
     - network_ovn_acl_defaults
     - gpu_mig
     - project_usage
     - network_bridge_acl
     - warnings
     - projects_restricted_backups_and_snapshots
     - clustering_join_token
     - clustering_description
     - server_trusted_proxy
     - clustering_update_cert
     - storage_api_project
     - server_instance_driver_operational
     - server_supported_storage_drivers
     - event_lifecycle_requestor_address
     - resources_gpu_usb
     - clustering_evacuation
     - network_ovn_nat_address
     - network_bgp
     - network_forward
     - custom_volume_refresh
     - network_counters_errors_dropped
     - metrics
     - image_source_project
     - clustering_config
     - network_peer
     - linux_sysctl
     - network_dns
     - ovn_nic_acceleration
     - certificate_self_renewal
     - instance_project_move
     - storage_volume_project_move
     - cloud_init
     - network_dns_nat
     - database_leader
     - instance_all_projects
     - clustering_groups
     - ceph_rbd_du
     - instance_get_full
     - qemu_metrics
     - gpu_mig_uuid
     - event_project
     - clustering_evacuation_live
     - instance_allow_inconsistent_copy
     - network_state_ovn
     - storage_volume_api_filtering
     - image_restrictions
     - storage_zfs_export
     - network_dns_records
     - storage_zfs_reserve_space
     - network_acl_log
     - storage_zfs_blocksize
     - metrics_cpu_seconds
     - instance_snapshot_never
     - certificate_token
     - instance_nic_routed_neighbor_probe
     - event_hub
     - agent_nic_config
     - projects_restricted_intercept
     - metrics_authentication
     - images_target_project
     - cluster_migration_inconsistent_copy
     - cluster_ovn_chassis
     - container_syscall_intercept_sched_setscheduler
     - storage_lvm_thinpool_metadata_size
     - storage_volume_state_total
     - instance_file_head
     - instances_nic_host_name
     - image_copy_profile
     - container_syscall_intercept_sysinfo
     - clustering_evacuation_mode
     - resources_pci_vpd
     - qemu_raw_conf
     - storage_cephfs_fscache
     - network_load_balancer
     - vsock_api
     - instance_ready_state
     - network_bgp_holdtime
     - storage_volumes_all_projects
     - metrics_memory_oom_total
     - storage_buckets
     - storage_buckets_create_credentials
     - metrics_cpu_effective_total
     - projects_networks_restricted_access
     - storage_buckets_local
     - loki
     - acme
     - internal_metrics
     - cluster_join_token_expiry
     - remote_token_expiry
     - init_preseed
     - storage_volumes_created_at
     - cpu_hotplug
     - projects_networks_zones
     - network_txqueuelen
     - cluster_member_state
     - instances_placement_scriptlet
     - storage_pool_source_wipe
     - zfs_block_mode
     - instance_generation_id
     - disk_io_cache
     - amd_sev
     - storage_pool_loop_resize
     - migration_vm_live
     - ovn_nic_nesting
     - oidc
     - network_ovn_l3only
     - ovn_nic_acceleration_vdpa
     - cluster_healing
     - instances_state_total
     - auth_user
     - security_csm
     - instances_rebuild
     - numa_cpu_placement
     - custom_volume_iso
     - network_allocations
     - storage_api_remote_volume_snapshot_copy
     - zfs_delegate
     - operations_get_query_all_projects
     - metadata_configuration
     - syslog_socket
     - event_lifecycle_name_and_project
     - instances_nic_limits_priority
     - disk_initial_volume_configuration
     - operation_wait
     - cluster_internal_custom_volume_copy
     - disk_io_bus
     - storage_cephfs_create_missing
     - instance_move_config
     - ovn_ssl_config
     - init_preseed_storage_volumes
     - metrics_instances_count
     - server_instance_type_info
     - resources_disk_mounted
     - server_version_lts
     - oidc_groups_claim
     - loki_config_instance
     - storage_volatile_uuid
     - import_instance_devices
     - instances_uefi_vars
     - instances_migration_stateful
     - container_syscall_filtering_allow_deny_syntax
     - access_management
     - vm_disk_io_limits
     - storage_volumes_all
     - instances_files_modify_permissions
     - image_restriction_nesting
     - container_syscall_intercept_finit_module
     - device_usb_serial
     - network_allocate_external_ips
     - explicit_trust_token
     api_status: stable
     api_version: "1.0"
     auth: trusted
     public: false
     auth_methods:
     - tls
     auth_user_name: jsimpso
     auth_user_method: unix
     environment:
       addresses:
       - 10.0.25.10:8443
       architectures:
       - x86_64
       - i686
       certificate: |
         -----BEGIN CERTIFICATE-----
         MIIB5DCCAWqgAwIBAgIRAK/vWMdSjS44j1Bhp+adHwIwCgYIKoZIzj0EAwMwJDEM
         MAoGA1UEChMDTFhEMRQwEgYDVQQDDAtyb290QG1jLTAwMTAeFw0yNDExMjQwOTQ3
         MTBaFw0zNDExMjIwOTQ3MTBaMCQxDDAKBgNVBAoTA0xYRDEUMBIGA1UEAwwLcm9v
         dEBtYy0wMDEwdjAQBgcqhkjOPQIBBgUrgQQAIgNiAARWbRkG9Ht7v0f7Z2i49Zdw
         EWXcSzMswnPCyUeQOVulI97yekk1KDNfcJbgEejn4TfKAQT1Qrw5lSCl+LXhQdmB
         Vlwam8t8OxNAIDz2mGUMuYANDcFJIbBZjq/+tNWM8TujYDBeMA4GA1UdDwEB/wQE
         AwIFoDATBgNVHSUEDDAKBggrBgEFBQcDATAMBgNVHRMBAf8EAjAAMCkGA1UdEQQi
         MCCCBm1jLTAwMYcEfwAAAYcQAAAAAAAAAAAAAAAAAAAAATAKBggqhkjOPQQDAwNo
         ADBlAjAvQOfW1vSIejG80/0U14OW+5fMZIHdwKnZOBkFXrRh1z5pRhddzOWE8QGj
         jh1VluYCMQCcqxZh1u5SKbcp47h2OZK0CiDUQS/ML+y3gVY+k37L1PvlrLAG25wJ
         vALYZ81YfKQ=
         -----END CERTIFICATE-----
       certificate_fingerprint: 37ec2b324056d390fc3562e98362abee62523bb6d4f99484621c844b90dbd22a
       driver: lxc | qemu
       driver_version: 6.0.0 | 8.2.1
       instance_types:
       - container
       - virtual-machine
       firewall: nftables
       kernel: Linux
       kernel_architecture: x86_64
       kernel_features:
         idmapped_mounts: "true"
         netnsid_getifaddrs: "true"
         seccomp_listener: "true"
         seccomp_listener_continue: "true"
         uevent_injection: "true"
         unpriv_fscaps: "true"
       kernel_version: 6.8.0-49-generic
       lxc_features:
         cgroup2: "true"
         core_scheduling: "true"
         devpts_fd: "true"
         idmapped_mounts_v2: "true"
         mount_injection_file: "true"
         network_gateway_device_route: "true"
         network_ipvlan: "true"
         network_l2proxy: "true"
         network_phys_macvlan_mtu: "true"
         network_veth_router: "true"
         pidfd: "true"
         seccomp_allow_deny_syntax: "true"
         seccomp_notify: "true"
         seccomp_proxy_send_notify_fd: "true"
       os_name: Ubuntu
       os_version: "24.04"
       project: default
       server: lxd
       server_clustered: true
       server_event_mode: full-mesh
       server_name: mc-001
       server_pid: 1763
       server_version: 5.21.2
       server_lts: true
       storage: ""
       storage_version: ""
       storage_supported_drivers:
       - name: ceph
         version: 17.2.7
         remote: true
       - name: cephfs
         version: 17.2.7
         remote: true
       - name: cephobject
         version: 17.2.7
         remote: true
       - name: dir
         version: "1"
         remote: false
       - name: lvm
         version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.48.0
         remote: false
       - name: powerflex
         version: 1.16 (nvme-cli)
         remote: true
       - name: zfs
         version: 2.2.2-0ubuntu9
         remote: false
       - name: btrfs
         version: 5.16.2
         remote: false

Issue description

  • This is a MicroCloud deployed cross 3 physical nodes

  • I've configured BGP between LXD and the physical router in this lab network

  • The router is successfully receiving routes for the three networks I've created so far. However, one of those networks is showing different values for the next hop:

    jsimpso@rubidoux:~$ show bgp ipv4
    BGP table version is 49, local router ID is 10.0.25.1, vrf id 0
    Default local pref 100, local AS 64513
    Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                   i internal, r RIB-failure, S Stale, R Removed
    Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
    Origin codes:  i - IGP, e - EGP, ? - incomplete
    RPKI validation codes: V valid, I invalid, N Not found
    
        Network          Next Hop            Metric LocPrf Weight Path
     *  172.16.0.0/28    10.0.10.4                              0 64512 i
     *                   10.0.10.4                              0 64512 i
     *>                  10.0.10.4                              0 64512 i
     *  172.16.0.128/28  10.0.10.3                              0 64512 i
     *                   10.0.10.3                              0 64512 i
     *>                  10.0.10.3                              0 64512 i
     *= 172.16.1.0/25    10.0.25.11                             0 64512 i
     *=                  10.0.10.5                              0 64512 i
     *>                  10.0.25.10                             0 64512 i
    
    Displayed  3 routes and 9 total paths
    
  • Running lxc query internal/testing/bgp across the three nodes confirms that only one of them has the nexthop value set to the OVN router, the other two list "0.0.0.0".:

  • mc-001:

    {
            "peers": [
                    {
                            "address": "10.0.25.1",
                            "asn": 64513,
                            "count": 1,
                            "holdtime": 0,
                            "password": ""
                    }
            ],
            "prefixes": [
                    {
                            "nexthop": "10.0.10.3",
                            "owner": "network_3",
                            "prefix": "172.16.0.128/28"
                    },
                    {
                            "nexthop": "10.0.10.4",
                            "owner": "network_4",
                            "prefix": "172.16.0.0/28"
                    },
                    {
                            "nexthop": "0.0.0.0",
                            "owner": "network_5",
                            "prefix": "172.16.1.0/25"
                    }
            ],
            "server": {
                    "address": "10.0.25.10",
                    "asn": 64512,
                    "router_id": "10.0.25.10",
                    "running": true
            }
    }
  • mc-002:

    {
            "peers": [
                    {
                            "address": "10.0.25.1",
                            "asn": 64513,
                            "count": 1,
                            "holdtime": 0,
                            "password": ""
                    }
            ],
            "prefixes": [
                    {
                            "nexthop": "10.0.10.3",
                            "owner": "network_3",
                            "prefix": "172.16.0.128/28"
                    },
                    {
                            "nexthop": "10.0.10.4",
                            "owner": "network_4",
                            "prefix": "172.16.0.0/28"
                    },
                    {
                            "nexthop": "0.0.0.0",
                            "owner": "network_5",
                            "prefix": "172.16.1.0/25"
                    }
            ],
            "server": {
                    "address": "10.0.25.11",
                    "asn": 64512,
                    "router_id": "10.0.25.11",
                    "running": true
            }
    }
  • mc-003:

    {
            "peers": [
                    {
                            "address": "10.0.25.1",
                            "asn": 64513,
                            "count": 1,
                            "holdtime": 0,
                            "password": ""
                    }
            ],
            "prefixes": [
                    {
                            "nexthop": "10.0.10.3",
                            "owner": "network_3",
                            "prefix": "172.16.0.128/28"
                    },
                    {
                            "nexthop": "10.0.10.4",
                            "owner": "network_4",
                            "prefix": "172.16.0.0/28"
                    },
                    {
                            "nexthop": "10.0.10.5",
                            "owner": "network_5",
                            "prefix": "172.16.1.0/25"
                    }
            ],
            "server": {
                    "address": "10.0.25.12",
                    "asn": 64512,
                    "router_id": "10.0.25.12",
                    "running": true
            }
    }

Steps to reproduce

I seem to be able to trigger this behaviour by changing the configuration of the network. This issue occurred after I changed the MTU for the network from 1442 to 1500 for some unrelated troubleshooting. Changing the configuration back to its original state doesn't appear to have any positive effect.

I've just reproduced this by replicating that same change (editing the network's YAML configuration via the LXD UI) on network_4 from the above output, and now see the same situation there:

  • Router

    BGP table version is 53, local router ID is 10.0.25.1, vrf id 0
    Default local pref 100, local AS 64513
    Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                   i internal, r RIB-failure, S Stale, R Removed
    Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
    Origin codes:  i - IGP, e - EGP, ? - incomplete
    RPKI validation codes: V valid, I invalid, N Not found
    
        Network          Next Hop            Metric LocPrf Weight Path
     *= 172.16.0.0/28    10.0.25.11                             0 64512 i
     *=                  10.0.25.10                             0 64512 i
     *>                  10.0.10.4                              0 64512 i
     *  172.16.0.128/28  10.0.10.3                              0 64512 i
     *                   10.0.10.3                              0 64512 i
     *>                  10.0.10.3                              0 64512 i
     *= 172.16.1.0/25    10.0.25.11                             0 64512 i
     *=                  10.0.10.5                              0 64512 i
     *>                  10.0.25.10                             0 64512 i
    
    Displayed  3 routes and 9 total paths
    
  • mc-001:

    {
            "peers": [
                    {
                            "address": "10.0.25.1",
                            "asn": 64513,
                            "count": 1,
                            "holdtime": 0,
                            "password": ""
                    }
            ],
            "prefixes": [
                    {
                            "nexthop": "0.0.0.0",
                            "owner": "network_5",
                            "prefix": "172.16.1.0/25"
                    },
                    {
                            "nexthop": "10.0.10.3",
                            "owner": "network_3",
                            "prefix": "172.16.0.128/28"
                    },
                    {
                            "nexthop": "0.0.0.0",
                            "owner": "network_4",
                            "prefix": "172.16.0.0/28"
                    }
            ],
            "server": {
                    "address": "10.0.25.10",
                    "asn": 64512,
                    "router_id": "10.0.25.10",
                    "running": true
            }
    }
    
  • mc-002:

      {
              "peers": [
                      {
                              "address": "10.0.25.1",
                              "asn": 64513,
                              "count": 1,
                              "holdtime": 0,
                              "password": ""
                      }
              ],
              "prefixes": [
                      {
                              "nexthop": "10.0.10.3",
                              "owner": "network_3",
                              "prefix": "172.16.0.128/28"
                      },
                      {
                              "nexthop": "0.0.0.0",
                              "owner": "network_4",
                              "prefix": "172.16.0.0/28"
                      },
                      {
                              "nexthop": "0.0.0.0",
                              "owner": "network_5",
                              "prefix": "172.16.1.0/25"
                      }
              ],
              "server": {
                      "address": "10.0.25.11",
                      "asn": 64512,
                      "router_id": "10.0.25.11",
                      "running": true
              }
      }
  • mc-003:

    {
            "peers": [
                    {
                            "address": "10.0.25.1",
                            "asn": 64513,
                            "count": 1,
                            "holdtime": 0,
                            "password": ""
                    }
            ],
            "prefixes": [
                    {
                            "nexthop": "10.0.10.3",
                            "owner": "network_3",
                            "prefix": "172.16.0.128/28"
                    },
                    {
                            "nexthop": "10.0.10.4",
                            "owner": "network_4",
                            "prefix": "172.16.0.0/28"
                    },
                    {
                            "nexthop": "10.0.10.5",
                            "owner": "network_5",
                            "prefix": "172.16.1.0/25"
                    }
            ],
            "server": {
                    "address": "10.0.25.12",
                    "asn": 64512,
                    "router_id": "10.0.25.12",
                    "running": true
            }
    }
    
@tomponline tomponline self-assigned this Nov 26, 2024
@tomponline tomponline modified the milestones: lxd-6.2, lxd-6.3 Nov 26, 2024
@jsimpso
Copy link
Author

jsimpso commented Nov 28, 2024

In case it helps, it looks like restarting the LXD snap causes those values to re-sync:

jsimpso@mc-002:~$ lxc query /internal/testing/bgp
{
	"peers": [
		{
			"address": "10.0.25.1",
			"asn": 64513,
			"count": 1,
			"holdtime": 0,
			"password": ""
		}
	],
	"prefixes": [
		{
			"nexthop": "10.0.10.3",
			"owner": "network_3",
			"prefix": "172.16.0.128/28"
		},
		{
			"nexthop": "0.0.0.0",
			"owner": "network_4",
			"prefix": "172.16.0.0/28"
		},
		{
			"nexthop": "0.0.0.0",
			"owner": "network_5",
			"prefix": "172.16.1.0/25"
		}
	],
	"server": {
		"address": "10.0.25.11",
		"asn": 64512,
		"router_id": "10.0.25.11",
		"running": true
	}
}
jsimpso@mc-002:~$ sudo snap restart lxd
2024-11-28T11:07:48+08:00 INFO Waiting for "snap.lxd.daemon.service" to stop.
Restarted.
jsimpso@mc-002:~$ lxc query /internal/testing/bgp
{
	"peers": [
		{
			"address": "10.0.25.1",
			"asn": 64513,
			"count": 1,
			"holdtime": 0,
			"password": ""
		}
	],
	"prefixes": [
		{
			"nexthop": "10.0.10.4",
			"owner": "network_4",
			"prefix": "172.16.0.0/28"
		},
		{
			"nexthop": "10.0.10.5",
			"owner": "network_5",
			"prefix": "172.16.1.0/25"
		},
		{
			"nexthop": "10.0.10.3",
			"owner": "network_3",
			"prefix": "172.16.0.128/28"
		}
	],
	"server": {
		"address": "10.0.25.11",
		"asn": 64512,
		"router_id": "10.0.25.11",
		"running": true
	}
}

@tomponline
Copy link
Member

Thanks @jsimpso this is high on my list to work on.

@tomponline
Copy link
Member

We should also consider here whether we can relax the requirements that all LXD cluster members are online when modifying OVN load balancers and forwards as currently the only reason for that is to update the local per-member BGP exporters.

@tomponline
Copy link
Member

I seem to be able to trigger this behaviour by changing the configuration of the network. This issue occurred after I changed the MTU for the network from 1442 to 1500 for some unrelated troubleshooting. Changing the configuration back to its original state doesn't appear to have any positive effect.

When you did this, were the networks that didn't have their MTU changed also affected (WRT to their BGP nexthop) or was it only the network that was being edited that was affected?

@tomponline
Copy link
Member

Please can you show lxc network show <network> for the networks affected too?

@tomponline
Copy link
Member

An initial attempt to reproduce this issue hasn't been successful:

snap list
Name        Version                 Rev    Tracking       Publisher   Notes
core22      20241119                1722   latest/stable  canonical✓  base
core24      20240920                609    latest/stable  canonical✓  base
lxd         5.21.2-084c8c8          31214  5.21/stable    canonical✓  -
microceph   19.2.0+snap9aeaeb2970   1228   squid/stable   canonical✓  -
microcloud  2.1.0-3e8b183           1144   2/stable       canonical✓  -
microovn    24.03.2+snapa2c59c105b  667    24.03/stable   canonical✓  -
snapd       2.66.1                  23258  latest/stable  canonical✓  snapd

I observed the nexthop address did not go to 0 on updating the ovn network's bridge.mtu on any of the cluster members.

@tomponline
Copy link
Member

editing the network's YAML configuration via the LXD UI

Do you see the same issue if editing the network's config using lxc network edit <network>, and also which host in your cluster is your UI browser pointing at? And how does this compare to the active chassis for the affected networks in lxc network info <network>?

@tomponline tomponline added the Incomplete Waiting on more information from reporter label Dec 9, 2024
@jsimpso
Copy link
Author

jsimpso commented Dec 10, 2024

When you did this, were the networks that didn't have their MTU changed also affected (WRT to their BGP nexthop) or was it only the network that was being edited that was affected?

Only the network that had a configuration change was affected

Please can you show lxc network show <network> for the networks affected too?

Before:

jsimpso@mc-001:~$ lxc network show dev-test
name: dev-test
description: ""
type: ovn
managed: true
status: Created
config:
  bridge.mtu: "1442"
  ipv4.address: 172.16.0.1/28
  ipv4.nat: "false"
  ipv6.address: none
  network: UPLINK
  volatile.network.ipv4.address: 10.0.10.4
used_by:
- /1.0/profiles/default?project=dev-test
locations:
- mc-001
- mc-002
- mc-003

BGP status (lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28")'):

mc-001
{
  "nexthop": "10.0.10.4",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

mc-002
{
  "nexthop": "10.0.10.4",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

mc-003
{
  "nexthop": "10.0.10.4",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

After changing bridge.mtu through UI

name: dev-test
description: ""
type: ovn
managed: true
status: Created
config:
  bridge.mtu: "1500"
  ipv4.address: 172.16.0.1/28
  ipv4.nat: "false"
  ipv6.address: none
  network: UPLINK
  volatile.network.ipv4.address: 10.0.10.4
used_by:
- /1.0/profiles/default?project=dev-test
locations:
- mc-001
- mc-002
- mc-003

BGP status (lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28")'):

mc-001
{
  "nexthop": "0.0.0.0",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

mc-002
{
  "nexthop": "0.0.0.0",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

mc-003
{
  "nexthop": "10.0.10.4",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

@jsimpso
Copy link
Author

jsimpso commented Dec 10, 2024

Do you see the same issue if editing the network's config using lxc network edit <network>,

Just tested making the same change through the CLI and no, that doesn't seem to cause the same issue. When the network is changed through the CLI each member shows the expected nexthop value.

FWIW It also doesn't seem to matter what field is modified when the change is made through the UI, I've just reproduced it by changing only the network's description.

and also which host in your cluster is your UI browser pointing at? And how does this compare to the active chassis for the affected networks in lxc network info <network>?

The record I'm using to access the UI resolves to the IP of all three cluster members

jsimpso@nautilus:~$ host microcloud.jsimpso.xyz
microcloud.jsimpso.xyz has address 10.0.25.10
microcloud.jsimpso.xyz has address 10.0.25.12
microcloud.jsimpso.xyz has address 10.0.25.11

I'll test overriding it to resolve a single host and see if the behaviour is different

@jsimpso
Copy link
Author

jsimpso commented Dec 10, 2024

The active chassis for this network is mc-001:

jsimpso@mc-003:~$ lxc network info dev-test --project dev-test
Name: dev-test
MAC address: 00:16:3e:1f:44:31
MTU: 1442
State: up
Type: broadcast

IP addresses:
  inet  172.16.0.1/28 (link)

Network usage:
  Bytes received: 0B
  Bytes sent: 0B
  Packets received: 0
  Packets sent: 0

OVN:
  Chassis: mc-001
  • Hostname for the UI has a local override to match the active chassis:
jsimpso@nautilus:~$ host microcloud.jsimpso.xyz
microcloud.jsimpso.xyz has address 10.0.25.10
  • Confirmed that I have a valid nexthop from each host
jsimpso@mc-001:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"

jsimpso@mc-002:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"

jsimpso@mc-003:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"
  • Modified the network description with the UI
  • The other two cluster members (mc-002 and mc-003) both lose nexthop value
jsimpso@mc-001:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"

jsimpso@mc-002:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"0.0.0.0"

jsimpso@mc-003:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"0.0.0.0"

@jsimpso
Copy link
Author

jsimpso commented Dec 10, 2024

Confirmed this matches on all three members

snap list
Name        Version                 Rev    Tracking       Publisher   Notes
core22      20241119                1722   latest/stable  canonical✓  base
core24      20240920                609    latest/stable  canonical✓  base
lxd         5.21.2-084c8c8          31214  5.21/stable    canonical✓  in-cohort
microceph   19.2.0+snap9aeaeb2970   1228   squid/stable   canonical✓  in-cohort
microcloud  2.1.0-3e8b183           1144   2/stable       canonical✓  in-cohort
microovn    24.03.2+snapa2c59c105b  667    24.03/stable   canonical✓  in-cohort
snapd       2.66.1                  23258  latest/stable  canonical✓  snapd

@tomponline
Copy link
Member

tomponline commented Dec 10, 2024

@jsimpso would you be able to run lxc monitor --pretty on the host that the UI is connecting to, and then perform the change in the UI to cause the breakage and then send the monitor logs here so we can see what the UI is sending?

cc @edlerd I wont transfer this to lxd-ui repo just yet as not 100% sure its a UI bug yet.

@tomponline tomponline added Bug Confirmed to be a bug and removed Incomplete Waiting on more information from reporter labels Dec 10, 2024
@edlerd
Copy link
Contributor

edlerd commented Dec 10, 2024

cc @edlerd I wont transfer this to lxd-ui repo just yet as not 100% sure its a UI bug yet.

From the UI code, it seems to be a PUT to /1.0/networks/:name?project=:project when updating a network MTU. The discussion above indicates that this update messes with the BGP settings. Surprisingly, there are no bgp config settings for OVN networks available. But then again, I didn't reproduce the exact conditions above and might be missing something.

@edlerd
Copy link
Contributor

edlerd commented Dec 10, 2024

BGP status (lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28")'):

mc-001
{
  "nexthop": "0.0.0.0",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

mc-002
{
  "nexthop": "0.0.0.0",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

mc-003
{
  "nexthop": "10.0.10.4",
  "owner": "network_4",
  "prefix": "172.16.0.0/28"
}

Interesting here, that the nexthop is correct for one of the three members, but null for the other two. Which host is the one that serves the UI and its API to you?

@jsimpso
Copy link
Author

jsimpso commented Dec 10, 2024

@jsimpso would you be able to run lxc monitor --pretty on the host that the UI is connecting to, and then perform the change in the UI to cause the breakage and then send the monitor logs here so we can see what the UI is sending?

I tried to cut this down to what's relevant without pulling out anything useful, but I can grab a more verbose sample if needed:

DEBUG  [2024-12-10T20:47:07+08:00] Sending heartbeat request                     address="10.0.25.12:8443"
DEBUG  [2024-12-10T20:47:07+08:00] Successful heartbeat                          remote="10.0.25.12:8443"
DEBUG  [2024-12-10T20:47:08+08:00] Handling API request                          ip="10.0.37.136:44140" method=PUT protocol=tls url="/1.0/networks/dev-test?project=dev-test" username=8c9e3581c03ed1a3f5246242ca007832a675a6ff2a0247f03a596b89c5da4ee3
DEBUG  [2024-12-10T20:47:08+08:00] Matched trusted cert                          fingerprint=8c9e3581c03ed1a3f5246242ca007832a675a6ff2a0247f03a596b89c5da4ee3 subject="O=LXD UI 192.168.25.10 (Browser Generated),ST=Some-State,C=AU"
DEBUG  [2024-12-10T20:47:08+08:00] Update                                        clientType=normal driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK] test}" project=dev-test
DEBUG  [2024-12-10T20:47:08+08:00] Notify node 10.0.25.12:8443 of state changes
DEBUG  [2024-12-10T20:47:08+08:00] Notify node 10.0.25.10:8443 of state changes
DEBUG  [2024-12-10T20:47:08+08:00] Connecting to a remote LXD over HTTPS         url="https://10.0.25.12:8443"
DEBUG  [2024-12-10T20:47:08+08:00] Connecting to a remote LXD over HTTPS         url="https://10.0.25.10:8443"
DEBUG  [2024-12-10T20:47:08+08:00] Sending request to LXD                        etag= method=PUT url="https://10.0.25.12:8443/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-10T20:47:08+08:00] Sending request to LXD                        etag= method=PUT url="https://10.0.25.10:8443/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-10T20:47:08+08:00] Setting up network                            driver=ovn network=dev-test project=dev-test
DEBUG  [2024-12-10T20:47:08+08:00] Stable MAC generated                          driver=ovn hwAddr="00:16:3e:1f:44:31" network=dev-test project=dev-test seed=37ec2b324056d390fc3562e98362abee62523bb6d4f99484621c844b90dbd22a.0.4
DEBUG  [2024-12-10T20:47:10+08:00] Sending heartbeat request                     address="10.0.25.10:8443"
DEBUG  [2024-12-10T20:47:10+08:00] Successful heartbeat                          remote="10.0.25.10:8443"
DEBUG  [2024-12-10T20:47:10+08:00] Rebalancing member roles in heartbeat         local="10.0.25.11:8443"
DEBUG  [2024-12-10T20:47:10+08:00] Completed heartbeat round                     duration=4.684436172s local="10.0.25.11:8443"
DEBUG  [2024-12-10T20:47:10+08:00] Handling API request                          ip="10.0.37.136:44140" method=GET protocol=tls url="/1.0/networks/dev-test?project=dev-test" username=8c9e3581c03ed1a3f5246242ca007832a675a6ff2a0247f03a596b89c5da4ee3
DEBUG  [2024-12-10T20:47:10+08:00] Matched trusted cert                          fingerprint=8c9e3581c03ed1a3f5246242ca007832a675a6ff2a0247f03a596b89c5da4ee3 subject="O=LXD UI 192.168.25.10 (Browser Generated),ST=Some-State,C=AU"
DEBUG  [2024-12-10T20:47:11+08:00] Handling API request                          ip="10.0.25.24:34338" method=GET protocol=tls url=/1.0/metrics username=efa460eca20a32479bbb6b3d415d6e9340196ea717e93213a784686c82430c52

The target this time was 10.0.25.11 (mc-002), which is now the only cluster member showing the nexthop value:

jsimpso@mc-001:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"0.0.0.0"

jsimpso@mc-002:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"

jsimpso@mc-003:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"0.0.0.0"

@jsimpso
Copy link
Author

jsimpso commented Dec 10, 2024

Interesting here, that the nexthop is correct for one of the three members, but null for the other two. Which host is the one that serves the UI and its API to you?

@edlerd it appears that whichever host I target to serve the UI is the one that has the correct nexthop value after the change.

I'll run lxc monitor on one of the others and see if we can capture up the PUT update coming from the triggering unit

@jsimpso
Copy link
Author

jsimpso commented Dec 10, 2024

Here's the output from another node (all I'm doing to trigger this now is flipping the network description between test and testing):

DEBUG  [2024-12-10T21:01:19+08:00] Event listener server handler started         id=e5a692a3-4561-42ba-a3c1-c079843f4659 local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2024-12-10T21:01:19+08:00] Activated RBD volume                          dev=/dev/rbd1 driver=ceph pool=remote volName=virtual-machine_dev-test_u1
DEBUG  [2024-12-10T21:01:19+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2024-12-10T21:01:19+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/internal/ready username=root
DEBUG  [2024-12-10T21:01:19+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2024-12-10T21:01:19+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/internal/ready username=root
DEBUG  [2024-12-10T21:01:20+08:00] Matched trusted cert                          fingerprint=676ee46ad7a0347fc2bc6e65a2aac7f57c487cfccfc199c8f5e67bce92439d51 subject="CN=root@mc-002,O=LXD"
DEBUG  [2024-12-10T21:01:20+08:00] Replace current raft nodes                    raftMembers="[{{2 10.0.25.11:8443 voter} mc-002} {{3 10.0.25.12:8443 voter} mc-003} {{1 10.0.25.10:8443 voter} mc-001}]"
INFO   [2024-12-10T21:01:20+08:00] Cluster member state has changed              local="10.0.25.12:8443"
DEBUG  [2024-12-10T21:01:20+08:00] Refreshing identity cache
DEBUG  [2024-12-10T21:01:20+08:00] Matched trusted cert                          fingerprint=676ee46ad7a0347fc2bc6e65a2aac7f57c487cfccfc199c8f5e67bce92439d51 subject="CN=root@mc-002,O=LXD"
DEBUG  [2024-12-10T21:01:20+08:00] Handling API request                          fingerprint=676ee46ad7a0347fc2bc6e65a2aac7f57c487cfccfc199c8f5e67bce92439d51 ip="10.0.25.11:41626" method=PUT protocol=cluster url="/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-10T21:01:20+08:00] Refreshing forkdns servers
DEBUG  [2024-12-10T21:01:20+08:00] Connecting to a remote LXD over HTTPS         url="https://10.0.25.12:8443"
DEBUG  [2024-12-10T21:01:20+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2024-12-10T21:01:20+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/internal/ready username=root
DEBUG  [2024-12-10T21:01:20+08:00] Matched trusted cert                          fingerprint=8bb6622386a7d67712e30e694c64318b65c7e84ffd778473c53f4cf04e75d17d subject="CN=root@mc-003,O=LXD"
DEBUG  [2024-12-10T21:01:20+08:00] Handling API request                          fingerprint=8bb6622386a7d67712e30e694c64318b65c7e84ffd778473c53f4cf04e75d17d ip="10.0.25.12:40960" method=GET protocol=cluster url="/1.0/events?all-projects=true"
DEBUG  [2024-12-10T21:01:20+08:00] Event listener server handler started         id=51948623-eb5a-49cc-a735-e84432172492 local="10.0.25.12:8443" remote="10.0.25.12:40960"
DEBUG  [2024-12-10T21:01:20+08:00] Connected to the websocket: wss://10.0.25.12:8443/1.0/events?all-projects=true
INFO   [2024-12-10T21:01:20+08:00] Added member event listener client            local="[::]:8443" remote="10.0.25.12:8443"
DEBUG  [2024-12-10T21:01:20+08:00] Update                                        clientType=notifier driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK] testing}" project=dev-test
DEBUG  [2024-12-10T21:01:20+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2024-12-10T21:01:20+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/internal/ready username=root
DEBUG  [2024-12-10T21:01:21+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2024-12-10T21:01:21+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/internal/ready username=root
DEBUG  [2024-12-10T21:01:21+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2024-12-10T21:01:21+08:00] Mounted RBD volume                            dev=/dev/rbd1 driver=ceph options=discard path=/var/snap/lxd/common/lxd/storage-pools/remote/virtual-machines/dev-test_u1 pool=remote volName=dev-test_u1
DEBUG  [2024-12-10T21:01:21+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/internal/ready username=root
DEBUG  [2024-12-10T21:01:21+08:00] MountInstance finished                        driver=ceph instance=u1 pool=remote project=dev-test
DEBUG  [2024-12-10T21:01:21+08:00] Skipping lxd-agent install as unchanged       installPath=/var/snap/lxd/common/lxd/virtual-machines/dev-test_u1/config/lxd-agent instance=u1 instanceType=virtual-machine project=dev-test srcPath=/snap/lxd/31214/bin/lxd-agent
DEBUG  [2024-12-10T21:01:22+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2024-12-10T21:01:22+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/internal/ready username=root
DEBUG  [2024-12-10T21:01:22+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/1.0 username=root
DEBUG  [2024-12-10T21:01:22+08:00] Handling API request                          ip=@ method=GET protocol=unix url=/internal/ready username=root

@tomponline
Copy link
Member

thanks for this @jsimpso

@tomponline
Copy link
Member

@jsimpso now please could you get the same lxc monitor --pretty output when using lxc network edit <network> on the same host as the UI was talking to and we can compare/contrast the logs and hopefully see the difference.

@tomponline
Copy link
Member

We can see the request from the UI here

DEBUG  [2024-12-10T20:47:08+08:00] Update                                        clientType=normal driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK] test}" project=dev-test

and the notification to the other member here:

DEBUG  [2024-12-10T21:01:20+08:00] Update                                        clientType=notifier driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK] testing}" project=dev-test

@tomponline
Copy link
Member

Were they the same requests @jsimpso as I see a difference between "test" and "testing" in the two requests at the end.

@jsimpso
Copy link
Author

jsimpso commented Dec 18, 2024

Were they the same requests @jsimpso as I see a difference between "test" and "testing" in the two requests at the end.

Those were indeed two different requests, I was switching the value back and forth between "test" and "testing" to trigger the issue.

I'll re-run that and capture the same time period for all three nodes when changing via the UI, and then repeat for changing via the CLI

@jsimpso
Copy link
Author

jsimpso commented Dec 18, 2024

Changing network configuration via web UI

Pre-test

  • UI interactions have been limited to host mc-001 (10.0.25.10).
  • Current state of the target network:
    jsimpso@mc-001:~$ lxc network show dev-test --project dev-test
    name: dev-test
    description: testing
    type: ovn
    managed: true
    status: Created
    config:
      bridge.mtu: "1442"
      ipv4.address: 172.16.0.1/28
      ipv4.nat: "false"
      ipv6.address: none
      network: UPLINK
      volatile.network.ipv4.address: 10.0.10.4
    used_by:
    - /1.0/profiles/default?project=dev-test
    locations:
    - mc-001
    - mc-002
    - mc-003
  • Confirmed each node agrees on the next hop address:
     jsimpso@mc-001:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
     "10.0.10.4"
    
     jsimpso@mc-002:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
     "10.0.10.4"
     
     jsimpso@mc-003:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
     "10.0.10.4"

Making the change

  • UI opened at https://10.0.25.10:8443/ui/project/dev-test/network/dev-test/configuration/yaml-configuration
  • Hit "Edit Network"
  • Edited value for description from testing to changed-via-ui
  • Simultaneously on each MicroCloud node, run lxc monitor --pretty
  • Press Save changes in the UI
  • Wait for the success message to pop up
  • Wait another 1-2 seconds to ensure output noise is settled
  • Cancel lxc monitor

Result

Checking the nexthop values after the change shows that the other two nodes (mc-002 and mc-003) show 0.0.0.0 instead of the expected 10.0.0.4:

jsimpso@mc-001:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"

jsimpso@mc-002:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"0.0.0.0"

jsimpso@mc-003:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"0.0.0.0"

Output

mc-001

jsimpso@mc-001:~$ lxc monitor --pretty
DEBUG  [2024-12-18T10:22:43+08:00] Event listener server handler started         id=6b51a3ea-f15b-4c25-aed6-1abc74a90d6b local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2024-12-18T10:22:44+08:00] Handling API request                          ip="10.0.37.136:48824" method=PUT protocol=tls url="/1.0/networks/dev-test?project=dev-test" username=5273a69d19633c2b9aa19daf7e4dc2ee4384c05c7640fa5fe8c7d901e32095b2
DEBUG  [2024-12-18T10:22:44+08:00] Matched trusted cert                          fingerprint=5273a69d19633c2b9aa19daf7e4dc2ee4384c05c7640fa5fe8c7d901e32095b2 subject="CN=jsimpso@nautilus,O=LXD"
DEBUG  [2024-12-18T10:22:44+08:00] Update                                        clientType=normal driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK] changed-via-ui}" project=dev-test
DEBUG  [2024-12-18T10:22:44+08:00] Notify node 10.0.25.11:8443 of state changes
DEBUG  [2024-12-18T10:22:44+08:00] Notify node 10.0.25.12:8443 of state changes
DEBUG  [2024-12-18T10:22:44+08:00] Connecting to a remote LXD over HTTPS         url="https://10.0.25.12:8443"
DEBUG  [2024-12-18T10:22:44+08:00] Connecting to a remote LXD over HTTPS         url="https://10.0.25.11:8443"
DEBUG  [2024-12-18T10:22:44+08:00] Sending request to LXD                        etag= method=PUT url="https://10.0.25.11:8443/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-18T10:22:44+08:00] Sending request to LXD                        etag= method=PUT url="https://10.0.25.12:8443/1.0/networks/dev-test?project=dev-test"
INFO   [2024-12-18T10:22:44+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
INFO   [2024-12-18T10:22:44+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
DEBUG  [2024-12-18T10:22:44+08:00] Setting up network                            driver=ovn network=dev-test project=dev-test
DEBUG  [2024-12-18T10:22:44+08:00] Stable MAC generated                          driver=ovn hwAddr="00:16:3e:1f:44:31" network=dev-test project=dev-test seed=37ec2b324056d390fc3562e98362abee62523bb6d4f99484621c844b90dbd22a.0.4
DEBUG  [2024-12-18T10:22:46+08:00] Matched trusted cert                          fingerprint=676ee46ad7a0347fc2bc6e65a2aac7f57c487cfccfc199c8f5e67bce92439d51 subject="CN=root@mc-002,O=LXD"
DEBUG  [2024-12-18T10:22:46+08:00] Replace current raft nodes                    raftMembers="[{{1 10.0.25.10:8443 voter} mc-001} {{2 10.0.25.11:8443 voter} mc-002} {{3 10.0.25.12:8443 voter} mc-003}]"
INFO   [2024-12-18T10:22:47+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: tls/5273a69d19633c2b9aa19daf7e4dc2ee4384c05c7640fa5fe8c7d901e32095b2 (10.0.37.136:48824)
INFO   [2024-12-18T10:22:47+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: tls/5273a69d19633c2b9aa19daf7e4dc2ee4384c05c7640fa5fe8c7d901e32095b2 (10.0.37.136:48824)
DEBUG  [2024-12-18T10:22:47+08:00] Matched trusted cert                          fingerprint=5273a69d19633c2b9aa19daf7e4dc2ee4384c05c7640fa5fe8c7d901e32095b2 subject="CN=jsimpso@nautilus,O=LXD"
DEBUG  [2024-12-18T10:22:47+08:00] Handling API request                          ip="10.0.37.136:48824" method=GET protocol=tls url="/1.0/networks/dev-test?project=dev-test" username=5273a69d19633c2b9aa19daf7e4dc2ee4384c05c7640fa5fe8c7d901e32095b2
DEBUG  [2024-12-18T10:22:47+08:00] Matched trusted cert                          fingerprint=efa460eca20a32479bbb6b3d415d6e9340196ea717e93213a784686c82430c52 subject="CN=metrics.local"
DEBUG  [2024-12-18T10:22:47+08:00] Handling API request                          ip="10.0.25.24:45208" method=GET protocol=tls url=/1.0/metrics username=efa460eca20a32479bbb6b3d415d6e9340196ea717e93213a784686c82430c52
DEBUG  [2024-12-18T10:22:47+08:00] Connecting to a VM agent over a VM socket
DEBUG  [2024-12-18T10:22:47+08:00] Sending request to LXD                        etag= method=GET url="https://custom.socket/1.0"
DEBUG  [2024-12-18T10:22:47+08:00] Got response struct from LXD
DEBUG  [2024-12-18T10:22:47+08:00] Sending request to LXD                        etag= method=GET url="https://custom.socket/1.0/metrics"
DEBUG  [2024-12-18T10:22:47+08:00]
        {
                "config": null,
                "api_extensions": [
                        "storage_zfs_remove_snapshots",
                        "container_host_shutdown_timeout",
                        "container_stop_priority",
                        "container_syscall_filtering",
                        "auth_pki",
                        "container_last_used_at",
                        "etag",
                        "patch",
                        "usb_devices",
                        "https_allowed_credentials",
                        "image_compression_algorithm",
                        "directory_manipulation",
                        "container_cpu_time",
                        "storage_zfs_use_refquota",
                        "storage_lvm_mount_options",
                        "network",
                        "profile_usedby",
                        "container_push",
                        "container_exec_recording",
                        "certificate_update",
                        "container_exec_signal_handling",
                        "gpu_devices",
                        "container_image_properties",
                        "migration_progress",
                        "id_map",
                        "network_firewall_filtering",
                        "network_routes",
                        "storage",
                        "file_delete",
                        "file_append",
                        "network_dhcp_expiry",
                        "storage_lvm_vg_rename",
                        "storage_lvm_thinpool_rename",
                        "network_vlan",
                        "image_create_aliases",
                        "container_stateless_copy",
                        "container_only_migration",
                        "storage_zfs_clone_copy",
                        "unix_device_rename",
                        "storage_lvm_use_thinpool",
                        "storage_rsync_bwlimit",
                        "network_vxlan_interface",
                        "storage_btrfs_mount_options",
                        "entity_description",
                        "image_force_refresh",
                        "storage_lvm_lv_resizing",
                        "id_map_base",
                        "file_symlinks",
                        "container_push_target",
                        "network_vlan_physical",
                        "storage_images_delete",
                        "container_edit_metadata",
                        "container_snapshot_stateful_migration",
                        "storage_driver_ceph",
                        "storage_ceph_user_name",
                        "resource_limits",
                        "storage_volatile_initial_source",
                        "storage_ceph_force_osd_reuse",
                        "storage_block_filesystem_btrfs",
                        "resources",
                        "kernel_limits",
                        "storage_api_volume_rename",
                        "network_sriov",
                        "console",
                        "restrict_devlxd",
                        "migration_pre_copy",
                        "infiniband",
                        "maas_network",
                        "devlxd_events",
                        "proxy",
                        "network_dhcp_gateway",
                        "file_get_symlink",
                        "network_leases",
                        "unix_device_hotplug",
                        "storage_api_local_volume_handling",
                        "operation_description",
                        "clustering",
                        "event_lifecycle",
                        "storage_api_remote_volume_handling",
                        "nvidia_runtime",
                        "container_mount_propagation",
                        "container_backup",
                        "devlxd_images",
                        "container_local_cross_pool_handling",
                        "proxy_unix",
                        "proxy_udp",
                        "clustering_join",
                        "proxy_tcp_udp_multi_port_handling",
                        "network_state",
                        "proxy_unix_dac_properties",
                        "container_protection_delete",
                        "unix_priv_drop",
                        "pprof_http",
                        "proxy_haproxy_protocol",
                        "network_hwaddr",
                        "proxy_nat",
                        "network_nat_order",
                        "container_full",
                        "backup_compression",
                        "nvidia_runtime_config",
                        "storage_api_volume_snapshots",
                        "storage_unmapped",
                        "projects",
                        "network_vxlan_ttl",
                        "container_incremental_copy",
                        "usb_optional_vendorid",
                        "snapshot_scheduling",
                        "snapshot_schedule_aliases",
                        "container_copy_project",
                        "clustering_server_address",
                        "clustering_image_replication",
                        "container_protection_shift",
                        "snapshot_expiry",
                        "container_backup_override_pool",
                        "snapshot_expiry_creation",
                        "network_leases_location",
                        "resources_cpu_socket",
                        "resources_gpu",
                        "resources_numa",
                        "kernel_features",
                        "id_map_current",
                        "event_location",
                        "storage_api_remote_volume_snapshots",
                        "network_nat_address",
                        "container_nic_routes",
                        "cluster_internal_copy",
                        "seccomp_notify",
                        "lxc_features",
                        "container_nic_ipvlan",
                        "network_vlan_sriov",
                        "storage_cephfs",
                        "container_nic_ipfilter",
                        "resources_v2",
                        "container_exec_user_group_cwd",
                        "container_syscall_intercept",
                        "container_disk_shift",
                        "storage_shifted",
                        "resources_infiniband",
                        "daemon_storage",
                        "instances",
                        "image_types",
                        "resources_disk_sata",
                        "clustering_roles",
                        "images_expiry",
                        "resources_network_firmware",
                        "backup_compression_algorithm",
                        "ceph_data_pool_name",
                        "container_syscall_intercept_mount",
                        "compression_squashfs",
                        "container_raw_mount",
                        "container_nic_routed",
                        "container_syscall_intercept_mount_fuse",
                        "container_disk_ceph",
                        "virtual-machines",
                        "image_profiles",
                        "clustering_architecture",
                        "resources_disk_id",
                        "storage_lvm_stripes",
                        "vm_boot_priority",
                        "unix_hotplug_devices",
                        "api_filtering",
                        "instance_nic_network",
                        "clustering_sizing",
                        "firewall_driver",
                        "projects_limits",
                        "container_syscall_intercept_hugetlbfs",
                        "limits_hugepages",
                        "container_nic_routed_gateway",
                        "projects_restrictions",
                        "custom_volume_snapshot_expiry",
                        "volume_snapshot_scheduling",
                        "trust_ca_certificates",
                        "snapshot_disk_usage",
                        "clustering_edit_roles",
                        "container_nic_routed_host_address",
                        "container_nic_ipvlan_gateway",
                        "resources_usb_pci",
                        "resources_cpu_threads_numa",
                        "resources_cpu_core_die",
                        "api_os",
                        "container_nic_routed_host_table",
                        "container_nic_ipvlan_host_table",
                        "container_nic_ipvlan_mode",
                        "resources_system",
                        "images_push_relay",
                        "network_dns_search",
                        "container_nic_routed_limits",
                        "instance_nic_bridged_vlan",
                        "network_state_bond_bridge",
                        "usedby_consistency",
                        "custom_block_volumes",
                        "clustering_failure_domains",
                        "resources_gpu_mdev",
                        "console_vga_type",
                        "projects_limits_disk",
                        "network_type_macvlan",
                        "network_type_sriov",
                        "container_syscall_intercept_bpf_devices",
                        "network_type_ovn",
                        "projects_networks",
                        "projects_networks_restricted_uplinks",
                        "custom_volume_backup",
                        "backup_override_name",
                        "storage_rsync_compression",
                        "network_type_physical",
                        "network_ovn_external_subnets",
                        "network_ovn_nat",
                        "network_ovn_external_routes_remove",
                        "tpm_device_type",
                        "storage_zfs_clone_copy_rebase",
                        "gpu_mdev",
                        "resources_pci_iommu",
                        "resources_network_usb",
                        "resources_disk_address",
                        "network_physical_ovn_ingress_mode",
                        "network_ovn_dhcp",
                        "network_physical_routes_anycast",
                        "projects_limits_instances",
                        "network_state_vlan",
                        "instance_nic_bridged_port_isolation",
                        "instance_bulk_state_change",
                        "network_gvrp",
                        "instance_pool_move",
                        "gpu_sriov",
                        "pci_device_type",
                        "storage_volume_state",
                        "network_acl",
                        "migration_stateful",
                        "disk_state_quota",
                        "storage_ceph_features",
                        "projects_compression",
                        "projects_images_remote_cache_expiry",
                        "certificate_project",
                        "network_ovn_acl",
                        "projects_images_auto_update",
                        "projects_restricted_cluster_target",
                        "images_default_architecture",
                        "network_ovn_acl_defaults",
                        "gpu_mig",
                        "project_usage",
                        "network_bridge_acl",
                        "warnings",
                        "projects_restricted_backups_and_snapshots",
                        "clustering_join_token",
                        "clustering_description",
                        "server_trusted_proxy",
                        "clustering_update_cert",
                        "storage_api_project",
                        "server_instance_driver_operational",
                        "server_supported_storage_drivers",
                        "event_lifecycle_requestor_address",
                        "resources_gpu_usb",
                        "clustering_evacuation",
                        "network_ovn_nat_address",
                        "network_bgp",
                        "network_forward",
                        "custom_volume_refresh",
                        "network_counters_errors_dropped",
                        "metrics",
                        "image_source_project",
                        "clustering_config",
                        "network_peer",
                        "linux_sysctl",
                        "network_dns",
                        "ovn_nic_acceleration",
                        "certificate_self_renewal",
                        "instance_project_move",
                        "storage_volume_project_move",
                        "cloud_init",
                        "network_dns_nat",
                        "database_leader",
                        "instance_all_projects",
                        "clustering_groups",
                        "ceph_rbd_du",
                        "instance_get_full",
                        "qemu_metrics",
                        "gpu_mig_uuid",
                        "event_project",
                        "clustering_evacuation_live",
                        "instance_allow_inconsistent_copy",
                        "network_state_ovn",
                        "storage_volume_api_filtering",
                        "image_restrictions",
                        "storage_zfs_export",
                        "network_dns_records",
                        "storage_zfs_reserve_space",
                        "network_acl_log",
                        "storage_zfs_blocksize",
                        "metrics_cpu_seconds",
                        "instance_snapshot_never",
                        "certificate_token",
                        "instance_nic_routed_neighbor_probe",
                        "event_hub",
                        "agent_nic_config",
                        "projects_restricted_intercept",
                        "metrics_authentication",
                        "images_target_project",
                        "cluster_migration_inconsistent_copy",
                        "cluster_ovn_chassis",
                        "container_syscall_intercept_sched_setscheduler",
                        "storage_lvm_thinpool_metadata_size",
                        "storage_volume_state_total",
                        "instance_file_head",
                        "instances_nic_host_name",
                        "image_copy_profile",
                        "container_syscall_intercept_sysinfo",
                        "clustering_evacuation_mode",
                        "resources_pci_vpd",
                        "qemu_raw_conf",
                        "storage_cephfs_fscache",
                        "network_load_balancer",
                        "vsock_api",
                        "instance_ready_state",
                        "network_bgp_holdtime",
                        "storage_volumes_all_projects",
                        "metrics_memory_oom_total",
                        "storage_buckets",
                        "storage_buckets_create_credentials",
                        "metrics_cpu_effective_total",
                        "projects_networks_restricted_access",
                        "storage_buckets_local",
                        "loki",
                        "acme",
                        "internal_metrics",
                        "cluster_join_token_expiry",
                        "remote_token_expiry",
                        "init_preseed",
                        "storage_volumes_created_at",
                        "cpu_hotplug",
                        "projects_networks_zones",
                        "network_txqueuelen",
                        "cluster_member_state",
                        "instances_placement_scriptlet",
                        "storage_pool_source_wipe",
                        "zfs_block_mode",
                        "instance_generation_id",
                        "disk_io_cache",
                        "amd_sev",
                        "storage_pool_loop_resize",
                        "migration_vm_live",
                        "ovn_nic_nesting",
                        "oidc",
                        "network_ovn_l3only",
                        "ovn_nic_acceleration_vdpa",
                        "cluster_healing",
                        "instances_state_total",
                        "auth_user",
                        "security_csm",
                        "instances_rebuild",
                        "numa_cpu_placement",
                        "custom_volume_iso",
                        "network_allocations",
                        "storage_api_remote_volume_snapshot_copy",
                        "zfs_delegate",
                        "operations_get_query_all_projects",
                        "metadata_configuration",
                        "syslog_socket",
                        "event_lifecycle_name_and_project",
                        "instances_nic_limits_priority",
                        "disk_initial_volume_configuration",
                        "operation_wait",
                        "cluster_internal_custom_volume_copy",
                        "disk_io_bus",
                        "storage_cephfs_create_missing",
                        "instance_move_config",
                        "ovn_ssl_config",
                        "init_preseed_storage_volumes",
                        "metrics_instances_count",
                        "server_instance_type_info",
                        "resources_disk_mounted",
                        "server_version_lts",
                        "oidc_groups_claim",
                        "loki_config_instance",
                        "storage_volatile_uuid",
                        "import_instance_devices",
                        "instances_uefi_vars",
                        "instances_migration_stateful",
                        "container_syscall_filtering_allow_deny_syntax",
                        "access_management",
                        "vm_disk_io_limits",
                        "storage_volumes_all",
                        "instances_files_modify_permissions",
                        "image_restriction_nesting",
                        "container_syscall_intercept_finit_module",
                        "device_usb_serial",
                        "network_allocate_external_ips",
                        "explicit_trust_token"
                ],
                "api_status": "stable",
                "api_version": "1.0",
                "auth": "trusted",
                "public": false,
                "auth_methods": [
                        "tls"
                ],
                "auth_user_name": "",
                "auth_user_method": "",
                "environment": {
                        "addresses": null,
                        "architectures": null,
                        "certificate": "",
                        "certificate_fingerprint": "",
                        "driver": "",
                        "driver_version": "",
                        "instance_types": null,
                        "firewall": "",
                        "kernel": "Linux",
                        "kernel_architecture": "x86_64",
                        "kernel_features": null,
                        "kernel_version": "5.15.0-1069-kvm",
                        "lxc_features": null,
                        "os_name": "",
                        "os_version": "",
                        "project": "",
                        "server": "lxd-agent",
                        "server_clustered": false,
                        "server_event_mode": "",
                        "server_name": "juju-32f1c2-0",
                        "server_pid": 369,
                        "server_version": "5.21.2",
                        "server_lts": false,
                        "storage": "",
                        "storage_version": "",
                        "storage_supported_drivers": null
                }
        }
^C

mc-002

jsimpso@mc-002:~$ lxc monitor --pretty
DEBUG  [2024-12-18T10:22:43+08:00] Event listener server handler started         id=ada1f916-366b-4d4d-9424-9f29313c56ec local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2024-12-18T10:22:44+08:00] Handling API request                          fingerprint=485db7985f483d2fe7ecc0acc076f2363e40dce2844da4ece5db02f15a57a6cf ip="10.0.25.10:38612" method=PUT protocol=cluster url="/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-18T10:22:44+08:00] Matched trusted cert                          fingerprint=485db7985f483d2fe7ecc0acc076f2363e40dce2844da4ece5db02f15a57a6cf subject="CN=root@mc-001,O=LXD"
DEBUG  [2024-12-18T10:22:44+08:00] Update                                        clientType=notifier driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK] changed-via-ui}" project=dev-test
INFO   [2024-12-18T10:22:44+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
INFO   [2024-12-18T10:22:44+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
INFO   [2024-12-18T10:22:44+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
DEBUG  [2024-12-18T10:22:46+08:00] Starting heartbeat round                      local="10.0.25.11:8443" mode=normal
DEBUG  [2024-12-18T10:22:46+08:00] Heartbeat updating local raft members         members="[{{1 10.0.25.10:8443 voter} mc-001} {{2 10.0.25.11:8443 voter} mc-002} {{3 10.0.25.12:8443 voter} mc-003}]"
DEBUG  [2024-12-18T10:22:46+08:00] Sending heartbeat request                     address="10.0.25.10:8443"
DEBUG  [2024-12-18T10:22:46+08:00] Successful heartbeat                          remote="10.0.25.10:8443"
INFO   [2024-12-18T10:22:47+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: tls/5273a69d19633c2b9aa19daf7e4dc2ee4384c05c7640fa5fe8c7d901e32095b2 (10.0.37.136:48824)
DEBUG  [2024-12-18T10:22:50+08:00] Sending heartbeat request                     address="10.0.25.12:8443"
DEBUG  [2024-12-18T10:22:50+08:00] Successful heartbeat                          remote="10.0.25.12:8443"
DEBUG  [2024-12-18T10:22:50+08:00] Rebalancing member roles in heartbeat         local="10.0.25.11:8443"
DEBUG  [2024-12-18T10:22:50+08:00] Completed heartbeat round                     duration=4.302598901s local="10.0.25.11:8443"
^C

mc-003

jsimpso@mc-003:~$ lxc monitor --pretty
DEBUG  [2024-12-18T10:22:43+08:00] Event listener server handler started         id=973b0266-9a62-41ba-bf51-35738a0909b7 local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2024-12-18T10:22:44+08:00] Handling API request                          fingerprint=485db7985f483d2fe7ecc0acc076f2363e40dce2844da4ece5db02f15a57a6cf ip="10.0.25.10:39326" method=PUT protocol=cluster url="/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-18T10:22:44+08:00] Matched trusted cert                          fingerprint=485db7985f483d2fe7ecc0acc076f2363e40dce2844da4ece5db02f15a57a6cf subject="CN=root@mc-001,O=LXD"
DEBUG  [2024-12-18T10:22:44+08:00] Update                                        clientType=notifier driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK] changed-via-ui}" project=dev-test
DEBUG  [2024-12-18T10:22:48+08:00] Handling API request                          ip="10.0.25.24:41736" method=GET protocol=tls url=/1.0/metrics username=efa460eca20a32479bbb6b3d415d6e9340196ea717e93213a784686c82430c52
DEBUG  [2024-12-18T10:22:48+08:00] Matched trusted cert                          fingerprint=efa460eca20a32479bbb6b3d415d6e9340196ea717e93213a784686c82430c52 subject="CN=metrics.local"
DEBUG  [2024-12-18T10:22:48+08:00] Sending request to LXD                        etag= method=GET url="https://custom.socket/1.0"
DEBUG  [2024-12-18T10:22:48+08:00] Connecting to a VM agent over a VM socket
DEBUG  [2024-12-18T10:22:48+08:00] Got response struct from LXD
DEBUG  [2024-12-18T10:22:48+08:00] Sending request to LXD                        etag= method=GET url="https://custom.socket/1.0/metrics"
DEBUG  [2024-12-18T10:22:48+08:00]
        {
                "config": null,
                "api_extensions": [
                        "storage_zfs_remove_snapshots",
                        "container_host_shutdown_timeout",
                        "container_stop_priority",
                        "container_syscall_filtering",
                        "auth_pki",
                        "container_last_used_at",
                        "etag",
                        "patch",
                        "usb_devices",
                        "https_allowed_credentials",
                        "image_compression_algorithm",
                        "directory_manipulation",
                        "container_cpu_time",
                        "storage_zfs_use_refquota",
                        "storage_lvm_mount_options",
                        "network",
                        "profile_usedby",
                        "container_push",
                        "container_exec_recording",
                        "certificate_update",
                        "container_exec_signal_handling",
                        "gpu_devices",
                        "container_image_properties",
                        "migration_progress",
                        "id_map",
                        "network_firewall_filtering",
                        "network_routes",
                        "storage",
                        "file_delete",
                        "file_append",
                        "network_dhcp_expiry",
                        "storage_lvm_vg_rename",
                        "storage_lvm_thinpool_rename",
                        "network_vlan",
                        "image_create_aliases",
                        "container_stateless_copy",
                        "container_only_migration",
                        "storage_zfs_clone_copy",
                        "unix_device_rename",
                        "storage_lvm_use_thinpool",
                        "storage_rsync_bwlimit",
                        "network_vxlan_interface",
                        "storage_btrfs_mount_options",
                        "entity_description",
                        "image_force_refresh",
                        "storage_lvm_lv_resizing",
                        "id_map_base",
                        "file_symlinks",
                        "container_push_target",
                        "network_vlan_physical",
                        "storage_images_delete",
                        "container_edit_metadata",
                        "container_snapshot_stateful_migration",
                        "storage_driver_ceph",
                        "storage_ceph_user_name",
                        "resource_limits",
                        "storage_volatile_initial_source",
                        "storage_ceph_force_osd_reuse",
                        "storage_block_filesystem_btrfs",
                        "resources",
                        "kernel_limits",
                        "storage_api_volume_rename",
                        "network_sriov",
                        "console",
                        "restrict_devlxd",
                        "migration_pre_copy",
                        "infiniband",
                        "maas_network",
                        "devlxd_events",
                        "proxy",
                        "network_dhcp_gateway",
                        "file_get_symlink",
                        "network_leases",
                        "unix_device_hotplug",
                        "storage_api_local_volume_handling",
                        "operation_description",
                        "clustering",
                        "event_lifecycle",
                        "storage_api_remote_volume_handling",
                        "nvidia_runtime",
                        "container_mount_propagation",
                        "container_backup",
                        "devlxd_images",
                        "container_local_cross_pool_handling",
                        "proxy_unix",
                        "proxy_udp",
                        "clustering_join",
                        "proxy_tcp_udp_multi_port_handling",
                        "network_state",
                        "proxy_unix_dac_properties",
                        "container_protection_delete",
                        "unix_priv_drop",
                        "pprof_http",
                        "proxy_haproxy_protocol",
                        "network_hwaddr",
                        "proxy_nat",
                        "network_nat_order",
                        "container_full",
                        "backup_compression",
                        "nvidia_runtime_config",
                        "storage_api_volume_snapshots",
                        "storage_unmapped",
                        "projects",
                        "network_vxlan_ttl",
                        "container_incremental_copy",
                        "usb_optional_vendorid",
                        "snapshot_scheduling",
                        "snapshot_schedule_aliases",
                        "container_copy_project",
                        "clustering_server_address",
                        "clustering_image_replication",
                        "container_protection_shift",
                        "snapshot_expiry",
                        "container_backup_override_pool",
                        "snapshot_expiry_creation",
                        "network_leases_location",
                        "resources_cpu_socket",
                        "resources_gpu",
                        "resources_numa",
                        "kernel_features",
                        "id_map_current",
                        "event_location",
                        "storage_api_remote_volume_snapshots",
                        "network_nat_address",
                        "container_nic_routes",
                        "cluster_internal_copy",
                        "seccomp_notify",
                        "lxc_features",
                        "container_nic_ipvlan",
                        "network_vlan_sriov",
                        "storage_cephfs",
                        "container_nic_ipfilter",
                        "resources_v2",
                        "container_exec_user_group_cwd",
                        "container_syscall_intercept",
                        "container_disk_shift",
                        "storage_shifted",
                        "resources_infiniband",
                        "daemon_storage",
                        "instances",
                        "image_types",
                        "resources_disk_sata",
                        "clustering_roles",
                        "images_expiry",
                        "resources_network_firmware",
                        "backup_compression_algorithm",
                        "ceph_data_pool_name",
                        "container_syscall_intercept_mount",
                        "compression_squashfs",
                        "container_raw_mount",
                        "container_nic_routed",
                        "container_syscall_intercept_mount_fuse",
                        "container_disk_ceph",
                        "virtual-machines",
                        "image_profiles",
                        "clustering_architecture",
                        "resources_disk_id",
                        "storage_lvm_stripes",
                        "vm_boot_priority",
                        "unix_hotplug_devices",
                        "api_filtering",
                        "instance_nic_network",
                        "clustering_sizing",
                        "firewall_driver",
                        "projects_limits",
                        "container_syscall_intercept_hugetlbfs",
                        "limits_hugepages",
                        "container_nic_routed_gateway",
                        "projects_restrictions",
                        "custom_volume_snapshot_expiry",
                        "volume_snapshot_scheduling",
                        "trust_ca_certificates",
                        "snapshot_disk_usage",
                        "clustering_edit_roles",
                        "container_nic_routed_host_address",
                        "container_nic_ipvlan_gateway",
                        "resources_usb_pci",
                        "resources_cpu_threads_numa",
                        "resources_cpu_core_die",
                        "api_os",
                        "container_nic_routed_host_table",
                        "container_nic_ipvlan_host_table",
                        "container_nic_ipvlan_mode",
                        "resources_system",
                        "images_push_relay",
                        "network_dns_search",
                        "container_nic_routed_limits",
                        "instance_nic_bridged_vlan",
                        "network_state_bond_bridge",
                        "usedby_consistency",
                        "custom_block_volumes",
                        "clustering_failure_domains",
                        "resources_gpu_mdev",
                        "console_vga_type",
                        "projects_limits_disk",
                        "network_type_macvlan",
                        "network_type_sriov",
                        "container_syscall_intercept_bpf_devices",
                        "network_type_ovn",
                        "projects_networks",
                        "projects_networks_restricted_uplinks",
                        "custom_volume_backup",
                        "backup_override_name",
                        "storage_rsync_compression",
                        "network_type_physical",
                        "network_ovn_external_subnets",
                        "network_ovn_nat",
                        "network_ovn_external_routes_remove",
                        "tpm_device_type",
                        "storage_zfs_clone_copy_rebase",
                        "gpu_mdev",
                        "resources_pci_iommu",
                        "resources_network_usb",
                        "resources_disk_address",
                        "network_physical_ovn_ingress_mode",
                        "network_ovn_dhcp",
                        "network_physical_routes_anycast",
                        "projects_limits_instances",
                        "network_state_vlan",
                        "instance_nic_bridged_port_isolation",
                        "instance_bulk_state_change",
                        "network_gvrp",
                        "instance_pool_move",
                        "gpu_sriov",
                        "pci_device_type",
                        "storage_volume_state",
                        "network_acl",
                        "migration_stateful",
                        "disk_state_quota",
                        "storage_ceph_features",
                        "projects_compression",
                        "projects_images_remote_cache_expiry",
                        "certificate_project",
                        "network_ovn_acl",
                        "projects_images_auto_update",
                        "projects_restricted_cluster_target",
                        "images_default_architecture",
                        "network_ovn_acl_defaults",
                        "gpu_mig",
                        "project_usage",
                        "network_bridge_acl",
                        "warnings",
                        "projects_restricted_backups_and_snapshots",
                        "clustering_join_token",
                        "clustering_description",
                        "server_trusted_proxy",
                        "clustering_update_cert",
                        "storage_api_project",
                        "server_instance_driver_operational",
                        "server_supported_storage_drivers",
                        "event_lifecycle_requestor_address",
                        "resources_gpu_usb",
                        "clustering_evacuation",
                        "network_ovn_nat_address",
                        "network_bgp",
                        "network_forward",
                        "custom_volume_refresh",
                        "network_counters_errors_dropped",
                        "metrics",
                        "image_source_project",
                        "clustering_config",
                        "network_peer",
                        "linux_sysctl",
                        "network_dns",
                        "ovn_nic_acceleration",
                        "certificate_self_renewal",
                        "instance_project_move",
                        "storage_volume_project_move",
                        "cloud_init",
                        "network_dns_nat",
                        "database_leader",
                        "instance_all_projects",
                        "clustering_groups",
                        "ceph_rbd_du",
                        "instance_get_full",
                        "qemu_metrics",
                        "gpu_mig_uuid",
                        "event_project",
                        "clustering_evacuation_live",
                        "instance_allow_inconsistent_copy",
                        "network_state_ovn",
                        "storage_volume_api_filtering",
                        "image_restrictions",
                        "storage_zfs_export",
                        "network_dns_records",
                        "storage_zfs_reserve_space",
                        "network_acl_log",
                        "storage_zfs_blocksize",
                        "metrics_cpu_seconds",
                        "instance_snapshot_never",
                        "certificate_token",
                        "instance_nic_routed_neighbor_probe",
                        "event_hub",
                        "agent_nic_config",
                        "projects_restricted_intercept",
                        "metrics_authentication",
                        "images_target_project",
                        "cluster_migration_inconsistent_copy",
                        "cluster_ovn_chassis",
                        "container_syscall_intercept_sched_setscheduler",
                        "storage_lvm_thinpool_metadata_size",
                        "storage_volume_state_total",
                        "instance_file_head",
                        "instances_nic_host_name",
                        "image_copy_profile",
                        "container_syscall_intercept_sysinfo",
                        "clustering_evacuation_mode",
                        "resources_pci_vpd",
                        "qemu_raw_conf",
                        "storage_cephfs_fscache",
                        "network_load_balancer",
                        "vsock_api",
                        "instance_ready_state",
                        "network_bgp_holdtime",
                        "storage_volumes_all_projects",
                        "metrics_memory_oom_total",
                        "storage_buckets",
                        "storage_buckets_create_credentials",
                        "metrics_cpu_effective_total",
                        "projects_networks_restricted_access",
                        "storage_buckets_local",
                        "loki",
                        "acme",
                        "internal_metrics",
                        "cluster_join_token_expiry",
                        "remote_token_expiry",
                        "init_preseed",
                        "storage_volumes_created_at",
                        "cpu_hotplug",
                        "projects_networks_zones",
                        "network_txqueuelen",
                        "cluster_member_state",
                        "instances_placement_scriptlet",
                        "storage_pool_source_wipe",
                        "zfs_block_mode",
                        "instance_generation_id",
                        "disk_io_cache",
                        "amd_sev",
                        "storage_pool_loop_resize",
                        "migration_vm_live",
                        "ovn_nic_nesting",
                        "oidc",
                        "network_ovn_l3only",
                        "ovn_nic_acceleration_vdpa",
                        "cluster_healing",
                        "instances_state_total",
                        "auth_user",
                        "security_csm",
                        "instances_rebuild",
                        "numa_cpu_placement",
                        "custom_volume_iso",
                        "network_allocations",
                        "storage_api_remote_volume_snapshot_copy",
                        "zfs_delegate",
                        "operations_get_query_all_projects",
                        "metadata_configuration",
                        "syslog_socket",
                        "event_lifecycle_name_and_project",
                        "instances_nic_limits_priority",
                        "disk_initial_volume_configuration",
                        "operation_wait",
                        "cluster_internal_custom_volume_copy",
                        "disk_io_bus",
                        "storage_cephfs_create_missing",
                        "instance_move_config",
                        "ovn_ssl_config",
                        "init_preseed_storage_volumes",
                        "metrics_instances_count",
                        "server_instance_type_info",
                        "resources_disk_mounted",
                        "server_version_lts",
                        "oidc_groups_claim",
                        "loki_config_instance",
                        "storage_volatile_uuid",
                        "import_instance_devices",
                        "instances_uefi_vars",
                        "instances_migration_stateful",
                        "container_syscall_filtering_allow_deny_syntax",
                        "access_management",
                        "vm_disk_io_limits",
                        "storage_volumes_all",
                        "instances_files_modify_permissions",
                        "image_restriction_nesting",
                        "container_syscall_intercept_finit_module",
                        "device_usb_serial",
                        "network_allocate_external_ips",
                        "explicit_trust_token"
                ],
                "api_status": "stable",
                "api_version": "1.0",
                "auth": "trusted",
                "public": false,
                "auth_methods": [
                        "tls"
                ],
                "auth_user_name": "",
                "auth_user_method": "",
                "environment": {
                        "addresses": null,
                        "architectures": null,
                        "certificate": "",
                        "certificate_fingerprint": "",
                        "driver": "",
                        "driver_version": "",
                        "instance_types": null,
                        "firewall": "",
                        "kernel": "Linux",
                        "kernel_architecture": "x86_64",
                        "kernel_features": null,
                        "kernel_version": "6.8.0-49-generic",
                        "lxc_features": null,
                        "os_name": "",
                        "os_version": "",
                        "project": "",
                        "server": "lxd-agent",
                        "server_clustered": false,
                        "server_event_mode": "",
                        "server_name": "u1",
                        "server_pid": 453,
                        "server_version": "5.21.2",
                        "server_lts": false,
                        "storage": "",
                        "storage_version": "",
                        "storage_supported_drivers": null
                }
        }
DEBUG  [2024-12-18T10:22:50+08:00] Matched trusted cert                          fingerprint=676ee46ad7a0347fc2bc6e65a2aac7f57c487cfccfc199c8f5e67bce92439d51 subject="CN=root@mc-002,O=LXD"
DEBUG  [2024-12-18T10:22:50+08:00] Replace current raft nodes                    raftMembers="[{{3 10.0.25.12:8443 voter} mc-003} {{1 10.0.25.10:8443 voter} mc-001} {{2 10.0.25.11:8443 voter} mc-002}]"
^C

@jsimpso
Copy link
Author

jsimpso commented Dec 18, 2024

Changing network configuration via CLI

Pre-test

  • CLI interactions have been limited to host mc-001 (10.0.25.10).
  • Current state of the target network:
    jsimpso@mc-001:~$ lxc network show dev-test --project dev-test
    name: dev-test
    description: changed-via-ui
    type: ovn
    managed: true
    status: Created
    config:
      bridge.mtu: "1442"
      ipv4.address: 172.16.0.1/28
      ipv4.nat: "false"
      ipv6.address: none
      network: UPLINK
      volatile.network.ipv4.address: 10.0.10.4
    used_by:
    - /1.0/profiles/default?project=dev-test
    locations:
    - mc-001
    - mc-002
    - mc-003
  • Confirmed each node agrees on the next hop address (LXD snap has been restarted since previous test):
     jsimpso@mc-001:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
     "10.0.10.4"
    
     jsimpso@mc-002:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
     "10.0.10.4"
     
     jsimpso@mc-003:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
     "10.0.10.4"

Making the change

  • In a second terminal connected to mc-001, run lxc network edit --project dev-test dev-test
  • Edited value for description from changed-via-ui to changed-via-cli
  • Simultaneously on each MicroCloud node, run lxc monitor --pretty
  • Save the network changes
  • Wait a few seconds to ensure output noise is settled
  • Cancel lxc monitor

Result

Checking the nexthop values after the change shows that all nodes are still in agreement:

jsimpso@mc-001:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"

jsimpso@mc-002:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"

jsimpso@mc-003:~$ lxc query /internal/testing/bgp | jq '.prefixes.[] | select(.prefix=="172.16.0.0/28") | .nexthop'
"10.0.10.4"

Output

mc-001

jsimpso@mc-001:~$ lxc monitor --pretty
DEBUG  [2024-12-18T10:40:33+08:00] Event listener server handler started         id=7275977c-610c-4112-8427-f8a0e157da1e local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2024-12-18T10:40:34+08:00] Handling API request                          ip=@ method=PUT protocol=unix url="/1.0/networks/dev-test?project=dev-test" username=jsimpso
DEBUG  [2024-12-18T10:40:34+08:00] Update                                        clientType=normal driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK volatile.network.ipv4.address:10.0.10.4] changed-via-cli}" project=dev-test
DEBUG  [2024-12-18T10:40:34+08:00] Connecting to a remote LXD over HTTPS         url="https://10.0.25.12:8443"
DEBUG  [2024-12-18T10:40:34+08:00] Notify node 10.0.25.12:8443 of state changes
DEBUG  [2024-12-18T10:40:34+08:00] Connecting to a remote LXD over HTTPS         url="https://10.0.25.11:8443"
DEBUG  [2024-12-18T10:40:34+08:00] Notify node 10.0.25.11:8443 of state changes
DEBUG  [2024-12-18T10:40:34+08:00] Sending request to LXD                        etag= method=PUT url="https://10.0.25.12:8443/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-18T10:40:34+08:00] Sending request to LXD                        etag= method=PUT url="https://10.0.25.11:8443/1.0/networks/dev-test?project=dev-test"
INFO   [2024-12-18T10:40:34+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
INFO   [2024-12-18T10:40:35+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
INFO   [2024-12-18T10:40:35+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: unix/jsimpso (@)
INFO   [2024-12-18T10:40:35+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: unix/jsimpso (@)
DEBUG  [2024-12-18T10:40:37+08:00] Matched trusted cert                          fingerprint=676ee46ad7a0347fc2bc6e65a2aac7f57c487cfccfc199c8f5e67bce92439d51 subject="CN=root@mc-002,O=LXD"
DEBUG  [2024-12-18T10:40:37+08:00] Replace current raft nodes                    raftMembers="[{{3 10.0.25.12:8443 voter} mc-003} {{1 10.0.25.10:8443 voter} mc-001} {{2 10.0.25.11:8443 voter} mc-002}]"
^C

mc-002

jsimpso@mc-002:~$ lxc monitor --pretty
DEBUG  [2024-12-18T10:40:33+08:00] Event listener server handler started         id=a3449fbd-29a7-4d96-a0a2-fe10ca05c5ad local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2024-12-18T10:40:34+08:00] Handling API request                          fingerprint=485db7985f483d2fe7ecc0acc076f2363e40dce2844da4ece5db02f15a57a6cf ip="10.0.25.10:57312" method=PUT protocol=cluster url="/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-18T10:40:34+08:00] Matched trusted cert                          fingerprint=485db7985f483d2fe7ecc0acc076f2363e40dce2844da4ece5db02f15a57a6cf subject="CN=root@mc-001,O=LXD"
DEBUG  [2024-12-18T10:40:34+08:00] Update                                        clientType=notifier driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK volatile.network.ipv4.address:10.0.10.4] changed-via-cli}" project=dev-test
INFO   [2024-12-18T10:40:34+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
INFO   [2024-12-18T10:40:34+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
INFO   [2024-12-18T10:40:35+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: / ()
INFO   [2024-12-18T10:40:35+08:00] Action: network-updated, Source: /1.0/networks/dev-test?project=dev-test, Requestor: unix/jsimpso (@)
DEBUG  [2024-12-18T10:40:36+08:00] Heartbeat updating local raft members         members="[{{1 10.0.25.10:8443 voter} mc-001} {{2 10.0.25.11:8443 voter} mc-002} {{3 10.0.25.12:8443 voter} mc-003}]"
DEBUG  [2024-12-18T10:40:36+08:00] Starting heartbeat round                      local="10.0.25.11:8443" mode=normal
DEBUG  [2024-12-18T10:40:37+08:00] Sending heartbeat request                     address="10.0.25.10:8443"
DEBUG  [2024-12-18T10:40:37+08:00] Successful heartbeat                          remote="10.0.25.10:8443"
^C

mc-003

jsimpso@mc-003:~$ lxc monitor --pretty
DEBUG  [2024-12-18T10:40:33+08:00] Event listener server handler started         id=a9403c3a-4dcb-42ff-bc0c-4af06b099a33 local=/var/snap/lxd/common/lxd/unix.socket remote=@
DEBUG  [2024-12-18T10:40:33+08:00] Handling API request                          ip="10.0.25.24:34834" method=GET protocol=tls url=/1.0/metrics username=efa460eca20a32479bbb6b3d415d6e9340196ea717e93213a784686c82430c52
DEBUG  [2024-12-18T10:40:33+08:00] Matched trusted cert                          fingerprint=efa460eca20a32479bbb6b3d415d6e9340196ea717e93213a784686c82430c52 subject="CN=metrics.local"
DEBUG  [2024-12-18T10:40:33+08:00] Connecting to a VM agent over a VM socket
DEBUG  [2024-12-18T10:40:33+08:00] Sending request to LXD                        etag= method=GET url="https://custom.socket/1.0"
DEBUG  [2024-12-18T10:40:33+08:00] Got response struct from LXD
DEBUG  [2024-12-18T10:40:33+08:00]
        {
                "config": null,
                "api_extensions": [
                        "storage_zfs_remove_snapshots",
                        "container_host_shutdown_timeout",
                        "container_stop_priority",
                        "container_syscall_filtering",
                        "auth_pki",
                        "container_last_used_at",
                        "etag",
                        "patch",
                        "usb_devices",
                        "https_allowed_credentials",
                        "image_compression_algorithm",
                        "directory_manipulation",
                        "container_cpu_time",
                        "storage_zfs_use_refquota",
                        "storage_lvm_mount_options",
                        "network",
                        "profile_usedby",
                        "container_push",
                        "container_exec_recording",
                        "certificate_update",
                        "container_exec_signal_handling",
                        "gpu_devices",
                        "container_image_properties",
                        "migration_progress",
                        "id_map",
                        "network_firewall_filtering",
                        "network_routes",
                        "storage",
                        "file_delete",
                        "file_append",
                        "network_dhcp_expiry",
                        "storage_lvm_vg_rename",
                        "storage_lvm_thinpool_rename",
                        "network_vlan",
                        "image_create_aliases",
                        "container_stateless_copy",
                        "container_only_migration",
                        "storage_zfs_clone_copy",
                        "unix_device_rename",
                        "storage_lvm_use_thinpool",
                        "storage_rsync_bwlimit",
                        "network_vxlan_interface",
                        "storage_btrfs_mount_options",
                        "entity_description",
                        "image_force_refresh",
                        "storage_lvm_lv_resizing",
                        "id_map_base",
                        "file_symlinks",
                        "container_push_target",
                        "network_vlan_physical",
                        "storage_images_delete",
                        "container_edit_metadata",
                        "container_snapshot_stateful_migration",
                        "storage_driver_ceph",
                        "storage_ceph_user_name",
                        "resource_limits",
                        "storage_volatile_initial_source",
                        "storage_ceph_force_osd_reuse",
                        "storage_block_filesystem_btrfs",
                        "resources",
                        "kernel_limits",
                        "storage_api_volume_rename",
                        "network_sriov",
                        "console",
                        "restrict_devlxd",
                        "migration_pre_copy",
                        "infiniband",
                        "maas_network",
                        "devlxd_events",
                        "proxy",
                        "network_dhcp_gateway",
                        "file_get_symlink",
                        "network_leases",
                        "unix_device_hotplug",
                        "storage_api_local_volume_handling",
                        "operation_description",
                        "clustering",
                        "event_lifecycle",
                        "storage_api_remote_volume_handling",
                        "nvidia_runtime",
                        "container_mount_propagation",
                        "container_backup",
                        "devlxd_images",
                        "container_local_cross_pool_handling",
                        "proxy_unix",
                        "proxy_udp",
                        "clustering_join",
                        "proxy_tcp_udp_multi_port_handling",
                        "network_state",
                        "proxy_unix_dac_properties",
                        "container_protection_delete",
                        "unix_priv_drop",
                        "pprof_http",
                        "proxy_haproxy_protocol",
                        "network_hwaddr",
                        "proxy_nat",
                        "network_nat_order",
                        "container_full",
                        "backup_compression",
                        "nvidia_runtime_config",
                        "storage_api_volume_snapshots",
                        "storage_unmapped",
                        "projects",
                        "network_vxlan_ttl",
                        "container_incremental_copy",
                        "usb_optional_vendorid",
                        "snapshot_scheduling",
                        "snapshot_schedule_aliases",
                        "container_copy_project",
                        "clustering_server_address",
                        "clustering_image_replication",
                        "container_protection_shift",
                        "snapshot_expiry",
                        "container_backup_override_pool",
                        "snapshot_expiry_creation",
                        "network_leases_location",
                        "resources_cpu_socket",
                        "resources_gpu",
                        "resources_numa",
                        "kernel_features",
                        "id_map_current",
                        "event_location",
                        "storage_api_remote_volume_snapshots",
                        "network_nat_address",
                        "container_nic_routes",
                        "cluster_internal_copy",
                        "seccomp_notify",
                        "lxc_features",
                        "container_nic_ipvlan",
                        "network_vlan_sriov",
                        "storage_cephfs",
                        "container_nic_ipfilter",
                        "resources_v2",
                        "container_exec_user_group_cwd",
                        "container_syscall_intercept",
                        "container_disk_shift",
                        "storage_shifted",
                        "resources_infiniband",
                        "daemon_storage",
                        "instances",
                        "image_types",
                        "resources_disk_sata",
                        "clustering_roles",
                        "images_expiry",
                        "resources_network_firmware",
                        "backup_compression_algorithm",
                        "ceph_data_pool_name",
                        "container_syscall_intercept_mount",
                        "compression_squashfs",
                        "container_raw_mount",
                        "container_nic_routed",
                        "container_syscall_intercept_mount_fuse",
                        "container_disk_ceph",
                        "virtual-machines",
                        "image_profiles",
                        "clustering_architecture",
                        "resources_disk_id",
                        "storage_lvm_stripes",
                        "vm_boot_priority",
                        "unix_hotplug_devices",
                        "api_filtering",
                        "instance_nic_network",
                        "clustering_sizing",
                        "firewall_driver",
                        "projects_limits",
                        "container_syscall_intercept_hugetlbfs",
                        "limits_hugepages",
                        "container_nic_routed_gateway",
                        "projects_restrictions",
                        "custom_volume_snapshot_expiry",
                        "volume_snapshot_scheduling",
                        "trust_ca_certificates",
                        "snapshot_disk_usage",
                        "clustering_edit_roles",
                        "container_nic_routed_host_address",
                        "container_nic_ipvlan_gateway",
                        "resources_usb_pci",
                        "resources_cpu_threads_numa",
                        "resources_cpu_core_die",
                        "api_os",
                        "container_nic_routed_host_table",
                        "container_nic_ipvlan_host_table",
                        "container_nic_ipvlan_mode",
                        "resources_system",
                        "images_push_relay",
                        "network_dns_search",
                        "container_nic_routed_limits",
                        "instance_nic_bridged_vlan",
                        "network_state_bond_bridge",
                        "usedby_consistency",
                        "custom_block_volumes",
                        "clustering_failure_domains",
                        "resources_gpu_mdev",
                        "console_vga_type",
                        "projects_limits_disk",
                        "network_type_macvlan",
                        "network_type_sriov",
                        "container_syscall_intercept_bpf_devices",
                        "network_type_ovn",
                        "projects_networks",
                        "projects_networks_restricted_uplinks",
                        "custom_volume_backup",
                        "backup_override_name",
                        "storage_rsync_compression",
                        "network_type_physical",
                        "network_ovn_external_subnets",
                        "network_ovn_nat",
                        "network_ovn_external_routes_remove",
                        "tpm_device_type",
                        "storage_zfs_clone_copy_rebase",
                        "gpu_mdev",
                        "resources_pci_iommu",
                        "resources_network_usb",
                        "resources_disk_address",
                        "network_physical_ovn_ingress_mode",
                        "network_ovn_dhcp",
                        "network_physical_routes_anycast",
                        "projects_limits_instances",
                        "network_state_vlan",
                        "instance_nic_bridged_port_isolation",
                        "instance_bulk_state_change",
                        "network_gvrp",
                        "instance_pool_move",
                        "gpu_sriov",
                        "pci_device_type",
                        "storage_volume_state",
                        "network_acl",
                        "migration_stateful",
                        "disk_state_quota",
                        "storage_ceph_features",
                        "projects_compression",
                        "projects_images_remote_cache_expiry",
                        "certificate_project",
                        "network_ovn_acl",
                        "projects_images_auto_update",
                        "projects_restricted_cluster_target",
                        "images_default_architecture",
                        "network_ovn_acl_defaults",
                        "gpu_mig",
                        "project_usage",
                        "network_bridge_acl",
                        "warnings",
                        "projects_restricted_backups_and_snapshots",
                        "clustering_join_token",
                        "clustering_description",
                        "server_trusted_proxy",
                        "clustering_update_cert",
                        "storage_api_project",
                        "server_instance_driver_operational",
                        "server_supported_storage_drivers",
                        "event_lifecycle_requestor_address",
                        "resources_gpu_usb",
                        "clustering_evacuation",
                        "network_ovn_nat_address",
                        "network_bgp",
                        "network_forward",
                        "custom_volume_refresh",
                        "network_counters_errors_dropped",
                        "metrics",
                        "image_source_project",
                        "clustering_config",
                        "network_peer",
                        "linux_sysctl",
                        "network_dns",
                        "ovn_nic_acceleration",
                        "certificate_self_renewal",
                        "instance_project_move",
                        "storage_volume_project_move",
                        "cloud_init",
                        "network_dns_nat",
                        "database_leader",
                        "instance_all_projects",
                        "clustering_groups",
                        "ceph_rbd_du",
                        "instance_get_full",
                        "qemu_metrics",
                        "gpu_mig_uuid",
                        "event_project",
                        "clustering_evacuation_live",
                        "instance_allow_inconsistent_copy",
                        "network_state_ovn",
                        "storage_volume_api_filtering",
                        "image_restrictions",
                        "storage_zfs_export",
                        "network_dns_records",
                        "storage_zfs_reserve_space",
                        "network_acl_log",
                        "storage_zfs_blocksize",
                        "metrics_cpu_seconds",
                        "instance_snapshot_never",
                        "certificate_token",
                        "instance_nic_routed_neighbor_probe",
                        "event_hub",
                        "agent_nic_config",
                        "projects_restricted_intercept",
                        "metrics_authentication",
                        "images_target_project",
                        "cluster_migration_inconsistent_copy",
                        "cluster_ovn_chassis",
                        "container_syscall_intercept_sched_setscheduler",
                        "storage_lvm_thinpool_metadata_size",
                        "storage_volume_state_total",
                        "instance_file_head",
                        "instances_nic_host_name",
                        "image_copy_profile",
                        "container_syscall_intercept_sysinfo",
                        "clustering_evacuation_mode",
                        "resources_pci_vpd",
                        "qemu_raw_conf",
                        "storage_cephfs_fscache",
                        "network_load_balancer",
                        "vsock_api",
                        "instance_ready_state",
                        "network_bgp_holdtime",
                        "storage_volumes_all_projects",
                        "metrics_memory_oom_total",
                        "storage_buckets",
                        "storage_buckets_create_credentials",
                        "metrics_cpu_effective_total",
                        "projects_networks_restricted_access",
                        "storage_buckets_local",
                        "loki",
                        "acme",
                        "internal_metrics",
                        "cluster_join_token_expiry",
                        "remote_token_expiry",
                        "init_preseed",
                        "storage_volumes_created_at",
                        "cpu_hotplug",
                        "projects_networks_zones",
                        "network_txqueuelen",
                        "cluster_member_state",
                        "instances_placement_scriptlet",
                        "storage_pool_source_wipe",
                        "zfs_block_mode",
                        "instance_generation_id",
                        "disk_io_cache",
                        "amd_sev",
                        "storage_pool_loop_resize",
                        "migration_vm_live",
                        "ovn_nic_nesting",
                        "oidc",
                        "network_ovn_l3only",
                        "ovn_nic_acceleration_vdpa",
                        "cluster_healing",
                        "instances_state_total",
                        "auth_user",
                        "security_csm",
                        "instances_rebuild",
                        "numa_cpu_placement",
                        "custom_volume_iso",
                        "network_allocations",
                        "storage_api_remote_volume_snapshot_copy",
                        "zfs_delegate",
                        "operations_get_query_all_projects",
                        "metadata_configuration",
                        "syslog_socket",
                        "event_lifecycle_name_and_project",
                        "instances_nic_limits_priority",
                        "disk_initial_volume_configuration",
                        "operation_wait",
                        "cluster_internal_custom_volume_copy",
                        "disk_io_bus",
                        "storage_cephfs_create_missing",
                        "instance_move_config",
                        "ovn_ssl_config",
                        "init_preseed_storage_volumes",
                        "metrics_instances_count",
                        "server_instance_type_info",
                        "resources_disk_mounted",
                        "server_version_lts",
                        "oidc_groups_claim",
                        "loki_config_instance",
                        "storage_volatile_uuid",
                        "import_instance_devices",
                        "instances_uefi_vars",
                        "instances_migration_stateful",
                        "container_syscall_filtering_allow_deny_syntax",
                        "access_management",
                        "vm_disk_io_limits",
                        "storage_volumes_all",
                        "instances_files_modify_permissions",
                        "image_restriction_nesting",
                        "container_syscall_intercept_finit_module",
                        "device_usb_serial",
                        "network_allocate_external_ips",
                        "explicit_trust_token"
                ],
                "api_status": "stable",
                "api_version": "1.0",
                "auth": "trusted",
                "public": false,
                "auth_methods": [
                        "tls"
                ],
                "auth_user_name": "",
                "auth_user_method": "",
                "environment": {
                        "addresses": null,
                        "architectures": null,
                        "certificate": "",
                        "certificate_fingerprint": "",
                        "driver": "",
                        "driver_version": "",
                        "instance_types": null,
                        "firewall": "",
                        "kernel": "Linux",
                        "kernel_architecture": "x86_64",
                        "kernel_features": null,
                        "kernel_version": "6.8.0-49-generic",
                        "lxc_features": null,
                        "os_name": "",
                        "os_version": "",
                        "project": "",
                        "server": "lxd-agent",
                        "server_clustered": false,
                        "server_event_mode": "",
                        "server_name": "u1",
                        "server_pid": 482,
                        "server_version": "5.21.2",
                        "server_lts": false,
                        "storage": "",
                        "storage_version": "",
                        "storage_supported_drivers": null
                }
        }
DEBUG  [2024-12-18T10:40:33+08:00] Sending request to LXD                        etag= method=GET url="https://custom.socket/1.0/metrics"
DEBUG  [2024-12-18T10:40:34+08:00] Matched trusted cert                          fingerprint=485db7985f483d2fe7ecc0acc076f2363e40dce2844da4ece5db02f15a57a6cf subject="CN=root@mc-001,O=LXD"
DEBUG  [2024-12-18T10:40:34+08:00] Handling API request                          fingerprint=485db7985f483d2fe7ecc0acc076f2363e40dce2844da4ece5db02f15a57a6cf ip="10.0.25.10:33296" method=PUT protocol=cluster url="/1.0/networks/dev-test?project=dev-test"
DEBUG  [2024-12-18T10:40:35+08:00] Update                                        clientType=notifier driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK volatile.network.ipv4.address:10.0.10.4] changed-via-cli}" project=dev-test
^C

@jsimpso
Copy link
Author

jsimpso commented Dec 18, 2024

So it looks like volatile.network.ipv4.address isn't being included in the PUT request when the network is updated via the UI.

The PUT log line for the change made via the UI is: clientType=normal driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK] changed-via-ui}" project=dev-test

newNetwork.map:
  - bridge.mtu: 1442
  - ipv4.address: 172.16.0.1/28
  - ipv4.nat: false
  - ipv6.address: none
  - network: UPLINK 

Vs the CLI: clientType=normal driver=ovn network=dev-test newNetwork="{map[bridge.mtu:1442 ipv4.address:172.16.0.1/28 ipv4.nat:false ipv6.address:none network:UPLINK volatile.network.ipv4.address:10.0.10.4] changed-via-cli}" project=dev-test

newNetwork.map:
  - bridge.mtu: 1442
  - ipv4.address: 172.16.0.1/28
  - ipv4.nat: false
  - ipv6.address: none
  - network: UPLINK
  - volatile.network.ipv4.address: 10.0.10.4

@tomponline
Copy link
Member

@edlerd is this something you can look into?

@edlerd
Copy link
Contributor

edlerd commented Dec 18, 2024

@edlerd is this something you can look into?

The UI preserves any fields of a network that it doesn't present to the user in a web form. I don't fully understand the cause of the problem, as the volatile.* keys should be preserved. I'll have a closer look into it, the cluster case might be special in this regard.

@edlerd
Copy link
Contributor

edlerd commented Dec 18, 2024

The UI explicitly ignores config.volatile.* keys, when preserving keys for network updates in NetworkForm.tsx. This should be the root cause for this bug.

@tomponline Are there circumstances where the volatile keys should be ignored for networks? If not, I will remove this volatile filter from the UI and that should fix the behaviour reported above.

@edlerd
Copy link
Contributor

edlerd commented Dec 18, 2024

I see we are not doing any filtering for volatile keys on other entities, like instances, profile, storage or alike.

@tomponline
Copy link
Member

@tomponline Are there circumstances where the volatile keys should be ignored for networks? If not, I will remove this volatile filter from the UI and that should fix the behaviour reported above.

When submitting a PUT request an entity, the configuration submitted will replace the existing configuration.
This means that any keys not submitted will be removed.

This is why its important to ensure that all config that is read via GET is submitted back during the PUT request.

The other option is to use the PATCH method and only submit the changed keys.

@tomponline
Copy link
Member

@edlerd do you know if there is anywhere else in the UI where other keys are being stripped on return?

@edlerd
Copy link
Contributor

edlerd commented Dec 18, 2024

@edlerd do you know if there is anywhere else in the UI where other keys are being stripped on return?

We do preserve keys on the top level of the object and nested under config.* for instances, profiles, storage pools and volumes, networks and projects.

@edlerd
Copy link
Contributor

edlerd commented Dec 18, 2024

When submitting a PUT request an entity, the configuration submitted will replace the existing configuration. This means that any keys not submitted will be removed.

This is why its important to ensure that all config that is read via GET is submitted back during the PUT request.

The other option is to use the PATCH method and only submit the changed keys.

The problem with PATCH is, that if a user unsets a field, there is no way to express it in the payload.

@tomponline
Copy link
Member

The problem with PATCH is, that if a user unsets a field, there is no way to express it in the payload.

Using empty string for the key is the way to express this:

lxc query /1.0/networks/lxdbr0 -X PATCH --data '{
                "config": {
                        "user.foo": "" # This removes the user.foo key only
                }
        }'

However the drawback is that it also clears any non-config fields in the struct, such as description, so its equivalent to:

lxc query /1.0/networks/lxdbr0 -X PATCH --data '{
                "config": {
                        "user.foo": ""
                },
                "description": ""
        }'

@tomponline
Copy link
Member

Although the immediate issue will be fixed by canonical/lxd-ui#1036 (thanks @edlerd !) I will keep this issue open as removing the volatile key should consistently update the nexthop address, so this needs to be looked into why its not happening.

@tomponline
Copy link
Member

For OVN networks, if there is no volatile IP or bgp.ipv4.nexthop address, then the entire prefix should be withdrawn from being advertised by BGP (as there's no valid nexthop to use).

@tomponline
Copy link
Member

@edlerd once you have a fix for this in the LXD UI, please can you create a release tag and then we can include it in an interim release of LXD 6.2, and then we can also include it in the forthcoming LTS release of LXD 5.21.3 which is going to include the UI version from LXD 6.2.

edlerd added a commit to canonical/lxd-ui that referenced this issue Dec 18, 2024
## Done

- Ensure volatile keys are preserved, when saving a network

Fixes canonical/lxd#14531

## QA

1. Run the LXD-UI:
- On the demo server via the link posted by @webteam-app below. This is
only available for PRs created by collaborators of the repo. Ask
@mas-who or @edlerd for access.
- With a local copy of this branch, [build and run as described in the
docs](../CONTRIBUTING.md#setting-up-for-development).
2. Perform the following QA steps:
    - edit a network, that has settings under config.volatile.*
- modify the network in the UI, ensure the save call contains all
config.volatile keys in the payload
@edlerd
Copy link
Contributor

edlerd commented Dec 18, 2024

@edlerd once you have a fix for this in the LXD UI, please can you create a release tag and then we can include it in an interim release of LXD 6.2, and then we can also include it in the forthcoming LTS release of LXD 5.21.3 which is going to include the UI version from LXD 6.2.

Crafted a 0.15 tag and updated it in
canonical/lxd-pkg-snap#671 for latest candiate and canonical/lxd-pkg-snap#672 for 5.21-candidate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants