Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Broadcom Trident3][202211 branch] crm/test_crm.py test failure #1784

Open
ChungKuoYen opened this issue Mar 30, 2023 · 0 comments
Open

[Broadcom Trident3][202211 branch] crm/test_crm.py test failure #1784

ChungKuoYen opened this issue Mar 30, 2023 · 0 comments

Comments

@ChungKuoYen
Copy link

Description
Test failed with crm/test_crm.py::test_crm_fdb_entry.

E           Failed: Counter 'crm_stats_fdb_entry_used' was not incremented

It takes about 17 seconds to get the response after syncd calling SAI to do FDB flush.
And then test failed.
After recieved SAI response do the following updating redis database in 0.005320 sec:
Maybe there is some problem in broadcom SAI/sdk...

Mar 16 23:48:00.579298 str-ix7-100 NOTICE syncd#syncd: :- processFdbFlush: fdb flush succeeded, updating redis database
Mar 16 23:48:00.579635 str-ix7-100 NOTICE syncd#syncd: :- processFlushEvent: received a flush port fdb event, portVid = oid:0x0, bvId = oid:0x0
Mar 16 23:48:00.579803 str-ix7-100 NOTICE syncd#syncd: :- processFlushEvent: pattern ASIC_STATE:SAI_OBJECT_TYPE_FDB_ENTRY:*, portStr 
Mar 16 23:48:00.586423 str-ix7-100 NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: processing consolidated fdb flush event of type: SAI_FDB_ENTRY_TYPE_DYNAMIC
Mar 16 23:48:00.586423 str-ix7-100 NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: fdb flush took 0.005320 sec

Steps to reproduce the issue:

  1. Run crm/test_crm.py::test_crm_fdb_entry.

Describe the results you received:

  1. Test results:
    1.1 FAILED crm/test_crm.py::test_crm_fdb_entry:
common/helpers/parallel.py:198: Failed
=================================== FAILURES ===================================
_____________________ test_crm_fdb_entry[str-ix7-100-None] _____________________

duthosts = [<MultiAsicSonicHost str-ix7-100>]
enum_rand_one_per_hwsku_frontend_hostname = 'str-ix7-100'
enum_frontend_asic_index = None
tbinfo = {'auto_recover': 'True', 'comment': 'Tests Quanta IX7', 'conf-name': 'vms-ix1b-t0', 'duts': ['str-ix7-100'], ...}

    @pytest.mark.usefixtures('disable_fdb_aging')
    def test_crm_fdb_entry(duthosts, enum_rand_one_per_hwsku_frontend_hostname, enum_frontend_asic_index, tbinfo):
        duthost = duthosts[enum_rand_one_per_hwsku_frontend_hostname]
        asichost = duthost.asic_instance(enum_frontend_asic_index)
    
        get_fdb_stats = "redis-cli --raw -n 2 HMGET CRM:STATS crm_stats_fdb_entry_used crm_stats_fdb_entry_available"
        topology = tbinfo["topo"]["properties"]["topology"]
        cfg_facts = duthost.config_facts(host=duthost.hostname, source="persistent")['ansible_facts']
        port_dict = dict(zip(cfg_facts['port_index_map'].values(), cfg_facts['port_index_map'].keys()))
        # Use for test 1st in list hosts interface port to add into dummy VLAN
        host_port_id = [id for id in topology["host_interfaces"]][0]
        iface = port_dict[host_port_id]
        vlan_id = 2
        cmd_add_vlan_member = "config vlan member add {vid} {iface}"
        cmd_add_vlan = "config vlan add {}".format(vlan_id)
    
        # Configure test restore commands
        RESTORE_CMDS["crm_threshold_name"] = "fdb_entry"
        # Remove FDB entry
        RESTORE_CMDS["test_crm_fdb_entry"].append("fdbclear")
        # Restart arp_update
        # RESTORE_CMDS["test_crm_fdb_entry"].append("docker exec -i swss supervisorctl start arp_update")
        # Remove VLAN member required for FDB entry
        RESTORE_CMDS["test_crm_fdb_entry"].append("config vlan member del {} {}".format(vlan_id, iface))
        # Remove VLAN required for FDB entry
        RESTORE_CMDS["test_crm_fdb_entry"].append("config vlan del {}".format(vlan_id))
    
        # Add VLAN required for FDB entry
        duthost.command(cmd_add_vlan)
        # Add VLAN member required for FDB entry
        duthost.command(cmd_add_vlan_member.format(vid=vlan_id, iface=iface))
    
        # Stop arp_update
        cmd = "docker exec -i swss supervisorctl stop arp_update"
        duthost.command(cmd)
    
        # Remove FDB entry
        cmd = "fdbclear"
        duthost.command(cmd)
        time.sleep(5)
        # Get "crm_stats_fdb_entry" used and available counter value
        crm_stats_fdb_entry_used, crm_stats_fdb_entry_available = get_crm_stats(get_fdb_stats, duthost)

        # Generate FDB json file with one entry and apply it on DUT
        apply_fdb_config(duthost, "test_crm_fdb_entry", vlan_id, iface, 1)
    
        # Get new "crm_stats_fdb_entry" used and available counter value
        new_crm_stats_fdb_entry_used, new_crm_stats_fdb_entry_available = get_crm_stats(get_fdb_stats, duthost)
    
        # Verify "crm_stats_fdb_entry_used" counter was incremented
        # For Cisco-8000 devices, hardware FDB counter is statistical-based with +/- 1 entry tolerance.
        # Hence, the used counter can increase by more than 1.
        if is_cisco_device(duthost):
            pytest_assert(new_crm_stats_fdb_entry_used - crm_stats_fdb_entry_used >= 1, \
                "Counter 'crm_stats_fdb_entry_used' was not incremented")
        else:
            pytest_assert(new_crm_stats_fdb_entry_used - crm_stats_fdb_entry_used == 1, \
>               "Counter 'crm_stats_fdb_entry_used' was not incremented")
E           Failed: Counter 'crm_stats_fdb_entry_used' was not incremented

asichost   = <SonicAsic 0>
cfg_facts  = {'ACL_TABLE': {'DATAACL': {'policy_desc': u'DATAACL', 'ports': [u'PortChannel101', u'PortChannel102', u'PortChannel103...te_limit_interval': u'600', 'state': u'enabled'}, ...}, 'BGP_DEVICE_GLOBAL': {'STATE': {'tsa_enabled': u'false'}}, ...}
cmd        = 'fdbclear'
cmd_add_vlan = 'config vlan add 2'
cmd_add_vlan_member = 'config vlan member add {vid} {iface}'
crm_stats_fdb_entry_available = 65509
crm_stats_fdb_entry_used = 26
duthost    = <MultiAsicSonicHost str-ix7-100>
duthosts   = [<MultiAsicSonicHost str-ix7-100>]
enum_frontend_asic_index = None
enum_rand_one_per_hwsku_frontend_hostname = 'str-ix7-100'
get_fdb_stats = 'redis-cli --raw -n 2 HMGET CRM:STATS crm_stats_fdb_entry_used crm_stats_fdb_entry_available'
host_port_id = 0
id         = 27
iface      = 'Ethernet0'
new_crm_stats_fdb_entry_available = 65509
new_crm_stats_fdb_entry_used = 26
port_dict  = {0: 'Ethernet0', 1: 'Ethernet4', 2: 'Ethernet8', 3: 'Ethernet12', ...}
tbinfo     = {'auto_recover': 'True', 'comment': 'Tests Quanta IX7', 'conf-name': 'vms-ix1b-t0', 'duts': ['str-ix7-100'], ...}
topology   = {'DUT': {'vlan_configs': {'default_vlan_config': 'one_vlan_a', 'four_vlan_a': {'Vlan1000': {'id': 1000, 'intfs': [1, 2...lans': [31], 'vm_offset': 3}}, 'disabled_host_interfaces': [0, 25, 26, 27], 'host_interfaces': [0, 1, 2, 3, 4, 5, ...]}
vlan_id    = 2

crm/test_crm.py:980: Failed

Describe the results you expected:

  1. Test passed.

Additional information you deem important:

Output of show version:

admin@str-ix7-100:~$ show version

SONiC Software Version: SONiC.QUANTA_202211.0-4d998f6dc
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
Build commit: 4d998f6dc
Build date: Fri Mar 24 08:33:06 UTC 2023
Built by: frank@sonic

Platform: x86_64-quanta_ix7_rglbmc-r0
HwSKU: Quanta-IX7-32X
ASIC: broadcom
ASIC Count: 1
Serial Number: QTFCUW737001F
Model Number: 19105
Hardware Revision: N/A
Uptime: 16:37:54 up 14:47,  1 user,  load average: 0.75, 0.54, 0.51
Date: Sat 25 Mar 2023 16:37:54

Docker images:
REPOSITORY                    TAG                         IMAGE ID       SIZE
docker-platform-monitor       QUANTA_202211.0-4d998f6dc   61b11c2616c9   629MB
docker-platform-monitor       latest                      61b11c2616c9   629MB
docker-orchagent              QUANTA_202211.0-4d998f6dc   16382119d90f   537MB
docker-orchagent              latest                      16382119d90f   537MB
docker-fpm-frr                QUANTA_202211.0-4d998f6dc   ed71513521f6   548MB
docker-fpm-frr                latest                      ed71513521f6   548MB
docker-teamd                  QUANTA_202211.0-4d998f6dc   873c83005ba5   518MB
docker-teamd                  latest                      873c83005ba5   518MB
docker-macsec                 latest                      9a0bf69765fb   520MB
docker-sonic-mgmt-framework   QUANTA_202211.0-4d998f6dc   9d58173bdb19   608MB
docker-sonic-mgmt-framework   latest                      9d58173bdb19   608MB
docker-nat                    QUANTA_202211.0-4d998f6dc   953541c4a792   477MB
docker-nat                    latest                      953541c4a792   477MB
docker-sflow                  QUANTA_202211.0-4d998f6dc   75e1c694ec30   475MB
docker-sflow                  latest                      75e1c694ec30   475MB
docker-dhcp-relay             latest                      39a4c0edf47e   511MB
docker-eventd                 QUANTA_202211.0-4d998f6dc   70315d710c79   501MB
docker-eventd                 latest                      70315d710c79   501MB
docker-syncd-brcm             QUANTA_202211.0-4d998f6dc   3cfb49a0f86a   835MB
docker-syncd-brcm             latest                      3cfb49a0f86a   835MB
docker-gbsyncd-broncos        QUANTA_202211.0-4d998f6dc   705b8c06286f   540MB
docker-gbsyncd-broncos        latest                      705b8c06286f   540MB
docker-gbsyncd-credo          QUANTA_202211.0-4d998f6dc   17421bb619bf   511MB
docker-gbsyncd-credo          latest                      17421bb619bf   511MB
docker-snmp                   QUANTA_202211.0-4d998f6dc   500399454b3b   538MB
docker-snmp                   latest                      500399454b3b   538MB
docker-sonic-p4rt             QUANTA_202211.0-4d998f6dc   936665c6e433   1.06GB
docker-sonic-p4rt             latest                      936665c6e433   1.06GB
docker-sonic-telemetry        QUANTA_202211.0-4d998f6dc   7b2481830819   791MB
docker-sonic-telemetry        latest                      7b2481830819   791MB
docker-database               QUANTA_202211.0-4d998f6dc   7a65f743ab69   493MB
docker-database               latest                      7a65f743ab69   493MB
docker-router-advertiser      QUANTA_202211.0-4d998f6dc   8285974a40c4   493MB
docker-router-advertiser      latest                      8285974a40c4   493MB
docker-mux                    QUANTA_202211.0-4d998f6dc   5f742fc627b1   541MB
docker-mux                    latest                      5f742fc627b1   541MB
docker-lldp                   QUANTA_202211.0-4d998f6dc   46282f9483fa   535MB
docker-lldp                   latest                      46282f9483fa   535MB    ```

Attach debug file:

Mar 16 23:47:43.578162 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 363 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:44.538059 str-ix7-100 INFO python[2200050]: ansible-command Invoked with _uses_shell=True _raw_params=config vlan member del 2 Ethernet0 warn=True stdin_add_newline=True strip_empty_ends=True argv=None chdir=None executable=None creates=None removes=None stdin=None
Mar 16 23:47:44.578316 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 1364 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:45.530338 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet0 admin:0 oper:0 addr:a8:1e:84:c3:6f:30 ifindex:781 master:770
Mar 16 23:47:45.530953 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet0(ok:down) to state db
Mar 16 23:47:45.544786 str-ix7-100 INFO kernel: [278633.457490] device Ethernet0 left promiscuous mode
Mar 16 23:47:45.544819 str-ix7-100 INFO kernel: [278633.457630] Bridge: port 26(Ethernet0) entered disabled state
Mar 16 23:47:45.545502 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet0 admin:0 oper:0 addr:a8:1e:84:c3:6f:30 ifindex:781 master:770
Mar 16 23:47:45.549671 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet0(ok:down) to state db
Mar 16 23:47:45.549671 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet0 admin:0 oper:0 addr:a8:1e:84:c3:6f:30 ifindex:781 master:770
Mar 16 23:47:45.550547 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet0(ok:down) to state db
Mar 16 23:47:45.550547 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: nlmsg type:17 key:Ethernet0 admin:0 oper:0 addr:a8:1e:84:c3:6f:30 ifindex:781 master:770
Mar 16 23:47:45.562370 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet0 admin:0 oper:0 addr:a8:1e:84:c3:6f:30 ifindex:781 master:0
Mar 16 23:47:45.563192 str-ix7-100 NOTICE swss#portsyncd: :- onMsg: Publish Ethernet0(ok:down) to state db
Mar 16 23:47:45.578518 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 2364 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:46.578939 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 3364 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:47.008174 str-ix7-100 INFO python[2200083]: ansible-command Invoked with _uses_shell=True _raw_params=config vlan del 2 warn=True stdin_add_newline=True strip_empty_ends=True argv=None chdir=None executable=None creates=None removes=None stdin=None
Mar 16 23:47:47.579342 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 4365 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:48.579755 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 5365 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:49.455426 str-ix7-100 INFO python[2200113]: ansible-command Invoked with _uses_shell=True _raw_params=rm /tmp/fdb.json warn=True stdin_add_newline=True strip_empty_ends=True argv=None chdir=None executable=None creates=None removes=None stdin=None
Mar 16 23:47:49.455675 str-ix7-100 INFO python[2200113]: ansible-command [WARNING] Consider using the file module with state=absent rather than running 'rm'.  If you need to use command because file is insufficient you can add 'warn: false' to this command task or set 'command_warnings=False' in ansible.cfg to get rid of this message.
Mar 16 23:47:49.580154 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 6365 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:50.580268 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 7366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:50.718334 str-ix7-100 INFO python[2200134]: ansible-command Invoked with _uses_shell=True _raw_params=docker exec -i swss rm /fdb.json warn=True stdin_add_newline=True strip_empty_ends=True argv=None chdir=None executable=None creates=None removes=None stdin=None
Mar 16 23:47:51.580390 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 8366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:52.580510 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 9366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:53.580601 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 10366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:53.883054 str-ix7-100 ALERT dhcp_relay#dhcpmon[41]: dhcpmon detected disparity in DHCP Relay behavior. Duration: 86364 (sec) for vlan: 'Agg-Vlan1000'
Mar 16 23:47:53.883352 str-ix7-100 NOTICE dhcp_relay#dhcpmon[41]: [    Agg-Vlan1000-Snapshot rx/tx] Discover:     26589/        0, Offer:         0/        0, Request:         0/        0, ACK:         0/        0
Mar 16 23:47:53.883535 str-ix7-100 NOTICE dhcp_relay#dhcpmon[41]: [    Agg-Vlan1000- Current rx/tx] Discover:     26594/        0, Offer:         0/        0, Request:         0/        0, ACK:         0/        0
Mar 16 23:47:54.580740 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 11366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:55.580856 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 12366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:56.580990 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 13366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:57.581114 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 14366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:58.581227 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 15366 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:47:59.581347 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 16367 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:48:00.579298 str-ix7-100 NOTICE syncd#syncd: :- processFdbFlush: fdb flush succeeded, updating redis database
Mar 16 23:48:00.579635 str-ix7-100 NOTICE syncd#syncd: :- processFlushEvent: received a flush port fdb event, portVid = oid:0x0, bvId = oid:0x0
Mar 16 23:48:00.579803 str-ix7-100 NOTICE syncd#syncd: :- processFlushEvent: pattern ASIC_STATE:SAI_OBJECT_TYPE_FDB_ENTRY:*, portStr 
Mar 16 23:48:00.586423 str-ix7-100 NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: processing consolidated fdb flush event of type: SAI_FDB_ENTRY_TYPE_DYNAMIC
Mar 16 23:48:00.586423 str-ix7-100 NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: fdb flush took 0.005320 sec
Mar 16 23:48:00.586723 str-ix7-100 NOTICE syncd#syncd: :- threadFunction: time span 17372 ms for 'flush:SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000'
Mar 16 23:48:00.588461 str-ix7-100 WARNING swss#orchagent: :- meta_sai_on_fdb_event_single: object key SAI_OBJECT_TYPE_FDB_ENTRY:{"bvid":"oid:0x260000000005f4","mac":"EC:0D:9A:6F:A2:04","switch_id":"oid:0x21000000000000"} doesn't exist but received AGED event

fdbclear flow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant