forked from sonic-net/sonic-utilities
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Mellanox] [202311] Add CMIS Host Management Files to 'show techsupport' Dumps #5
Open
tshalvi
wants to merge
71
commits into
master
Choose a base branch
from
202311_adding_cmis_host_mgmt_files_to_show_techsupport
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[Mellanox] [202311] Add CMIS Host Management Files to 'show techsupport' Dumps #5
tshalvi
wants to merge
71
commits into
master
from
202311_adding_cmis_host_mgmt_files_to_show_techsupport
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* [sflow][db_migrator] Egress Sflow support
Why I did it Fix issue: sonic-net/sonic-buildimage#15047 of after deleting vlan member and vlan, the counters for for vlan / vlan member are still seen. How I did it Delete related counter entry in state_db when deleting vlan and vlan members. How to verify it All UTs passed Manually test Signed-off-by: Yaqiang Zhu <[email protected]>
**HLD:** sonic-net/SONiC#1501 #### What I did * Implemented CLI for Generic Hash feature #### How I did it * Integrated Generic Hash interface into `config` and `show` CLI root #### How to verify it * Run Generic Hash CLI UTs #### Previous command output (if the output of a command-line utility has changed) ``` root@sonic:/home/admin# show switch-hash global ECMP HASH LAG HASH ----------------- ----------------- DST_MAC DST_MAC SRC_MAC SRC_MAC ETHERTYPE ETHERTYPE IP_PROTOCOL IP_PROTOCOL DST_IP DST_IP SRC_IP SRC_IP L4_DST_PORT L4_DST_PORT L4_SRC_PORT L4_SRC_PORT INNER_DST_MAC INNER_DST_MAC INNER_SRC_MAC INNER_SRC_MAC INNER_ETHERTYPE INNER_ETHERTYPE INNER_IP_PROTOCOL INNER_IP_PROTOCOL INNER_DST_IP INNER_DST_IP INNER_SRC_IP INNER_SRC_IP INNER_L4_DST_PORT INNER_L4_DST_PORT INNER_L4_SRC_PORT INNER_L4_SRC_PORT ``` #### New command output (if the output of a command-line utility has changed) ``` root@sonic:/home/admin# show switch-hash global +--------+-------------------------------------+ | Hash | Configuration | +========+=====================================+ | ECMP | +-------------------+-------------+ | | | | Hash Field | Algorithm | | | | |-------------------+-------------| | | | | DST_MAC | CRC | | | | | SRC_MAC | | | | | | ETHERTYPE | | | | | | IP_PROTOCOL | | | | | | DST_IP | | | | | | SRC_IP | | | | | | L4_DST_PORT | | | | | | L4_SRC_PORT | | | | | | INNER_DST_MAC | | | | | | INNER_SRC_MAC | | | | | | INNER_ETHERTYPE | | | | | | INNER_IP_PROTOCOL | | | | | | INNER_DST_IP | | | | | | INNER_SRC_IP | | | | | | INNER_L4_DST_PORT | | | | | | INNER_L4_SRC_PORT | | | | | +-------------------+-------------+ | +--------+-------------------------------------+ | LAG | +-------------------+-------------+ | | | | Hash Field | Algorithm | | | | |-------------------+-------------| | | | | DST_MAC | CRC | | | | | SRC_MAC | | | | | | ETHERTYPE | | | | | | IP_PROTOCOL | | | | | | DST_IP | | | | | | SRC_IP | | | | | | L4_DST_PORT | | | | | | L4_SRC_PORT | | | | | | INNER_DST_MAC | | | | | | INNER_SRC_MAC | | | | | | INNER_ETHERTYPE | | | | | | INNER_IP_PROTOCOL | | | | | | INNER_DST_IP | | | | | | INNER_SRC_IP | | | | | | INNER_L4_DST_PORT | | | | | | INNER_L4_SRC_PORT | | | | | +-------------------+-------------+ | +--------+-------------------------------------+ ```
… branch (sonic-net#3082) Signed-off-by: Ying Xie <[email protected]>
Depends on PR sonic-net/sonic-buildimage#17458 What I did Add CLIs to enable/disable containercfgd to optimize warm/fast boot path How I did it Add CLIs to enable/disable containercfgd How to verify it unit test manual test
…et#3124) * Collect module EEPROM data in dump
sonic-net#3008) (sonic-net#3073) * Support reading/writing module EEPROM data by page and offset (sonic-net#3008) * Support reading/writing module EEPROM data by page and offset
* Revert "[config/show] Add command to control pending FIB suppression (sonic-net#2495)" This reverts commit 9126e7f. * Revert "Revert "Revert frr route check (sonic-net#2761)" (sonic-net#2762)" This reverts commit b4f4e63.
What I did Need to support golden config in db migrator. How I did it If there's golden config json, read from golden config instead of minigraph. And db migrator will use golden config data to generate new configuration. How to verify it Run unit test.
What I did db_migrator failed to initialize SonicDBConfig, and I fix this issue. How I did it If SonicDBConfig is already initialized, do not invoke initialize() again. How to verify it Run unit test, and verified on DUT.
…tus (sonic-net#3069) For each BGP status, if the `admin_status` field is not present, then whether the BGP session is admin up or admin down depends on the default BGP status (in the `default_bgp_status` field coming from `init_cfg.json`), which is specified during image build. If the default BGP status is up, then `admin_status` will be created only when the BGP session is brought down; similarly, if the default BGP status is down, then `admin_status` will be created when the BGP session is brought up. Because of that, modify the script to use the default BGP status as the initial value. Signed-off-by: Saikrishna Arcot <[email protected]>
Fix sonic-net/sonic-buildimage#17322 Remove the route migration operation from db_migrator. The route migration operation takes a lot of time as indicated in the below issue. This is not necessary since the hardcoded assert in the fpmsyncd on new fields is removed in sonic-net/sonic-swss#2981
…no external neighbors are configured on chassis LC (sonic-net#3099) Support show ip bgp summary to display without error when no external neighbors are configured on chassis LC
…atforms (sonic-net#3115) Disabling key validation feature in grub file as its not yet supported for Cisco platforms What I did Check if the platform we are installing the image on is a Cisco platform Return success if it is so we are on Cisco platform. This way, we do not perform signature verification as this feature is not yet supported on our platforms. How I did it Modified sonic-installer grub.py code
Add the core files to the tarball while they are been processed, this ensures that only one core file at a time will be consuming flash space inside the tarpath and the tarball.
…client.eth0.pid does not exist" (sonic-net#3149) * Fix load_mgmt_config not exit when dhclient.eth0.pid not exists Signed-off-by: Mai Bui <[email protected]> * add UT Signed-off-by: Mai Bui <[email protected]> --------- Signed-off-by: Mai Bui <[email protected]>
…decimal (sonic-net#3153) (sonic-net#3160) * Fix the sfputil treats page number as decimal instead of hexadecimal (sonic-net#22) Signed-off-by: Kebo Liu <[email protected]> Co-authored-by: Kebo Liu <[email protected]>
…SIC (sonic-net#3158) This PR sonic-net#3099 fixes the case where on chassis Linecard there are no BGP neighbors. However, if the Linecard has neighbors on one ASIC but not on other, the command show bgp summary displayed no neighbors. This PR fixes this. How I did it Add check in bgp_util to create empty peer list only once Add UT to cover this case
…KUs if the buffer configuration is empty (sonic-net#3114) ### What I did Do not touch the buffer model on generic SKUs if the buffer configuration is empty. #### How I did it Set the buffer model to traditional on generic SKUs in Mellanox db migrator only if the buffer configuration is not default and not empty. #### How to verify it Manually and mock test. ### Details #### Buffer configuration contains two parts: 1. the buffer model in `DEVICE_METADATA|localhost` which is from `init_cfg.json` and can be updated by Mellanox buffer migrator 2. the buffer pools, profiles, PGs, and queues which are renderred from the buffer templates in `config qos reload` There was a logic to update the buffer model in Mellanox buffer migrator: if the buffer configuration is not default, the buffer model is set to traditional. However, if a device is installed from ONIE, the buffer configuration is also empty. As a result, the traditional buffer manager starts after the device is installed from ONIE, and it requires to restart the buffer manager to switch to the dynamic model. This can be done only by `config reload`. It didn't matter since it was required to execute `config qos reload` to complete buffer configuration which required `config save` and `config reload` in any case due to issue sonic-net/sonic-buildimage#9088. Now that the issue has been fixed and `config reload` isn't required anymore to complete `config qos reload`, we should avoid setting the buffer model to traditional in such case, otherwise `config reload` is still required to switch the buffer model. Verified the following scenarios: 1. non-default configuration generic SKU upgrade from 202305: warm/cold boot: expected: traditional model 2. default configuration generic SKU upgrade from 201911/202305: warm/cold boot: expected: dynamic model 3. install from ONIE: expected: dynamic model 4. MSFT SKU upgrade from 201911 by cold boot/ from 202012 by warm boot: expected: traditional model
…c-net#3174) Signed-off-by: Mihir Patel <[email protected]>
…le (sonic-net#3177) * Retrieve firmware version fields from TRANSCEIVER_FIRMWARE_INFO table Signed-off-by: Mihir Patel <[email protected]> * Fixed test failures * Removed update_firmware_info_to_state_db function * Revert "Removed update_firmware_info_to_state_db function" This reverts commit 68f52a2. --------- Signed-off-by: Mihir Patel <[email protected]>
…ateTask thread (sonic-net#3187) * CLI to skip polling for periodic infomration for a port in DomInfoUpdateTask thread Signed-off-by: Mihir Patel <[email protected]> * Fixed unit-test failure * Modified dom_status to dom_polling * Modified comment for failing the command --------- Signed-off-by: Mihir Patel <[email protected]>
…ATE_DB is empty (sonic-net#3199) * Add skip_action_validation option to acl-loader
…ic-net#3148) (sonic-net#3224) * [show] Update show run all to cover all asic config in masic * per comment Co-authored-by: jingwenxie <[email protected]>
Basically port2alias Cli became broken on multi-asic platforms after introduction of sonic-net/sonic-buildimage#10960 which removed the initialization of global DB config from portconfig.py (library side) and expects application to do it, but here application side (port2alias) was not updated accordingly. How I did it Add load_db_config call to port2alias for initialization
#### What I did Add alerting for YANG validation when load_minigraph during override. This is to alert early if golden config is invalid which will breaks GCU feature. #### How I did it Add alerting when `is_yang_config_validation_enabled` is not set during load_minigraph with override #### How to verify it Unit test
…sonic-net#3240) * [fast/warm-reboot] Retain TRANSCEIVER_INFO/STATUS tables on reboot Signed-off-by: Stepan Blyschak <[email protected]> * Remove TRANSCEIVER_STATUS --------- Signed-off-by: Stepan Blyschak <[email protected]>
sonic-net#3272) - What I did Add support for a new platform x86_64-nvidia_sn5400-r0 - How to verify it Manual and unit test
…l reboot (sonic-net#3292) * [chassis][midplane] Add notification to Supervisor when LC is graceful reboot * Address review comment by adding log message when failed to create wentry in CHASSIS_STATE_DB Signed-off-by: mlok <[email protected]>
…-net#3236) ### What I did Update sonic-utilities to support new SKU Mellanox-SN5600-O128 1. Add the SKU to the generic configuration updater 2. Simplify the logic of the buffer migrator to support the new SKU ### How to verify it Manual and unit tests
Migrate AAA table in db_migrator #### Why I did it per-command AAA need enable in warm-upgrade case #### How I did it Add db_migrator code to migrate AAA table #### How to verify it Pass all test case. Add new test case. #### Which release branch to backport (provide reason below if selected) N/A #### Description for the changelog Migrate AAA table in db_migrator #### A picture of a cute animal (not mandatory but encouraged)
…#3296) Migrate AAA table per-command authorization in db_migrator #### Why I did it per-command AAA need enable in warm-upgrade case #### How I did it Add code to migrate per-command aunthorization #### How to verify it Pass all test case. Add new test case. #### Which release branch to backport (provide reason below if selected) N/A #### Description for the changelog Migrate AAA table per-command authorization in db_migrator #### A picture of a cute animal (not mandatory but encouraged)
…#3315) * [202311] Show running config when bgp is down
…onic-net#3305) - What I did Added code to remove leftover symlinks and directories created by featured. Featured creates a symlink to /dev/null when unit is masked and an auto restart configuration is left under corresponding service.d/ directory. - How I did it Added necessary changes and UT to cover it. - How to verify it Uninstall an extension and verify no leftovers from featured. Signed-off-by: Stepan Blyschak <[email protected]>
…en urllib3 and requests packages (sonic-net#3328) (sonic-net#3337) * [build] Fix base OS compilation issue caused by incompatibility between urllib3 and requests packages * [pipeline] Pin request package to v2.31.0
* Backup STATE_DB PORT_TABLE during warm-reboot Signed-off-by: Mihir Patel <[email protected]> * Backing up selected fields from STATE_DB PORT_TABLE|Ethernet* and deleting unwanted fields during warm-reboot --------- Signed-off-by: Mihir Patel <[email protected]>
- What I did Change the target path for SDK Sniffer from "/var/log/mellanox/sniffer/" To: "/var/log/sdk_dbg" - How I did it Change the default for SDK_SNIFFER_TARGET_PATH - How to verify it Run SDK sniffer and make sure the sniffer output file kept in the new location
…V256 (sonic-net#3312) - What I did Update sonic-utilities to support new SKU Mellanox-SN5600-V256 Add the SKU to the generic configuration updater - How I did it - How to verify it Manual and unit tests
**What I did?** 1. Bugfix for console CLI (This is introduced by [consutil] replace shell=True sonic-net#2725, * cannot be treated as wildcard correctly). ``` admin@sonic:~$ show line ls: cannot access '/dev/C0-*': No such file or directory ``` 2. Enhance UT to avoid regression mentioned in 1. 3. Fix incorrect statement in UT. 4. Fix critical Flake8 error. **How to verify it** 1. Verified on Nokia-7215 MC0 device. 2. Verified by UT Sign-Off By: Zhijian Li <[email protected]>
…nic-net#3370) Signed-off-by: Mihir Patel <[email protected]>
In the previous commit with hash a3cf5c that aimed to address the issue where sfputil incorrectly interpreted page numbers as decimal instead of hexadecimal, there was an inadvertent double conversion from hexadecimal to decimal. For instance, inputting 11 resulted in conversion to 17 and then further to 23. To rectify this, the second conversion would be removed. A related ut has also been added. Signed-off-by: Yuanzhe, Liu <[email protected]>
… (sonic-net#3372) * Improve load_mingraph to wait eth0 restart before exist
- What I did Backup DB after syncd and swss are stopped. I observed an issue with fast-reboot that in a rare circumstances a queued FDB event might be written to ASIC_DB by a thread inside syncd after a call to FLUSHDB ASIC_DB was made. That left ASIC_DB only with one record about that FDB entry and caused syncd to crash at start: Mar 15 13:28:42.765108 sonic NOTICE syncd#SAI: :- Syncd: syncd started Mar 15 13:28:42.765268 sonic NOTICE syncd#SAI: :- onSyncdStart: performing hard reinit since COLD start was performed Mar 15 13:28:42.765451 sonic NOTICE syncd#SAI: :- readAsicState: loaded 1 switches Mar 15 13:28:42.765465 sonic NOTICE syncd#SAI: :- readAsicState: switch VID: oid:0x21000000000000 Mar 15 13:28:42.765465 sonic NOTICE syncd#SAI: :- readAsicState: read asic state took 0.000205 sec Mar 15 13:28:42.766364 sonic NOTICE syncd#SAI: :- onSyncdStart: on syncd start took 0.001097 sec Mar 15 13:28:42.766376 sonic ERR syncd#SAI: :- run: Runtime error during syncd init: map::at Mar 15 13:28:42.766376 sonic NOTICE syncd#SAI: :- sendShutdownRequest: sending switch_shutdown_request notification to OA for switch: oid:0x0 Mar 15 13:28:42.766518 sonic NOTICE syncd#SAI: :- sendShutdownRequestAfterException: notification send successfully - How I did it Backup DB after syncd/swss have stopped. - How to verify it Run fast-reboot. Signed-off-by: Stepan Blyschak <[email protected]>
* [pbh]: Fix show PBH counters when cache is partial. Signed-off-by: Nazarii Hnydyn <[email protected]>
* [DPB]Fix return code in case of failure * Updating UT
What I did Show techsupport is designed to collect logs and core files since given date. I find that some core files are missing when given date is relative, for example "5 minutes ago". Microsoft ADO: 28737486 How I did it Create the reference file at the start of the script, and don't update it in find_files. How to verify it Run end to end test: show_techsupport/test_auto_techsupport.py
…cs (sonic-net#3448) What I did Due to a conflict while cherry-picking of the PR#3369 to branch 202311, re-create this pull request to merge it manually Add a debug group and a sub-command loopback under the sfputil command for debugging and module diagnostic purposes. How I did it Implement the loopback command by directly calling the set_loopback_mode() API. How to verify it Tested under Cisco8111 with Credo C1 cable. Turn off loopback mode sfputil debug loopback Ethernet88 none Turn on host input loopback sfputil debug loopback Ethernet88 host-side-input MSFT ADO: 26677525 Signed-off-by: xinyu <[email protected]>
) #### What I did If there is something wrong getting eeprom while exectuing script `decode-syseeprom`, it will raise an exception and log the error. There was no definition of `log` in script `decode-syseeprom`, which will raise such error ``` Traceback (most recent call last): File "/usr/local/bin/decode-syseeprom", line 264, in <module> sys.exit(main()) ^^^^^^ File "/usr/local/bin/decode-syseeprom", line 246, in main print_serial(use_db) File "/usr/local/bin/decode-syseeprom", line 171, in print_serial eeprom = instantiate_eeprom_object() ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/bin/decode-syseeprom", line 36, in instantiate_eeprom_object log.log_error('Failed to obtain EEPROM object due to {}'.format(repr(e))) ^^^ NameError: name 'log' is not defined ``` In this PR, I add the definition of log to avoid such error. #### How I did it Add the definition of log. #### How to verify it ``` admin@vlab-01:~$ sudo decode-syseeprom -s Failed to read system EEPROM info ```
…4 (202311) (sonic-net#3438) Signed-off-by: Andriy Yurkiv <[email protected]>
…d to 'show interfaces autoneg status'
…nd used save_cmd() instead of running the command directly and manually storing the output in a file
tshalvi
force-pushed
the
202311_adding_cmis_host_mgmt_files_to_show_techsupport
branch
from
August 28, 2024 08:19
4893f69
to
6a27b47
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What I did
For Mellanox platforms, I added the following CMIS host management-related files to the 'show techsupport' dumps (if they exist): sai.profile, pmon_daemon_control.json, media_settings.json, optics_si_settings.json, and autoneg.status.
How I did it
I copied the relevant files from the SKU/platform folder and ran the 'show interface autoneg status' command to store the auto-negotiation status for all ports.
How to verify it
Run 'show techsupport' and verify that autoneg.status is located in the 'dumps' directory and that the other files are present in the cmis-host-mgmt path within the generated dump.
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)