Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bgpd] Crash in libyang during boot up #20760

Open
stepanblyschak opened this issue Nov 11, 2024 · 1 comment
Open

[bgpd] Crash in libyang during boot up #20760

stepanblyschak opened this issue Nov 11, 2024 · 1 comment
Assignees
Labels
MSFT Triaged this issue has been triaged

Comments

@stepanblyschak
Copy link
Collaborator

stepanblyschak commented Nov 11, 2024

Description

BGP docker crashed during the test and it could not be recovered during the test.

Steps to reproduce the issue:

  1. Issue reproduced during sonic-mgmt platform_tests/test_advanced_reboot.py::test_warm_reboot_sad test
  2. BGP docker crashed during the test and it could not be recovered during the test:
2024 Nov  9 03:21:40.847546 arc-switch1004 INFO bgp#supervisord 2024-11-09 01:21:40,845 WARN exited: bgpd (terminated by SIGSEGV (core dumped); not expected)

Describe the results you received:

Logs:

2024 Nov  9 03:21:37.123288 sonic DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpfcjtdazx']'.
2024 Nov  9 03:21:37.194414 sonic INFO sonic-ztp[4005]: ZTP is administratively disabled.
2024 Nov  9 03:21:37.445364 sonic CRIT bgp#BGP[60]: Received signal 11 at 1731115297 (si_addr 0x4, PC 0x7fdc3c18748c); aborting...
2024 Nov  9 03:21:37.449344 sonic CRIT bgp#BGP[60]: zlog_signal+0xf5                   7fdc3c61f345     7ffcf5aed3b0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.449344 sonic CRIT bgp#BGP[60]: PBKDF2_SHA256+0x4b1                7fdc3c64cf81     7ffcf5aed4f0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.449344 sonic CRIT bgp#BGP[60]: __sigaction+0x40                   7fdc3c2e2050     7ffcf5aed640 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fdc3c2a6000)
2024 Nov  9 03:21:37.464006 sonic CRIT bgp#BGP[60]:     ---- signal ----
2024 Nov  9 03:21:37.464006 sonic CRIT bgp#BGP[60]: ly_err_print+0xe1c                 7fdc3c18748c     7ffcf5aedae0 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x3af0       7fdc3c21c5f0     7ffcf5aedb50 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x672d       7fdc3c21f22d     7ffcf5aedbb0 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x15a46      7fdc3c22e546     7ffcf5aedc20 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x1364b      7fdc3c22c14b     7ffcf5aedd00 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x12b62      7fdc3c22b662     7ffcf5aede90 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x17664      7fdc3c230164     7ffcf5aee020 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.469685 sonic CRIT bgp#BGP[60]: lyxp_get_expr+0x1a6                7fdc3c2307e6     7ffcf5aee090 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.469685 sonic CRIT bgp#BGP[60]: lyxp_get_expr+0x2a97               7fdc3c2330d7     7ffcf5aee1a0 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.469685 sonic CRIT bgp#BGP[60]: lyxp_get_expr+0x30d7               7fdc3c233717     7ffcf5aee240 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.469685 sonic CRIT bgp#BGP[60]: lyd_validate_all+0x42              7fdc3c2338e2     7ffcf5aee370 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov  9 03:21:37.469685 sonic CRIT bgp#BGP[60]: nb_candidate_commit_prepare+0x4e     7fdc3c62ea8e     7ffcf5aee3a0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.469685 sonic CRIT bgp#BGP[60]: nb_candidate_commit+0x47           7fdc3c62ed97     7ffcf5aee400 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.469685 sonic CRIT bgp#BGP[60]: nb_terminate+0x29f8                7fdc3c631c68     7ffcf5aee450 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.483873 sonic CRIT bgp#BGP[60]: nb_cli_pending_commit_check+0x28     7fdc3c631da8     7ffcf5af04b0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.483873 sonic CRIT bgp#BGP[60]: cmd_exit+0x28d                     7fdc3c5f169d     7ffcf5af04d0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.483873 sonic CRIT bgp#BGP[60]: cmd_execute_command+0xd7           7fdc3c5f19f7     7ffcf5af0540 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.491285 sonic CRIT bgp#BGP[60]: cmd_execute+0xd0                   7fdc3c5f1c10     7ffcf5af0590 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.491285 sonic CRIT bgp#BGP[60]: vty_set_include+0x197              7fdc3c664127     7ffcf5af05f0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.491285 sonic CRIT bgp#BGP[60]: vty_set_include+0x964              7fdc3c6648f4     7ffcf5af26a0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.491285 sonic CRIT bgp#BGP[60]: vty_close+0x1f08                   7fdc3c667b48     7ffcf5af26e0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.491285 sonic CRIT bgp#BGP[60]: thread_call+0x7d                   7fdc3c65ee2d     7ffcf5af2930 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.517219 sonic CRIT bgp#BGP[60]: frr_run+0xe8                       7fdc3c617368     7ffcf5af29d0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov  9 03:21:37.533015 sonic CRIT bgp#BGP[60]: main+0x36b                         55e2769d238b     7ffcf5af2be0 /usr/lib/frr/bgpd (mapped at 0x55e2768e8000)
2024 Nov  9 03:21:37.533015 sonic CRIT bgp#BGP[60]: __libc_init_first+0x8a             7fdc3c2cd24a     7ffcf5af2c40 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fdc3c2a6000)
2024 Nov  9 03:21:37.533015 sonic CRIT bgp#BGP[60]: __libc_start_main+0x85             7fdc3c2cd305     7ffcf5af2ce0 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fdc3c2a6000)
2024 Nov  9 03:21:37.533015 sonic CRIT bgp#BGP[60]: _start+0x21                        55e2769d4091     7ffcf5af2d30 /usr/lib/frr/bgpd (mapped at 0x55e2768e8000)
2024 Nov  9 03:21:37.533015 sonic CRIT bgp#BGP[60]: in thread vtysh_read scheduled from ../lib/vty.c:2740 vty_event()

Backtrace shows the issue comes from libyang hash table implementation:

#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x00007fdc3c330e9f in __pthread_kill_internal (signo=11, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00007fdc3c2e1fb2 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#3  0x00007fdc3c64cfbc in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#4  <signal handler called>
#5  0x00007fdc3c18748c in lyht_insert_with_resize_cb (ht=0x55e2784fb6d0, val_p=0x7ffcf5aedb5c, hash=3760792813, resize_val_equal=resize_val_equal@entry=0x0, match_p=0x0) at ./src/hash_table.c:697
#6  0x00007fdc3c187b5a in lyht_insert (ht=<optimized out>, val_p=<optimized out>, hash=<optimized out>, match_p=<optimized out>) at ./src/hash_table.c:746
#7  0x00007fdc3c21c5f0 in set_insert_node_hash (set=0x55e2784fc970, node=0x55e2784f5160, type=<optimized out>) at ./src/xpath.c:647
#8  0x00007fdc3c21f22d in moveto_node (set=set@entry=0x55e2784fc970, moveto_mod=0x55e277e02270, ncname=ncname@entry=0x55e277dead90 "entry", options=options@entry=2) at ./src/xpath.c:5603
#9  0x00007fdc3c22e546 in eval_name_test_with_predicate (options=2, set=<optimized out>, all_desc=<optimized out>, attr_axis=<optimized out>, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:7350
#10 eval_relative_location_path (exp=0x55e277e3a990, tok_idx=0x7ffcf5aee046, all_desc=<optimized out>, set=<optimized out>, options=2) at ./src/xpath.c:7522
#11 0x00007fdc3c22ae8d in eval_path_expr (options=21986, set=<optimized out>, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:8072
#12 0x00007fdc3c22c14b in eval_function_call (options=2, set=0x7ffcf5aee0d0, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:7772
#13 eval_path_expr (options=2, set=0x7ffcf5aee0d0, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:8002
#14 eval_expr_select (exp=exp@entry=0x55e277e3a990, tok_idx=tok_idx@entry=0x7ffcf5aee046, etype=etype@entry=LYXP_EXPR_OR, set=set@entry=0x7ffcf5aee0d0, options=options@entry=2) at ./src/xpath.c:8666
#15 0x00007fdc3c22b662 in eval_or_expr (options=2, set=0x7ffcf5aee0d0, repeat=<optimized out>, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:8558
#16 eval_expr_select (exp=exp@entry=0x55e277e3a990, tok_idx=tok_idx@entry=0x7ffcf5aee046, etype=etype@entry=LYXP_EXPR_NONE, set=set@entry=0x7ffcf5aee0d0, options=options@entry=2) at ./src/xpath.c:8642
#17 0x00007fdc3c230164 in lyxp_eval (ctx=0x55e277db3a00, exp=0x55e277e3a990, cur_mod=0x55e277e0e390, format=format@entry=LY_VALUE_SCHEMA_RESOLVED, prefix_data=<optimized out>, ctx_node=0x55e2784fa6c0, tree=0x55e277e2d6b0, 
    vars=<optimized out>, set=<optimized out>, options=<optimized out>) at ./src/xpath.c:8758
#18 0x00007fdc3c2307e6 in lyd_validate_node_when (tree=0x55e277e2c610, node=node@entry=0x55e2784fbd10, schema=<optimized out>, disabled=disabled@entry=0x7ffcf5aee1f0) at ./src/validation.c:153
#19 0x00007fdc3c2330d7 in lyd_validate_unres_when (diff=0x0, node_types=0x7ffcf5aee2e0, node_when=<optimized out>, mod=0x55e277e02270, tree=0x7ffcf5aee2d0) at ./src/validation.c:206
#20 lyd_validate_unres (tree=0x7ffcf5aee2d0, mod=0x55e277e02270, node_when=<optimized out>, node_exts=0x7ffcf5aee310, node_types=0x7ffcf5aee2e0, meta_types=0x7ffcf5aee2f0, diff=0x0) at ./src/validation.c:322
#21 0x00007fdc3c233717 in lyd_validate (tree=0x55e277dff110, module=module@entry=0x0, ctx=0x55e277db3a00, val_opts=1, validate_subtree=validate_subtree@entry=1 '\001', node_when_p=0x7ffcf5aee300, node_when_p@entry=0x0, 
    node_exts_p=0x7ffcf5aee310, node_types_p=0x7ffcf5aee2e0, meta_types_p=0x7ffcf5aee2f0, diff=0x0) at ./src/validation.c:1577
#22 0x00007fdc3c2338e2 in lyd_validate_all (tree=<optimized out>, ctx=<optimized out>, val_opts=<optimized out>, diff=<optimized out>) at ./src/validation.c:1604
#23 0x00007fdc3c62ea8e in nb_candidate_commit_prepare () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#24 0x00007fdc3c62ed97 in nb_candidate_commit () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#25 0x00007fdc3c631c68 in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#26 0x00007fdc3c631da8 in nb_cli_pending_commit_check () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#27 0x00007fdc3c5f169d in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#28 0x00007fdc3c5f19f7 in cmd_execute_command () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#29 0x00007fdc3c5f1c10 in cmd_execute () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
--Type <RET> for more, q to quit, c to continue without paging--
#30 0x00007fdc3c664127 in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#31 0x00007fdc3c6648f4 in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#32 0x00007fdc3c667b48 in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#33 0x00007fdc3c65ee2d in thread_call () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#34 0x00007fdc3c617368 in frr_run () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#35 0x000055e2769d238b in main ()

Crashes at dereference:

   0x00007fdc3c187484 <+788>:   test   %ebx,%ebx
   0x00007fdc3c187486 <+790>:   jg     0x7fdc3c187609 <lyht_insert_with_resize_cb+1177>
=> 0x00007fdc3c18748c <+796>:   mov    0x4(%rcx),%ebx
   0x00007fdc3c18748f <+799>:   jmp    0x7fdc3c187236 <lyht_insert_with_resize_cb+198>

...

(gdb) p $rcx
$42 = 0

which corresponds to the code in libyang:

LY_ERR
lyht_insert_with_resize_cb(struct hash_table *ht, void *val_p, uint32_t hash, lyht_value_equal_cb resize_val_equal,
        void **match_p)

...

    /* insert it into the returned record */
    assert(rec->hits < 1);
    if (rec->hits < 0) {.   <========= line crashed
        --ht->invalid;
    }

Describe the results you expected:

Output of show version:

SONiC Software Version: SONiC.202405_RC.45-28a64576c_Internal
SONiC OS Version: 12
Distribution: Debian 12.7
Kernel: 6.1.0-22-2-amd64
Build commit: 28a64576c
Build date: Thu Nov  7 06:41:16 UTC 2024

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

Core dump: bgpd.1731115297.60.core.gz

@prgeor
Copy link
Contributor

prgeor commented Nov 20, 2024

@StormLiangMS can you check with @qiluo-msft if this issue with libyang. As per @stepanblyschak this cannot be reproduced easily. please check the coredump

@prgeor prgeor added the MSFT label Nov 20, 2024
@prgeor prgeor added the Triaged this issue has been triaged label Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MSFT Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

3 participants