Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: how to avoid hang during library instrumentation #177

Open
jhux2 opened this issue Oct 14, 2022 · 10 comments
Open

Question: how to avoid hang during library instrumentation #177

jhux2 opened this issue Oct 14, 2022 · 10 comments

Comments

@jhux2
Copy link

jhux2 commented Oct 14, 2022

When I attempt to instrument a particular library in the Trilinos project, the process doesn't finish, even running overnight.

This is with omnitrace release 1.7 on crusher. The library in question is libteuchosnumerics.so.13 and the command is

omnitrace -v -1 --print-instrumented functions -o /ccs/home/jjhu/crusher/libs-instrumented/libteuchosnumerics.so.13

The documentation presents a few options -- are there any that you'd recommend that I try?

I'm using Trilinos develop a76c1c4a9, and my module environment is

Currently Loaded Modules:
  1) libfabric/1.15.0.0                      9) cray-dsmml/0.2.2         17) rocm/5.2.0                        25) metis/5.1.0
  2) craype-network-ofi                     10) cray-mpich/8.1.16        18) cmake/3.22.1                      26) yaml-cpp/0.7.0
  3) perftools-base/22.05.0                 11) cray-libsci/21.08.1.2    19) ninja/1.10.2                      27) zlib/1.2.11
  4) xpmem/2.4.4-2.3_2.12__gff0e1d9.shasta  12) PrgEnv-cray/8.3.3        20) cray-hdf5-parallel/1.12.1.1       28) superlu/5.3.0
  5) cray-pmi/6.1.2                         13) xalt/1.3.0               21) cray-netcdf-hdf5parallel/4.8.1.1  29) omnitrace/1.7.0
  6) cce/14.0.0                             14) DefApps/default          22) parallel-netcdf/1.12.2
  7) tmux/3.2a                              15) craype-accel-amd-gfx90a  23) boost/1.78.0
  8) craype/2.7.15                          16) craype-x86-trento        24) parmetis/4.0.3
@jrmadsen
Copy link
Collaborator

omnitrace -v -1 --print-instrumented functions -o /ccs/home/jjhu/crusher/libs-instrumented/libteuchosnumerics.so.13

This doesn’t look right. You need to provide the library you want to instrument after a double-hyphen:

omnitrace -v -1 --print-instrumented functions -o /ccs/home/jjhu/crusher/libs-instrumented/libteuchosnumerics.so.13 -- /path/to/original/lib

I'm surprised it hung instead of throwing an error that you didn't provide a exe/lib.

Binary rewrites don't take much time (couple seconds to a minute), you shouldn't need to run them in a job.

Also, you may want to try the omnitrace-sample exe if you just want sampling. It was part of the 1.7 release but I'm still writing all the documentation. It follows the same syntax as:

omnitrace-sample <options> -- <command-to-run>

@jhux2
Copy link
Author

jhux2 commented Oct 14, 2022

@jrmadsen Cut and paste error on my side. My command indeed looks like what you said it should:

omnitrace -v -1 --print-instrumented functions -o /ccs/home/jjhu/crusher/libs-instrumented/libteuchosnumerics.so.13 -- ./numerics/src/libteuchosnumerics.so.13

@jrmadsen
Copy link
Collaborator

Ah I wondered if that might be the case. Can you try running it interactively and when you think it has hung, hit control-C. I put a new feature in 1.7 that should print out some log messages about what it was doing when it fails or gets interrupted

@jhux2
Copy link
Author

jhux2 commented Oct 14, 2022

Oh yeah, I actually have those handy :). Here you go. (I'm not sure how long it ran before I broke it, but I can let it run for however long you think if it would produce better logging information.)

log
### ERROR ### [omnitrace][PID=31056][TID=0] signal=2 (SIGINT) interrupt program. code: 128
Backtrace:
[PID=31056][TID=0][0/25]> _ZN3tim7signals26termination_signal_messageEiP9siginfo_tRSo +0x2d2
[PID=31056][TID=0][1/25]> _ZN3tim7signals26termination_signal_handlerEiP9siginfo_tPv +0x131
[PID=31056][TID=0][2/25]> __restore_rt
[PID=31056][TID=0][3/25]> _ZNK7Dyninst8PatchAPI10PatchBlock5startEv +0xe
[PID=31056][TID=0][4/25]> _ZNSt8_Rb_treeIPN7Dyninst8PatchAPI10PatchBlockES3_St9_IdentityIS3_ENS1_13PatchFunction7compareESaIS3_EE24_M_get_insert_unique_posERKS3_ +0x37
[PID=31056][TID=0][5/25]> _ZNSt8_Rb_treeIPN7Dyninst8PatchAPI10PatchBlockES3_St9_IdentityIS3_ENS1_13PatchFunction7compareESaIS3_EE16_M_insert_uniqueIRKS3_EESt4pairISt17_Rb_tree_iteratorIS3_EbEOT_ +0x11
[PID=31056][TID=0][6/25]> _ZN7Dyninst8PatchAPI13PatchFunction10exitBlocksEv +0x51
[PID=31056][TID=0][7/25]> _ZN7Dyninst8PatchAPI8PatchMgr6verifyERNS0_8LocationE +0x1d3
[PID=31056][TID=0][8/25]> _ZN7Dyninst8PatchAPI8PatchMgr9findPointENS0_8LocationENS0_5Point4TypeEb +0xd5
[PID=31056][TID=0][9/25]> _ZN13func_instance13funcExitPointEP14block_instanceb +0xcc
[PID=31056][TID=0][10/25]> _ZN7Dyninst10Relocation12Instrumenter23funcExitInstrumentationEPNS0_10RelocBlockEPNS0_10RelocGraphE +0x29
[PID=31056][TID=0][11/25]> _ZN7Dyninst10Relocation12Instrumenter7processEPNS0_10RelocBlockEPNS0_10RelocGraphE +0xab
[PID=31056][TID=0][12/25]> _ZN7Dyninst10Relocation11Transformer12processGraphEPNS0_10RelocGraphE +0x40
[PID=31056][TID=0][13/25]> _ZN7Dyninst10Relocation9CodeMover9transformERNS0_11TransformerE +0x26
[PID=31056][TID=0][14/25]> _ZN12AddressSpace9transformEN5boost10shared_ptrIN7Dyninst10Relocation9CodeMoverEEE +0x238
[PID=31056][TID=0][15/25]> _ZN12AddressSpace11relocateIntESt23_Rb_tree_const_iteratorIP13func_instanceES3_m +0x12d
[PID=31056][TID=0][16/25]> _ZN12AddressSpace8relocateEv +0x29f
[PID=31056][TID=0][17/25]> _ZN7Dyninst8PatchAPI15DynInstrumenter3runEv +0xdf
[PID=31056][TID=0][18/25]> _ZN7Dyninst8PatchAPI7Patcher3runEv +0xbe
[PID=31056][TID=0][19/25]> _ZN7Dyninst8PatchAPI7Command6commitEv +0xf
[PID=31056][TID=0][20/25]> _ZN12AddressSpace5patchEPS_ +0x2c
[PID=31056][TID=0][21/25]> _ZN17BPatch_binaryEdit9writeFileEPKc +0x8c
[PID=31056][TID=0][22/25]> main +0xf68b
[PID=31056][TID=0][23/25]> __libc_start_main +0xef
[PID=31056][TID=0][24/25]> _start +0x2a

Backtrace (demangled):
[PID=31056][TID=0][0/25]> tim::signals::termination_signal_message(int, siginfo_t*, std::ostream&) +0x3ff
[PID=31056][TID=0][1/25]> tim::signals::termination_signal_handler(int, siginfo_t*, void*) +0x131
[PID=31056][TID=0][2/25]> __restore_rt
[PID=31056][TID=0][3/25]> Dyninst::PatchAPI::PatchBlock::start() const +0xe
[PID=31056][TID=0][4/25]> std::_Rb_tree<Dyninst::PatchAPI::PatchBlock*, Dyninst::PatchAPI::PatchBlock*, std::_Identity<Dyninst::PatchAPI::PatchBlock*>, Dyninst::PatchAPI::PatchFunction::compare, std::allocator<Dyninst::PatchAPI::PatchBlock*> >::_M_get_insert_unique_pos(Dyninst::PatchAPI::PatchBlock* const&) +0x37
[PID=31056][TID=0][5/25]> std::pair<std::_Rb_tree_iterator<Dyninst::PatchAPI::PatchBlock*>, bool> std::_Rb_tree<Dyninst::PatchAPI::PatchBlock*, Dyninst::PatchAPI::PatchBlock*, std::_Identity<Dyninst::PatchAPI::PatchBlock*>, Dyninst::PatchAPI::PatchFunction::compare, std::allocator<Dyninst::PatchAPI::PatchBlock*> >::_M_insert_unique<Dyninst::PatchAPI::PatchBlock* const&>(Dyninst::PatchAPI::PatchBlock* const&) +0x11
[PID=31056][TID=0][6/25]> Dyninst::PatchAPI::PatchFunction::exitBlocks() +0x51
[PID=31056][TID=0][7/25]> Dyninst::PatchAPI::PatchMgr::verify(Dyninst::PatchAPI::Location&) +0x1d3
[PID=31056][TID=0][8/25]> Dyninst::PatchAPI::PatchMgr::findPoint(Dyninst::PatchAPI::Location, Dyninst::PatchAPI::Point::Type, bool) +0xd5
[PID=31056][TID=0][9/25]> func_instance::funcExitPoint(block_instance*, bool) +0xcc
[PID=31056][TID=0][10/25]> Dyninst::Relocation::Instrumenter::funcExitInstrumentation(Dyninst::Relocation::RelocBlock*, Dyninst::Relocation::RelocGraph*) +0x29
[PID=31056][TID=0][11/25]> Dyninst::Relocation::Instrumenter::process(Dyninst::Relocation::RelocBlock*, Dyninst::Relocation::RelocGraph*) +0xab
[PID=31056][TID=0][12/25]> Dyninst::Relocation::Transformer::processGraph(Dyninst::Relocation::RelocGraph*) +0x40
[PID=31056][TID=0][13/25]> Dyninst::Relocation::CodeMover::transform(Dyninst::Relocation::Transformer&) +0x26
[PID=31056][TID=0][14/25]> AddressSpace::transform(boost::shared_ptr<Dyninst::Relocation::CodeMover>) +0x238
[PID=31056][TID=0][15/25]> AddressSpace::relocateInt(std::_Rb_tree_const_iterator<func_instance*>, std::_Rb_tree_const_iterator<func_instance*>, unsigned long) +0x12d
[PID=31056][TID=0][16/25]> AddressSpace::relocate() +0x29f
[PID=31056][TID=0][17/25]> Dyninst::PatchAPI::DynInstrumenter::run() +0xdf
[PID=31056][TID=0][18/25]> Dyninst::PatchAPI::Patcher::run() +0xbe
[PID=31056][TID=0][19/25]> Dyninst::PatchAPI::Command::commit() +0xf
[PID=31056][TID=0][20/25]> AddressSpace::patch(AddressSpace*) +0x2c
[PID=31056][TID=0][21/25]> BPatch_binaryEdit::writeFile(char const*) +0x8c
[PID=31056][TID=0][22/25]> main +0xf68b
[PID=31056][TID=0][23/25]> __libc_start_main +0xef
[PID=31056][TID=0][24/25]> _start +0x2a

Backtrace (demangled):
[PID=31056][TID=0][0/25]> omnitrace() [0x47263f]
[PID=31056][TID=0][1/25]> omnitrace() [0x483ef1]
[PID=31056][TID=0][2/25]> /lib64/libpthread.so.0(+0x168c0) [0x7fffed5cd8c0]
[PID=31056][TID=0][3/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::PatchBlock::start() const+0xe) [0x7fffecd2e06e]
[PID=31056][TID=0][4/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(std::_Rb_tree<Dyninst::PatchAPI::PatchBlock*, Dyninst::PatchAPI::PatchBlock*, std::_Identity<Dyninst::PatchAPI::PatchBlock*>, Dyninst::PatchAPI::PatchFunction::compare, std::allocator<Dyninst::PatchAPI::PatchBlock*> >::_M_get_insert_unique_pos(Dyninst::PatchAPI::PatchBlock* const&)+0x37) [0x7fffecd3b417]
[PID=31056][TID=0][5/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(std::pair<std::_Rb_tree_iterator<Dyninst::PatchAPI::PatchBlock*>, bool> std::_Rb_tree<Dyninst::PatchAPI::PatchBlock*, Dyninst::PatchAPI::PatchBlock*, std::_Identity<Dyninst::PatchAPI::PatchBlock*>, Dyninst::PatchAPI::PatchFunction::compare, std::allocator<Dyninst::PatchAPI::PatchBlock*> >::_M_insert_unique<Dyninst::PatchAPI::PatchBlock* const&>(Dyninst::PatchAPI::PatchBlock* const&)+0x11) [0x7fffecd3b4b1]
[PID=31056][TID=0][6/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::PatchFunction::exitBlocks()+0x51) [0x7fffecd36121]
[PID=31056][TID=0][7/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::PatchMgr::verify(Dyninst::PatchAPI::Location&)+0x1d3) [0x7fffecd3d973]
[PID=31056][TID=0][8/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::PatchMgr::findPoint(Dyninst::PatchAPI::Location, Dyninst::PatchAPI::Point::Type, bool)+0xd5) [0x7fffecd3daa5]
[PID=31056][TID=0][9/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x1188fc) [0x7fffed0728fc]
[PID=31056][TID=0][10/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x1679a9) [0x7fffed0c19a9]
[PID=31056][TID=0][11/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x167e8b) [0x7fffed0c1e8b]
[PID=31056][TID=0][12/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x164f10) [0x7fffed0bef10]
[PID=31056][TID=0][13/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x14fb16) [0x7fffed0a9b16]
[PID=31056][TID=0][14/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xe3388) [0x7fffed03d388]
[PID=31056][TID=0][15/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xe371d) [0x7fffed03d71d]
[PID=31056][TID=0][16/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xe42ef) [0x7fffed03e2ef]
[PID=31056][TID=0][17/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x17ecdf) [0x7fffed0d8cdf]
[PID=31056][TID=0][18/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::Patcher::run()+0xbe) [0x7fffecd44b1e]
[PID=31056][TID=0][19/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::Command::commit()+0xf) [0x7fffecd4433f]
[PID=31056][TID=0][20/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xdd77c) [0x7fffed03777c]
[PID=31056][TID=0][21/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(BPatch_binaryEdit::writeFile(char const*)+0x8c) [0x7fffed00db6c]
[PID=31056][TID=0][22/25]> omnitrace() [0x420b7b]
[PID=31056][TID=0][23/25]> /lib64/libc.so.6(__libc_start_main+0xef) [0x7fffe8ed02bd]
[PID=31056][TID=0][24/25]> omnitrace() [0x427e7a]
[omnitrace] omnitrace exited with signal 2 ::  Signal:     SIGINT (signal number:   2)                        interrupt program
[33069/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ssymv.f ...
[33070/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ssyr2k.f ...
[33071/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ssyrk.f ...
[33072/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/stbsv.f ...
[33073/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/strmm.f ...
[33074/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/strmv.f ...
[33075/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/strsm.f ...
[33076/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/strsv.f ...
[33077/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/zgemm.f ...
[33078/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/zgemv.f ...
[33079/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/zhemv.f ...
[33080/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    2 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/zherk.f ...
[33081/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ztrmm.f ...
[33082/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ztrmv.f ...
[33083/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ztrsm.f ...
[33084/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ztrsv.f ...
[33085/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    2 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/src/crayblas_gemm.c ...
[33086/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]   58 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/src/loop.c ...
[33087/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]  432 instrumented funcs in libteuchosnumerics.so.13.5 ...
[33088/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1992][main]  ...

[omnitrace][exe] Potentially important log entries

[omnitrace]
[omnitrace] These were the last 20 log entries from omnitrace. You can control the number of log entries via the '--log <N>' option or OMNITRACE_LOG_COUNT env variable.
Command terminated by signal 2

@jhux2
Copy link
Author

jhux2 commented Oct 15, 2022

Here's a log after I let omnitrace run for about 6 minutes:

omnitrace -v -1 --print-instrumented functions -o /ccs/home/jjhu/crusher/libs-instrumented/libteuchosnumerics.so.13 -- ./numerics/src/libteuchosnumerics.so.13^C
### ERROR ### [omnitrace][PID=112020][TID=0] signal=2 (SIGINT) interrupt program. code: 128
Backtrace:
[PID=112020][TID=0][0/25]> _ZN3tim7signals26termination_signal_messageEiP9siginfo_tRSo +0x2d2
[PID=112020][TID=0][1/25]> _ZN3tim7signals26termination_signal_handlerEiP9siginfo_tPv +0x131
[PID=112020][TID=0][2/25]> __restore_rt
[PID=112020][TID=0][3/25]> _ZNK7Dyninst8PatchAPI10PatchBlock5startEv +0x1d
[PID=112020][TID=0][4/25]> _ZNSt8_Rb_treeIPN7Dyninst8PatchAPI10PatchBlockES3_St9_IdentityIS3_ENS1_13PatchFunction7compareESaIS3_EE24_M_get_insert_unique_posERKS3_ +0x43
[PID=112020][TID=0][5/25]> _ZNSt8_Rb_treeIPN7Dyninst8PatchAPI10PatchBlockES3_St9_IdentityIS3_ENS1_13PatchFunction7compareESaIS3_EE16_M_insert_uniqueIRKS3_EESt4pairISt17_Rb_tree_iteratorIS3_EbEOT_ +0x11
[PID=112020][TID=0][6/25]> _ZN7Dyninst8PatchAPI13PatchFunction10exitBlocksEv +0x51
[PID=112020][TID=0][7/25]> _ZN7Dyninst8PatchAPI8PatchMgr6verifyERNS0_8LocationE +0x1d3
[PID=112020][TID=0][8/25]> _ZN7Dyninst8PatchAPI8PatchMgr9findPointENS0_8LocationENS0_5Point4TypeEb +0xd5
[PID=112020][TID=0][9/25]> _ZN13func_instance13funcExitPointEP14block_instanceb +0xcc
[PID=112020][TID=0][10/25]> _ZN7Dyninst10Relocation12Instrumenter23funcExitInstrumentationEPNS0_10RelocBlockEPNS0_10RelocGraphE +0x29
[PID=112020][TID=0][11/25]> _ZN7Dyninst10Relocation12Instrumenter7processEPNS0_10RelocBlockEPNS0_10RelocGraphE +0xab
[PID=112020][TID=0][12/25]> _ZN7Dyninst10Relocation11Transformer12processGraphEPNS0_10RelocGraphE +0x40
[PID=112020][TID=0][13/25]> _ZN7Dyninst10Relocation9CodeMover9transformERNS0_11TransformerE +0x26
[PID=112020][TID=0][14/25]> _ZN12AddressSpace9transformEN5boost10shared_ptrIN7Dyninst10Relocation9CodeMoverEEE +0x238
[PID=112020][TID=0][15/25]> _ZN12AddressSpace11relocateIntESt23_Rb_tree_const_iteratorIP13func_instanceES3_m +0x12d
[PID=112020][TID=0][16/25]> _ZN12AddressSpace8relocateEv +0x29f
[PID=112020][TID=0][17/25]> _ZN7Dyninst8PatchAPI15DynInstrumenter3runEv +0xdf
[PID=112020][TID=0][18/25]> _ZN7Dyninst8PatchAPI7Patcher3runEv +0xbe
[PID=112020][TID=0][19/25]> _ZN7Dyninst8PatchAPI7Command6commitEv +0xf
[PID=112020][TID=0][20/25]> _ZN12AddressSpace5patchEPS_ +0x2c
[PID=112020][TID=0][21/25]> _ZN17BPatch_binaryEdit9writeFileEPKc +0x8c
[PID=112020][TID=0][22/25]> main +0xf68b
[PID=112020][TID=0][23/25]> __libc_start_main +0xef
[PID=112020][TID=0][24/25]> _start +0x2a

Backtrace (demangled):
[PID=112020][TID=0][0/25]> tim::signals::termination_signal_message(int, siginfo_t*, std::ostream&) +0x3ff
[PID=112020][TID=0][1/25]> tim::signals::termination_signal_handler(int, siginfo_t*, void*) +0x131
[PID=112020][TID=0][2/25]> __restore_rt
[PID=112020][TID=0][3/25]> Dyninst::PatchAPI::PatchBlock::start() const +0x1d
[PID=112020][TID=0][4/25]> std::_Rb_tree<Dyninst::PatchAPI::PatchBlock*, Dyninst::PatchAPI::PatchBlock*, std::_Identity<Dyninst::PatchAPI::PatchBlock*>, Dyninst::PatchAPI::PatchFunction::compare, std::allocator<Dyninst::PatchAPI::PatchBlock*> >::_M_get_insert_unique_pos(Dyninst::PatchAPI::PatchBlock* const&) +0x43
[PID=112020][TID=0][5/25]> std::pair<std::_Rb_tree_iterator<Dyninst::PatchAPI::PatchBlock*>, bool> std::_Rb_tree<Dyninst::PatchAPI::PatchBlock*, Dyninst::PatchAPI::PatchBlock*, std::_Identity<Dyninst::PatchAPI::PatchBlock*>, Dyninst::PatchAPI::PatchFunction::compare, std::allocator<Dyninst::PatchAPI::PatchBlock*> >::_M_insert_unique<Dyninst::PatchAPI::PatchBlock* const&>(Dyninst::PatchAPI::PatchBlock* const&) +0x11
[PID=112020][TID=0][6/25]> Dyninst::PatchAPI::PatchFunction::exitBlocks() +0x51
[PID=112020][TID=0][7/25]> Dyninst::PatchAPI::PatchMgr::verify(Dyninst::PatchAPI::Location&) +0x1d3
[PID=112020][TID=0][8/25]> Dyninst::PatchAPI::PatchMgr::findPoint(Dyninst::PatchAPI::Location, Dyninst::PatchAPI::Point::Type, bool) +0xd5
[PID=112020][TID=0][9/25]> func_instance::funcExitPoint(block_instance*, bool) +0xcc
[PID=112020][TID=0][10/25]> Dyninst::Relocation::Instrumenter::funcExitInstrumentation(Dyninst::Relocation::RelocBlock*, Dyninst::Relocation::RelocGraph*) +0x29
[PID=112020][TID=0][11/25]> Dyninst::Relocation::Instrumenter::process(Dyninst::Relocation::RelocBlock*, Dyninst::Relocation::RelocGraph*) +0xab
[PID=112020][TID=0][12/25]> Dyninst::Relocation::Transformer::processGraph(Dyninst::Relocation::RelocGraph*) +0x40
[PID=112020][TID=0][13/25]> Dyninst::Relocation::CodeMover::transform(Dyninst::Relocation::Transformer&) +0x26
[PID=112020][TID=0][14/25]> AddressSpace::transform(boost::shared_ptr<Dyninst::Relocation::CodeMover>) +0x238
[PID=112020][TID=0][15/25]> AddressSpace::relocateInt(std::_Rb_tree_const_iterator<func_instance*>, std::_Rb_tree_const_iterator<func_instance*>, unsigned long) +0x12d
[PID=112020][TID=0][16/25]> AddressSpace::relocate() +0x29f
[PID=112020][TID=0][17/25]> Dyninst::PatchAPI::DynInstrumenter::run() +0xdf
[PID=112020][TID=0][18/25]> Dyninst::PatchAPI::Patcher::run() +0xbe
[PID=112020][TID=0][19/25]> Dyninst::PatchAPI::Command::commit() +0xf
[PID=112020][TID=0][20/25]> AddressSpace::patch(AddressSpace*) +0x2c
[PID=112020][TID=0][21/25]> BPatch_binaryEdit::writeFile(char const*) +0x8c
[PID=112020][TID=0][22/25]> main +0xf68b
[PID=112020][TID=0][23/25]> __libc_start_main +0xef
[PID=112020][TID=0][24/25]> _start +0x2a

Backtrace (demangled):
[PID=112020][TID=0][0/25]> omnitrace() [0x47263f]
[PID=112020][TID=0][1/25]> omnitrace() [0x483ef1]
[PID=112020][TID=0][2/25]> /lib64/libpthread.so.0(+0x168c0) [0x7fffed5cd8c0]
[PID=112020][TID=0][3/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::PatchBlock::start() const+0x1d) [0x7fffecd2e07d]
[PID=112020][TID=0][4/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(std::_Rb_tree<Dyninst::PatchAPI::PatchBlock*, Dyninst::PatchAPI::PatchBlock*, std::_Identity<Dyninst::PatchAPI::PatchBlock*>, Dyninst::PatchAPI::PatchFunction::compare, std::allocator<Dyninst::PatchAPI::PatchBlock*> >::_M_get_insert_unique_pos(Dyninst::PatchAPI::PatchBlock* const&)+0x43) [0x7fffecd3b423]
[PID=112020][TID=0][5/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(std::pair<std::_Rb_tree_iterator<Dyninst::PatchAPI::PatchBlock*>, bool> std::_Rb_tree<Dyninst::PatchAPI::PatchBlock*, Dyninst::PatchAPI::PatchBlock*, std::_Identity<Dyninst::PatchAPI::PatchBlock*>, Dyninst::PatchAPI::PatchFunction::compare, std::allocator<Dyninst::PatchAPI::PatchBlock*> >::_M_insert_unique<Dyninst::PatchAPI::PatchBlock* const&>(Dyninst::PatchAPI::PatchBlock* const&)+0x11) [0x7fffecd3b4b1]
[PID=112020][TID=0][6/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::PatchFunction::exitBlocks()+0x51) [0x7fffecd36121]
[PID=112020][TID=0][7/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::PatchMgr::verify(Dyninst::PatchAPI::Location&)+0x1d3) [0x7fffecd3d973]
[PID=112020][TID=0][8/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::PatchMgr::findPoint(Dyninst::PatchAPI::Location, Dyninst::PatchAPI::Point::Type, bool)+0xd5) [0x7fffecd3daa5]
[PID=112020][TID=0][9/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x1188fc) [0x7fffed0728fc]
[PID=112020][TID=0][10/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x1679a9) [0x7fffed0c19a9]
[PID=112020][TID=0][11/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x167e8b) [0x7fffed0c1e8b]
[PID=112020][TID=0][12/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x164f10) [0x7fffed0bef10]
[PID=112020][TID=0][13/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x14fb16) [0x7fffed0a9b16]
[PID=112020][TID=0][14/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xe3388) [0x7fffed03d388]
[PID=112020][TID=0][15/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xe371d) [0x7fffed03d71d]
[PID=112020][TID=0][16/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xe42ef) [0x7fffed03e2ef]
[PID=112020][TID=0][17/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x17ecdf) [0x7fffed0d8cdf]
[PID=112020][TID=0][18/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::Patcher::run()+0xbe) [0x7fffecd44b1e]
[PID=112020][TID=0][19/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libpatchAPI.so.11.0(Dyninst::PatchAPI::Command::commit()+0xf) [0x7fffecd4433f]
[PID=112020][TID=0][20/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xdd77c) [0x7fffed03777c]
[PID=112020][TID=0][21/25]> /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(BPatch_binaryEdit::writeFile(char const*)+0x8c) [0x7fffed00db6c]
[PID=112020][TID=0][22/25]> omnitrace() [0x420b7b]
[PID=112020][TID=0][23/25]> /lib64/libc.so.6(__libc_start_main+0xef) [0x7fffe8ed02bd]
[PID=112020][TID=0][24/25]> omnitrace() [0x427e7a]
[omnitrace] omnitrace exited with signal 2 ::  Signal:     SIGINT (signal number:   2)                        interrupt program
[33069/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ssymv.f ...
[33070/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ssyr2k.f ...
[33071/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ssyrk.f ...
[33072/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/stbsv.f ...
[33073/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/strmm.f ...
[33074/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/strmv.f ...
[33075/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/strsm.f ...
[33076/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/strsv.f ...
[33077/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/zgemm.f ...
[33078/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/zgemv.f ...
[33079/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/zhemv.f ...
[33080/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    2 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/zherk.f ...
[33081/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ztrmm.f ...
[33082/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ztrmv.f ...
[33083/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    3 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ztrsm.f ...
[33084/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    4 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/basis/netlib/ztrsv.f ...
[33085/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]    2 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/src/crayblas_gemm.c ...
[33086/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]   58 instrumented funcs in /lus/cls01075/home/users/mhadi/build/libsci/bframe/crayblas/src/loop.c ...
[33087/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1937][main]  432 instrumented funcs in libteuchosnumerics.so.13.5 ...
[33088/33089][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1992][main]  ...

[omnitrace][exe] Potentially important log entries

[omnitrace]
[omnitrace] These were the last 20 log entries from omnitrace. You can control the number of log entries via the '--log <N>' option or OMNITRACE_LOG_COUNT env variable.
omnitrace :: : Interrupt (Signal sent by the kernel 0 0)
Command terminated by signal 2
 ... 333.33 seconds

@jrmadsen
Copy link
Collaborator

Hi @jhux2, thanks for providing the second log. Yeah, it looks like Dyninst is hanging in the exact same spot both times. Let me look into this and get back to you in a couple hours.

@jrmadsen
Copy link
Collaborator

jrmadsen commented Oct 19, 2022

Hi @jhux2 if you install the (newly released) v1.7.1 and interrupt the job as before, there should be line info in the backtrace -- knowing the line number it is hanging on will (hopefully) significantly help me track down the reason for the hang.

@jhux2
Copy link
Author

jhux2 commented Oct 19, 2022

@jrmadsen Here's a backtrace from v1.7.1.

omnitrace -v -1 --print-instrumented functions -o /ccs/home/jjhu/crusher/libs-instrumented/libteuchosnumerics.so.13 -- ./numerics/src/libteuchosnumerics.so.13^C
### ERROR ### [omnitrace][PID=75872][TID=0] signal=2 (SIGINT) interrupt program. code: 128
Backtrace:
[PID=75872][TID=0][0/12] __restore_rt
[PID=75872][TID=0][1/12] _ZN7Dyninst8ParseAPI6Parser11getGapRangeEPNS0_10CodeRegionEmRmS4_ +0xe2
[PID=75872][TID=0][2/12] _ZN7Dyninst8ParseAPI6Parser25probabilistic_gap_parsingEPNS0_10CodeRegionE +0x104
[PID=75872][TID=0][3/12] _ZN5image12analyzeImageEv +0xd4
[PID=75872][TID=0][4/12] _ZN5image15analyzeIfNeededEv +0x70
[PID=75872][TID=0][5/12] _ZN8pdmodule12getFunctionsERSt6vectorIP10parse_funcSaIS2_EE +0x27
[PID=75872][TID=0][6/12] _ZN13mapped_module15getAllFunctionsEv +0x2e
[PID=75872][TID=0][7/12] _ZN13BPatch_module13getProceduresERSt6vectorIP15BPatch_functionSaIS2_EEb +0x56
[PID=75872][TID=0][8/12] _ZN12BPatch_image13getProceduresEb +0x71
[PID=75872][TID=0][9/12] main +0x9a7d
[PID=75872][TID=0][10/12] __libc_start_main +0xef
[PID=75872][TID=0][11/12] _start +0x2a

Backtrace (demangled):
[PID=75872][TID=0][0/12] /lib64/libpthread.so.0(+0x168c0) [0x7fb4bfce28c0]
[PID=75872][TID=0][1/12] /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libparseAPI.so.11.0(+0xa35a2) [0x7fb4becde5a2]
[PID=75872][TID=0][2/12] /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libparseAPI.so.11.0(+0xa38c4) [0x7fb4becde8c4]
[PID=75872][TID=0][3/12] /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xf1434) [0x7fb4bf768434]
[PID=75872][TID=0][4/12] /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xf1500) [0x7fb4bf768500]
[PID=75872][TID=0][5/12] /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0xf28b7) [0x7fb4bf7698b7]
[PID=75872][TID=0][6/12] /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(+0x10a5ce) [0x7fb4bf7815ce]
[PID=75872][TID=0][7/12] /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(BPatch_module::getProcedures(std::vector<BPatch_function*, std::allocator<BPatch_function*>>&, bool)+0x56) [0x7fb4bf7061c6]
[PID=75872][TID=0][8/12] /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/bin/../lib/omnitrace/libdyninstAPI.so.11.0(BPatch_image::getProcedures(bool)+0x71) [0x7fb4bf6e7a91]
[PID=75872][TID=0][9/12] omnitrace() [0x41dddd]
[PID=75872][TID=0][10/12] /lib64/libc.so.6(__libc_start_main+0xef) [0x7fb4bb3df2bd]
[PID=75872][TID=0][11/12] omnitrace() [0x433e1a]

Backtrace (demangled):
[PID=75872][TID=0][0/12] __restore_rt
[PID=75872][TID=0][1/12] Dyninst::ParseAPI::Parser::getGapRange(Dyninst::ParseAPI::CodeRegion*, unsigned long, unsigned long&, unsigned long&) +0xe2
[PID=75872][TID=0][2/12] Dyninst::ParseAPI::Parser::probabilistic_gap_parsing(Dyninst::ParseAPI::CodeRegion*) +0x104
[PID=75872][TID=0][3/12] image::analyzeImage() +0xd4
[PID=75872][TID=0][4/12] image::analyzeIfNeeded() +0x70
[PID=75872][TID=0][5/12] pdmodule::getFunctions(std::vector<parse_func*, std::allocator<parse_func*>>&) +0x27
[PID=75872][TID=0][6/12] mapped_module::getAllFunctions() +0x2e
[PID=75872][TID=0][7/12] BPatch_module::getProcedures(std::vector<BPatch_function*, std::allocator<BPatch_function*>>&, bool) +0x56
[PID=75872][TID=0][8/12] BPatch_image::getProcedures(bool) +0x71
[PID=75872][TID=0][9/12] main +0x9a7d
[PID=75872][TID=0][10/12] __libc_start_main +0xef
[PID=75872][TID=0][11/12] _start +0x2a

Backtrace (lineinfo):
[PID=75872][TID=0][0/9]
    [??:?] Dyninst::ParseAPI::Parser::probabilistic_gap_parsing(Dyninst::ParseAPI::CodeRegion*)
[PID=75872][TID=0][1/9]
    [??:?] image::analyzeImage()
[PID=75872][TID=0][2/9]
    [??:?] image::analyzeIfNeeded()
[PID=75872][TID=0][3/9]
    [??:?] pdmodule::getFunctions(std::vector<parse_func*, std::allocator<parse_func*>>&)
[PID=75872][TID=0][4/9]
    [??:?] mapped_module::getAllFunctions()
[PID=75872][TID=0][5/9]
    [/gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/lib/omnitrace/libdyninstAPI.so.11.0.1:?] BPatch_module::getProcedures(std::vector<BPatch_function*, std::allocator<BPatch_function*>>&, bool)
[PID=75872][TID=0][6/9]
    [/gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/lib/omnitrace/libdyninstAPI.so.11.0.1:?] BPatch_image::getProcedures(bool)
[PID=75872][TID=0][7/9]
    [??:1242] main
    [/usr/include/c++/7/bits/stl_set.h:157] std::set<BPatch_module*, std::less<BPatch_module*>, std::allocator<BPatch_module*>>::set()
    [/usr/include/c++/7/bits/stl_tree.h:913] std::_Rb_tree<BPatch_module*, BPatch_module*, std::_Identity<BPatch_module*>, std::less<BPatch_module*>, std::allocator<BPatch_module*>>::_Rb_tree()
    [/usr/include/c++/7/bits/stl_tree.h:688] std::_Rb_tree<BPatch_module*, BPatch_module*, std::_Identity<BPatch_module*>, std::less<BPatch_module*>, std::allocator<BPatch_module*>>::_Rb_tree_impl<std::less<BPatch_module*>, true>::_Rb_tree_impl()
    [/usr/include/c++/7/bits/stl_tree.h:176] std::_Rb_tree_header::_Rb_tree_header()
    [/usr/include/c++/7/bits/stl_tree.h:209] std::_Rb_tree_header::_M_reset()
[PID=75872][TID=0][8/9]
    [/lib64/libc-2.31.so:?] __libc_start_main

[omnitrace] omnitrace exited with signal 2 ::  Signal:     SIGINT (signal number:   2)                        interrupt program
[35/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:2488][file_exists] querying whether file '/gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/lib/libomnitrace-rt.so' exists... ...
[36/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:2451][get_absolute_lib_filepath] Resolved 'libomnitrace-rt.so' to '/gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/lib/libomnitrace-rt.so.11.0.1'... ...
[37/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:2488][file_exists] querying whether file '/gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/lib/libomnitrace-rt.so.11.0.1' exists... ...
[38/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:2583][find_dyn_api_rt] DYNINST_API_RT: /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/lib/libomnitrace-rt.so.11.0.1 ...
[39/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1143][operator()] [dyninst-option]> TypeChecking         =   on ...
[40/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1143][operator()] [dyninst-option]> SaveFPR              =   on ...
[41/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1143][operator()] [dyninst-option]> DelayedParsing       =   on ...
[42/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1143][operator()] [dyninst-option]> DebugParsing         =   on ...
[43/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1143][operator()] [dyninst-option]> InstrStackFrames     =  off ...
[44/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1143][operator()] [dyninst-option]> TrampRecursive       =  off ...
[45/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1143][operator()] [dyninst-option]> MergeTramp           =   on ...
[46/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1143][operator()] [dyninst-option]> BaseTrampDeletion    =  off ...
[47/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1184][main] instrumentation target: /gpfs/alpine/cfd116/scratch/jjhu/crusher/builds/performance-new/packages/teuchos/numerics/src/libteuchosnumerics.so.13.5 ...
[48/55][/home/omnitrace/source/bin/omnitrace/omnitrace.hpp:198][omnitrace_get_address_space] Opening '/gpfs/alpine/cfd116/scratch/jjhu/crusher/builds/performance-new/packages/teuchos/numerics/src/libteuchosnumerics.so.13.5' for binary rewrite...
[49/55][/home/omnitrace/source/bin/omnitrace/details.cpp:440][error_func_real] Dyninst error function called with level 3 :: ID# = 0 :: Parsing object file: /gpfs/alpine/cfd116/scratch/jjhu/crusher/builds/performance-new/packages/teuchos/numerics/src/libteuchosnumerics.so.13.5
[50/55][/home/omnitrace/source/bin/omnitrace/details.cpp:440][error_func_real] Dyninst error function called with level 3 :: ID# = 0 :: ready
[51/55][/home/omnitrace/source/bin/omnitrace/details.cpp:440][error_func_real] Dyninst error function called with level 3 :: ID# = 0 :: Parsing object file: /gpfs/alpine/csc465/proj-shared/jjhu/omnitrace/omnitrace-1.7.1-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3/lib/libomnitrace-rt.so.11.0.1
[52/55][/home/omnitrace/source/bin/omnitrace/details.cpp:440][error_func_real] Dyninst error function called with level 3 :: ID# = 0 :: ready
[53/55][/home/omnitrace/source/bin/omnitrace/omnitrace.hpp:206][omnitrace_get_address_space] Done ...
[54/55][/home/omnitrace/source/bin/omnitrace/omnitrace.cpp:1238][main] Getting the address space image, modules, and procedures... ...
[omnitrace]
[omnitrace] These were the last 20 log entries from omnitrace. You can control the number of log entries via the '--log <N>' option or OMNITRACE_LOG_COUNT env variable.
Command terminated by signal 2

@jrmadsen
Copy link
Collaborator

There are performance and other issues noted in these which help explain what is going on here

@jhux2
Copy link
Author

jhux2 commented Dec 19, 2022

@jrmadsen Thanks for the information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants