Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cilk bkend segfaults: Some remaining problems with shared library handling #5

Open
rrnewton opened this issue Feb 13, 2014 · 0 comments
Labels
Milestone

Comments

@rrnewton
Copy link
Member

In JITRuntime.hs we use withDL, but I'm seeing some suspecious segfaults even when the whole test suite claims to PASS. And valgrind says:

$ DEBUG=2 valgrind ./dist/build/test-accelerate-cpu-cilk/test-accelerate-cpu-cilk -t p18f
==8168== Memcheck, a memory error detector
==8168== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==8168== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==8168== Command: ./dist/build/test-accelerate-cpu-cilk/test-accelerate-cpu-cilk -t p18f
==8168== 
  [Note: passing through options to test-framework]: -t p18f
 [!] Testing backend: Cilk
[main] First checking that all requested tests can be found within 'allProgs'...
[main] Yep, all 70 tests are there.
run test 39 oneDim_p18f::
 ! Responding to env Var: DEBUG=2
 [dbg] engaging optional typecheck pass, AST size 7
COMPILETIME_phase0: 0.116065s
COMPILETIME_phase1: 0.253395s
 [dbg] engaging optional typecheck pass, AST size 9
 [dbg] engaging optional typecheck pass, AST size 9
 [dbg] engaging optional typecheck pass, AST size 11
 [dbg] engaging optional typecheck pass, AST size 11
 [dbg] engaging optional typecheck pass, AST size 11
 [dbg] engaging optional typecheck pass, AST size 12
 [dbg] engaging optional typecheck pass, AST size 13
 [dbg] engaging optional typecheck pass, AST size 13
!! Victory, inlineCheap: inlining reference to aLt0
!! Victory, inlineCheap: inlining reference to tmp_1
 [dbg] engaging optional typecheck pass, AST size 15
 [dbg] engaging optional typecheck pass, AST size 15
!! Victory: deadArrays, start an unraveling by eliminating: aLt0
!! Victory: deadArrays, start an unraveling by eliminating: tmp_1
 [dbg] engaging optional typecheck pass, AST size 9
[DBG OneDimensionalize]Computing 1D size for ndims 1 input exp EVr tmp_0_shape
 [dbg] engaging optional typecheck pass, AST size 15
 [dbg] engaging optional typecheck pass, AST size 15
COMPILETIME_phase2: 0.140196s
COMPILETIME_phase3: 0.002773s
COMPILETIME_emit: 0.421638s
[JIT] Invoking C compiler on: .genC_CilkParallel_oneDimp18f.c
[JIT] Found ICC (/l/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/bin/intel64/icc) Using it.
[JIT]   Compiling with: /l/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/bin/intel64/icc-fast -ww13397 -vec-report2 -g -lcilkrts -std=c99 -shared -fPIC .genC_CilkParallel_oneDimp18f.c -o .genC_CilkParallel_oneDimp18f.so
COMPILETIME_C: 0.302799s
.genC_CilkParallel_oneDimp18f.c(141): (col. 5) remark: loop was not vectorized: existence of vector dependence
.genC_CilkParallel_oneDimp18f.c(28): (col. 5) remark: LOOP WAS VECTORIZED
.genC_CilkParallel_oneDimp18f.c(14): (col. 1) remark: FUNCTION WAS VECTORIZED
.genC_CilkParallel_oneDimp18f.c(14): (col. 1) remark: FUNCTION WAS VECTORIZED
.genC_CilkParallel_oneDimp18f.c(14): (col. 1) remark: FUNCTION WAS VECTORIZED
.genC_CilkParallel_oneDimp18f.c(14): (col. 1) remark: FUNCTION WAS VECTORIZED
[JIT]: Working directory: /nfs/nfs3/home/rrnewton/working_copies/accelerate/accelerate-backend-kit/icc-opencl

 [dbg] Top lvl scalar binding: gensym_0 = 7
 [dbg] Top lvl scalar binding: tmp_0_shape = 7
 [dbg] Top lvl scalar binding: tmp_0_size = 7
==8168== Thread 3:
==8168== Invalid read of size 8
==8168==    at 0x7A0CC88: __intel_sse2_strlen (in /nfs/nfs5-insecure/home/insecure-ro/software/rhel6_x86_64/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libcilkrts.so.5)
==8168==    by 0x7A217A0: tbb::internal::dynamic_link(char const*, tbb::internal::dynamic_link_descriptor const*, unsigned long, unsigned long, void**) (in /nfs/nfs5-insecure/home/insecure-ro/software/rhel6_x86_64/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libcilkrts.so.5)
==8168==    by 0x7A21503: tbb::internal::rml::tbb_factory::open() (in /nfs/nfs5-insecure/home/insecure-ro/software/rhel6_x86_64/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libcilkrts.so.5)
==8168==    by 0x7A21CE4: __cilkrts_rml_startup (in /nfs/nfs5-insecure/home/insecure-ro/software/rhel6_x86_64/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libcilkrts.so.5)
==8168==    by 0x7A22158: __cilkrts_start_workers (in /nfs/nfs5-insecure/home/insecure-ro/software/rhel6_x86_64/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libcilkrts.so.5)
==8168==    by 0x7A25F9D: __cilkrts_init_internal (in /nfs/nfs5-insecure/home/insecure-ro/software/rhel6_x86_64/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libcilkrts.so.5)
==8168==    by 0x7A1550C: __cilkrts_bind_thread (in /nfs/nfs5-insecure/home/insecure-ro/software/rhel6_x86_64/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libcilkrts.so.5)
==8168==    by 0x7A11213: void cilk_for_root<unsigned long, void (*)(void*, unsigned long, unsigned long)>(void (*)(void*, unsigned long, unsigned long), void*, unsigned long, int) (in /nfs/nfs5-insecure/home/insecure-ro/software/rhel6_x86_64/intel/composer_xe_2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64/libcilkrts.so.5)
==8168==    by 0x7802350: MainProg (.genC_CilkParallel_oneDimp18f.c:27)
==8168==    by 0x4140E3: ??? (in /nfs/nfs3/home/rrnewton/working_copies/accelerate/accelerate-backend-kit/icc-opencl/dist/build/test-accelerate-cpu-cilk/test-accelerate-cpu-cilk)
==8168==  Address 0x760c5a8 is 104 bytes inside a block of size 107 alloc'd
==8168==    at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==8168==    by 0x35A1A05BCD: open_path (in /lib64/ld-2.12.so)
==8168==    by 0x35A1A08464: _dl_map_object (in /lib64/ld-2.12.so)
==8168==    by 0x35A1A0C2D1: openaux (in /lib64/ld-2.12.so)
==8168==    by 0x35A1A0E1B5: _dl_catch_error (in /lib64/ld-2.12.so)
==8168==    by 0x35A1A0C9B4: _dl_map_object_deps (in /lib64/ld-2.12.so)
==8168==    by 0x35A1A12AA0: dl_open_worker (in /lib64/ld-2.12.so)
==8168==    by 0x35A1A0E1B5: _dl_catch_error (in /lib64/ld-2.12.so)
==8168==    by 0x35A1A124F9: _dl_open (in /lib64/ld-2.12.so)
==8168==    by 0x35A2200F65: dlopen_doit (in /lib64/libdl-2.12.so)
==8168==    by 0x35A1A0E1B5: _dl_catch_error (in /lib64/ld-2.12.so)
==8168==    by 0x35A220129B: _dlerror_run (in /lib64/libdl-2.12.so)
==8168== 
SELFTIMED: 0.110766s
[JIT] Finished executing dynamically loaded Acc computation!
[JIT] Fetched result ptr: tmp_0 = 0x0000000007648fc0 and size 7
 [dbg] !! Copying back 56 bytes (array len 7) into haskell heap!
 [dbg] !! Copying to 0x000000000511c320 from 0x0000000007648fc0
[JIT] Destroying args record: 0x0000000007611820
[JIT] Destroying results record: 0x0000000007611860
  : [OK]

         Test Cases  Total      
 Passed  1           1          
 Failed  0           0          
 Total   1           1          
==8168== Jump to the invalid address stated on the next line
==8168==    at 0x7A19DE0: ???
==8168==    by 0x35A2607A68: start_thread (in /lib64/libpthread-2.12.so)
==8168==    by 0x66016FF: ???
==8168==  Address 0x7a19de0 is not stack'd, malloc'd or (recently) free'd
==8168== 
==8168== 
==8168== Process terminating with default action of signal 11 (SIGSEGV)
==8168==  Access not within mapped region at address 0x7A19DE0
==8168==    at 0x7A19DE0: ???
==8168==    by 0x35A2607A68: start_thread (in /lib64/libpthread-2.12.so)
==8168==    by 0x66016FF: ???
==8168==  If you believe this happened as a result of a stack
==8168==  overflow in your program's main thread (unlikely but
==8168==  possible), you can try to increase the size of the
==8168==  main thread stack using the --main-stacksize= flag.
==8168==  The main thread stack size used in this run was 10485760.
==8168== 
==8168== HEAP SUMMARY:
==8168==     in use at exit: 227,288 bytes in 85 blocks
==8168==   total heap usage: 4,782 allocs, 4,697 frees, 6,578,989 bytes allocated
==8168== 
==8168== LEAK SUMMARY:
==8168==    definitely lost: 392 bytes in 4 blocks
==8168==    indirectly lost: 33 bytes in 3 blocks
==8168==      possibly lost: 576 bytes in 2 blocks
==8168==    still reachable: 226,287 bytes in 76 blocks
==8168==         suppressed: 0 bytes in 0 blocks
==8168== Rerun with --leak-check=full to see details of leaked memory
==8168== 
==8168== For counts of detected and suppressed errors, rerun with: -v
==8168== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 8 from 8)
Killed
@rrnewton rrnewton added the bug label Feb 13, 2014
@rrnewton rrnewton added this to the 1.0 milestone Feb 24, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant