Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loom_shell - Artefacts on simpeStich Live #395

Closed
arpu opened this issue Nov 17, 2020 · 26 comments
Closed

loom_shell - Artefacts on simpeStich Live #395

arpu opened this issue Nov 17, 2020 · 26 comments
Assignees
Labels
bug Something isn't working

Comments

@arpu
Copy link

arpu commented Nov 17, 2020

Hey with a config

  setGlobalAttribute(0,1);// # 0 -- Profiler::0:OFF 1:ON Default:OFF

        setGlobalAttribute(7,1); //#simple/quality stitch
        // Turn Off/ON ExpoComp
        setGlobalAttribute(1,0);  //# 1 -- ExpoComp::0:OFF 1:ON Default:ON

        // Turn Off/ON SeamFind
        setGlobalAttribute(2,0); //# 2 -- SeamFind::0:OFF 1:ON Default:ON

        // Turn Off/ON Multiband & Num Bands
        setGlobalAttribute(5,0); //# 5 -- Multiband::0:OFF 1:ON Default:ON

        setGlobalAttribute(6,4); //# 6 -- Multiband Bands

i get strange artefacts

Bildschirmfoto von 2020-11-17 17-04-02

rocminfo:

ROCk module is loaded
Able to open /dev/kfd read-write
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
  Uuid:                    CPU-XX                             
  Marketing Name:          Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3900                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            4                                  
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    16307136(0xf8d3c0) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    16307136(0xf8d3c0) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
    N/A                      
*******                  
Agent 2                  
*******                  
  Name:                    gfx803                             
  Uuid:                    GPU-XX                             
  Marketing Name:          Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X]
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          4096(0x1000)                       
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
  Chip ID:                 26607(0x67ef)                      
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   1210                               
  BDFID:                   256                                
  Internal Node ID:        1                                  
  Compute Unit:            14                                 
  SIMDs per CU:            4                                  
  Shader Engines:          2                                  
  Shader Arrs. per Eng.:   1                                  
  WatchPts on Addr. Ranges:4                                  
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      FALSE                              
  Wavefront Size:          64(0x40)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        40(0x28)                           
  Max Work-item Per CU:    2560(0xa00)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    4194304(0x400000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx803          
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***  
@kiritigowda kiritigowda added the bug Something isn't working label Nov 17, 2020
@kiritigowda
Copy link
Collaborator

@arpu can you send me your loom shell script and test case images.

@arpu
Copy link
Author

arpu commented Nov 17, 2020

i use c++ qt Gui for this but will try to make a simple loom shell sample

@kiritigowda
Copy link
Collaborator

@arpu you can look at this sample for reference

@arpu
Copy link
Author

arpu commented Nov 17, 2020

ok here we go using the sample but with enabled simple stich

loomStitch-sample1.txt

LoomOutputStitch.zip

(zip includes the output image)

@kiritigowda
Copy link
Collaborator

@arpu Running the sample on windows [AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx 2.30 GHz] I had no issue seems to be isolated to Linux. Look at the image below

Also notices that you are using OpenCL 1.2

OK: OpenVX using GPU device#0 (gfx803) [OpenCL 1.2 ] [SvmCaps 0 1]

compared to OK: OpenVX using GPU device#0 (gfx902) [OpenCL 2.0 AMD-APP (3075.12)] [SvmCaps 0 0]

The Loom uses features from 2.0 and that might be a problem. I will investigate, try using OpenCL 2.0


loom_shell.exe 0.9.9 [loomsl 0.9.9]
... processing commands from loomStitch-sample1.txt
..ls_context context[1] created
..lsCreateContext: created context context[0]
..lsSetOutputConfig: successful for context[0]
..lsSetCameraConfig: successful for context[0]
OK: enabled graph scheduling in separate threads
OK: enabled graph scheduling in separate threads
OK: OpenVX using GPU device#0 (gfx902) [OpenCL 2.0 AMD-APP (3075.12)] [SvmCaps 0 0]
..lsInitialize: successful for context[0] (3552.072 ms)
..cl_mem mem[2] created
..cl_context opencl_context[1] created
..lsGetOpenCLContext: get OpenCL context opencl_context[0] from context[0]
OK: loaded cam00.bmp
OK: loaded cam01.bmp
OK: loaded cam02.bmp
OK: loaded cam03.bmp
..lsSetCameraBuffer: set OpenCL buffer mem[0] for context[0]
..lsSetOutputBuffer: set OpenCL buffer mem[1] for context[0]
OK: run: executed for 100 frames
OK: run: Time:   3.933 ms (min);   6.963 ms (avg);   8.065 ms (max);   6.222 ms (1st-frame) of 100 frames
OK: created LoomOutputStitch.bmp
> stitch graph profile
 COUNT,tmp(ms),avg(ms),min(ms),max(ms),DEV,KERNEL
   100,  7.740,  6.931,  3.919,  8.038,CPU,GRAPH
   100,  7.728,  6.919,  3.914,  8.018,GPU,com.amd.openvx.Remap_U24_U24_Bilinear
OK: OpenCL buffer usage: 147456544, 3/3
..lsReleaseContext: released context context[0]
... exit from loomStitch-sample1.txt

output

@arpu
Copy link
Author

arpu commented Nov 19, 2020

Hey, thx for looking at this

what i think the opencl kernel build should add -cl-std=CL2 ?

clinfo output is

Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.0 AMD-APP (3204.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback 
  Platform Extensions function suffix             AMD

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     gfx803
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.2 
  Driver Version                                  3204.0 (HSA1.1,LC)
  Device OpenCL C Version                         OpenCL C 2.0 
  Device Type                                     GPU
  Device Board Name (AMD)                         Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X]

@kiritigowda
Copy link
Collaborator

Cool, let me know if using OpenCL 2.0 fixes all these issues. You can send a PR to add an explicit 2.0 flag.

Also, as a side note using features like seamfind and multiband blend makes the image better. See sample below

  • No SeamFind

no-seamfind

  • With SeamFind

seamfind

@arpu
Copy link
Author

arpu commented Nov 19, 2020

ok any idea how i can set the 2.0 flag. ?

@kiritigowda
Copy link
Collaborator

kiritigowda commented Nov 19, 2020

@arpu try this on your system

sudo apt autoremove ocl-icd-*

Then do a reboot and recompile MIVisionX. Also, check the below command on reboot. Should solve your issue

/opt/rocm/opencl/bin/clinfo

Let me know if this solves the issues.

@arpu
Copy link
Author

arpu commented Nov 19, 2020

hmm i have some new info tested with runcl in the utilies

./runcl -v -gpu                                                                                                                                                                                        1 ↵  10289  06:34:22
OK: DEVICE # 0 [gfx803]
OK: Using GPU device#0 [gfx803]

OpenCL Device Information:
  DEVICE_NAME              : gfx803
  MAX_CLOCK_FREQUENCY      : 1210 MHz
  DEVICE_VENDOR            : Advanced Micro Devices, Inc.
  DRIVER_VERSION           : 3204.0 (HSA1.1,LC)
  DEVICE_VERSION           : OpenCL 1.2 
  MAX_COMPUTE_UNITS        : 14
  MAX_WORK_ITEM_DIMENSIONS : 3
  MAX_WORK_ITEM_SIZES      : 0 -1361356288 32704
  MAX_WORK_GROUP_SIZE      : 0
  ADDRESS_BITS             : 64 bits
  MEM_BASE_ADDR_ALIGN      : 1024 bits
  MAX_MEM_ALLOC_SIZE       : 3481 MB
  GLOBAL_MEM_SIZE          : 4096 MB
  GLOBAL_MEM_CACHE_TYPE    : READ WRITE
  GLOBAL_MEM_CACHELINE_SIZE: 64 bytes
  GLOBAL_MEM_CACHE_SIZE    : 16 KB
  LOCAL_MEM_TYPE           : LOCAL
  LOCAL_MEM_SIZE           : 64 KB

@arpu
Copy link
Author

arpu commented Nov 19, 2020

and info from opencl/clinfo from rocm

 /opt/rocm-3.9.1  opencl/bin/clinfo    ✔  10258  06:36:18
Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.0 AMD-APP (3204.0)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback 


  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 1
  Device Type:					 CL_DEVICE_TYPE_GPU
  Vendor ID:					 1002h
  Board name:					 Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X]
  Device Topology:				 PCI[ B#1, D#0, F#0 ]
  Max compute units:				 14
  Max work items dimensions:			 3
    Max work items[0]:				 1024
    Max work items[1]:				 1024
    Max work items[2]:				 1024
  Max work group size:				 256
  Preferred vector width char:			 4
  Preferred vector width short:			 2
  Preferred vector width int:			 1
  Preferred vector width long:			 1
  Preferred vector width float:			 1
  Preferred vector width double:		 1
  Native vector width char:			 4
  Native vector width short:			 2
  Native vector width int:			 1
  Native vector width long:			 1
  Native vector width float:			 1
  Native vector width double:			 1
  Max clock frequency:				 1210Mhz
  Address bits:					 64
  Max memory allocation:			 3650722200
  Image support:				 Yes
  Max number of images read arguments:		 128
  Max number of images write arguments:		 8
  Max image 2D width:				 16384
  Max image 2D height:				 16384
  Max image 3D width:				 2048
  Max image 3D height:				 2048
  Max image 3D depth:				 2048
  Max samplers within kernel:			 26607
  Max size of kernel argument:			 1024
  Alignment (bits) of base address:		 1024
  Minimum alignment (bytes) for any datatype:	 128
  Single precision floating point capability
    Denorms:					 No
    Quiet NaNs:					 Yes
    Round to nearest even:			 Yes
    Round to zero:				 Yes
    Round to +ve and infinity:			 Yes
    IEEE754-2008 fused multiply-add:		 Yes
  Cache type:					 Read/Write
  Cache line size:				 64
  Cache size:					 16384
  Global memory size:				 4294967296
  Constant buffer size:				 3650722200
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 65536
  Max pipe arguments:				 16
  Max pipe active reservations:			 16
  Max pipe packet size:				 3650722200
  Max global variable size:			 3650722200
  Max global variable preferred total size:	 4294967296
  Max read/write image args:			 64
  Max on device events:				 1024
  Queue on device max size:			 8388608
  Max on device queues:				 1
  Queue on device preferred size:		 262144
  SVM capabilities:				 
    Coarse grain buffer:			 Yes
    Fine grain buffer:				 Yes
    Fine grain system:				 No
    Atomics:					 No
  Preferred platform atomic alignment:		 0
  Preferred global atomic alignment:		 0
  Preferred local atomic alignment:		 0
  Kernel Preferred work group size multiple:	 64
  Error correction support:			 0
  Unified memory for Host and Device:		 0
  Profiling timer resolution:			 1
  Device endianess:				 Little
  Available:					 Yes
  Compiler available:				 Yes
  Execution capabilities:				 
    Execute OpenCL kernels:			 Yes
    Execute native function:			 No
  Queue on Host properties:				 
    Out-of-Order:				 No
    Profiling :					 Yes
  Queue on Device properties:				 
    Out-of-Order:				 Yes
    Profiling :					 Yes
  Platform ID:					 0x7f22ce436cf0
  Name:						 gfx803
  Vendor:					 Advanced Micro Devices, Inc.
  Device OpenCL C version:			 OpenCL C 2.0 
  Driver version:				 3204.0 (HSA1.1,LC)
  Profile:					 FULL_PROFILE
  Version:					 OpenCL 1.2 
  Extensions:					 cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program 

@arpu
Copy link
Author

arpu commented Nov 19, 2020

ok from the docs:

OpenCL 2.0 support

ROCm 2.0 introduces full support for kernels written in the OpenCL 2.0 C language on certain devices and systems. Applications can detect this support by calling the “clGetDeviceInfo” query function with “parame_name” argument set to CL_DEVICE_OPENCL_C_VERSION

https://rocmdocs.amd.com/en/latest/Current_Release_Notes/Current-Release-Notes.html

looks like ago do not check this CL_DEVICE_OPENCL_C_VERSION it uses CL_DEVICE_VERSION

@kiritigowda
Copy link
Collaborator

@arpu I will look into this.

As a temporary workaround, you could boot your machine with windows and try out the flow.

@kiritigowda
Copy link
Collaborator

@paveltc the stitch mode 1 - simple stitch needs to be added to test. Change sample-1

setGlobalAttribute(7,1);        # 7 -- simple/quality stitch::0:quality stitch 1:simple stitch Default:quality stitch

@arpu
Copy link
Author

arpu commented Nov 19, 2020

@kiritigowda sorry i have no windows installed

tested with simple stitch and get the Artefacts

@kiritigowda
Copy link
Collaborator

@arpu will try to resolve this soon, I have a reproducible test case. Thanks for bringing this to our attention.

@kiritigowda kiritigowda changed the title [loom_shell] Artefacts on simpeStich Live loom_shell - Artefacts on simpeStich Live Nov 19, 2020
@arpu
Copy link
Author

arpu commented Nov 25, 2020

@kiritigowda anything i could provide for help?

@arpu
Copy link
Author

arpu commented Nov 29, 2020

ok looks like my gpu does not support openCl 2.0
ROCm/ROCm-OpenCL-Runtime#127

anway opencl 1.2 should be good for this

@arpu
Copy link
Author

arpu commented Dec 9, 2020

@kiritigowda anything i can play with? for what i understand opencl 1.2 should be good for loom

@kiritigowda
Copy link
Collaborator

@arpu I don't think the OpenCL 1.2 will solve this issue. Some features used in the code require OpenCL 2.0+, let me verify this bug on Linux and see possible fallbacks. I will let you know once there is a resolution. Thanks!

@arpu
Copy link
Author

arpu commented Jan 12, 2021

@kiritigowda any news on this?

@kiritigowda
Copy link
Collaborator

@arpu this issue has been fixed with TOT changes, can you verify and we can close this issue?

  • Sample-1 with a simple stitch
loom_shell loomStitch-sample1.txt 
loom_shell 0.9.9 [loomsl 0.9.9]
... processing commands from loomStitch-sample1.txt
..ls_context context[1] created
..lsCreateContext: created context context[0]
..lsSetOutputConfig: successful for context[0]
..lsSetCameraConfig: successful for context[0]
OK: OpenVX using GPU device#0 (gfx906) [OpenCL 2.0 ] [SvmCaps 0 0]
..lsInitialize: successful for context[0] (1517.891 ms)
..cl_mem mem[2] created
..cl_context opencl_context[1] created
..lsGetOpenCLContext: get OpenCL context opencl_context[0] from context[0]
OK: loaded cam00.bmp
OK: loaded cam01.bmp
OK: loaded cam02.bmp
OK: loaded cam03.bmp
..lsSetCameraBuffer: set OpenCL buffer mem[0] for context[0]
..lsSetOutputBuffer: set OpenCL buffer mem[1] for context[0]
OK: run: executed for 100 frames
OK: run: Time:   1.058 ms (min);   1.077 ms (avg);   1.088 ms (max);   1.076 ms (1st-frame) of 100 frames
OK: created LoomOutputStitch.bmp
> stitch graph profile
 COUNT,tmp(ms),avg(ms),min(ms),max(ms),DEV,KERNEL
   101,  0.565,  0.569,  0.543,  0.785,CPU,GRAPH
   101,  0.564,  0.568,  0.542,  0.782,GPU,com.amd.openvx.Remap_U24_U24_Bilinear
..lsReleaseContext: released context context[0]
... exit from loomStitch-sample1.txt

Screen Shot 2021-03-24 at 10 57 25 AM

@arpu
Copy link
Author

arpu commented Mar 24, 2021

yeahh i see the commit and want to test this
but building on my pc has some problems now with latest master

❯ make
[ 50%] Linking CXX executable runcl
/usr/bin/ld: CMakeFiles/runcl.dir/runcl.cpp.o: in function `initialize(int, int, char const*)':
runcl.cpp:(.text+0x1a9): undefined reference to `clGetPlatformIDs'
/usr/bin/ld: runcl.cpp:(.text+0x24c): undefined reference to `clCreateContextFromType'
/usr/bin/ld: runcl.cpp:(.text+0x2bc): undefined reference to `clGetContextInfo'
/usr/bin/ld: runcl.cpp:(.text+0x392): undefined reference to `clGetContextInfo'
/usr/bin/ld: runcl.cpp:(.text+0x3d2): undefined reference to `clReleaseContext'
/usr/bin/ld: runcl.cpp:(.text+0x456): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x523): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x5fc): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x628): undefined reference to `clCreateContext'
/usr/bin/ld: runcl.cpp:(.text+0x6db): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x724): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x76b): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x7b7): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x803): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: CMakeFiles/runcl.dir/runcl.cpp.o:runcl.cpp:(.text+0x84c): more undefined references to `clGetDeviceInfo' follow
/usr/bin/ld: CMakeFiles/runcl.dir/runcl.cpp.o: in function `shutdown()':
runcl.cpp:(.text+0xcd8): undefined reference to `clReleaseCommandQueue'
/usr/bin/ld: runcl.cpp:(.text+0xcf3): undefined reference to `clReleaseContext'
/usr/bin/ld: CMakeFiles/runcl.dir/runcl.cpp.o: in function `main':
runcl.cpp:(.text+0x56be): undefined reference to `clCreateProgramWithBinary'
/usr/bin/ld: runcl.cpp:(.text+0x572f): undefined reference to `clCreateProgramWithSource'
/usr/bin/ld: runcl.cpp:(.text+0x57ac): undefined reference to `clBuildProgram'
/usr/bin/ld: runcl.cpp:(.text+0x5801): undefined reference to `clGetProgramBuildInfo'
/usr/bin/ld: runcl.cpp:(.text+0x593b): undefined reference to `clGetProgramInfo'
/usr/bin/ld: runcl.cpp:(.text+0x59f3): undefined reference to `clGetProgramInfo'
/usr/bin/ld: runcl.cpp:(.text+0x5b32): undefined reference to `clGetProgramInfo'
/usr/bin/ld: runcl.cpp:(.text+0x5d56): undefined reference to `clCreateKernel'
/usr/bin/ld: runcl.cpp:(.text+0x5db0): undefined reference to `clReleaseProgram'
/usr/bin/ld: runcl.cpp:(.text+0x5e04): undefined reference to `clGetKernelWorkGroupInfo'
/usr/bin/ld: runcl.cpp:(.text+0x5e7e): undefined reference to `clGetKernelWorkGroupInfo'
/usr/bin/ld: runcl.cpp:(.text+0x5eee): undefined reference to `clGetKernelWorkGroupInfo'
/usr/bin/ld: runcl.cpp:(.text+0x5f53): undefined reference to `clGetKernelWorkGroupInfo'
/usr/bin/ld: runcl.cpp:(.text+0x5fad): undefined reference to `clCreateCommandQueueWithProperties'
/usr/bin/ld: runcl.cpp:(.text+0x63b2): undefined reference to `clCreateImage'
/usr/bin/ld: runcl.cpp:(.text+0x66cf): undefined reference to `clEnqueueWriteImage'
/usr/bin/ld: runcl.cpp:(.text+0x68c1): undefined reference to `clCreateBuffer'
/usr/bin/ld: runcl.cpp:(.text+0x6978): undefined reference to `clEnqueueWriteBuffer'
/usr/bin/ld: runcl.cpp:(.text+0x6a18): undefined reference to `clSetKernelArg'
/usr/bin/ld: runcl.cpp:(.text+0x6aea): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x6b5c): undefined reference to `clGetKernelWorkGroupInfo'
/usr/bin/ld: runcl.cpp:(.text+0x6bce): undefined reference to `clGetKernelWorkGroupInfo'
/usr/bin/ld: runcl.cpp:(.text+0x6c2f): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x6c7e): undefined reference to `clGetDeviceInfo'
/usr/bin/ld: runcl.cpp:(.text+0x6cd4): undefined reference to `clGetKernelWorkGroupInfo'
/usr/bin/ld: runcl.cpp:(.text+0x6d5e): undefined reference to `clFinish'
/usr/bin/ld: runcl.cpp:(.text+0x6dbf): undefined reference to `clEnqueueNDRangeKernel'
/usr/bin/ld: runcl.cpp:(.text+0x6e12): undefined reference to `clFinish'
/usr/bin/ld: runcl.cpp:(.text+0x6f9e): undefined reference to `clFinish'
/usr/bin/ld: runcl.cpp:(.text+0x7018): undefined reference to `clEnqueueNDRangeKernel'
/usr/bin/ld: runcl.cpp:(.text+0x7077): undefined reference to `clFinish'
/usr/bin/ld: runcl.cpp:(.text+0x73e5): undefined reference to `clEnqueueReadImage'
/usr/bin/ld: runcl.cpp:(.text+0x75f9): undefined reference to `clEnqueueReadBuffer'
collect2: error: ld returned 1 exit status
make[2]: *** [runcl/CMakeFiles/runcl.dir/build.make:103: runcl/runcl] Error 1
make[1]: *** [CMakeFiles/Makefile2:182: runcl/CMakeFiles/runcl.dir/all] Error 2
make: *** [Makefile:149: all] Error 2

not sure what the Problem is

@kiritigowda
Copy link
Collaborator

@arpu looks like a linking error. Can you do a clean build following the instructions here - https://github.com/GPUOpen-ProfessionalCompute-Libraries/MIVisionX/wiki/Linux#linux-build-instructions

@arpu
Copy link
Author

arpu commented Mar 24, 2021

hmm getting working with adding the opencl so to the command line c++ build

❯ /usr/bin/c++ -DENABLE_OPENCL=1 -I/opt/rocm-4.0.1/opencl/include -I/opt/rocm-4.0.1/opencl/include/Headers -std=c++11 -rdynamic CMakeFiles/loom_shell.dir/loom_shell_util.cpp.o CMakeFiles/loom_shell.dir/loom_shell.cpp.o CMakeFiles/loom_shell.dir/help.cpp.o -o loom_shell  -lvx_loomsl -lopenvx -lpthread -lOpenCL
/usr/bin/ld: cannot find -lOpenCL
collect2: error: ld returned 1 exit status
❯ /usr/bin/c++ -DENABLE_OPENCL=1 -I/opt/rocm-4.0.1/opencl/include -I/opt/rocm-4.0.1/opencl/include/Headers -std=c++11 -rdynamic CMakeFiles/loom_shell.dir/loom_shell_util.cpp.o CMakeFiles/loom_shell.dir/loom_shell.cpp.o CMakeFiles/loom_shell.dir/help.cpp.o -o loom_shell  -lvx_loomsl -lopenvx -lpthread -lOpenCl
/usr/bin/ld: cannot find -lOpenCl
collect2: error: ld returned 1 exit status
❯ /usr/bin/c++ -DENABLE_OPENCL=1 -I/opt/rocm-4.0.1/opencl/include -I/opt/rocm-4.0.1/opencl/include/Headers -std=c++11 -rdynamic CMakeFiles/loom_shell.dir/loom_shell_util.cpp.o CMakeFiles/loom_shell.dir/loom_shell.cpp.o CMakeFiles/loom_shell.dir/help.cpp.o -o loom_shell  -lvx_loomsl -lopenvx -lpthread  /opt/rocm-4.0.1/opencl/lib/libamdocl64.so
❯ ll
total 636K
-rw-r--r--. 1 arpu arpu  15K Mar 24 21:04 CMakeCache.txt
drwxr-xr-x. 6 arpu arpu 4.0K Mar 24 21:04 CMakeFiles
-rw-r--r--. 1 arpu arpu 2.0K Mar 24 16:45 CMakeLists.txt
-rw-r--r--. 1 arpu arpu 8.9K Mar 24 21:04 Makefile
-rw-r--r--. 1 arpu arpu  12K Mar 24 16:45 README.md
-rw-r--r--. 1 arpu arpu 2.5K Mar 24 21:04 cmake_install.cmake
-rw-r--r--. 1 arpu arpu  18K Mar 24 16:45 help.cpp
-rwxr-xr-x. 1 arpu arpu 399K Mar 24 21:09 loom_shell
-rw-r--r--. 1 arpu arpu  85K Mar 24 16:45 loom_shell.cpp
-rw-r--r--. 1 arpu arpu 3.8K Mar 24 16:45 loom_shell.h
-rw-r--r--. 1 arpu arpu  919 Mar 24 16:45 loom_shell.sln
-rw-r--r--. 1 arpu arpu 5.1K Mar 24 16:45 loom_shell.vcxproj
-rw-r--r--. 1 arpu arpu 1.4K Mar 24 16:45 loom_shell.vcxproj.filters
-rw-r--r--. 1 arpu arpu  45K Mar 24 16:45 loom_shell_util.cpp
-rw-r--r--. 1 arpu arpu 4.5K Mar 24 16:45 loom_shell_util.h
❯ ./loom_shell
loom_shell 0.9.9 [loomsl 0.9.9]

@arpu
Copy link
Author

arpu commented Mar 24, 2021

after this the stitching looks perfect now! thx a lot will close this and add new bugreport for the linking problem

@arpu arpu closed this as completed Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants