Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLAS multikernel benchmark example doesn't work on U250 platform #193

Open
fiannone opened this issue Jan 3, 2024 · 1 comment
Open

Comments

@fiannone
Copy link

fiannone commented Jan 3, 2024

I'd like to inform you that the VITIS Library example of GEMM multi kernels fail using 4 and 2 kernels targeting the Alveo U250 board compiling with VITIS 2022.2 and 2023.1 (platform xilinx_u250_gen3x16_xdma_4_1_202210_1) . The same example using target the hw_emu works whilst the target hw fails with the following error:

21:18:39] Run vpl: FINISHED. Run Status: impl ERROR

===>The following messages were generated while Compiling (synthesis checkpoint) kernel/IP: ulp_m01_regslice_3 Log file: /afs/enea.it/fra/user/iannone/.Xilinx/Vitis/2022.2/Vitis_Libraries/BLAS_bench/L3/benchmarks/gemm/memKernel/_x_temp.hw.xilinx_u250_gen3x16_xdma_4_1_202210_1/link/vivado/vpl/prj/prj.runs/ulp_m01_regslice_3_synth_1/runme.log :
ERROR: [VPL 17-356] Failed to install all user apps.

===>The following messages were generated while processing /afs/enea.it/fra/user/iannone/.Xilinx/Vitis/2022.2/Vitis_Libraries/BLAS_bench/L3/benchmarks/gemm/memKernel/_x_temp.hw.xilinx_u250_gen3x16_xdma_4_1_202210_1/link/vivado/vpl/prj/prj.runs/impl_1 :
ERROR: [VPL 30-487] The packing of instances into a set of CLBs defined by a pblock constraint could not be obeyed. There are a total of 25680 CLBs in the pblock, of which 20686 CLBs are available, however, the unplaced instances require 23494 CLBs. The unavailable CLBs are either taken by placed instances or are blocked due to exclude placement constraints. Please analyze your design to determine if the pblock can be resized or the number of LUTs, FFs, and/or control sets can be reduced.

Number of control sets and instances constrained to the pBlock
Control sets: 2885
Luts: 232127 (combined) 265463 (total), available capacity: 205440
Flip flops: 365122, available capacity: 410880
NOTE: each CLB can only accommodate up to 4 unique control sets so FFs cannot be packed to fully fill every CLB

To attempt placement at higher effort levels at the expense of runtime, please use the following tcl command, setting the value of limit to 2000 or more.
set_param place.sliceLegEffortLimit limit

My feeling is that is very serious as an example developed by Xilinx developers doesn't work. It's better remove it from GitHub repository in order to avoid my wasting time. Furthermore I'd like to use the tcl command to setting a new limit of place.sliceLegEffortLimit but I don't know how to open a tclsh consolle in the makefile parameters of the GEMM Bias Vitis library example.

Finally I'd like to inform you that I and my Colleague Paolo are involving in the EUROHPC TEXTROSSA project and we are going to deliver a report on which we'll inform the developers EUROHPC community about the bad support by AMD Xilinx on an example developed by AMD that doesn't work.

@afzalxo
Copy link

afzalxo commented Mar 20, 2024

I would like to agree with the gentleman above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants