You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(cuda12_1) root@I19359398490090128f:/hy-tmp# cd fastllm-master/
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# mkdir build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# cd build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# cmake .. -DUSE_CUDA=ON
-- The CXX compiler identification is GNU 9.4.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- USE_CUDA: ON
-- USE_TFACC: OFF
-- For legacy CUDA GPUs: OFF
-- PYTHON_API: OFF
-- BUILD_CLI: OFF
-- USE_SENTENCEPIECE: OFF
-- USE_IVCOREX: OFF
-- CMAKE_CXX_FLAGS: -pthread --std=c++17 -O2 -march=native
-- The CUDA compiler identification is NVIDIA 12.1.105
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Configuring done
-- Generating done
-- Build files have been written to: /hy-tmp/fastllm-master/build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# make -j
Scanning dependencies of target fastllm
Scanning dependencies of target fastllm_tools
[ 1%] Building CXX object CMakeFiles/fastllm.dir/src/fastllm.cpp.o
[ 3%] Building CXX object CMakeFiles/fastllm.dir/src/models/moss.cpp.o
[ 5%] Building CXX object CMakeFiles/fastllm.dir/src/models/llama.cpp.o
[ 6%] Building CXX object CMakeFiles/fastllm.dir/src/model.cpp.o
[ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/executor.cpp.o
[ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/chatglm.cpp.o
[ 12%] Building CXX object CMakeFiles/fastllm.dir/src/device.cpp.o
[ 13%] Building CXX object CMakeFiles/fastllm_tools.dir/src/fastllm.cpp.o
[ 15%] Building CXX object CMakeFiles/fastllm.dir/src/models/chatglm.cpp.o
[ 18%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevicebatch.cpp.o
[ 18%] Building CXX object CMakeFiles/fastllm_tools.dir/src/model.cpp.o
[ 20%] Building CXX object CMakeFiles/fastllm.dir/src/executor.cpp.o
[ 22%] Building CXX object CMakeFiles/fastllm.dir/src/models/glm.cpp.o
[ 24%] Building CXX object CMakeFiles/fastllm_tools.dir/src/device.cpp.o
[ 25%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/llama.cpp.o
[ 27%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevice.cpp.o
[ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moss.cpp.o
[ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevicebatch.cpp.o
[ 32%] Building CXX object CMakeFiles/fastllm.dir/src/models/deepseekv2.cpp.o
[ 34%] Building CXX object CMakeFiles/fastllm.dir/src/models/basellm.cpp.o
[ 36%] Building CXX object CMakeFiles/fastllm.dir/src/template.cpp.o
[ 37%] Building CXX object CMakeFiles/fastllm.dir/src/models/minicpm.cpp.o
[ 39%] Building CXX object CMakeFiles/fastllm_tools.dir/src/template.cpp.o
[ 43%] Building CXX object CMakeFiles/fastllm.dir/src/models/bert.cpp.o
[ 43%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevice.cpp.o
[ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/basellm.cpp.o
[ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/glm.cpp.o
[ 48%] Building CXX object CMakeFiles/fastllm.dir/src/models/qwen.cpp.o
[ 50%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/qwen.cpp.o
[ 51%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevicebatch.cpp.o
[ 55%] Building CXX object CMakeFiles/fastllm.dir/src/models/internlm2.cpp.o
[ 55%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/minicpm.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/internlm2.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/bert.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm.dir/src/models/moe.cpp.o
[ 62%] Building CXX object CMakeFiles/fastllm_tools.dir/third_party/json11/json11.cpp.o
[ 63%] Building CUDA object CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o
[ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/tools/src/pytools.cpp.o
[ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moe.cpp.o
[ 68%] Building CUDA object CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o
[ 74%] Building CXX object CMakeFiles/fastllm.dir/third_party/json11/json11.cpp.o
[ 74%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/deepseekv2.cpp.o
[ 74%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevicebatch.cpp.o
[ 77%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevice.cpp.o
[ 77%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevice.cpp.o
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced
const uint8_t *baseB = B + p * m;
^
Remark: The warnings can be suppressed with "-diag-suppress "
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced
float4 regA;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced
union_char4 regB;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "__hsub" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=64]" at line 2941
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=256]" at line 2943
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced
const size_t ST128_FP16_COUNT = 8;
^
12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu".
make[2]: *** [CMakeFiles/fastllm_tools.dir/build.make:336: CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
function "__half::operator long long() const" (declared at line 247 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
function "__half::operator unsigned long long() const" (declared at line 250 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
function "__half::operator __nv_bool() const" (declared at line 254 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced
const uint8_t *baseB = B + p * m;
^
Remark: The warnings can be suppressed with "-diag-suppress "
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced
float4 regA;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced
union_char4 regB;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "__hsub" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=64]" at line 2941
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=256]" at line 2943
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced
const size_t ST128_FP16_COUNT = 8;
^
12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu".
make[2]: *** [CMakeFiles/fastllm.dir/build.make:336: CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:279: CMakeFiles/fastllm.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:90: CMakeFiles/fastllm_tools.dir/all] Error 2
make: *** [Makefile:84: all] Error 2
The text was updated successfully, but these errors were encountered:
用的cuda12.1,make -j过程中报错,整体安装过程如下:
(cuda12_1) root@I19359398490090128f:/hy-tmp# cd fastllm-master/
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# mkdir build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# cd build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# cmake .. -DUSE_CUDA=ON
-- The CXX compiler identification is GNU 9.4.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- USE_CUDA: ON
-- USE_TFACC: OFF
-- For legacy CUDA GPUs: OFF
-- PYTHON_API: OFF
-- BUILD_CLI: OFF
-- USE_SENTENCEPIECE: OFF
-- USE_IVCOREX: OFF
-- CMAKE_CXX_FLAGS: -pthread --std=c++17 -O2 -march=native
-- The CUDA compiler identification is NVIDIA 12.1.105
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Configuring done
-- Generating done
-- Build files have been written to: /hy-tmp/fastllm-master/build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# make -j
Scanning dependencies of target fastllm
Scanning dependencies of target fastllm_tools
[ 1%] Building CXX object CMakeFiles/fastllm.dir/src/fastllm.cpp.o
[ 3%] Building CXX object CMakeFiles/fastllm.dir/src/models/moss.cpp.o
[ 5%] Building CXX object CMakeFiles/fastllm.dir/src/models/llama.cpp.o
[ 6%] Building CXX object CMakeFiles/fastllm.dir/src/model.cpp.o
[ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/executor.cpp.o
[ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/chatglm.cpp.o
[ 12%] Building CXX object CMakeFiles/fastllm.dir/src/device.cpp.o
[ 13%] Building CXX object CMakeFiles/fastllm_tools.dir/src/fastllm.cpp.o
[ 15%] Building CXX object CMakeFiles/fastllm.dir/src/models/chatglm.cpp.o
[ 18%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevicebatch.cpp.o
[ 18%] Building CXX object CMakeFiles/fastllm_tools.dir/src/model.cpp.o
[ 20%] Building CXX object CMakeFiles/fastllm.dir/src/executor.cpp.o
[ 22%] Building CXX object CMakeFiles/fastllm.dir/src/models/glm.cpp.o
[ 24%] Building CXX object CMakeFiles/fastllm_tools.dir/src/device.cpp.o
[ 25%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/llama.cpp.o
[ 27%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevice.cpp.o
[ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moss.cpp.o
[ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevicebatch.cpp.o
[ 32%] Building CXX object CMakeFiles/fastllm.dir/src/models/deepseekv2.cpp.o
[ 34%] Building CXX object CMakeFiles/fastllm.dir/src/models/basellm.cpp.o
[ 36%] Building CXX object CMakeFiles/fastllm.dir/src/template.cpp.o
[ 37%] Building CXX object CMakeFiles/fastllm.dir/src/models/minicpm.cpp.o
[ 39%] Building CXX object CMakeFiles/fastllm_tools.dir/src/template.cpp.o
[ 43%] Building CXX object CMakeFiles/fastllm.dir/src/models/bert.cpp.o
[ 43%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevice.cpp.o
[ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/basellm.cpp.o
[ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/glm.cpp.o
[ 48%] Building CXX object CMakeFiles/fastllm.dir/src/models/qwen.cpp.o
[ 50%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/qwen.cpp.o
[ 51%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevicebatch.cpp.o
[ 55%] Building CXX object CMakeFiles/fastllm.dir/src/models/internlm2.cpp.o
[ 55%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/minicpm.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/internlm2.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/bert.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm.dir/src/models/moe.cpp.o
[ 62%] Building CXX object CMakeFiles/fastllm_tools.dir/third_party/json11/json11.cpp.o
[ 63%] Building CUDA object CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o
[ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/tools/src/pytools.cpp.o
[ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moe.cpp.o
[ 68%] Building CUDA object CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o
[ 74%] Building CXX object CMakeFiles/fastllm.dir/third_party/json11/json11.cpp.o
[ 74%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/deepseekv2.cpp.o
[ 74%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevicebatch.cpp.o
[ 77%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevice.cpp.o
[ 77%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevice.cpp.o
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "__hdiv" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "__hmul" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(267): error: identifier "__hmul" is undefined
b[idx] = __hmul(a[idx], v);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(295): error: identifier "__hmul" is undefined
a[idx] = __hadd(a[idx], __hmul(b[idx], alpha));
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced
const uint8_t *baseB = B + p * m;
^
Remark: The warnings can be suppressed with "-diag-suppress "
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced
float4 regA;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced
union_char4 regB;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "__hsub" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=64]" at line 2941
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=256]" at line 2943
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced
const size_t ST128_FP16_COUNT = 8;
^
12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu".
make[2]: *** [CMakeFiles/fastllm_tools.dir/build.make:336: CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hexp" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hexp" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "__hdiv" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "__hmul" is undefined
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(267): error: identifier "__hmul" is undefined
b[idx] = __hmul(a[idx], v);
b[idx] = __hmul(a[idx], v);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(295): error: identifier "__hmul" is undefined
a[idx] = __hadd(a[idx], __hmul(b[idx], alpha));
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced
const uint8_t *baseB = B + p * m;
^
Remark: The warnings can be suppressed with "-diag-suppress "
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced
float4 regA;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced
union_char4 regB;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "__hsub" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=8]" at line 2939
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=64]" at line 2941
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
output[i] = __hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func<THREAD_PER_BLOCK>(half *, half *, int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1<THREAD_PER_BLOCK>(half *, half *, int, int) [with THREAD_PER_BLOCK=256]" at line 2943
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced
const size_t ST128_FP16_COUNT = 8;
^
12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu".
make[2]: *** [CMakeFiles/fastllm.dir/build.make:336: CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:279: CMakeFiles/fastllm.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:90: CMakeFiles/fastllm_tools.dir/all] Error 2
make: *** [Makefile:84: all] Error 2
The text was updated successfully, but these errors were encountered: