You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! First, I'd like to say thank you for putting together a well-documented project that is straightforward to reproduce. I have run the basic example (YOLOv3 inference using solo.sh) several times without issue.
I was trying to run some other networks on the NVDLA using other software (the nvdla_runtime binary from NVIDIA's nvdla/sw repo), and ran into some issues that I'm still trying to debug. After getting those errors, I wanted to check if the NVDLA was still in a known-good state, so I ran the darknet-nvdla/solo.sh workload, and got this result:
# ./solo.sh
[ 9.508000] random: crng init done
learning_rate: Using default '0.001000'
momentum: Using default '0.900000'
decay: Using default '0.000100'
policy: Using default 'constant'
max_batches: Using default '0'
layer filters size input output
0 offset: Using default '0.000000'
shifter: Using default '0'
post_offset: Using default '0.000000'
post_scale: Using default '1.000000'
outputs 692224 num_out 5537792
1 odla tensor 0 416 x 416 x 4 -> 52 x 52 x 256
odla tensor 1 416 x 416 x 4 -> 26 x 26 x 512
odla tensor 2 416 x 416 x 4 -> 13 x 13 x 255
odla tensor 3 416 x 416 x 4 -> 13 x 13 x 256
2 input layer 1 tensor 3
make_split_layer input layer index 1 tensor 3
split tensor 3 13 x 13 x 256 -> 13 x 13 x 256
3 out layer 5 tensor 0
4 input layer 1 tensor 2
make_split_layer input layer index 1 tensor 2
split tensor 2 13 x 13 x 255 -> 13 x 13 x 255
5 post_offset: Using default '0.000000'
outputs 43095 num_out 43264
6 yolo
7 input layer 1 tensor 1
make_split_layer input layer index 1 tensor 1
split tensor 1 26 x 26 x 512 -> 26 x 26 x 512
8 odla tensor 0 26 x 26 x 512 -> 26 x 26 x 255
odla tensor 1 26 x 26 x 512 -> 26 x 26 x 128
9 input layer 8 tensor 0
make_split_layer input layer index 8 tensor 0
split tensor 0 26 x 26 x 255 -> 26 x 26 x 255
10 post_offset: Using default '0.000000'
outputs 172380 num_out 173056
11 yolo
12 input layer 8 tensor 1
make_split_layer input layer index 8 tensor 1
split tensor 1 26 x 26 x 128 -> 26 x 26 x 128
13 out layer 2 tensor 0
14 input layer 1 tensor 0
make_split_layer input layer index 1 tensor 0
split tensor 0 52 x 52 x 256 -> 52 x 52 x 256
15 odla tensor 0 52 x 52 x 256 -> 52 x 52 x 255
16 input layer 15 tensor 0
make_split_layer input layer index 15 tensor 0
split tensor 0 52 x 52 x 255 -> 52 x 52 x 255
17 post_offset: Using default '0.000000'
outputs 689520 num_out 692224
18 yolo
Loading weights from yolov3-odla.cfg...Done!
#### input image size c=4 h=416 w=416
[ 10.316000] Task execution failed
NvDlaSubmit: Error IOCTL failed (Cannot allocate memory)
(DLA_RUNTIME) Error 0x0003000f: (propagating from Runtime.cpp, function submitInternal(), line 669)
NVDLA time: 0.000231 seconds
[ 10.320000] Task execution failed
NvDlaSubmit: Error IOCTL failed (Cannot allocate memory)
(DLA_RUNTIME) Error 0x0003000f: (propagating from Runtime.cpp, function submitInternal(), line 669)
NVDLA time: 0.000097 seconds
[ 10.328000] Task execution failed
NvDlaSubmit: Error IOCTL failed (Cannot allocate memory)
(DLA_RUNTIME) Error 0x0003000f: (propagating from Runtime.cpp, function submitInternal(), line 669)
NVDLA time: 0.000097 seconds
data/person.jpg: Predicted in 0.058764 seconds.
# echo $?
0
You can see that there are several errors from the NVDLA driver, but they aren't caught by the darknet-nvdla software. As a result, an unrealistically fast inference is reported, and we don't get the detections for the horse, dog and person. I wasn't sure if I should file this issue at the darknet-nvdla repo, but it looks like issues are disabled there.
To be clear, I didn't make any modifications to the FireSim NVDLA hardware, I only added files to the workload overlay that's built into the Linux image.
I'm not entirely sure if this is the source of the error, but it looks like the functions in src/odla_layer_impl.cpp (added in this fork) don't do any error checking.
The text was updated successfully, but these errors were encountered:
tymcauley
changed the title
NVDLA error undetected in solo.sh
NVDLA driver error undetected in solo.shFeb 17, 2020
Hello! First, I'd like to say thank you for putting together a well-documented project that is straightforward to reproduce. I have run the basic example (YOLOv3 inference using
solo.sh
) several times without issue.I was trying to run some other networks on the NVDLA using other software (the
nvdla_runtime
binary from NVIDIA'snvdla/sw
repo), and ran into some issues that I'm still trying to debug. After getting those errors, I wanted to check if the NVDLA was still in a known-good state, so I ran thedarknet-nvdla/solo.sh
workload, and got this result:You can see that there are several errors from the NVDLA driver, but they aren't caught by the
darknet-nvdla
software. As a result, an unrealistically fast inference is reported, and we don't get the detections for the horse, dog and person. I wasn't sure if I should file this issue at the darknet-nvdla repo, but it looks like issues are disabled there.To be clear, I didn't make any modifications to the FireSim NVDLA hardware, I only added files to the workload overlay that's built into the Linux image.
I'm not entirely sure if this is the source of the error, but it looks like the functions in
src/odla_layer_impl.cpp
(added in this fork) don't do any error checking.The text was updated successfully, but these errors were encountered: