Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVDLA driver error undetected in solo.sh #17

Open
tymcauley opened this issue Feb 17, 2020 · 1 comment
Open

NVDLA driver error undetected in solo.sh #17

tymcauley opened this issue Feb 17, 2020 · 1 comment

Comments

@tymcauley
Copy link

tymcauley commented Feb 17, 2020

Hello! First, I'd like to say thank you for putting together a well-documented project that is straightforward to reproduce. I have run the basic example (YOLOv3 inference using solo.sh) several times without issue.

I was trying to run some other networks on the NVDLA using other software (the nvdla_runtime binary from NVIDIA's nvdla/sw repo), and ran into some issues that I'm still trying to debug. After getting those errors, I wanted to check if the NVDLA was still in a known-good state, so I ran the darknet-nvdla/solo.sh workload, and got this result:

# ./solo.sh
[    9.508000] random: crng init done
learning_rate: Using default '0.001000'
momentum: Using default '0.900000'
decay: Using default '0.000100'
policy: Using default 'constant'
max_batches: Using default '0'
layer     filters    size              input                output
    0 offset: Using default '0.000000'
shifter: Using default '0'
post_offset: Using default '0.000000'
post_scale: Using default '1.000000'
outputs 692224 num_out 5537792
    1 odla          tensor 0  416 x 416 x   4   ->    52 x  52 x 256
odla          tensor 1  416 x 416 x   4   ->    26 x  26 x 512
odla          tensor 2  416 x 416 x   4   ->    13 x  13 x 255
odla          tensor 3  416 x 416 x   4   ->    13 x  13 x 256
    2 input layer 1 tensor 3
make_split_layer input layer index 1 tensor 3
split          tensor 3   13 x  13 x 256   ->    13 x  13 x 256
    3 out layer 5 tensor 0
    4 input layer 1 tensor 2
make_split_layer input layer index 1 tensor 2
split          tensor 2   13 x  13 x 255   ->    13 x  13 x 255
    5 post_offset: Using default '0.000000'
outputs 43095 num_out 43264
    6 yolo
    7 input layer 1 tensor 1
make_split_layer input layer index 1 tensor 1
split          tensor 1   26 x  26 x 512   ->    26 x  26 x 512
    8 odla          tensor 0   26 x  26 x 512   ->    26 x  26 x 255
odla          tensor 1   26 x  26 x 512   ->    26 x  26 x 128
    9 input layer 8 tensor 0
make_split_layer input layer index 8 tensor 0
split          tensor 0   26 x  26 x 255   ->    26 x  26 x 255
   10 post_offset: Using default '0.000000'
outputs 172380 num_out 173056
   11 yolo
   12 input layer 8 tensor 1
make_split_layer input layer index 8 tensor 1
split          tensor 1   26 x  26 x 128   ->    26 x  26 x 128
   13 out layer 2 tensor 0
   14 input layer 1 tensor 0
make_split_layer input layer index 1 tensor 0
split          tensor 0   52 x  52 x 256   ->    52 x  52 x 256
   15 odla          tensor 0   52 x  52 x 256   ->    52 x  52 x 255
   16 input layer 15 tensor 0
make_split_layer input layer index 15 tensor 0
split          tensor 0   52 x  52 x 255   ->    52 x  52 x 255
   17 post_offset: Using default '0.000000'
outputs 689520 num_out 692224
   18 yolo
Loading weights from yolov3-odla.cfg...Done!
#### input image size c=4 h=416 w=416
[   10.316000] Task execution failed
NvDlaSubmit: Error IOCTL failed (Cannot allocate memory)
(DLA_RUNTIME) Error 0x0003000f: (propagating from Runtime.cpp, function submitInternal(), line 669)
NVDLA time: 0.000231 seconds
[   10.320000] Task execution failed
NvDlaSubmit: Error IOCTL failed (Cannot allocate memory)
(DLA_RUNTIME) Error 0x0003000f: (propagating from Runtime.cpp, function submitInternal(), line 669)
NVDLA time: 0.000097 seconds
[   10.328000] Task execution failed
NvDlaSubmit: Error IOCTL failed (Cannot allocate memory)
(DLA_RUNTIME) Error 0x0003000f: (propagating from Runtime.cpp, function submitInternal(), line 669)
NVDLA time: 0.000097 seconds
data/person.jpg: Predicted in 0.058764 seconds.
# echo $?
0

You can see that there are several errors from the NVDLA driver, but they aren't caught by the darknet-nvdla software. As a result, an unrealistically fast inference is reported, and we don't get the detections for the horse, dog and person. I wasn't sure if I should file this issue at the darknet-nvdla repo, but it looks like issues are disabled there.

To be clear, I didn't make any modifications to the FireSim NVDLA hardware, I only added files to the workload overlay that's built into the Linux image.

I'm not entirely sure if this is the source of the error, but it looks like the functions in src/odla_layer_impl.cpp (added in this fork) don't do any error checking.

@tymcauley tymcauley changed the title NVDLA error undetected in solo.sh NVDLA driver error undetected in solo.sh Feb 17, 2020
@ku-researcher
Copy link

@tymcauley
I'm sorry to bother you.
I encountered the same error and have trouble debugging.
Did you fix the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants