Lucid Camera Client uses too much memory #173

atar13 · 2024-05-03T04:53:00Z

🐛 Bug

Currently, we save all our images to a queue. Each additional image seems to be taking 100MB of memory. LUCID says the image buffer size is 20MB but in our observations memory usage is increasing much faster.

Even if the images are only taking 20MB, it will only take a few hundred images to take up all the Jetson's memory and crash it. Assuming you have 4GB free out of 8GB (since other stuff is running with obcpp or the host), it will take 200 images of size 20MB to fill that memory. This is not outside the realm of possibility to occur during flights.

To Reproduce

Steps to reproduce the behavior:

SSH in to Jetson
Checkout feat/lucid-camera branch
Set image taking delay to 0s in camera_lucid.cpp test to quickly take a bunch of images
Compile camera_lucid test and run
Watch memory consumption skyrocket

Potential Solutions

These aren't mutually exclusive so we can do a combination of these.

Option 1: Save images to disk instead of memory

Instead of saving the images to a queue, store them to a file (png or jpg; both should be less size than our memory representation).

Pros:

We have a lot more disk space than memory
We can keep taking images without worrying about running out of memory
Cons:
Disk space is slow and copying data to disk wastes CPU cycles
Have to load image back into memory when running pipeline
We might still run out of disk space. Right now the SD card is only 64 GB and a lot of that space is already taken up by docker images so we're running low on space. So it's not unreasonable to run out space after leaving the camera running for a while. We could mitigate this by putting back the 512GB SD card.

Option 2: Run the pipeline sooner and free image data

Instead of waiting until all the image taking is done we can take a few images, run the pipeline sooner than later so we can free the image buffers when the pipeline (or at least the saliency stage completes).

The implementation of this depends on how long the pipeline takes to run (which is something we have to benchmark).

If the pipeline is quick, we can take pictures and synchronously run the pipeline, free the image and move onto taking the next one. This only works if the pipeline takes a second or two to run.
If the pipeline takes a while, we don't want to block our image taking since we could be missing images of targets while flying over the search area. So we should have two threads running simultaneously: one taking images and another running the pipeline. This works as a producer consumer model where the pipeline consumes the images that the camera produces. The one thing to be careful of here is to make sure image producer doesn't use too much memory since it'll be producing images faster than the pipeline can finish. Also, one thing to keep in mind is the additional memory used by the pipeline (pytorch models and other stuff)

Option 3: Optimize the existing code

There might be a way to optimize how we're storing data in the code. Maybe we can use some lossless compression instead of storing huge WxHxC arrays. It's also possible that the images are being stored multiple times somewhere as well (since it seems to be taking 100MB of data instead of 20MB). To make sure there's no memory leaks we can check with Valgrind since it seems it should work on Jetson in Docker according to the Internet.

The text was updated successfully, but these errors were encountered:

atar13 assigned atar13 and Samir-Rashid May 3, 2024

atar13 added the bug Something isn't working label May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lucid Camera Client uses too much memory #173

Lucid Camera Client uses too much memory #173

atar13 commented May 3, 2024

Lucid Camera Client uses too much memory #173

Lucid Camera Client uses too much memory #173

Comments

atar13 commented May 3, 2024

🐛 Bug

To Reproduce

Potential Solutions

Option 1: Save images to disk instead of memory

Option 2: Run the pipeline sooner and free image data

Option 3: Optimize the existing code