Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lucid Camera Client uses too much memory #173

Open
atar13 opened this issue May 3, 2024 · 0 comments
Open

Lucid Camera Client uses too much memory #173

atar13 opened this issue May 3, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@atar13
Copy link
Member

atar13 commented May 3, 2024

🐛 Bug

Currently, we save all our images to a queue. Each additional image seems to be taking 100MB of memory. LUCID says the image buffer size is 20MB but in our observations memory usage is increasing much faster.

Even if the images are only taking 20MB, it will only take a few hundred images to take up all the Jetson's memory and crash it. Assuming you have 4GB free out of 8GB (since other stuff is running with obcpp or the host), it will take 200 images of size 20MB to fill that memory. This is not outside the realm of possibility to occur during flights.

To Reproduce

Steps to reproduce the behavior:

  1. SSH in to Jetson
  2. Checkout feat/lucid-camera branch
  3. Set image taking delay to 0s in camera_lucid.cpp test to quickly take a bunch of images
  4. Compile camera_lucid test and run
  5. Watch memory consumption skyrocket

Potential Solutions

These aren't mutually exclusive so we can do a combination of these.

Option 1: Save images to disk instead of memory

Instead of saving the images to a queue, store them to a file (png or jpg; both should be less size than our memory representation).

Pros:

  • We have a lot more disk space than memory
  • We can keep taking images without worrying about running out of memory
    Cons:
  • Disk space is slow and copying data to disk wastes CPU cycles
  • Have to load image back into memory when running pipeline
  • We might still run out of disk space. Right now the SD card is only 64 GB and a lot of that space is already taken up by docker images so we're running low on space. So it's not unreasonable to run out space after leaving the camera running for a while. We could mitigate this by putting back the 512GB SD card.

Option 2: Run the pipeline sooner and free image data

Instead of waiting until all the image taking is done we can take a few images, run the pipeline sooner than later so we can free the image buffers when the pipeline (or at least the saliency stage completes).

The implementation of this depends on how long the pipeline takes to run (which is something we have to benchmark).

  • If the pipeline is quick, we can take pictures and synchronously run the pipeline, free the image and move onto taking the next one. This only works if the pipeline takes a second or two to run.
  • If the pipeline takes a while, we don't want to block our image taking since we could be missing images of targets while flying over the search area. So we should have two threads running simultaneously: one taking images and another running the pipeline. This works as a producer consumer model where the pipeline consumes the images that the camera produces. The one thing to be careful of here is to make sure image producer doesn't use too much memory since it'll be producing images faster than the pipeline can finish. Also, one thing to keep in mind is the additional memory used by the pipeline (pytorch models and other stuff)

Option 3: Optimize the existing code

There might be a way to optimize how we're storing data in the code. Maybe we can use some lossless compression instead of storing huge WxHxC arrays. It's also possible that the images are being stored multiple times somewhere as well (since it seems to be taking 100MB of data instead of 20MB). To make sure there's no memory leaks we can check with Valgrind since it seems it should work on Jetson in Docker according to the Internet.

@atar13 atar13 added the bug Something isn't working label May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants