Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak with large pointclouds #140

Closed
AlexDayCRL opened this issue Mar 26, 2024 · 3 comments
Closed

Memory leak with large pointclouds #140

AlexDayCRL opened this issue Mar 26, 2024 · 3 comments

Comments

@AlexDayCRL
Copy link
Contributor

Hi,

I'm trying to test out the capabilities of Zenoh in handling large pointclouds and it seems like I'm encountering a memory leak when testing with both synthetic data and livestreamed data from one of our sensors. It seems if the bandwidth exceeds around 16 MB/s the memory used by the node that is publishing the data will slowly creep up until the node is killed by the OOM Killer. This is on Ubuntu Noble with ROS Rolling built from source and rmw_zenoh using the yadu/events branch rebased onto rolling. I have verified that the memory usage is stable over the same time period while using rmw_cyclonedds_cpp

Here is the code I'm using for the synthetic data: https://github.com/AlexDayCRL/ros2-message-benchmark/blob/master/src/talker.cpp

And this is the command I'm using to run the node: ./talker 100000 and you can bump that up to ./talker 1000000 to make the OOM Killer kill the node sooner.

@AlexDayCRL
Copy link
Contributor Author

It seems like the RMW is keeping a copy of every message that's getting published. Running ./talker 500 has a cumulative bandwidth of about 80 KB/s * 3 or 240 KB/s. When logging the total memory of the program using this command while true; do cat /proc/$pid/smaps | grep -i pss | awk '{Total+=$2} END {print systime() " " Total/1024/1024" GB"}' >> $pid.mem_usage && sleep 1; done the slope of the accumulation is about 220 KB/s.

@AlexDayCRL
Copy link
Contributor Author

After instrumenting the zenoh rmw_publish function it seems that the free_msg_bytes scope exit lambda is not actually deallocating the memory.

@Yadunund
Copy link
Member

Hi @AlexDayCRL,

Thanks for investigating this problem and identifying the fix. I've merged #142.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants