Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak while getting data from aerospike #520

Open
smartist1401 opened this issue Sep 25, 2023 · 15 comments
Open

Memory Leak while getting data from aerospike #520

smartist1401 opened this issue Sep 25, 2023 · 15 comments

Comments

@smartist1401
Copy link

smartist1401 commented Sep 25, 2023

The tracemalloc pythonic tools shows that calling the get function on the aerospike client causes a memory leak.
how I use the client:
I connect to the aerospike cluster by passing a list of hosts,
then in a loop (multiple iteration per second) I have multiple threads that simultaneously read some records from multiple sets.

Screenshot 2023-09-25 093254

Screenshot 2023-09-25 093756

@juliannguyen4
Copy link
Collaborator

Hi @smartist1401 I'll investigate this issue today and follow up with you soon.

@juliannguyen4
Copy link
Collaborator

Would you mind sharing the script so I can debug it?

@smartist1401
Copy link
Author

smartist1401 commented Sep 27, 2023

test_aerospike_memoryleak.zip

Here is a sample python script that shows memory leak in line 44.
Capture

@smartist1401
Copy link
Author

smartist1401 commented Sep 30, 2023

Hi @juliannguyen4
Did you find anything?
Thank you for checking faster, please. I have used your aerospike client in a serious project.

@juliannguyen4
Copy link
Collaborator

juliannguyen4 commented Oct 2, 2023

Hey @smartist1401, I have found a few leaks from get() caused by several different sources. I'll try to fix them and will give you a status update by tomorrow at 5 PM PST.

@juliannguyen4
Copy link
Collaborator

My notes about the get() leaks:

There’s 2 sources for memory leaks with get(): raise_exception() and record_to_pyobject().
I investigated the key_to_pyobject() leaks coming from record_to_pyobject() and I haven’t found the cause yet. The leaks are coming from the namespace, set, and digest of the key tuple.

gdb -args python3 -m pytest new_tests/
# Using Python client 13.0.0
b src/main/conversions.c:1840
cond 2 py_namespace->ob_refcnt > 1 || py_set->ob_refcnt > 1 || py_digest->ob_refcnt > 1
# Test succeeds without breaking

I also added a breakpoint to check the reference counts of the record tuple’s objects as well as the reference count of the record tuple itself. None of them exceeded 1 when being returned by record_to_pyobject() and AerospikeClient_GetInvoke() respectively.

Using sys.getrefcount() doesn’t show any memory leaks either:

>>> import aerospike
>>> config = {"hosts": [("127.0.0.1", 3000)]}
>>> client = aerospike.client(config).connect()
>>> key = ("test", "demo", 1)
>>> client.put(key, {"a": 1})
0
>>> rec = client.get(key)
>>> import sys
>>> sys.getrefcount(rec)
2
>>> sys.getrefcount(rec[2])
2
>>> sys.getrefcount(rec[1])
2
>>> sys.getrefcount(rec[0])
2

@juliannguyen4
Copy link
Collaborator

juliannguyen4 commented Oct 2, 2023

@smartist1401 When you ran that script and saw the memory leaks reported by tracemalloc, did the records that were being queried exist on the server?

@juliannguyen4
Copy link
Collaborator

Hey @smartist1401, I worked on fixing the memory leak today but I wasn't able to fully solve it. I'll let you know once I have found a solution

@smartist1401
Copy link
Author

smartist1401 commented Oct 4, 2023

@smartist1401 When you ran that script and saw the memory leaks reported by tracemalloc, did the records that were being queried exist on the server?

Hi @juliannguyen4
Thanks for the investigation
I get a record by key in a try section and return the record if exist,
in except (not found record exception) I return an empty dict in except section.

some records exist and some other not.

@juliannguyen4
Copy link
Collaborator

juliannguyen4 commented Oct 5, 2023

Got it. As I mentioned in my comment above, there's a memory leak from raise_exception(), which should be called when calling get() on a record that does not exist. I'll try to fix this memory leak and provide you with a build with the fix.

@smartist1401
Copy link
Author

Hi @juliannguyen4
I'm Waiting ... :)

@juliannguyen4
Copy link
Collaborator

I'm still getting to the bottom of it.

@smartist1401
Copy link
Author

Hi @juliannguyen4
Recently, the new version 14.0.0 has been released. Has the memory leak problem been fixed in this version?

@smartist1401
Copy link
Author

smartist1401 commented Jun 8, 2024

Hi @juliannguyen4
Could you do something to resolve the memory leak issue? We have a problem with this in our system and I would be grateful if you could fix this issue as soon as possible.
I have also tested your version 15.0.0 but still there is the memory leak problem.

@juliannguyen4
Copy link
Collaborator

I'm looking into it again now. I ran your script using Python client 11.0.1 and the memory leaks are still there. So this doesn't seem related to my memory leak fixes in 11.1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants