Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: memory leak due to exception #3161

Open
SarielMa opened this issue Nov 19, 2024 · 3 comments
Open

[Bug]: memory leak due to exception #3161

SarielMa opened this issue Nov 19, 2024 · 3 comments
Labels
2025-review bug Something isn't working

Comments

@SarielMa
Copy link

What happened?

When the following exception is triggered, we noticed that the memory will not be released after this query end. In this case, after many such queries, the memory is used up.

if (result.size() != k) throw std::runtime_error( "Cannot return the results in a contigious 2D array. Probably ef or M is too small");

Versions

ChromaV0.5.13

Relevant log output

No response

@SarielMa SarielMa added the bug Something isn't working label Nov 19, 2024
@tazarov
Copy link
Contributor

tazarov commented Nov 20, 2024

@SarielMa, thanks for reporting this. Need to do a bit more digging whether the leak (if existing) happens in the pybind11 or in the python code.

@itaismith
Copy link
Contributor

Hey @SarielMa! Can you share more information on your setup? Are you running this on a Windows machine by chance?

@tazarov
Copy link
Contributor

tazarov commented Nov 26, 2024

hey @SarielMa, do you have an observation as to how long it takes for the memory to be consumed? e.g. time or number of queries?

I can demonstrably show that, indeed the hnsw lib has a memory leak when an exception is thrown. Here's the call stack sequence:

here's some python code to reproduce exactly the error you encounter and once we find the right embedding to reproduce it with we spam query and printout some stats. Note that this runs for very long time (adjust the x loop):

import gc
import uuid
import tracemalloc

import chromadb
import numpy as np
import psutil

tracemalloc.start()
np.random.seed(42)
process = psutil.Process()

data = np.random.uniform(-1, 1, (1000, 500, 384))

client = chromadb.PersistentClient("contiguous2d")
# client = chromadb.HttpClient()
collection = client.get_or_create_collection("test_collection")
for i in range(data.shape[0]):
    try:
        print("Iteration: ", str(i))
        gc.collect()
        ids = [f"{uuid.uuid4()}" for i in range(data[i].shape[0])]
        collection.add(ids=ids, embeddings=data[i])
        random_embeddings = [data[i][np.random.choice(data[i].shape[0])].tolist()]
        collection.query(query_embeddings=data[i], n_results=10)
        collection.delete(ids=ids)
        gc.collect()
    except Exception as e:
        print(e)
        snapshot1 = tracemalloc.take_snapshot()
        memory_info = process.memory_info()

        # Print the memory usage
        print(f"RSS: {memory_info.rss / (1024 ** 2):.2f} MB")
        print(f"VMS: {memory_info.vms / (1024 ** 2):.2f} MB")
        for x in range(100000):
            try:
                # print("leak check ",x)
                collection.query(query_embeddings=data[i], n_results=500)
            except Exception as e1:
                # Get the memory info
                continue
        memory_info = process.memory_info()

        # Print the memory usage
        print(f"RSS: {memory_info.rss / (1024 ** 2):.2f} MB")
        print(f"VMS: {memory_info.vms / (1024 ** 2):.2f} MB")
        snapshot2 = tracemalloc.take_snapshot()
        stats = snapshot2.compare_to(snapshot1, 'lineno')
        for stat in stats[:20]:  # Top 10 changes
            print(stat)
        raise e

RSS grows over time.

I've also created a sample cpp code to reproduce the conditions - https://gist.github.com/tazarov/71fe6f2e8d5947dde998e83ee9a57d0a

Overall yes there seems to be a leak when the exception is raised but the leak is quite small (120 bytes with defaults) for 99.99% of the cases. The following two criteria can affect the size of the leak:

  • very large n_results - k in the cpp binding
  • lots of queries query_embeddings (or query_texts) - rows in the cpp binding

The most common case involves 1 single query embedding and default n_results=10. This results in 10 * 8 bytes (on 64bit) for labeltype which is size_t and 10 * 4 bytes for dist_t which is float totaling 120 bytes per exception per query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2025-review bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants