Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Milvus Lite on Unix, lock file and refuse to connect #195

Open
christy opened this issue Jul 25, 2024 · 10 comments
Open

Milvus Lite on Unix, lock file and refuse to connect #195

christy opened this issue Jul 25, 2024 · 10 comments
Assignees

Comments

@christy
Copy link

christy commented Jul 25, 2024

Trying to use Milvus Lite on Unix, a user repeatedly gets a lock file and error message they cannot connect to Milvus Lite.

Code used:
https://milvus.io/docs/quickstart.md

Error message:
E20240722 17:51:48.802269 71268 collection_meta.cpp:94] [SERVER][CreateTable][milvus] Create table failed, errs: database is locked
E20240722 17:51:48.802600 71268 server.cpp:73] [SERVER][main][milvus] Init milvus failed
Failed to create new connection using: 6aa54c5d05b945488b9f5fc0bc5500c1

File /anaconda/envs/test/lib/python3.11/site-packages/pymilvus/client/grpc_handler.py:150, in GrpcHandler._wait_for_channel_ready(self, timeout)
148 self._setup_identifier_interceptor(self._user, timeout=timeout)
149 except grpc.FutureTimeoutError as e:
--> 150 raise MilvusException(
151 code=Status.CONNECT_FAILED,
152 message=f"Fail connecting to server on {self._address}, illegal connection params or server unavailable",
153 ) from e
154 except Exception as e:
155 raise e from e

MilvusException: <MilvusException: (code=2, message=Fail connecting to server on unix:/tmp/tmp4da_2ec6_test.db.sock, illegal connection params or server unavailable)>"
}:cry:

@christy christy changed the title Milvus Lite on Linux, lock file and refuse to connect Milvus Lite on Unix, lock file and refuse to connect Jul 25, 2024
@christy
Copy link
Author

christy commented Jul 25, 2024

image

@junjiejiangjjj
Copy link
Collaborator

This is an error message from sqlite3. There may be a limitation or problem with the underlying file system that is causing the database to be locked. Try running it in a different directory.

@arielfaur
Copy link

I'm having the same issue on mac os, the lock is not removed.

@alibabadoufu
Copy link

I am having the same issue!

@leonardozilli
Copy link

Same problem here on Fedora 40.

@xiaofan-luan
Copy link

@junjiejiangjjj
is there anything we can do here?

@junjiejiangjjj
Copy link
Collaborator

@junjiejiangjjj is there anything we can do here?

I have run the script thousands of times on both Mac and Ubuntu, and I have not been able to reproduce this issue.

@leonardozilli
Copy link

leonardozilli commented Nov 21, 2024

More context, i'm working with jupyter notebooks in vscode.
platform: fedora 40
python version: 3.12.6
pymilvus version: 2.4.9

Populate the database

from pymilvus import MilvusClient, connections, utility, FieldSchema, CollectionSchema, DataType, Collection
from pymilvus.model.hybrid import BGEM3EmbeddingFunction


MILVUS_URL = "../vec_db/definitions_vectors.db"

client = MilvusClient(
    uri=MILVUS_URL
)

connections.connect(uri=MILVUS_URL)

fields = [
    FieldSchema(name="id", dtype=DataType.INT64,
                is_primary=True, auto_id=True, max_length=100),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=5000),
    FieldSchema(name="dense_vector", dtype=DataType.FLOAT_VECTOR,
                dim=1024),
]

schema = CollectionSchema(fields, "Definitions embeddings")

COLLECTION_NAME = "Definitions"
if utility.has_collection(COLLECTION_NAME):
    Collection(COLLECTION_NAME).drop()
collection = Collection(COLLECTION_NAME, schema, consistency_level="Strong")

dense_index = {"index_type": "AUTOINDEX", "metric_type": "IP"}
collection.create_index("dense_vector", dense_index)
collection.load()

for i in range(0, len(doc_list), 50):
    batched_entities = [
        doc_list[i : i + 50],
        defs_embeddings["dense"][i : i + 50],
    ]
    collection.insert(batched_entities)
print("Number of entities inserted:", collection.num_entities)

The code above creates the definitions_vectors.db and definitions_vectors.db.lock files in the /vec_db folder. I am able to connect to the db with no problems from the same jupyter notebook. However, if I open a new jupyter notebook and try to connect to the database like so:

from langchain_milvus import Milvus
from pymilvus.model.hybrid import BGEM3EmbeddingFunction
from langchain.embeddings.base import Embeddings

class BGEMilvusEmbeddings(Embeddings):
    def __init__(self):
        self.model = BGEM3EmbeddingFunction(
            model_name='BAAI/bge-m3',
            device='cpu',
            use_fp16=False #set to false if device='cpu'
        )

    def embed_documents(self, texts):
        embeddings = self.model.encode_documents(texts)
        return [i.tolist() for i in embeddings["dense"]]

    def embed_query(self, text):
        embedding = self.model.encode_queries([text])
        return embedding["dense"][0].tolist()

MILVUS_URI = "../vec_db/definitions_vectors.db"
MILVUS_COLLECTION_NAME = 'Definitions'

vector_store = Milvus(
    embedding_function=BGEMilvusEmbeddings(),
    connection_args={"uri": MILVUS_URI},
    collection_name=MILVUS_COLLECTION_NAME,
    vector_field="dense_vector",
    text_field="text",
)

I get the following error:

Failed to create new connection using: acd5d70fea4c46beb2a1a5bab35edf75

---------------------------------------------------------------------------
MilvusException: <MilvusException: (code=2, message=Fail connecting to server on unix:/tmp/tmp5nlcf3pw_definitions_vectors.db.sock, illegal connection params or server unavailable)>

@junjiejiangjjj
Copy link
Collaborator

More context, i'm working with jupyter notebooks in vscode. platform: fedora 40 python version: 3.12.6 pymilvus version: 2.4.9

Populate the database

from pymilvus import MilvusClient, connections, utility, FieldSchema, CollectionSchema, DataType, Collection
from pymilvus.model.hybrid import BGEM3EmbeddingFunction


MILVUS_URL = "../vec_db/definitions_vectors.db"

client = MilvusClient(
    uri=MILVUS_URL
)

connections.connect(uri=MILVUS_URL)

fields = [
    FieldSchema(name="id", dtype=DataType.INT64,
                is_primary=True, auto_id=True, max_length=100),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=5000),
    FieldSchema(name="dense_vector", dtype=DataType.FLOAT_VECTOR,
                dim=1024),
]

schema = CollectionSchema(fields, "Definitions embeddings")

COLLECTION_NAME = "Definitions"
if utility.has_collection(COLLECTION_NAME):
    Collection(COLLECTION_NAME).drop()
collection = Collection(COLLECTION_NAME, schema, consistency_level="Strong")

dense_index = {"index_type": "AUTOINDEX", "metric_type": "IP"}
collection.create_index("dense_vector", dense_index)
collection.load()

for i in range(0, len(doc_list), 50):
    batched_entities = [
        doc_list[i : i + 50],
        defs_embeddings["dense"][i : i + 50],
    ]
    collection.insert(batched_entities)
print("Number of entities inserted:", collection.num_entities)

The code above creates the definitions_vectors.db and definitions_vectors.db.lock files in the /vec_db folder. I am able to connect to the db with no problems from the same jupyter notebook. However, if I open a new jupyter notebook and try to connect to the database like so:

from langchain_milvus import Milvus
from pymilvus.model.hybrid import BGEM3EmbeddingFunction
from langchain.embeddings.base import Embeddings

class BGEMilvusEmbeddings(Embeddings):
    def __init__(self):
        self.model = BGEM3EmbeddingFunction(
            model_name='BAAI/bge-m3',
            device='cpu',
            use_fp16=False #set to false if device='cpu'
        )

    def embed_documents(self, texts):
        embeddings = self.model.encode_documents(texts)
        return [i.tolist() for i in embeddings["dense"]]

    def embed_query(self, text):
        embedding = self.model.encode_queries([text])
        return embedding["dense"][0].tolist()

MILVUS_URI = "../vec_db/definitions_vectors.db"
MILVUS_COLLECTION_NAME = 'Definitions'

vector_store = Milvus(
    embedding_function=BGEMilvusEmbeddings(),
    connection_args={"uri": MILVUS_URI},
    collection_name=MILVUS_COLLECTION_NAME,
    vector_field="dense_vector",
    text_field="text",
)

I get the following error:

Failed to create new connection using: acd5d70fea4c46beb2a1a5bab35edf75

---------------------------------------------------------------------------
MilvusException: <MilvusException: (code=2, message=Fail connecting to server on unix:/tmp/tmp5nlcf3pw_definitions_vectors.db.sock, illegal connection params or server unavailable)>

hi @leonardozilli the db file of milvus-lite does't support multiple processes to open, otherwise it will cause data confusion

@leonardozilli
Copy link

Okay that makes sense, I'll make sure to have only one process working on it. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants