Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting issue due to multiprocessing used in the launch function during indexing of the collections #373

Open
Rajeswari2000 opened this issue Oct 22, 2024 · 1 comment

Comments

@Rajeswari2000
Copy link

my python file:
with Run().context(RunConfig(nranks=1)):
config = ColBERTConfig(doc_maxlen=doc_maxlen, nbits=nbits)
indexer = Indexer(checkpoint=checkpoint, config=config)
indexer.index(name=index_name, collection=collection, overwrite=True)

code in indexer.py:

def __launch(self, collection):
    launcher = Launcher(encode)
    if self.config.nranks == 1 and self.config.avoid_fork_if_possible:
        shared_queues = []
        shared_lists = []
        launcher.launch_without_fork(self.config, collection, shared_lists, shared_queues, self.verbose)

        return
    manager = mp.Manager()
    shared_lists = [manager.list() for _ in range(self.config.nranks)]
    shared_queues = [manager.Queue(maxsize=1) for _ in range(self.config.nranks)]

    # Encodes collection into index using the CollectionIndexer class
    launcher.launch(self.config, collection, shared_lists, shared_queues, self.verbose)

here, after reaching the line manager = mp.Manager(), i'm getting the following error:

Traceback (most recent call last):
File "/code/util/new_colbert_v2.py", line 90, in
colber_v2()
File "/code/util/new_colbert_v2.py", line 78, in colber_v2
indexer.index(name=index_name, collection=collection, overwrite=True)
File "/.venv/lib/python3.10/site-packages/colbert/indexer.py", line 80, in index
self.__launch(collection)
File "/.venv/lib/python3.10/site-packages/colbert/indexer.py", line 93, in __launch
manager = mp.Manager()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/context.py", line 57, in Manager
m.start()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/managers.py", line 566, in start
self._address = reader.recv()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/connection.py", line 388, in _recv
raise EOFError
EOFError

@sandkumaCode
Copy link

i am getting similar issue

Traceback (most recent call last):
File "", line 1, in
File "/home/demo/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/home/demo/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/home/demo/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/home/demo/.pyenv/versions/3.10.12/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/home/demo/.pyenv/versions/3.10.12/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/demo/.pyenv/versions/3.10.12/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/demo/.pyenv/versions/3.10.12/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/demo/latest_repo/gpu-libs/gpu-libs/colbert/web_colbert_index_multiple.py", line 1, in
from colbert.infra.run import Run
ModuleNotFoundError: No module named 'colbert.infra'

pip show colbert-ir
Name: colbert-ir
Version: 0.2.14
Summary: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
Home-page: https://github.com/stanford-futuredata/ColBERT
Author: Omar Khattab
Author-email: [email protected]
License: UNKNOWN
Location: /home/demo/latest_repo/gpu-libs/.venv/lib/python3.10/site-packages
Requires: bitarray, datasets, flask, git-python, ninja, python-dotenv, scipy, spacy, tqdm, transformers, ujson
Required-by:

pip show colbert-ai
Name: colbert-ai
Version: 0.2.19
Summary: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
Home-page: https://github.com/stanford-futuredata/ColBERT
Author: Omar Khattab
Author-email: [email protected]
License:
Location: /home/demo/latest_repo/gpu-libs/.venv/lib/python3.10/site-packages
Requires: bitarray, datasets, flask, git-python, ninja, python-dotenv, scipy, tqdm, transformers, ujson
Required-by: RAGatouille

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants