You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the tasks with BAAI/bge-visualized-base-base/m3 and getting errors like below
ERROR:mteb.evaluation.MTEB:Error while evaluating InfoSeekIT2TRetrieval: The size of tensor a (516) must matc
h the size of tensor b (512) at non-singleton dimension 1
Traceback (most recent call last):
File "/data/niklas/mieb/mteb/scripts/run_mieb.py", line 82, in <module>
results = evaluation.run(model, output_folder="/data/niklas/mieb/results-mieb-final", batch_size=1)
File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 464, in run
raise e
File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 425, in run
results, tick, tock = self._run_eval(
File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 300, in _run_eval
results = task.evaluate(
File "/data/niklas/mieb/mteb/mteb/abstasks/Image/AbsTaskAny2AnyRetrieval.py", line 269, in evaluate
scores[hf_subset] = self._evaluate_subset(
File "/data/niklas/mieb/mteb/mteb/abstasks/Image/AbsTaskAny2AnyRetrieval.py", line 278, in _evaluate_subset
results = retriever(corpus, queries)
File "/data/niklas/mieb/mteb/mteb/evaluation/evaluators/Image/Any2AnyRetrievalEvaluator.py", line 290, in _
_call__
return self.retriever.search(
File "/data/niklas/mieb/mteb/mteb/evaluation/evaluators/Image/Any2AnyRetrievalEvaluator.py", line 173, in s
earch
sub_corpus_embeddings = self.model.get_text_embeddings(
File "/data/niklas/mieb/mteb/mteb/models/vista_models.py", line 130, in get_text_embeddings
batch_embeddings = self.encode(texts=batch_texts)
File "/data/niklas/mieb/mteb/mteb/models/vista_models.py", line 121, in encode
return self.encode_text(texts.to(self.device))
File "/data/niklas/mieb/mteb/mteb/models/vista_models.py", line 65, in encode_text
embedding_output = self.bge_embeddings(
File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrap
ped_call_impl
return self._call_impl(*args, **kwargs)
File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call
_impl
return forward_call(*args, **kwargs)
File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line
217, in forward
embeddings += position_embeddings
RuntimeError: The size of tensor a (516) must match the size of tensor b (512) at non-singleton dimension 1
and the below
/opt/conda/conda-bld/pytorch_1724789122112/work/aten/src/ATen/native/cuda/Indexing.cu:1284: indexSelectLargeI
ndex: block: [845,0,0], thread: [126,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1724789122112/work/aten/src/ATen/native/cuda/Indexing.cu:1284: indexSelectLargeI
ndex: block: [845,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
31%|████████████████████▍ | 6110/20000 [01:38<03:43, 62.08it/s]
ERROR:mteb.evaluation.MTEB:Error while evaluating InfoSeekIT2ITRetrieval: CUDA error: device-side assert trig
gered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be
incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Traceback (most recent call last):
File "/data/niklas/mieb/mteb/scripts/run_mieb.py", line 81, in <module>
results = evaluation.run(model, output_folder="/data/niklas/mieb/results-mieb-final", batch_size=1)
File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 464, in run
raise e
File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 425, in run
results, tick, tock = self._run_eval(
File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 300, in _run_eval
results = task.evaluate(
File "/data/niklas/mieb/mteb/mteb/abstasks/Image/AbsTaskAny2AnyRetrieval.py", line 269, in evaluate
scores[hf_subset] = self._evaluate_subset(
File "/data/niklas/mieb/mteb/mteb/abstasks/Image/AbsTaskAny2AnyRetrieval.py", line 278, in _evaluate_subset
results = retriever(corpus, queries)
File "/data/niklas/mieb/mteb/mteb/evaluation/evaluators/Image/Any2AnyRetrievalEvaluator.py", line 290, in _
_call__
return self.retriever.search(
File "/data/niklas/mieb/mteb/mteb/evaluation/evaluators/Image/Any2AnyRetrievalEvaluator.py", line 194, in s
earch
sub_corpus_embeddings = self.model.get_fused_embeddings(
File "/data/niklas/mieb/mteb/mteb/models/vista_models.py", line 169, in get_fused_embeddings
all_embeddings.append(batch_embeddings.cpu())
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be
incorrect.
[
also getting this for OVENIT2ITRetrieval ; maybe a problem with our bge implementation
The text was updated successfully, but these errors were encountered:
Muennighoff
changed the title
InfoSeekIT2ITRetrieval & InfoSeekIT2TRetrieval fail with BAAI/bge-visualized
[mieb] InfoSeekIT2ITRetrieval & InfoSeekIT2TRetrieval fail with BAAI/bge-visualized
Nov 4, 2024
strangely not able to reproduce the error on the two datasets. Can this be transformers and tokenizers version? Looks like it's relevant to max length not truncated.
On my end, I am able to run:
Running the tasks with
BAAI/bge-visualized-base-base/m3
and getting errors like belowand the below
also getting this for
OVENIT2ITRetrieval
; maybe a problem with our bge implementationThe text was updated successfully, but these errors were encountered: