Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating nomic on BornholmBitextMining fails #1421

Open
Muennighoff opened this issue Nov 9, 2024 · 0 comments
Open

Evaluating nomic on BornholmBitextMining fails #1421

Muennighoff opened this issue Nov 9, 2024 · 0 comments

Comments

@Muennighoff
Copy link
Contributor

2024-11-09 21:39:45.007505: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-11-09 21:39:45.021610: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-11-09 21:39:45.025738: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
INFO:mteb.cli:Running with parameters: Namespace(model='nomic-ai/nomic-embed-text-v1.5', task_types=None, categories=None, tasks=['BornholmBitextMining'], languages=None, benchmarks=None, device=None, output_folder='/data/niklas/results/results', verbosity=2, co2_tracker=True, eval_splits=None, model_revision=None, batch_size=4, overwrite=False, save_predictions=False, func=<function run at 0x7f8c842505e0>)
WARNING:transformers_modules.nomic-ai.nomic-bert-2048.c1b1fd7a715b8eb2e232d34593154ac782c98ac9.modeling_hf_nomic_bert:<All keys matched successfully>
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
─────────────────────────────── Selected tasks  ────────────────────────────────
BitextMining
    - BornholmBitextMining, s2s


INFO:mteb.evaluation.MTEB:

********************** Evaluating BornholmBitextMining **********************
INFO:mteb.evaluation.MTEB:Loading dataset for BornholmBitextMining
No config specified, defaulting to the single config: bornholmsk_parallel/BornholmskParallel
INFO:datasets.builder:No config specified, defaulting to the single config: bornholmsk_parallel/BornholmskParallel
Loading Dataset Infos from /data/huggingface/modules/datasets_modules/datasets/strombergnlp--bornholmsk_parallel/a93ddacca6042553271bf4c1c0e035df3fccf848c6820417766752d676567815
INFO:datasets.info:Loading Dataset Infos from /data/huggingface/modules/datasets_modules/datasets/strombergnlp--bornholmsk_parallel/a93ddacca6042553271bf4c1c0e035df3fccf848c6820417766752d676567815
Overwrite dataset info from restored data version if exists.
INFO:datasets.builder:Overwrite dataset info from restored data version if exists.
Loading Dataset info from /data/huggingface/datasets/strombergnlp___bornholmsk_parallel/BornholmskParallel/1.0.0/a93ddacca6042553271bf4c1c0e035df3fccf848c6820417766752d676567815
INFO:datasets.info:Loading Dataset info from /data/huggingface/datasets/strombergnlp___bornholmsk_parallel/BornholmskParallel/1.0.0/a93ddacca6042553271bf4c1c0e035df3fccf848c6820417766752d676567815
Found cached dataset bornholmsk_parallel (/data/huggingface/datasets/strombergnlp___bornholmsk_parallel/BornholmskParallel/1.0.0/a93ddacca6042553271bf4c1c0e035df3fccf848c6820417766752d676567815)
INFO:datasets.builder:Found cached dataset bornholmsk_parallel (/data/huggingface/datasets/strombergnlp___bornholmsk_parallel/BornholmskParallel/1.0.0/a93ddacca6042553271bf4c1c0e035df3fccf848c6820417766752d676567815)
Loading Dataset info from /data/huggingface/datasets/strombergnlp___bornholmsk_parallel/BornholmskParallel/1.0.0/a93ddacca6042553271bf4c1c0e035df3fccf848c6820417766752d676567815
INFO:datasets.info:Loading Dataset info from /data/huggingface/datasets/strombergnlp___bornholmsk_parallel/BornholmskParallel/1.0.0/a93ddacca6042553271bf4c1c0e035df3fccf848c6820417766752d676567815
INFO:mteb.abstasks.AbsTaskBitextMining:
Task: BornholmBitextMining, split: test, subset: default. Running...

Encoding 2x500 sentences:   0%|          | 0/2 [00:00<?, ?it/s]INFO:mteb.models.wrapper:No combination of task name and prompt type was found in model prompts.

Encoding 2x500 sentences:  50%|█████     | 1/2 [00:01<00:01,  1.27s/it]INFO:mteb.models.wrapper:No combination of task name and prompt type was found in model prompts.

Encoding 2x500 sentences: 100%|██████████| 2/2 [00:02<00:00,  1.10s/it]
Encoding 2x500 sentences: 100%|██████████| 2/2 [00:02<00:00,  1.12s/it]

Matching sentences:   0%|          | 0/1 [00:00<?, ?it/s]INFO:mteb.evaluation.evaluators.BitextMiningEvaluator:Finding nearest neighbors...

Matching sentences:   0%|          | 0/1 [00:00<?, ?it/s]
ERROR:mteb.evaluation.MTEB:Error while evaluating BornholmBitextMining: expected np.ndarray (got Tensor)
Traceback (most recent call last):
  File "/env/lib/conda/gritkto/bin/mteb", line 8, in <module>
    sys.exit(main())
  File "/data/niklas/mteb/mteb/cli.py", line 387, in main
    args.func(args)
  File "/data/niklas/mteb/mteb/cli.py", line 145, in run
    eval.run(
  File "/data/niklas/mteb/mteb/evaluation/MTEB.py", line 477, in run
    raise e
  File "/data/niklas/mteb/mteb/evaluation/MTEB.py", line 425, in run
    results, tick, tock = self._run_eval(
  File "/data/niklas/mteb/mteb/evaluation/MTEB.py", line 301, in _run_eval
    results = task.evaluate(
  File "/data/niklas/mteb/mteb/abstasks/AbsTaskBitextMining.py", line 82, in evaluate
    scores[hf_subet] = self._evaluate_subset(
  File "/data/niklas/mteb/mteb/abstasks/AbsTaskBitextMining.py", line 115, in _evaluate_subset
    metrics = evaluator(model, encode_kwargs=encode_kwargs)
  File "/data/niklas/mteb/mteb/evaluation/evaluators/BitextMiningEvaluator.py", line 42, in __call__
    scores = self.compute_metrics(model, encode_kwargs=encode_kwargs)
  File "/data/niklas/mteb/mteb/evaluation/evaluators/BitextMiningEvaluator.py", line 64, in compute_metrics
    scores[f"{key1}-{key2}"] = self._compute_metrics(
  File "/data/niklas/mteb/mteb/evaluation/evaluators/BitextMiningEvaluator.py", line 82, in _compute_metrics
    nearest_neighbors = self._similarity_search(embeddings1, embeddings2, top_k=1)
  File "/data/niklas/mteb/mteb/evaluation/evaluators/BitextMiningEvaluator.py", line 128, in _similarity_search
    query_embeddings = torch.from_numpy(query_embeddings)
TypeError: expected np.ndarray (got Tensor)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant