TypeError running cell extraction with downloaded model #15

creisle · 2021-09-14T20:07:42Z

I am probably doing something incorrect here but I am not sure what. I got everything to run up to and including the combine script, src/baseline/retriever/combine_retrieval.py. I then downloaded the models linked in the README and tried to run the table cell extraction step and I run into this issue (see trace below)

/projects/creisle_prj/creisle_scratch/FEVEROUS/venv/lib/python3.7/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0
[INFO] 2021-09-14 12:58:40,249 - LogHelper - Log Helper set up
[INFO] 2021-09-14 12:58:41,154 - __main__ - Start extracting cells from Tables...

  0%|                                                  | 0/7890 [00:00<?, ?it/s]
100%|███████████████████████████████████| 7890/7890 [00:00<00:00, 259469.96it/s]
Ignored unknown kwargs option trim_offsets
Traceback (most recent call last):
  File "src/baseline/retriever/predict_cells_from_table.py", line 322, in <module>
    main()
  File "src/baseline/retriever/predict_cells_from_table.py", line 317, in main
    extract_cells_from_tables(annotations, args)
  File "src/baseline/retriever/predict_cells_from_table.py", line 260, in extract_cells_from_tables
    predictions =  (model_output.predictions > 0.25).astype(int)
TypeError: '>' not supported between instances of 'NoneType' and 'float'

Have you seen this before? Any idea what I might have done wrong?

System specs
OS: centos07
python: 3.7
conda or pip: pip

The text was updated successfully, but these errors were encountered:

Raldir · 2021-09-17T11:33:31Z

Sorry for the delay! The processing step looks suspicious to me, it normally takes a bit of time (about 40 seconds for me) for the data to process. Could you check the input, e.g. by checking the first entries of text_test in l.250?

creisle · 2021-09-20T22:14:38Z

I ran it again and printed out the first entry of text_test. It is odd, it looks like its trying to run a batch of zero. Something must be wrong with one of the inputs but I didn't see any upstream errors so far.

Since the GPL-licensed package `unidecode` is not installed, using Python's `unicodedata` package which yields worse results.
[INFO] 2021-09-20 13:35:23,636 - LogHelper - Log Helper set up
[INFO] 2021-09-20 13:35:24,767 - __main__ - Start extracting cells from Tables...

  0%|          | 0/7890 [00:00<?, ?it/s]
100%|██████████| 7890/7890 [00:00<00:00, 302936.25it/s]
***** Running Prediction *****
  Num examples = 0
  Batch size = 16
Encoding(num_tokens=2, attributes=[ids, type_ids, tokens, offsets, attention_mask, special_tokens_mask, overflowing])
Traceback (most recent call last):
  File "src/baseline/retriever/predict_cells_from_table.py", line 323, in <module>
    main()
  File "src/baseline/retriever/predict_cells_from_table.py", line 318, in main
    extract_cells_from_tables(annotations, args)
  File "src/baseline/retriever/predict_cells_from_table.py", line 261, in extract_cells_from_tables
    predictions =  (model_output.predictions > 0.25).astype(int)
TypeError: '>' not supported between instances of 'NoneType' and 'float'
[Mon Sep 20 13:36:23 2021]
Error in rule extract_table_cells:
    jobid: 0
    output: data/dev.combined.not_precomputed.p5.s5.t3.cells.jsonl
    shell:
        source /projects/creisle_prj/creisle_scratch/FEVEROUS/venv3.7-unbuntu/bin/activate; python src/baseline/retriever/predict_cells_from_table.py --input_path data/dev.combined.not_precomputed.p5.s5.t3.jsonl --max_sent 5 --wiki_path data/feverous_wikiv1.db --model_path models/feverous_cell_extractor
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

Raldir · 2021-09-24T04:35:34Z

Yes, there is something wrong with the input. With the latest commit, I have added some small additional loggings -- they might help to find what's going on. Could you pull the current version and try again?

creisle · 2021-09-27T23:16:07Z

I gave this a try and ran into some namespace/path errors with the latest version (#18). I am trying this again with the path fixes I put in the latest PR

creisle · 2021-10-05T19:31:15Z

Sorry for the delay, This is the error message I get with the changes in main

[INFO] 2021-10-05 12:25:46,778 - __main__ - Start extracting cells from Tables...

  0%|          | 0/7890 [00:00<?, ?it/s]
100%|██████████| 7890/7890 [00:00<00:00, 289959.33it/s]
Traceback (most recent call last):
  File "src/feverous/baseline/retriever/predict_cells_from_table.py", line 325, in <module>
    main()
  File "src/feverous/baseline/retriever/predict_cells_from_table.py", line 320, in main
    extract_cells_from_tables(annotations, args)
  File "src/feverous/baseline/retriever/predict_cells_from_table.py", line 245, in extract_cells_from_tables
    logger.info('Sample entry: {}'.format(all_input[0]))

Should I try from the start? Does the new error help?

Raldir · 2021-10-06T14:36:25Z

Is that the end of the error message already? Does not tell us much beyond that the entries are not processed correctly. I would recommend doing a line by line check for l.237 to l.242 and check whether anno contains information. Maybe call anno.get_claim() and anno.get_evidence() to check the sanity of the annotation processor.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError running cell extraction with downloaded model #15

TypeError running cell extraction with downloaded model #15

creisle commented Sep 14, 2021

Raldir commented Sep 17, 2021 •

edited

Loading

creisle commented Sep 20, 2021

Raldir commented Sep 24, 2021

creisle commented Sep 27, 2021

creisle commented Oct 5, 2021

Raldir commented Oct 6, 2021 •

edited

Loading

TypeError running cell extraction with downloaded model #15

TypeError running cell extraction with downloaded model #15

Comments

creisle commented Sep 14, 2021

Raldir commented Sep 17, 2021 • edited Loading

creisle commented Sep 20, 2021

Raldir commented Sep 24, 2021

creisle commented Sep 27, 2021

creisle commented Oct 5, 2021

Raldir commented Oct 6, 2021 • edited Loading

Raldir commented Sep 17, 2021 •

edited

Loading

Raldir commented Oct 6, 2021 •

edited

Loading