Error when converting llama3.2-1B tokenizer.model to tokenizer.bin #5913

a8nova · 2024-10-05T04:11:25Z

🐛 Describe the bug

When following instructions from https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/docs/delegates/xnnpack_README.md and when trying to convert llama3.2-1B tokenizer to bin you will see below error:

(et_xnnpack) /content/HI/executorch# python -m extension.llm.tokenizer.tokenizer -t /root/.llama/checkpoints/Llama3.2-1B/tokenizer.model -o tokenizer.bin
Traceback (most recent call last):
  File "/usr/local/envs/et_xnnpack/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/envs/et_xnnpack/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/content/HI/executorch/extension/llm/tokenizer/tokenizer.py", line 140, in <module>
    t = Tokenizer(args.tokenizer_model)
  File "/content/HI/executorch/extension/llm/tokenizer/tokenizer.py", line 26, in __init__
    self.sp_model = SentencePieceProcessor(model_file=model_path)
  File "/usr/local/envs/et_xnnpack/lib/python3.10/site-packages/sentencepiece/__init__.py", line 468, in Init
    self.Load(model_file=model_file, model_proto=model_proto)
  File "/usr/local/envs/et_xnnpack/lib/python3.10/site-packages/sentencepiece/__init__.py", line 961, in Load
    return self.LoadFromFile(model_file)
  File "/usr/local/envs/et_xnnpack/lib/python3.10/site-packages/sentencepiece/__init__.py", line 316, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: could not parse ModelProto from /root/.llama/checkpoints/Llama3.2-1B/tokenizer.model

The text was updated successfully, but these errors were encountered:

HSANGLEE · 2024-10-08T03:50:13Z

Dear @a8nova

As far as I know, from LLaMA3 you can just use tokenizer.model file not tokenizer.bin.

a8nova · 2024-10-09T04:20:57Z

Thank you, @HSANGLEE. I am currently blocked by #5840

a8nova changed the title ~~Error when converting tokenizer.model to bin~~ Error when converting llama3.2-1B tokenizer.model to bin Oct 5, 2024

a8nova changed the title ~~Error when converting llama3.2-1B tokenizer.model to bin~~ Error when converting llama3.2-1B tokenizer.model to tokenizer.bin Oct 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when converting llama3.2-1B tokenizer.model to tokenizer.bin #5913

Error when converting llama3.2-1B tokenizer.model to tokenizer.bin #5913

a8nova commented Oct 5, 2024 •

edited

Loading

HSANGLEE commented Oct 8, 2024 •

edited

Loading

a8nova commented Oct 9, 2024

Error when converting llama3.2-1B tokenizer.model to tokenizer.bin #5913

Error when converting llama3.2-1B tokenizer.model to tokenizer.bin #5913

Comments

a8nova commented Oct 5, 2024 • edited Loading

🐛 Describe the bug

HSANGLEE commented Oct 8, 2024 • edited Loading

a8nova commented Oct 9, 2024

a8nova commented Oct 5, 2024 •

edited

Loading

HSANGLEE commented Oct 8, 2024 •

edited

Loading