[Training] protobuf limit reached when generating training artifacts for a onnx model more than 2GB. #18874
Labels
training
issues related to ONNX Runtime training; typically submitted using template
Describe the issue
Unable to generate training artifacts for model having size greater than 2GB. The onnx model is exported from hugging face repo with weights in external file,
Model exported: OPT-1.3B
Error when passing onnx.ModelProto
Traceback (most recent call last):
File "onnxruntime_artifacts.py", line 58, in
artifacts.generate_artifacts(
File "/workspace/envs/training_env/lib/python3.8/site-packages/onnxruntime/training/artifacts.py", line 137, in generate_artifacts
_ = training_block(*[output.name for output in model.graph.output])
File "/workspace/envs/training_env/lib/python3.8/site-packages/onnxruntime/training/onnxblock/onnxblock.py", line 188, in call
output = self.build(*args, **kwargs)
File "/workspace/envs/training_env/lib/python3.8/site-packages/onnxruntime/training/artifacts.py", line 107, in build
return self._loss(*inputs_to_loss)
File "/workspace/envs/training_env/lib/python3.8/site-packages/onnxruntime/training/onnxblock/blocks.py", line 48, in call
output = self.build(*args, **kwargs)
File "/workspace/envs/training_env/lib/python3.8/site-packages/onnxruntime/training/onnxblock/loss/loss.py", line 48, in build
target_name = blocks.InputLike(loss_input_name)(target_name)
File "/workspace/envs/training_env/lib/python3.8/site-packages/onnxruntime/training/onnxblock/blocks.py", line 50, in call
onnx.checker.check_model(self.base, True)
File "/workspace/envs/training_env/lib/python3.8/site-packages/onnx/checker.py", line 145, in check_model
raise ValueError(
ValueError: This protobuf of onnx model is too large (>2GB). Call check_model with model path instead.
Error when passing str (onnx file path)
Traceback (most recent call last):
File "onnxruntime_artifacts.py", line 58, in
artifacts.generate_artifacts(
File "/workspace/envs/training_env/lib/python3.8/site-packages/onnxruntime/training/artifacts.py", line 137, in generate_artifacts
_ = training_block(*[output.name for output in model.graph.output])
AttributeError: 'str' object has no attribute 'graph'
To reproduce
artifacts.generate_artifacts( onnx_model, optimizer=artifacts.OptimType.AdamW, loss=artifacts.LossType.MSELoss, requires_grad=requires_grad, frozen_params=frozen_params, artifact_directory=output_dir, additional_output_names=["logits"])
Urgency
High
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
onnx-1.15.0 onnxruntime-1.16.3 onnxruntime-training-1.16.3
PyTorch Version
pytorch-2.0.1
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: