-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Training] Onnxruntime OnDevice training : onnxruntime::training::api::Module::Module [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Shape(19) node with name '/bert/Shape_1' #19351
Comments
This issue should be fixed by #18300 and this will be part of the next release coming soon. The onnx model for the model is created with an opset onnxruntime does not support yet. In the new release, the loss is created with the same opset as the model. |
does this mean that the current version does not support Bert model for on-device training? If the answer is no, can you pls share some references related to this? |
It does. It should work if you install a less recent version of onnx. |
this is returning 20 and the onnx version that I am using is '1.15.0'. Which version of onnx should I use and the opset version? |
This table should give you this information: https://onnxruntime.ai/docs/reference/compatibility.html#onnx-opset-support. |
I tried changing the version but the above solution does not worked. can you pls guide which node/layer should i remove so that the above error gets resolved? These are the nodes/layers that I am using for training :- 'bert.pooler.dense.weight |
@Leaner23 could you try using a newer version of import onnx
from onnxruntime.training import artifacts
onnx_model = onnx.load("artifacts/bertmodel.onnx")
requires_grad = [
"bert.pooler.dense.weight",
"bert.pooler.dense.bias",
"linear.weight",
"linear.bias",
"linear2.weight",
"linear2.bias",
"linear3.weight",
"linear3.bias",
]
frozen_params = [
param.name
for param in onnx_model.graph.initializer
if param.name not in requires_grad
]
artifacts.generate_artifacts(
onnx_model,
requires_grad=requires_grad,
frozen_params=frozen_params,
loss=artifacts.LossType.BCEWithLogitsLoss,
optimizer=artifacts.OptimType.AdamW,
artifact_directory="artifacts",
) import onnxruntime.training.api as orttraining
checkpoint = orttraining.CheckpointState.load_checkpoint("artifacts/checkpoint")
model = orttraining.Module("artifacts/training_model.onnx", checkpoint, "artifacts/eval_model.onnx")
optimize = orttraining.Optimizer("artifacts/optimizer_model.onnx", model) To install python -m pip install cerberus flatbuffers h5py numpy>=1.16.6 onnx packaging protobuf sympy setuptools>=41.4.0
pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT/pypi/simple/ onnxruntime-training-cpu |
I am closing this as I cannot reproduce this error on my end. Please re-open in case this issue still persists. |
Describe the issue
I am building the sentiment analysis model using Bert for the on-device training. I am getting the below error message while loading the generated artifacts of the model:-
RuntimeError Traceback (most recent call last)
Input In [5], in <cell line: 9>()
5 from onnxruntime.capi import _pybind_state as C
7 checkpoint_state = orttraining.CheckpointState.load_checkpoint(
8 r'artifacts\checkpoint')
----> 9 model = orttraining.Module(
10 r"artifacts\training_model.onnx",
11 checkpoint_state,
12 r"artifacts\eval_model.onnx",
13 )
14 optimizer = orttraining.Optimizer(
15 r"artifacts\optimizer.onnx", model
16 )
File ~\AppData\Roaming\Python\Python39\site-packages\onnxruntime\training\api\module.py:54, in Module.init(self, train_model_uri, state, eval_model_uri, device)
47 device_id = 0 if len(options) < 2 else int(options[1])
49 self._device = C.OrtDevice(
50 get_ort_device_type(self._device_type, device_id),
51 C.OrtDevice.default_memory(),
52 device_id,
53 )
---> 54 self._model = C.Module(
55 os.fspath(train_model_uri),
56 state._state,
57 os.fspath(eval_model_uri) if eval_model_uri is not None else None,
58 self._device,
59 )
60 self._state = state
RuntimeError: C:\a_work\1\s\orttraining\orttraining\training_api\module.cc:175 onnxruntime::training::api::Module::Module [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Shape(19) node with name '/bert/Shape_1'
To reproduce
Artifacts generation code:-
requires_grad = '''bert.pooler.dense.weight
bert.pooler.dense.bias
linear.weight
linear.bias
linear2.weight
linear2.bias
linear3.weight
linear3.bias'''.split()
frozen_params = [param.name
for param in onnx_model.graph.initializer
if param.name not in requires_grad]
artifacts.generate_artifacts(
onnx_model,
requires_grad=requires_grad,
frozen_params=frozen_params,
loss=artifacts.LossType.BCEWithLogitsLoss,
optimizer=artifacts.OptimType.AdamW,
artifact_directory="artifacts")
bertmodel.zip
import onnx
from onnxruntime.training import artifacts
import onnxruntime.training.api as orttraining
from onnxruntime import InferenceSession
from onnxruntime.capi import _pybind_state as C
checkpoint_state = orttraining.CheckpointState.load_checkpoint(
r'artifacts\checkpoint')
model = orttraining.Module(
r"artifacts\training_model.onnx",
checkpoint_state,
r"artifacts\eval_model.onnx",
)
optimizer = orttraining.Optimizer(
r"artifacts\optimizer.onnx", model
)
artifacts_training.zip
Urgency
very urgent
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.15
PyTorch Version
2.2.0+cpu
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: