[Training] [Windows] #19965

Positronx · 2024-03-18T15:07:17Z

Describe the issue

OS : Windows 10
Is there a way to generate training artifacts in C++, without having to use python utilities? I took a look at the source code and I think that it is possible. I'm just having a hard time to link the necessary header files related to generating the artifacts.

To reproduce

#include "orttraining/training_api/checkpoint.h"

I can't even compile an empty code containing above header, even though I linked the .lib files and required headers. The error says that the file 'onnx/onnx..pb.h' can't be opened.

Urgency

No response

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.18.0

PyTorch Version

2.2.0

Execution Provider

Default CPU

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

baijumeswani · 2024-03-18T19:03:41Z

Is there a way to generate training artifacts in C++, without having to use python utilities? I took a look at the source code and I think that it is possible. I'm just having a hard time to link the necessary header files related to generating the artifacts.

Generating the training artifacts is currently not supported through C++ and requires usage of our python utilities.

May I ask why you would like to generate the training artifacts from c++?

Positronx · 2024-03-18T20:03:32Z

I have a graphical framework written in C that reads functions compiled into DLL files. I want to be able to generate the artifacts directly from the graphical framework without having the need to use a third party function from python. Unfortunately, I can't communicate with python and C++ is the best I can do since it is the closest thing to C.

eric-vision-e · 2024-03-19T16:28:23Z

Hi @Positronx

I'm looking for the same thing. Have you found a way to generate artifacts with C++?

Thanks

Positronx · 2024-03-21T10:39:28Z

Hello @eric-vision-e
I managed to generate a checkpoint file using the function SaveCheckpoint defined in the file orttraining/training_api/checkpoint.h. Otherwise, I'm struggling to use the class OrtModuleGraphBuilder defined in orttraining/core/framework/ortmodule_graph_builder.h to generate the gradient graph (what I assume to be related to training_model.onnx).

eric-vision-e · 2024-03-21T11:10:57Z

Hi @Positronix,

Thanks for your answer. But as far as i understood checkpoint is only for weights values. For example if I have a classification model already deployed with 2 classes and for some reason I want to add another class. The only way is to export again the model in Python then retrain in C++ with new data (old classes + new one). Because in this case my model architecture has changed.

Am I correct? Have you understood how to handle this scenario using only C++?

Thanks

Positronx · 2024-03-21T17:05:58Z

Hi @eric-vision-e

The checkpoint is only for weight values (and other metadata like optimizers momentums but that doesn't concern me yet). If I understood your problem roughly, you want to change the training_model architecture without having to resort to python. What I'm looking for aligns with that. As far as I can tell, this will require (partial) rewriting of the python libraries that call onnxruntime_pybind11_state. Unfortunately, for the training model, it isn't as straightforward as the checkpoint file.

github-actions · 2024-04-21T15:00:51Z

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

Positronx added the training issues related to ONNX Runtime training; typically submitted using template label Mar 18, 2024

github-actions bot added the platform:windows issues related to the Windows platform label Mar 18, 2024

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Apr 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Training] [Windows] #19965

[Training] [Windows] #19965

Positronx commented Mar 18, 2024

baijumeswani commented Mar 18, 2024

Positronx commented Mar 18, 2024

eric-vision-e commented Mar 19, 2024

Positronx commented Mar 21, 2024 •

edited

Loading

eric-vision-e commented Mar 21, 2024

Positronx commented Mar 21, 2024

github-actions bot commented Apr 21, 2024

[Training] [Windows] #19965

[Training] [Windows] #19965

Comments

Positronx commented Mar 18, 2024

Describe the issue

To reproduce

Urgency

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

PyTorch Version

Execution Provider

Execution Provider Library Version

baijumeswani commented Mar 18, 2024

Positronx commented Mar 18, 2024

eric-vision-e commented Mar 19, 2024

Positronx commented Mar 21, 2024 • edited Loading

eric-vision-e commented Mar 21, 2024

Positronx commented Mar 21, 2024

github-actions bot commented Apr 21, 2024

Positronx commented Mar 21, 2024 •

edited

Loading