-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for Hidden States Access in Phi-3 with ONNX Runtime #20969
Comments
@ajliouat, you can use onnx API to modify the graph to add hidden state to graph output. @kunal-vaishnavi, is it possible to add an option to model builder to output hidden state? |
You can generate an ONNX model that outputs the hidden states using ONNX Runtime GenAI's model builder with If you already have the PyTorch model saved on disk:
If you do not have the PyTorch model saved on disk:
|
@kunal-vaishnavi, the option |
Hello @kunal-vaishnavi, I followed the above steps but it doesn't work:
Expected Behaviour:
|
The option doesn't currently exist but we can add it. |
If you open the ONNX model saved to disk, you will have This can be fixed in ONNX Runtime GenAI so that both the |
### Description This PR adds support for outputting the last hidden state in addition to the logits in ONNX models. Users can run their models with ONNX Runtime GenAI and use the generator's `GetOutput` API to obtain the hidden states. C/C++: ```c std::unique_ptr<OgaTensor> embeddings = generator->GetOutput("hidden_states"); ``` C#: ```csharp using var embeddings = generator.GetOutput("hidden_states"); ``` Java: ```java Tensor embeddings = generator.getOutput("hidden_states"); ``` Python: ```python embeddings = generator.get_output("hidden_states") ``` ### Motivation and Context In SLMs and LLMs, the last hidden state represents a model's embeddings for a particular input before the language modeling head is applied. Generating embeddings for a model is a popular task. These embeddings can be used for many scenarios such as text classification, sequence labeling, information retrieval using [retrieval-augmented generation (RAG)](https://en.wikipedia.org/wiki/Retrieval-augmented_generation), and more. This PR helps the following issues: - microsoft/onnxruntime#20969 - #442 - #474 - #713
The hidden states are now accessible via ONNX Runtime GenAI. You can create the ONNX model to output the hidden states using ONNX Runtime GenAI's model builder with |
Describe the documentation issue
Hi,
I am using the Phi-3 LLM on the ONNX runtime and noticed the API lacks a method to access the hidden states. Could you inform me how to access these states, or if there will be an update to include this functionality?
Thank you for your help.
Best,
Abdeljalil
Page / URL
No response
The text was updated successfully, but these errors were encountered: