Update API docs

microsoft · May 20, 2024 · 876d8c4 · 876d8c4
1 parent e34a040
commit 876d8c4
Show file tree

Hide file tree

Showing 3 changed files with 116 additions and 43 deletions.
diff --git a/docs/genai/api/c.md b/docs/genai/api/c.md
@@ -23,7 +23,7 @@ _Note: this API is in preview and is subject to change._
 
 ### Create model
 
-Creates a model from the given configuration directory and device type.
+Creates a model from the given directory. The directory should contain a file called `genai_config.json`, which corresponds to the [configuration specification](../reference/config.md).
 
 #### Parameters
  * Input: config_path The path to the model configuration directory. The path is expected to be encoded in UTF-8.
@@ -224,6 +224,23 @@ Set a search option where the option is a bool.
 OGA_EXPORT OgaResult* OGA_API_CALL OgaGeneratorParamsSetSearchBool(OgaGeneratorParams* generator_params, const char* name, bool value);
 ```
 
+### Try graph capture with max batch size
+
+Graph capture fixes the dynamic elements of the computation graph to constant values. It can provide more efficient execution in some environments. To execute in graph capture mode, the maximum batch size needs to be known ahead of time. This function can fail if there is not enough memory to allocate the specified maximum batch size.
+
+#### Parameters
+
+* generator_params: The generator params object to set the parameter on
+* max_batch_size: The maximum batch size to allocate
+
+#### Returns
+
+`OgaResult` containing the error message if graph capture mode could not be configured with the specified batch size
+
+```c
+OGA_EXPORT OgaResult* OGA_API_CALL OgaGeneratorParamsTryGraphCaptureWithMaxBatchSize(OgaGeneratorParams* generator_params, int32_t max_batch_size);
+```
+
 ### Set inputs
 
 Sets the input ids for the generator params. The input ids are used to seed the generation.
@@ -255,12 +272,30 @@ Sets the input id sequences for the generator params. The input id sequences are
 
 #### Returns
 
- OgaResult containing the error message if the setting of the input id sequences failed.
+OgaResult containing the error message if the setting of the input id sequences failed.
  
 ```c
 OGA_EXPORT OgaResult* OGA_API_CALL OgaGeneratorParamsSetInputSequences(OgaGeneratorParams* generator_params, const OgaSequences* sequences);
 ```
 
+### Set model input
+
+Set an additional model input, aside from the input_ids. For example additional inputs for LoRA adapters.
+
+### Parameters
+
+* generator_params: The generator params to set the input on
+* name: the name of the parameter to set
+* tensor: the value of the parameter
+
+### Returns
+
+OgaResult containing the error message if the setting of the input failed.
+
+```c
+OGA_EXPORT OgaResult* OGA_API_CALL OgaGeneratorParamsSetWhisperInputFeatures(OgaGeneratorParams*, OgaTensor* tensor);
+```
+
 
 ## Generator API
 
@@ -330,7 +365,7 @@ OGA_EXPORT OgaResult* OGA_API_CALL OgaGenerator_ComputeLogits(OgaGenerator* gene
 
 ### Generate next token
 
-Generates the next token based on the computed logits using the greedy search.
+Generates the next token based on the computed logits using the configured generation parameters.
 
 #### Parameters
 
@@ -341,32 +376,13 @@ Generates the next token based on the computed logits using the greedy search.
 OgaResult containing the error message if the generation of the next token failed.
 
 ```c
-OGA_EXPORT OgaResult* OGA_API_CALL OgaGenerator_GenerateNextToken_Top(OgaGenerator* generator);
-```
-
-### Generate next token with Top K sampling
-
-#### Parameters
-
-#### Returns
-
-```c
-OGA_EXPORT OgaResult* OGA_API_CALL OgaGenerator_GenerateNextToken_TopK(OgaGenerator* generator, int k, float t);
+OGA_EXPORT OgaResult* OGA_API_CALL OgaGenerator_GenerateNextToken(OgaGenerator* generator);
 ```
 
-### Generate next token with Top P sampling
-
-#### Parameters
-
-#### Returns
-
-```c
-OGA_EXPORT OgaResult* OGA_API_CALL OgaGenerator_GenerateNextToken_TopP(OgaGenerator* generator, float p, float t);
-```
 
 ### Get number of tokens
 
- Returns the number of tokens in the sequence at the given index.
+Returns the number of tokens in the sequence at the given index.
 
 #### Parameters
 
@@ -378,12 +394,12 @@ OGA_EXPORT OgaResult* OGA_API_CALL OgaGenerator_GenerateNextToken_TopP(OgaGenera
 The number tokens in the sequence at the given index.
 
 ```c
-OGA_EXPORT size_t OGA_API_CALL OgaGenerator_GetSequenceLength(const OgaGenerator* generator, size_t index);
+OGA_EXPORT size_t OGA_API_CALL OgaGenerator_GetSequenceCount(const OgaGenerator* generator, size_t index);
 ```
 
 ### Get sequence
 
-Returns a pointer to the sequence data at the given index. The number of tokens in the sequence is given by OgaGenerator_GetSequenceLength.
+Returns a pointer to the sequence data at the given index. The number of tokens in the sequence is given by `OgaGenerator_GetSequenceCount`.
 
 #### Parameters
 
@@ -395,7 +411,7 @@ Returns a pointer to the sequence data at the given index. The number of tokens
 A pointer to the token sequence
 
 ```c
-OGA_EXPORT const int32_t* OGA_API_CALL OgaGenerator_GetSequence(const OgaGenerator* generator, size_t index);
+OGA_EXPORT const int32_t* OGA_API_CALL OgaGenerator_GetSequenceData(const OgaGenerator* generator, size_t index);
 ```
 
 ## Enums and structs
@@ -419,6 +435,18 @@ typedef struct OgaBuffer OgaBuffer;
 
 ## Utility functions
 
+### Set the GPU device ID
+
+```c
+OGA_EXPORT OgaResult* OGA_API_CALL OgaSetCurrentGpuDeviceId(int device_id);
+```
+
+### Get the GPU device ID
+
+```c
+OGA_EXPORT OgaResult* OGA_API_CALL OgaGetCurrentGpuDeviceId(int* device_id);
+```
+
 ### Get error message
 
 #### Parameters

diff --git a/docs/genai/api/csharp.md b/docs/genai/api/csharp.md
@@ -98,6 +98,12 @@ public void SetSearchOption(string searchOption, double value)
 public void SetSearchOption(string searchOption, bool value)
 ```
 
+### Try graph capture with max batch size
+
+```csharp
+ public void TryGraphCaptureWithMaxBatchSize(int maxBatchSize)
+```
+
 ### Set input ids method
 
 ```csharp
@@ -110,8 +116,11 @@ public void SetInputIDs(ReadOnlySpan<int> inputIDs, ulong sequenceLength, ulong
 public void SetInputSequences(Sequences sequences)
 ```
 
+### Set model inputs
 
-
+```csharp
+public void SetModelInput(string name, Tensor value)
+```
 
 
 ## Generator class
@@ -137,9 +146,14 @@ public void ComputeLogits()
 ### Generate next token method
 
 ```csharp
-public void GenerateNextTokenTop()
+public void GenerateNextToken()
 ```
 
+### Get sequence
+
+```csharp
+public ReadOnlySpan<int> GetSequence(ulong index)
+```
 
 ## Sequences class
 

diff --git a/docs/genai/api/python.md b/docs/genai/api/python.md
@@ -30,7 +30,7 @@ import onnxruntime_genai
 
 ## Model class
 
-### Load the model
+### Load a model
 
 Loads the ONNX model(s) and configuration from a folder on disk.
 
@@ -59,22 +59,14 @@ onnxruntime_genai.Model.generate(params: GeneratorParams) -> numpy.ndarray[int,
 
 `numpy.ndarray[int, int]`: a two dimensional numpy array with dimensions equal to the size of the batch passed in and the maximum length of the sequence of tokens.
 
+### Device type
 
-## GeneratorParams class
-
-### Create GeneratorParams object
+Return the device type that the model has been configured to run on.
 
 ```python
-onnxruntime_genai.GeneratorParams(model: onnxruntime_genai.Model) -> onnxruntime_genai.GeneratorParams
+onnxruntime_genai.Model.device_type
 ```
 
-#### Parameters
-
-- `model`: (required) The model that was loaded by onnxruntime_genai.Model()
-
-#### Returns
-
-`onnxruntime_genai.GeneratorParams`: The GeneratorParams object
 
 ## Tokenizer class
 
@@ -193,18 +185,49 @@ onnxruntime_genai.TokenizerStream.decode(token: int32) -> str
 onnxruntime_genai.GeneratorParams(model: Model) -> GeneratorParams
 ```
 
-### Input_ids member 
+### Pad token id member
 
 ```python
-onnxruntime_genai.GeneratorParams.input_ids = numpy.ndarray[numpy.int32, numpy.int32]
+onnxruntime_genai.GeneratorParams.pad_token_id
 ```
 
+### EOS token id member
+
+```python
+onnxruntime_genai.GeneratorParams.eos_token_id
+```
+
+### vocab size member
+
+```python
+onnxruntime_genai.GeneratorParams.vocab_size
+```
+
+### input_ids member
+
+```python
+onnxruntime_genai.GeneratorParams.input_ids: numpy.ndarray[numpy.int32, numpy.int32]
+```
+
+### Set model input
+
+```python
+onnxruntime_genai.GeneratorParams.set_model_input(name: str, value: [])
+```
+
+
 ### Set search options method
 
 ```python
 onnxruntime_genai.GeneratorParams.set_search_options(options: dict[str, Any])
 ```
 
+### Try graph capture with max batch size
+
+```python
+onnxruntime_genai.GeneratorParams.try_graph_capture_with_max_batch_size(max_batch_size: int)
+```
+
 ## Generator class
 
 ### Create a Generator
@@ -242,6 +265,14 @@ Runs the model through one iteration.
 onnxruntime_genai.Generator.compute_logits()
 ```
 
+### Get output
+
+Returns the output logits of the model.
+
+```python
+onnxruntime_genai.Generator.get_output()
+```
+
 ### Generate next token
 
 Using the current set of logits and the specified generator parameters, calculates the next batch of tokens, using Top P sampling.