microsoft · fionabos · Jul 22, 2024 · Jul 30, 2024 · Jul 30, 2024 · Aug 2, 2024
diff --git a/docs/genai/api/java.md b/docs/genai/api/java.md
@@ -206,6 +206,97 @@ The GeneratorParams instance.
 GeneratorParams params = generator.createGeneratorParams("What's 6 times 7?");
 ```
 
+## Multimodal Processor Class
+The MultiModalProcessor class is responsible for converting text/images into a NamedTensors list that can be fed into a Generator class instance.
+
+### processImages Method
+
+Processes text and image (optional) into a NamedTensors object.
+
+```java
+public NamedTensors processImages(String prompt, Images images) throws GenAIException
+```
+
+#### Parameters
+
+- `prompt`: text input formatted according to model specifications.
+- `images`: optional image input. Pass `null` if no image input.
+
+#### Throws
+
+`GenAIException`- if the call to the GenAI native API fails.
+
+#### Returns
+
+A NamedTensors object.
+
+#### Example
+Example using the [Phi-3 Vision](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct-onnx-cpu/tree/main/cpu-int4-rtn-block-32-acc-level-4) model.
+```java
+Images images = null;
+if (inputImage != null) {
+  images = inputImage.getImages();
+}
+
+NamedTensors inputTensors = multiModalProcessor.processImages(promptQuestion_formatted, images);
+```
+
+### Decode Method
+
+Decodes a sequence of token ids into text.
+
+```java
+public String decode(int[] sequence) throws GenAIException
+```
+
+#### Parameters
+
+- `sequence`: collection of token ids to decode to text.
+
+#### Throws
+
+`GenAIException`- if the call to the GenAI native API fails.
+
+#### Returns
+
+The text representation of the sequence.
+
+#### Example
+
+```java
+String result = multiModalProcessor.decode(output_ids);
+```
+
+### createStream Method
+
+Creates a TokenizerStream object for streaming tokenization. This is used with Generator class to provide each token as it is generated.
+
+```java
+public TokenizerStream createStream() throws GenAIException
+```
+
+#### Throws
+
+`GenAIException`- if the call to the GenAI native API fails.
+
+#### Returns
+
+The new TokenizerStream instance.
+
+## Images Class
+The Images class loads images from files to be used in the MultiModalProcessor class.
+### Constructor
+
+```java
+public Images(String imagesPath) throws GenAIException
+```
+
+#### Parameters
+- `imagesPath`: path for inputed image.
+
+#### Throws
+`GenAIException`- if the call to the GenAI native API fails.
+
 ## Tokenizer class
 
 ### Encode Method
@@ -244,7 +335,7 @@ public String decode(int[] sequence) throws GenAIException
 
 #### Parameters
 
-- `sequence`: collection of token ids to decode to text
+- `sequence`: collection of token ids to decode to text.
 
 #### Throws
 
@@ -344,7 +435,7 @@ public String decode(int token) throws GenAIException
 
 #### Throws
 
-`GenAIException`
+`GenAIException`- if the call to the GenAI native API fails.
 
 ## Tensor Class
 
@@ -362,7 +453,7 @@ public Tensor(ByteBuffer data, long[] shape, ElementType elementType) throws Gen
 
 #### Throws
 
-`GenAIException`
+`GenAIException`- if the call to the GenAI native API fails.
 
 #### Example
 
@@ -377,6 +468,18 @@ floatBuffer.put(new float[] {1.0f, 2.0f, 3.0f, 4.0f});
 Tensor tensor = new Tensor(data, shape, Tensor.ElementType.float32);
 ```
 
+## NamedTensors Class
+The NamedTensors class holds input from the MultiModalProcessor class.
+### Constructor
+
+```java
+public NamedTensors(long handle)
+```
+
+#### Parameters
+
+- `handle`: handle of NamedTensors.
+
 ## GeneratorParams class
 
 The `GeneratorParams` class represents the parameters used for generating sequences with a model. Set the prompt using setInput, and any other search options using setSearchOption.
@@ -387,15 +490,15 @@ The `GeneratorParams` class represents the parameters used for generating sequen
 GeneratorParams params = new GeneratorParams(model);
 ```
 
-### setSearchOption Method
+### setSearchOption Method (double)
 
 ```java
 public void setSearchOption(String optionName, double value) throws GenAIException
 ```
 
 #### Throws
 
-`GenAIException`
+`GenAIException`- if the call to the GenAI native API fails.
 
 #### Example
 
@@ -405,15 +508,15 @@ Set search option to limit the model generation length.
 generatorParams.setSearchOption("max_length", 10);
 ```
 
-### setSearchOption Method
+### setSearchOption Method (boolean)
 
 ```java
 public void setSearchOption(String optionName, boolean value) throws GenAIException
 ```
 
 #### Throws
 
-`GenAIException`
+`GenAIException`- if the call to the GenAI native API fails.
 
 #### Example
 
@@ -440,18 +543,17 @@ public void setInput(Sequences sequences) throws GenAIException
 generatorParams.setInput(encodedPrompt);
 ```
 
-### setInput Method
+### setInput Method (Token IDs)
 
 Sets the prompt/s token ids for model execution. The `tokenIds` are the encoded parameters.
 
 ```java
-public void setInput(int[] tokenIds, int sequenceLength, int batchSize)
- throws GenAIException
+public void setInput(int[] tokenIds, int sequenceLength, int batchSize) throws GenAIException
 ```
 
 #### Parameters
 
-- `tokenIds`: the token ids of the encoded prompt/s
+- `tokenIds`: the token ids of the encoded prompt/s.
 - `sequenceLength`: the length of each sequence.
 - `batchSize`: size of the batch. 
 
@@ -467,21 +569,64 @@ NOTE: all sequences in the batch must be the same length.
 generatorParams.setInput(tokenIds, sequenceLength, batchSize);
 ```
 
+### setInput Method (Tensor)
+Add a Tensor as a model input.
+
+```java
+public void setInput(String name, Tensor tensor) throws GenAIException
+```
+#### Parameters
+
+- `name`: name of the model input the tensor will provide.
+- `tensor`: tensor to add.
+
+#### Throws
+
+`GenAIException`- if the call to the GenAI native API fails. 
+
+#### Example
+
+```java
+generatorParams.setInput(name, tensor);
+```
+
+### setInput Method (NamedTensors)
+Add a NamedTensors as a model input.
+
+```java
+public void setInput(NamedTensors namedTensors) throws GenAIException {
+```
+#### Parameters
+
+- `namedTensors`: NamedTensors to add.
+
+#### Throws
+
+`GenAIException`- if the call to the GenAI native API fails. 
+
+#### Example
+
+```java
+NamedTensors inputTensors = multiModalProcessor.processImages(promptQuestion_formatted, images);
+generatorParams.setInput(inputTensors);
+```
+
 ## Generator class
 
 The Generator class generates output using a model and generator parameters.
-The expected usage is to loop until isDone returns false. Within the loop, call computeLogits followed by generateNextToken.
 
-The newly generated token can be retrieved with getLastTokenInSequence and decoded with TokenizerStream.Decode.
+The expected usage is to loop until isDone() returns false. Within the loop, call computeLogits() followed by generateNextToken().
+
+The newly generated token can be retrieved with getLastTokenInSequence() and decoded with TokenizerStream.Decode.
 
-After the generation process is done, GetSequence can be used to retrieve the complete generated sequence if needed.
+After the generation process is done, getSequence() can be used to retrieve the complete generated sequence if needed.
 
 ### Create a Generator
 
 Constructs a Generator object with the given model and generator parameters.
 
 ```java
-Generator(Model model, GeneratorParams generatorParams)
+public Generator(Model model, GeneratorParams generatorParams) throws GenAIException
 ```
 
 #### Parameters

diff --git a/docs/genai/tutorials/phi3-android.md b/docs/genai/tutorials/phi3-android.md
@@ -0,0 +1,118 @@
+---
+title: Phi-3 for Android
+description: Develop an Android generative AI application with ONNX Runtime
+has_children: false
+parent: Tutorials
+grand_parent: Generate API (Preview)
+nav_order: 1
+---
+
+# Build an Android generative AI application
+This is a [Phi-3](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) Android example application using [ONNX Runtime mobile](https://onnxruntime.ai/docs/tutorials/mobile/) and [ONNX Runtime Generate() API](https://github.com/microsoft/onnxruntime-genai) with support for efficiently running generative AI models. This tutorial will walk you through how to build and run the Phi-3 app on your own mobile device so you can get started incorporating Phi-3 into your own mobile developments.  
+
+## Model Capabilities
+[Phi-3 Mini-4k-Instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) is a small language model used for language understanding, math, code, long context, logical reasoning, and more showcasing a robust and state-of-the-art performance among models with less than 13 billion parameters.
+
+## Important Features
+
+### Java API
+This app uses the [generate() Java API](https://github.com/microsoft/onnxruntime-genai/tree/main/src/java/src/main/java/ai/onnxruntime/genai) GenAIException, Generator, GeneratorParams, Model, and TokenizerStream classes ([documentation](https://onnxruntime.ai/docs/genai/api/java.html)). The [generate() C API](https://onnxruntime.ai/docs/genai/api/c.html), [generate() C# API](https://onnxruntime.ai/docs/genai/api/csharp.html), and [generate() Python API](https://onnxruntime.ai/docs/genai/api/python.html) are also available.
+
+### Model Download
+This app downloads the [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) model from Hugging Face. To use a different model, such as the [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/tree/main), change the path links to refer to your chosen model. If using a model with imaging capabilities, use the [MultiModalProcessor class]() in place of the [Tokenizer class]() and update the prompt template accordingly.
+```java
+final String baseUrl = "https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/main/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/";
+List<String> files = Arrays.asList(
+    "added_tokens.json",
+    "config.json",
+    "configuration_phi3.py",
+    "genai_config.json",
+    "phi3-mini-4k-instruct-cpu-int4-rtn-block-32-acc-level-4.onnx",
+    "phi3-mini-4k-instruct-cpu-int4-rtn-block-32-acc-level-4.onnx.data",
+    "special_tokens_map.json",
+    "tokenizer.json",
+    "tokenizer.model",
+    "tokenizer_config.json");
+```
+The model files will only need to be downloaded once. While editing your app and running new versions, the downloads will skip since all files already exist.
+```java
+if (urlFilePairs.isEmpty()) {
+    // Display a message using Toast
+    Toast.makeText(this, "All files already exist. Skipping download.", Toast.LENGTH_SHORT).show();
+    Log.d(TAG, "All files already exist. Skipping download.");
+    model = new Model(getFilesDir().getPath());
+    tokenizer = model.createTokenizer();
+    return;
+}
+```
+### Status while model is downloading
+Downloading the packages for the app on your mobile device takes ~10-15 minutes depending on which device you are using. The progress bar indicates what percent of the downloads are completed. 
+```java
+public void onProgress(long lastBytesRead, long bytesRead, long bytesTotal) {
+    long lastPctDone = 100 * lastBytesRead / bytesTotal;
+    long pctDone = 100 * bytesRead / bytesTotal;
+    if (pctDone > lastPctDone) {
+        Log.d(TAG, "Downloading files: " + pctDone + "%");
+        runOnUiThread(() -> {
+            progressText.setText("Downloading: " + pctDone + "%");
+        });
+    }
+}
+```
+Because the app is initialized when downloads start, the 'send' button for prompts is disabled until downloads are complete to prevent crashing.
+```java
+if (model == null) {
+    // if the edit text is empty display a toast message.
+    Toast.makeText(MainActivity.this, "Model not loaded yet, please wait...", Toast.LENGTH_SHORT).show();
+    return;
+}
+```
+
+### Prompt Template
+On its own, this model's answers can be very long. To format the AI assistant's answers, you can adjust the prompt template. 
+```java
+String promptQuestion = userMsgEdt.getText().toString();
+String promptQuestion_formatted = "<system>You are a helpful AI assistant. Answer in two paragraphs or less<|end|><|user|>"+promptQuestion+"<|end|>\n<assistant|>";
+Log.i("GenAI: prompt question", promptQuestion_formatted);
+```
+You can also include limits such as a max_length or length_penalty to your liking. 
+```java
+generatorParams.setSearchOption("length_penalty", 1000);
+generatorParams.setSearchOption("max_length", 500);
+```
+NOTE: Including a max_length will cut off the assistant's answer once reaching the maximum number of tokens rather than formatting a complete response.
+
+## Run the App
+
+### Download Android Studio
+You will be using [Android Studio](https://developer.android.com/studio) to run the app.
+
+### Download the App
+Clone the [ONNX Runtime Inference Examples](https://github.com/microsoft/onnxruntime-inference-examples/tree/c29d8edd6d010a2649d69f84f54539f1062d776d) repository.
+
+### Enable Developer Mode on Mobile
+On your Android Mobile device, go to "Settings > About Phone > Software information" and tap the "Build Number" tile repeatedly until you see the message “You are now in developer mode”. In "Developer Options", turn on Wireless or USB debugging.
+
+### Open Project in Android Studio
+Open the Phi-3 mobile app in Android Studio (onnxruntime-inference-examples/mobile/examples/phi-3/android/app).
+
+### Connect Device
+To run the app on a device, follow the instructions from the Running Devices tab on the right side panel. You can connect through Wi-Fi or USB.
+![WiFi Instructions](../../../images/phi3_MobileTutorial_RunDevice.png)
+#### Pair over Wi-Fi
+![WiFi Instructions](../../../images/phi3_MobileTutorial_WiFi.png)
+
+### Manage Devices
+You can manage/change devices and device model through the Device Manager tab on the right side panel.
+![WiFi Instructions](../../../images/phi3_MobileTutorial_DeviceManager.png)
+
+### Downloading the App
+Once your device is connected, run the app by using the play button on the top panel. Downloading all packages will take ~10-15 minutes. If you submit a prompt before downloads are complete, you will encounter an error message. Once completed, the logcat (the cat tab on the bottom left panel) will display an "All downloads complete" message.
+
+![WiFi Instructions](../../../images/phi3_MobileTutorial_Error.png)
+
+### Ask questions
+Now that the app is downloaded, you can start asking questions!
+![Example Prompt 1](../../../images/phi3_MobileTutorial_ex1.png)
+![Example Prompt 2](../../../images/phi3_MobileTutorial_ex2.png)
+![Example Prompt 3](../../../images/phi3_MobileTutorial_ex3.png)
diff --git a/images/phi3_MobileTutorial_DeviceManager.png b/images/phi3_MobileTutorial_DeviceManager.png
diff --git a/images/phi3_MobileTutorial_Error.png b/images/phi3_MobileTutorial_Error.png
diff --git a/images/phi3_MobileTutorial_RunDevice.png b/images/phi3_MobileTutorial_RunDevice.png
diff --git a/images/phi3_MobileTutorial_WiFi.png b/images/phi3_MobileTutorial_WiFi.png
diff --git a/images/phi3_MobileTutorial_ex1.png b/images/phi3_MobileTutorial_ex1.png
diff --git a/images/phi3_MobileTutorial_ex2.png b/images/phi3_MobileTutorial_ex2.png
diff --git a/images/phi3_MobileTutorial_ex3.png b/images/phi3_MobileTutorial_ex3.png