Skip to content

Commit

Permalink
Fixed many (not all) accessibility issues. (#22002)
Browse files Browse the repository at this point in the history
Likely will need to change HLJS theme for the rest.

Should be good to instantly approve/merge, but feel free to review as
necessary. Not currently deploying a preview as another more significant
PR is being reviewed, but I can do it if requested :)
  • Loading branch information
MaanavD authored Sep 10, 2024
1 parent 99ddaed commit e892a56
Show file tree
Hide file tree
Showing 12 changed files with 77 additions and 105 deletions.
36 changes: 10 additions & 26 deletions docs/tutorials/on-device-training/android-app.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,15 @@ nav_order: 1
---

# On-Device Training: Building an Android Application

{: .no_toc }
In this tutorial, we will explore how to build an Android application that incorporates ONNX Runtime's On-Device Training solution. On-device training refers to the process of training a machine learning model directly on an edge device without relying on cloud services or external servers.

Here is what the application will look like at the end of this tutorial:

<img src="../../../images/on-device-training-application-prediction-tom.jpg" width="30%" height="30%">
<img src="../../../images/on-device-training-application-prediction-tom.jpg" alt="an image classification app with Tom Cruise in the middle." width="30%" height="30%">

## Introduction

{: .no_toc }
We will guide you through the steps to create an Android app that can train a simple image classification model using on-device training techniques. This tutorial showcases the `transfer learning` technique where knowledge gained from training a model on one task is leveraged to improve the performance of a model on a different but related task. Instead of starting the learning process from scratch, transfer learning allows us to transfer the knowledge or features learned by a pre-trained model to a new task.

For this tutorial, we will leverage the `MobileNetV2` model which has been trained on large-scale image datasets such as ImageNet (which has 1,000 classes). We will use this model for classifying custom data into one of four classes. The initial layers of MobileNetV2 serve as a feature extractor, capturing generic visual features applicable to various tasks, and only the final classifier layer will be trained for the task at hand.
Expand All @@ -24,26 +24,10 @@ In this tutorial, we will use data to learn to:
- Classify animals into one of four categories using a pre-packed animals dataset.
- Classify celebrities into one of four categories using a custom celebrities dataset.

## Contents

- [Introduction](#introduction)
- [Prerequisites](#prerequisites)
- [Offline Phase - Building the training artifacts](#offline-phase---building-the-training-artifacts)
- [Export the model to ONNX](#op1)
- [Define the trainable and non trainable parameters](#op2)
- [Generate the training artifacts](#op3)
- [Training Phase - Android application development](#training-phase---android-application-development)
- [Setting up the project in Android Studio](#tp1)
- [Adding the ONNX Runtime dependency](#tp2)
- [Packaging the Prebuilt Training Artifacts and Dataset](#tp3)
- [Interfacing with ONNX Runtime - C++ Code](#tp4)
- [Image Preprocessing](#tp5)
- [Application Frontend](#tp6)
- [Training Phase - Running the application on a device](#training-phase---running-the-application-on-a-device)
- [Running the application on a device](#tp7)
- [Training with a pre-loaded dataset - Animals](#tp8)
- [Training with a custom dataset - Celebrities](#tp9)
- [Conclusion](#conclusion)

## Table of Contents
* TOC placeholder
{:toc}

## Prerequisites

Expand Down Expand Up @@ -791,7 +775,7 @@ To follow this tutorial, you should have a basic understanding of Android app de

b. Launching the application on the device should look like this:

<img src="../../../images/on-device-training-application-landing-page.jpg" width="30%" height="30%">
<img src="../../../images/on-device-training-application-landing-page.jpg" alt="Barebones ORT Personalize app" width="30%" height="30%">

2. <a name="tp8"></a>Training with a pre-loaded dataset - Animals

Expand All @@ -805,7 +789,7 @@ To follow this tutorial, you should have a basic understanding of Android app de

e. Use any animal image from your library for inferencing now.

<img src="../../../images/on-device-training-application-prediction-cow.jpg" width="30%" height="30%">
<img src="../../../images/on-device-training-application-prediction-cow.jpg" alt="ORT Personalize app with an image of a cow" width="30%" height="30%">

As can be seen from the image above, the model correctly predicted `Cow`.

Expand All @@ -825,7 +809,7 @@ To follow this tutorial, you should have a basic understanding of Android app de

g. That's it!. Hopefully the application classified the image correctly.

<img src="../../../images/on-device-training-application-prediction-tom.jpg" width="30%" height="30%">
<img src="../../../images/on-device-training-application-prediction-tom.jpg" alt="an image classification app with Tom Cruise in the middle." width="30%" height="30%">


## Conclusion
Expand Down
33 changes: 10 additions & 23 deletions docs/tutorials/on-device-training/ios-app.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ nav_order: 2
---

# Building an iOS Application

{: .no_toc }
In this tutorial, we will explore how to build an iOS application that incorporates ONNX Runtime's On-Device Training solution. On-device training refers to the process of training a machine learning model directly on an edge device without relying on cloud services or external servers.

In this tutorial, we will build a simple speaker identification app that learns to identify a speaker's voice. We will take a look at how to train a model on-device, export the trained model, and use the trained model to perform inference.
Expand All @@ -18,6 +18,7 @@ Here is what the application will look like:
<img src="../../../images/iOS_speaker_identification_app.png" alt="application demo, with buttons for voice, train, and infer." width="30%" height="30%">

## Introduction
{: .no_toc }
We will guide you through the process of building an iOS application that can train a simple audio classification model using on-device training techniques. The tutorial showcases the `transfer learning` technique where knowledge gained from training a model on one task is leveraged to improve the performance of a model on a different but related task. Instead of starting the learning process from scratch, transfer learning allows us to transfer the knowledge or features learned by a pre-trained model to a new task.

In this tutorial, we will leverage the [`wav2vec`](https://huggingface.co/superb/wav2vec2-base-superb-sid) model which has been trained on large-scale celebrity speech data such as `VoxCeleb1`. We will use the pre-trained model to extract features from the audio data and train a binary classifier to identify the speaker. The initial layers of the model serve as a feature extractor, capturing the important features of the audio data. Only the last layer of the model is trained to perform the classification task.
Expand All @@ -29,23 +30,9 @@ In the tutorial, we will:
- Use the exported model to perform inference


## Contents
- [Building an iOS Application](#building-an-ios-application)
- [Introduction](#introduction)
- [Contents](#contents)
- [Prerequisites](#prerequisites)
- [Generating the training artifacts](#generating-the-training-artifacts)
- [Building the iOS application](#building-the-ios-application)
- [Xcode Setup](#xcode-setup)
- [Application Overview](#application-overview)
- [Training the model](#training-the-model)
- [Inference with the trained model](#inference-with-the-trained-model)
- [Recording Audio](#recording-audio)
- [Train View](#train-view)
- [Infer View](#infer-view)
- [ContentView](#contentview)
- [Running the iOS application](#running-the-ios-application)
- [Conclusion](#conclusion)
## Table of Contents
* TOC placeholder
{:toc}


## Prerequisites
Expand Down Expand Up @@ -947,27 +934,27 @@ Now, we are ready to run the application. You can run the application on the sim

a. Now, when you run the application, you should see the following screen:

<img src="../../../images/iOS_speaker_identification_app.png" width="30%" height="30%">
<img src="../../../images/iOS_speaker_identification_app.png" alt="My Voice application with Train and Infer buttons" width="30%" height="30%">


b. Next, click on the `Train` button to navigate to the `TrainView`. The `TrainView` will prompt you to record your voice. You will need to record your voice `kNumRecordings` times.

<img src="../../../images/iOS_speaker_identification_training_screen.jpg" width="30%" height="30%">
<img src="../../../images/iOS_speaker_identification_training_screen.jpg" alt="My Voice application with words to record" width="30%" height="30%">


c. Once all the recordings are complete, the application will train the model on the given data. You will see the progress bar indicating the progress of the training.

<img src="../../../images/iOS_speaker_identification_training_progress_screen.jpg" width="30%" height="30%">
<img src="../../../images/iOS_speaker_identification_training_progress_screen.jpg" alt="Loading bar while the app is training" width="30%" height="30%">


d. Once the training is complete, you will see the following screen:

<img src="../../../images/iOS_speaker_identification_training_complete_screen.jpg" width="30%" height="30%">
<img src="../../../images/iOS_speaker_identification_training_complete_screen.jpg" alt="The app informs you training finished successfully!" width="30%" height="30%">


e. Now, click on the `Infer` button to navigate to the `InferView`. The `InferView` will prompt you to record your voice. Once the recording is complete, it will perform inference with the trained model and display the result of the inference.

<img src="../../../images/iOS_speaker_identification_infer_screen.jpg" width="30%" height="30%">
<img src="../../../images/iOS_speaker_identification_infer_screen.jpg" alt="My Voice application allows you to record and infer whether it's you or not." width="30%" height="30%">


That's it! Hopefully, it identified your voice correctly.
Expand Down
32 changes: 16 additions & 16 deletions src/routes/blogs/pytorch-on-the-edge/+page.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -179,9 +179,9 @@ fun run(audioTensor: OnnxTensor): Result {
<div class="container mx-auto px-4 md:px-8 lg:px-48 pt-8">
<h1 class="text-5xl pb-2">Run PyTorch models on the edge</h1>
<p class="text-neutral">
By: <a href="https://www.linkedin.com/in/natkershaw/" class="text-blue-700">Natalie Kershaw</a>
By: <a href="https://www.linkedin.com/in/natkershaw/" class="dark:text-blue-300 text-blue-800 underline">Natalie Kershaw</a>
and
<a href="https://www.linkedin.com/in/prasanthpulavarthi/" class="text-blue-700"
<a href="https://www.linkedin.com/in/prasanthpulavarthi/" class="dark:text-blue-300 text-blue-800 underline"
>Prasanth Pulavarthi</a
>
</p>
Expand Down Expand Up @@ -217,12 +217,12 @@ fun run(audioTensor: OnnxTensor): Result {
anywhere that is outside of the cloud, ranging from large, well-resourced personal computers
to small footprint devices such as mobile phones. This has been a challenging task to
accomplish in the past, but new advances in model optimization and software like
<a href="https://onnxruntime.ai/pytorch" class="text-blue-700">ONNX Runtime</a>
<a href="https://onnxruntime.ai/pytorch" class="dark:text-blue-300 text-blue-800 underline">ONNX Runtime</a>
make it more feasible - even for new generative AI and large language models like Stable Diffusion,
Whisper, and Llama2.
</p>

<h2 class="text-blue-700 text-3xl mb-4">Considerations for PyTorch models on the edge</h2>
<h2 class="dark:text-blue-300 text-blue-800 underline text-3xl mb-4">Considerations for PyTorch models on the edge</h2>

<p class="mb-4">
There are several factors to keep in mind when thinking about running a PyTorch model on the
Expand Down Expand Up @@ -292,7 +292,7 @@ fun run(audioTensor: OnnxTensor): Result {
</li>
</ul>

<h2 class="text-blue-700 text-3xl mb-4">Tools for PyTorch models on the edge</h2>
<h2 class="dark:text-blue-300 text-blue-800 underline text-3xl mb-4">Tools for PyTorch models on the edge</h2>

<p class="mb-4">
We mentioned ONNX Runtime several times above. ONNX Runtime is a compact, standards-based
Expand All @@ -305,7 +305,7 @@ fun run(audioTensor: OnnxTensor): Result {
format that doesn't require the PyTorch framework and its gigabytes of dependencies. PyTorch
has thought about this and includes an API that enables exactly this - <a
href="https://pytorch.org/docs/stable/onnx.html"
class="text-blue-700">torch.onnx</a
class="dark:text-blue-300 text-blue-800 underline">torch.onnx</a
>. <a href="https://onnx.ai/">ONNX</a> is an open standard that defines the operators that make
up models. The PyTorch ONNX APIs take the Pythonic PyTorch code and turn it into a functional
graph that captures the operators that are needed to run the model without Python. As with everything
Expand All @@ -318,7 +318,7 @@ fun run(audioTensor: OnnxTensor): Result {
The popular Hugging Face library also has APIs that build on top of this torch.onnx
functionality to export models to the ONNX format. Over <a
href="https://huggingface.co/blog/ort-accelerating-hf-models"
class="text-blue-700">130,000 models</a
class="dark:text-blue-300 text-blue-800 underline">130,000 models</a
> are supported making it very likely that the model you care about is one of them.
</p>

Expand All @@ -328,7 +328,7 @@ fun run(audioTensor: OnnxTensor): Result {
and web browsers) via various languages (from C# to JavaScript to Swift).
</p>

<h2 class="text-blue-700 text-3xl mb-4">Examples of PyTorch models on the edge</h2>
<h2 class="dark:text-blue-300 text-blue-800 underline text-3xl mb-4">Examples of PyTorch models on the edge</h2>

<h3 class=" text-2xl mb-2">Stable Diffusion on Windows</h3>

Expand All @@ -345,15 +345,15 @@ fun run(audioTensor: OnnxTensor): Result {
<p class="mb-4">
You don't have to export the fifth model, ClipTokenizer, as it is available in <a
href="https://onnxruntime.ai/docs/extensions"
class="text-blue-700">ONNX Runtime extensions</a
class="dark:text-blue-300 text-blue-800 underline">ONNX Runtime extensions</a
>, a library for pre and post processing PyTorch models.
</p>

<p class="mb-4">
To run this pipeline of models as a .NET application, we build the pipeline code in C#. This
code can be run on CPU, GPU, or NPU, if they are available on your machine, using ONNX
Runtime's device-specific hardware accelerators. This is configured with the <code
class="bg-gray-200 p-1 rounded">ExecutionProviderTarget</code
class="bg-gray-200 dark:bg-gray-700 p-1 rounded">ExecutionProviderTarget</code
> below.
</p>
<Highlight language={csharp} code={dotnetcode} />
Expand All @@ -366,15 +366,15 @@ fun run(audioTensor: OnnxTensor): Result {
<p class="mb-4">
You can build the application and run it on Windows with the detailed steps shown in this <a
href="https://onnxruntime.ai/docs/tutorials/csharp/stable-diffusion-csharp.html"
class="text-blue-700">tutorial</a
class="dark:text-blue-300 text-blue-800 underline">tutorial</a
>.
</p>

<h3 class=" text-2xl mb-2">Text generation in the browser</h3>

<p class="mb-4">
Running a PyTorch model locally in the browser is not only possible but super simple with
the <a href="https://huggingface.co/docs/transformers.js/index" class="text-blue-700"
the <a href="https://huggingface.co/docs/transformers.js/index" class="dark:text-blue-300 text-blue-800 underline"
>transformers.js</a
> library. Transformers.js uses ONNX Runtime Web as its backend. Many models are already converted
to ONNX and served by the tranformers.js CDN, making inference in the browser a matter of writing
Expand Down Expand Up @@ -407,7 +407,7 @@ fun run(audioTensor: OnnxTensor): Result {
All components of the Whisper Tiny model (audio decoder, encoder, decoder, and text sequence
generation) can be composed and exported to a single ONNX model using the <a
href="https://github.com/microsoft/Olive/tree/main/examples/whisper"
class="text-blue-700">Olive framework</a
class="dark:text-blue-300 text-blue-800 underline">Olive framework</a
>. To run this model as part of a mobile application, you can use ONNX Runtime Mobile, which
supports Android, iOS, react-native, and MAUI/Xamarin.
</p>
Expand All @@ -420,7 +420,7 @@ fun run(audioTensor: OnnxTensor): Result {
<p class="mb-4">
The relevant snippet of a example <a
href="https://github.com/microsoft/onnxruntime-inference-examples/tree/main/mobile/examples/speech_recognition"
class="text-blue-700">Android mobile app</a
class="dark:text-blue-300 text-blue-800 underline">Android mobile app</a
> that performs speech transcription on short samples of audio is shown below:
</p>
<Highlight language={kotlin} code={mobilecode} />
Expand Down Expand Up @@ -476,11 +476,11 @@ fun run(audioTensor: OnnxTensor): Result {
<p class="mb-4">
You can read the full <a
href="https://onnxruntime.ai/docs/tutorials/on-device-training/ios-app.html"
class="text-blue-700">Speaker Verification tutorial</a
class="dark:text-blue-300 text-blue-800 underline">Speaker Verification tutorial</a
>, and
<a
href="https://github.com/microsoft/onnxruntime-training-examples/tree/master/on_device_training/mobile/ios"
class="text-blue-700">build and run the application from source</a
class="dark:text-blue-300 text-blue-800 underline">build and run the application from source</a
>.
</p>

Expand Down
6 changes: 3 additions & 3 deletions src/routes/components/footer.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
<footer class="footer p-10 mt-10 text-base-content z-40 border-top border-t">
<div>
<p>ONNX Runtime<br />Copyright © Microsoft. All rights reserved.</p>
<span class="footer-title">Follow us at:</span>
<span class="dark:text-blue-200 footer-title">Follow us at:</span>
<div class="grid grid-flow-col gap-4">
<a aria-label="youtube" href="https://www.youtube.com/onnxruntime" target="_blank"
><div class="w-8 h-8 pt-0.5 hover:text-primary"><FaYoutube /></div></a
Expand All @@ -24,12 +24,12 @@
</div>
<div />
<div>
<span class="footer-title text-bold ">Get Started</span>
<span class="dark:text-blue-200 footer-title text-bold">Get Started</span>
<a href={pathvar + '/getting-started'} class="link link-hover">Install</a>
<a href={pathvar + '/pytorch'} class="link link-hover">PyTorch</a>
</div>
<div>
<span class="footer-title">Resources</span>
<span class="dark:text-blue-200 footer-title">Resources</span>
<a href={pathvar + '/blogs'} class="link link-hover">Blogs</a>
<a rel="external" href={pathvar + '/docs/tutorials'} class="link link-hover">Tutorials</a>
<a rel="external" href={pathvar + '/docs/api/'} class="link link-hover">APIs</a>
Expand Down
4 changes: 2 additions & 2 deletions src/routes/events/+page.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,7 @@
}
],
image: converttoort,
imagealt:
'Slide detailing how to convert from various frameworks to ONNX, then deploy anywhere using ORT'
imagealt: 'Slide detailing how to convert from various frameworks to ONNX, then deploy anywhere using ORT'
}
];
Expand Down Expand Up @@ -74,6 +73,7 @@
date={event.date}
linkarr={event.linkarr}
image={event.image}
imagealt={event.imagealt}
/>
{/each}
</div>
Expand Down
4 changes: 2 additions & 2 deletions src/routes/events/event-post.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
<div class="card-body col-span-3 md:col-span-2">
<h2 class="card-title">{title}</h2>
<p>{description}</p>
<p class="text-blue-700 text-right">
<p class="text-blue-800 text-right">
{date}
</p>
<div class="card-actions">
Expand All @@ -43,7 +43,7 @@
</div>
</div>
<div class="card-image col-span-1 m-auto hidden md:flex">
<img class="" src={image} alt={imagealt} />
<img src={image} alt={imagealt} />
</div>
</div>
</a>
Expand Down
Loading

0 comments on commit e892a56

Please sign in to comment.