diff --git a/docs/build/ios.md b/docs/build/ios.md index 7734db756cb2e..bad20d142eb34 100644 --- a/docs/build/ios.md +++ b/docs/build/ios.md @@ -51,14 +51,14 @@ Run one of the following build scripts from the ONNX Runtime repository root: ```bash ./build.sh --config --use_xcode \ - --ios --ios_sysroot iphonesimulator --osx_arch x86_64 --apple_deploy_target + --ios --apple_sysroot iphonesimulator --osx_arch x86_64 --apple_deploy_target ``` ### Cross compile for iOS device ```bash ./build.sh --config --use_xcode \ - --ios --ios_sysroot iphoneos --osx_arch arm64 --apple_deploy_target + --ios --apple_sysroot iphoneos --osx_arch arm64 --apple_deploy_target ``` ### CoreML Execution Provider diff --git a/docs/ecosystem/index.md b/docs/ecosystem/index.md index 12c8335f64be7..1a0f95d77f4e0 100644 --- a/docs/ecosystem/index.md +++ b/docs/ecosystem/index.md @@ -18,8 +18,6 @@ ONNX Runtime functions as part of an ecosystem of tools and platforms to deliver ## Azure Machine Learning Services * [Azure Container Instance: BERT](https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/python/tools/transformers/notebooks/Inference_Bert_with_OnnxRuntime_on_AzureML.ipynb){:target="_blank"} -* [Azure Container Instance: Facial Expression Recognition](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-inference-facial-expression-recognition-deploy.ipynb){:target="_blank"} -* [Azure Container Instance: MNIST](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-inference-mnist-deploy.ipynb){:target="_blank"} * [Azure Container Instance: Image classification (Resnet)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-modelzoo-aml-deploy-resnet50.ipynb){:target="_blank"} * [Azure Kubernetes Services: FER+](https://github.com/microsoft/onnxruntime/blob/main/docs/python/notebooks/onnx-inference-byoc-gpu-cpu-aks.ipynb){:target="_blank"} * [Azure IoT Sedge (Intel UP2 device with OpenVINO)](https://github.com/Azure-Samples/onnxruntime-iot-edge/blob/master/AzureML-OpenVINO/README.md){:target="_blank"} diff --git a/docs/execution-providers/TensorRT-ExecutionProvider.md b/docs/execution-providers/TensorRT-ExecutionProvider.md index 7a9b371b60eff..93d3938a0c731 100644 --- a/docs/execution-providers/TensorRT-ExecutionProvider.md +++ b/docs/execution-providers/TensorRT-ExecutionProvider.md @@ -90,6 +90,7 @@ There are two ways to configure TensorRT settings, either by **TensorRT Executio | trt_context_memory_sharing_enable | ORT_TENSORRT_CONTEXT_MEMORY_SHARING_ENABLE | bool | | trt_layer_norm_fp32_fallback | ORT_TENSORRT_LAYER_NORM_FP32_FALLBACK | bool | | trt_timing_cache_enable | ORT_TENSORRT_TIMING_CACHE_ENABLE | bool | +| trt_timing_cache_path | ORT_TENSORRT_TIMING_CACHE_PATH | string | | trt_force_timing_cache | ORT_TENSORRT_FORCE_TIMING_CACHE_ENABLE | bool | | trt_detailed_build_log | ORT_TENSORRT_DETAILED_BUILD_LOG_ENABLE | bool | | trt_build_heuristics_enable | ORT_TENSORRT_BUILD_HEURISTICS_ENABLE | bool | @@ -179,6 +180,9 @@ TensorRT configurations can be set by execution provider options. It's useful wh * `trt_timing_cache_enable`: Enable TensorRT timing cache. * Check [Timing cache](#timing-cache) for details. +* `trt_timing_cache_path`: Specify path for TensorRT timing cache if `trt_timing_cache_enable` is `True`. + * Not specifying a `trt_timing_cache_path` will result in using the working directory + * `trt_force_timing_cache`: Force the TensorRT timing cache to be used even if device profile does not match. * A perfect match is only the exact same GPU model as the on that produced the timing cache. diff --git a/docs/reference/operators/ContribOperators.md b/docs/reference/operators/ContribOperators.md index 77eabcc75a7f9..060ade9aa2d88 100644 --- a/docs/reference/operators/ContribOperators.md +++ b/docs/reference/operators/ContribOperators.md @@ -22,6 +22,8 @@ The contrib operator schemas are documented in the ONNX Runtime repository. | Release | Documentation | |---------|---------------| | Main | [https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md](https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md) | +| 1.17 | [https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/docs/ContribOperators.md](https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/docs/ContribOperators.md)| +| 1.16 | [https://github.com/microsoft/onnxruntime/blob/rel-1.16.0/docs/ContribOperators.md](https://github.com/microsoft/onnxruntime/blob/rel-1.16.0/docs/ContribOperators.md)| | 1.15 | [https://github.com/microsoft/onnxruntime/blob/rel-1.15.0/docs/ContribOperators.md](https://github.com/microsoft/onnxruntime/blob/rel-1.15.0/docs/ContribOperators.md)| | 1.14 | [https://github.com/microsoft/onnxruntime/blob/rel-1.14.0/docs/ContribOperators.md](https://github.com/microsoft/onnxruntime/blob/rel-1.14.0/docs/ContribOperators.md)| | 1.13 | [https://github.com/microsoft/onnxruntime/blob/rel-1.13.1/docs/ContribOperators.md](https://github.com/microsoft/onnxruntime/blob/rel-1.13.1/docs/ContribOperators.md)| diff --git a/docs/reference/operators/OperatorKernels.md b/docs/reference/operators/OperatorKernels.md index caa7df1143abd..9c52acafec9c0 100644 --- a/docs/reference/operators/OperatorKernels.md +++ b/docs/reference/operators/OperatorKernels.md @@ -10,6 +10,7 @@ The operator kernels supported by the CPU Execution Provider, CUDA Execution Pro | Release | Documentation | |---------|---------------| | Current main | [https://github.com/microsoft/onnxruntime/blob/main/docs/OperatorKernels.md](https://github.com/microsoft/onnxruntime/blob/main/docs/OperatorKernels.md) | +| 1.17 | [https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/docs/OperatorKernels.md](https://github.com/microsoft/onnxruntime/blob/rel-1.17.0/docs/OperatorKernels.md) | | 1.16 | [https://github.com/microsoft/onnxruntime/blob/rel-1.16.0/docs/OperatorKernels.md](https://github.com/microsoft/onnxruntime/blob/rel-1.16.0/docs/OperatorKernels.md)| | 1.15 | [https://github.com/microsoft/onnxruntime/blob/rel-1.15.0/docs/OperatorKernels.md](https://github.com/microsoft/onnxruntime/blob/rel-1.15.0/docs/OperatorKernels.md)| | 1.14 | [https://github.com/microsoft/onnxruntime/blob/rel-1.14.0/docs/OperatorKernels.md](https://github.com/microsoft/onnxruntime/blob/rel-1.14.0/docs/OperatorKernels.md)| diff --git a/docs/reference/operators/reduced-operator-config-file.md b/docs/reference/operators/reduced-operator-config-file.md index cd7793e4c3c91..5bdf9d98a0e08 100644 --- a/docs/reference/operators/reduced-operator-config-file.md +++ b/docs/reference/operators/reduced-operator-config-file.md @@ -62,6 +62,8 @@ Additionally, the ONNX operator specs for [DNN](https://github.com/onnx/onnx/blo ## Type reduction format +### Per-operator type information + If the types an operator implementation supports can be limited to a specific set of types, this is specified in a JSON string immediately after the operator name in the configuration file. **It is highly recommended that you first generate the configuration file using ORT format models with type reduction enabled in order to see which operators support type reduction, and how the entry is defined for the individual operators.** @@ -69,17 +71,42 @@ If the types an operator implementation supports can be limited to a specific se The required types are generally listed per input and/or output of the operator. The type information is in a map, with 'inputs' and 'outputs' keys. The value for 'inputs' or 'outputs' is a map between the index number of the input/output and the required list of types. For example, both the input and output types are relevant to ai.onnx:Cast. Type information for input 0 and output 0 could look like this: - `{"inputs": {"0": ["float", "int32_t"]}, "outputs": {"0": ["float", "int64_t"]}}` -which is added directly after the operator name in the configuration file. -e.g. - `ai.onnx;12;Add,Cast{"inputs": {"0": ["float", "int32_t"]}, "outputs": {"0": ["float", "int64_t"]}},Concat,Squeeze` +``` +{"inputs": {"0": ["float", "int32_t"]}, "outputs": {"0": ["float", "int64_t"]}} +``` + +which is added directly after the operator name in the configuration file. E.g.: + +``` +ai.onnx;12;Add,Cast{"inputs": {"0": ["float", "int32_t"]}, "outputs": {"0": ["float", "int64_t"]}},Concat,Squeeze +``` If, for example, the types of inputs 0 and 1 were important, the entry may look like this (e.g. ai.onnx:Gather): - `{"inputs": {"0": ["float", "int32_t"], "1": ["int32_t"]}}` + +``` +{"inputs": {"0": ["float", "int32_t"], "1": ["int32_t"]}} +``` Finally some operators do non-standard things and store their type information under a 'custom' key. ai.onnx.OneHot is an example of this, where the three input types are combined into a triple. - `{"custom": [["float", "int64_t", "int64_t"], ["int64_t", "std::string", "int64_t"]]}` + +``` +{"custom": [["float", "int64_t", "int64_t"], ["int64_t", "std::string", "int64_t"]]} +``` For these reasons, it is best to generate the configuration file first, and manually edit any entries if needed. + +### Globally allowed types + +It is also possible to limit the types supported by all operators to a specific set of types. These are referred to as *globally allowed types*. They may be specified in the configuration file on a separate line. + +The format for specifying globally allowed types for all operators is: + +``` +!globally_allowed_types;T0,T1,... +``` + +`Ti` should be a C++ scalar type supported by ONNX and ORT. At most one globally allowed types specification is allowed. + +Specifying per-operator type information and specifying globally allowed types are mutually exclusive - it is an error to specify both. diff --git a/src/images/blogs/webtraining_blog_thumbnail.png b/src/images/blogs/webtraining_blog_thumbnail.png new file mode 100644 index 0000000000000..211de4d67e61e Binary files /dev/null and b/src/images/blogs/webtraining_blog_thumbnail.png differ diff --git a/src/routes/blogs/+page.svelte b/src/routes/blogs/+page.svelte index 131f186127be9..4cb0af1af6857 100644 --- a/src/routes/blogs/+page.svelte +++ b/src/routes/blogs/+page.svelte @@ -10,6 +10,7 @@ import LlamaImage from '../../images/blogs/accelerating-llama-2/Figure1-LLaMA-2-7B-E2E-Throughput.png'; import SDXLTurboImage from '../../images/blogs/sdxl_blog_thumbnail.png'; import { createEventDispatcher } from 'svelte'; + import WebTrainingImage from '../../images/blogs/webtraining_blog_thumbnail.png'; onMount(() => { anime({ targets: '.border-primary', @@ -37,6 +38,15 @@ dispatch('switchTab', tab); } let featuredblog = [ + { + title: 'On-Device Training: Training a model in browser', + date: 'February 6th, 2024', + blurb: + 'Want to do ML training for your website in-browser? Learn more about what web training with ONNX Runtime has to offer in our blog below and experiment with your own applications through our easy-to-follow tutorials and demo.', + link: 'https://cloudblogs.microsoft.com/opensource/2024/02/06/on-device-training-training-a-model-in-browser', + image: WebTrainingImage, + imgalt: 'Components of the onnxruntime-web JS package' + }, { title: 'Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive', date: 'January 15th, 2024', @@ -53,7 +63,9 @@ link: 'blogs/accelerating-llama-2', image: LlamaImage, imgalt: 'LLaMA-2 e2e throughput' - }, + } + ]; + let blogs = [ { title: 'Run PyTorch models on the edge', date: 'October 12th, 2023', @@ -63,9 +75,7 @@ image: 'https://onnxruntime.ai/_app/immutable/assets/pytorch-on-the-edge-with-ort.cdaa9c84.png', imgalt: 'Run PyTorch models on the edge' - } - ]; - let blogs = [ + }, { title: 'Accelerating over 130,000 Hugging Face models with ONNX Runtime', date: 'October 4th, 2023',