microsoft · wejoncy · Dec 24, 2024 · Nov 27, 2024 · Nov 27, 2024 · Nov 27, 2024
diff --git a/docs/execution-providers/CoreML-ExecutionProvider.md b/docs/execution-providers/CoreML-ExecutionProvider.md
@@ -41,6 +41,17 @@ The CoreML EP can be used via the C, C++, Objective-C, C# and Java APIs.
 
 The CoreML EP must be explicitly registered when creating the inference session. For example:
 
+```C++
+Ort::Env env = Ort::Env{ORT_LOGGING_LEVEL_ERROR, "Default"};
+Ort::SessionOptions so;
+std::unordered_map<std::string, std::string> provider_options;
+provider_options["ModelFormat"]  = std::to_string("MLProgram");
+so.AppendExecutionProvider("CoreML", provider_options);
+Ort::Session session(env, model_path, so);
+```
+
+
+**Deprecated** APIs `OrtSessionOptionsAppendExecutionProvider_CoreML` in ONNX Runtime 1.20.0. Please use `OrtSessionOptionsAppendExecutionProvider` instead.
 ```C++
 Ort::Env env = Ort::Env{ORT_LOGGING_LEVEL_ERROR, "Default"};
 Ort::SessionOptions so;
@@ -49,18 +60,114 @@ Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_CoreML(so, coreml_fla
 Ort::Session session(env, model_path, so);
 ```
 
-## Configuration Options
+## Configuration Options (NEW API)
 
 There are several run time options available for the CoreML EP.
 
 To use the CoreML EP run time options, create an unsigned integer representing the options, and set each individual option by using the bitwise OR operator.
 
+ProviderOptions can be set by passing string to the `AppendExecutionProvider` method.
+```c++
+std::unordered_map<std::string, std::string> provider_options;
+provider_options["ModelFormat"] = std::to_string("MLProgram");
+provider_options["MLComputeUnits"] = std::to_string("ALL");
+provider_options["RequireStaticInputShapes"] = std::to_string("0");
+provider_options["EnableOnSubgraphs"] = std::to_string("0");
+```
+
+Python inference example code to use the CoreML EP run time options:
+```python
+import onnxruntime as ort
+model_path = "model.onnx"
+providers = [
+    ('CoreMLExecutionProvider', {
+        "ModelFormat": "MLProgram", "MLComputeUnits": "ALL", 
+        "RequireStaticInputShapes": "0", "EnableOnSubgraphs": "0"
+    }),
+]
+
+session = ort.InferenceSession(model_path, providers=providers)
+outputs = ort_sess.run(None, input_feed)
+```
+
+### Available Options (NEW API)
+`ModelFormat` can be one of the following values: (`NeuralNetwork` by default )
+- `MLProgram`: Create an MLProgram format model. Requires Core ML 5 or later (iOS 15+ or macOS 12+).
+- `NeuralNetwork`: Create a NeuralNetwork format model. Requires Core ML 3 or later (iOS 13+ or macOS 10.15+).
+
+`MLComputeUnits` can be one of the following values: (`ALL` by default )
+- `CPUOnly`: Limit CoreML to running on CPU only.
+- `CPUAndNeuralEngine`: Enable CoreML EP for Apple devices with a compatible Apple Neural Engine (ANE).
+- `CPUAndGPU`: Enable CoreML EP for Apple devices with a compatible GPU.
+- `ALL`: Enable CoreML EP for all compatible Apple devices.
+
+`RequireStaticInputShapes` can be one of the following values: (`0` by default )
+
+Only allow the CoreML EP to take nodes with inputs that have static shapes.
+By default the CoreML EP will also allow inputs with dynamic shapes, however performance may be negatively impacted by inputs with dynamic shapes.
+
+- `0`: Allow the CoreML EP to take nodes with inputs that have dynamic shapes.
+- `1`: Only allow the CoreML EP to take nodes with inputs that have static shapes.
+
+
+`EnableOnSubgraphs` can be one of the following values: (`0` by default )
+
+Enable CoreML EP to run on a subgraph in the body of a control flow operator (i.e. a [Loop](https://github.com/onnx/onnx/blob/master/docs/Operators.md#loop), [Scan](https://github.com/onnx/onnx/blob/master/docs/Operators.md#scan) or [If](https://github.com/onnx/onnx/blob/master/docs/Operators.md#if) operator).
+- `0`: Disable CoreML EP to run on a subgraph in the body of a control flow operator.
+- `1`: Enable CoreML EP to run on a subgraph in the body of a control flow operator.
+
+`SpecializationStrategy`:  This feature is available since macOS>=10.15 or iOS>=18.0. This process can affect the model loading time and the prediction latency. Use this option to tailor the specialization strategy for your model. Navigate to [Apple Doc](https://developer.apple.com/documentation/coreml/mloptimizationhints-swift.struct/specializationstrategy-swift.property) for more information. Can be one of the following values: (`Default` by default )
+- `Default`:
+- `FastPrediction`:
+
+`ProfileComputePlan`:Profile the Core ML MLComputePlan. This logs the hardware each operator is dispatched to and the estimated execution time. Intended for developer usage but provides useful diagnostic information if performance is not as expected. can be one of the following values: (`0` by default )
+- `0`: Disable profile.
+- `1`: Enable profile.
+
+`AllowLowPrecisionAccumulationOnGPU`: please refer to [Apple Doc](https://developer.apple.com/documentation/coreml/mlmodelconfiguration/allowlowprecisionaccumulationongpu).  can be one of the following values: (`0` by default )
+- `0`: Use float32 data type to accumulate data. 
+- `1`: Use low precision data(float16) to accumulate data.
+
+`ModelCachePath`: The path to the directory where the Core ML model cache is stored. CoreML EP will compile the captured subgraph to CoreML format graph and saved to disk.
+For the same model, if caching is not enabled, CoreML EP will do the compiling and saving to disk every time, this may cost some time(even minutes) for complicated model. By passing a cache path and a model hash (which is different for different model), CoreML format model can be reused.(Cache disbled by default).
+- `""` : Disable cache. (empty string by default)
+- `"/path/to/cache"` : Enable cache. (path to cache directory, will be created if not exist)
+
+The model hash is very sensitive and important to a specific model, if the model content is changed, the hash will be changed, and the cache will be invalid. If user didn't provide a model hash, CoreML EP will calculate the hash based on the model Path, and use it as the model hash. Please attention that the model hash calculated by CoreML EP is not reliable if model path is not find or even user used a same model path for different model. In such case, even the model is changed, the cache will be reused, this will produce totally wrong results.
+
+Here is an example of how to fill model hash in metadata of model:
+```python
+import onnx
+import hashlib
+
+def hash_file(file_path, algorithm='sha256', chunk_size=8192):
+    hash_func = hashlib.new(algorithm)
+    with open(file_path, 'rb') as file:
+        while chunk := file.read(chunk_size):
+            hash_func.update(chunk)
+    return hash_func.hexdigest()
+
+CACHE_KEY_NAME = "CACHE_KEY"
+model_path = "/a/b/c/model.onnx"
+m = onnx.load(model_path)
+
+cache_key = m.metadata_props.add()
+cache_key.key = CACHE_KEY_NAME
+cache_key.value = str(hash_file(model_path))
+
+for entry in m.metadata_props:
+    print(entry) # to verify the metadata
+onnx.save_model(m, model_path)
+```
+
+
+## Configuration Options (Old API)
 ```
 uint32_t coreml_flags = 0;
 coreml_flags |= COREML_FLAG_ONLY_ENABLE_DEVICE_WITH_ANE;
 ```
 
-### Available Options
+### Available Options (Deprecated API)
 
 ##### COREML_FLAG_USE_CPU_ONLY
 
@@ -147,28 +254,47 @@ Operators that are supported by the CoreML Execution Provider when a MLProgram m
 |Operator|Note|
 |--------|------|
 |ai.onnx:Add||
+|ai.onnx:Argmax||
 |ai.onnx:AveragePool|Only 2D Pool is supported currently. 3D and 5D support can be added if needed.|
+|ai.onnx:Cast||
 |ai.onnx:Clip||
 |ai.onnx:Concat||
 |ai.onnx:Conv|Only 1D/2D Conv is supported.<br/>Bias if provided must be constant.|
 |ai.onnx:ConvTranspose|Weight and bias must be constant.<br/>padding_type of SAME_UPPER/SAME_LOWER is not supported.<br/>kernel_shape must have default values.<br/>output_shape is not supported.<br/>output_padding must have default values.|
-|ai.onnx.DepthToSpace|If 'mode' is 'CRD' the input must have a fixed shape.|
+|ai.onnx:DepthToSpace|If 'mode' is 'CRD' the input must have a fixed shape.|
 |ai.onnx:Div||
+|ai.onnx:Erf||
 |ai.onnx:Gemm|Input B must be constant.|
+|ai.onnx:Gelu||
 |ai.onnx:GlobalAveragePool|Only 2D Pool is supported currently. 3D and 5D support can be added if needed.|
 |ai.onnx:GlobalMaxPool|Only 2D Pool is supported currently. 3D and 5D support can be added if needed.|
 |ai.onnx:GridSample|4D input.<br/>'mode' of 'linear' or 'zeros'.<br/>(mode==linear && padding_mode==reflection && align_corners==0) is not supported.|
-|ai.onnx.LeakyRelu||
+|ai.onnx:GroupNormalization||
+|ai.onnx:InstanceNormalization||
+|ai.onnx:LayerNormalization||
+|ai.onnx:LeakyRelu||
 |ai.onnx:MatMul|Only support for transA == 0, alpha == 1.0 and beta == 1.0 is currently implemented.|
 |ai.onnx:MaxPool|Only 2D Pool is supported currently. 3D and 5D support can be added if needed.|
+|ai.onnx:Max||
 |ai.onnx:Mul||
 |ai.onnx:Pow|Only supports cases when both inputs are fp32.|
+|ai.onnx:PRelu||
+|ai.onnx:Reciprocal|this ask for a `epislon` (default 1e-4) where onnx don't provide|
+|ai.onnx:ReduceSum||
+|ai.onnx:ReduceMean||
+|ai.onnx:ReduceMax||
 |ai.onnx:Relu||
 |ai.onnx:Reshape||
 |ai.onnx:Resize|See [resize_op_builder.cc](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/coreml/builders/impl/resize_op_builder.cc) implementation. There are too many permutations to describe the valid combinations.|
-|ai.onnx.Slice|starts/ends/axes/steps must be constant initializers.|
-|ai.onnx.Split|If provided, `splits` must be constant.|
+|ai.onnx:Round||
+|ai.onnx:Shape||
+|ai.onnx:Slice|starts/ends/axes/steps must be constant initializers.|
+|ai.onnx:Split|If provided, `splits` must be constant.|
 |ai.onnx:Sub||
 |ai.onnx:Sigmoid||
+|ai.onnx:Softmax||
+|ai.onnx:Sqrt||
+|ai.onnx:Squeeze||
 |ai.onnx:Tanh||
 |ai.onnx:Transpose||
+|ai.onnx:Unsqueeze||