[CoreML] support coreml model cache #23065

wejoncy · 2024-12-10T10:24:22Z

Description

Refactor compute plan profiling

Support cache coreml model to speed up session initialization. this is only support by user provided entry and user responsible to manage the cache

With the cache, session initialization time can be reduced by 50% or more:

model	before	after
yolo11.onnx	0.6s	0.1s
yolo11-fp16.onnx	1.8s	0.1s

Motivation and Context

include/onnxruntime/core/providers/coreml/coreml_provider_factory.h

skottmckay · 2024-12-13T04:28:46Z

onnxruntime/core/providers/coreml/coreml_options.cc

+    if (require_static_shape_) {
+      model_cache_path_ += "/static_shape";
+    } else {
+      model_cache_path_ += "/dynamic_shape";
+    }


nit: Is this required? Would be good to keep this as simple as possible. #Closed

Yes, It's required.

require_static_shape_ or not will determinate the what the sub-graph looks like. And gen_metadef_name didn't check the input/output shape info.

skottmckay · 2024-12-16T08:32:20Z

include/onnxruntime/core/providers/coreml/coreml_provider_factory.h

+//   } else {
+//     // save to ModelCachePath
+//   }
+// we wound not detect if the cached model match the onnx subgraph, so User should carefully manage the cache for a new model.


We should document what we do and don't do here.

Prefer cache key from model metadata

we could include example python here, and skip the actual implementation of the hashing to keep it simple

Use model path if available

If model changes user must do one of:

Set different cache key in model metadata

Load model from a different path

Delete old cache information

Or we can document fully in CoreML-ExecutionProvider.md and include a link to that here.

skottmckay · 2024-12-16T08:33:13Z

include/onnxruntime/core/providers/coreml/coreml_provider_factory.h

+//     // save to ModelCachePath
+//   }
+// we wound not detect if the cached model match the onnx subgraph, so User should carefully manage the cache for a new model.
+static const char* const kCoremlProviderOption_ModelCachePath = "ModelCachePath";


We should have a const for the model metadata key name here. I'd vote for COREML_CACHE_KEY given the usage is CoreML specific.

Sure.
May the cache_key can be used by other EPs.

skottmckay · 2024-12-16T08:36:56Z

onnxruntime/core/providers/coreml/builders/model_builder.cc

+    ORT_ENFORCE(std::count(subgraph_name.begin(), subgraph_name.end(), '_') == 3,
+                "Unexpected graph name format: ", subgraph_name);


Given the cache is an optional feature, it might be better to disable caching and log the error instead of throwing, as it could break an iOS app completely (e.g. if they don't have logic to explicitly turn off caching it will be broken).

This won't be possible to throw as long as CoreML EP developers didn't modify gen_metadef_name or modify both.

User can unset 'coreml_options.ModelCachePath()' to disable cache.

But it should be fine to remove this check.

onnxruntime/core/providers/coreml/builders/model_builder.cc

skottmckay · 2024-12-16T08:54:18Z

onnxruntime/core/providers/coreml/coreml_execution_provider.cc

+          // model_hash is a 64-bit hash value of model_path
+          user_provide_key = std::to_string(model_hash);


This is only true if the model path is available. If not it hashes the graph input names and all the node output names. #Closed

I have documented it in CoreML EP.
I will add more comments here.

onnxruntime/core/providers/coreml/coreml_options.cc

skottmckay · 2024-12-16T09:21:14Z

onnxruntime/core/providers/coreml/coreml_execution_provider.cc

+          main_graph = main_graph->ParentGraph();
+        }
+        if (main_graph->GetModel().MetaData().count("CACHE_KEY") > 0) {
+          user_provide_key = graph_viewer.GetGraph().GetModel().MetaData().at("CACHE_KEY");


We need to validate this with something like std::isalnum to guarantee it will be valid for use in the filesystem. I'd also suggest we enforce a maximum length to also try and avoid issues creating folders/files with this in the name.

Yeah. we should do it.
For now, cache-key should has at most 32 characters. Or we will re-hash it.

64 chars + null is required to store a sha256 hash as hex so maybe we could use that as the limit?

64 comes to my mind in the first thought, would this be too long?
Let me use 64 as the limit again.

it's a maximum so the user isn't forced to use it all. Allowing a 256 bit hash seems reasonable. Max directory/file name limit is 255 chars so 64 is well under that. Max path is somewhere around 1000 chars. User also controls the cache path.

But we should be careful about how deep the directory names get for the cache files. It might make more sense to shorten things like whether static shapes were enabled, and for the model it's really only the subgraph id that matters.

i.e. including the user hash and the model hash in a directory name (userhash_COREML_modelhash_subgraphid) shouldn't actually matter when all files for the model are stored under a top-level directory name that uses the preferred hash, as there a) should never be files from any other models in there, and b) the subgraphid should be deterministic (unless they run with a different optimization level).

Take a look at how long the paths are in your tests and figure out a good balance between readable/safe and avoiding exceeding the max path length.

Sure. It's quite balanced now.
<user_provided_path>/cc6b2111b15dcdcf00ed6647c4430315e01616efde46ab4109f80fd1d3c46731/0_dynamic_mlprogram/model/

Co-authored-by: Scott McKay <[email protected]>

include/onnxruntime/core/providers/coreml/coreml_provider_factory.h

onnxruntime/core/providers/coreml/coreml_execution_provider.cc

onnxruntime/core/providers/coreml/builders/model_builder.cc

…ory.h Co-authored-by: Scott McKay <[email protected]>

include/onnxruntime/core/providers/coreml/coreml_provider_factory.h

onnxruntime/core/providers/coreml/builders/model_builder.cc

skottmckay · 2024-12-20T07:20:08Z

onnxruntime/core/providers/coreml/coreml_execution_provider.cc

+          user_provided_key = graph_viewer.GetGraph().GetModel().MetaData().at(kCOREML_CACHE_KEY);
+          if (user_provided_key.size() > 64 ||
+              std::any_of(user_provided_key.begin(), user_provided_key.end(),
+                          [](unsigned char c) { return !std::isalnum(c); })) {
+            LOGS(logger, ERROR) << "[" << kCOREML_CACHE_KEY << ":" << user_provided_key << "] is not a valid cache key."
+                                << " It should be alphanumeric and less than 64 characters.";
+          }
+          // invalid cache-key
+          if (user_provided_key.size() == 0) {


Can we do this once outside of the gen_metadef_name lambda?

When there's an error we're logging it, but nothing is setting user_provided_key to empty, so not clear how the 'invalid cache-key' if condition will be satisfied.

Sure， Sounds good.

onnxruntime/core/providers/coreml/coreml_options.h

Co-authored-by: Scott McKay <[email protected]>

wejoncy requested a review from skottmckay December 10, 2024 10:26

wejoncy marked this pull request as ready for review December 10, 2024 10:26

wejoncy linked an issue Dec 10, 2024 that may be closed by this pull request

CoreML - Writing CoreML Model on every inference session creation #21761

Open

wejoncy force-pushed the jicwen/coreml_cache branch from 5bfc8eb to fc9db07 Compare December 10, 2024 11:10

support coreml model cache

1d1c874

wejoncy force-pushed the jicwen/coreml_cache branch from d539da2 to 1d1c874 Compare December 10, 2024 12:42

wejoncy and others added 3 commits December 11, 2024 11:37

improve

b7888c4

fix

f492fee

better hash

7b11848

wejoncy force-pushed the jicwen/coreml_cache branch from 81c2b9e to 7b11848 Compare December 16, 2024 06:16

refactor output -path

4a5772f

skottmckay reviewed Dec 16, 2024

View reviewed changes

wejoncy and others added 15 commits December 16, 2024 17:26

address comments

b57aa28

remove extra check

723b2dd

Apply suggestions from code review

4f0ac2a

Co-authored-by: Scott McKay <[email protected]>

improve doc

781e42e

typo

26775b4

check cache-key

89317c6

validate alpha-number

773dce0

fix

e82f3e4

format

d3d25b9

fix bug

d053dc0

format

2779e3d

renaming

c7194ad

max 64 chars

8204e64

polish cache path

9c9374c

fix

8faf178

skottmckay reviewed Dec 19, 2024

View reviewed changes

Update include/onnxruntime/core/providers/coreml/coreml_provider_fact…

e4e3547

…ory.h Co-authored-by: Scott McKay <[email protected]>

wejoncy and others added 2 commits December 19, 2024 14:42

Update include/onnxruntime/core/providers/coreml/coreml_provider_fact…

728fbee

…ory.h Co-authored-by: Scott McKay <[email protected]>

Update include/onnxruntime/core/providers/coreml/coreml_provider_fact…

e49112c

…ory.h Co-authored-by: Scott McKay <[email protected]>

github-actions bot reviewed Dec 19, 2024

View reviewed changes

include/onnxruntime/core/providers/coreml/coreml_provider_factory.h Outdated Show resolved Hide resolved

disable caching in runtime.

d7b867c

skottmckay reviewed Dec 20, 2024

View reviewed changes

wejoncy and others added 5 commits December 20, 2024 16:50

Apply suggestions from code review

7c466a1

Co-authored-by: Scott McKay <[email protected]>

address comments

d1e7633

fix

a5ffe03

format

5518e38

format

70075e5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CoreML] support coreml model cache #23065

[CoreML] support coreml model cache #23065

wejoncy commented Dec 10, 2024 •

edited

Loading

skottmckay Dec 13, 2024 •

edited

Loading

wejoncy Dec 16, 2024

skottmckay Dec 16, 2024

skottmckay Dec 16, 2024

wejoncy Dec 16, 2024

skottmckay Dec 16, 2024

wejoncy Dec 16, 2024

skottmckay Dec 16, 2024

wejoncy Dec 16, 2024

skottmckay Dec 16, 2024 •

edited

Loading

wejoncy Dec 16, 2024

skottmckay Dec 16, 2024

wejoncy Dec 16, 2024

skottmckay Dec 17, 2024

wejoncy Dec 17, 2024 •

edited

Loading

skottmckay Dec 17, 2024

wejoncy Dec 18, 2024 •

edited

Loading

skottmckay Dec 20, 2024

wejoncy Dec 20, 2024

		ORT_ENFORCE(std::count(subgraph_name.begin(), subgraph_name.end(), '_') == 3,
		"Unexpected graph name format: ", subgraph_name);

		// model_hash is a 64-bit hash value of model_path
		user_provide_key = std::to_string(model_hash);

[CoreML] support coreml model cache #23065

Are you sure you want to change the base?

[CoreML] support coreml model cache #23065

Conversation

wejoncy commented Dec 10, 2024 • edited Loading

Description

Motivation and Context

skottmckay Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skottmckay Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wejoncy Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wejoncy Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wejoncy commented Dec 10, 2024 •

edited

Loading

skottmckay Dec 13, 2024 •

edited

Loading

skottmckay Dec 16, 2024 •

edited

Loading

wejoncy Dec 17, 2024 •

edited

Loading

wejoncy Dec 18, 2024 •

edited

Loading