Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TensorRT EP] How can I disable generating cache when using trt execution provider #22822

Open
noahzn opened this issue Nov 13, 2024 · 11 comments
Labels
ep:TensorRT issues related to TensorRT execution provider

Comments

@noahzn
Copy link

noahzn commented Nov 13, 2024

I have already generated some trt cache when infering my ONNX model using TRT Execution Provider. Then, for the online testing of my model, I set so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL, but it seems that still new caches are generated. I only want to reuse the old cache while not generating new cache. How can I do that? Thanks in advance!

providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
trt_engine_cache_path = "weights/.trtcache_engines"
trt_timing_cache_path = "weights/.trtcache_timings"

# Create the 'weights' directory if it doesn't exist
os.makedirs(os.path.dirname(trt_engine_cache_path), exist_ok=True)

if conf.trt:
    providers = [
                        (
                            "TensorrtExecutionProvider",
                            {
                                "trt_max_workspace_size": 2 * 1024 * 1024 * 1024,
                                "trt_fp16_enable": True,
                                "trt_engine_cache_enable": True,
                                'trt_timing_cache_enable': True,
                                "trt_engine_cache_path": trt_engine_cache_path,
                                "trt_timing_cache_path": trt_timing_cache_path,
                           
                            }
                        )
                    ] + providers
@github-actions github-actions bot added the ep:TensorRT issues related to TensorRT execution provider label Nov 13, 2024
@noahzn noahzn changed the title how can I disable generating cache when using trt execution provider [TensorRT EP] How can I disable generating cache when using trt execution provider Nov 13, 2024
@yf711
Copy link
Contributor

yf711 commented Nov 13, 2024

Hi @noahzn Your old engine/profile might not be reused by TRTEP if current inference param/cache name/env variables/HW env changes.

Here's more info about engine reusability: https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#trt_engine_cache_enable

I wonder if you update your old engine/profile with newly generated ones, is that new engine going to be reused? or a newer engine need to be generated

@noahzn
Copy link
Author

noahzn commented Nov 14, 2024

@yf711 Thanks for your reply!
My networks are keypoints detection and matching. I think the issue is that we cannot guarantee to extract the same numbers of keypoints on both images. I have warmed up the networks using about 10k paired of images, but it still generates new engines for some paired of images. The old generated engines are still used I think, because it indeed accelerates the inference.
What can I do in this case? will trt_profile_min_shapes and trt_profile_max_shapes help? I tried setting this for input dimensions, but it's not enough.
Following input(s) has no associated shape profiles provided: /Reshape_3_output_0,/norm/Div_output_0,/Resize_output_0,/Unsqueeze_18_output_0,/NonZero_output_0. Maybe some intermediate layers also need to be given dimension ranges?

@chilo-ms
Copy link
Contributor

chilo-ms commented Nov 21, 2024

@noahzn

It's not related to dimension ranges in the intermediate layers input.

The engine cache name,
e.g. TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_17097719564268968195_0_0_sm80.engine,
contains a hash value which is the return of the hash function that takes following metadata as input:

  • model/graph
  • model's file name (Use the model's file name instead of the entire path to avoid cache regeneration if path changes)
  • input names of the graph
  • output name of each node
  • TRT version (determined at build time)
  • ORT version (determined at build time)
  • CUDA version (determined at build time)

Also, the cache name contains compute capability, e.g. sm80.

Does any of metadata above change between the run that generated the cache and the run that supposed to use the old cache?
If so, TRT EP won't use the old cache, and it will generate a new one instead.

@noahzn
Copy link
Author

noahzn commented Nov 22, 2024

@chilo-ms Thanks for your reply. I don't think the above metadata changes. The model's file name is never changed, the input names of the graph are fixed in the onnx model. Concerning the graph, since the numbers of keypoints are different, it may be changed. So now I try to set min. and max. shape of some middle layers and seems now it generates new caches less frequently than before.

For example, these are the cached files in the folder.

-rw-r--r-- 1 root root  165668 Nov 22 07:33 TensorrtExecutionProvider_TRTKernel_graph_main_graph_5143105182468268169_0_0_fp16_sm87.engine
-rw-r--r-- 1 root root 1894027 Nov 22 11:07 TensorrtExecutionProvider_TRTKernel_graph_main_graph_5143105182468268169_1_1_fp16_sm87.engine
-rw-r--r-- 1 root root      38 Nov 22 11:07 TensorrtExecutionProvider_TRTKernel_graph_main_graph_5143105182468268169_1_1_fp16_sm87.profile
-rw-r--r-- 1 root root  392336 Nov 22 11:07 TensorrtExecutionProvider_TRTKernel_graph_main_graph_5143105182468268169_2_2_fp16_sm87.engine
-rw-r--r-- 1 root root     129 Nov 22 11:07 TensorrtExecutionProvider_TRTKernel_graph_main_graph_5143105182468268169_2_2_fp16_sm87.profile
-rw-r--r-- 1 root root  387743 Nov 22 07:40 TensorrtExecutionProvider_TRTKernel_graph_main_graph_5143105182468268169_3_3_fp16_sm87.engine
-rw-r--r-- 1 root root     139 Nov 22 07:40 TensorrtExecutionProvider_TRTKernel_graph_main_graph_5143105182468268169_3_3_fp16_sm87.profile

@chilo-ms
Copy link
Contributor

Concerning the graph, since the numbers of keypoints are different

I assume the shape of input/output tensor reflects the numbers of keypoints, right?
But shape is not the metadata to be hashed.

I suspected it's the model's file name. Could you confirm you use the exact same path for the first run and test run?
Could you also paste the new engine file name here?
If you could share the model as well as the repro code, we can try our side to repro.

For setting the trt_profile_min_shapes, trt_profile_max_shapes and trt_profile_opt_shapes to the range of the minimum and maximum of the input image, it doesn't help for the issue, but it can prevent TRT engine being rebuilt during multiple inference run with different input images.

Following input(s) has no associated shape profiles provided: /Reshape_3_output_0,/norm/Div_output_0,/Resize_output_0,/Unsqueeze_18_output_0,/NonZero_output_0. Maybe some intermediate layers also need to be given dimension ranges?

Yes, in your case, the model is being partitioned into multiple subgraphs that run by TRT EP and several other nodes run by CUDA EP or CPU.
The shape could be one of the input of the subgraph to be run by TRT EP and it requires shape info.

@noahzn
Copy link
Author

noahzn commented Nov 28, 2024

@chilo-ms Thanks for your reply!

I assume the shape of input/output tensor reflects the numbers of keypoints, right?

The input images always have the same size (e.g., 512x512), but different numbers of keypoints might be extracted. So, intermediate tensors can have different shapes.

I suspected it's the model's file name. Could you confirm you use the exact same path for the first run and test run?

I didn't change this part in the code. The onnx model always has the same name.

Now I set min. max. shapes and it seems that trt doesn't generate new caches as frequent as before.
`

trt_engine_cache_path = "weights/.trtcache_engines"
trt_timing_cache_path = "weights/.trtcache_timings"
providers = [
(
"TensorrtExecutionProvider",
{
"trt_max_workspace_size": 1 * 1024 * 1024 * 1024,
"trt_fp16_enable": True,
"trt_engine_cache_enable": True,
# 'trt_dump_subgraphs': False,
"trt_timing_cache_enable": True,
"trt_detailed_build_log": True,
"trt_engine_cache_path": trt_engine_cache_path,
"trt_timing_cache_path": trt_timing_cache_path,
"trt_builder_optimization_level": 1,`
"trt_profile_min_shapes": "kpts0:1x1x2,desc0:1x1x64,kpts1:1x1x2,desc0:1x1x64",
"trt_profile_max_shapes": "kpts0:1x600x2,desc0:1x600x64,kpts1:1x600x2,desc1:1x600x64",

Could you also paste the new engine file name here?

-rw-r--r-- 1 root root 347821 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_0_0_fp16_sm87.engine
-rw-r--r-- 1 root root     88 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_0_0_fp16_sm87.profile
-rw-r--r-- 1 root root 304913 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_10_10_fp16_sm87.engine
-rw-r--r-- 1 root root    350 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_10_10_fp16_sm87.profile
-rw-r--r-- 1 root root 177307 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_11_11_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_11_11_fp16_sm87.profile
-rw-r--r-- 1 root root 381095 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_1_1_fp16_sm87.engine
-rw-r--r-- 1 root root    202 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_1_1_fp16_sm87.profile
-rw-r--r-- 1 root root 313296 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_12_12_fp16_sm87.engine
-rw-r--r-- 1 root root    628 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_12_12_fp16_sm87.profile
-rw-r--r-- 1 root root 179275 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_13_13_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_13_13_fp16_sm87.profile
-rw-r--r-- 1 root root 304162 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_14_14_fp16_sm87.engine
-rw-r--r-- 1 root root    474 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_14_14_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_15_15_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_15_15_fp16_sm87.profile
-rw-r--r-- 1 root root 364084 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_16_16_fp16_sm87.engine
-rw-r--r-- 1 root root    610 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_16_16_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_17_17_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_17_17_fp16_sm87.profile
-rw-r--r-- 1 root root 335865 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_18_18_fp16_sm87.engine
-rw-r--r-- 1 root root    350 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_18_18_fp16_sm87.profile
-rw-r--r-- 1 root root 178777 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_19_19_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_19_19_fp16_sm87.profile
-rw-r--r-- 1 root root 290488 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_20_20_fp16_sm87.engine
-rw-r--r-- 1 root root    628 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_20_20_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_21_21_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_21_21_fp16_sm87.profile
-rw-r--r-- 1 root root 304162 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_22_22_fp16_sm87.engine
-rw-r--r-- 1 root root    474 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_22_22_fp16_sm87.profile
-rw-r--r-- 1 root root 304841 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_2_2_fp16_sm87.engine
-rw-r--r-- 1 root root    306 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_2_2_fp16_sm87.profile
-rw-r--r-- 1 root root 177817 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_23_23_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_23_23_fp16_sm87.profile
-rw-r--r-- 1 root root 333286 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_24_24_fp16_sm87.engine
-rw-r--r-- 1 root root    610 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_24_24_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_25_25_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_25_25_fp16_sm87.profile
-rw-r--r-- 1 root root 304913 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_26_26_fp16_sm87.engine
-rw-r--r-- 1 root root    350 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_26_26_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_27_27_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_27_27_fp16_sm87.profile
-rw-r--r-- 1 root root 282212 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_28_28_fp16_sm87.engine
-rw-r--r-- 1 root root    628 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_28_28_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_29_29_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_29_29_fp16_sm87.profile
-rw-r--r-- 1 root root 304162 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_30_30_fp16_sm87.engine
-rw-r--r-- 1 root root    474 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_30_30_fp16_sm87.profile
-rw-r--r-- 1 root root 177817 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_31_31_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_31_31_fp16_sm87.profile
-rw-r--r-- 1 root root 333286 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_32_32_fp16_sm87.engine
-rw-r--r-- 1 root root    610 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_32_32_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_33_33_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_33_33_fp16_sm87.profile
-rw-r--r-- 1 root root 177307 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_3_3_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_3_3_fp16_sm87.profile
-rw-r--r-- 1 root root 304913 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_34_34_fp16_sm87.engine
-rw-r--r-- 1 root root    350 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_34_34_fp16_sm87.profile
-rw-r--r-- 1 root root 281047 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_35_35_fp16_sm87.engine
-rw-r--r-- 1 root root    492 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_35_35_fp16_sm87.profile
-rw-r--r-- 1 root root 234321 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_36_36_fp16_sm87.engine
-rw-r--r-- 1 root root    280 Nov 22 07:58 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_36_36_fp16_sm87.profile
-rw-r--r-- 1 root root 106366 Nov 26 13:06 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_37_37_fp16_sm87.engine
-rw-r--r-- 1 root root    112 Nov 26 13:06 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_37_37_fp16_sm87.profile
-rw-r--r-- 1 root root 290488 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_4_4_fp16_sm87.engine
-rw-r--r-- 1 root root    628 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_4_4_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_5_5_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_5_5_fp16_sm87.profile
-rw-r--r-- 1 root root 304162 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_6_6_fp16_sm87.engine
-rw-r--r-- 1 root root    474 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_6_6_fp16_sm87.profile
-rw-r--r-- 1 root root 177817 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_7_7_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_7_7_fp16_sm87.profile
-rw-r--r-- 1 root root 326273 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_8_8_fp16_sm87.engine
-rw-r--r-- 1 root root    610 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_8_8_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_9_9_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:56 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17030199236189371776_9_9_fp16_sm87.profile
-rw-r--r-- 1 root root 501315 Nov 22 07:42 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_0_0_fp16_sm87.engine
-rw-r--r-- 1 root root     88 Nov 22 07:42 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_0_0_fp16_sm87.profile
-rw-r--r-- 1 root root 304812 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_10_10_fp16_sm87.engine
-rw-r--r-- 1 root root    350 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_10_10_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_11_11_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_11_11_fp16_sm87.profile
-rw-r--r-- 1 root root 477366 Nov 22 07:43 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_1_1_fp16_sm87.engine
-rw-r--r-- 1 root root    202 Nov 22 07:43 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_1_1_fp16_sm87.profile
-rw-r--r-- 1 root root 290387 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_12_12_fp16_sm87.engine
-rw-r--r-- 1 root root    628 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_12_12_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_13_13_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_13_13_fp16_sm87.profile
-rw-r--r-- 1 root root 304019 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_14_14_fp16_sm87.engine
-rw-r--r-- 1 root root    474 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_14_14_fp16_sm87.profile
-rw-r--r-- 1 root root 177817 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_15_15_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_15_15_fp16_sm87.profile
-rw-r--r-- 1 root root 333150 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_16_16_fp16_sm87.engine
-rw-r--r-- 1 root root    610 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_16_16_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_17_17_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_17_17_fp16_sm87.profile
-rw-r--r-- 1 root root 304812 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_18_18_fp16_sm87.engine
-rw-r--r-- 1 root root    350 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_18_18_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_19_19_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_19_19_fp16_sm87.profile
-rw-r--r-- 1 root root 290387 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_20_20_fp16_sm87.engine
-rw-r--r-- 1 root root    628 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_20_20_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_21_21_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_21_21_fp16_sm87.profile
-rw-r--r-- 1 root root 304019 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_22_22_fp16_sm87.engine
-rw-r--r-- 1 root root    474 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_22_22_fp16_sm87.profile
-rw-r--r-- 1 root root 305032 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_2_2_fp16_sm87.engine
-rw-r--r-- 1 root root    306 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_2_2_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_23_23_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_23_23_fp16_sm87.profile
-rw-r--r-- 1 root root 362685 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_24_24_fp16_sm87.engine
-rw-r--r-- 1 root root    610 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_24_24_fp16_sm87.profile
-rw-r--r-- 1 root root 177307 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_25_25_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_25_25_fp16_sm87.profile
-rw-r--r-- 1 root root 304812 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_26_26_fp16_sm87.engine
-rw-r--r-- 1 root root    350 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_26_26_fp16_sm87.profile
-rw-r--r-- 1 root root 177307 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_27_27_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_27_27_fp16_sm87.profile
-rw-r--r-- 1 root root 290387 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_28_28_fp16_sm87.engine
-rw-r--r-- 1 root root    628 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_28_28_fp16_sm87.profile
-rw-r--r-- 1 root root 177817 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_29_29_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_29_29_fp16_sm87.profile
-rw-r--r-- 1 root root 304019 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_30_30_fp16_sm87.engine
-rw-r--r-- 1 root root    474 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_30_30_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_31_31_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_31_31_fp16_sm87.profile
-rw-r--r-- 1 root root 333150 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_32_32_fp16_sm87.engine
-rw-r--r-- 1 root root    610 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_32_32_fp16_sm87.profile
-rw-r--r-- 1 root root 177307 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_33_33_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_33_33_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_3_3_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_3_3_fp16_sm87.profile
-rw-r--r-- 1 root root 307522 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_34_34_fp16_sm87.engine
-rw-r--r-- 1 root root    350 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_34_34_fp16_sm87.profile
-rw-r--r-- 1 root root 282129 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_35_35_fp16_sm87.engine
-rw-r--r-- 1 root root    492 Nov 22 07:46 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_35_35_fp16_sm87.profile
-rw-r--r-- 1 root root 234218 Nov 22 07:47 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_36_36_fp16_sm87.engine
-rw-r--r-- 1 root root    280 Nov 22 07:47 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_36_36_fp16_sm87.profile
-rw-r--r-- 1 root root  53392 Nov 28 06:21 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_37_37_fp16_sm87.engine
-rw-r--r-- 1 root root    106 Nov 28 06:21 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_37_37_fp16_sm87.profile
-rw-r--r-- 1 root root 282111 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_4_4_fp16_sm87.engine
-rw-r--r-- 1 root root    628 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_4_4_fp16_sm87.profile
-rw-r--r-- 1 root root 177817 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_5_5_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_5_5_fp16_sm87.profile
-rw-r--r-- 1 root root 304019 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_6_6_fp16_sm87.engine
-rw-r--r-- 1 root root    474 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_6_6_fp16_sm87.profile
-rw-r--r-- 1 root root 178251 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_7_7_fp16_sm87.engine
-rw-r--r-- 1 root root    298 Nov 22 07:44 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_7_7_fp16_sm87.profile
-rw-r--r-- 1 root root 333150 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_8_8_fp16_sm87.engine
-rw-r--r-- 1 root root    610 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_8_8_fp16_sm87.profile
-rw-r--r-- 1 root root 177753 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_9_9_fp16_sm87.engine
-rw-r--r-- 1 root root    290 Nov 22 07:45 TensorrtExecutionProvider_TRTKernel_graph_main_graph_17196428275334503461_9_9_fp16_sm87.profile

As you can see, some new files are generated today. When it generates new files, are the old files still used? Only two new files are generated today, and more than 50 files were generated on Nov. 22. (I only paste part )

@chilo-ms
Copy link
Contributor

chilo-ms commented Dec 3, 2024

I think there are two topics here:

  1. Why new engine cache files are being generated instead of using the old ones.
  2. Why providing min. max. opt. shapes, there is still one engine file (and profile file) being updated.

For the 1st topic, i saw you mentioned:

I have already generated some trt cache when infering my ONNX model using TRT Execution Provider. Then, for the online testing of my model, I set so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL

What graph opt level did you set for the first/warm-up run?
The graph opt level should be the same for the first run and test run, if not, the graphs consumed by TRT EP might be different which causes producing different hash values and ends up generating different engine cache files.

For the 2nd topic, it's possible the range of min. max. opt. values are not large enough to cover all the inputs shapes or intermediate tensor shapes.
If the range is large enough to cover the shapes, all the engines should only be built once meaning you won't see engine cache files being updated.
By looking at your directory snapshot, there are 38 subgraphs run by TRT EP, it's the last subgraph (TensorrtExecutionProvider_XXX_37_37_fp16_sm87.engine) being updated during multiple inference runs.

If you turn on verbose log,

ort.set_default_logger_severity(0)

once the engine files are created and you keep running multiple inference runs, you might see the following log for the last subgraph:
[TensorRT EP] Serialized TensorrtExecutionProvider_XXX_37_37_fp16_sm87.engine

@chilo-ms
Copy link
Contributor

chilo-ms commented Dec 3, 2024

When it generates new files, are the old files still used?

No, the old files won't be used.

BTW, the .engine file is the serialized engine and .profile is the input shape range used to build the engine.
They are being dumped to disk together and will be updated together as well if needed.

@noahzn
Copy link
Author

noahzn commented Dec 10, 2024

@chilo-ms

What graph opt level did you set for the first/warm-up run?

During warm-up I use so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL. After that, I use so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL during real online inference. So, you suggest not changing graph_optimization_level?

For the 2nd topic, it's possible the range of min. max. opt. values are not large enough to cover all the inputs shapes or intermediate tensor shapes.

Yes, I think so. In our last two tests the caches were not updated.

@chilo-ms
Copy link
Contributor

chilo-ms commented Dec 11, 2024

So, you suggest not changing graph_optimization_level?

yes. Then the hash value should be the same between warm-up run and test run, then you won't see new engine cache being created. Could you help give it a try?

@noahzn
Copy link
Author

noahzn commented Dec 17, 2024

Thank you! I will let you know. @chilo-ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:TensorRT issues related to TensorRT execution provider
Projects
None yet
Development

No branches or pull requests

3 participants