TensorRT 8.5.2 support? #1655

HugeBob · 2023-02-09T22:50:55Z

HugeBob
Feb 9, 2023

Sorry if this has already been asked or answered elsewhere, but are there plans to implement support for TensorRT 8.5.2? I am using the Segformer model and it is not supported in TensorRT 8.5 but is supported by 8.5.2. If there are plans already, is there a timeline for when that would be implemented? Even if the timeline is fluid or merely an estimation it would be helpful. Thank you!

peri044 · 2023-02-13T17:53:17Z

peri044
Feb 13, 2023
Collaborator

Our main branch should be compatible with TRT 8.5.2. Can you try compiling it with TRT 8.5.2 and try ?

9 replies

HugeBob Feb 24, 2023
Author

Just in case I hadn't provided enough context with the previous post of the error here is the extended traceback:

Traceback (most recent call last):
  File "trttest.py", line 87, in <module>
    trt_model = torch_tensorrt.compile(model,
  File "/usr/local/lib/python3.8/dist-packages/torch_tensorrt/_compile.py", line 124, in compile
    ts_mod = torch.jit.script(module)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_script.py", line 1286, in script
    return torch.jit._recursive.create_script_module(
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 479, in create_script_module
    return create_script_module_impl(nn_module, concrete_type, stubs_fn)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 541, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_script.py", line 615, in _construct
    init_fn(script_module)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 519, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 541, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_script.py", line 615, in _construct
    init_fn(script_module)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 519, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 491, in create_script_module_impl
    method_stubs = stubs_fn(nn_module)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 760, in infer_methods_to_compile
    stubs.append(make_stub_from_method(nn_module, method))
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 72, in make_stub_from_method
    return make_stub(func, method_name)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/_recursive.py", line 57, in make_stub
    ast = get_jit_def(func, name, self_name="RecursiveScriptModule")
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 293, in get_jit_def
    return build_def(parsed_def.ctx, fn_def, type_line, def_name, self_name=self_name, pdt_arg_types=pdt_arg_types)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 344, in build_def
    build_stmts(ctx, body))
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 142, in build_stmts
    stmts = [build_stmt(ctx, s) for s in stmts]
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 142, in <listcomp>
    stmts = [build_stmt(ctx, s) for s in stmts]
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 316, in __call__
    return method(ctx, node)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 644, in build_If
    build_stmts(ctx, stmt.body),
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 142, in build_stmts
    stmts = [build_stmt(ctx, s) for s in stmts]
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 142, in <listcomp>
    stmts = [build_stmt(ctx, s) for s in stmts]
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 316, in __call__
    return method(ctx, node)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 592, in build_Return
    return Return(r, None if stmt.value is None else build_expr(ctx, stmt.value))
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 316, in __call__
    return method(ctx, node)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 744, in build_Call
    args = [build_expr(ctx, py_arg) for py_arg in expr.args]
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 744, in <listcomp>
    args = [build_expr(ctx, py_arg) for py_arg in expr.args]
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 316, in __call__
    return method(ctx, node)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 1026, in build_GeneratorExp
    return ExprBuilder.build_ListComp(ctx, stmt)
  File "/home/nvidia/.local/lib/python3.8/site-packages/torch/jit/frontend.py", line 1015, in build_ListComp
    raise NotSupportedError(r, "Comprehension ifs are not supported yet")
torch.jit.frontend.NotSupportedError: Comprehension ifs are not supported yet:
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/segformer/modeling_segformer.py", line 442

        if not return_dict:
            return tuple(v for v in [hidden_states, all_hidden_states, all_self_attentions] if v is not None)
        return BaseModelOutput(
            last_hidden_state=hidden_states,

narendasan Feb 24, 2023
Collaborator

torch_tensorrt.compile under the hood uses torch.jit.script for the sake of correctness so there arent models that trace but produce erroneous results. You can provide a traced model to torch_tensorrt.compile if you already have a working torchscript model

HugeBob Feb 24, 2023
Author

Changing the code to:

from transformers import SegformerForSemanticSegmentation, SegformerConfig

state_dict = torch.load("bdd_drivable_b0.pt")
config = SegformerConfig()
config.num_labels = 3
model = SegformerForSemanticSegmentation(config)
model.load_state_dict(state_dict)

traced_model = torch.jit.trace(model, torch.randn((1,3,x,y)).to("cuda"),strict=False)

trt_model = torch_tensorrt.compile(traced_model, 
    inputs= [torch_tensorrt.Input((1, 3, 1280, 720))],
    enabled_precisions= { torch_tensorrt.dtype.float16}
)

Produces the following error:

Traceback (most recent call last):
  File "trttest.py", line 87, in <module>
    trt_model = torch_tensorrt.compile(traced_model,
  File "/usr/local/lib/python3.8/dist-packages/torch_tensorrt/_compile.py", line 125, in compile
    return torch_tensorrt.ts.compile(
  File "/usr/local/lib/python3.8/dist-packages/torch_tensorrt/ts/_compiler.py", line 136, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: [Error thrown at core/conversion/converters/converter_util.cpp:234] Unable to freeze tensor of type Int64/Float64 into constant layer, try to compile model with truncate_long_and_double enabled

And adding the truncate_long_and_double argument like so:

trt_model = torch_tensorrt.compile(traced_model, 
    inputs= [torch_tensorrt.Input((1, 3, x, y))],
    #enabled_precisions= { torch_tensorrt.dtype.float32} # Run with FP32
    enabled_precisions= { torch_tensorrt.dtype.float16} # Run with FP16
    #enabled_precisions= { torch_tensorrt.dtype.int8} # Run with INT16
    ,truncate_long_and_double=True
)

Produces no error but actually runs slower than base TorchScript
torch.jit.trace average throughput: 15.71 images/second
torch_tensorrt.compile average throughput: 14.80 images/second

narendasan Feb 24, 2023
Collaborator

If the model is not compiled fully, i.e not all operations are run in tensorrt then perf might be lower for instance if there are a lot of context switches from pytorch to trt. Settings like min block size, strategic additions of converters or assigning certain ops or modules to run in pytorch can help you tune the performance of the compiled model

HugeBob Feb 24, 2023
Author

Is there something I'm doing wrong so that it doesn't compile fully?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorRT 8.5.2 support? #1655

{{title}}

Replies: 1 comment 9 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

TensorRT 8.5.2 support? #1655

HugeBob Feb 9, 2023

Replies: 1 comment · 9 replies

peri044 Feb 13, 2023 Collaborator

HugeBob Feb 24, 2023 Author

narendasan Feb 24, 2023 Collaborator

HugeBob Feb 24, 2023 Author

narendasan Feb 24, 2023 Collaborator

HugeBob Feb 24, 2023 Author

HugeBob
Feb 9, 2023

Replies: 1 comment 9 replies

peri044
Feb 13, 2023
Collaborator

HugeBob Feb 24, 2023
Author

narendasan Feb 24, 2023
Collaborator

HugeBob Feb 24, 2023
Author

narendasan Feb 24, 2023
Collaborator

HugeBob Feb 24, 2023
Author