Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen2 VL cannot be convert to checkpoint on TensorRT-LLM #2658

Open
2 of 4 tasks
xunuohope1107 opened this issue Jan 5, 2025 · 9 comments
Open
2 of 4 tasks

Qwen2 VL cannot be convert to checkpoint on TensorRT-LLM #2658

xunuohope1107 opened this issue Jan 5, 2025 · 9 comments
Assignees
Labels
bug Something isn't working Investigating LLM API/Workflow triaged Issue has been triaged by maintainers

Comments

@xunuohope1107
Copy link

System Info

  • CPU: x86
  • GPU: 2xL40S
  • Memory: 256GB
  • System: Ubuntu 22.04
  • Docker Image: nvcr.io/nvidia/tritonserver:24.12-trtllm-python-py3
  • TensorRT-LLM version: 0.16.0

Who can help?

I have tested the examples under examples/multimodal. But when I try to convert the Qwen2-VL-7B to checkpoint via python3 ../qwen/convert_checkpoint.py --model_dir Qwen2-VL-7B-Instruct \ --output_dir trt_models/Qwen2-VL-7B-Instruct/fp16/1-gpu \ --dtype float16, I got the error Unrecognized keys in rope_scaling for 'rope_type'='default': {'mrope_section'}, seems the Qwen2-VL is not supported. Is it due to the docker image I used or I have build the trtllm from the source?

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Cd to examples/multimodal
  2. Run python3 ../qwen/convert_checkpoint.py --model_dir Qwen2-VL-7B-Instruct \ --output_dir trt_models/Qwen2-VL-7B-Instruct/fp16/1-gpu \ --dtype float16

Expected behavior

Got trt_models/Qwen2-VL-7B-Instruct/fp16/1-gpu without any errors.

actual behavior

Got error log:
root@04292e29d243:/workspace/TensorRT-LLM/examples/multimodal# python3 ../qwen/convert_checkpoint.py --model_dir Qwen2-VL-7B-Instruct \ --output_dir trt_models/Qwen2-VL-7B-Instruct/fp16/1-gpu \ --dtype float16 2025-01-03 11:20:24.426668: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2025-01-03 11:20:24.441389: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1735903224.456763 2272 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered E0000 00:00:1735903224.461320 2272 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2025-01-03 11:20:24.477010: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX512_FP16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. [TensorRT-LLM] TensorRT-LLM version: 0.16.0 0.16.0 Unrecognized keys in rope_scalingfor 'rope_type'='default': {'mrope_section'} Unrecognized keys inrope_scalingfor 'rope_type'='default': {'mrope_section'} Traceback (most recent call last): File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/functional.py", line 656, in from_string return RotaryScalingType[s] ~~~~~~~~~~~~~~~~~^^^ File "/usr/lib/python3.12/enum.py", line 814, in __getitem__ return cls._member_map_[name] ~~~~~~~~~~~~~~~~^^^^^^ KeyError: 'default'

additional notes

I have tried Phi-3 vision, Qwen2-7B-instruct as well, both of them works.

@xunuohope1107 xunuohope1107 added the bug Something isn't working label Jan 5, 2025
@github-actions github-actions bot added triaged Issue has been triaged by maintainers Investigating labels Jan 6, 2025
@nv-guomingz
Copy link
Collaborator

@sunnyqgg would u please take a look this issue?

@sunnyqgg
Copy link
Collaborator

sunnyqgg commented Jan 6, 2025

Hi,
Please use the latest main code and run "pip install -r requirements-qwen2vl.txt" firstly.

Thanks.

@xunuohope1107
Copy link
Author

I tried to rebuild the docker image with latest source code on the main branch. The checkpoint converting has been fixed for Qwen2-VL. However, the run.py seems still not working for Qwen2-VL.

I have tried python run.py \ --hf_model_dir Qwen2-VL-7B-Instruct \ --visual_engine_dir trt_engines/Qwen2-VL-7B-Instruct/vision_encoder \ --llm_engine_dir trt_engines/Qwen2-VL-7B-Instruct/fp16/1-gpu/ \ --image_path=merlion.png

But got root@00d9a1ccd86f:/workspace/TensorRT-LLM/examples/multimodal# python run.py \ --hf_model_dir Qwen2-VL-7B-Instruct \ --visual_engine_dir trt_engines/Qwen2-VL-7B-Instruct/vision_encoder \ --llm_engine_dir trt_engines/Qwen2-VL-7B-Instruct/fp16/1-gpu/ \ --image_path=merlion.png 2025-01-10 08:19:36.099445: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2025-01-10 08:19:36.114432: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1736497176.130732 10771 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered E0000 00:00:1736497176.135485 10771 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2025-01-10 08:19:36.152056: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX512_FP16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. [TensorRT-LLM] TensorRT-LLM version: 0.17.0.dev2024121700 [TensorRT-LLM][INFO] Engine version 0.17.0.dev2024121700 found in the config file, assuming engine(s) built by new builder API. [01/10/2025-08:19:39] [TRT-LLM] [I] Loading engine from trt_engines/Qwen2-VL-7B-Instruct/vision_encoder/model.engine [01/10/2025-08:19:39] [TRT-LLM] [I] Creating session from engine trt_engines/Qwen2-VL-7B-Instruct/vision_encoder/model.engine [01/10/2025-08:19:39] [TRT] [I] Loaded engine size: 1303 MiB [01/10/2025-08:19:40] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +498, now: CPU 0, GPU 1791 (MiB) [01/10/2025-08:19:40] [TRT-LLM] [I] Running LLM with C++ runner [TensorRT-LLM][INFO] Engine version 0.17.0.dev2024121700 found in the config file, assuming engine(s) built by new builder API. [TensorRT-LLM][INFO] MPI size: 1, MPI local size: 1, rank: 0 [TensorRT-LLM][INFO] Engine version 0.17.0.dev2024121700 found in the config file, assuming engine(s) built by new builder API. [TensorRT-LLM][INFO] Refreshed the MPI local session [TensorRT-LLM][INFO] MPI size: 1, MPI local size: 1, rank: 0 [TensorRT-LLM][INFO] Rank 0 is using GPU 0 [TensorRT-LLM][INFO] TRTGptModel maxNumSequences: 4 [TensorRT-LLM][INFO] TRTGptModel maxBatchSize: 4 [TensorRT-LLM][INFO] TRTGptModel maxBeamWidth: 1 [TensorRT-LLM][INFO] TRTGptModel maxSequenceLen: 3072 [TensorRT-LLM][INFO] TRTGptModel maxDraftLen: 0 [TensorRT-LLM][INFO] TRTGptModel mMaxAttentionWindowSize: (3072) * 28 [TensorRT-LLM][INFO] TRTGptModel enableTrtOverlap: 0 [TensorRT-LLM][INFO] TRTGptModel normalizeLogProbs: 1 [TensorRT-LLM][INFO] TRTGptModel maxNumTokens: 8192 [TensorRT-LLM][INFO] TRTGptModel maxInputLen: 3071 = min(maxSequenceLen - 1, maxNumTokens) since context FMHA and usePackedInput are enabled [TensorRT-LLM][INFO] TRTGptModel If model type is encoder, maxInputLen would be reset in trtEncoderModel to maxInputLen: min(maxSequenceLen, maxNumTokens). [TensorRT-LLM][INFO] Capacity Scheduler Policy: GUARANTEED_NO_EVICT [TensorRT-LLM][INFO] Context Chunking Scheduler Policy: None [TensorRT-LLM][INFO] The logger passed into createInferRuntime differs from one already provided for an existing builder, runtime, or refitter. Uses of the global logger, returned by nvinfer1::getLogger(), will return the existing value. [TensorRT-LLM][INFO] Loaded engine size: 14549 MiB [TensorRT-LLM][INFO] Inspecting the engine to identify potential runtime issues... [TensorRT-LLM][INFO] The profiling verbosity of the engine does not allow this analysis to proceed. Re-build the engine with 'detailed' profiling verbosity to get more diagnostics. [TensorRT-LLM][INFO] [MemUsageChange] Allocated 1000.03 MiB for execution context memory. [TensorRT-LLM][INFO] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 16332 (MiB) [TensorRT-LLM][INFO] [MemUsageChange] Allocated 11.49 MB GPU memory for runtime buffers. [TensorRT-LLM][INFO] [MemUsageChange] Allocated 9.72 MB GPU memory for decoder. [TensorRT-LLM][INFO] Memory usage when calculating max tokens in paged kv cache: total: 44.52 GiB, available: 27.02 GiB [TensorRT-LLM][INFO] Number of blocks in KV cache primary pool: 7116 [TensorRT-LLM][INFO] Number of blocks in KV cache secondary pool: 0, onboard blocks to primary memory before reuse: true [TensorRT-LLM][INFO] Max KV cache pages per sequence: 48 [TensorRT-LLM][INFO] Number of tokens per block: 64. [TensorRT-LLM][INFO] [MemUsageChange] Allocated 24.32 GiB for max tokens in paged KV cache (455424). [TensorRT-LLM][INFO] Enable MPI KV cache transport. [01/10/2025-08:19:51] [TRT-LLM] [I] Load engine takes: 10.98725938796997 sec Traceback (most recent call last): File "/workspace/TensorRT-LLM/examples/multimodal/run.py", line 88, in <module> input_text, output_text = model.run(args.input_text, raw_image, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/runtime/multimodal_model_runner.py", line 1989, in run input_text, pre_prompt, post_prompt, processed_image, decoder_input_ids, other_vision_inputs, other_decoder_inputs = self.setup_inputs( ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/runtime/multimodal_model_runner.py", line 1728, in setup_inputs processor.apply_chat_template(msg, ^^^^^^^^^ NameError: name 'processor' is not defined. Did you mean: 'self.processor'? [TensorRT-LLM][INFO] Refreshed the MPI local session.

Any suggestion here? Thanks!

@xunuohope1107
Copy link
Author

I tried to rebuild the docker image with latest source code on the main branch. The checkpoint converting has been fixed for Qwen2-VL. However, the run.py seems still not working for Qwen2-VL.

I have tried python run.py \ --hf_model_dir Qwen2-VL-7B-Instruct \ --visual_engine_dir trt_engines/Qwen2-VL-7B-Instruct/vision_encoder \ --llm_engine_dir trt_engines/Qwen2-VL-7B-Instruct/fp16/1-gpu/ \ --image_path=merlion.png

But got root@00d9a1ccd86f:/workspace/TensorRT-LLM/examples/multimodal# python run.py \ --hf_model_dir Qwen2-VL-7B-Instruct \ --visual_engine_dir trt_engines/Qwen2-VL-7B-Instruct/vision_encoder \ --llm_engine_dir trt_engines/Qwen2-VL-7B-Instruct/fp16/1-gpu/ \ --image_path=merlion.png 2025-01-10 08:19:36.099445: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2025-01-10 08:19:36.114432: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1736497176.130732 10771 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered E0000 00:00:1736497176.135485 10771 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2025-01-10 08:19:36.152056: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX512_FP16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. [TensorRT-LLM] TensorRT-LLM version: 0.17.0.dev2024121700 [TensorRT-LLM][INFO] Engine version 0.17.0.dev2024121700 found in the config file, assuming engine(s) built by new builder API. [01/10/2025-08:19:39] [TRT-LLM] [I] Loading engine from trt_engines/Qwen2-VL-7B-Instruct/vision_encoder/model.engine [01/10/2025-08:19:39] [TRT-LLM] [I] Creating session from engine trt_engines/Qwen2-VL-7B-Instruct/vision_encoder/model.engine [01/10/2025-08:19:39] [TRT] [I] Loaded engine size: 1303 MiB [01/10/2025-08:19:40] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +498, now: CPU 0, GPU 1791 (MiB) [01/10/2025-08:19:40] [TRT-LLM] [I] Running LLM with C++ runner [TensorRT-LLM][INFO] Engine version 0.17.0.dev2024121700 found in the config file, assuming engine(s) built by new builder API. [TensorRT-LLM][INFO] MPI size: 1, MPI local size: 1, rank: 0 [TensorRT-LLM][INFO] Engine version 0.17.0.dev2024121700 found in the config file, assuming engine(s) built by new builder API. [TensorRT-LLM][INFO] Refreshed the MPI local session [TensorRT-LLM][INFO] MPI size: 1, MPI local size: 1, rank: 0 [TensorRT-LLM][INFO] Rank 0 is using GPU 0 [TensorRT-LLM][INFO] TRTGptModel maxNumSequences: 4 [TensorRT-LLM][INFO] TRTGptModel maxBatchSize: 4 [TensorRT-LLM][INFO] TRTGptModel maxBeamWidth: 1 [TensorRT-LLM][INFO] TRTGptModel maxSequenceLen: 3072 [TensorRT-LLM][INFO] TRTGptModel maxDraftLen: 0 [TensorRT-LLM][INFO] TRTGptModel mMaxAttentionWindowSize: (3072) * 28 [TensorRT-LLM][INFO] TRTGptModel enableTrtOverlap: 0 [TensorRT-LLM][INFO] TRTGptModel normalizeLogProbs: 1 [TensorRT-LLM][INFO] TRTGptModel maxNumTokens: 8192 [TensorRT-LLM][INFO] TRTGptModel maxInputLen: 3071 = min(maxSequenceLen - 1, maxNumTokens) since context FMHA and usePackedInput are enabled [TensorRT-LLM][INFO] TRTGptModel If model type is encoder, maxInputLen would be reset in trtEncoderModel to maxInputLen: min(maxSequenceLen, maxNumTokens). [TensorRT-LLM][INFO] Capacity Scheduler Policy: GUARANTEED_NO_EVICT [TensorRT-LLM][INFO] Context Chunking Scheduler Policy: None [TensorRT-LLM][INFO] The logger passed into createInferRuntime differs from one already provided for an existing builder, runtime, or refitter. Uses of the global logger, returned by nvinfer1::getLogger(), will return the existing value. [TensorRT-LLM][INFO] Loaded engine size: 14549 MiB [TensorRT-LLM][INFO] Inspecting the engine to identify potential runtime issues... [TensorRT-LLM][INFO] The profiling verbosity of the engine does not allow this analysis to proceed. Re-build the engine with 'detailed' profiling verbosity to get more diagnostics. [TensorRT-LLM][INFO] [MemUsageChange] Allocated 1000.03 MiB for execution context memory. [TensorRT-LLM][INFO] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 16332 (MiB) [TensorRT-LLM][INFO] [MemUsageChange] Allocated 11.49 MB GPU memory for runtime buffers. [TensorRT-LLM][INFO] [MemUsageChange] Allocated 9.72 MB GPU memory for decoder. [TensorRT-LLM][INFO] Memory usage when calculating max tokens in paged kv cache: total: 44.52 GiB, available: 27.02 GiB [TensorRT-LLM][INFO] Number of blocks in KV cache primary pool: 7116 [TensorRT-LLM][INFO] Number of blocks in KV cache secondary pool: 0, onboard blocks to primary memory before reuse: true [TensorRT-LLM][INFO] Max KV cache pages per sequence: 48 [TensorRT-LLM][INFO] Number of tokens per block: 64. [TensorRT-LLM][INFO] [MemUsageChange] Allocated 24.32 GiB for max tokens in paged KV cache (455424). [TensorRT-LLM][INFO] Enable MPI KV cache transport. [01/10/2025-08:19:51] [TRT-LLM] [I] Load engine takes: 10.98725938796997 sec Traceback (most recent call last): File "/workspace/TensorRT-LLM/examples/multimodal/run.py", line 88, in <module> input_text, output_text = model.run(args.input_text, raw_image, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/runtime/multimodal_model_runner.py", line 1989, in run input_text, pre_prompt, post_prompt, processed_image, decoder_input_ids, other_vision_inputs, other_decoder_inputs = self.setup_inputs( ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/runtime/multimodal_model_runner.py", line 1728, in setup_inputs processor.apply_chat_template(msg, ^^^^^^^^^ NameError: name 'processor' is not defined. Did you mean: 'self.processor'? [TensorRT-LLM][INFO] Refreshed the MPI local session.

Any suggestion here? Thanks!

The issue happens here:
[01/10/2025-08:19:51] [TRT-LLM] [I] Load engine takes: 10.98725938796997 sec Traceback (most recent call last): File "/workspace/TensorRT-LLM/examples/multimodal/run.py", line 88, in <module> input_text, output_text = model.run(args.input_text, raw_image, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/runtime/multimodal_model_runner.py", line 1989, in run input_text, pre_prompt, post_prompt, processed_image, decoder_input_ids, other_vision_inputs, other_decoder_inputs = self.setup_inputs( ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/runtime/multimodal_model_runner.py", line 1728, in setup_inputs processor.apply_chat_template(msg, ^^^^^^^^^ NameError: name 'processor' is not defined. Did you mean: 'self.processor'?

@sunnyqgg
Copy link
Collaborator

HI @xunuohope1107 ,
Please add processor = AutoProcessor.from_pretrained(self.args.hf_model_dir) in tensorrt_llm/runtime/multimodal_model_runner.py.

Thanks.

@xunuohope1107
Copy link
Author

Yeah, i have checked tensorrt_llm/runtime/multimodal_model_runner.py. But it already has if self.model_type == "qwen2_vl": hf_config = AutoConfig.from_pretrained(self.args.hf_model_dir) self.vision_start_token_id = hf_config.vision_start_token_id self.vision_end_token_id = hf_config.vision_end_token_id self.vision_token_id = hf_config.vision_token_id self.image_token_id = hf_config.image_token_id self.video_token_id = hf_config.video_token_id self.spatial_merge_size = hf_config.vision_config.spatial_merge_size self.max_position_embeddings = hf_config.max_position_embeddings self.hidden_size = hf_config.hidden_size self.num_attention_heads = hf_config.num_attention_heads self.rope_theta = hf_config.rope_theta

@xunuohope1107
Copy link
Author

Yeah, i have checked tensorrt_llm/runtime/multimodal_model_runner.py. But it already has if self.model_type == "qwen2_vl": hf_config = AutoConfig.from_pretrained(self.args.hf_model_dir) self.vision_start_token_id = hf_config.vision_start_token_id self.vision_end_token_id = hf_config.vision_end_token_id self.vision_token_id = hf_config.vision_token_id self.image_token_id = hf_config.image_token_id self.video_token_id = hf_config.video_token_id self.spatial_merge_size = hf_config.vision_config.spatial_merge_size self.max_position_embeddings = hf_config.max_position_embeddings self.hidden_size = hf_config.hidden_size self.num_attention_heads = hf_config.num_attention_heads self.rope_theta = hf_config.rope_theta

Do you mean modify the code like this:
`elif 'qwen2_vl' in self.model_type:
from qwen_vl_utils import process_vision_info
from transformers.models.qwen2_vl.modeling_qwen2_vl import
VisionRotaryEmbedding
hf_config = AutoConfig.from_pretrained(self.args.hf_model_dir)
if input_text is None:
input_text = "Question: Describe this image. Answer:"
messages = [[{
"role":
"user",
"content": [
{
"type": "image",
"image": raw_image[idx],
},
{
"type": "text",
"text": input_text[idx],
},
],
}] for idx in range(self.args.batch_size)]

        texts = [
            hf_config.apply_chat_template(msg,
                                          tokenize=False,
                                          add_generation_prompt=True)
            for msg in messages
        ]`

change processor.apply_chat_template to hf_config.apply_chat_template?

@lessmore991
Copy link

Hi, Please use the latest main code and run "pip install -r requirements-qwen2vl.txt" firstly.

Thanks.

Do I still need to install the source code based on the code submitted by 21fac7? Can I use the latest version of transformers directly?

@zhaocc1106
Copy link

Can try edit tensrrt-llm source:
vim /usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/qwen/config.py +146
Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Investigating LLM API/Workflow triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

5 participants