Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trust_remote_code argument ignored in load_calib_dataset() #2537

Open
2 of 4 tasks
hiroshi-matsuda-rit opened this issue Dec 5, 2024 · 1 comment
Open
2 of 4 tasks
Labels
bug Something isn't working

Comments

@hiroshi-matsuda-rit
Copy link

hiroshi-matsuda-rit commented Dec 5, 2024

System Info

  • CPU: Any
  • GPU: Any
  • TensorRT-LLM main branch (including <= v0.15.0)
  • NVIDIA driver: Any
  • OS: Any

Who can help?

The argument of trust_remote_code is never used in tensorrt_llm.models.convert_utils.load_calib_dataset(). @Tracin
https://github.com/NVIDIA/TensorRT-LLM/blob/v0.15.0/tensorrt_llm/models/convert_utils.py#L284

Some of the qunatization models require loading the Hugging Face Datasets including customizing codes, and the HF datasets library performs time-limited interactive verification process to trust remote codes by using command-line in default.
This behavior is not suitable for non-interactive automated quantization processes.

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

MODEL_DIR=Llama-3.3-70B-Instruct
DTYPE=bfloat16 # or float16
TP=2
PP=1
MAX_SEQ_LEN=2048
SETTINGS=sq-0.5_tp${TP}_pp${PP}_$((MAX_SEQ_LEN / 1024))k
CKPT_DIR=./${MODEL_DIR}.${SETTINGS}.ckpt
ENGINE_DIR=./${MODEL_DIR}.${SETTINGS}
python3 TensorRT-LLM/examples/llama/convert_checkpoint.py \
  --model_dir ${MODEL_DIR} --dtype ${DTYPE} \
  --tp_size ${TP} --pp_size ${PP} \
  --smoothquant 0.5 --per_token --per_channel \
  --output_dir ${CKPT_DIR}

Expected behavior

The quantization process should scceed without trust_remote_code related errors.

actual behavior

The quantization process freezes at load_calib_dataset() in tensorrt_llm/models/convert_utils.py to wait user's input whether enable trust_remote_code or not.

additional notes

We should modify load_calib_dataset() in one of the followings.

  1. Remove trust_remote_code from the args of load_calib_dataset()
  2. Change the default value of trust_remote_code as False and pass it for datasets.load_dataset()

In either case, trust_remote_code should be set as an arg when calling load_calib_dataset() if the dataset specified by dataset_name_or_dir includes the custom code.
My suggetion is adding calib_trust_remote_code arg to the command-line options of quantization.py and the other similar situations.

parser.add_argument(
'--calib_dataset',
type=str,
default='cnn_dailymail',
help=
"The huggingface dataset name or the local directory of the dataset for calibration."
)
parser.add_argument(
'--calib_tp_size',
type=int,
default=1,
help=
"Tensor parallel size for calibration; effective for NeMo checkpoint only."
)
parser.add_argument(
'--calib_pp_size',
type=int,
default=1,
help=
"Pipeline parallel size for calibration; effective for NeMo checkpoint only."
)

@hiroshi-matsuda-rit hiroshi-matsuda-rit added the bug Something isn't working label Dec 5, 2024
@hiroshi-matsuda-rit
Copy link
Author

@Tracin I just described more detailed reproduction steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant