-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚠️ Add warning guidelines and update codebase to follow best practices #2350
base: main
Are you sure you want to change the base?
Conversation
warnings.warn( | ||
"You passed a model_id to the BCOTrainer. This will automatically create an " | ||
"`AutoModelForCausalLM` or a `PeftModel` (if you passed a `peft_config`) for you." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
warnings.warn( | ||
"You passed a ref model_id to the BCOTrainer. This will automatically create an " | ||
"`AutoModelForCausalLM`" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@@ -99,7 +99,8 @@ | |||
if model_config.use_peft and model_config.lora_task_type != "SEQ_CLS": | |||
warnings.warn( | |||
"You are using a `task_type` that is different than `SEQ_CLS` for PEFT. This will lead to silent bugs" | |||
" Make sure to pass --lora_task_type SEQ_CLS when using this script with PEFT." | |||
" Make sure to pass --lora_task_type SEQ_CLS when using this script with PEFT.", | |||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the appropriate warning type.
@@ -296,7 +296,8 @@ def randn_tensor( | |||
warnings.warn( | |||
f"The passed generator was created on 'cpu' even though a tensor on {device} was expected." | |||
f" Tensors will be created on 'cpu' and then moved to {device}. Note that one can probably" | |||
f" slighly speed up this function by passing a generator that was created on the {device} device." | |||
f" slighly speed up this function by passing a generator that was created on the {device} device.", | |||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the appropriate warning type.
if np.array(predictions[:, 0] == predictions[:, 1], dtype=float).sum() > 0: | ||
equal_predictions_count = np.array(predictions[:, 0] == predictions[:, 1], dtype=float).sum() | ||
if equal_predictions_count > 0: | ||
warnings.warn( | ||
f"There are {np.array(predictions[:, 0] == predictions[:, 1]).sum()} out of {len(predictions[:, 0])} instances where the predictions for both options are equal. As a consequence the accuracy can be misleading." | ||
f"There are {equal_predictions_count} out of {len(predictions[:, 0])} instances where the predictions for " | ||
"both options are equal. As a consequence the accuracy can be misleading.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings must be actionable.
Warnings should not indicate normal behavior.
Use the appropriate warning type.
This warning remains not actionable. Any idea to solve this?
warnings.warn("install rich to display text") | ||
return | ||
raise ImportError( | ||
"The `rich` library is required to display text with formatting. " | ||
"Install it using `pip install rich`." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
warnings.warn("install rich to display tokens") | ||
return | ||
raise ImportError( | ||
"The `rich` library is required to display tokens with formatting. " | ||
"Install it using `pip install rich`." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
warnings.warn("install rich to display colour legend") | ||
return | ||
raise ImportError( | ||
"The `rich` library is required to display colour legends with formatting. " | ||
"Install it using `pip install rich`." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
"If you are aware that the pretrained model has no lora weights to it, ignore this message. " | ||
"Otherwise please check the if `pytorch_lora_weights.safetensors` exists in the model folder." | ||
"Trying to load LoRA weights but no LoRA weights found. Set `use_lora=False` or check that " | ||
"`pytorch_lora_weights.safetensors` exists in the model folder.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings must be actionable.
Warnings should not indicate normal behavior.
if self.log_with not in ["wandb", "tensorboard"]: | ||
warnings.warn( | ||
"Accelerator tracking only supports image logging if `log_with` is set to 'wandb' or 'tensorboard'." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if self.log_with == "wandb" and not is_torchvision_available(): | ||
warnings.warn("Wandb image logging requires torchvision to be installed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
"You set `output_router_logits` to True in the model config, but `router_aux_loss_coef` is set to 0.0," | ||
" meaning the auxiliary loss will not be used." | ||
"You set `output_router_logits` to `True` in the model config, but `router_aux_loss_coef` is set to " | ||
"`0.0`, meaning the auxiliary loss will not be used. Either set `router_aux_loss_coef` to a value " | ||
"greater than `0.0`, or set `output_router_logits` to `False` if you don't want to use the auxiliary " | ||
"loss.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings must be actionable.
Use the appropriate warning type.
@@ -705,7 +700,6 @@ def make_inputs_require_grad(module, input, output): | |||
self.running = RunningMoments(accelerator=self.accelerator) | |||
|
|||
if self.embedding_func is None: | |||
warnings.warn("You did not pass `embedding_func` underlying distribution matching feature is deactivated.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if not os.path.isfile(running_file): | ||
warnings.warn(f"Missing file {running_file}. Will use a new running delta value for BCO loss calculation") | ||
else: | ||
if os.path.isfile(running_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if not os.path.isfile(running_file): | ||
warnings.warn(f"Missing file {clf_file}. Will use a new UDM classifier for BCO loss calculation") | ||
else: | ||
if os.path.isfile(running_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if not self.use_dpo_data_collator: | ||
warnings.warn( | ||
"prediction_step is only implemented for DPODataCollatorWithPadding, and you passed a datacollator that is different than " | ||
"DPODataCollatorWithPadding - you might see unexpected behavior. Alternatively, you can implement your own prediction_step method if you are using a custom data collator" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if not self.use_dpo_data_collator: | ||
warnings.warn( | ||
"prediction_step is only implemented for DPODataCollatorWithPadding, and you passed a datacollator that is different than " | ||
"DPODataCollatorWithPadding - you might see unexpected behavior. Alternatively, you can implement your own prediction_step method if you are using a custom data collator" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if not self.use_dpo_data_collator: | ||
warnings.warn( | ||
"compute_loss is only implemented for DPODataCollatorWithPadding, and you passed a datacollator that is different than " | ||
"DPODataCollatorWithPadding - you might see unexpected behavior. Alternatively, you can implement your own prediction_step method if you are using a custom data collator" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if self.cpo_alpha > 0: | ||
warnings.warn( | ||
"You are using CPO-SimPO method because you set a non-zero cpo_alpha. " | ||
"This will result in the CPO-SimPO method " | ||
"(https://github.com/fe1ixxu/CPO_SIMPO/tree/main). " | ||
"If you want to use a pure SimPO method, please set cpo_alpha to 0." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
"You set `output_router_logits` to True in the model config, but `router_aux_loss_coef` is set to 0.0," | ||
" meaning the auxiliary loss will not be used." | ||
"You set `output_router_logits` to `True` in the model config, but `router_aux_loss_coef` is set to " | ||
"`0.0`, meaning the auxiliary loss will not be used. Either set `router_aux_loss_coef` to a value " | ||
"greater than `0.0`, or set `output_router_logits` to `False` if you don't want to use the auxiliary " | ||
"loss.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings must be actionable.
Use the appropriate warning type.
warnings.warn( | ||
"You passed a ref model_id to the KTOTrainer. This will automatically create an " | ||
"`AutoModelForCausalLM`" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
"You set `output_router_logits` to True in the model config, but `router_aux_loss_coef` is set to 0.0," | ||
" meaning the auxiliary loss will not be used." | ||
"You set `output_router_logits` to `True` in the model config, but `router_aux_loss_coef` is set to " | ||
"`0.0`, meaning the auxiliary loss will not be used. Either set `router_aux_loss_coef` to a value " | ||
"greater than `0.0`, or set `output_router_logits` to `False` if you don't want to use the auxiliary " | ||
"loss.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings must be actionable.
Use the appropriate warning type.
if not self.use_dpo_data_collator: | ||
warnings.warn( | ||
"compute_loss is only implemented for DPODataCollatorWithPadding, and you passed a datacollator that is different than " | ||
"DPODataCollatorWithPadding - you might see unexpected behavior. Alternatively, you can implement your own prediction_step method if you are using a custom data collator" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if not self.use_dpo_data_collator: | ||
warnings.warn( | ||
"prediction_step is only implemented for DPODataCollatorWithPadding, and you passed a datacollator that is different than " | ||
"DPODataCollatorWithPadding - you might see unexpected behavior. Alternatively, you can implement your own prediction_step method if you are using a custom data collator" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
"Ignoring `judge` and using `reward_model`." | ||
"Ignoring `judge` and using `reward_model`.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the appropriate warning type.
warnings.warn( | ||
"You passed a model_id to the ORPOTrainer. This will automatically create an " | ||
"`AutoModelForCausalLM` or a `PeftModel` (if you passed a `peft_config`) for you." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
"You set `output_router_logits` to True in the model config, but `router_aux_loss_coef` is set to 0.0," | ||
" meaning the auxiliary loss will not be used." | ||
"You set `output_router_logits` to `True` in the model config, but `router_aux_loss_coef` is set to " | ||
"`0.0`, meaning the auxiliary loss will not be used. Either set `router_aux_loss_coef` to a value " | ||
"greater than `0.0`, or set `output_router_logits` to `False` if you don't want to use the auxiliary " | ||
"loss.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings must be actionable.
Use the appropriate warning type.
if not self.use_dpo_data_collator: | ||
warnings.warn( | ||
"compute_loss is only implemented for DPODataCollatorWithPadding, and you passed a datacollator that is different than " | ||
"DPODataCollatorWithPadding - you might see unexpected behavior. Alternatively, you can implement your own prediction_step method if you are using a custom data collator" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if type(args) is TrainingArguments: | ||
warnings.warn( | ||
"Using `transformers.TrainingArguments` for `args` is deprecated and will be removed in a future version. Please use `RewardConfig` instead.", | ||
FutureWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been deprecated for more than a year (#748)
@@ -126,9 +126,7 @@ def __init__( | |||
formatting_func: Optional[Callable] = None, | |||
): | |||
if args is None: | |||
output_dir = "tmp_trainer" | |||
warnings.warn(f"No `SFTConfig` passed, using `output_dir={output_dir}`.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
if not self.use_reward_data_collator: | ||
warnings.warn( | ||
"The current compute_loss is implemented for RewardDataCollatorWithPadding," | ||
" if you are using a custom data collator make sure you know what you are doing or" | ||
" implement your own compute_loss method." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
@@ -189,7 +173,7 @@ def __init__( | |||
"A processing_class must be specified when using the default RewardDataCollatorWithPadding" | |||
) | |||
if max_length is None: | |||
max_length = 512 if type(args) is TrainingArguments or args.max_length is None else args.max_length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has been deprecated for more than a year (#748)
"please update to the latest version of peft to use `gradient_checkpointing_kwargs`." | ||
"please update to the latest version of peft to use `gradient_checkpointing_kwargs`.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the appropriate warning type.
warnings.warn( | ||
"You passed a model_id to the SFTTrainer. This will automatically create an " | ||
"`AutoModelForCausalLM` or a `PeftModel` (if you passed a `peft_config`) for you." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
warnings.warn( | ||
f"You didn't pass a `max_seq_length` argument to the SFTTrainer, this will default to {args.max_seq_length}" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
"You passed a processing_class with `padding_side` not equal to `right` to the SFTTrainer. This might lead to some unexpected behaviour due to " | ||
"overflow issues when training a model in half-precision. You might consider adding `processing_class.padding_side = 'right'` to your code." | ||
"You passed a processing_class with `padding_side` not equal to `right` to the SFTTrainer. This might " | ||
"lead to some unexpected behaviour due to overflow issues when training a model in half-precision. " | ||
"You might consider adding `processing_class.padding_side = 'right'` to your code.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings must be actionable.
Use the appropriate warning type.
This warning is still not actionable. Any idea?
warnings.warn( | ||
"You passed `packing=True` to the SFTTrainer/SFTConfig, and you are training your model with `max_steps` strategy. The dataset will be iterated until the `max_steps` are reached." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
"You passed a dataset that is already processed (contains an `input_ids` field) together with a valid formatting function. Therefore `formatting_func` will be ignored." | ||
"You passed a dataset that is already processed (contains an `input_ids` field) together with a " | ||
"valid formatting function. Therefore `formatting_func` will be ignored. Either remove the " | ||
"`formatting_func` or pass a dataset that is not already processed.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings must be actionable.
Use the appropriate warning type.
"You passed `remove_unused_columns=False` on a non-packed dataset. This might create some issues with the default collator and yield to errors. If you want to " | ||
f"inspect dataset other columns (in this case {extra_columns}), you can subclass `DataCollatorForLanguageModeling` in case you used the default collator and create your own data collator in order to inspect the unused dataset columns." | ||
"You passed `remove_unused_columns=False` on a non-packed dataset. This might create some issues with " | ||
"the default collator and yield to errors. If you want to inspect dataset other columns (in this " | ||
f"case {extra_columns}), you can subclass `DataCollatorForLanguageModeling` in case you used the " | ||
"default collator and create your own data collator in order to inspect the unused dataset columns.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the appropriate warning type.
"To avoid this, set the pad_token_id to a different value." | ||
"To avoid this, set the pad_token_id to a different value.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
Use the appropriate warning type.
This warning can still occur in normal behavior. Any idea?
trl/trainer/utils.py
Outdated
f"Could not find response key `{self.response_template}` in the " | ||
f'following instance: {self.tokenizer.decode(batch["input_ids"][i])} ' | ||
f"This instance will be ignored in loss calculation. " | ||
f"Note, if this happens often, consider increasing the `max_seq_length`." | ||
f"Could not find response key `{self.response_template}` in the following instance: " | ||
f"{self.tokenizer.decode(batch["input_ids"][i])}. This instance will be ignored in loss " | ||
"calculation. Note, if this happens often, consider increasing the `max_seq_length`.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
Use the appropriate warning type.
This warning can still occur in normal behavior. Any idea?
trl/trainer/utils.py
Outdated
f"Could not find response key `{self.response_template}` in the " | ||
f'following instance: {self.tokenizer.decode(batch["input_ids"][i])} ' | ||
f"This instance will be ignored in loss calculation. " | ||
f"Note, if this happens often, consider increasing the `max_seq_length`." | ||
f"Could not find response key `{self.response_template}` in the following instance: " | ||
f"{self.tokenizer.decode(batch["input_ids"][i])}. This instance will be ignored in loss " | ||
"calculation. Note, if this happens often, consider increasing the `max_seq_length`.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
Use the appropriate warning type.
This warning can still occur in normal behavior. Any idea?
|
||
if tokenizer.eos_token_id is None: | ||
warnings.warn( | ||
"The passed tokenizer does not have an EOS token. We will use the passed eos_token_id instead which corresponds" | ||
f" to {eos_token_id}. If this is not the correct EOS token, make sure to pass the correct eos_token_id." | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
trl/trainer/utils.py
Outdated
f"Could not find instruction key `{self.instruction_template}` in the " | ||
f'following instance: {self.tokenizer.decode(batch["input_ids"][i])} ' | ||
f"This instance will be ignored in loss calculation. " | ||
f"Note, if this happens often, consider increasing the `max_seq_length`." | ||
f"Could not find instruction key `{self.instruction_template}` in the following instance: " | ||
f"{self.tokenizer.decode(batch["input_ids"][i])}. This instance will be ignored in loss " | ||
"calculation. Note, if this happens often, consider increasing the `max_seq_length`.", | ||
UserWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Warnings should not indicate normal behavior.
Use the appropriate warning type.
This warning can still occur in normal behavior. Any idea?
What does this PR do?
Some warnings, previously triggered during normal operation, have been removed. As a consequence it reduces noise in our CI pipeline. This change brings us closer to our goal of zero warnings.
CI Warnings Count (Dev Dependencies):
A follow-up PR is planned to address the remaining legitimate warnings.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.