This repository has been archived by the owner on Oct 12, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 473
Usage
hoshi-hiyouga edited this page Apr 22, 2023
·
18 revisions
python src/finetune.py
class utils.config.ModelArguments <source>
-
model_name_or_path (str, optional): Path to pretrained model or model identifier from huggingface.co/models. Default:
CHATGLM_REPO_NAME
-
config_name (str, optional): Pretrained config name or path if not the same as model_name. Default:
None
-
tokenizer_name (str, optional): Pretrained tokenizer name or path if not the same as model_name. Default:
None
-
cache_dir (str, optional): Where to store the pretrained models downloaded from huggingface.co. Default:
None
-
use_fast_tokenizer (bool, optional): Whether to use one of the fast tokenizer (backed by the tokenizers library) or not. Default:
True
-
model_revision (str, optional): The specific model version to use (can be a branch name, tag name or commit id). Default:
CHATGLM_LASTEST_HASH
-
use_auth_token (str, optional): Will use the token generated when running
huggingface-cli login
. Default:False
-
resize_position_embeddings (bool, optional): Whether to resize the position embeddings if
max_source_length
exceeds or not. Default:False
-
quantization_bit (int, optional): The number of bits to quantize the model. Default:
None
-
checkpoint_dir (str, optional): Path to the directory containing the model checkpoints as well as the configurations. Default:
None
class utils.config.DataTrainingArguments <source>
-
dataset (str, optional): The name of provided dataset(s) to use. Use comma to separate multiple datasets. Default:
alpaca_zh
-
dataset_dir (str, optional): The name of the folder containing datasets. Default:
data
-
split (str, optional): Which dataset split to use for training and evaluation. Default:
train
-
overwrite_cache (bool, optional): Overwrite the cached training and evaluation sets. Default:
False
-
preprocessing_num_workers (int, optional): The number of processes to use for the preprocessing. Default:
None
-
max_source_length (int, optional): The maximum total input sequence length after tokenization. Default:
512
-
max_target_length (int, optional): The maximum total output sequence length after tokenization. Default:
512
-
pad_to_max_length (bool, optional): Whether to pad all samples to model maximum sentence length or not. Default:
False
-
max_train_samples (int, optional): For debugging purposes, truncate the number of training examples for each dataset. Default:
None
-
max_eval_samples (int, optional): For debugging purposes, truncate the number of evaluation examples for each dataset. Default:
None
-
num_beams (int, optional): Number of beams to use for evaluation. This argument will be passed to
model.generate
. Default:None
-
ignore_pad_token_for_loss (bool, optional): Whether to ignore the tokens corresponding to padded labels in the loss computation or not. Default:
True
class utils.config.FinetuningArguments <source>
-
finetuning_type: Which fine-tuning method to use for training. Default:
lora
- w