diff --git a/docs/genai/reference/config.md b/docs/genai/reference/config.md index 95b143c84489e..19ee41fbf7bc4 100644 --- a/docs/genai/reference/config.md +++ b/docs/genai/reference/config.md @@ -79,73 +79,73 @@ A configuration file called genai_config.json is generated automatically if the #### General model config -* _type_: The type of model. Can be phi, llama or gpt. +* **_type_**: The type of model. Can be phi, llama or gpt. -* _vocab_size_: The size of the vocabulary that the model processes ie the number of tokens in the vocabulary. +* **_vocab_size_**: The size of the vocabulary that the model processes ie the number of tokens in the vocabulary. -* _bos_token_id_: The id of the beginning of sequence token. +* **_bos_token_id_**: The id of the beginning of sequence token. -* _eos_token_id_: The id of the end of sequence token. +* **_eos_token_id_**: The id of the end of sequence token. -* _pad_token_: The id of the padding token. +* **_pad_token_**: The id of the padding token. -* _context_length_: The maxinum length of sequence that the model can process. +* **_context_length_**: The maxinum length of sequence that the model can process. #### Session options These are the options that are passed to ONNX Runtime, which runs the model on each token generation iteration. -* _provider_options_: a priortized list of execution targets on which to run the model. If running on CPU, this option is not present. A list of execution provider specific configurations can be specified inside the provider item. +* **_provider_options_**: a priortized list of execution targets on which to run the model. If running on CPU, this option is not present. A list of execution provider specific configurations can be specified inside the provider item. -* _log_id_: a prefix to output when logging +* **_log_id_**: a prefix to output when logging Then For each model in the pipeline there is one section, named by the model. #### Decoder model config -* _filename_: The name of the model file. +* **_filename_**: The name of the model file. -* _inputs_: The names of each of the inputs. Sequences of model inputs can contain a wildcard representing the index in the sequence. +* **_inputs_**: The names of each of the inputs. Sequences of model inputs can contain a wildcard representing the index in the sequence. -* _outputs_: The names of each of the outputs. +* **_outputs_**: The names of each of the outputs. -* _num_attention_heads: The number of attention heads in the model. +* **_num_attention_heads_**: The number of attention heads in the model. -* _head_size_: The size of the attention heads. +* **_head_size_**: The size of the attention heads. -* _hidden_size_: The size of the hidden layers. +* **_hidden_size_**: The size of the hidden layers. -* _num_key_value_heads_: The number of key value heads. +* **_num_key_value_heads_**: The number of key value heads. -### Search section +### Generation search section -* _max_length_: The maximum length that the model will generate. +* **_max_length_**: The maximum length that the model will generate. -* _min_length_: The minimum length that the model will generate. +* **_min_length_**: The minimum length that the model will generate. -* _do_sample_: +* **_do_sample_**: -* _num_beams_: The number of beams to apply when generating the output sequence using beam search. If num_beams=1, then generation is performed using greedy search. +* **_num_beams_**: The number of beams to apply when generating the output sequence using beam search. If num_beams=1, then generation is performed using greedy search. -* _early_stopping_ : Whether to stop the beam search when at least num_beams sentences are finished per batch or not. Defaults to false. +* **_early_stopping_**: Whether to stop the beam search when at least num_beams sentences are finished per batch or not. Defaults to false. -* _num_sequences_: The number of sequences to generate. Returns the sequences with the highest scores in order. +* **_num_sequences_**: The number of sequences to generate. Returns the sequences with the highest scores in order. -* _temperature_: The temperature value scales the probability of each token so that probable tokens become more likely while less probable ones become less likely. This value can have a range 0 < `temperature` ≤ 1. When temperature is equal to `1`, it has no effect. +* **_temperature_**: The temperature value scales the probability of each token so that probable tokens become more likely while less probable ones become less likely. This value can have a range 0 < `temperature` ≤ 1. When temperature is equal to `1`, it has no effect. -* _top_k_: Only includes tokens that do fall within the list of the `K` most probable tokens. +* **_top_k_**: Only includes tokens that do fall within the list of the `K` most probable tokens. -* _top_p_: Only includes the most probable tokens with probabilities that add up to `P` or higher. Defaults to `1`, which includes all of the tokens. +* **_top_p_**: Only includes the most probable tokens with probabilities that add up to `P` or higher. Defaults to `1`, which includes all of the tokens. -* _repetition_penalty_: Discounts the scores of previously generated tokens if set to a value greater than `1`. Defaults to `1`. +* **_repetition_penalty_**: Discounts the scores of previously generated tokens if set to a value greater than `1`. Defaults to `1`. -* _length_penalty_: Controls the length of the output generated. Value less than `1` encourages the generation to produce shorter sequences. Values greater than `1` encourages longer sequences. Defaults to `1`. +* **_length_penalty_**: Controls the length of the output generated. Value less than `1` encourages the generation to produce shorter sequences. Values greater than `1` encourages longer sequences. Defaults to `1`. -* _diversity_penalty_: +* **_diversity_penalty_**: -* _no_repeat_ngram_size_: +* **_no_repeat_ngram_size_**: -* _past_present_share_buffer_: If set to true, the past and present buffer are shared for efficiency. +* **_past_present_share_buffer_**: If set to true, the past and present buffer are shared for efficiency.