Skip to content

Commit

Permalink
Bolden config items
Browse files Browse the repository at this point in the history
  • Loading branch information
natke committed Mar 14, 2024
1 parent a74585f commit d8d4fb6
Showing 1 changed file with 30 additions and 30 deletions.
60 changes: 30 additions & 30 deletions docs/genai/reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,73 +79,73 @@ A configuration file called genai_config.json is generated automatically if the

#### General model config

* _type_: The type of model. Can be phi, llama or gpt.
* **_type_**: The type of model. Can be phi, llama or gpt.

* _vocab_size_: The size of the vocabulary that the model processes ie the number of tokens in the vocabulary.
* **_vocab_size_**: The size of the vocabulary that the model processes ie the number of tokens in the vocabulary.

* _bos_token_id_: The id of the beginning of sequence token.
* **_bos_token_id_**: The id of the beginning of sequence token.

* _eos_token_id_: The id of the end of sequence token.
* **_eos_token_id_**: The id of the end of sequence token.

* _pad_token_: The id of the padding token.
* **_pad_token_**: The id of the padding token.

* _context_length_: The maxinum length of sequence that the model can process.
* **_context_length_**: The maxinum length of sequence that the model can process.


#### Session options

These are the options that are passed to ONNX Runtime, which runs the model on each token generation iteration.

* _provider_options_: a priortized list of execution targets on which to run the model. If running on CPU, this option is not present. A list of execution provider specific configurations can be specified inside the provider item.
* **_provider_options_**: a priortized list of execution targets on which to run the model. If running on CPU, this option is not present. A list of execution provider specific configurations can be specified inside the provider item.

* _log_id_: a prefix to output when logging
* **_log_id_**: a prefix to output when logging


Then For each model in the pipeline there is one section, named by the model.

#### Decoder model config

* _filename_: The name of the model file.
* **_filename_**: The name of the model file.

* _inputs_: The names of each of the inputs. Sequences of model inputs can contain a wildcard representing the index in the sequence.
* **_inputs_**: The names of each of the inputs. Sequences of model inputs can contain a wildcard representing the index in the sequence.

* _outputs_: The names of each of the outputs.
* **_outputs_**: The names of each of the outputs.

* _num_attention_heads: The number of attention heads in the model.
* **_num_attention_heads_**: The number of attention heads in the model.

* _head_size_: The size of the attention heads.
* **_head_size_**: The size of the attention heads.

* _hidden_size_: The size of the hidden layers.
* **_hidden_size_**: The size of the hidden layers.

* _num_key_value_heads_: The number of key value heads.
* **_num_key_value_heads_**: The number of key value heads.


### Search section
### Generation search section

* _max_length_: The maximum length that the model will generate.
* **_max_length_**: The maximum length that the model will generate.

* _min_length_: The minimum length that the model will generate.
* **_min_length_**: The minimum length that the model will generate.

* _do_sample_:
* **_do_sample_**:

* _num_beams_: The number of beams to apply when generating the output sequence using beam search. If num_beams=1, then generation is performed using greedy search.
* **_num_beams_**: The number of beams to apply when generating the output sequence using beam search. If num_beams=1, then generation is performed using greedy search.

* _early_stopping_ : Whether to stop the beam search when at least num_beams sentences are finished per batch or not. Defaults to false.
* **_early_stopping_**: Whether to stop the beam search when at least num_beams sentences are finished per batch or not. Defaults to false.

* _num_sequences_: The number of sequences to generate. Returns the sequences with the highest scores in order.
* **_num_sequences_**: The number of sequences to generate. Returns the sequences with the highest scores in order.

* _temperature_: The temperature value scales the probability of each token so that probable tokens become more likely while less probable ones become less likely. This value can have a range 0 < `temperature` ≤ 1. When temperature is equal to `1`, it has no effect.
* **_temperature_**: The temperature value scales the probability of each token so that probable tokens become more likely while less probable ones become less likely. This value can have a range 0 < `temperature` ≤ 1. When temperature is equal to `1`, it has no effect.

* _top_k_: Only includes tokens that do fall within the list of the `K` most probable tokens.
* **_top_k_**: Only includes tokens that do fall within the list of the `K` most probable tokens.

* _top_p_: Only includes the most probable tokens with probabilities that add up to `P` or higher. Defaults to `1`, which includes all of the tokens.
* **_top_p_**: Only includes the most probable tokens with probabilities that add up to `P` or higher. Defaults to `1`, which includes all of the tokens.

* _repetition_penalty_: Discounts the scores of previously generated tokens if set to a value greater than `1`. Defaults to `1`.
* **_repetition_penalty_**: Discounts the scores of previously generated tokens if set to a value greater than `1`. Defaults to `1`.

* _length_penalty_: Controls the length of the output generated. Value less than `1` encourages the generation to produce shorter sequences. Values greater than `1` encourages longer sequences. Defaults to `1`.
* **_length_penalty_**: Controls the length of the output generated. Value less than `1` encourages the generation to produce shorter sequences. Values greater than `1` encourages longer sequences. Defaults to `1`.

* _diversity_penalty_:
* **_diversity_penalty_**:

* _no_repeat_ngram_size_:
* **_no_repeat_ngram_size_**:

* _past_present_share_buffer_: If set to true, the past and present buffer are shared for efficiency.
* **_past_present_share_buffer_**: If set to true, the past and present buffer are shared for efficiency.

0 comments on commit d8d4fb6

Please sign in to comment.