Training Process

Training Options and Description

Switch	Description
--use_model	To specify which RNN variants to be used in the training process. Currently, available models are GRU and LSTM. In future, we will be adding other variants also like Deep-LSTM, Simple RNN etc.
--load_data	This is useful when you have prepared data for training.
--data_train	This switch is used to specify the training data. For quality estimation, this could be the source, target, and alignment files as shared in WMT shared-task. For tagging, it could be source text only.
--data_train_y	This is used to specify the labels. In QE task, this should be the word level quality tags. In pos tagging, this should be the word level part-of-speech tags.
--data_test	Similar to --data_train switch
--data_test_y	Similar to --data_train_y
--data_valid	Similar to --data_train
--data_valid_y	Similar to --data_train_y
--dictionaries	To specify the word dictionaries. This must be a JSON file containing word and corresponding integer id. For the bilingual model of QE task, one must provide both source and target dictionaries.
--label2indx	This is used to specifying the system output labels. It similar to the --dictionaries switch except it is for the output tags.
--use_char	To enable the models to use word character feature
--character2index	Character level dictionary similar to word dictionaries. This is required when you are using --use_char switch
--pretrain	Enable the switch if you have pre-trained word embedding
--embeddings	Provide trained word embeddings. Currently, the system accepts only word2vec (text) trained models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Process

Training Options and Description

Clone this wiki locally