-
Notifications
You must be signed in to change notification settings - Fork 0
Training Process
Raj Nath Patel edited this page Jun 9, 2017
·
10 revisions
Switch | Description |
---|---|
--use_model | To specify which RNN variants to be used in the training process. Currently, available models are GRU and LSTM. In future, we will be adding other variants also like Deep-LSTM, Simple RNN etc. |
--load_data | This is useful when you have prepared data for training. |
--data_train | This switch is used to specify the training data. For quality estimation, this could be the source, target, and alignment files as shared in WMT shared-task. For tagging, it could be source text only. |
--data_train_y | This is used to specify the labels. In QE task, this should be the word level quality tags. In pos tagging, this should be the word level part-of-speech tags. |
--data_test | Similar to --data_train switch |
--data_test_y | Similar to --data_train_y |
--data_valid | Similar to --data_train |
--data_valid_y | Similar to --data_train_y |
--dictionaries | To specify the word dictionaries. This must be a JSON file containing word and corresponding integer id. For the bilingual model of QE task, one must provide both source and target dictionaries. |
--label2indx | This is used to specifying the system output labels. It similar to the --dictionaries switch except it is for the output tags. |
--use_char | To enable the models to use word character feature |
--character2index | Character level dictionary similar to word dictionaries. This is required when you are using --use_char switch |
--pretrain | Enable the switch if you have pre-trained word embedding |
--embeddings | Provide trained word embeddings. Currently, the system accepts only word2vec (text) trained models. |