Skip to content

Commit

Permalink
Polish BERT example (#229)
Browse files Browse the repository at this point in the history
  • Loading branch information
gpengzhi authored Oct 12, 2019
1 parent 484b9b2 commit b07501a
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 6 deletions.
4 changes: 2 additions & 2 deletions examples/bert/data/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
This gives the explanation on data preparation.

When you run `data/download_glue_data.py` in the parent directory, by default, all datasets in GLEU will be stored here. For more information on GLUE, please refer to
When you run `data/download_glue_data.py` in the parent directory, by default, all datasets in the General Language Understanding Evaluation (GLUE) will be stored here. For more information on GLUE, please refer to
[gluebenchmark](https://gluebenchmark.com/tasks)

Here we show the data format of the SSN-2 dataset.
Expand All @@ -26,4 +26,4 @@ index sentence
* The test data is in a different format: the first column is a unique index for each test example, the second column is the space-seperated string.


In [`bert/utils/data_utils.py`](https://github.com/asyml/texar/blob/master/examples/bert/utils/data_utils.py), there are 5 types of `Data Processor` Implemented. You can run `python bert_classifier_main.py` and specify `--task` to run on different datasets.
In [`bert/utils/data_utils.py`](https://github.com/asyml/texar-pytorch/blob/master/examples/bert/utils/data_utils.py), there are 5 types of `Data Processor` implemented. You can run `python bert_classifier_main.py` and specify `--task` to run on different datasets.
4 changes: 0 additions & 4 deletions examples/bert/prepare_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,6 @@
help="The output directory where the pickled files will be generated. "
"By default it will be set to 'data/{task}'. E.g.: if "
"task is 'MRPC', it will be set as 'data/MRPC'")
parser.add_argument(
"--lower-case", type=bool, default=True,
help="Whether to lower case the input text. Should be True for uncased "
"models and False for cased models.")
parser.add_argument(
"--config-data", default="config_data", help="The dataset config.")
args = parser.parse_args()
Expand Down

0 comments on commit b07501a

Please sign in to comment.