HJQE is a benchmark dataset for word-level quality estimation (QE) of machine translation, where all examples are annotated by expert translators. The goal of the dataset is to measure the translation errors from the human judgement.
HJQE contains the corpus for two translation directions: English-German and English-Chinese. For each corpus, the source and mt sentences are same to WMT20 (https://www.statmt.org/wmt20/quality-estimation-task.html).
Please cite the following paper if you found the resources in this repository useful.
@article{yang2021HJQE,
title={Rethink about the Word-level Quality Estimation for Machine Translation from Human Judgement},
author={Yang Zhen, Meng Fandong, Yuanmeng Yan, and Zhou, Jie},
year={2022}
}