CMRC 2019 is a Chinese Machine Reading Comprehension dataset that was used in The Third Evaluation Workshop on Chinese Machine Reading Comprehension. Specifically, CMRC 2019 is a sentence cloze-style machine reading comprehension dataset that aims to evaluate the sentence-level inference ability.
Paper [Cui et al., COLING 2020] BibTeX [Cui et al., COLING 2020]Download a copy of the dataset (distributed under the CC BY-SA 4.0 license):
You may also be interested in a quick baseline system based on pre-trained language model (such as BERT).
To preserve the integrity of test results, we do not release the test and challenge set to the public. Instead, we require you to upload your model onto CodaLab so that we can run it on the test and challenge set for you. You can follow the instructions on CodaLab (which is similar to SQuAD or CMRC 2018 submission). Submission Tutorial
Ask us questions at our GitHub repository or at cmrc2019 [at] 126 [dot] com .
CMRC 2019 contains fake candidates that need the machine to distinguish from the correct ones and fill into the passage. Will your system surpass the humans on this task?
Rank | Model | ||
---|---|---|---|
QAC | PAC | ||
Human Performance
Joint Laboratory of HIT and iFLYTEK Research [Cui et al., COLING 2020] |
95.326 | 75.000 | |
🥇 2019/10/19 |
bert_scp_spm (ensemble)
PINGAN-GammaLab |
90.054 | 57.600 |
🥈 2019/10/19 |
mojito system (ensemble)
SFTech |
85.990 | 41.800 |
🥉 2019/10/19 |
CMRC 2019 MULTIPLE BERT (ensemble)
Six Estates https://www.6estates.com |
82.590 | 32.200 |
4 2019/10/19 |
DA-BERT (ensemble)
Anonymous |
84.447 | 27.600 |
5 2019/10/19 |
nkyzhangyi_cmrc_v2 (ensemble)
CICC |
79.562 | 26.600 |
6 2019/10/19 |
MRC-ZZ SYSTEM (single model)
Harbin Institute of Technology & Hanyi Fonts |
78.780 | 26.600 |
7 2019/10/19 |
MB-Reader (ensemble)
ECUST |
76.319 | 15.600 |