Publications
Check my scholar profile at the followings.
Featured Publications
Pre-Training with Whole Word Masking for Chinese BERT
- Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol.29. 2021.
- TLDR: This paper proposed a series of Chinese pre-trained language models with thorough evaluations.
- ๐ Been selected as one of the ESI Highly Cited Papers in Engineering by Clarivateโข.
- ๐ Been selected as one of Top-25 Downloaded Papers in IEEE Signal Processing Society (2021-2023).
๐ PDF ๐ Bib IEEE Xplore Chinese-BERT-wwm MacBERT
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
- Yiming Cui, Ziqing Yang, Xin Yao. arXiv pre-print: 2304.08177. 2023.
- TLDR: This paper proposes Chinese LLaMA and Alpaca models.
- ๐ The open-source projects have been ranked 1st place in GitHub Trending repositories.
๐ PDF ๐ Bib arXiv Chinese-LLaMA-Alpaca Chinese-LLaMA-Alpaca-2 Chinese-LLaMA-Alpaca-3
Attention-over-Attention Neural Networks for Reading Comprehension
- Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, Guoping Hu. ACL 2017. 2017.
- TLDR: This paper proposes two-stream attention network (i.e., AoA) for machine reading comprehension.
- ๐ This paper has been selected as one of the Most Influential ACL 2017 Paper (Top 11) by Paper Digest.
๐ PDF ๐ Bib ๐ชง Slides ACL Anthology
2024
Self-Evolving GPT: A Lifelong Autonomous Experiential Learner
- Jinglong Gao, Xiao Ding, Yiming Cui, Jianbai Zhao, Hepeng Wang, Ting Liu, Bing Qin. ACL 2024. 2024. ๐ Bib ACL Anthology GitHub
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral
2023
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
Gradient-based Intra-attention Pruning on Pre-trained Language Models
- Ziqing Yang, Yiming Cui, Xin Yao, Shijin Wang. ACL 2023. ๐ Bib ACL Anthology GitHub
IDOL: Indicator-oriented Logic Pre-training for Logical Reasoning
- Zihang Xu, Ziqing Yang, Yiming Cui, Shijin Wang. Findings of ACL 2023. ๐ Bib ACL Anthology GitHub
MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model
2022
LERT: A Linguistically-motivated Pre-trained Language Model
Visualizing Attention Zones in Machine Reading Comprehension Models
- Yiming Cui, Wei-Nan Zhang, Ting Liu. STAR Protocols, Vol.3. ๐ Bib
Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models
- Yiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhigang Chen, Shijin Wang. iScience, Vol.25. ๐ Bib GitHub
ExpMRC: Explainability Evaluation for Machine Reading Comprehension
- Yiming Cui, Ting Liu, Wanxiang Che, Zhigang Chen, Shijin Wang. Heliyon, Vol.8. ๐ Bib ๐ Leaderboard GitHub
Teaching Machines to Read, Answer and Explain
- Yiming Cui, Ting Liu, Wanxiang Che, Zhigang Chen, Shijin Wang. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol.30. ๐ Bib arXiv IEEE Xplore
PERT: Pre-training BERT with Permuted Language Model
Interactive Gated Decoder for Machine Reading Comprehension
- Yiming Cui, Wanxiang Che, Ziqing Yang, Ting Liu. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Vol.21. ๐ Bib ACM DL
A Static and Dynamic Attention Framework for Multi Turn Dialogue Generation
- Wei-Nan Zhang, Yiming Cui, Kaiyan Zhang, Yifa Wang, Qingfu Zhu, Lingzhi Li, Ting Liu. ACM Transactions on Information Systems (TOIS). ๐ Bib ACM DL
TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models
- Ziqing Yang, Yiming Cui, Zhigang Chen. ACL 2022 (System Demonstration). ๐ Bib ACL Anthology GitHub
Cross-Lingual Text Classification with Multilingual Distillation and Zero-Shot-Aware Training
CINO: A Chinese Minority Pre-trained Language Model
- Ziqing Yang, Zihang Xu, Yiming Cui, Baoxin Wang, Min Lin, Dayong Wu, Zhigang Chen. COLING 2022. ๐ Bib ACL Anthology GitHub
HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News Similarity
- Zihang Xu, Ziqing Yang, Yiming Cui, Zhigang Chen. SemEval 2022. ๐ Bib ACL Anthology GitHub
HIT at SemEval-2022 Task 2: Pre-trained Language Model for Idioms Detection
- Zheng Chu, Ziqing Yang, Yiming Cui, Zhigang Chen, Ming Liu. SemEval 2022. ๐ Bib ACL Anthology
Augmented and challenging datasets with multi-step reasoning and multi-span questions for Chinese judicial reading comprehension
- Qingye Meng, Ziyue Wang, Hang Chen, Xianzhen Luo, Baoxin Wang, Zhipeng Chen, Yiming Cui, Dayong Wu, Zhigang Chen, Shijin Wang. AI Open, Vol.3. ๐ Bib ScienceDirect
2021
Pre-Training with Whole Word Masking for Chinese BERT
- Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol.29. ๐ Bib IEEE Xplore GitHub
- ๐ This paper has been selected as one of the ESI Highly Cited Papers (by Clarivateโข) in Engineering.
Adversarial Training for Machine Reading Comprehension with Virtual Embeddings
- Ziqing Yang, Yiming Cui, Chenglei Si, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu. *SEM 2021. ๐ Bib ACL Anthology
Memory Augmented Sequential Paragraph Retrieval for Multi-hop Question Answering
- Nan Shao, Yiming Cui, Ting Liu, Shijin Wang, Guoping Hu. arXiv pre-print: 2102.03741. ๐ Bib arXiv
๐ Natural Language Processing: A Pre-trained Model Approach (่ช็ถ่ฏญ่จๅค็๏ผๅบไบ้ข่ฎญ็ปๆจกๅ็ๆนๆณ)
- Wanxiang Che, Jiang Guo, Yiming Cui. Publishing House of Electronics Industry. ๐ JD (Online Store) ๐ PDF (Front Matter) ๐ Bib
Benchmarking Robustness of Machine Reading Comprehension Models
- Chenglei Si, Ziqing Yang, Yiming Cui, Wentao Ma, Ting Liu, Shijin Wang. ACL 2021 (Findings). ๐ Bib ACL Anthology GitHub
Bilingual Alignment Pre-training for Zero-shot Cross-lingual Transfer
- Ziqing Yang, Wentao Ma, Yiming Cui, Jiani Ye, Wanxiang Che, Shijin Wang. MRQA 2021. ๐ Bib ACL Anthology
2020
Discriminative Sentence Modeling for Story Ending Prediction
- Yiming Cui, Wanxiang Che, Wei-Nan Zhang, Ting Liu, Shijin Wang, Guoping Hu. AAAI 2020. ๐ Bib ๐ข AAAI Publisher
A Sentence Cloze Dataset for Chinese Machine Reading Comprehension
- Yiming Cui, Ting Liu, Ziqing Yang, Zhipeng Chen, Wentao Ma, Wanxiang Che, Shijin Wang, Guoping Hu. COLING 2020. ๐ Bib ACL Anthology ๐ Leaderboard GitHub
Revisiting Pre-Trained Models for Chinese Natural Language Processing
- Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu. EMNLP 2020 (Findings). ๐ Bib ACL Anthology GitHub
- ๐ This paper has been selected as one of the Most Influential EMNLP 2020 Papers (Top 11) by Paper Digest.
CharBERT: Character-aware Pre-trained Language Model
- Wentao Ma, Yiming Cui, Chenglei Si, Ting Liu, Shijin Wang, Guoping Hu. COLING 2020. ๐ Bib ACL Anthology GitHub
Is Graph Structure Necessary for Multi-hop Question Answering?
- Nan Shao, Yiming Cui, Ting Liu, Shijin Wang, Guoping Hu. EMNLP 2020. ๐ Bib ACL Anthology
Conversational Word Embedding for Retrieval-based Dialog System
- Wentao Ma, Yiming Cui, Ting Liu, Dong Wang, Shijin Wang, Guoping Hu. ACL 2020. ๐ Bib ๐ชง Slides ACL Anthology GitHub
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing
- Ziqing Yang, Yiming Cui, Zhipeng Chen, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu. ACL 2020 (System Demonstration). ๐ Bib ๐ชง Slides ACL Anthology GitHub
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting
- Sanyuan Chen, Yutai Hou, Yiming Cui, Wanxiang Che, Ting Liu, Xiangzhan Yu. EMNLP 2020. ๐ Bib ACL Anthology GitHub
2019
Cross-Lingual Machine Reading Comprehension
- Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu. EMNLP 2019. ๐ Bib ๐ชง Slides ACL Anthology GitHub
A Span-Extraction Dataset for Chinese Machine Reading Comprehension
- Yiming Cui, Ting Liu, Wanxiang Che, Li Xiao, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping Hu. EMNLP 2019. ๐ Bib ACL Anthology ๐ Leaderboard GitHub
Contextual Recurrent Units for Cloze-style Reading Comprehension
- Yiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhipeng Chen, Shijin Wang, Guoping Hu. arXiv pre-print: 1911.05960. ๐ Bib arXiv
Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions
- Zhipeng Chen, Yiming Cui, Wentao Ma, Shijin Wang, Guoping Hu. AAAI 2019. ๐ Bib ๐ชง Slides ๐ข AAAI Publisher
TripleNet: Triple Attention Network for Multi-Turn Response Selection in Retrieval-based Chatbots
- Wentao Ma, Yiming Cui, Nan Shao, Su He, Wei-Nan Zhang, Ting Liu, Shijin Wang, Guoping Hu. CoNLL 2019. ๐ Bib ๐ชง Slides ACL Anthology GitHub
Improving Machine Reading Comprehension via Adversarial Training
- Ziqing Yang, Yiming Cui, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu. arXiv pre-print: 1911.03614. ๐ Bib arXiv
Exploiting Persona Information for Diverse Generation of Conversational Responses
- Haoyu Song, Wei-Nan Zhang, Yiming Cui, Dong Wang, Ting Liu. IJCAI 2019. ๐ Bib ๐ข IJCAI Publisher
CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension
- Xingyi Duan, Baoxin Wang, Ziyue Wang, Wentao Ma, Yiming Cui, Dayong Wu, Shijin Wang, Ting Liu, Tianxiang Huo, Zhen Hu, Heng Wang, Zhiyuan Liu. CCL 2019. ๐ Bib arXiv GitHub
2018
Dataset for the First Evaluation on Chinese Machine Reading Comprehension
- Yiming Cui, Ting Liu, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping Hu. LREC 2018. ๐ Bib ACL Anthology GitHub
Context-Sensitive Generation of Open-Domain Conversational Responses
- Wei-Nan Zhang, Yiming Cui, Yifa Wang, Qingfu Zhu, Lingzhi Li, Lianqiang Zhou, Ting Liu. COLING 2018. ๐ Bib ACL Anthology
HFL-RC System at SemEval-2018 Task 11: Hybrid Multi-Aspects Model for Commonsense Reading Comprehension
- Zhipeng Chen, Yiming Cui*, Wentao Ma, Shijin Wang, Ting Liu, Guoping Hu. arXiv pre-print: 1803.05655. ๐ Bib arXiv
A Car Manual Question Answering System based on Neural Network (ๅบไบ็ฅ็ป็ฝ็ป็ๆฑฝ่ฝฆ่ฏดๆไนฆ้ฎ็ญ็ณป็ป)
- Le Qi, Yu Zhang, Wentao Ma, Yiming Cui, Shijin Wang, Ting Liu. Journal of Shanxi University (Natural Science Edition). ๐ Bib
2017
Attention-over-Attention Neural Networks for Reading Comprehension
- Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, Guoping Hu. ACL 2017. ๐ Bib ๐ชง Slides ACL Anthology
- ๐ This paper has been selected as one of the Most Influential ACL 2017 Papers (Top 11) by Paper Digest.
The Brilliant Chinese Achievements in SQuAD Challenge (ๆฏๅฆ็ฆSQuADๆๆ่ต็ไธญๅฝไบฎไธฝๆฆๅ)
- Yiming Cui, Ting Liu, Shijin Wang, Zhipeng Chen, Wentao Ma, Guoping Hu. Communication of CCF. ๐ Bib
Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution
- Ting Liu, Yiming Cui, Qingyu Yin, Wei-Nan Zhang, Shijin Wang, Guoping Hu. ACL 2017. ๐ Bib ๐ชง Slides ACL Anthology
2016
Consensus Attention-based Neural Networks for Chinese Reading Comprehension
- Yiming Cui, Ting Liu, Zhipeng Chen, Shijin Wang, Guoping Hu. COLING 2016. ๐ Bib ๐ชง Slides ACL Anthology GitHub
LSTM Neural Reordering Feature for Statistical Machine Translation
- Yiming Cui, Shijin Wang, Jianfeng Li. NAACL 2016. ๐ Bib ๐ชง Slides ACL Anthology
2015 and earlier
Augmenting Phrase Table by Employing Lexicons for Pivot-based SMT
Context-extended Phrase Reordering Model for Pivot-based Statistical Machine Translation
- Xiaoning Zhu, Tiejun Zhao, Yiming Cui, Conghui Zhu. IALP 2015. ๐ Bib IEEE Xplore
The USTC Machine Translation System for IWSLT2014
- Shijin Wang, Yuguang Wang, Jianfeng Li, Yiming Cui, Lirong Dai. IWSLT 2014. ๐ Bib ACL Anthology
Phrase Table Combination Deficiency Analyses in Pivot-based SMT
The HIT-LTRC Machine Translation System for IWSLT 2012
- Xiaoning Zhu, Yiming Cui, Conghui Zhu, Tiejun Zhao, Hailong Cao. IWSLT 2012. ๐ Bib ๐ชง Slides ACL Anthology