Check my scholar profile at Google Schoalr | Semantic Scholar | dblp | ACL Anthology

ใ€ŒFeatured Publicationsใ€

Pre-Training with Whole Word Masking for Chinese BERT
  • Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang.
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol.29. 2021. [J]
  • TLDR: This paper proposed a series of Chinese pre-trained language models with thorough evaluations.
  • ๐ŸŽ‰ Been selected as one of the ESI Highly Cited Papers in Engineering by Clarivateโ„ข.
  • ๐ŸŽ‰ Been selected as one of Top-25 Downloaded Papers in IEEE Signal Processing Society (2021-2023).
  • ๐Ÿ“„ PDF ๐Ÿ”Ž Bib pub IEEE Xplore Chinese-BERT-wwm
    Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
  • Yiming Cui*, Ziqing Yang*, Xin Yao.
  • arXiv pre-print: 2304.08177
  • TLDR: This paper proposes Chinese LLaMA and Alpaca models.
  • ๐ŸŽ‰ The open-source projects have been ranked 1st place in GitHub Trending repositories.
  • ๐Ÿ“„ PDF ๐Ÿ”Ž Bib arXiv Chinese-LLaMA-Alpaca Chinese-LLaMA-Alpaca-2
    Attention-over-Attention Neural Networks for Reading Comprehension
  • Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, Guoping Hu.
  • In Proceedings of ACL 2017 [C]
  • TLDR: This paper proposes two-stream attention network (i.e., AoA) for machine reading comprehension.
  • ๐ŸŽ‰ This paper has been selected as one of the Most Influential ACL 2017 Paper (Top 11) by Paper Digest.
  • ๐Ÿ“„ PDF ๐Ÿ”Ž Bib ๐Ÿชง Slides pub ACL Anthology
    HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News Similarity
  • Zihang Xu, Ziqing Yang, Yiming Cui, Zhigang Chen.
  • The 16th International Workshop on Semantic Evaluation (SemEval 2022) [W]
  • TLDR: This paper describes our winning system of SemEval-2022 Task 8.
  • ๐ŸŽ‰ This paper has been selected as one of the "Best Paper Honorable Mention Award" at SemEval-2022.
  • ๐Ÿ“„ PDF ๐Ÿ”Ž Bib pub ACL Anthology SemEval2022-Task8-TonyX

    ใ€Œ2024ใ€

  • new Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral
  • ใ€Œ2023ใ€

  • Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca
  • Gradient-based Intra-attention Pruning on Pre-trained Language Models
  • IDOL: Indicator-oriented Logic Pre-training for Logical Reasoning
  • MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model
  • ใ€Œ2022ใ€

  • LERT: A Linguistically-motivated Pre-trained Language Model
  • Visualizing Attention Zones in Machine Reading Comprehension Models
    • Yiming Cui, Wei-Nan Zhang, Ting Liu. STAR Protocols, Vol.3 [J] ๐Ÿ”Ž Bib pub
  • Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models
    • Yiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhigang Chen, Shijin Wang. iScience, Vol.25 [J] ๐Ÿ”Ž Bib pub GitHub
  • ExpMRC: Explainability Evaluation for Machine Reading Comprehension
  • Teaching Machines to Read, Answer and Explain
    • Yiming Cui, Ting Liu, Wanxiang Che, Zhigang Chen, Shijin Wang. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol.30 [J] ๐Ÿ”Ž Bib arXiv pub IEEE Xplore
  • PERT: Pre-training BERT with Permuted Language Model
  • Interactive Gated Decoder for Machine Reading Comprehension
    • Yiming Cui, Wanxiang Che, Ziqing Yang, Ting Liu, et al. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Vol.21 [J] ๐Ÿ”Ž Bib pub ACM DL
  • A Static and Dynamic Attention Framework for Multi Turn Dialogue Generation
    • Wei-Nan Zhang, Yiming Cui, Kaiyan Zhang, Yifa Wang, Qingfu Zhu, Lingzhi Li, Ting Liu. ACM Transactions on Information Systems (TOIS) [J] ๐Ÿ”Ž Bib pub ACM DL
  • TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models
  • Cross-Lingual Text Classification with Multilingual Distillation and Zero-Shot-Aware Training
    • Ziqing Yang, Yiming Cui, Zhigang Chen, Shijin Wang. arXiv pre-print: 2202.13654 ๐Ÿ”Ž Bib arXiv
  • CINO: A Chinese Minority Pre-trained Language Model
  • HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News Similarity
  • HIT at SemEval-2022 Task 2: Pre-trained Language Model for Idioms Detection
    • Zheng Chu, Ziqing Yang, Yiming Cui, Zhigang Chen, Ming Liu. The 16th International Workshop on Semantic Evaluation (SemEval 2022) [W] ๐Ÿ”Ž Bib pub ACL Anthology
    • ๐Ÿ† This paper describes the winning system of SemEval 2022 Task 2 (Subtask A, one-shot).
  • Augmented and challenging datasets with multi-step reasoning and multi-span questions for Chinese judicial reading comprehension
    • Qingye Meng, Ziyue Wang, Hang Chen, Xianzhen Luo, Baoxin Wang, Zhipeng Chen, Yiming Cui, Dayong Wu, Zhigang Chen, Shijin Wang. AI Open, Vol.3 [J] ๐Ÿ”Ž Bib pub ScienceDirect
  • ใ€Œ2021ใ€

  • Highly Influential Pre-Training with Whole Word Masking for Chinese BERT
    • Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), Vol.29 [J] ๐Ÿ”Ž Bib pub IEEE Xplore GitHub
    • ๐Ÿ† This paper has been selected as one of the ESI Highly Cited Papers (by Clarivateโ„ข) in Engineering.
  • Adversarial Training for Machine Reading Comprehension with Virtual Embeddings
  • Memory Augmented Sequential Paragraph Retrieval for Multi-hop Question Answering
    • Nan Shao, Yiming Cui, Ting Liu, Shijin Wang, Guoping Hu. arXiv pre-print: 2102.03741. ๐Ÿ”Ž Bib arXiv
  • ๐Ÿ“š Natural Language Processing: A Pre-trained Model Approach (่‡ช็„ถ่ฏญ่จ€ๅค„็†๏ผšๅŸบไบŽ้ข„่ฎญ็ปƒๆจกๅž‹็š„ๆ–นๆณ•)
  • Benchmarking Robustness of Machine Reading Comprehension Models
  • Bilingual Alignment Pre-training for Zero-shot Cross-lingual Transfer
  • ใ€Œ2020ใ€

  • Discriminative Sentence Modeling for Story Ending Prediction
  • A Sentence Cloze Dataset for Chinese Machine Reading Comprehension
  • Revisiting Pre-Trained Models for Chinese Natural Language Processing
    • Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu. EMNLP 2020 (Findings) [C] ๐Ÿ”Ž Bib pub ACL Anthology GitHub
    • ๐ŸŽ‰ This paper has been selected as one of the Most Influential EMNLP 2020 Papers (Top 11) by Paper Digest.
  • CharBERT: Character-aware Pre-trained Language Model
  • Is Graph Structure Necessary for Multi-hop Question Answering?
  • Conversational Word Embedding for Retrieval-based Dialog System
  • TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing
  • Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting
  • ใ€Œ2019ใ€

  • Cross-Lingual Machine Reading Comprehension
  • A Span-Extraction Dataset for Chinese Machine Reading Comprehension
  • Contextual Recurrent Units for Cloze-style Reading Comprehension
    • Yiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhipeng Chen, Shijin Wang, Guoping Hu. arXiv pre-print: 1911.05960 ๐Ÿ”Ž Bib arXiv
  • Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions
  • TripleNet: Triple Attention Network for Multi-Turn Response Selection in Retrieval-based Chatbots
  • Improving Machine Reading Comprehension via Adversarial Training
    • Ziqing Yang, Yiming Cui, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu. arXiv pre-print: 1911.03614 ๐Ÿ”Ž Bib arXiv
  • Exploiting Persona Information for Diverse Generation of Conversational Responses
  • CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension
    • Xingyi Duan, Baoxin Wang, Ziyue Wang, Wentao Ma, Yiming Cui, Dayong Wu, Shijin Wang, Ting Liu, Tianxiang Huo, Zhen Hu, Heng Wang, Zhiyuan Liu. CCL 2019 [C] ๐Ÿ”Ž Bib arXiv GitHub
  • ใ€Œ2018ใ€

  • Dataset for the First Evaluation on Chinese Machine Reading Comprehension
  • Context-Sensitive Generation of Open-Domain Conversational Responses
  • HFL-RC System at SemEval-2018 Task 11: Hybrid Multi-Aspects Model for Commonsense Reading Comprehension
    • Zhipeng Chen, Yiming Cui*, Wentao Ma, Shijin Wang, Ting Liu, Guoping Hu. arXiv pre-print: 1803.05655 ๐Ÿ”Ž Bib arXiv
  • A Car Manual Question Answering System based on Neural Network (ๅŸบไบŽ็ฅž็ป็ฝ‘็ปœ็š„ๆฑฝ่ฝฆ่ฏดๆ˜Žไนฆ้—ฎ็ญ”็ณป็ปŸ)
    • Le Qi, Yu Zhang, Wentao Ma, Yiming Cui, Shijin Wang, Ting Liu. Journal of Shanxi University (Natural Science Edition) [J] ๐Ÿ”Ž Bib
  • ใ€Œ2017ใ€

  • Highly Influential Attention-over-Attention Neural Networks for Reading Comprehension
  • The Brilliant Chinese Achievements in SQuAD Challenge (ๆ–ฏๅฆ็ฆSQuADๆŒ‘ๆˆ˜่ต›็š„ไธญๅ›ฝไบฎไธฝๆฆœๅ•)
    • Yiming Cui, Ting Liu, Shijin Wang, Zhipeng Chen, Wentao Ma, Guoping Hu. Communication of CCF [M] ๐Ÿ”Ž Bib
  • Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution
  • ใ€Œ2016ใ€

  • Highly Influential Consensus Attention-based Neural Networks for Chinese Reading Comprehension
  • LSTM Neural Reordering Feature for Statistical Machine Translation
  • ใ€ŒBefore 2016ใ€

  • Augmenting Phrase Table by Employing Lexicons for Pivot-based SMT
    • Yiming Cui, Conghui Zhu, Xiaoning Zhu, Tiejun Zhao. arXiv pre-print: 1512.00170 ๐Ÿ”Ž Bib arXiv
  • Context-extended Phrase Reordering Model for Pivot-based Statistical Machine Translation
  • The USTC Machine Translation System for IWSLT2014
  • Phrase Table Combination Deficiency Analyses in Pivot-based SMT
  • The HIT-LTRC Machine Translation System for IWSLT 2012

  • Notations: [J]ournal, [C]onference, [W]orkshop, [B]ook, [M]agazine