Category Name Description Paper Download
Large Language Models (LLMs) new Chinese-LLaMA-Alpaca-3 Chinese LLaMA-3 series LLMs. - Star
new Chinese-Mixtral Chinese Mixtral MoE LLMs. - Star
new Chinese-LLaMA-Alpaca-2 Chinese LLaMA-2 and Alpaca-2 LLMs. arXiv 2023 Star
new Chinese-LLaMA-Alpaca Chinese LLaMA and Alpaca LLMs. arXiv 2023 Star
new VisualCLA Visual Chinese LLaMA and Alpaca LLMs. - Star
Pre-Trained Language Models new VLE A multimodal Vision-Language Encoder. - Star
new LERT Pre-trained Chinese LERT. arXiv 2022 Star
new MiniRBT Pre-trained small models: MiniRBT. arXiv 2023 Star
new PERT Pre-trained Chinese and English PERT. arXiv 2022 Star
new CINO Pre-trained Language Model for Chinese Minority. arXiv 2022 Star
blank MacBERT Pre-trained Chinese MacBERT. IEEE/ACM TASLP Star
blank CharBERT Character-aware pre-trained language model. COLING 2020 Star
blank Chinese-ELECTRA Pre-trained Chinese ELECTRA-large, ELECTRA-base, etc. IEEE/ACM TASLP Star
blank Chinese-XLNet Pre-trained Chinese XLNet-mid, XLNet-base. arXiv Star
hot Chinese-BERT-wwm Pre-trained Chinese BERT-wwm, RoBERTa-wwm-ext, RBT3, etc. IEEE/ACM TASLP Star
Datasets new ExpMRC Explainability Evaluation for MRC. Heliyon Star
new AdvRACE A dataset for testing robustness of MRC models. Findings of ACL 2021 Star
blank CMRC 2019 A sentence cloze dataset for Chinese MRC. COLING 2020 Star
blank CJRC A dataset for Chinese judicial MRC. CCL 2019 Star
hot CMRC 2018 A span-extraction dataset for Chinese MRC. EMNLP 2019 Star
blank CMRC 2017 A cloze-style Chinese reading comprehension dataset. LREC 2018 Star
blank PD & CFT A Chinese cloze-style MRC dataset: PD&CFT. COLING 2016 Star
Toolkit blank TextPruner An open-source model pruning toolkit for NLP. ACL 2022 Demo Star
blank TextBrewer An open-source knowledge distillation toolkit for NLP. ACL 2020 Demo Star
Models blank RecAdam An optimizer for fine-tuning PLMs with less forgetting. EMNLP 2020 Star
blank PR-Embedding Conversational word embeddings for retrieval-based dialog system. ACL 2020 Star
blank Dual BERT A neural cross-lingual approach for MRC. EMNLP 2019 Star
blank TripleNet A network for multi-turn for retrieval-based dialog system. CoNLL 2019 Star
Miscs blank ChatGPT-in-Academia Policies of publishers and conferences towards the using of LLM. - Star
blank Chinese ACL2020 Blogs Chinese version of ACL 2020 PC blogs. - Star
blank LAMB Optimizer (TF) LAMB optimizer for large batch training (TensorFlow version). - Star
blank NLP Review Scorer Get a score for the review of your NLP paper. - Star
blank Chinese MRC Datasets Collections of Chinese reading comprehension datasets. - Star