Resources
Large Language Models (LLMs)
-
Chinese-LLaMA-Alpaca-3. Chinese LLaMA-3 series LLMs. Star
-
Chinese-Mixtral. Chinese Mixtral MoE LLMs. Star
-
Chinese-LLaMA-Alpaca-2. Chinese LLaMA-2 and Alpaca-2 LLMs. arXiv 2023. Star
-
Chinese-LLaMA-Alpaca. Chinese LLaMA and Alpaca LLMs. arXiv 2023. Star
Pre-Trained Language Models (PLMs)
-
LERT. Pre-trained Chinese LERT. arXiv 2022. Star
-
MiniRBT. Pre-trained small models: MiniRBT. arXiv 2023. Star
-
PERT. Pre-trained Chinese and English PERT. arXiv 2022. Star
-
CINO. Pre-trained Language Model for Chinese Minority. arXiv 2022. Star
-
MacBERT. Pre-trained Chinese MacBERT. IEEE/ACM TASLP. Star
-
CharBERT. Character-aware pre-trained language model. COLING 2020. Star
-
Chinese-ELECTRA. Pre-trained Chinese ELECTRA-large, ELECTRA-base, etc. IEEE/ACM TASLP. Star
-
Chinese-XLNet. Pre-trained Chinese XLNet-mid, XLNet-base. arXiv. Star
-
Chinese-BERT-wwm. Pre-trained Chinese BERT-wwm, RoBERTa-wwm-ext, RBT3, etc. IEEE/ACM TASLP. Star
Datasets
-
AdvRACE. A dataset for testing robustness of MRC models. Findings of ACL 2021. Star
-
CMRC 2019. A sentence cloze dataset for Chinese MRC. COLING 2020. Star
-
CMRC 2018. A span-extraction dataset for Chinese MRC. EMNLP 2019. Star
-
CMRC 2017. A cloze-style Chinese reading comprehension dataset. LREC 2018. Star
-
PD & CFT. A Chinese cloze-style MRC dataset: PD&CFT. COLING 2016. Star
Toolkits
-
TextPruner. An open-source model pruning toolkit for NLP. ACL 2022 Demo. Star
-
TextBrewer. An open-source knowledge distillation toolkit for NLP. ACL 2020 Demo. Star
Models
-
RecAdam. An optimizer for fine-tuning PLMs with less forgetting. EMNLP 2020. Star
-
PR-Embedding. Conversational word embeddings for retrieval-based dialog system. ACL 2020. Star
-
Dual BERT. A neural cross-lingual approach for MRC. EMNLP 2019. Star
-
TripleNet. A network for multi-turn for retrieval-based dialog system. CoNLL 2019. Star
Miscs
-
ChatGPT-in-Academia. Policies of publishers and conferences towards the using of LLM. Star
-
Chinese ACL2020 Blogs. Chinese version of ACL 2020 PC blogs. Star
-
LAMB Optimizer (TF). LAMB optimizer for large batch training (TensorFlow version). Star
-
NLP Review Scorer. Get a score for the review of your NLP paper. Star
-
Chinese MRC Datasets. Collections of Chinese reading comprehension datasets. Star