Category |
Name |
Description |
Paper |
Download |
Large Language Models (LLMs) |
Chinese-LLaMA-Alpaca
|
Chinese LLaMA and Alpaca LLMs, with tools for quick deployment on personal computers. |
arXiv 2023 |
Star |
Chinese-LLaMA-Alpaca-2
|
Chinese LLaMA-2 and Alpaca-2 LLMs, with tools for quick deployment on personal computers. |
arXiv 2023 |
Star |
VisualCLA
|
Visual Chinese LLaMA and Alpaca LLMs. |
- |
Star |
|
|
|
Pre-Trained Language Models |
VLE
|
A multimodal Vision-Language Encoder. |
- |
Star |
LERT
|
Pre-trained Chinese LERT. |
arXiv 2022 |
Star |
MiniRBT
|
Pre-trained small models: MiniRBT. |
arXiv 2023 |
Star |
PERT
|
Pre-trained Chinese and English PERT. |
arXiv 2022 |
Star |
CINO
|
Pre-trained Language Model for Chinese Minority. |
arXiv 2022 |
Star |
MacBERT
|
Pre-trained Chinese MacBERT. |
IEEE/ACM TASLP |
Star |
CharBERT
|
Character-aware pre-trained language model. |
COLING 2020 |
Star |
Chinese-ELECTRA
|
Pre-trained Chinese ELECTRA-large, ELECTRA-base, etc. |
IEEE/ACM TASLP |
Star |
Chinese-XLNet
|
Pre-trained Chinese XLNet-mid, XLNet-base. |
arXiv |
Star |
Chinese-BERT-wwm
|
Pre-trained Chinese BERT-wwm, RoBERTa-wwm-ext, RBT3, etc. |
IEEE/ACM TASLP |
Star |
|
|
|
Datasets |
ExpMRC
|
Explainability Evaluation for MRC. |
Heliyon |
Star |
AdvRACE
|
A dataset for testing robustness of MRC models. |
Findings of ACL 2021 |
Star |
CMRC 2019
|
A sentence cloze dataset for Chinese MRC. |
COLING 2020 |
Star |
CJRC
|
A dataset for Chinese judicial MRC. |
CCL 2019 |
Star |
CMRC 2018
|
A span-extraction dataset for Chinese MRC. |
EMNLP 2019 |
Star |
CMRC 2017
|
A cloze-style Chinese reading comprehension dataset. |
LREC 2018 |
Star |
PD & CFT
|
A Chinese cloze-style MRC dataset: PD&CFT. |
COLING 2016 |
Star |
|
|
|
Toolkit |
TextPruner
|
An open-source model pruning toolkit for NLP. |
ACL 2022 Demo |
Star |
TextBrewer
|
An open-source knowledge distillation toolkit for NLP. |
ACL 2020 Demo |
Star |
|
|
|
Models |
RecAdam
|
An optimizer for fine-tuning PLMs with less forgetting. |
EMNLP 2020 |
Star |
PR-Embedding
|
Conversational word embeddings for retrieval-based dialog system. |
ACL 2020 |
Star |
Dual BERT
|
A neural cross-lingual approach for MRC. |
EMNLP 2019 |
Star |
TripleNet
|
A network for multi-turn for retrieval-based dialog system. |
CoNLL 2019 |
Star |
|
|
|
Miscs |
ChatGPT-in-Academia
|
Policies of publishers and conferences towards the using of LLM. |
- |
Star |
Chinese ACL2020 Blogs
|
Chinese version of ACL 2020 PC blogs. |
- |
Star |
LAMB Optimizer (TF)
|
LAMB optimizer for large batch training (TensorFlow version). |
- |
Star |
NLP Review Scorer
|
Get a score for the review of your NLP paper. |
- |
Star |
Chinese MRC Datasets
|
Collections of Chinese reading comprehension datasets. |
- |
Star |