This repository is curated and maintained by Prof. Yang Li’s Group at Northeast Electric Power University (NEEPU), China.
🌐 Visit our academic portal: https://CausalNLP.ai
📚 Focus: Causal NLP · Trustworthy AI · Privacy-Preserving Models
This repository compiles high-quality NLP resources for research and development, including toolkits, corpora, learning materials, and technology summaries.
We aim to:
This section introduces widely used NLP libraries and toolkits, covering both English and Chinese processing, implemented in various languages such as Python, Java, and C++.
本节整理了常用的自然语言处理工具包,涵盖英文与中文处理,支持多种编程语言如 Python、Java、C++ 等。
CoreNLP: Java-based NLP toolkit from Stanford, providing robust syntactic and semantic analysis tools.
由斯坦福开发的 Java 自然语言处理工具集,支持句法与语义分析。
NLTK: A classic Python NLP toolkit offering corpora, lexical resources, and text processing functions.
Python 编写的老牌 NLP 工具包,内含语料库、词汇资源与处理接口。
gensim | GitHub: Topic modeling and document similarity library in Python.
Python 编写的主题建模、文本相似度与向量空间工具库。
LTP | 官网: Language Technology Platform from HIT, supporting Java & Python.
哈工大开发的中文语言技术平台,支持词法与句法分析。
jieba: The most popular Python Chinese word segmentation library, widely used across platforms.
主流 Python 中文分词工具,适配多平台环境。
jieba_fast: Optimized version of jieba using Cython for faster DAG and Viterbi computation.
用 Cython 优化 DAG 与 Viterbi 计算,大幅提升性能的 jieba 版本。
NLPIR: Chinese lexical analysis toolkit by CAS/BIT, supporting Java, with Python wrapper PyNLPIR.
中科院/北理工开发的中文 NLP 工具,Java 实现,支持 Python 接口。
HanLP: Industrial-grade Chinese NLP toolkit supporting Java & Python, with pretrained models.
商用级中文 NLP 工具包,支持多任务与预训练模型。
THULAC: Efficient Chinese lexical analyzer by Tsinghua, supporting C++, Java, and Python.
清华大学开发的高效中文词法分析工具,跨平台支持。
pkuseg: Domain-specific Chinese segmentation toolkit from Peking University.
北大开发的多领域中文分词库,适用于新闻、电商、医疗等领域。
FudanNLP: NLP toolkit with datasets and ML algorithms in Java, from Fudan University.
复旦大学发布的 NLP 工具与算法平台,提供数据集支持。
Apache OpenNLP: A Java toolkit for tokenization, POS tagging, NER, chunking, parsing, and coreference resolution.
Apache 出品的 Java 工具包,涵盖 NLP 主要任务。
SnowNLP: Chinese NLP toolkit with support for segmentation, POS tagging, sentiment analysis, keyword & summary extraction (TextRank), etc.
支持分词、情感分析、拼音/简繁转换、关键词提取等中文处理功能。
Ansj Seg: Java-based Chinese segmentation library using n-gram and CRF models.
Java 实现的中文分词工具,支持多种算法模式。
This section lists lightweight tools useful for Chinese linguistic processing, including pronunciation conversion, component lookup, error correction, and visualization.
本节汇总了一些轻量级中文处理工具,如拼音转换、部首拆解、查询纠错及可视化工具等。
Chinese Cixing: Chinese character stroke decomposition, radical query, and pinyin conversion.
中文词语笔画拆解、偏旁部首查询与拼音转换。
Chai Zi: A decomposable Chinese character dictionary database with component-level data.
拆字字典数据库,可用于部件级别的字旁查询。
python-pinyin: Convert Chinese characters to Pinyin, useful for phonetic annotation, sorting, and search.
汉字转拼音,可用于注音、排序与检索。
Nstools - zhTools: Simplified-Traditional Chinese character conversion.
中文繁简体互转工具。
QueryCorrection: Query spelling correction using pinyin similarity and edit distance.
基于拼音相似度与编辑距离的查询纠错工具。
50 Essential Matplotlib Visualizations: Most valuable Matplotlib visualizations (in Chinese).
Matplotlib 最有价值的可视化图例整理。
This section collects widely used Chinese NLP corpora, including general-purpose, domain-specific, dialogue, lexicons, and pretrained embeddings.
本节收录常用中文语料资源,涵盖通用语料、垂直领域、对话语料、词典与词向量等。
Wikipedia 中文版: 中文维基百科全文 dump 数据。
NLP Corpus Collection: NLP and KG-related datasets, organized by tasks.
自然语言处理与知识图谱语料集,按任务分类整理。
人民日报 1998 标注语料: 标注版本人民日报中文语料。
Sogou Labs Datasets: Sogou provides Chinese collocation, news, and web-scale corpora.
搜狗实验室资源:中文搭配库、新闻数据与互联网语料。
Chinese Dialogue Corpus: Includes Chatterbot, Douban, PTT, Weibo, TV scripts, forums, etc.
中文多轮对话语料,覆盖豆瓣、微博、小黄鸡等。
THUOCL: Domain-specific Chinese lexicons (e.g. IT, law, food, idioms).
清华开放中文词库,覆盖多个垂直领域。
Chinese Lexicon: Word lists including people names, organizations, etc.
各类中文词库,如人名、机构名等。
Chinese Dependency Treebank: Provided by NLP&CC 2013 technical evaluation.
中文依存句法树库。
WeChat Public Corpus: Crawled articles from public WeChat accounts.
微信公众号语料库。
Chinese Rumor Dataset: Fake news dataset from Sina Weibo.
新浪微博谣言数据集。
Tencent AI Embedding Corpus: Chinese word and phrase embeddings.
腾讯 AI 实验室中文词向量。
Word2Vec Slim: Slimmed version of Google News Word2Vec embeddings.
Google News 词向量简化版。
Chinese Word2Vec Models: Pretrained Chinese embeddings.
中文 Word2Vec 预训练模型。
Chinese Word Vectors: Multiple pretrained embeddings in Chinese.
中文词向量集合。
nlp_chinese_corpus: Includes Wikipedia, news, QA, and translation corpora.
包含百科、问答、翻译等多样语料。
Chinese Classical Poetry Corpus: A comprehensive Chinese poetry and literature database.
中文古典诗词数据库。
Chinese RC Dataset: Reading comprehension dataset in Chinese.
中文阅读理解数据集。
Chinese WOE Correction Corpus: Word ordering error detection corpus.
中文语序纠错语料库。
THUCNews: 740,000 categorized Chinese news articles from Sina RSS feeds (2005–2011).
THU 新闻文本分类数据集。
Company Name Corpus: Chinese company and organization name datasets.
公司、机构、品牌等名称词库。
Chinese Name Corpus: People name datasets (common, ancient, foreign).
中文人名、古人名、翻译人名等集合。
Abbreviation Dataset: Chinese abbreviation wordlists.
中文简称词典。
Chinese-Xinhua: Xinhua dictionary data, idioms, proverbs, characters.
中华新华字典,含歇后语、成语、词语与汉字释义。
Couplet Dataset: Chinese couplet data.
对联生成数据集。
5156Edu Tools: Chinese synonym/antonym/pinyin converters (in Chinese).
在线近义词、反义词与拼音转换工具。
EmotionLexicon: Fine-grained emotion and sentiment dictionaries.
情感词典、网络词汇与停用词典。
Chinese Dictionary: Synonym, antonym, and negation lists.
中文同义词、反义词、否定词等词表。
Synonyms: Chinese synonym toolkit.
中文近义词处理工具。
Chinese NLP Corpus: Semantic words, domain consensus, diachronic data, and evaluation corpora.
中文 NLP 通用/垂直/历时/评测语料集合。
CEC-Corpus: Chinese Emergency Corpus for sudden event detection.
中文突发事件语料库。
HardNLU (NLP太难了系列): Curated complex NLP tasks and resources.
面向复杂任务的中文 NLP 数据集合集。
Corpus Construction 语料构建工具
Opencc Python — Python-based converter between Traditional and Simplified Chinese.
Python 简繁体中文转换工具。
Pinyin Python — Chinese-to-Pinyin converter in Python.
汉字转拼音的 Python 工具。
Python模拟登陆 — Simulated login scripts for popular websites using Python.
使用 Python 模拟登录大型网站的工具集合。
Baidu Baike Spider — Python-based crawler for Baidu Baike entries.
使用 Python 抓取百度百科词条的爬虫工具。
Sina Weibo Spider — Java-based crawler for Sina Weibo content.
使用 Java 实现的新浪微博内容采集工具。
Sougou Words Collector — Tool for extracting and converting Sogou Input Method dictionaries.
搜狗输入法词库抓取与格式转换工具。
Baike Knowledge Schema — Crawler for extracting conceptual classification systems from Baidu and Hudong Baike.
面向百度百科与互动百科的概念分类体系抓取脚本。
Baike Info Extraction — Extracts structured infobox information from various Chinese encyclopedias.
对百科词条中的 infobox 结构化信息进行抽取与融合。
Baidu Index Spyder — Automatically collects historical Baidu search index data by keyword.
按关键词抓取百度搜索指数数据的自动化工具。
Ali Index Spyder — Crawler for Alibaba commodity indices including procurement and supply.
抓取阿里商品指数数据,包括淘宝与1688的采购和供应指数。
新闻搜索引擎新闻爬取 — News crawler based on Scrapy framework, supporting Baidu, Sogou, Sina, 360, and Xinhua.
支持多家新闻搜索引擎的 Scrapy 框架爬虫。
通用新闻类网站分布式爬虫 — Distributed crawler for general news websites; extracts title, time, author, and content.
可提取新闻标题、时间、作者、正文等内容的通用分布式爬虫。
Deep Learning Frameworks 深度学习框架
Keras Official Docs, Chinese Docs, Examples
Keras 官方文档、中文文档与示例代码。
TensorFlow Official Docs, Chinese Docs, Tutorial in Chinese, Examples, Cookbook
TensorFlow 全套资源,包括官方文档、中文教程与代码示例。
PyTorch Official Docs, Chinese Docs, Examples, Resources, Practical Guide, Awesome PyTorch, PyTorch Tutorial
PyTorch 资源大全,包括中文手册、示例教程与高质量仓库列表。
Deploying PyTorch Models via Flask — REST API deployment example with Flask; applicable to other DL models.
使用 Flask 部署 PyTorch 模型教程,可推广至其他深度学习模型部署场景。
ML Resources 机器学习书籍与资料
Statistical Learning Methods — A rigorous, foundational Chinese ML textbook, by Hang Li.
《统计学习方法》,经典教材,公式推导与定理逻辑严谨。
Machine Learning (Zhou Zhihua) — Widely known as “The Watermelon Book,” a go-to for ML beginners.
周志华老师《机器学习》,入门必读教材。
Deep Learning (Chinese Translation) — Chinese version of the book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville. GitHub source
深度学习领域权威教材中文版,免费开源。
Machine Learning Code Examples — Implementation of common ML algorithms.
常见机器学习算法的实现示例合集。
Deep Learning 500 Questions — Q&A-style guide to DL/ML/linear algebra/probability/CV topics.
问答形式讲解概率论、线性代数、机器学习、深度学习与计算机视觉等内容。
Neural Networks and Deep Learning — Covers CNN, RNN and applications in NLP and CV, by Xuipeng Qiu.
邱锡鹏老师编写的《神经网络与深度学习》,系统介绍主流模型与应用。
Andrew Ng’s Notebooks: ML Notes, Deep Learning Notes
吴恩达机器学习与深度学习课程笔记合集。
Machine Learning Yearning — Practice-focused ML guidance by Andrew Ng.
强调实践经验的机器学习策略指导手册,作者:Andrew Ng。
NLP Resources 自然语言处理书籍与资料
The Beauty of Mathematics in Computer Science — A popular science book introducing NLP ideas, by Jun Wu.
吴军《数学之美》,NLP 入门经典,通俗易懂。
Statistical Natural Language Processing — Comprehensive NLP theory and methodology, by Chengqing Zong.
宗成庆《统计自然语言处理》第二版,覆盖基础与最新进展。
Neural Network Methods for NLP — Techniques in NLP with neural networks, by Yoav Goldberg.
Yoav Goldberg 所著神经网络 NLP 方法书籍。
Report on Chinese Information Development — 2016 overview by Chinese Information Processing Society.
中国中文信息学会发布的 NLP 发展概览。
Speech and Language Processing (3rd Edition) — Classic NLP textbook by Dan Jurafsky and James H. Martin.
NLP 权威教材,Jurafsky 与 Martin 合著。
Deep Learning for NLP (CityU Lecture) — Application of DL in NLP, by Xuipeng Qiu.
邱锡鹏老师关于深度学习与 NLP 的讲义。
NLP Reading List (2019) — Introductory NLP book recommendations by Zhiyuan Liu.
刘知远老师推荐的 NLP 入门阅读清单(2019年版)。
nlp — An open-source NLP introduction book.
一本结构清晰的开源 NLP 入门书。
Blogs and Courses 博客和课程
NLP Progress
Repository to track progress in NLP, including datasets and state-of-the-art results for various tasks.
跟踪 NLP 最新研究进展和主流任务的 SOTA 性能与数据集的项目。
Everything You Need to Know About Text Processing in NLP and ML
A WeChat article introducing common text preprocessing methods in NLP and ML.
自然语言处理与机器学习中常见文本预处理方法汇总文章。
BERT Model 相关资源
From Word Embedding to BERT: A History of Pretraining in NLP by 张俊林
A historical overview of pretrained models in NLP, from word embeddings to BERT.
NLP 预训练模型发展简史,从词向量到 BERT 的演进。
BERT GitHub (Google Research)
Official TensorFlow implementation and pretrained models by Google.
Google 官方发布的 BERT TensorFlow 实现及预训练模型。
Awesome BERT
A curated list of papers, applications, and resources related to BERT.
BERT 相关论文、应用与资源整合列表。
Awesome BERT NLP
Resources focused on BERT, attention mechanisms, Transformers, and transfer learning.
聚焦 BERT、注意力机制、Transformer 和迁移学习的精选资源。
The Illustrated BERT, ELMo, and co.
Visual and intuitive explanation of BERT and ELMo by Jay Alammar.
Jay Alammar 用可视化方式通俗解释 BERT 和 ELMo 模型。
BERT as a Service
Use BERT as an encoder service to generate fixed-length sentence embeddings.
将 BERT 模型作为服务使用,生成句子向量表示。
PyTorch Pretrained BERT
PyTorch implementation and pretrained BERT models by HuggingFace.
HuggingFace 提供的 PyTorch 版本 BERT 实现与预训练模型。
BERT Classification Tutorial
Practical tutorial on using BERT for text classification.
使用 BERT 进行文本分类的实战教程。
BERT Utils
Utilities for sentence embeddings, classification, and similarity computation with BERT.
提供 BERT 的句向量生成、文本分类、相似度计算工具代码。
BERT BiLSTM-CRF for NER
TensorFlow implementation for NER using BERT + BiLSTM-CRF.
使用 BERT 与 BiLSTM-CRF 实现命名实体识别(NER)。
BERT Chinese NER
BERT-based approach for Chinese named entity recognition.
基于 BERT 的中文命名实体识别模型。
BERT Innovations in NLP by 张俊林
A comprehensive review of BERT innovations across NLP tasks.
全面综述 BERT 在各类 NLP 任务中的创新与应用。
Text Modeling and Analysis
Self Attention Mechanism Simple TensorFlow implementation of “A Structured Self-attentive Sentence Embedding” (ICLR 2017)
Encoder Decoder Four types of encoder-decoder models implemented using Python, Theano, Keras, and Seq2Seq
Seq2seq Sequence-to-sequence learning with Keras
Keras Language Modeling Language modeling code for question-answering tasks using Keras
CNN for Sentence Classification in Keras Keras implementation of “Convolutional Neural Networks for Sentence Classification” (EMNLP 2014)
CNN for Classification PyTorch implementation of CNN for sentence classification
Awesome NLP Sentiment Analysis A curated list of datasets, papers, and open-source implementations for sentiment analysis
Text Similarity
Cilin and Hownet Word similarity computation using extended Chinese thesaurus (Cilin) and HowNet
Similarity Compute Sentence similarity based on Cilin, HowNet, fingerprints, word embeddings, and VSM
Siamese Sentence Similarity Siamese BiLSTM model for sentence similarity with training and test datasets
SentenceSim Chinese short sentence similarity based on HowNet, Onehot, word2vec, Harbin SDP, LSTM, and fusion algorithms
Text Disambiguation
Information Extraction
Open IE Papers Collection of OpenIE and ORE papers and resources
Relation Extraction Summary A summary of relation extraction/classification papers from 2013–2017
LM-LSTM-CRF PyTorch implementation of “Empower Sequence Labeling with Task-Aware Neural Language Model” (AAAI 2018)
Named Entity Relation Extraction Entity relation extraction using syntactic parsing
Pytorch Relation Extraction PyTorch reproduction of PCNN + MIL (Zeng 2015) and PCNN + ATT (Lin 2016)
Zh NER TF BiLSTM-CRF model for Chinese NER using TensorFlow
BERT BiLSTM CRF NER NER using BERT with BiLSTM-CRF (TensorFlow)
Event Triples Extraction Event triple extraction using dependency parsing and semantic role labeling
Important Event Extractor Ranking news importance based on DoCRank and selecting timeline highlights
Text Grapher Extracts and structures key document information into a semantic graph
Knowledge Graph from Scratch Zhihu column on building knowledge graphs from scratch
Text Info Exp Toolkit for TF-IDF, classification, clustering, word vectors, sentiment, and relation extraction
Text Generation
Texar A toolkit for text generation and beyond
Awesome Text Generation A curated list of recent models and applications in text generation
Ehud Reiter’s Blog In-depth discussions on NLG techniques, evaluation, and applications
Talk Latent Slides for “Controlling Text Generation” by Alexander Rush
Sequence Labeling
Reading Comprehension
CMRC 2017 First Chinese Machine Reading Comprehension evaluation workshop
CMRC 2018 Second Chinese Machine Reading Comprehension evaluation workshop
Neural Reading Comprehension and Beyond PhD thesis by Danqi Chen on neural reading comprehension
Teaching Machines to Read A review article on machine reading comprehension and key research papers
QA System
AnyQ FAQ-based question answering system by Baidu
KBQA using Seq2Seq Implementation of a knowledge-base QA system with seq2seq model, GitHub
Knowledge Graph
Intro Guide to Knowledge Graph Technology A beginner-friendly guide to knowledge graph technology and applications
Slides About Knowledge Graph Collection of knowledge graph slides, Baidu Netdisk code: z5yb
Agriculture Knowledge Graph Tools for entity recognition, relation extraction, classification trees, and data mining in agriculture
Person Relation Knowledge Graph Chinese person relation knowledge graph construction, distant supervision, bootstrapping, and QA applications
Awesome Knowledge Graph Curated resources for learning and building knowledge graphs
Automatic/Semi-automatic Knowledge Extraction Blog post from Microsoft Research Asia on knowledge base construction
Bottom-Up Knowledge Graph Construction Detailed tutorial on constructing knowledge graphs from the bottom-up
Chinese Knowledge Graph APIs and Tools A summary of Chinese APIs, tools, research institutes, and frameworks
Product-Oriented View on Knowledge Graphs Analysis of the value and application of knowledge graphs from a product management perspective
Military Knowledge Graph and QA QA system for military knowledge graph (5800+ entries), including spacecraft, aircraft, and space equipment
Relation Extraction
Note: The order does not indicate ranking. This list is not exhaustive—contributions are welcome!
ACL Anthology
A digital archive of research papers in natural language processing and computational linguistics.
NLP Conference Calendar
A maintained calendar for major NLP conferences and submission deadlines.
NLP Group, Institute of Computing Technology, Chinese Academy of Sciences
NLP Group, Institute of Automation, Chinese Academy of Sciences
Chinese Information Processing Lab, Institute of Software, CAS
Natural Language Processing & Social Human Computing Lab, Tsinghua University – GitHub
THUIR Group, State Key Lab of Intelligent Technology and Systems, Tsinghua University
MOE Key Laboratory of Computational Linguistics, Peking University
Social Computing and Information Retrieval Center, Harbin Institute of Technology
Text Mining Group, CUHK
Chinese University of Hong Kong
Social Media Mining Group, PolyU
The Hong Kong Polytechnic University
Human Language Technology Center, HKUST
Hong Kong University of Science and Technology
NLP2CT Lab, University of Macau
NLP and Portuguese-Chinese Machine Translation Lab
新加坡/日本/以色列/澳大利亚
NUS Natural Language Processing Group(新加坡国立大学自然语言处理组)
NLP and Big Data Research Group in the ISTD pillar at the Singapore University of Technology and Design (新加坡科技设计大学自然语言处理和大数据研究组)
NLP Research Group at the Nanyang Technological University(南洋理工大学自然语言处理组)
Advanced Translation Technology Laboratory at National Institute of Information and Communications Technology(日本情报通讯研究所高级翻译技术实验室)
Nakayama Laboratory at University of Tokyo (东京大学中山实验室)
Natural Language Processing Lab at Bar-Ilan University (以色列巴伊兰大学自然语言处理实验室)
The University of Melbourne NLP Group(澳大利亚墨尔本大学自然语言处理组)
Natural Language Processing - Research at Google
Google NLP Research Group (Google自然语言处理组)
The Redmond-based Natural Language Processing group
Microsoft NLP Group (微软自然语言处理组)
Facebook AI Research (FAIR)
Facebook AI Research (Facebook AI研究部)
IBM Thomas J. Watson Research Center
IBM Thomas J. Watson Research Center(IBM研究中心)
The Stanford Natural Language Processing Group
Stanford NLP Group(斯坦福大学自然语言处理组)
The Berkeley Natural Language Processing Group
Berkeley NLP Group(加州大学伯克利分校自然语言处理组)
Natural Language Processing research at Columbia University
Columbia University NLP Group(哥伦比亚大学自然语言处理组)
Graham Neubig’s lab at the Language Technologies Institute of Carnegie Mellon University
CMU NeuLab(卡内基梅隆大学Graham Neubig实验室)
RPI Blender Lab
NLP Lab at Rensselaer Polytechnic Institute(伦斯勒理工学院NLP实验室)
UC Santa Barbara Natural Language Processing Group
UCSB NLP Group(加州大学圣塔芭芭拉分校NLP组)
The Natural Language Group at the USC Information Sciences Institute
USC ISI NLP Group(南加州大学信息科学研究所NLP组)
Natural Language Processing @USC
USC NLP Group(南加州大学自然语言处理组)
Natural Language Processing Group at University of Notre Dame
Notre Dame NLP Group(圣母大学NLP组)
Artificial Intelligence Research Group at Harvard
Harvard AI Group(哈佛大学人工智能研究组)
The Harvard natural-language processing group
Harvard NLP Group(哈佛大学NLP组)
Computational Linguistics and Information Processing at Maryland
UMD CLIP Lab(马里兰大学计算语言学与信息处理实验室)
Language and Speech Processing at Johns Hopkins University
JHU CLSP(约翰斯·霍普金斯大学语言与语音处理实验室)
Human Language Technology Center of Excellence at JHU
JHU HLTCOE(人类语言技术卓越中心)
Machine Translation Group at The Johns Hopkins University
JHU Machine Translation Group(约翰斯·霍普金斯大学机器翻译组)
Machine Translation Research at Rochester
University of Rochester MT Group(罗切斯特大学机器翻译研究)
NLP @ University of Illinois at Urbana-Champaign
UIUC NLP Group(伊利诺伊大学厄巴纳-香槟分校NLP组)
UIC Natural Language Processing Laboratory
UIC NLP Lab(伊利诺伊大学芝加哥分校NLP实验室)
Human Language Technology Research Institute at The University of Texas at Dallas
UT Dallas HLT Research Institute(德州大学达拉斯分校人类语言技术研究所)
Natural Language Processing Group at MIT CSAIL
MIT NLP Group(麻省理工学院计算机与人工智能实验室)
Natural Language Processing Group at Texas A&M University
Texas A&M NLP Group(德州农工大学NLP组)
The Natural Language Processing Group at Northeastern University
Northeastern NLP Group(东北大学NLP组)
Cornell NLP group
Cornell NLP Group(康奈尔大学自然语言处理组)
Natural Language Processing group at University Of Washington
UW NLP Group(华盛顿大学NLP组)
Natural Language Processing Research Group at University of Utah
University of Utah NLP Group(犹他大学自然语言处理组)
Natural Language Processing and Information Retrieval group at University of Pittsburgh
Pitt NLP & IR Group(匹兹堡大学NLP与信息检索组)
Brown Laboratory for Linguistic Information Processing (BLLIP)
BLLIP at Brown University(布朗大学语言信息处理实验室)
Natural Language Processing (NLP) group at University of British Columbia
UBC NLP Group(不列颠哥伦比亚大学NLP组)
Natural Language and Information Processing Research Group at University of Cambridge
University of Cambridge NLP Group(英国剑桥大学自然语言与信息处理组)
The Computational Linguistics Group at Oxford University
University of Oxford Computational Linguistics Group(英国牛津大学计算语言学组)
Human Language Technology and Pattern Recognition Group at the RWTH Aachen
RWTH Aachen HLT & Pattern Recognition Group(德国亚琛工业大学语言技术与模式识别组)
The Natural Language Processing Group at the University of Edinburgh (EdinburghNLP)
University of Edinburgh NLP Group(英国爱丁堡大学自然语言处理研究组)
Statistical Machine Translation Group at the University of Edinburgh
Edinburgh SMT Group(英国爱丁堡大学统计机器翻译组)
Natural Language Processing Research Group at The University of Sheffield
University of Sheffield NLP Group(英国谢菲尔德大学自然语言处理组)
Speech Research Group at University of Cambridge
University of Cambridge Speech Research Group(剑桥大学语音研究组)
Statistical Machine Translation Group at the University of Cambridge
Cambridge SMT Group(剑桥大学统计机器翻译组)
Computational Linguistics group at Uppsala University
Uppsala University CL Group(瑞典乌普萨拉大学计算语言学组)
The Center for Information and Language Processing at University of Munich
LMU Munich CIS(德国慕尼黑大学信息与语言处理中心)
National Centre for Language Technology at Dublin City University
DCU NCLT(爱尔兰都柏林城市大学国家语言技术中心)
The National Centre for Text Mining (NaCTeM) at University of Manchester
University of Manchester NaCTeM(英国曼彻斯特大学国家文本挖掘中心)
The Information and Language Processing Systems group at the University of Amsterdam
University of Amsterdam ILPS Group(荷兰阿姆斯特丹大学信息与语言处理系统组)
Institute of Formal and Applied Linguistics at Charles University
Charles University UFAL(捷克查理大学语言学应用与规范研究所)
DFKI Language Technology Lab
DFKI LT Lab(德国人工智能研究中心语言技术实验室)
IXA in University of the Basque Country
University of the Basque Country IXA Group(西班牙巴斯克大学自然语言处理组)
Statistical Natural Language Processing Group at the Institute for Computational Linguistics at Heidelberg University
Heidelberg University StatNLP Group(德国海德堡大学统计自然语言处理组)
NLP Research at the University of Helsinki
University of Helsinki NLP Group(芬兰赫尔辛基大学自然语言处理组)
This project draws upon and integrates outstanding open-source contributions from the following communities:
Special thanks to Jia Zheng (Institute of Software, Chinese Academy of Sciences) for his early compilation and curation efforts.
Our team has further updated, reorganized, and continues to maintain this repository.
If your work has been used without proper attribution, please contact us for timely acknowledgment.
📖 This repository is intended solely for academic learning and research.
Please refer to the original sources for licensing and usage terms.
If any content infringes on your rights, please contact us and we will promptly address it.
For questions, corrections, or collaboration inquiries, please reach out to:
Prof. Yang Li
Email: liyang@neepu.edu.cn
Website: https://CausalNLP.ai