Chinese Nested Named Entity Recognition Based on Suffix Labeling

Our code mainly refers to W2NER. We optimize the span-based approach using the suffix labeling method,aiming to improve the performance of Chinese NER.

Environments

python 3.8.19
pytorch 2.4.0+cu118

Dependencies

numpy (1.24.1)
transformers (4.44.2)
scikit-learn (1.3.2)
prettytable (3.11.0)

Preparation

Download dataset
Process them to fit the same format as the example in data/
Put the processed data into the directory data/
Here is an example of the processed data

{"sentence": ["中", "国", "科", "学", "院", "大", "学"], "ner": [{"index": [0,1,2,3,4,5,6], "type": "organization"} ] }

We provide the code process_CMeEE.py for processing the original CMeEE dataset

Create configuration files(*.json) in configs/
We have provided configuration files about Weibo, Resume, OntoNotes4 and CMeEE-V2

Training

python main.py --config ./configs/example.json

The experiment records are stored in the directory train_logs/

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
configs		configs
data/test		data/test
README.md		README.md
config.py		config.py
data_loader.py		data_loader.py
main.py		main.py
model.py		model.py
process_CMeEE.py		process_CMeEE.py
suffix_labeling.py		suffix_labeling.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chinese Nested Named Entity Recognition Based on Suffix Labeling

Environments

Dependencies

Preparation

Training

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Chinese Nested Named Entity Recognition Based on Suffix Labeling

Environments

Dependencies

Preparation

Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages