Skip to content

Latest commit

 

History

History
66 lines (51 loc) · 1.73 KB

README.md

File metadata and controls

66 lines (51 loc) · 1.73 KB

Named Entity Recognition (NER) with PyTorch

test Status lint Status License release (latest by date) code style: black

About

Pipeline for training NER models using PyTorch.
ONNX export supported.

Usage

The user interface consists of only one file config.yaml.
Change config.yaml to create the desired configuration and start the pipeline with the following command:

python main.py --config config.yaml

If --config argument is not specified, then used config.yaml.

To export trained model to ONNX use config.yaml:

save:
  export_onnx: True

Data Format:

Text file containing separated tokens and labels on each line. Sentences are separated by empty line. Labels should already be in necessary format, e.g. IO, BIO, BILUO, ...

Example:

token_11    label_11
token_12    label_12

token_21    label_21
token_22    label_22
token_23    label_23

...

Models

List of implemented models:

  • BiLTSM
  • BiLTSMCRF
  • BiLTSMAttn
  • BiLTSMAttnCRF
  • BiLTSMCNN
  • BiLTSMCNNCRF
  • BiLTSMCNNAttn
  • BiLTSMCNNAttnCRF

Docker

To simplify installation, you can deploy a container with all dependencies pre-installed.

Build container:
$ docker build -t pytorch_ner .

Run container (add --gpus all to use GPUs):
$ docker container run --rm -it -v ${PWD}:/workspace -p 6006:6006 pytorch_ner

Requirements

Python 3.6+