Skip to content

mickeysjm/R-BERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

4288cbe · Apr 20, 2020

History

8 Commits
Apr 17, 2020
Apr 20, 2020
Apr 17, 2020
Apr 17, 2020
Apr 20, 2020
Apr 18, 2020
Apr 17, 2020
Apr 20, 2020
Apr 20, 2020
Apr 20, 2020
Apr 20, 2020
Apr 20, 2020
Apr 20, 2020

Repository files navigation

A Pytorch Implementation of R-BERT relation classification model

PWC PWC

This is an unofficial pytorch implementation of R-BERT model described paper Enriching Pre-trained Language Model with Entity Information for Relation Classification.

In addition to the SemEval 2010 dataset tested in the original paper, I aslo test implementation on the more recent TACRED dataset

Requirements:

Install

$ https://github.com/mickeystroller/R-BERT
$ cd R-BERT

Train

SemEval-2010

The SemEval-2010 dataset is already included in this repo and you can directly run:

CUDA_VISIBLE_DEVICES=0 python r_bert.py --config config.ini

TACRED

You need to first download TACRED dataset from LDC, which due to the license issue I cannot put in this repo. Then, you can directly run:

CUDA_VISIBLE_DEVICES=0 python r_bert.py --config config_tacred.ini

Eval

SemEval-2010

We use the official script for SemEval 2010 task-8

$ cd eval
$ bash test.sh
$ cat res.txt

TACRED

First, we generate prediction file tac_res.txt

$ python eval_tacred.py

You may change test file/model path in the eval_tacred.py file

Then, we use the official scoring script for TACRED dataset

$ python ./eval/score.py -gold_file <TACRED_DIR/data/gold/test.gold> -pred_file ./eval/tac_res.txt

Results

SemEval-2010

Below is the Macro-F1 score

Model Original Paper Ours
BERT-uncased-base ---- 88.40
BERT-uncased-large 89.25 90.16

TACRED

Below is the evaluation result

Model Precision (Micro) Recall (Micro) F1 (Micro)
BERT-uncased-base 72.99 62.50 67.34
BERT-cased-base 71.27 64.84 67.91
BERT-uncased-large 72.91 66.20 69.39
BERT-cased-large 70.86 65.96 68.32

Reference

  1. https://github.com/wang-h/bert-relation-classification

  2. Enriching Pre-trained Language Model with Entity Information for Relation Classification.