Skip to content

Commit

Permalink
Add new result for GEC (#642)
Browse files Browse the repository at this point in the history
* Add new result for GEC

* Update grammatical_error_correction.md

Add EditScorer and GRECO

---------

Co-authored-by: Stuart Mesham <[email protected]>
  • Loading branch information
mrqorib and StuartMesham authored Dec 15, 2023
1 parent 00a483f commit 0fbc440
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions english/grammatical_error_correction.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,15 @@ The shared task setting restricts that systems use only publicly available datas

| Model | F0.5 | Paper / Source | Code |
| ------------- | :-----:| --- | :-----: |
| GRECO (Qorib and Ng, EMNLP 2023) | 71.12 | [System Combination via Quality Estimation for Grammatical Error Correction](https://aclanthology.org/2023.emnlp-main.785) | [official](https://github.com/nusnlp/greco) |
| ESC (Qorib et al., NAACL 2022) | 69.51 | [Frustratingly Easy System Combination for Grammatical Error Correction](https://aclanthology.org/2022.naacl-main.143/) | [official](https://github.com/nusnlp/esc) |
| T5 ([t5.1.1.xxl](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md)) trained on [cLang-8](https://github.com/google-research-datasets/clang8) (Rothe et al., ACL-IJCNLP 2021) | 68.87 | [A Simple Recipe for Multilingual Grammatical Error Correction](https://arxiv.org/pdf/2106.03830.pdf) | [T5](https://github.com/google-research/text-to-text-transfer-transformer), [cLang-8](https://github.com/google-research-datasets/clang8) |
| Tagged corruptions - ensemble (Stahlberg and Kumar, 2021)| 68.3 | [Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models](https://www.aclweb.org/anthology/2021.bea-1.4.pdf)| [Official](https://github.com/google-research-datasets/C4_200M-synthetic-dataset-for-grammatical-error-correction) |
| Sequence tagging + token-level transformations + two-stage fine-tuning, DeBERTa + ELECTRA + RoBERTa ensemble (Mesham et al., EACL 2023) | 67.93 | [An Extended Sequence Tagging Vocabulary for Grammatical Error Correction](https://aclanthology.org/2023.findings-eacl.119.pdf) | [Official](https://github.com/StuartMesham/gector_experiment_public) |
| TMTC (Lai et al., ACL Findings 2022) | 67.02 | [Type-Driven Multi-Turn Corrections for Grammatical Error Correction](https://aclanthology.org/2022.findings-acl.254) | [official](https://github.com/DeepLearnXMU/TMTC) |
| Sequence tagging + token-level transformations + two-stage fine-tuning + (BERT, RoBERTa, XLNet), ensemble (Omelianchuk et al., BEA 2020) | 66.5 | [GECToR – Grammatical Error Correction: Tag, Not Rewrite](https://arxiv.org/pdf/2005.12592.pdf) | [Official](https://github.com/grammarly/gector) |
| Shallow Aggressive Decoding with BART (12+2), single model (beam=1) (Sun et al., ACL 2021) | 66.4 | [Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding](https://aclanthology.org/2021.acl-long.462.pdf) | [Official](https://github.com/AutoTemp/Shallow-Aggressive-Decoding) |
| Sequence tagging + token-level transformations + two-stage fine-tuning, DeBERTa (Mesham et al., EACL 2023) | 66.06 | [An Extended Sequence Tagging Vocabulary for Grammatical Error Correction](https://aclanthology.org/2023.findings-eacl.119.pdf) | [Official](https://github.com/StuartMesham/gector_experiment_public) |
| DeBERTa(L) + RoBERTa(L) + XLNet (Tarnavskyi et al., ACL 2022) | 65.3 | [Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction](https://aclanthology.org/2022.acl-long.266) | [Official](https://github.com/MaksTarnavskyi/gector-large) |
| Sequence tagging + token-level transformations + two-stage fine-tuning + XLNet, single model (Omelianchuk et al., BEA 2020) | 65.3 | [GECToR – Grammatical Error Correction: Tag, Not Rewrite](https://arxiv.org/pdf/2005.12592.pdf) | [Official](https://github.com/grammarly/gector) |
| Transformer + Pre-train with Pseudo Data + BERT (Kaneko et al., ACL 2020) | 65.2 | [Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction](https://arxiv.org/pdf/2005.00987.pdf) | [Official](https://github.com/kanekomasahiro/bert-gec) |
Expand Down Expand Up @@ -54,6 +57,7 @@ _**Restricted**_: uses only publicly available datasets. _**Unrestricted**_: use

| Model | F0.5 | Paper / Source | Code |
| ------------- | :-----:| --- | :-----: |
| GRECO (Qorib and Ng, EMNLP 2023) | 85.21 | [System Combination via Quality Estimation for Grammatical Error Correction](https://aclanthology.org/2023.emnlp-main.785/) | [official](https://github.com/nusnlp/greco) |
| SMT + BiGRU (Grundkiewicz and Junczys-Dowmunt, 2018) | 72.04 | [Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation](http://aclweb.org/anthology/N18-2046)| NA |
| CNN Seq2Seq (Chollampatt and Ng, 2018)| 70.14 (measured by Ge et al., 2018) | [ A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17308/16137)| [Official](https://github.com/nusnlp/mlconvgec2018) |

Expand Down Expand Up @@ -131,14 +135,18 @@ Since current state-of-the-art systems rely on as much annotated learner data as

| Model | F0.5 | Paper / Source | Code |
| ------------- | :-----:| --- | :-----: |
| GRECO (Qorib and Ng, EMNLP 2023) | 80.84 | [System Combination via Quality Estimation for Grammatical Error Correction](https://aclanthology.org/2023.emnlp-main.785) | [official](https://github.com/nusnlp/greco) |
| ESC (Qorib et al., NAACL 2022) | 79.90| [Frustratingly Easy System Combination for Grammatical Error Correction](https://aclanthology.org/2022.naacl-main.143/) | [official](https://github.com/nusnlp/esc) |
| TMTC (Lai et al., ACL Findings 2022) | 77.93 | [Type-Driven Multi-Turn Corrections for Grammatical Error Correction](https://aclanthology.org/2022.findings-acl.254) | [official](https://github.com/DeepLearnXMU/TMTC) |
| RoBERTa(L) + EditScorer (Sorokin, EMNLP 2022) | 77.1 | [Improved grammatical error correction by ranking elementary edits](https://aclanthology.org/2022.emnlp-main.785) | [official](https://github.com/AlexeySorokin/EditScorer) |
| Sequence tagging + token-level transformations + two-stage fine-tuning, DeBERTa + ELECTRA + RoBERTa ensemble (Mesham et al., EACL 2023) | 76.17 | [An Extended Sequence Tagging Vocabulary for Grammatical Error Correction](https://aclanthology.org/2023.findings-eacl.119.pdf) | [Official](https://github.com/StuartMesham/gector_experiment_public) |
| DeBERTa(L) + RoBERTa(L) + XLNet (Tarnavskyi et al., ACL 2022) | 76.05 | [Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction](https://aclanthology.org/2022.acl-long.266) | [Official](https://github.com/MaksTarnavskyi/gector-large) |
| GECToR large without synthetic pre-training - ensemble (Tarnavskyi and Omelianchuk, 2021) | 76.05 | [Improving Sequence Tagging for Grammatical Error Correction](https://er.ucu.edu.ua/handle/1/2707) | [Official](https://github.com/MaksTarnavskyi/gector-large) |
| T5 ([t5.1.1.xxl](https://github.com/google-research/text-to-text-transfer-transformer/blob/master/released_checkpoints.md)) trained on [cLang-8](https://github.com/google-research-datasets/clang8) (Rothe et al., ACL-IJCNLP 2021) | 75.88 | [A Simple Recipe for Multilingual Grammatical Error Correction](https://arxiv.org/pdf/2106.03830.pdf) | [T5](https://github.com/google-research/text-to-text-transfer-transformer), [cLang-8](https://github.com/google-research-datasets/clang8) |
| Tagged corruptions - ensemble (Stahlberg and Kumar, 2021)| 74.9 | [Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models](https://www.aclweb.org/anthology/2021.bea-1.4.pdf)| [Official](https://github.com/google-research-datasets/C4_200M-synthetic-dataset-for-grammatical-error-correction) |
| Sequence tagging + token-level transformations + two-stage fine-tuning + (BERT, RoBERTa, XLNet), ensemble (Omelianchuk et al., BEA 2020) | 73.6 | [GECToR – Grammatical Error Correction: Tag, Not Rewrite](https://arxiv.org/pdf/2005.12592.pdf) | [Official](https://github.com/grammarly/gector) |
| BEA Combination | 73.18 | [Learning to Combine Grammatical Error Corrections ](https://www.aclweb.org/anthology/W19-4414/) | [official](https://github.com/IBM/learning-to-combine-grammatical-error-corrections) |
| Sequence tagging + token-level transformations + two-stage fine-tuning, DeBERTa (Mesham et al., EACL 2023) | 73.09 | [An Extended Sequence Tagging Vocabulary for Grammatical Error Correction](https://aclanthology.org/2023.findings-eacl.119.pdf) | [Official](https://github.com/StuartMesham/gector_experiment_public) |
| Shallow Aggressive Decoding with BART (12+2), single model (beam=1) (Sun et al., ACL 2021) | 72.9 | [Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding](https://aclanthology.org/2021.acl-long.462.pdf) | [Official](https://github.com/AutoTemp/Shallow-Aggressive-Decoding) |
| Sequence tagging + token-level transformations + two-stage fine-tuning + XLNet, single model (Omelianchuk et al., BEA 2020) | 72.4 | [GECToR – Grammatical Error Correction: Tag, Not Rewrite](https://arxiv.org/pdf/2005.12592.pdf) | [Official](https://github.com/grammarly/gector) |
| Transformer + Pre-train with Pseudo Data (Kiyono et al., EMNLP 2019) | 70.2 | [An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction](https://arxiv.org/abs/1909.00502) | NA |
Expand Down

0 comments on commit 0fbc440

Please sign in to comment.