diff --git a/docs/tasks/treebank.md b/docs/tasks/treebank.md index a57b744..4f99a08 100644 --- a/docs/tasks/treebank.md +++ b/docs/tasks/treebank.md @@ -6,4 +6,6 @@ | ----------------------------- | ------------------------------------------------------------ | ---------------------------------- | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | UD Thai PUD | This is a part of the Parallel Universal Dependencies (PUD) treebanks created for the CoNLL 2017 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. | 1,000 sentences | CC BY-SA 3.0 | Universal Dependencies | [GitHub](https://github.com/UniversalDependencies/UD_Thai-PUD) | | Thai Treebanks Dataset (thtb) | To enable research oppotunities with very few Thai Computational Linguitic resources, we willingly introduce fundamental high-level language resouces built with passion, Thai Treebanks, build from scratch for researchers and enthusiasts. | 5,200 sentences | CC BY 4.0 | Pechlada Seenual, Thodsaporn Chay-intr and Thanaruk Theeramunkong | [GitHub](https://github.com/tchayintr/thtb) | -| Blackboard Treebank | Blackboard Treebank is a Thai dependency corpus based on the LST20 Annotation Guideline. It features dependency structures, constituency structures, word boundaries, named entities, clause boundaries, and sentence boundaries. | 122,851 clauses (38,558 sentences) | CC BY 3.0 | Prachya Boonkwan, NECTEC | [bitbucket](https://bitbucket.org/kaamanita/blackboard-treebank/) | \ No newline at end of file +| Blackboard Treebank | Blackboard Treebank is a Thai dependency corpus based on the LST20 Annotation Guideline. It features dependency structures, constituency structures, word boundaries, named entities, clause boundaries, and sentence boundaries. | 122,851 clauses (38,558 sentences) | CC BY 3.0 | Prachya Boonkwan, NECTEC | [bitbucket](https://bitbucket.org/kaamanita/blackboard-treebank/) | +| Thai Universal Dependency Treebank (TUD) | Thai Universal Dependency Treebank, consisting of 3,627 trees annotated in accordance with the Universal Dependencies (UD) framework. | 3,627 trees | | Chulalongkorn University | [GitHub](https://github.com/nlp-chula/TUD) | +| Thai Discourse Treebank | Thai Discourse Treebank is the first and largest Thai corpus annotated with explicit discourse relations in the style of the English Penn Discourse Treebank 3 scheme. The final corpus consists of 10,602 sentences from 384 documents, 180 of which have complete annotation of discourse connectives and its two argument spans. | | | Ponrawee Prasertsom, Apiwat Jaroonpol, Attapol T. Rutherford | [GitHub](https://github.com/nlp-chula/thai-discourse-treebank/tree/main/data/th-tdtb) |