Skip to content

`iapp_thaiqa_xquad` dataset

Compare
Choose a tag to compare
@cstorm125 cstorm125 released this 09 Jun 08:02
· 5 commits to master since this release
974fb53

Combine iapp_wiki_qa_squad, thaiqa_squad and xquad training sets, using validation and test sets from iapp_wiki_qa_squad. Remove all contexts in training sets that are similar (mUSE cosine similarity > 0.8) out of the training sets.

DatasetDict({
    train: Dataset({
        features: ['question_id', 'article_id', 'title', 'context', 'question', 'answers'],
        num_rows: 10916
    })
    validation: Dataset({
        features: ['question_id', 'article_id', 'title', 'context', 'question', 'answers'],
        num_rows: 742
    })
    test: Dataset({
        features: ['question_id', 'article_id', 'title', 'context', 'question', 'answers'],
        num_rows: 739
    })
})