Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
run_bert.py		run_bert.py
run_t5.py		run_t5.py
tfidf_lgbm.py		tfidf_lgbm.py
tfidf_logistic_regression.py		tfidf_logistic_regression.py

README.md

Text Classification

This folder contains examples of text classification models.

What is Text Classification?

Text classification is a supervised learning method of learning and predicting the category or the class of a document given its text content. The state-of-the-art methods are based on neural networks of different architectures as well as pre-trained language models or word embeddings.

https://github.com/microsoft/nlp-recipes/blob/master/examples/text_classification/README.md

Summary

Notebook	Environment	Description	ACC
TF-IDF & Logistic Regression	Local	Logistic Regression with TF-IDF vectors	0.9308
TF-IDF & LightGBM	Local	LightGBM with TF-IDF vectors	0.9512
BERT 'cl-tohoku/bert-base-japanese-v2'	Local	Transformers BERT	0.9362
BERT 'cl-tohoku/bert-base-japanese-char-v2'	Local	Transformers BERT	0.9274
BERT 'cl-tohoku/bert-base-large'	Local	Transformers BERT	-
T5	Local	T5 for japanese	0.9566

Accuracy scores (ACC) are calculated by running code only in fold 0 in the condition that datasets are devided into train/val/test at the rate of 0.6/0.2/0.2. Be careful that the scores are highly affected by the way of splitting dataset and hyperparameters like the number of epochs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text_classification

text_classification

README.md

Text Classification

What is Text Classification?

Summary

Files

text_classification

Directory actions

More options

Directory actions

More options

Latest commit

History

text_classification

Folders and files

parent directory

README.md

Text Classification

What is Text Classification?

Summary