Skip to content

Latest commit

 

History

History

embeddings

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Embeddings

This folder contains examples for getting pretrained embedding vectors.

What is Word Embedding?

Word embedding is a technique to map words or phrases from a vocabulary to vectors or real numbers. The learned vector representations of words capture syntactic and semantic word relationships and therefore can be very useful for tasks like sentence similary, text classifcation, etc.

https://github.com/microsoft/nlp-recipes/blob/master/examples/embeddings/README.md

Japanese pretrained models

There is a survey article titled "学習済み日本語word2vecとその評価について". This article introduces many Japanese pretrained embedding models avaliable and evaluate them.

Summary

Notebook Environment Description
Word2vec Local Get word2vec vectors pretrained by Japanese Wikipedia
fastText Local Get fastText vectors pretrained by Japanese Common Crawl
Download Pre-trained Embeddings Local Download pre-trained embeddings by chakin
Universal Sentence Encoder Local Get Universal Sentence Encoder