GitHub - luang008/dcca_naacl_2015: DCCA code for NAACL 2015 paper

Code for naacl-2015 paper: Deep Multilingual Correlation for Improved Word Embeddings

Author: Ang Lu

Based on work of Manaal Faruqui and Weiran Wang

###Requirement: Python 2.7

Matlab from shell

###Data: The original word embeddings and aligned word embeddings have been store as .mat format, the header of English embeddings is in head.txt. The original embeddings can be found from Manaal's homepage: http://www.cs.cmu.edu/~mfaruqui .

Before applying DCCA, please follows the instruction https://github.com/mfaruqui/crosslingual-cca to align two original embeddings and get the subset to train the model. Then save them into matlab format using makemat.m. The alignment file is en-de, which may be different from orignial alignment file.

###Usage:

sh alltask.sh [128] [128] 0.0001 0.0001 3000 0.0001 0.99

###Reference:

@inproceedings{lu2015deep,
  title={Deep multilingual correlation for improved word embeddings},
  author={Lu, Ang and Wang, Weiran and Bansal, Mohit and Gimpel, Kevin and Livescu, Karen}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
deepnet		deepnet
DCCA.txt		DCCA.txt
DCCA_SGD.m		DCCA_SGD.m
DCCA_convgrad.m		DCCA_convgrad.m
DCCA_corr.m		DCCA_corr.m
EN-WS-353-ALL.txt		EN-WS-353-ALL.txt
EN-WS-353-REL.txt		EN-WS-353-REL.txt
EN-WS-353-SIM.txt		EN-WS-353-SIM.txt
README.md		README.md
SIMLEX999.txt		SIMLEX999.txt
alltask.m		alltask.m
alltask.sh		alltask.sh
deep_project_vectors.m		deep_project_vectors.m
demo_SGD.m		demo_SGD.m
en-de		en-de
filterVocab.py		filterVocab.py
fullVocab.txt		fullVocab.txt
head.txt		head.txt
linCCA.m		linCCA.m
makemat.m		makemat.m
naacl2015_code.tar		naacl2015_code.tar
normr.m		normr.m
paste.py		paste.py
ranking.py		ranking.py
testDCCA.m		testDCCA.m
wordsim_simlex999.py		wordsim_simlex999.py
wordsim_ws353.py		wordsim_ws353.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code for naacl-2015 paper: Deep Multilingual Correlation for Improved Word Embeddings

About

Releases

Packages

Languages

luang008/dcca_naacl_2015

Folders and files

Latest commit

History

Repository files navigation

Code for naacl-2015 paper: Deep Multilingual Correlation for Improved Word Embeddings

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages