Skip to content

meedan/textsimilarity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Semantic text similarity on short text segments

Summary

Building on the Semantic Text Similarity datasets of SemEval, this repository seeks to evaluate computationally-efficient approaches to identify short text segments that have nearly the same semantic meaning in large-scale datasets.

Getting started

The main entry point to the code in this repository is test_textsim.py. That file contains fuller comments describing the approaches being evaluated.

There are several similar files that build upon test_textsim.py.

density_plots.R plots out results, and mwe* files are minimal working examples for some approaches.

Contact

Further information is available from Scott Hale. Meedan team members can contact Scott via Slack and others can reach out to Scott via comments/issues on this repository or via direct message on Twitter

About

Matching short text segments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published