Bootstrap

The text files in this repository, which are read by mill, can be bootstrapped from the WordNet RDF distributed at this link (PWN 3.0), or from RDF data generated from PWN’s database files using this tool.

You will need:

Python 3 (+ libraries, see requirements file)
(optional) Common Lisp, if you would like to generate the legacy RDF from the WNDB files yourself. This option is not documented (yet).

After obtaining the input RDF mentioned above, there are a couple outstanding issues in it that do not allow a clean conversion to mill’s format:

mill assumes every wordsense has unique identifier composed by its language/WN, its lexicographer file, its lexical form, and a lexical identifier. This is not true of adjective satellites in PWN, because they all share the same lexicographer file and the same lexical identifier of 0 (see related issue and its solution). This assumption allows to create simpler sense identifiers (versus the legacy sense keys which have special cases for adjective satellites).
this distribution has a few wrong URIs that end up pointing to inexisting nodes (see related issue and its solution).

To generate an initial version of the text files read by mill use the script at mill’s repository — it solves the issues above and then performs the conversion. Run

python python/bootstrap-legacy-rdf.py --help

from the root of mill’s repository for help on how to run it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bootstrap

Clone this wiki locally