This project provides the Java SDK and components for the Evolution of Language and Information Technology (ELIT) platform. It is under the Apache 2 license and currently led by the Emory NLP research group.
- Latest release: 0.0.2 (release notes).
- Javadoc.
Add the following dependency to the pom.xml
in your maven project.
<dependency>
<groupId>cloud.elit</groupId>
<artifactId>elit-sdk</artifactId>
<version>0.0.2</version>
</dependency>
The following code makes a HTTP request to retrieve NLP output for the input string from all components in spaCy.
- Replace
Fields.ALL
withFields.TOK
,Fields.LEM
,Fields.POS
,Fields.NER
, orFields.DEP
if you wish to perform the NLP pipeline only up to tokenization, lemmatization, part-of-speech tagging, named entity recognition, or dependency parsing, respectively. - Replace
Tools.SPACY
withTools.ELIT
orTools.NLP4J
if you want to use components provided by ELIT or NLP4J instead.
import cloud.elit.sdk.api.Client;
import cloud.elit.sdk.api.TaskRequest;
import Document;
import Tools;
import Fields;
public class DecodeWebAPITest {
static public void main(String[] args) {
Client api = new Client();
String input = "Hello World! Welcome to ELIT.";
TaskRequest r = new TaskRequest(input, Fields.ALL, Tools.SPACY);
String output = api.decode(r);
System.out.println(output);
}
}
The web-API then retrieves the NLP output in JSON as follows:
{
"output": [{
"sid": 0,
"tok": ["Hello", "World", "!"],
"lem": ["hello", "world", "!"],
"pos": ["UH", "NN", "."],
"ner": [],
"dep": [[1, "intj"], [-1, "ROOT"], [1, "punct"]]
},
{
"sid": 1,
"tok": ["Welcome", "to", "ELIT", "."],
"lem": ["welcome", "to", "elit", "."],
"pos": ["VBP", "IN", "NN", "."],
"ner": [[2, 3, "ORG"]],
"dep": [[-1, "ROOT"], [2, "aux"], [0, "xcomp"], [0, "punct"]]
}],
"pipeline": {
"dep": "spacy",
"lem": "spacy",
"ner": "spacy",
"pos": "spacy",
"tok": "spacy"}
}
Our SDK provides a convenient wrapper class to read the JSON output and convert it into a structure (see the Javadoc for more details).
import cloud.elit.sdk.api.Client;
import cloud.elit.sdk.api.TaskRequest;
import Document;
import Tools;
import Fields;
import Document;
import Sentence;
import NLPNode;
public class DecodeWebAPITest {
static public void main(String[] args) {
Client api = new Client();
String input = "Hello World! Welcome to ELIT.";
TaskRequest r = new TaskRequest(input, Fields.ALL, Tools.SPACY);
String output = api.decode(r);
Document doc = new Document(output);
for (Sentence sen : doc) {
for (NLPNode node : sen)
System.out.println(String.format("%s(%s, %s)",
node.getDependencyLabel(),
node.getToken(),
node.getParent().getToken()));
System.out.println();
}
}
}
The above code generates the following output:
intj(Hello, World)
ROOT(World, @#r$%)
punct(!, World)
ROOT(Welcome, @#r$%)
aux(to, ELIT)
xcomp(ELIT, Welcome)
punct(., Welcome)
Our SDK also allows you to create an NLP pipeline consisting of multiple tools. The following code makes a request specifying ELIT for tokenization, NLP4J for part-of-speech tagging and spaCy for dependency parsing.
TaskRequest r = new TaskRequest(input, Fields.DEP, Tools.SPACY);
r.setDependencies(new TaskDependency(Fields.TOK, Tools.ELIT), new TaskDependency(Fields.POS, Tools.NLP4J));