Skip to content

Containerized pipeline used for annotation of Slovenian text corpora, composed for the MARCELL sustainability project.

Notifications You must be signed in to change notification settings

pzajec/marcell-annotation-pipeline

 
 

Repository files navigation

Deploying

How do I get the model files?

Run get-models.sh or get them yourself from clarin.si.

Docker

Building

Build the Docker image:

$ ./build.sh

Deploying

Run with docker-compose:

$ docker-compose up -d

Usage - Anonymization API

The Anonymization API is listening for HTTP POST requests on port 80 by default under the /anonymize path. File in xml format should be passed via --data-binary option with content type set to application/octet-stream.

You can test the API with cURL:

$ curl -X POST --data-binary @<path_to_xml> -H "Content-Type: application/octet-stream" http://localhost:80/anonymize 

Usage - Simple UI

Simple anonymization interface can be accessed via a web browser at http://localhost:80/

Automatically start on system boot

To enable the automatic start of container on system boot, first run the container and then execute:

$ cp docker-anonymization.service /etc/systemd/system/
$ systemctl enable docker-anonymization.service
$ systemctl start docker-anonymization

To start/stop the container manually run:

$ systemctl start docker-anonymization
$ systemctl stop docker-anonymization

About

Containerized pipeline used for annotation of Slovenian text corpora, composed for the MARCELL sustainability project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 60.1%
  • HTML 15.1%
  • Shell 9.7%
  • Dockerfile 8.7%
  • JavaScript 6.4%