Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Text Example] Add Exchange Oracle #1350

Open
wants to merge 55 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
5288ebb
create project skeleton.
simulacrum6 Oct 24, 2023
4489801
implement manifest download.
simulacrum6 Oct 25, 2023
c3de7b7
add capabilities to launch doccano project.
simulacrum6 Oct 25, 2023
60c54fc
add test data. fix project creation.
simulacrum6 Oct 26, 2023
98b8df7
move project creation to annotation.
simulacrum6 Oct 30, 2023
8a306a8
add database definition for job requests.
simulacrum6 Oct 30, 2023
dc2af69
add job request registration to api.
simulacrum6 Oct 30, 2023
85c2d23
add scheduled cron jobs for processing.
simulacrum6 Oct 30, 2023
be7c0a8
add recurring job.
simulacrum6 Oct 31, 2023
05101fa
update hmt-basemodels version to 0.2.6.
simulacrum6 Oct 31, 2023
1195933
set up multiple projects per job.
simulacrum6 Nov 1, 2023
f525fb7
add user creation and project registration.
simulacrum6 Nov 2, 2023
deef4e2
refactor process_pending_applications
simulacrum6 Nov 2, 2023
a42b802
split registration of user and application for job.
simulacrum6 Nov 2, 2023
6912a68
add processing function for in-progress job requests.
simulacrum6 Nov 28, 2023
0430951
add processing function for completed job requests.
simulacrum6 Nov 28, 2023
db2fada
add functions to upload job results and notify the recording oracle.
simulacrum6 Nov 28, 2023
2d03254
add cron jobs.
simulacrum6 Nov 28, 2023
07de8b2
add validations for api methods.
simulacrum6 Dec 4, 2023
f5680ac
add tests for api methods. (WIP)
simulacrum6 Dec 4, 2023
b6dd20f
add additional tests for api methods. (WIP)
simulacrum6 Dec 5, 2023
90a0006
finalise api tests.
simulacrum6 Dec 5, 2023
4043704
refactor test.
simulacrum6 Dec 5, 2023
c987b34
add tests for cron jobs.
simulacrum6 Dec 6, 2023
6a4cbf9
add mapping so users can be identified and paid.
simulacrum6 Dec 6, 2023
7eaa61c
align api test wording. add additional checks.
simulacrum6 Dec 6, 2023
cd9be37
add dockerfiles for tests.
simulacrum6 Dec 8, 2023
6c42b7c
make add necessary data and uploads to make test standalone.
simulacrum6 Dec 11, 2023
707f2ba
foolproof file extraction for tests.
simulacrum6 Dec 11, 2023
af494a3
reduce sleep time.
simulacrum6 Dec 11, 2023
e79fce1
add logging.
simulacrum6 Dec 11, 2023
8a612b2
move .env config file to docker.
simulacrum6 Dec 12, 2023
f748292
fix dockerfiles.
simulacrum6 Dec 12, 2023
ffeca05
add checks for signature.
simulacrum6 Dec 12, 2023
f24161a
add test for signatures.
simulacrum6 Dec 13, 2023
24e6b95
add docker container for production server.
simulacrum6 Dec 13, 2023
6723895
update readme.
simulacrum6 Dec 13, 2023
5b52eb4
add pre-commit. reformat files.
simulacrum6 Dec 13, 2023
696049d
refactor.
simulacrum6 Dec 13, 2023
58a0ffb
update aiohttp. remove sensitive logs.
simulacrum6 Dec 13, 2023
96d045f
update cryptography.
simulacrum6 Dec 13, 2023
7da7204
update logging.
simulacrum6 Dec 13, 2023
3c375c3
add project expiration feature.
simulacrum6 Dec 14, 2023
3cf2d82
add retry feature.
simulacrum6 Dec 14, 2023
b10e3cf
move endpoint definition to own module.
simulacrum6 Dec 14, 2023
11ed233
align /webhook endpoint with oracle spec.
simulacrum6 Dec 20, 2023
2f9c05c
align /jobs endpoint with oracle spec.
simulacrum6 Dec 20, 2023
d4cbbbe
align /jobs/details endpoint with oracle spec.
simulacrum6 Dec 20, 2023
2670246
switch authentication to oauth2 bearer token
simulacrum6 Dec 21, 2023
5474adb
align /webhook endpoint signature header name with spec.
simulacrum6 Dec 22, 2023
00ce74b
add signature to outgoing webhook call.
simulacrum6 Jan 2, 2024
d4b36d6
change default username to doccano username.
simulacrum6 Jan 2, 2024
d09bc6e
align webhook signature with specification.
simulacrum6 Jan 2, 2024
1c812bf
add datapoint_uri to raw results.
simulacrum6 Jan 2, 2024
3cdb149
add reverse mapping from label texts to label ids.
simulacrum6 Jan 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
repos:
- repo: https://github.com/psf/black
rev: 22.6.0
hooks:
- id: black
language_version: python3.10
41 changes: 41 additions & 0 deletions packages/apps/text-example/exchange-oracle/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Text Annotation Example - Exchange Oracle

This directory contains a Recording Oracle Server for text span annotation jobs on Human Protocol.

## Setup
Make sure you have the following software installed and available on your path:

- [Python](https://www.python.org/downloads/) (>=3.10)
- [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) (>=1.5.1)
- [Docker](https://docs.docker.com/desktop) (>=24.0)

and install all dependencies

```shell
poetry install
```

## Running the Oracle

1. Create an appropriate ``.env`` file under `docker/prod/`. (See `docker/test/.env` for an example.)
2. Run ``docker compose -f docker/prod/compose.yaml up``

## API

For details on the API, start the server and check the `/docs` route, e.g. http://127.0.0.1:8000/docs if you run the server locally.

## Tests

To run tests, unzip `test/data/txt_files/data.zip` in its folder.
Then, navigate to the exchange oracle direcotry and run

````shell
docker compose -f docker/test/compose.yaml up --build test --attach test --exit-code-from test
````

## Contributing

1. Make sure to install pre commit ``pre-commit install``
2. Make sure you have postgres-devel packages installed on your OS. It is required for psycopg2 build phase.
`libpq-dev` in Debian/Ubuntu, `libpq-devel` on Centos/Fedora/Cygwin/Babun.)
`postgres` package in the homebrew for macO
17 changes: 17 additions & 0 deletions packages/apps/text-example/exchange-oracle/docker/prod/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM python:3.10

WORKDIR /app

RUN apt-get update -y && \
pip install poetry


COPY pyproject.toml poetry.lock ./

RUN poetry config virtualenvs.create false \
&& poetry install --no-interaction --no-ansi --no-root

COPY . .
COPY docker/prod/.env src/.env

CMD ["uvicorn", "src.main:exchange_oracle", "--reload"]
157 changes: 157 additions & 0 deletions packages/apps/text-example/exchange-oracle/docker/prod/compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
version: "3.8"

services:
# EXCHANGE ORACLE
exchange_oracle:
build:
context: ../../
dockerfile: docker/prod/Dockerfile
network_mode: "host"
extra_hosts:
- "host.docker.internal:host-gateway"
depends_on:
minio:
condition: service_started
postgres_exchange_oracle:
condition: service_started
backend:
condition: service_started
tty: true
postgres_exchange_oracle:
image: postgres:14.4
restart: on-failure
environment:
POSTGRES_PASSWORD: '${PG_USER}'
POSTGRES_USER: '${PG_PASSWORD}'
POSTGRES_DB: '${PG_DB}'
PGDATA: '/var/lib/postgresql/data/pgdata'
volumes:
- postgres:/var/lib/postgresql/data/pgdata
ports:
- ${PG_PORT}:${PG_PORT}
command: ["postgres", "-c", "log_statement=all", "-p", "${PG_PORT}"]
minio:
image: quay.io/minio/minio
network_mode: "host"
container_name: minio_server
restart: on-failure
volumes:
- ~/minio:/data
environment:
- MINIO_ROOT_USER=${MINIO_ROOT_USER}
- MINIO_ROOT_PASSWORD=${MINIO_ROOT_PASSWORD}
extra_hosts:
- "host.docker.internal:host-gateway"
command: minio server --console-address ":9090" /data
backend:
image: doccano/doccano:backend
container_name: doccano_backend
volumes:
- static_volume:/backend/staticfiles
- media:/backend/media
- tmp_file:/backend/filepond-temp-uploads
environment:
ADMIN_USERNAME: "${DOCCANO_ADMIN}"
ADMIN_PASSWORD: "${DOCCANO_ADMIN_PASS}"
ADMIN_EMAIL: ${DOCCANO_ADMIN_EMAIL}
CELERY_BROKER_URL: "amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbitmq"
DATABASE_URL: "postgres://${PG_USER}:${PG_PASSWORD}@postgres:${PG_PORT_DOCCANO}/${PG_DB_DOCCANO}?sslmode=disable"
ALLOW_SIGNUP: "False"
DEBUG: "False"
DJANGO_SETTINGS_MODULE: "config.settings.production"
ports:
- 4358:8000
depends_on:
- postgres
- celery
networks:
- network-backend
- network-frontend
celery:
image: doccano/doccano:backend
volumes:
- media:/backend/media
- tmp_file:/backend/filepond-temp-uploads
entrypoint: ["/opt/bin/prod-celery.sh"]
environment:
PYTHONUNBUFFERED: "1"
CELERY_BROKER_URL: "amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbitmq"
DATABASE_URL: "postgres://${PG_USER}:${PG_PASSWORD}@postgres:${PG_PORT_DOCCANO}/${PG_DB_DOCCANO}?sslmode=disable"
DJANGO_SETTINGS_MODULE: "config.settings.production"
depends_on:
- postgres
- rabbitmq
networks:
- network-backend
flower:
image: doccano/doccano:backend
entrypoint: ["/opt/bin/prod-flower.sh"]
environment:
PYTHONUNBUFFERED: "1"
CELERY_BROKER_URL: "amqp://${RABBITMQ_DEFAULT_USER}:${RABBITMQ_DEFAULT_PASS}@rabbitmq"
DATABASE_URL: "postgres://${PG_USER}:${PG_PASSWORD}@postgres:${PG_PORT_DOCCANO}/${PG_DB_DOCCANO}?sslmode=disable"
DJANGO_SETTINGS_MODULE: "config.settings.production"
FLOWER_BASIC_AUTH: "${FLOWER_BASIC_AUTH}" # Format "username:password"
depends_on:
- celery
ports:
- 5555:5555
networks:
- network-backend
- network-frontend
rabbitmq:
image: rabbitmq:3.10.7-alpine
environment:
RABBITMQ_DEFAULT_USER: "${RABBITMQ_DEFAULT_USER}"
RABBITMQ_DEFAULT_PASS: "${RABBITMQ_DEFAULT_PASS}"
ports:
- 5672:5672
networks:
- network-backend
nginx:
image: doccano/doccano:frontend
command: >
/bin/sh -c
"envsubst '
$${WORKER_PROCESSES}
'< /etc/nginx/nginx.conf.template
> /etc/nginx/nginx.conf
&& nginx -g 'daemon off;'"
environment:
API_URL: "http://backend:8000"
GOOGLE_TRACKING_ID: ""
WORKER_PROCESSES: "auto"
volumes:
- static_volume:/static
- media:/media
ports:
- 80:8080
depends_on:
- backend
networks:
- network-frontend
postgres:
image: postgres:13.3-alpine
volumes:
- postgres_data:/var/lib/postgresql/data/
environment:
POSTGRES_USER: "${PG_USER}"
POSTGRES_PASSWORD: "${PG_PASSWORD}"
POSTGRES_DB: "${PG_DB_DOCCANO}"
networks:
- network-backend
command: ["-p", "${PG_PORT_DOCCANO}"]


volumes:
postgres_data:
static_volume:
media:
tmp_file:
postgres:
minio_data:
driver: local

networks:
network-backend:
network-frontend:
69 changes: 69 additions & 0 deletions packages/apps/text-example/exchange-oracle/docker/test/.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
PORT=8000
ENVIRONMENT="test"
WORKERS_AMOUNT=1
LOG_LEVEL="DEBUG"
DEFAULT_JOB_EXPIRY_DAYS=7 # days until a job expires, unless overwritten by manifest
MAX_ATTEMPTS=1 # number of attempts to process a job request per stage. you might want to increase this in a production environment

# Cron Config
BACKGROUND_TASK_INTERVAL=150
BACKGROUND_TASK_CHUNK_SIZE=10

# postgres_config
PG_HOST="0.0.0.0"
PG_USER="admin"
PG_PASSWORD="admin"
PG_PORT="5437"
PG_DB="exchange_oracle"
PG_PORT_DOCCANO="5432"
PG_DB_DOCCANO="doccano"

# Polygon Mainnet Config
POLYGON_MAINNET_RPC_API_URL="https://some.url.ext"
POLYGON_MAINNET_PRIVATE_KEY="abcdef123456789" # 32 bytes hex string
POLYGON_MAINNET_ADDR="0x1" # 20 bytes hex string

# Polygon Mumbai Config
POLYGON_MUMBAI_RPC_API_URL="https://some.url.ext"
POLYGON_MUMBAI_PRIVATE_KEY="abcdef123456789" # 64 bytes hex string
POLYGON_MUMBAI_ADDR="0x0" # 20 bytes hex string

# Annotation Tool Config
DOCCANO_TASKS_PER_WORKER=25
DOCCANO_HOST="localhost"
DOCCANO_PORT=4358
DOCCANO_USE_SSL=0
DOCCANO_ADMIN="admin"
DOCCANO_ADMIN_PASS="password"
DOCCANO_ADMIN_EMAIL="[email protected]"

# S3 Storage Config
S3_ENDPOINT_URL="127.0.0.1:9000"
S3_REGION=""
S3_ACCESS_KEY="minioadmin"
S3_SECRET_KEY="minioadmin"
S3_USE_SSL=0
S3_RESULTS_BUCKET_NAME="snickering-eric-fivepin"
LOCAL_DATA_DIR="/tmp/txt-exo-data/"

# Human Protocol config
RECORDING_ORACLE_ENDPOINT_URL="http://127.0.0.1:8001/job/request"
HUMAN_APP_SIGNATURE="HUMAN_APP_SIGNATURE"
JOB_LAUNCHER_ADDRESS="0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" # 32 bytes hex string
REPUTATION_ORACLE_KEY="somekey"

# platform settings
ADMIN_USERNAME=admin
ADMIN_PASSWORD=password
[email protected]

# rabbit mq settings
RABBITMQ_DEFAULT_USER=doccano
RABBITMQ_DEFAULT_PASS=doccano

# Flower settings
FLOWER_BASIC_AUTH=""

# minio settings
MINIO_ROOT_USER=minioadmin
MINIO_ROOT_PASSWORD=minioadmin
18 changes: 18 additions & 0 deletions packages/apps/text-example/exchange-oracle/docker/test/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM python:3.10

WORKDIR /app

RUN apt-get update -y && \
pip install poetry


COPY pyproject.toml poetry.lock ./

RUN poetry config virtualenvs.create false \
&& poetry install --no-interaction --no-ansi --no-root

COPY . .
COPY docker/test/.env src/.env

CMD sleep 12 && pytest -s
#CMD ["pytest", "-s"]
Loading