Skip to content

Commit

Permalink
Rewrote description
Browse files Browse the repository at this point in the history
  • Loading branch information
DoctressWasTaken committed Feb 28, 2021
1 parent 6e3aefa commit 999dec7
Show file tree
Hide file tree
Showing 15 changed files with 145 additions and 134 deletions.
1 change: 1 addition & 0 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ COMPOSE_HTTP_TIMEOUT=240
## Global settings :: TODO: Not yet implemented rabbitmq
RABBITMQ_HOST=rabbitmq
RABBITMQ_PORT=5672
TAG=dev
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
venv/*
.idea/*
secrets.env
secrets_template.env

.py3/
*/.py3/*
Expand Down
123 changes: 60 additions & 63 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,91 +3,88 @@

This tool provides a fully automated system to permanently mirror the official Riot APIs content
with your local system. It uses a number of microservices to split requests done to different endpoints
across separated processes.
across separated processes and allows scaling each service according to your current data needs.
This is especially usefull in the early stages of setting up a data repository.

The setup is meant for tasks using alot of match data while not wanting to set up their own data polling.
The setup is meant for tasks using alot of match data while not wanting to set up their own data pipeline structure.
Lightshield does currently not pull **all** match details data but only a select subset (for details
check the SQL schemas in the postgres folder). Changes can be made easily by simply expanding/replacing the service.

Lightshield is optimized to not repeat calls unecessarily. This comes at cost of having data added in a real time fashion.
*While unlikely services can drop singular datasets on restarts/shutdowns that will not be repeated unless manually forced.*
Lightshield is optimized to not repeat calls unecessarily. This comes at cost of having data added in a less than
real time fashion.

## Structure in Short

![Service](https://raw.githubusercontent.com/LightshieldDotDev/Lightshield/master/Service.png)
Lightshield handles data through a distributable scalable network of triangular microservice structures.
All data collected is stored in a dedicated postgres database. Task handling and scalability is provided through a
buffering redis database.

- Each cluster of services is responsible for a single server and requires an additional proxy for rate limiting and
handling of the API key.

Each step of data processing is stored inside the Postgres database from which a single manager service creates tasks.
Worker services then process the tasks, add data to the db and close the task.

Services rely on a standardized structure:
- Services take input from other services using a Websocket Client that connects to the publishing service.
- Services each rely on their own redis database for buffering of incoming as well as outgoing tasks. Buffer are limited
as not to build a major overhead at any point of the service chain.
- Services use a local SQLite database to hold details on already performed calls/ latest stats. This data is not
kept in the Redis database as the in-memory cost would be too high.
- Services use a centralized Service Class/Worker Class System to perform calls to the API in an asynchronous manner.
The Service Class handles pulls and pushes to either buffer.
- All calls are issued through a centralized proxy service, tasked with ratelimiting the requests on client side.
- Tasks are published to subscribed services using a websocket server.

## Requirements
Lightshield runs on docker-compose meaning that outside of the container system no data is needed.
Lightshield runs on docker and can either be built through this repository or by pulling the images
[directly from DockerHub](https://hub.docker.com/u/lightshield).


## Setup

### I. API Key
Create a `secrets.env` file in the project folder and add your API key.
```.env
API_KEY=RGAPI-xxx
```
### I. Network

### II. Base image & Network
Build the basic image used by all services:
```shell script
docker build -t lightshield_service ./services/base_image/
```
Initialize the network used:
Initialize the network used to bundle all services together and allow communication:
```shell script
docker network create lightshield
```
The name can be changed but has to be updated in the compose files as well.

### III. Centralized Postgres
Calls and services are initiated on a **per server** basis, meaning that server can be added and removed
as interest in the data exists.
### II. Database
Set up a postgres database either locally, in docker (attached to the network) or remotely. The services currently expect
no password verification as such using a remote postgres library should only be done if you can limit access through other means.

Data collected across **all** services is centralized in a final postgres database.
Settings for the postgres can be found in the `compose-persistent.yaml`.
Data is saved in the `data` database.\
Build and start the postgres db via:
```shell script
docker-compose -f compose-persistent.yaml build
docker-compose -f compose-persistent.yaml up -d
```
DB Connection details can be configured through a secrets.env file (template file included).

Lightshield requires the in the postgres folder listed tables to be set up in the specified database under schemas
corresponding to the server they will contain data for. E.g. pulling data for NA1 means setting up a schema `na1` (lower case)
with all tables inside said schema.

### III. Centralized Services
Services included in the `compose-global.yaml` file are centralized and as such do not run per-server.
This currently mainly includes the redis buffer database which should be started ahead of the other services.

### IV. Server Specific Structure
The project uses every container multiple times. For this to work a different `COMPOSE_PROJECT_NAME` has
to be set for each server. \
In adition the currently selected `SERVER` code has to be passed into the command to be passed on into
the services.\
**While the `COMPOSE_PROJECT_NAME` can be chosen and changed at will while running the project, the
`SERVER` variable has to be kept consistent in spelling (always capital letters) as volumes used by
the Redis databases, naming for the SQLite databases and server names in the postgres db all rely on this variable.** \
Initiate each server chain as follows:

#### IV.0 Proxy
See the [Proxy Repository](https://github.com/LightshieldDotDev/Lightshield_proxy) for setup description.

#### For docker-compose
Run docker compose with default parameters. The services require the selected server to be passed into the container via
the environment variable `SERVER`. In addition make sure to use different project names either through `-p [project name]`
or through the env variable `COMPOSE_PROJECT_NAME`. This will stop multiple server setups from overwriting one another.

```shell script
SERVER=EUW1 COMPOSE_PROJECT_NAME=lightshield_euw1 docker-compose build
# Build from source
docker-compose build
# Or pull from docker hub using TAG to specify the version you want
TAG=latest docker-compose pull
# Run either with the -p tag
SERVER=EUW1 docker-compose -p lightshield_euw1 up -d
# Or env variable
SERVER=EUW1 COMPOSE_PROJECT_NAME=lightshield_euw1 docker-compose up -d
```
#### For docker-swarm
Follow the same guidelines explained for docker-compose. The images can either be built or pulled from DockerHub.
`SERVER` still needs to be passed into the service.
The individual project name is passed through the stack name.
```shell script
SERVER=EUW1 docker stack deploy -c compose-services.yaml lightshield_euw1
```

<hr>

## Config
Each service takes a number of required arguments:
#### Default
`SERVER` contains the selected api server. For the proper codes please refer to the riot api dev website.\
`WORKER` sets the number of parallel async worker started in the service.\
`MAX_TASK_BUFFER` sets the maximum number of incoming tasks buffered. *Outgoing tasks are currently not set via env variables.*\
`REQUIRED_SUBSCRIBER` [Optional] allows to set a number of services needed for the publisher to allow output.
Only once all services mentioned are connected the publisher is allowed to broadcast tasks to all connected services.
*Service names are currently not set via environment variables.*

Services connecting to the postgres database can process host, port and user variables provided for the postgres db.
The parameters shown below are default parameters used in the services.
```
POSTGRES_HOST=postgres
POSTGRES_PORT=5432
POSTGRES_USER=db_worker
```
In addition to the configs listed in the secrets.env file there are default values used by docker in the .env file
as well as in the individual compose files (on the services) themselves.
2 changes: 1 addition & 1 deletion compose-global.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ services:
environment:
- SERVER=NA1,EUW1,KR
env_file:
- secrets.env
- secrets_template.env
ports:
- 8003:8000

Expand Down
36 changes: 18 additions & 18 deletions compose-services.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,46 +6,46 @@ services:
build:
dockerfile: Dockerfile
context: services/base
image: lightshield_base_image:latest
image: lightshield/base_image:${TAG}
entrypoint: ['ls']

league_rankings:
league_ranking:
hostname: league_rankings
build:
dockerfile: Dockerfile
context: services/league_rankings
image: lightshield.dev:5000/lightshield_league_rankings:latest
image: lightshield/lightshield_league_ranking:${TAG}
environment:
- SERVER=${SERVER}
- UPDATE_INTERVAL=1
- WORKER=5
env_file:
- secrets.env
- secrets_template.env
restart: always
volumes:
- ranking_progress:/project/configs/

summoner_ids_manager:
summoner_id_manager:
hostname: summoner_ids
build:
dockerfile: Dockerfile
context: services/summoner_ids_manager
image: lightshield.dev:5000/lightshield_summoner_ids_manager:latest
image: lightshield/lightshield_summoner_id_manager:${TAG}
environment:
- SERVER=${SERVER}
env_file:
- secrets.env
- secrets_template.env

summoner_ids: # SI
summoner_id: # SI
hostname: summoner_ids
build:
dockerfile: Dockerfile
context: services/summoner_ids
image: lightshield.dev:5000/lightshield_summoner_ids:latest
image: lightshield/lightshield_summoner_id:${TAG}
environment:
- SERVER=${SERVER}
env_file:
- secrets.env
- secrets_template.env
restart: always

match_history_manager:
Expand All @@ -56,48 +56,48 @@ services:
build:
dockerfile: Dockerfile
context: services/match_history_manager
image: lightshield.dev:5000/lightshield_match_history_manager:latest
image: lightshield/lightshield_match_history_manager:${TAG}
env_file:
- secrets.env
- secrets_template.env

match_history: # MH
hostname: match_history
build:
dockerfile: Dockerfile
context: services/match_history
image: lightshield.dev:5000/lightshield_match_history:latest
image: lightshield/lightshield_match_history:${TAG}
environment:
- SERVER=${SERVER}
- QUEUES=420
restart: always
env_file:
- secrets.env
- secrets_template.env

match_details_manager:
hostname: match_details
build:
dockerfile: Dockerfile
context: services/match_details_manager
image: lightshield.dev:5000/lightshield_match_details_manager:latest
image: lightshield/lightshield_match_details_manager:${TAG}
environment:
- LIMIT=5000
- SERVER=${SERVER}
env_file:
- secrets.env
- secrets_template.env

match_details: # MH
hostname: match_details
build:
dockerfile: Dockerfile
context: services/match_details
image: lightshield.dev:5000/lightshield_match_details:latest
image: lightshield/lightshield_match_details:${TAG}
environment:
- SERVER=${SERVER}
- QUEUES=420
- BATCH_SIZE=30
restart: always
env_file:
- secrets.env
- secrets_template.env

volumes:
ranking_progress:
Expand Down
7 changes: 2 additions & 5 deletions global/status_page/backend/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from aiohttp import web
import traceback


class Server:
def __init__(self):
self.db_host = os.environ["DB_HOST"]
Expand All @@ -15,15 +16,13 @@ def __init__(self):
self.last = datetime.now()
self.cutoff = os.environ["DETAILS_CUTOFF"]


self.logging = logging.getLogger("Main")
level = logging.INFO
self.logging.setLevel(level)
handler = logging.StreamHandler()
handler.setLevel(level)
handler.setFormatter(logging.Formatter("%(asctime)s %(message)s"))


self.data = {}

async def make_app(self):
Expand Down Expand Up @@ -140,9 +139,7 @@ async def return_status(self, request):

async def main():
server = Server()
await asyncio.gather(
server.generate_data()
)
await asyncio.gather(server.generate_data())


if __name__ == "__main__":
Expand Down
9 changes: 5 additions & 4 deletions services/base/connection_manager/buffer.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,19 @@


class RedisConnector:

def __init__(self):
self.host = os.environ['BUFFER_HOST']
self.port = int(os.environ['BUFFER_PORT'])
self.host = os.environ["BUFFER_HOST"]
self.port = int(os.environ["BUFFER_PORT"])

self.connection = None
self.connection: aioredis.ConnectionsPool

@asynccontextmanager
async def get_connection(self):
if not self.connection or self.connection.closed:
self.connection = await aioredis.create_redis_pool((self.host, self.port), encoding="utf-8")
self.connection = await aioredis.create_redis_pool(
(self.host, self.port), encoding="utf-8"
)

try:
yield self.connection
Expand Down
20 changes: 11 additions & 9 deletions services/base/connection_manager/persistent.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@
import traceback
import logging

class PostgresConnector:

def __init__(self, user='postgres'):
self.host = os.environ['PERSISTENT_HOST']
self.port = int(os.environ['PERSISTENT_PORT'])
self.db = os.environ['PERSISTENT_DATABASE']
class PostgresConnector:
def __init__(self, user="postgres"):
self.host = os.environ["PERSISTENT_HOST"]
self.port = int(os.environ["PERSISTENT_PORT"])
self.db = os.environ["PERSISTENT_DATABASE"]
self.user = user
self.connection = None
self.prepare = None
Expand All @@ -20,19 +20,21 @@ def __init__(self, user='postgres'):
self.logging.setLevel(level)
handler = logging.StreamHandler()
handler.setLevel(level)
handler.setFormatter(logging.Formatter("%(asctime)s [DB Connector] %(message)s"))
handler.setFormatter(
logging.Formatter("%(asctime)s [DB Connector] %(message)s")
)
self.logging.addHandler(handler)


@asynccontextmanager
async def get_connection(self, exclusive=False):
if exclusive:
await self.lock.acquire()
try:
if not self.connection or self.connection.is_closed():
self.connection = await asyncpg.connect(
"postgresql://%s@%s:%s/%s" % (
self.user, self.host, self.port, self.db))
"postgresql://%s@%s:%s/%s"
% (self.user, self.host, self.port, self.db)
)
if self.prepare:
await self.prepare(self.connection)
yield self.connection
Expand Down
1 change: 1 addition & 0 deletions services/league_rankings/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,7 @@ async def run(self):
await self.rankmanager.update(key=(tier, division))
await self.db.close()


if __name__ == "__main__":
service = Service()
signal.signal(signal.SIGTERM, service.shutdown)
Expand Down
Loading

0 comments on commit 999dec7

Please sign in to comment.