Skip to content

Latest commit

 

History

History
77 lines (60 loc) · 2.75 KB

README.md

File metadata and controls

77 lines (60 loc) · 2.75 KB

argoCollector

Collect Argo float profiles as NetCDF from the Global Data Assembly Centre and push to Erddap via Kafka

Requirements

  • Developed on Python 2.7
  • Dependencies are detailied in the requirements.txt file

Installation

sudo mkdir /opt/argos
sudo chown dmuser:dmuser /opt/argos
cd /opt/argos
git clone https://github.com/IrishMarineInstitute/argoCollector.git .
pip install -r requirements.txt

Invocation

to store data from http to kafka:

./argo_collector.py -c argo.json --input http --output kafka

to store data from kafka to the file system:

./argo_collector.py -c argo.json --input http --output file --no-commit

or to store data from http to the file system:

./argo_collector.py -c argo.json --input http --output file --no-commit

Control File

The control file is a JSON file which specifies various parameters

  • logfile - A string giving the full path to which the app will log
  • http - a JSON object
    • url - A string giving the GDAC URL pattern to poll data from {{floatID}} and {{quality}} and {{profileID}} will be replaced in the app
    • padProfileID - An integer giving how many digits to pad the profile ID out to (i.e. for a pattern of 001, 002, 003 use 3)
  • file - a JSON object
    • target - A string giving the target file path, {{float}} and {{basename}} will be replaced by the app
  • kafka - a JSON object
    • topic - a string giving the topic to send messages to
    • hosts - a comma separated string of server:port kafka servers
    • consumer_group - a string so used to commit offsets in kafka
  • floats - A JSON object of multiple "floatID": lastProfileIDProcessed key-value-pairs
    • floatID - A string identifying this float
    • lastProfileIDProcessed - An integer giving the last profile processed for this float. Use null to begin with the last profile. received by the GDAC. Use 0 to start from the first profile received for a given float

An example control file is given below, and here

{
    "logfile": "~/logs/argo_log.txt", 
    "http": {
        "url": "http://www.usgodae.org/ftp/outgoing/argo/dac/bodc/{{floatID}}/profiles/{{quality}}{{floatID}}_{{profileID}}.nc", 
        "padProfileID": 3
    }, 
    "floats": {
        "6900444": 185, 
        "6900658": 185
    }, 
    "kafka": {
        "topic": "argo", 
        "hosts": "127.0.0.1:9092",
        "consumer_group": "kafka"
    }
}

Adding a new float

To add a new float, for example 6901933, use the add_float command. Like this:

/opt/argos/add_float 6901933

Scheduled update

Cron entry:

8 * * * * /opt/argos/update_floats >> /home/dmuser/update_floats.log 2>&1