-
Notifications
You must be signed in to change notification settings - Fork 1
Synchronizing GOKb instances
For the purpose of synchronizing a (new) GOKb instance with an existing database, this repository comes with multiple Groovy sync scripts. Their purpose is to copy the data rather than create a like-for-like duplicate of the gokb server (With users, curatorial groups, etc, etc). The order is important - we suggest Orgs, Titles, Platforms, Packages
The sync scripts are written using groovysh, sdkman is a handy way to manage groovy and grails installations:
sudo apt-get install zip
curl -s "https://get.sdkman.io" | bash
source "/home/ubuntu/.sdkman/bin/sdkman-init.sh" # Only needed first time
sdk use groovy
# Prompted, reply "Y"
It's probably best to clone the source repository to get the scripts
git clone https://github.com/openlibraryenvironment/gokb.git
cd scripts
If you are running the scripts for the first time, grape will download all required dependencies. This may take a moment... Eventually you should see status messages downloading blocks of data and then a series of 200 OK responses showing that data is being loaded.
Each of the four main sync scripts (sync_gokb_orgs.groovy
, sync_gokb_platforms.groovy
, sync_gokb_titles.groovy
& sync_gokb_packages.groovy
) supports the usage of a separate configuration file to override the default values (which include sending the data to a localhost
instance). The files must be in the same folder and the naming of each config file is determined by the pattern sync-gokb-{'orgs'|'platforms'|'titles'|'packages'}-cfg.json
. A full configuration would look like this:
{
"uploadUser":"targetSystemUser",
"uploadPass":"targetSystemUserPass",
"targetBase":"http(s)://target.url/",
"sourceBase":"http(s)://source.url/"
}
Important: The script currently works under the assumption that both GOKb applications are hosted at the '/gokb/' webapp endpoint. If this is not the case for either side, the config values sourceContext
and targetContext
can be used to modify this behaviour (so the value '' would denote the app running under the main context)
During or after a run of each script, additional fields may be written to these files:
-
resumptionToken
: The resumptionToken of the last OAI API call. Useful for resuming after an interruption, is empty after a finished run. -
lastTimestamp
: The timestamp of the last item of the last API call. -
lastRun
: The highest timestamp of the last script run. Should be equal tolastTimestamp
after a finished run.
Each script may be run with the parameter --update
to only request data that has been changed after the last successful run (config value lastRun
). This is useful for continuous synchronization with another instance, as it avoids unnecessary API calls.