Skip to content

Data backup and restore

Deepak Narayana Rao edited this page Oct 26, 2017 · 9 revisions

We follow a common backup, restore and purging process for all data

Backup Process

  • Data backup is scheduled via *_Backup jobs in jenkins. These jobs run respective *-backup ansible roles in this repo.
  • Backup jobs run once every day @midnight
  • Data backup is compressed and uploaded to Azure blob storage for long term storage

Databases

Data stored in following databases are periodically backed up using process mentioned above

  • Postgresql - Full DB backup us taken using postgresql-backup role which uses pg_dump_all internally.
  • ElasticSearch - Snapshot based backups are stored in azure using [azure repository plugin]. Uses ansible role es-azure-snapshot (https://www.elastic.co/guide/en/elasticsearch/plugins/5.6/repository-azure.html)
  • Cassandra - Full DB backups taken using ansible role cassandra-backup
  • MongoDB - Full DB backups taken using ansible role mongo-backup

Jenkins

  • Jenkins thin backup plugin is used for jenkins backup.
  • This plugin is scheduled to run @midnight. It saves a new backup folder for each backup
  • The Jenkins_Backup job compresses latest backup folder and uploads to azure blob storage. This is scheduled to run every day

Restore Process

  • Restore from backup can be done via *_Restore jenkins jobs. These jobs run respective *-restore ansible roles in this repo
  • Restore jobs take backup_name as parameter to decide backup from which time should be restored
  • Restore job are run on demand

Databases

These backups are stored in azure resource group <env>-db-backups. There will be a storage account by name <env>dbbackups which will have container(folder) for each backup

  • Postgresql - Name of the backup to restore can be found in container postgresql-backup
  • ElasticSearch - Name of the backup to restore can be found in console logs of 'Elasticsearch_Backup' job

Note: Alternatively you can get snapshot name by looking at last snapshot from API response of curl --silent 10.10.3.7:9200/_snapshot/azurebackup/_all | jq . | less

  • Cassandra - Name of the backup to restore can be found in container cassandra-backup
  • MongoDB - Name of the backup to restore can be found in container mongodb-backup

Jenkins

These backups are stored in azure resource group admin-backups. There will be a storage account by name adminbackups. Name of the backup to restore can be found in container jenkins-backup

Purging Process

  • Purging of backup in azure blob storage is scheduled using azure logic app. The steps are listed in Azure blob storage purge setup
  • Backup data is retained for 30 days by default
  • Purging has been setup for postgresql, cassandra and jenkins.
  • Elasticsearch does incremental backup and snapshots, hence no purging is required
  • Mongodb will removed soon, hence no purging has been setup
Clone this wiki locally