[Bug]: Race condition when running backup and monitor at ~ the same time will result in failed backup #4496
Labels
command: backup
command: install
distro: openSUSE
distro: Ubuntu
info: alerts
info: query
type: bug
Something isn't working
User story
As a server admin, i want the monitor command to not interrupt currently running backups.
Game
All ?!
Linux distro
Ubuntu 22.04
Command
command: backup
Further information
I noticed a bug when my Minecraft server got mistakenly restarted during a backup because of an automated cronjob running the monitor command (while the server was creating a backup) but essentially at the same time. This occured becasuse i have the monitor command set to run every 5 minutes and the backup command to run every day at 4:00 am.
After looking into what's happening i think i found the issue:
If you start a backup and immediatley run the monitor command after the server has already been stopped by the backup command but before the
backup.lock
file was created, the monitor command will pass thebackup.lock
filecheck and run regularly (start querying the gameserver) and restart the server because thebackup.lock
file didn't exist. The monitor command will send out the appropriate alertUnable to query mcserver. Game server has been restarted.
and restart the server. This will possibly result in the backup not finishing becasue of atar
error:Backing up mcserver: Backup (9,7G) mcserver-2024-02-12-015407.tar.gz, in progress...tar: ././serverfiles/world: file changed as we read it FAIL
This will also lead to thebackup.lock
file not being deleted.Note that this will not be noticed if the backup finishes within 60 seconds of the monitor command being called, in which case the monitor command will simply assume that the server is online and thus won't restart the server.
If creating the backup takes a little longer the monitor command will continue querying and spew out errors depending how far the backup command got:
Another error that can occur depending on timing:
Possible Solutions:
backup.lock
file earlier in the backup flow.gameserver-monitor.lock
file that forces commands that affect uptime to wait until the lock file is removed.Relevant log output
Steps to reproduce
stoponbackup
./gameserver backup
./gameserver monitor
backup.lock
file, you were too slow and the backup command has gotten too far alreadyKeep in mind to manually delete the
backup.lock
file in between trying to reproduce this bug becasue it won't be deleted sometimes.The text was updated successfully, but these errors were encountered: