-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change definition of "stuck VM" in manage_vms script #996
Conversation
In the current version of the script, stuck VMs are those whose name starts with `vgcnbwc-worker-` and do not show up in `condor_status`. Change the definition of stuck so that stuck VMs are those that do not reply to ICMP echo requests. This prevents the script from removing machines belonging to the secondary HTCondor cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine. Was this function tested?
Need to take another look as ping on the names of the VMs might not work
|
It pings the IPs of the VMs. |
Cool, thank you for testing it! |
Machines answer within a few hundred milliseconds, but better safe than sorry.
I assume you are ok with the ping timeout increase as well. I am merging, feel free to revert f199067 any time (or to choose a different value). |
In the current version of the script, stuck VMs are those whose name starts with
vgcnbwc-worker-
and do not show up incondor_status
.Change the definition of stuck so that stuck VMs are those that do not reply to ICMP echo requests.
This prevents the script from removing machines belonging to the secondary HTCondor cluster.