Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if ansible-bender is killed the stale pid file can block future uses #223

Open
coreyoconnor opened this issue Apr 24, 2020 · 3 comments
Open
Labels
bug Something isn't working UX Improve user experience

Comments

@coreyoconnor
Copy link

Suppose the following:

  1. ansible-bender starts
  2. acquires db access; creates the pid file
  3. host computer dies; eg: power failure
  4. ansible-bender list-builds is invoked

expectation is that list-builds would work as expected. However, if we run with --debug we'll note repeated output of:

ab is running as PID 12947
ab is running as PID 12947
ab is running as PID 12947

Which, is not correct. The referenced ansible-bender process has died and the current ansible-bender process will wait forever. See the while loop here:

Proposal:

  • check if a process with that pid exists
  • if no process with that pid exists for a period of time then delete the pid file and continue

As a separate feature:

  • have an option to timeout and failure if db cannot be acquired within specified time. Not really required: The calling process could handle the timeout .
@TomasTomecek TomasTomecek added bug Something isn't working UX Improve user experience labels Apr 27, 2020
@TomasTomecek
Copy link
Collaborator

check if a process with that pid exists
if no process with that pid exists for a period of time then delete the pid file and continue

That's a good idea! Since you investigated so thoroughly, would you be interested in fixing it as well? :)

@coreyoconnor
Copy link
Author

check if a process with that pid exists
if no process with that pid exists for a period of time then delete the pid file and continue

That's a good idea! Since you investigated so thoroughly, would you be interested in fixing it as well? :)

Thanks for confirming my understanding :)

I'm not much of a python programmer, but this should be within bounds. Unfortunately, my queue is long. :\ I'll submit a patch when I can. Still, anyone should feel free to fix

@areese
Copy link

areese commented Jan 25, 2021

@coreyoconnor I found a workaround for this. I hit the same issue, and added a log statement to show where the lock file is stored, and ended up removing it manually.

rm -f ~/.cache/ab/ab.pid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working UX Improve user experience
Projects
None yet
Development

No branches or pull requests

3 participants