Skip to content

Commit

Permalink
Backport(v1.16) Windows: Fix an issue where stopping the service imme…
Browse files Browse the repository at this point in the history
…diately after startup could leave the processes (#4782) (#4802)

**Which issue(s) this PR fixes**:

Backport #4782 

**What this PR does / why we need it**:

Add retry for stop event for Windows Service to fix #3937.

If `Event.open()`

([OpenEvent](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-openeventw))
is called before the `Event.new()`

([CreateEvent](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-createeventw)),
`Event.open()` raises `Errno::ENOENT`.
This causes the service to be stopped while the supervisor and worker
process remains.
It causes #3937.
This PR fixes it by adding retry.


https://github.com/fluent/fluentd/blob/30c3ce00ff165b1b5d9f53fc0a027074bbcab0da/lib/fluent/winsvc.rb#L90


https://github.com/fluent/fluentd/blob/30c3ce00ff165b1b5d9f53fc0a027074bbcab0da/lib/fluent/supervisor.rb#L299

**Docs Changes**:
Not needed.

**Release Note**:
It would be good to have both of the following.

* Windows: Fixed an issue where stopping the service immediately after
startup could leave the processes.
* Windows: Fixed an issue where stopping service sometimes can not be
completed forever.

Signed-off-by: Daijiro Fukuda <[email protected]>
Signed-off-by: Kentaro Hayashi <[email protected]>
Co-authored-by: Daijiro Fukuda <[email protected]>
  • Loading branch information
kenhys and daipom authored Jan 29, 2025
1 parent 8e303cf commit b973802
Showing 1 changed file with 28 additions and 3 deletions.
31 changes: 28 additions & 3 deletions lib/fluent/winsvc.rb
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,12 @@ def service_main
end

def service_stop
set_event(@service_name)
if @pid > 0
Process.waitpid(@pid)
if @pid <= 0
set_event(@service_name)
return
end

wait_supervisor_finished
end

def service_paramchange
Expand All @@ -91,6 +93,29 @@ def set_event(event_name)
ev.set
ev.close
end

def repeat_set_event_several_times_until_success(event_name)
retries = 0
max_retries = 10
delay_sec = 3

begin
set_event(event_name)
rescue Errno::ENOENT
# This error occurs when the supervisor process has not yet created the event.
# If STOP is immediately executed, this state will occur.
# Retry `set_event' to wait for the initialization of the supervisor.
retries += 1
raise if max_retries < retries
sleep(delay_sec)
retry
end
end

def wait_supervisor_finished
repeat_set_event_several_times_until_success(@service_name)
Process.waitpid(@pid)
end
end

FluentdService.new(opts[:service_name]).mainloop
Expand Down

0 comments on commit b973802

Please sign in to comment.