-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Healthz returns incorrect status with health_check_grpc_backend active #751
Comments
UPDATE: D1112 21:01:19.167 28 external/envoy/source/common/network/connection_impl.cc:924] [28][connection][C40] connecting to 192.168.65.2:21411 Would it be possible to change initial state to unhealthy? |
@qiwzhang could you take a look? |
Sure |
ESPv2 is using envoy. Currently, it is not possible to change the initial_health_status to unhealthy. I just checked its health check config, it seems that it is not supported. |
Hello
I am using espv2 (2.39.0 ) active health checking to track backend status.
Parameters I use are:
--healthz=healthz
--health_check_grpc_backend
--health_check_grpc_backend_interval=5s
My expectation is:
when ESPv2 has started and backend service has not started yet an esp endpoint /healthz should fail.
However a response received is 200 OK with body { "code": 200, "message": "" }
In espv2 container log I periodically see
In espv2 container log I periodically see lines
"D1112 20:22:43.232 24 D1112 20:44:18.625 27 external/envoy/source/common/http/codec_client.cc:57] [27][client][C23] connecting
D1112 20:44:18.625 27 external/envoy/source/common/network/connection_impl.cc:924] [27][connection][C23] connecting to 192.168.65.2:21411
D1112 20:44:18.625 27 external/envoy/source/common/network/connection_impl.cc:943] [27][connection][C23] connection in progress
D1112 20:44:20.657 27 external/envoy/source/common/network/connection_impl.cc:695] [27][connection][C23] delayed connect error: 111
D1112 20:44:20.657 27 external/envoy/source/common/network/connection_impl.cc:250] [27][connection][C23] closing socket: 0
D1112 20:44:20.657 27 external/envoy/source/common/http/codec_client.cc:108] [27][client][C23] disconnect. resetting 1 pending requests
D1112 20:44:20.657 27 external/envoy/source/common/http/codec_client.cc:140] [27][client][C23] request reset
D1112 20:44:20.657 27 external/envoy/source/common/upstream/health_checker_impl.cc:787] [27][hc][C23] connection/stream error health_flags=healthy
Is that an intended behavior?
I use ingress load balancer which is monitoring /healthz endpoint to check for status. A pod is either added or removed from a loadbalancer based on the status. With the current setup (on rolling update) on rollout a failing pod is added to a balancer causing request timeouts, because backend is not ready yet but esp tells that backend pod is ready. After a couple of seconds a pod is removed from a load balancer after esp does a couple of checks to a backend and change status from healthy to unhealthy.
Thanks!
The text was updated successfully, but these errors were encountered: