Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New probe port for Gossip Router when TLS is enabled #1675

Merged
merged 1 commit into from
Sep 7, 2022

Conversation

pruivo
Copy link
Member

@pruivo pruivo commented Sep 6, 2022

Closes #1602

@pruivo pruivo requested a review from ryanemerson September 6, 2022 10:39
@@ -51,6 +51,8 @@ func GossipRouter(i *ispnv1.Infinispan, ctx pipeline.Context) {
}
mutateFn := func() error {
routerLabels := i.GossipRouterPodLabels()
var enableJgrpDiag bool = consts.JGroupsDiagnosticsFlag == "TRUE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type declaration is redudant, this can just be enableJgrpDiag := ...

@@ -67,6 +69,10 @@ func GossipRouter(i *ispnv1.Infinispan, ctx pipeline.Context) {

var addKeystoreVolume, addTruststoreVolume bool
if i.IsSiteTLSEnabled() {
// enable diagnostic for k8s probes
enableJgrpDiag = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always want diagnostics enabled whenever XSIte TLS is enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, because I need the Gossip Router to open the port so probe can "ping" it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to use the diagnostic port as it's a TCP probe and when TLS is enabled there's no other way for the health check to succeed on the router pod?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The health check succeeds with TLS but the issue is the gossip router logging javax.net.ssl.SSLHandshakeException: Remote host terminated the handshake because the probe does not do the TLS handshake (it looks like it only tries to ping it).

With this PR, I'm enabling the diagnostic port when TLS is enabled; the health check can probe that port and we avoided the exceptions in the log.

Copy link
Member

@tristantarrant tristantarrant Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception reported when using openssl is different until wildfly-security/wildfly-openssl#123 is merged

@pruivo pruivo force-pushed the t_gossip_router_diag_port branch from 41c2615 to da47ced Compare September 6, 2022 17:08
@tristantarrant
Copy link
Member

Maybe it's worth investigating a micrometer-based endpoint for the GossipRouter so that it can be scraped by Prometheus

@ryanemerson ryanemerson merged commit 0ffc882 into infinispan:main Sep 7, 2022
@ryanemerson
Copy link
Contributor

Thanks @pruivo

@pruivo pruivo deleted the t_gossip_router_diag_port branch September 7, 2022 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cross Site GossipRouter TLS: java.io.EOFException: SSL peer shut down incorrectly
3 participants