Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let KnowledgeDirectoryConnection deal with GOAWAY received better. #476

Open
bnouwt opened this issue Feb 9, 2024 · 1 comment
Open
Assignees

Comments

@bnouwt
Copy link
Collaborator

bnouwt commented Feb 9, 2024

The usage of cloud services also means that services are sometimes temporarily unavailable. One instance where we encountered this is with the Knowledge Directory and it returned the message GOAWAY received. Apparently, this is something we just have to deal with.

Currently, if the GOAWAY received message is received, the KnowledgeDirectoryConnection returns an empty list and all RemoteKerConnection objects are removed from the RemoteKerConnectionManager. This is too much, I think. The best way would be to deal with this in the KnowledgeDirectoryConnection, but if that is not possible we should deal with it somewhere else.

@bnouwt bnouwt self-assigned this Feb 27, 2024
@bnouwt
Copy link
Collaborator Author

bnouwt commented Mar 1, 2024

The Exception looks like this (when querying for recent KER info):

2024-02-05T11:03:21.997224000Z 2024-02-05 11:03:21:996 +0000 [pool-11-thread-1] WARN KnowledgeDirectoryConnection - Was not able to retrieve KnowledgeEngineRuntimeConnectionDetails
2024-02-05T11:03:21.997354000Z java.io.IOException: /xx.xx.xx.x:48088: GOAWAY received
2024-02-05T11:03:21.997499000Z 	at java.net.http/jdk.internal.net.http.HttpClientImpl.send(HttpClientImpl.java:586)
2024-02-05T11:03:21.997578000Z 	at java.net.http/jdk.internal.net.http.HttpClientFacade.send(HttpClientFacade.java:123)
2024-02-05T11:03:21.997623000Z 	at eu.knowledge.engine.smartconnector.runtime.messaging.KnowledgeDirectoryConnection.getKnowledgeEngineRuntimeConnectionDetails(KnowledgeDirectoryConnection.java:160)
2024-02-05T11:03:21.997676000Z 	at eu.knowledge.engine.smartconnector.runtime.messaging.KnowledgeDirectoryConnection.getOtherKnowledgeEngineRuntimeConnectionDetails(KnowledgeDirectoryConnection.java:175)
2024-02-05T11:03:21.997716000Z 	at eu.knowledge.engine.smartconnector.runtime.messaging.RemoteKerConnectionManager.queryKnowledgeDirectory(RemoteKerConnectionManager.java:104)
2024-02-05T11:03:21.997770000Z 	at eu.knowledge.engine.smartconnector.runtime.messaging.RemoteKerConnectionManager.lambda$scheduleQueryKnowledgeDirectory$1(RemoteKerConnectionManager.java:81)
2024-02-05T11:03:21.997813000Z 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
2024-02-05T11:03:21.997852000Z 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2024-02-05T11:03:21.997892000Z 	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
2024-02-05T11:03:21.997923000Z 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
2024-02-05T11:03:21.997978000Z 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2024-02-05T11:03:21.998022000Z 	at java.base/java.lang.Thread.run(Thread.java:840)
2024-02-05T11:03:21.998060000Z Caused by: java.io.IOException: /xx.xx.xx.x:48088: GOAWAY received
2024-02-05T11:03:21.998104000Z 	at java.net.http/jdk.internal.net.http.Http2Connection.handleGoAway(Http2Connection.java:1008)
2024-02-05T11:03:21.998153000Z 	at java.net.http/jdk.internal.net.http.Http2Connection.handleConnectionFrame(Http2Connection.java:873)
2024-02-05T11:03:21.998205000Z 	at java.net.http/jdk.internal.net.http.Http2Connection.processFrame(Http2Connection.java:748)
2024-02-05T11:03:21.998250000Z 	at java.net.http/jdk.internal.net.http.frame.FramesDecoder.decode(FramesDecoder.java:155)
2024-02-05T11:03:21.998292000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$FramesController.processReceivedData(Http2Connection.java:232)
2024-02-05T11:03:21.998333000Z 	at java.net.http/jdk.internal.net.http.Http2Connection.asyncReceive(Http2Connection.java:674)
2024-02-05T11:03:21.998374000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$Http2TubeSubscriber.processQueue(Http2Connection.java:1310)
2024-02-05T11:03:21.998413000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
2024-02-05T11:03:21.998455000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
2024-02-05T11:03:21.998491000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
2024-02-05T11:03:21.998526000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:303)
2024-02-05T11:03:21.998573000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:256)
2024-02-05T11:03:21.998610000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$Http2TubeSubscriber.runOrSchedule(Http2Connection.java:1328)
2024-02-05T11:03:21.998651000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$Http2TubeSubscriber.onNext(Http2Connection.java:1354)
2024-02-05T11:03:21.998711000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$Http2TubeSubscriber.onNext(Http2Connection.java:1288)
2024-02-05T11:03:21.998757000Z 	at java.net.http/jdk.internal.net.http.common.SSLTube$DelegateWrapper.onNext(SSLTube.java:210)
2024-02-05T11:03:21.998801000Z 	at java.net.http/jdk.internal.net.http.common.SSLTube$SSLSubscriberWrapper.onNext(SSLTube.java:492)
2024-02-05T11:03:21.998842000Z 	at java.net.http/jdk.internal.net.http.common.SSLTube$SSLSubscriberWrapper.onNext(SSLTube.java:295)
2024-02-05T11:03:21.998885000Z 	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper$DownstreamPusher.run1(SubscriberWrapper.java:316)
2024-02-05T11:03:21.998939000Z 	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper$DownstreamPusher.run(SubscriberWrapper.java:259)
2024-02-05T11:03:21.998989000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
2024-02-05T11:03:21.999034000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
2024-02-05T11:03:21.999074000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
2024-02-05T11:03:21.999117000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:303)
2024-02-05T11:03:21.999156000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:256)
2024-02-05T11:03:21.999200000Z 	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper.outgoing(SubscriberWrapper.java:232)
2024-02-05T11:03:21.999248000Z 	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper.outgoing(SubscriberWrapper.java:198)
2024-02-05T11:03:21.999292000Z 	at java.net.http/jdk.internal.net.http.common.SSLFlowDelegate$Reader.processData(SSLFlowDelegate.java:444)
2024-02-05T11:03:21.999339000Z 	at java.net.http/jdk.internal.net.http.common.SSLFlowDelegate$Reader$ReaderDownstreamPusher.run(SSLFlowDelegate.java:268)
2024-02-05T11:03:21.999391000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
2024-02-05T11:03:21.999427000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
2024-02-05T11:03:21.999471000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
2024-02-05T11:03:21.999517000Z 	... 3 more

And it looks like this when renewing our KD lease:

2024-02-05T07:15:11.735090000Z 2024-02-05 07:15:11:734 +0000 [pool-16-thread-1] WARN KnowledgeDirectoryConnection - Could not renew lease at Knowledge Directory https://kd.cyber-grid.com
2024-02-05T07:15:11.735301000Z java.io.IOException: /xx.xx.xx.x:33082: GOAWAY received
2024-02-05T07:15:11.735346000Z 	at java.net.http/jdk.internal.net.http.HttpClientImpl.send(HttpClientImpl.java:586)
2024-02-05T07:15:11.735387000Z 	at java.net.http/jdk.internal.net.http.HttpClientFacade.send(HttpClientFacade.java:123)
2024-02-05T07:15:11.735431000Z 	at eu.knowledge.engine.smartconnector.runtime.messaging.KnowledgeDirectoryConnection.tryRenewLease(KnowledgeDirectoryConnection.java:272)
2024-02-05T07:15:11.735471000Z 	at eu.knowledge.engine.smartconnector.runtime.messaging.KnowledgeDirectoryConnection.lambda$start$0(KnowledgeDirectoryConnection.java:110)
2024-02-05T07:15:11.735510000Z 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
2024-02-05T07:15:11.735567000Z 	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
2024-02-05T07:15:11.735611000Z 	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
2024-02-05T07:15:11.735647000Z 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
2024-02-05T07:15:11.735698000Z 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2024-02-05T07:15:11.735741000Z 	at java.base/java.lang.Thread.run(Thread.java:840)
2024-02-05T07:15:11.735776000Z Caused by: java.io.IOException: /xx.xx.xx.x:33082: GOAWAY received
2024-02-05T07:15:11.735812000Z 	at java.net.http/jdk.internal.net.http.Http2Connection.handleGoAway(Http2Connection.java:1008)
2024-02-05T07:15:11.735854000Z 	at java.net.http/jdk.internal.net.http.Http2Connection.handleConnectionFrame(Http2Connection.java:873)
2024-02-05T07:15:11.735889000Z 	at java.net.http/jdk.internal.net.http.Http2Connection.processFrame(Http2Connection.java:748)
2024-02-05T07:15:11.735924000Z 	at java.net.http/jdk.internal.net.http.frame.FramesDecoder.decode(FramesDecoder.java:155)
2024-02-05T07:15:11.735967000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$FramesController.processReceivedData(Http2Connection.java:232)
2024-02-05T07:15:11.736006000Z 	at java.net.http/jdk.internal.net.http.Http2Connection.asyncReceive(Http2Connection.java:674)
2024-02-05T07:15:11.736062000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$Http2TubeSubscriber.processQueue(Http2Connection.java:1310)
2024-02-05T07:15:11.736105000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
2024-02-05T07:15:11.736149000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
2024-02-05T07:15:11.736205000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
2024-02-05T07:15:11.736251000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:303)
2024-02-05T07:15:11.736291000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:256)
2024-02-05T07:15:11.736333000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$Http2TubeSubscriber.runOrSchedule(Http2Connection.java:1328)
2024-02-05T07:15:11.736381000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$Http2TubeSubscriber.onNext(Http2Connection.java:1354)
2024-02-05T07:15:11.736427000Z 	at java.net.http/jdk.internal.net.http.Http2Connection$Http2TubeSubscriber.onNext(Http2Connection.java:1288)
2024-02-05T07:15:11.736471000Z 	at java.net.http/jdk.internal.net.http.common.SSLTube$DelegateWrapper.onNext(SSLTube.java:210)
2024-02-05T07:15:11.736510000Z 	at java.net.http/jdk.internal.net.http.common.SSLTube$SSLSubscriberWrapper.onNext(SSLTube.java:492)
2024-02-05T07:15:11.736552000Z 	at java.net.http/jdk.internal.net.http.common.SSLTube$SSLSubscriberWrapper.onNext(SSLTube.java:295)
2024-02-05T07:15:11.736595000Z 	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper$DownstreamPusher.run1(SubscriberWrapper.java:316)
2024-02-05T07:15:11.736638000Z 	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper$DownstreamPusher.run(SubscriberWrapper.java:259)
2024-02-05T07:15:11.736691000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
2024-02-05T07:15:11.736760000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
2024-02-05T07:15:11.736807000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
2024-02-05T07:15:11.736850000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:303)
2024-02-05T07:15:11.736888000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:256)
2024-02-05T07:15:11.736929000Z 	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper.outgoing(SubscriberWrapper.java:232)
2024-02-05T07:15:11.736974000Z 	at java.net.http/jdk.internal.net.http.common.SubscriberWrapper.outgoing(SubscriberWrapper.java:198)
2024-02-05T07:15:11.737012000Z 	at java.net.http/jdk.internal.net.http.common.SSLFlowDelegate$Reader.processData(SSLFlowDelegate.java:444)
2024-02-05T07:15:11.737054000Z 	at java.net.http/jdk.internal.net.http.common.SSLFlowDelegate$Reader$ReaderDownstreamPusher.run(SSLFlowDelegate.java:268)
2024-02-05T07:15:11.737092000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:205)
2024-02-05T07:15:11.737135000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
2024-02-05T07:15:11.737176000Z 	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:230)
2024-02-05T07:15:11.737218000Z 	... 3 more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant