Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IMDS returns SdkClientException on 4XX errors, but reading autoscaling lifecycle state can return 404. #5786

Open
2 tasks done
benjumanji opened this issue Jan 9, 2025 · 2 comments
Labels
feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged.

Comments

@benjumanji
Copy link

benjumanji commented Jan 9, 2025

Describe the feature

IMDS client should return SdkServiceException when an http request is returned from the service. It's use of SdkClientException doesn't match the description in the docs: https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/handling-exceptions.html#sdkclientexception, to wit:

An SdkClientException is generally more severe than an SdkServiceException, and indicates a major problem that is preventing the client from making service calls to AWS services

A request that recieves a response cannot be a client exception. It made the call, the service rejected it.

Use Case

IMDS client returns SdkClientExceptions unconditionally on 4XX errors (

throw SdkClientException.builder().message(responseContent).build();
) but when an instance has been up for some length of time then reading the autoscaling lifecycle state can fail with a 404, this is not a client error and is retryable. We don't want to log these, but the status code isn't accessible to us because the client threw away that information.

Proposed Solution

Return an SdkServiceException any time there is an actual service response.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

AWS Java SDK version used

2

JDK version used

17

Operating System and version

linux 6.11 (nixos)

@benjumanji benjumanji added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jan 9, 2025
@debora-ito
Copy link
Member

debora-ito commented Jan 14, 2025

@benjumanji do you have a stacktrace for the specific exception you have in mind?

@debora-ito debora-ito added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. and removed needs-triage This issue or PR still needs to be triaged. labels Jan 14, 2025
@benjumanji
Copy link
Author

benjumanji commented Jan 21, 2025

Do you mean a stack trace illustrating the question? Or the stack trace I would like to have in the future? Assuming it is the former:

{
  "level": "ERROR",
  "ts": "1736891693048",
  "name": "hypervolt.aws.imds.AwsMetadataClient",
  "message": "Failed to request target state from aws metadata service, will try again in 10 seconds seconds",
  "stacktrace": {
    "class": "software.amazon.awssdk.core.exception.SdkClientException",
    "message": "<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>\n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\"\n\t\t \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\" lang=\"en\">\n <head>\n  <title>404 - Not Found</title>\n </head>\n <body>\n  <h1>404 - Not Found</h1>\n </body>\n</html>\n",
    "backtrace": [
      "at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)",
      "at software.amazon.awssdk.imds.internal.AsyncHttpRequestHelper.handleResponse(AsyncHttpRequestHelper.java:89)",
      "at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.lambda$prepare$0(AsyncResponseHandler.java:92)",
      "at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150)",
      "at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)",
      "at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)",
      "at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler$BaosSubscriber.onComplete(AsyncResponseHandler.java:135)",
      "at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$DataCountingPublisher$1.onComplete(ResponseHandler.java:519)",
      "at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.runAndLogError(ResponseHandler.java:254)",
      "at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.access$600(ResponseHandler.java:77)",
      "at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$PublisherAdapter$1.onComplete(ResponseHandler.java:375)",
      "at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.publishMessage(HandlerPublisher.java:402)",
      "at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.flushBuffer(HandlerPublisher.java:338)",
      "at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.receivedDemand(HandlerPublisher.java:291)",
      "at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.access$200(HandlerPublisher.java:61)",
      "at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher$ChannelSubscription$1.run(HandlerPublisher.java:495)",
      "at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)",
      "at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)",
      "at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)",
      "at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566)",
      "at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)",
      "at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)",
      "at java.base/java.lang.Thread.run(Thread.java:833)",
      "at delay @ hypervolt.aws.imds.AwsMetadataClient.readMetadata(AwsMetadataClient.scala:47)",
      "at fromCompletableFuture @ hypervolt.aws.imds.AwsMetadataClient.readMetadata(AwsMetadataClient.scala:47)",
      "at map @ hypervolt.aws.imds.AwsMetadataClient.readMetadata(AwsMetadataClient.scala:47)"
    ],
    "cause": null
  }
}

Note that this is an SdkClientException so there is no status code member to interrogate, the best we can do is string matching on the message which is objectively bad.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Jan 21, 2025
@debora-ito debora-ito added the needs-triage This issue or PR still needs to be triaged. label Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged.
Projects
None yet
Development

No branches or pull requests

2 participants