Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lambda processor should retry for certain class of exceptions #5320

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

srikanthjg
Copy link
Collaborator

@srikanthjg srikanthjg commented Jan 9, 2025

Description

Categorize exceptions as retryable and non-retryable for lambda plugin.
It should retry with backoff for certain number of times.

Issue

#5340

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Collaborator

@kkondaka kkondaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have couple of couple of IT tests with retryable errors? If IT tests not possible, at least LambdaProcessor tests should be possible, right?

*/
private static final Set<Integer> TIMEOUT_ERRORS = new HashSet<>(
Arrays.asList(
408, // Request Timeout
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this retryable category?

if(response == null) return false;
int statusCode = response.statusCode();
// Example logic: 429 (Too Many Requests) or 5xx => retry
return statusCode == 429 || (statusCode >= 500 && statusCode < 600);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why have 429 in multiple places?

final InvokeResponse previousResponse,
final Logger LOG
) {
int maxRetries = config.getClientOptions().getMaxConnectionRetries();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is max connection retries same as mac lambda retries? Because it is possible for a connection to be established immediately but failed due to server errors? Or are we trying to use the same config parameter for both?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess it was a bad naming for that field. The corresponding field is "max_retries", i am re-using that. I hope that is ok? I think there was another school of that we should indefinitely retry, but because this is a processor i thought it is better to retry for only certain number of times. We could also keep a different config for this and have a big but finite number assigned to this as default.

@srikanthjg srikanthjg requested a review from san81 as a code owner January 16, 2025 02:38
Signed-off-by: Srikanth Govindarajan <[email protected]>
Signed-off-by: Srikanth Govindarajan <[email protected]>
@@ -98,12 +108,6 @@ public InvokeRequest getRequestPayload(String functionName, String invocationTyp
return null;
}

try {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any advantage in moving this to another public method? One disadvantage I see is that the caller needs to know and must follow that he should call completeCodec first before he call getRequestPayload method - right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think we are allowed to have private methods in interface(Buffer interface). That is the reason for using public here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we want to have it as a separate method instead of keeping this code at where it is?

/**
* Possibly a set of “bad request” style errors which might fall
*/
private static final Set<Integer> BAD_REQUEST_ERRORS = new HashSet<>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to use Set.of() method instead of HashSet with Arrays.asList within it.

)
);

public static boolean isRetryable(final int statusCode) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not seeing any references to this method. is this unused (other than tests)? If it is unused, we can remove it. If we have to keep it, then I would recommend changing the name to isRetryableStatuscode to be inline with the other method isRetryableException

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i am not currently using it. It was part of my previous implementation where i had a manual retry logic. I can remove it now.


@ExtendWith(MockitoExtension.class)
@MockitoSettings(strictness = Strictness.LENIENT)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to avoid this if possible. It forces to avoid unwanted stubbing. If things in setup method are not needed then we can repeat them in corresponding test and avoid stubbing in the setup method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sure.


@Override
public void close() {
delegate.close();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close should decrement the counter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using this class only for testing. i will add it but it wont be required/used.

import java.util.concurrent.CompletableFuture;
import java.util.concurrent.atomic.AtomicInteger;

public class CountingHttpClient implements SdkAsyncHttpClient {
Copy link
Collaborator

@san81 san81 Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is only for testcases? I don't see any references to this class other than tests

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is only for test. The reason is that, the sdk retry is handled internally within sdk. I had no way of verify the call. This creates a client where i can check the calls explicitly. If there is any other way please let me know, i can try that out.

Signed-off-by: Srikanth Govindarajan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants