Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting "random" ApiCallTimeoutException from SQS polling #1555

Closed
fbn-roussel opened this issue Dec 23, 2019 · 3 comments
Closed

Getting "random" ApiCallTimeoutException from SQS polling #1555

fbn-roussel opened this issue Dec 23, 2019 · 3 comments
Labels
closed-for-staleness guidance Question that needs advice or information. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days.

Comments

@fbn-roussel
Copy link

fbn-roussel commented Dec 23, 2019

I have an ECS Fargate service running with multiple threads using long polling on multiple queues and I am facing some random error messages related to ApiCallTimeoutException.

Number of polled queues : 6
Long polling duration : 20s
Api call timeout configured : 30s
Number of polling threads by queues : 10

Expected Behavior

I should not get any timeout as my api call timeout is greater than my long polling duration.

Current Behavior

I get random (at least to me) timeouts with following stacktrace :

Unable to poll messages: queueUrl=https://sqs.eu-central-1.amazonaws.com/xxxxxxx/xxxxxx
software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution did not complete before the specified timeout configuration: 30000 millis
software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:87)
software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:147)
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:240) 
software.amazon.awssdk.core.client.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:96) 
software.amazon.awssdk.core.client.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:120) 
software.amazon.awssdk.core.client.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:73)
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:44) 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:55)
software.amazon.awssdk.services.sqs.DefaultSqsClient.receiveMessage(DefaultSqsClient.java:1046) 
com.xxx.MessageSubscriber.consume(MessageSubscriber.java:105) 
com.xxx.MessageSubscriber.lambda$null$1(MessageSubscriber.java:74)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

Steps to Reproduce (for bugs)

SQS client configuration

public SqsClient amazonSqsClient() {
    return SqsClient.builder()
      .region(Region.of("eu-central-1"))
      .overrideConfiguration(ClientOverrideConfiguration.builder()
        .apiCallTimeout(Duration.ofMillis(30000))
        .build())
      .build();
  }

SQS poller thread method

  private void consume(final String queueUrl) {
      final ReceiveMessageRequest receiveMessageRequest = ReceiveMessageRequest.builder()
        .queueUrl(queueUrl)
        .maxNumberOfMessages(10)
        .waitTimeSeconds(20)
        .visibilityTimeout(60)
        .build();

    while (true) {
      try {
          log.debug("Polling messages: queueUrl={}", queueUrl);
          final ReceiveMessageResponse receiveMessageResult = sqsClient.receiveMessage(receiveMessageRequest);
          if (receiveMessageResult != null && CollectionUtils.isNotEmpty(receiveMessageResult.messages())) {
            // process message here
          }
          log.debug("Messages polled: queueUrl={}", queueUrl);
        }
      } catch (final Exception e) {
        log.error("Unable to poll messages: queueUrl={}", queueUrl, e);
        TimeUnit.MINUTES.sleep(1);
    }
  }

Context

When exception occurs, it goes to the catch block, sleep for 1 minute and then continue polling.

I am not sure but I think I was not getting these errors when I was using V1 of the SDK.

Your Environment

  • AWS Java SDK version used: 2.10.12
  • JDK version used: 8
  • Operating System and version: ECS Fargate service
@dagnir dagnir added the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Dec 30, 2019
@dagnir
Copy link
Contributor

dagnir commented Dec 30, 2019

The API call timeout takes into account any retries performed as well. It's possible some requests are failing and getting automatically retried, which is causing the total API call execution time to exceed 30s. I think one possible solution would be to set the API call timeout to be 30s x maxNumberOfRetries, which is 3 by default, and then set an apiCallAttemptTimeout of 30s so each individual HTTP request doesn't take more than 30s.

@debora-ito
Copy link
Member

@fbn-roussel Apologies for the lack of replies in here. Were you able to try the suggestions? Are you still experiencing the issue?

@debora-ito debora-ito added guidance Question that needs advice or information. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Oct 8, 2020
@github-actions
Copy link

It looks like this issue hasn’t been active in longer than a week. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please add a comment to prevent automatic closure, or if the issue is already closed please feel free to reopen it.

@github-actions github-actions bot added closing-soon This issue will close in 4 days unless further comments are made. closed-for-staleness and removed closing-soon This issue will close in 4 days unless further comments are made. labels Oct 15, 2020
aws-sdk-java-automation added a commit that referenced this issue Jul 21, 2021
…5f47ced8e

Pull request: release <- staging/685db2ad-cada-4e10-a824-ad15f47ced8e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-for-staleness guidance Question that needs advice or information. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days.
Projects
None yet
Development

No branches or pull requests

3 participants