Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQS Source has unbounded parallelism #135

Closed
aserrallerios opened this issue Dec 30, 2016 · 5 comments
Closed

SQS Source has unbounded parallelism #135

aserrallerios opened this issue Dec 30, 2016 · 5 comments
Milestone

Comments

@aserrallerios
Copy link
Contributor

aserrallerios commented Dec 30, 2016

Every time downstream pulls when the buffer is not full, a new async request is spawned using AmazonSQSAsyncClient, thus ignoring the longPollingDuration or the maxBufferSize configuration. This makes it unusable and dangerous.

I understand that throughput of this Source is achieved with parallelism of requests of AmazonSQSAsyncClient. This client can be instantiated with a custom thread pool but if the downstream is too fast, this will effectively starve other usages of the same client instance. For example:

implicit val sqsClient : AmazonSQSAsyncClient = ???

alpakkaSqsSource(queueName1) ~> } ~> fastFlow ~> deleteSqsMessage
alpakkaSqsSource(queueName2) ~> }

The sqsSources will exhaust all the resources of the sqsClient. This can be easily fixed using a different sqsClient, but IMHO this should be properly advertised on the documentation because it's clearly not how you'd use this api:

alpakkaSqsSource(queueName1)(sqsClient1) ~> } ~> fastFlow ~> deleteSqsMessage(sqsClient3)
alpakkaSqsSource(queueName2)(sqsClient2) ~> }

Now, we've avoided starvation BUT if we check our SQS metrics on AWS console we'll find astronomic polling rates (which are billed to your AWS account). This happens because the Source polls whenever its buffer it's not full on every downstream pull, as I previously said.

One would expect that the Source honors the longPollingDuration and the maxBufferSize configuration in order to control its internal logic and throughput, but it fails under the most trivial scenarios:

Scenario 1:

Max buffer: 100
Current buffer: 85
Batch size: 10
Downstream pulls: 2

Outcome: 2 requests are performed, when only 1 of the responses would fit in the local buffer

Scenario 2:

Max buffer: 100
Current buffer: 0
Batch size: 10
Downstream pulls: 2

Outcome: 2 requests are performed, even when the application has no clue about the current messages on the server, making longPollingDuration parameter useless


Solution?
Source may keep local state to track if there are pending async requests and the latest server response.
I think that would be enough, and it'd make the Source behave as expected.

@dpfeiffer
Copy link
Contributor

@aserrallerios What do you think of configurable parallel async request count? This would allow to keep throughput high if needed but would also allow to keep costs low. I'll try to open a PR by the end of the week.

@aserrallerios
Copy link
Contributor Author

There are some problems with that.

  • You'd need to configure AmazonSQSAsyncClient and alpakka Source with almost the same parameters. It feels inelegant.
  • You achieve the same thing configuring the AmazonSQSAsyncClient thread pool in the first place.
  • Even though parallelism would be explicitly bounded, it wouldn't still be an optimal usage of resources.
  • IMO parallelism shouldn't be higher than buffer max size / batch max size, it doesn't make any sense to me.

What would you think of a constructor that doesn't take a AmazonSQSAsyncClient as parameter, and instantiates it with the recommended parameters (for the thread pool and for the http client)? That would fix one of the problems: the parallelism would be bounded with a single "parallelism" parameter or even calculated by buffer max size / batch max size parameters, and users of the alpakka api would not starve their other AmazonSQSAsyncClients usages.

Then, optimizing the usage of the queue would be a different thing, but we'd be pretty close if the max parallelism is calculated and not a user setting. The only thing remaining would to keep track of the remote queue: if it's empty (the last request returned 0 results, for example), just limit parallelism at 1.

@dpfeiffer
Copy link
Contributor

In the first place I decided to hand in the AmazonSQSAsyncClient so it could be used in other places as well, so you don't have to spin up two thread pools when you simply want to achieve something like this:

SqsSource(...)
  .via(foo)
  .via(deleteMessageFromQueue)
  .runWith(...)

I agree that configuring everything correctly will get really hard if we add a parallelism parameter and allow to pass in a AmazonSQSAsyncClient. Maybe we could provide both constructors, one that gets a custom client and one that doesn't?

I wouldn't calculate the parallelism with buffer max size / batch max size because it makes it intransparent and event harder to configure when we provide 2 different constructors. A separate parallelism setting (with a check that it doesn't exceed the max buffer size if all requests are finished) would allow more transparent configuration.

I am not sure if we can infer a limited parallelism from an empty response. As I know this can happen also if the queue is not empty (maybe we need a more sophisticated way of doing that)

@aserrallerios
Copy link
Contributor Author

aserrallerios commented Jan 9, 2017

An empty response means that remote queue may be empty, and you can leverage on the long polling until it has messages again. Another option is to use the AWS API to directly query the queue, but the result would be the same, you cannot assume that the queue is empty even when the latest request returned that the queue size is 0. What we got here is a best effort approach, and I agree that is not so important if the async requests are bounded.

A derived parallelism setting would make it intransparent but reasonable, you'll always comply with the max buffer setting. The check that you propose sounds like "ignore max buffer size" check to me.

@dpfeiffer
Copy link
Contributor

@aserrallerios I'll try out a few options on the weekend.

ktoso pushed a commit that referenced this issue Mar 23, 2017
* Create a custom thread pool for the SqsSource and limit concurrency with the buffer size

* Provide AWSCredentials for all SqsSource tests

* Provide better java api

* Use ArrayDeque as FIFO queue
Messages for asserts in SqsSourceSettings

* Remove AmazonSQSAsyncClient from factory method

- Correct the javadoc

* Add implicit AmazonSQSAsync to factory method

- It now conforms again with the SqsSink
- Passing the clients from extern seems the preferred way (awslambda module)

* Update documentation to reflect thread pool usage

* Improvements from review by ktoso

- Replace IntStream with Source.range
- Use Sink.head instead of Sink.seq
@ktoso ktoso added this to the 0.7 milestone Mar 23, 2017
ennru pushed a commit to ennru/alpakka that referenced this issue Mar 27, 2017
* Create a custom thread pool for the SqsSource and limit concurrency with the buffer size

* Provide AWSCredentials for all SqsSource tests

* Provide better java api

* Use ArrayDeque as FIFO queue
Messages for asserts in SqsSourceSettings

* Remove AmazonSQSAsyncClient from factory method

- Correct the javadoc

* Add implicit AmazonSQSAsync to factory method

- It now conforms again with the SqsSink
- Passing the clients from extern seems the preferred way (awslambda module)

* Update documentation to reflect thread pool usage

* Improvements from review by ktoso

- Replace IntStream with Source.range
- Use Sink.head instead of Sink.seq
s-soroosh added a commit to s-soroosh/alpakka that referenced this issue Jun 12, 2017
- Set default settings as default parameter
- Improve tests

Implement java dsl

[FTP] Critical fix for infinite loop of traversing "." and ".." directories

Upgrade to aws-java-sdk-dynamodb 1.11.106

Allow to pass a `SSLSocketFactory` to `MqttConnectionSettings`

FTP - attribute enrichment of FTPFile akka#153

Add KairosDB connector

= akka#135 Limit parallelism for the SqsSource (akka#163)

* Create a custom thread pool for the SqsSource and limit concurrency with the buffer size

* Provide AWSCredentials for all SqsSource tests

* Provide better java api

* Use ArrayDeque as FIFO queue
Messages for asserts in SqsSourceSettings

* Remove AmazonSQSAsyncClient from factory method

- Correct the javadoc

* Add implicit AmazonSQSAsync to factory method

- It now conforms again with the SqsSink
- Passing the clients from extern seems the preferred way (awslambda module)

* Update documentation to reflect thread pool usage

* Improvements from review by ktoso

- Replace IntStream with Source.range
- Use Sink.head instead of Sink.seq

=pro update akka http to 10.0.5 (akka#230)

Make Travis fail build on code format differences

=sqs fix typo in require in SqsSourceSettings (akka#228)

FTP - toPath sink akka#182

Add GCE Pubsub with publish and subscribe.

improve naming consistency of private vars.

PR feedback. Improve java api, java examples, better json marshalling.

use mapAsyncUnordered

s3: provide access to the returned response

s3: clean up test log

s3: add javadsl for request()

Time goes by, next try

Update connectors.md (akka#237)

Make external libs more visible (reactive kafka) (akka#229)

* Update TOC depth to 3, to show Reactive Kafka

Now people don't notice Kafka is in here since it's "external", expanding the TOC one more level makes it more visible.

WDYT?

* Update index.md

FTP: make lastModified test more robust (fixes akka#236)

Add SqsAckSink (akka#129)

Add SqsAckSink

* update elasticmq version
* update dependencies

Added possibility configure Sftp connection using private key akka#197.

- Added SftpIdentity case class to allow configuring private/public key,
- Added option to configure known_hosts file and test to check its usage.
- Added spec that should fail password based authentication and revert to private key one,
- Added docs paragraph to describe this option.

Upgrade to scalafmt 0.6.6

Remove deperecated binPack.callSite scalafmt setting

Format with updated scalafmt and fixed settings

S3 - add documentation akka#103

Fix alphabetical ordering in the docs

Separate out release docs

S3 path style access akka#64

Make SqsSourceTest less likely to fail

- Reduce amount of sent messages to 1 (multiple batch streaming is tested in the SqsSourceSpec)
- Increase timeout

Introduced "secure" boolean property for S3 which controls whether HTTPS is used akka#247

README: add scaladex, travis badges

And make docs links less scary to click on :)

Add CSV data transformation module (akka#213)

* Alpakka Issue akka#66: CSV component

* Alpakka Issue akka#66: revised CSV parser

* Alpakka Issue akka#60: CSV parsing stage

* wait for line end before issuing line

As the byte string may not contain a whole line the parser needs to read until a line end is reached.

* Add Java API and JUnit test; add a bit of documentation

* Introduce CsvToMap stage; more documentation

* Parse line even without line end at upstream finish

* Add Java API for CsvToMap; more documentation

* More restricted API, incorporated comments by @johanandren

* Format sequence as CSV in ByteString

* Add Scala CSV formatting stage

* Add Java API for CSV formatting; more docs

* Separate enums for Java and Scala DSLs

* Use Flow.fromGraph to construct flow

* Rename CsvFraming to CsvParsing

* Check for Byte Order Mark and ignore it for UTF-8

* Emit Byte Order Marks in formatting; CsvFormatting is just a map

* Byte Order Mark for Java API

* Add line number to error messages; sample files exported from third party software

* Use Charset directly instead of name

* csv: autoformatted files

* simplified dependency declaration

Fixes akka#60.

SQS flows + Embedded ElasticMQ akka#255

* Add a flow stage and use ElasticMQ
* Use flow-based stage for ACKs
* Use AmazonSQSAsync instead of AmazonSQSAsyncClient
* Using embedded ElasticMQ for tests

add SNS connector with publish sink akka#204

Await futures before performing assertion (fixes akka#235)

When an assertion fails after the test has already succeeded it will be
ignored, so Await the future before continuing with the check.

Document 'docker-compose' for running tests

Fail MqttSourceStage mat. value on connection loss

And increase the timeout. Might help with akka#189, or otherwise help generate a
better error message when it does happen again.

Ref akka#2 add IronMq integration

Refs akka#2 add at-least-one semantic to IronMq connector

Improve documentation and test coverage for IronMq ref akka#2

- Document the IronMq domain classes
- Document IronMq client
- Test the at-least-once producer/consumer mechanism
- Improve the IronMQ connector documentation

Ref akka#2 Preserve newline in reference.conf

Ref akka#2 Make seure the actor system is fully terminated after each test

Ref akka#2 Reformat code

Refs akka#2 define a different Committable and CommittableMessage for Java and Scala DSL

Refs akka#2 Fix typos in IronMQ documentations

Refs akka#2 Remove non needed Environment variables from TravisCI config file

Refs akka#2 Add a simple Java test and refactor the Java DSL to looks better in Java

FTP: Attempt to fix flaky test on Travis

Link to scaladex (akka#266)

s3: support for encryption, storage class, custom headers akka#109

s3: Added support for partial file download from S3 akka#264 (akka#265)

Add version info and links in index page (akka#273)

FTP - append mode for toPath sink + improved upstream failure handling akka#207

Fix broken recovery of EventSource (sse)

Replace scala.binaryVersion with scalaBinaryVersion (see akka#278)

Fix minor typo in alpakka MQTT Connector doc

Add Flow to support RabbitMQ RPC workflow akka#160

Changes Amqp sinks to materialize to Future[Done]. As currently it was
very difficult to determine when/if a sink failed due to a amqp error.

AMQP: add more options to configuration of the ConnectionFactory, akka#191

Directory sources akka#272

sse: Upgrade to Akka SSE 3 and make test more robust

CSV: Fixes ignored second double quote

S3: add listBucket method to S3 library (akka#253)

* Added recursive listBucket call to get all keys under a specific prefix.

* Properly using the request URI method and constructing queries with the Query type

* Added tests around query parsing

* Fixed formatting and removed recoverWithRetries on listbucket calls as they are already retried on the underlying layer with max-retries

* Using signAndGetAs instead of signAndGet as to not duplicate logic.

* Implemented quick fixes based on comments. Removed recursive call to get keys and used unfoldAsync to get all keys to run in constant memory.

* Added execution context. Fixed broken test

* Fixed formatting error.

* Cleaned up lisBucket call by added a ListBucketState object instead of the brutal type signature from earlier.

* Moved trait for listBucket into the def itself as to remove it from the public namespace.

azure-storage-queue connector akka#280

Add attribute parameters to sqs source settings akka#302

Formatting fix for akka#302

Streaming XML parser and utilities.

Prepare XML parser to join Alpakka family

Remove duplicated region argument in client methods akka#297

Build with Akka 2.5 as well

Add Azure Storage Queue documentation to TOC

Stub documentation for S3.listBucket

S3: fix formatting

Run the deployment only against Akka 2.4

PubSub: Add support for emulator host variables

Initial commit for apache geode connector

CSV: Emit all lines on completion akka#315

XML: make code in tests more consistent

Add whitesource plugin

Merge branch 'master' into add-kairosdb-connector

Add copyright header

update docker-compose

Make execution context optional in java api

Make execution context optional in scala api

remove ec from sink spec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants