-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECS reporter throttled by AWS API #2050
Comments
This is the Sock Shop, run with the CloudFormation template (3x m4.xlarge instances) in Weave Cloud |
It seems AWS is throttling us:
|
Also we should correct the printf string format of some warnings:
|
First 1000 lines of the logs: http://sprunge.us/SNIb |
my short-term thoughts on a long-term solution: we may need to get clever here with caching and careful use of immutable fields. For example, StartedBy for a task isn't going to change, which means we don't need to DescribeTasks every time - except this means any other metadata we may want to collect will also get stale. :S |
I've worked around it for now by creating the cluster in a separate region (AWS rate limits per region) |
A robust way to fix this should be using the ECS event stream https://aws.amazon.com/blogs/compute/monitor-cluster-state-with-amazon-ecs-event-stream/ However, I am not sure whether or how easily we can plug Scope in. |
I think it's too much to ask to need the user to create CloudWatch rules and an SQS topic as part of setup - or even if we go that route later, it'd be nice if it was optional.
Taken together, these improvements will cut down at least 50% of all queries, and likely more in most situations (since there's more tasks than services, and we won't be fetching services that aren't present on the machine). |
We could cut down on requests further by allowing our data to be stale out to some refresh rate (say, 1 minute), but still doing a shortcut refresh if needed to find the correct task for a service. But I'd like to avoid stale data in the details panel if at all possible - even a single instance of that can undermine user confidence in its accuracy in all cases. |
Would this be something the launch-generator to take care of? cc @lukemarsden @errordeveloper |
I don't see how, at least not in the way we are currently using the launch-generator. You cannot create CloudWatch rules from cluster resources (be it Kubernetes, ECS or whathaveyou) |
The AWS Blox project purportedly provides a CFN template for doing this. The launch-generator could do the same, no? Or at least provide a fragment. |
Sure, we could create the AWS resources through a CFN template but I don't think the launch-generator would be involved. Scope could detect whether the ECS SQS queue is available at start through the presence of a parameter in In order to propagate the SQS credetails to Scope, I guess we could:
@errordeveloper Does this make sense? If it does, let's create separate issues forit here and https://github.com/weaveworks/integrations (we still need a minimally performant solution when SQS is not available and I would like to use this issue for that). |
A user is experiencing this even after #2065 (Scope 1.2) in a 5-node cluster: https://weaveworks.slack.com/archives/weave-users/p1486634036001678 . Reopening. |
See also: prometheus/prometheus#2309 |
@2opremio I really like the idea of using CW for keeping a state of the resources if an SQS parameter is provided, otherwise fallback to API+Cache.
Something things to keep in mind:
|
Is every probe polling the API and getting the same information? |
@bboreham I'd agree, however currently nothing stops you from running scope probes in different clusters, in which case how can we tell which probe is in which cluster? it'd make a lot of sense to externalise Kubernetes and ECS code into plugins, and deploy just one of those per cluster. |
By “just one” I meant one per cluster. |
And if the node on which that runs gets removed, we'd have to enable the polling on another one. Not easy. |
The only sensible thing I can image would be to enable these integration outside of the probe process, run them as containers and let orchestrator take care of where it would run. It's probably a little easier then doing some kind of election among probes, but a big change nevertheless, although it could help the plugin story. |
Can Kubernetes run this as a deployment of 1? Or do the plugins need to be sidecars? Obvs, not going to work for ECS... |
Yes it can, as long as there is also a probe pod on the same node (which should be the case under normal conditions).
I think it could be made to work... |
The text was updated successfully, but these errors were encountered: