Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] - Add ability to spinup dask workers in a single AZ (AWS) #1388

Closed
Tracked by #1309
aktech opened this issue Aug 3, 2022 · 3 comments · Fixed by #1428
Closed
Tracked by #1309

[ENH] - Add ability to spinup dask workers in a single AZ (AWS) #1388

aktech opened this issue Aug 3, 2022 · 3 comments · Fixed by #1428
Labels
area: integration/Dask Issues related to Dask on QHub provider: AWS type: enhancement 💅🏼 New feature or request

Comments

@aktech
Copy link
Member

aktech commented Aug 3, 2022

Feature description

Ability to spin up dask workers in a single availability Zone in AWS.

Value and/or benefit

While running data intensive tasks via dask workers, it happens quite often that dask workers are spun up in various AZs (Availability zones), which can cause lot of data transfer across AZs, which is not very cheap.

Having this ability will make spinning up large number of dask workers very cost efficient.

Anything else?

No response

@aktech aktech added provider: AWS type: enhancement 💅🏼 New feature or request area: integration/Dask Issues related to Dask on QHub labels Aug 3, 2022
@iameskild
Copy link
Member

@dharhas I believe that FAQ fixes another issue. I tried making the change that was suggested and new nodes are still split between the two AZs.

From my perspective, there's a potential short-term solution and a long-term solution that will require a potential update to how we create AWS node-groups.

short term solution

Disable one of the network subnets for the associated AutoScaling group.

  • To perform this action, on the AWS console, navigate to EC2 > Auto Scaling Groups and select the appropriate auto-scaling group.
  • Under Network, remove all but one subnet. This will force all new nodes to spin up using that subnet (and subsequently only in one AZ).

This workaround has the drawback that the associated node-group will raise a "Health Issue":

  • AutoScalingGroupInvalidConfiguration - it wants two subnets in seperate AZs

long term solution

I believe the long term solution is to have an option to force the node-group to run in a single subnet (ie single AZ). An initial attempt at this solution can be found on the aws_single_subnet branch.

@iameskild
Copy link
Member

I tested the "long term solution" (on branch aws_single_subnet) and from what I can tell, all of the nodes in the worker node-group spawned in a single AZ (provided that the key single_subnet = true was set in the node-group section). It's probably worth testing this a little more to ensure there are no other unintended consequences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: integration/Dask Issues related to Dask on QHub provider: AWS type: enhancement 💅🏼 New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants