Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strategy htex_auto_scale removing blocks far too slow #2195

Closed
jrueb opened this issue Jan 20, 2022 · 0 comments · Fixed by #2196
Closed

Strategy htex_auto_scale removing blocks far too slow #2195

jrueb opened this issue Jan 20, 2022 · 0 comments · Fixed by #2196
Labels

Comments

@jrueb
Copy link
Contributor

jrueb commented Jan 20, 2022

Describe the bug
The "htex_auto_scale" strategy is very slow to remove blocks in case there are more slots than tasks. It removes one block per 5 seconds at most.

To Reproduce
Steps to reproduce the behavior, for e.g:

  1. Run the test script
  2. Check parsl.log and wait for the first handful of tasks are completed
  3. The logs shows that it will need around a minute after the idle time out to remove the blocks, even though all tasks but one have been finished. Most of the blocks are idle.

Expected behavior
All idle blocks are removed as soon as possible. This becomes really important if you are dealing with, say 500 blocks and most of them are idle.

Environment
Parsl 1.2.0, Python 3.8

Test script

import parsl
from parsl.providers import LocalProvider
from parsl.channels import LocalChannel
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.app.app import python_app


local_htex = Config(
    executors=[
        HighThroughputExecutor(
            label="htex_Local",
            worker_debug=True,
            max_workers=1,
            provider=LocalProvider(
                channel=LocalChannel(),
                init_blocks=2,
                max_blocks=20,
            ),
        )
    ],
    strategy="htex_auto_scale",
    max_idletime=3,
)
parsl.load(local_htex)


@python_app
def long():
    import time
    time.sleep(10000)


@python_app
def short():
    import time
    time.sleep(1)


a = long()
b = [short() for i in range(20)]
a.result()
@jrueb jrueb added the bug label Jan 20, 2022
benclifford pushed a commit that referenced this issue Jan 26, 2022
For htex_auto_scale instead of removing only 1 block at most in case there are more slots than tasks, remove try to remove many blocks so that the number of slots matches the number of tasks. This is of course still limited by how many blocks have reached the configured max_idletime.

Fixes #2195
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant