-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make scale in of htex_auto_scale more effective #2196
Conversation
parsl/dataflow/strategy.py
Outdated
@@ -262,7 +262,10 @@ def _general_strategy(self, status_list, tasks, *, strategy_type): | |||
logger.debug("More slots than tasks") | |||
if isinstance(executor, HighThroughputExecutor): | |||
if active_blocks > min_blocks: | |||
exec_status.scale_in(1, force=False, max_idletime=self.max_idletime) | |||
exec_status.scale_in( | |||
active_blocks - active_tasks // tasks_per_node // nodes_per_block, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this calculation should look similar to the calculation around line 246: both of them arg calculating some kind of "target number of blocks" and it makes me uncomfortable that they don't look exactly the same (eg ceil
, min
and parallelism
)
other than that, this looks like the right thing to be doing.
Adjusted the computation. Indeed I didn't take parallelism and min_blocks into the calculation. |
parsl/dataflow/strategy.py
Outdated
@@ -262,7 +262,10 @@ def _general_strategy(self, status_list, tasks, *, strategy_type): | |||
logger.debug("More slots than tasks") | |||
if isinstance(executor, HighThroughputExecutor): | |||
if active_blocks > min_blocks: | |||
exec_status.scale_in(1, force=False, max_idletime=self.max_idletime) | |||
excess = math.ceil(active_slots - (active_tasks * parallelism)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be clearer to rename the var excess
to excess_slots
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also want excess
renamed above, in line 245? It was using this term before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that isn't a bad idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrueb I think the logic here is correct. I have a minor recommendation about variable naming but otherwise, this PR is good to go. Thanks for writing this up! :)
Description
For htex_auto_scale instead of removing only 1 block at most in case there are more slots than tasks, remove try to remove many blocks so that the number of slots matches the number of tasks. This is of course still limited by how many blocks have reached the configured max_idletime.
Fixes #2195
Type of change
Choose which options apply, and delete the ones which do not apply.