-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Allow concurrentGpuTasks and possibly other configs to be dynamically set #1399
Comments
@chenrui17 the headlines imply these two are the same, but I think this is a bit different than what is being proposed in #635. The other issue is asking for automatic, dynamic scaling of the concurrent tasks based on GPU memory usage. This issue instead is asking for something that should be much simpler to implement - the ability to manually adjust the concurrent tasks setting at runtime. If #635 can be satisfied by implementing that then I agree this can be closed as a duplicate. |
Because of the dupe I had an idea on how to implement this. #7521 (comment)
|
Added Needs Triage so we can look at this again now that we have a customer that really needs this to go into production. |
…IDIA#1399) Signed-off-by: spark-rapids automation <[email protected]>
Fixed by #7527. |
Is your feature request related to a problem? Please describe.
Currently some of our configs can't always be dynamic changed. For instance spark.rapids.sql.concurrentGpuTasks is read on executor startup and initialized, so generally it has to be set on startup. If you are using dynamic allocation it may pick up changes to it which could be very confusing. There are other configs like this like memory %.
Some environments make it hard to change this as you have to modify init scripts or restart entire clusters - like EMR and Databricks. So we should investigate a way to make it so we can dynamically change these configs. For instance read the new config value every so often and see if it has changed. Then we have to take into account if its getting larger or smaller as making it smaller we may need to drain things first.
The text was updated successfully, but these errors were encountered: