-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YARN-10929. Refrain from creating new Configuration object in AbstractManagedParentQueue#initializeLeafQueueConfigs #3361
Conversation
…ntQueue#initializeLeafQueueConfigs
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @JackWangCS for working on this! I had a few comments on the change.
for (Map.Entry<String, String> confKeyValuePair : | ||
csContext.getConfiguration().getPropsWithPrefix(prefix).entrySet()) { | ||
configsWithPrefix.put(prefix + confKeyValuePair.getKey(), confKeyValuePair.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using the slower getPropsWithPrefix the optimised version could be used which was introduced in YARN-10838 and will be integrated in YARN-10872.
|
||
CapacitySchedulerConfiguration leafQueueConfigs = new | ||
CapacitySchedulerConfiguration(new Configuration(false), false); | ||
CapacitySchedulerConfiguration(csContext.getConf(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't entirely what we had in mind. Now every time a new dynamic queue is created every single property of capacity-scheduler.xml and yarn-site.xml is loaded into the newly created leafQueueConfigs, which could hinder the performance. The jira was more about getting rid of the config object duplication in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @brumi1024 , thanks for your review. For my understanging, the AbstractManagedParentQueue#initializeLeafQueueConfigs
is used to only initialize the leafQueueTemplate when the parent queue is initialized/re-initialized. And for each time, a newly created leafQueue should use this leafQueueTemplate
to get its configuration via adding new configurations by replacing configurations with leaf-queue-template
.
But now, in ManagedParentQueue#getLeafQueueConfigs
, it creates a new configuration everytime. So we need to reuse getLeafQueueTemplate().getLeafQueueConfigs()
directly instead of creating a new one. I mean we can change this to
public CapacitySchedulerConfiguration getLeafQueueConfigs(
CapacitySchedulerConfiguration templateConfig, String leafQueueName) {
CapacitySchedulerConfiguration leafQueueConfigTemplate = templateConfig;
.....
}
What do you think of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JackWangCS,
Sorry I was not entirely clear. By now I mean the currently merged code creates a new Configuration object in ManagedParentQueue.getLeafQueueConfigs(CapacitySchedulerConfiguration templateConfig, String leafQueueName) (which is called every time an AutoCreatedLeafQueue is instantiated), but your patch solves that issue. However I'm thinking about a bigger change by completely removing the config object duplication (because it is error prone, see the bugfix YarnConfiguration.RESOURCE_TYPES property cloning) by doing something similar which the AQCv2 does (see the AutoCreatedQueueTemplate variable and its uses in ParentQueue and LeafQueue). This way the new template configs are added as if they would be normal user-defined configs, but it's better than creating a new half-filled config for every AutoCreatedLeafQueue or after your patch ManagedParent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @brumi1024 on this. We need to eliminate initializing an other configuration object, which is still happening in your current revision. My advise is to simply use csContext.getConfiguration to set the properties similar to AutoCreatedQueueTemplate class, as @brumi1024 suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @9uapaw and @brumi1024. I made the initializeLeafQueueConfigs to use the csContext.getConfiguration directly. I will check the AQCv2 related code to see where we can reduce these configuration object duplication further.
String resourceTypePrefix = YarnConfiguration.RESOURCE_TYPES + "."; | ||
Map<String, String> rtConfigs = getCSConfigurationsWithPrefix(resourceTypePrefix); | ||
for (Map.Entry<String, String> confKeyValuePair: rtConfigs.entrySet()) { | ||
leafQueueConfigs.set(confKeyValuePair.getKey(), confKeyValuePair.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is necessary anymore, if the whole config is duplicated. This is a property that comes from yarn-site.xml and it is necessary for the dynamic queues as well. If however the CapacitySchedulerConfiguration object is copied into the new one there is no need for copying these properties separately.
💔 -1 overall
This message was automatically generated. |
|
||
CapacitySchedulerConfiguration leafQueueConfigs = new | ||
CapacitySchedulerConfiguration(new Configuration(false), false); | ||
CapacitySchedulerConfiguration(csContext.getConf(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JackWangCS,
Sorry I was not entirely clear. By now I mean the currently merged code creates a new Configuration object in ManagedParentQueue.getLeafQueueConfigs(CapacitySchedulerConfiguration templateConfig, String leafQueueName) (which is called every time an AutoCreatedLeafQueue is instantiated), but your patch solves that issue. However I'm thinking about a bigger change by completely removing the config object duplication (because it is error prone, see the bugfix YarnConfiguration.RESOURCE_TYPES property cloning) by doing something similar which the AQCv2 does (see the AutoCreatedQueueTemplate variable and its uses in ParentQueue and LeafQueue). This way the new template configs are added as if they would be normal user-defined configs, but it's better than creating a new half-filled config for every AutoCreatedLeafQueue or after your patch ManagedParent.
💔 -1 overall
This message was automatically generated. |
Closing this PR in favor of #3744 |
Description of PR
Refrain from creating new Configuration object and remove the unneccessary configuration sorting in AbstractManagedParentQueue#initializeLeafQueueConfigs.
How was this patch tested?
This patch just refactor code and should be covered by exists test.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?