-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expander optimization #11069
base: main
Are you sure you want to change the base?
expander optimization #11069
Conversation
…pass in ExpandExpressionCapture
Thank you for tackling performance, I just have a few comments before getting to the specifics of the code.
|
First of all, thank you for looking at my experimental changes. Much appreciated.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for digging into this!!
The main reason why I do not want to approve now - This is nontrivial change in core logic and the PR has no mention of test coverage. If it's already covered by pre-existing tests, please call those out. If not - I'd prefer a tailored test(s) to be added first (can be same PR, but idealy a separate commit, that shows the test were passing on previous version).
Thank you for the review. I wasn't expecting approval anytime soon. This is mostly to open the discussion about this area in particular. Regarding the test coverage: |
All sounds perfect! Can you run an exp insertion, request perf runs (DDRITs and Speedometer) a post results here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
// Trim from the end. | ||
// There is some extra logic here to avoid edge case where we would trim a whitespace only character with | ||
// one slice from the start, and then once more from the end, resulting in invalid indices. | ||
while (((firstSlice < lastSlice) || (firstSlice == lastSlice && firstSliceIdx < lastSliceIdx)) && Char.IsWhiteSpace(arg, lastSliceIdx - 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like (firstSlice <= lastSlice && Char.IsWhiteSpace(arg, firstSliceIdx)
can be moved to the separate method with descriptive name and reused in the while loop here and on the line 794
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checks checks against lastSliceIndx and also checks to prevent duplicate whitespace removal (the second part after the ||)
It was similar-ish before, then I run into the "all whitespace" edge case and crashed.
that is what the firstSlice == lastSlice && firstSliceIdx < lastSliceIdx
is for.
So the function would have to be something like
IsEdgeWhitespace(arg, firstSlice, lastSlice, idxToCheck)
and then some extra logic for this case.
Is that more readable? I would like to say no, but that could be just my laziness/lack of feeling for cases such as this.
|
||
// If the argument is in quotes, we want to remove those | ||
if ((arg[firstSliceIdx] == '\'' && arg[lastSliceIdx - 1] == '\'') || | ||
(arg[firstSliceIdx] == '`' && arg[lastSliceIdx - 1] == '`') || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you remove the static fields for these repetitive chars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comparison didn't use them (I'm not 100% sure why, but I kept it that way). Since I'm working with indices, I didn't need them for .Trim() calls anymore so they became unused and the ./build.cmd started complaining.
Copilot
AI
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 suggestion.
Comments skipped due to low confidence (1)
src/Build/Evaluation/Expander.cs:778
- Add a test case to verify the behavior of the AddArgumentFromSlices method, especially for edge cases such as arguments with quotes and 'null' values.
private static void AddArgumentFromSlices(List<string> arguments, List<Tuple<int, int>> slices, string arg)
src/Build/Evaluation/Expander.cs
Outdated
|
||
while (firstSlice < lastSlice) | ||
{ | ||
argValue += arg.Substring(firstSliceIdx, slices[firstSlice].Item2 - firstSliceIdx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused why would this improve perf, i guess there are usually few arguments. In general repeated concatenations are perf antipattern. Couldn't the StringTools.SpanBasedStringBuilder be optimized instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I saw/googled, the concat should be faster / similar speedwise up to 3-4 concatenations, after that the stringbuilder is faster.
Most of resolved variables have 1 or 2 slices to concatenate(beyond the initial empty string) so I opted for the simplicity.
There is an argument to be made to have a split there based on number of slices and use a stringbuilder for 3+.
As for optimizing SpanBasedStringBuilder - that is not our code but something from Microsoft.NET.StringTools.
I think that the main difference is that for most cases, the span based string builder is an overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the perf improvement, the first two lines of the previous version function were doing this:
argumentBuilder.Trim();
string argValue = argumentBuilder.ToString();
and then doing everything with String, throwing away any and all advantage the spanbased stringbuilder might've had.
So there is an option to kill most of my changes and replace the string with SpanBasedChar - it could achieve similar if not identical results. The main cost would be probably further profiling since it's rather large replacement once again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit more googling and having a static StringBuilder and reusing the instance should have the best from the both worlds - reasonable performance for small concatenation counts while avoiding the danger of allocation.
@SimaTian , I am slightly concerned about code readability factor. |
changes:
The AddArgumentFromSlices is an optimization that bypasses the need for argumentBuilder which has noticeable overhead.
I use indices to directly calculate the argument instead.
The lines 2126 - 2142
The
expressionCapture.Captures?.Any
iteration was visible in the profiler.I pulled both branches into one loop since it looked slightly better. Performance improvement is speculative at best, can be
removed if we decide that it is too risky or something.
Finally the
src/StringTools/WeakStringCache.Concurrent.cs
the .Count call on the ConcurrentDictionary is locking everything which is not great since we do it quite often. my change aims to introduce separate count for that to limit the number of locks we take and to unblock the dictionary.
All together when profiling on my machine, after killing the outliers (MSBuild sometimes has a random-ish spikes in build time that I decided to skip) my aggregated output was ~this:
-maxcpucount:10 -tl:true
and average time of 62.103sThough these are speculative at best, I've yet to find a reasonable and stable enough way of checking the impact.
While using VS profiler for running my changes on a single-process basis, the improvement looked to be somewhere in the vicinity of 0.2-03%