expander optimization #11069

SimaTian · 2024-12-02T15:44:16Z

changes:

The AddArgumentFromSlices is an optimization that bypasses the need for argumentBuilder which has noticeable overhead.
I use indices to directly calculate the argument instead.
The lines 2126 - 2142
The expressionCapture.Captures?.Any iteration was visible in the profiler.
I pulled both branches into one loop since it looked slightly better. Performance improvement is speculative at best, can be
removed if we decide that it is too risky or something.
Finally the src/StringTools/WeakStringCache.Concurrent.cs
the .Count call on the ConcurrentDictionary is locking everything which is not great since we do it quite often. my change aims to introduce separate count for that to limit the number of locks we take and to unblock the dictionary.

All together when profiling on my machine, after killing the outliers (MSBuild sometimes has a random-ish spikes in build time that I decided to skip) my aggregated output was ~this:

23 cold runs for our current Main branch, with -maxcpucount:10 -tl:true and average time of 62.103s
13 cold runs for this branch with same arguments, average time was 61.31s
Though these are speculative at best, I've yet to find a reasonable and stable enough way of checking the impact.

While using VS profiler for running my changes on a single-process basis, the improvement looked to be somewhere in the vicinity of 0.2-03%

…pass in ExpandExpressionCapture

maridematte · 2024-12-03T13:15:25Z

Thank you for tackling performance, I just have a few comments before getting to the specifics of the code.

There are a lot of commented out code that doesn't seem that will be used in the future, please clean it up.
For the PR description, it is very confusing. Please add the reason that these changes need to be made, and separate what you changed and the performance impact into two separate sections.

SimaTian · 2024-12-03T14:20:03Z

First of all, thank you for looking at my experimental changes. Much appreciated.

Thank you for tackling performance, I just have a few comments before getting to the specifics of the code.

There are a lot of commented out code that doesn't seem that will be used in the future, please clean it up.

For the PR description, it is very confusing. Please add the reason that these changes need to be made, and separate what you changed and the performance impact into two separate sections.

I originally intended to keep the old code around for comparison purposes. That being said I forget that Git will make it visible anyways so I will remedy that soon-ish.
I will try.

JanKrivanek

Thanks for digging into this!!

The main reason why I do not want to approve now - This is nontrivial change in core logic and the PR has no mention of test coverage. If it's already covered by pre-existing tests, please call those out. If not - I'd prefer a tailored test(s) to be added first (can be same PR, but idealy a separate commit, that shows the test were passing on previous version).

src/StringTools/WeakStringCache.Concurrent.cs

src/Build/Evaluation/Expander.cs

SimaTian · 2024-12-04T08:10:34Z

Thank you for the review. I wasn't expecting approval anytime soon. This is mostly to open the discussion about this area in particular.
Before we go ahead with an eventual merge(after resolving all concerns and/or discarding things we deem invalid), I would prefer to have some way to quantify the results other than "it looks good in my local profiler" or "while running on my machine, it looked to be faster after averaging X vs Y runs."

Regarding the test coverage:
We have 348 expander tests covering this area in general, see Expander_tests.cs
If I were to call out one test in particular that made my life somewhat uncomfortable, it would be Medley - a bunch of random error checks and edge cases.
Every test there does some expansion somewhere.
Furthermore, I was able to build the Orchard even while some tests were failing - I don't think I can do much better test wise, unless we want to spend a lot more time on this.

JanKrivanek · 2024-12-10T09:33:10Z

All sounds perfect!

Can you run an exp insertion, request perf runs (DDRITs and Speedometer) a post results here?

JanKrivanek

Thank you!

src/Build/Evaluation/Expander.cs

YuliiaKovalova · 2024-12-10T10:27:49Z

src/Build/Evaluation/Expander.cs

+            // Trim from the end.
+            // There is some extra logic here to avoid edge case where we would trim a whitespace only character with
+            // one slice from the start, and then once more from the end, resulting in invalid indices.
+            while (((firstSlice < lastSlice) || (firstSlice == lastSlice && firstSliceIdx < lastSliceIdx)) && Char.IsWhiteSpace(arg, lastSliceIdx - 1))


it looks like (firstSlice <= lastSlice && Char.IsWhiteSpace(arg, firstSliceIdx) can be moved to the separate method with descriptive name and reused in the while loop here and on the line 794

This checks checks against lastSliceIndx and also checks to prevent duplicate whitespace removal (the second part after the ||)
It was similar-ish before, then I run into the "all whitespace" edge case and crashed.
that is what the firstSlice == lastSlice && firstSliceIdx < lastSliceIdx is for.
So the function would have to be something like
IsEdgeWhitespace(arg, firstSlice, lastSlice, idxToCheck) and then some extra logic for this case.
Is that more readable? I would like to say no, but that could be just my laziness/lack of feeling for cases such as this.

YuliiaKovalova · 2024-12-10T10:29:25Z

src/Build/Evaluation/Expander.cs

+
+            // If the argument is in quotes, we want to remove those
+            if ((arg[firstSliceIdx] == '\'' && arg[lastSliceIdx - 1] == '\'') ||
+                (arg[firstSliceIdx] == '`' && arg[lastSliceIdx - 1] == '`') ||


why did you remove the static fields for these repetitive chars?

The comparison didn't use them (I'm not 100% sure why, but I kept it that way). Since I'm working with indices, I didn't need them for .Trim() calls anymore so they became unused and the ./build.cmd started complaining.

src/Build/Evaluation/Expander.cs

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 suggestion.

Comments skipped due to low confidence (1)

src/Build/Evaluation/Expander.cs:778

Add a test case to verify the behavior of the AddArgumentFromSlices method, especially for edge cases such as arguments with quotes and 'null' values.

private static void AddArgumentFromSlices(List<string> arguments, List<Tuple<int, int>> slices, string arg)

src/StringTools/WeakStringCache.Concurrent.cs

src/Build/Evaluation/Expander.cs

JanProvaznik · 2024-12-10T10:48:21Z

src/Build/Evaluation/Expander.cs

+
+            while (firstSlice < lastSlice)
+            {
+                argValue += arg.Substring(firstSliceIdx, slices[firstSlice].Item2 - firstSliceIdx);


I am a bit confused why would this improve perf, i guess there are usually few arguments. In general repeated concatenations are perf antipattern. Couldn't the StringTools.SpanBasedStringBuilder be optimized instead?

From what I saw/googled, the concat should be faster / similar speedwise up to 3-4 concatenations, after that the stringbuilder is faster.
Most of resolved variables have 1 or 2 slices to concatenate(beyond the initial empty string) so I opted for the simplicity.

There is an argument to be made to have a split there based on number of slices and use a stringbuilder for 3+.
As for optimizing SpanBasedStringBuilder - that is not our code but something from Microsoft.NET.StringTools.
I think that the main difference is that for most cases, the span based string builder is an overkill.

As for the perf improvement, the first two lines of the previous version function were doing this:

argumentBuilder.Trim(); string argValue = argumentBuilder.ToString();

and then doing everything with String, throwing away any and all advantage the spanbased stringbuilder might've had.
So there is an option to kill most of my changes and replace the string with SpanBasedChar - it could achieve similar if not identical results. The main cost would be probably further profiling since it's rather large replacement once again.

Bit more googling and having a static StringBuilder and reusing the instance should have the best from the both worlds - reasonable performance for small concatenation counts while avoiding the danger of allocation.

src/Build/Evaluation/Expander.cs

YuliiaKovalova · 2024-12-10T14:03:24Z

@SimaTian , I am slightly concerned about code readability factor.
It's great that applying these changes brings perf improvements , but it's quite difficult to understand a new logic in the amended parts.

SimaTian added 4 commits December 1, 2024 20:43

wip

f6b9b29

multidictionary count to a separate variable, expander iteration one …

e0ebda7

…pass in ExpandExpressionCapture

small revert

13e8d3b

fixing indexing errors

a55ad5e

SimaTian added 3 commits December 3, 2024 15:27

code cleanup

6e359e0

annotation added back

7d173bb

removal of unused variables

62cf7e4

JanKrivanek reviewed Dec 4, 2024

View reviewed changes

src/StringTools/WeakStringCache.Concurrent.cs Show resolved Hide resolved

src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved

src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved

src/Build/Evaluation/Expander.cs Show resolved Hide resolved

SimaTian added 3 commits December 4, 2024 09:22

addressing review comments

05c339c

Merge branch 'main' into refactor_expander_optimization_GetArguments

088a4c0

Merge branch 'main' into refactor_expander_optimization_GetArguments

b159473

SimaTian requested a review from JanKrivanek December 9, 2024 15:03

JanKrivanek approved these changes Dec 10, 2024

View reviewed changes

Merge branch 'main' into refactor_expander_optimization_GetArguments

c818cb3

YuliiaKovalova reviewed Dec 10, 2024

View reviewed changes

src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved

YuliiaKovalova reviewed Dec 10, 2024

View reviewed changes

src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved

JanProvaznik requested a review from Copilot December 10, 2024 10:42

Copilot AI reviewed Dec 10, 2024

View reviewed changes

src/StringTools/WeakStringCache.Concurrent.cs Show resolved Hide resolved

YuliiaKovalova reviewed Dec 10, 2024

View reviewed changes

src/Build/Evaluation/Expander.cs Show resolved Hide resolved

JanProvaznik reviewed Dec 10, 2024

View reviewed changes

JanKrivanek reviewed Dec 10, 2024

View reviewed changes

src/Build/Evaluation/Expander.cs Outdated Show resolved Hide resolved

YuliiaKovalova requested a review from rainersigwald December 10, 2024 14:03

review

2c56875

SimaTian added 2 commits December 12, 2024 13:35

fix

b524097

Merge branch 'main' into refactor_expander_optimization_GetArguments

5acf825

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expander optimization #11069

expander optimization #11069

SimaTian commented Dec 2, 2024 •

edited

Loading

maridematte commented Dec 3, 2024

SimaTian commented Dec 3, 2024 •

edited

Loading

JanKrivanek left a comment

SimaTian commented Dec 4, 2024 •

edited

Loading

JanKrivanek commented Dec 10, 2024

JanKrivanek left a comment

YuliiaKovalova Dec 10, 2024

SimaTian Dec 10, 2024

YuliiaKovalova Dec 10, 2024

SimaTian Dec 10, 2024

Copilot AI left a comment

JanProvaznik Dec 10, 2024

SimaTian Dec 10, 2024 •

edited

Loading

SimaTian Dec 10, 2024

SimaTian Dec 11, 2024

YuliiaKovalova commented Dec 10, 2024

expander optimization #11069

Are you sure you want to change the base?

expander optimization #11069

Conversation

SimaTian commented Dec 2, 2024 • edited Loading

changes:

maridematte commented Dec 3, 2024

SimaTian commented Dec 3, 2024 • edited Loading

JanKrivanek left a comment

Choose a reason for hiding this comment

SimaTian commented Dec 4, 2024 • edited Loading

JanKrivanek commented Dec 10, 2024

JanKrivanek left a comment

Choose a reason for hiding this comment

YuliiaKovalova Dec 10, 2024

Choose a reason for hiding this comment

SimaTian Dec 10, 2024

Choose a reason for hiding this comment

YuliiaKovalova Dec 10, 2024

Choose a reason for hiding this comment

SimaTian Dec 10, 2024

Choose a reason for hiding this comment

Copilot AI left a comment

Choose a reason for hiding this comment

JanProvaznik Dec 10, 2024

Choose a reason for hiding this comment

SimaTian Dec 10, 2024 • edited Loading

Choose a reason for hiding this comment

SimaTian Dec 10, 2024

Choose a reason for hiding this comment

SimaTian Dec 11, 2024

Choose a reason for hiding this comment

YuliiaKovalova commented Dec 10, 2024

SimaTian commented Dec 2, 2024 •

edited

Loading

SimaTian commented Dec 3, 2024 •

edited

Loading

SimaTian commented Dec 4, 2024 •

edited

Loading

SimaTian Dec 10, 2024 •

edited

Loading