[Core] Optimize `String::join` #92550

AThousandShips · 2024-05-30T14:54:19Z

Avoid reallocation by pre-computing size

See also:

lawnjelly · 2024-06-03T13:50:54Z

core/string/ustring.cpp

+
+	bool first = true;
+	for (const String &part : parts) {
+		if (first) {


Does compiler optimize this out? If not, taking it out of the loop might be an option for an ounce more (have to profile though). 🙂

I assume it'll unroll it

Ah what is it they say about assumption.. 😉

It's only a minor thing anyway but ime optimizers don't always make the decisions you expect, as they don't have knowledge of the data, they can't assume you are feeding a list of zero items, 1 or 2, or 10000. Good practice to look at the assembly or look in godbolt (or just profile and see what is going on).

Having said that it should be an improvement so far anyway so happy to approve.

Tested a simple case with gcc and it unrolls a loop like this one, by jumping past the check and then looping back around, like this for a trivial case:

cmpq %rbx, %rsi jne .L6 jmp .L1 .p2align 4,,10 .p2align 3 .L8: movl (%rbx), %ecx call _Z11call_me_tooi .L6: movl (%rbx), %ecx addq $4, %rbx call _Z7call_mei cmpq %rbx, %rsi jne .L8 .L1: addq $40, %rsp popq %rbx popq %rsi ret

This is from this code:

void do_this(const std::vector<int> &p_arg) { bool first = true; for (const int &i : p_arg) { if (first) { first = false; } else { call_me_too(i); } call_me(i); } }

So it will jump to L6 which is the call_me line, and then next time it loops it'll go to call_me_too

Mickeon · 2024-06-23T18:51:13Z

For the extra code added, how much faster is the method in this PR in comparison?

AThousandShips · 2024-06-23T20:11:24Z

Haven't done explicit checking with this one as benchmarking it is hard, but it should pretty non-trivial since it skips a lot of allocations for long code

Our current benchmarking tools doesn't cover this and doing separate benchmarking with non-library code wouldn't be reliable as it can't cover the COW stuff easily

Calinou

Tested locally (rebased on top of master c73ac74), it works as expected.

Testing project: test_pr_92550.zip

Benchmark

PC specifications

CPU: Intel Core i9-13900K
GPU: NVIDIA GeForce RTX 4090
RAM: 64 GB (2×32 GB DDR5-5800 C30)
SSD: Solidigm P44 Pro 2 TB
OS: Linux (Fedora 40)

Using an optimized editor build (optimize=speed lto=full).

Before

join(2 strings) took 3155335 usecs
join(10 strings) took 7880363 usecs

After

join(2 strings) took 2511176 usecs
join(10 strings) took 5855268 usecs

Avoid reallocation by pre-computing size

akien-mga · 2024-08-16T08:48:37Z

Thanks! String ops go brrr 🚗

AThousandShips · 2024-08-16T08:48:51Z

Thank you!

AThousandShips added enhancement topic:core labels May 30, 2024

AThousandShips added this to the 4.x milestone May 30, 2024

AThousandShips requested a review from a team as a code owner May 30, 2024 14:54

AThousandShips mentioned this pull request May 30, 2024

[Core] Optimize String::insert #92555

Merged

lawnjelly reviewed Jun 3, 2024

View reviewed changes

lawnjelly approved these changes Jun 3, 2024

View reviewed changes

Calinou added the performance label Jun 3, 2024

Mickeon mentioned this pull request Jun 26, 2024

Optimize get_path() in EditorFileSystemDirectory #93611

Merged

AThousandShips force-pushed the join_improve branch from 485441c to 819a725 Compare July 12, 2024 12:37

Calinou approved these changes Aug 9, 2024

View reviewed changes

clayjohn modified the milestones: 4.x, 4.4 Aug 9, 2024

[Core] Optimize String::join

e211d08

Avoid reallocation by pre-computing size

AThousandShips force-pushed the join_improve branch from 819a725 to e211d08 Compare August 15, 2024 18:59

akien-mga merged commit e057c49 into godotengine:master Aug 16, 2024
18 checks passed

AThousandShips deleted the join_improve branch August 16, 2024 08:48

miv391 mentioned this pull request Aug 29, 2024

Add more unit tests for String insert and join. #96291

Merged

Tekisasu-JohnK mentioned this pull request Sep 16, 2024

4.3.0p5 Tekisasu-JohnK/Tekisasu-Engine#18

Merged

jss2a98aj mentioned this pull request Oct 18, 2024

[4.4 backport] Assorted ustring optimizations and a bugfix blazium-engine/blazium#68

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core] Optimize `String::join` #92550

[Core] Optimize `String::join` #92550

AThousandShips commented May 30, 2024 •

edited

Loading

lawnjelly Jun 3, 2024

AThousandShips Jun 3, 2024

lawnjelly Jun 3, 2024

AThousandShips Jun 3, 2024

Mickeon commented Jun 23, 2024

AThousandShips commented Jun 23, 2024 •

edited

Loading

Calinou left a comment •

edited

Loading

akien-mga commented Aug 16, 2024

AThousandShips commented Aug 16, 2024

[Core] Optimize String::join #92550

[Core] Optimize String::join #92550

Conversation

AThousandShips commented May 30, 2024 • edited Loading

lawnjelly Jun 3, 2024

Choose a reason for hiding this comment

AThousandShips Jun 3, 2024

Choose a reason for hiding this comment

lawnjelly Jun 3, 2024

Choose a reason for hiding this comment

AThousandShips Jun 3, 2024

Choose a reason for hiding this comment

Mickeon commented Jun 23, 2024

AThousandShips commented Jun 23, 2024 • edited Loading

Calinou left a comment • edited Loading

Choose a reason for hiding this comment

Benchmark

Before

After

akien-mga commented Aug 16, 2024

AThousandShips commented Aug 16, 2024

[Core] Optimize `String::join` #92550

[Core] Optimize `String::join` #92550

AThousandShips commented May 30, 2024 •

edited

Loading

AThousandShips commented Jun 23, 2024 •

edited

Loading

Calinou left a comment •

edited

Loading