Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor optimisation of the string operations #42037

Merged
merged 1 commit into from
May 19, 2017
Merged

Conversation

nagisa
Copy link
Member

@nagisa nagisa commented May 16, 2017

The interesting benchmarks are

(<100% → better than original ) Delta Original ns/i This Patch ns/i
str::contains_bang_char::long_lorem_ipsum 24.98% 5565 1390
str::contains_bang_char::short_ascii 27.78% 144 40
str::contains_bang_char::short_mixed 33.33% 120 40
str::contains_bang_char::short_pile_of_poo 59.42% 69 41
str::ends_with_ascii_char::long_lorem_ipsum 45.26% 5457 2470
str::ends_with_ascii_char::short_ascii 45.01% 5486 2469
str::ends_with_ascii_char::short_mixed 46.31% 5452 2525
str::ends_with_ascii_char::short_pile_of_poo 45.34% 5443 2468
str::ends_with_unichar::long_lorem_ipsum 45.94% 5466 2511
str::ends_with_unichar::short_ascii 46.06% 5456 2513
str::ends_with_unichar::short_mixed 45.50% 5486 2496
str::ends_with_unichar::short_pile_of_poo 45.65% 5448 2487
str::find_underscore_char::long_lorem_ipsum 62.21% 5567 3463
str::find_underscore_char::short_ascii 66.44% 146 97
str::find_underscore_char::short_mixed 76.23% 122 93
str::find_underscore_str::short_pile_of_poo 47.62% 252 120
str::find_zzz_char::long_lorem_ipsum 62.17% 5569 3462
str::find_zzz_char::short_ascii 66.67% 147 98
str::find_zzz_char::short_mixed 75.41% 122 92
str::find_zzz_str::long_lorem_ipsum 72.86% 925 674
str::rfind_underscore_char::long_lorem_ipsum 34.19% 7128 2437
str::rfind_underscore_char::short_ascii 37.31% 193 72
str::rfind_underscore_char::short_mixed 41.42% 169 70
str::rfind_underscore_char::short_pile_of_poo 50.51% 99 50
str::rfind_zzz_char::long_lorem_ipsum 34.20% 7129 2438
str::rfind_zzz_char::short_ascii 37.11% 194 72
str::rfind_zzz_char::short_mixed 40.94% 171 70
str::rfind_zzz_char::short_pile_of_poo 54.81% 104 57
str::trim_right_ascii_char::short_mixed 110.42% 48 53
string::bench_from 117.24% 58 68
string::bench_from_str 117.24% 58 68
string::bench_push_str 111.11% 54 60
string::bench_to_string 112.07% 58 65
string::from_utf8_lossy_100_invalid 111.31% 1529 1702

@nagisa nagisa changed the title Minor optimisation of the char Pattern Minor optimisation of the string operations May 16, 2017
@cbreeden
Copy link
Contributor

I'm having a hard time finding where any of your changes impact string::bench_[from|from_str|push_str|to_string].

@arielb1 arielb1 added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 16, 2017
@arielb1
Copy link
Contributor

arielb1 commented May 16, 2017

r? @sfackler

@sfackler
Copy link
Member

How much do the #[inline]s matter here in comparison to the changes to the Pattern impl for char?

@nagisa
Copy link
Member Author

nagisa commented May 17, 2017

I did notice LLVM refusing to inline Pattern instantiations, resulting in significantly worse code. I guess inlines on the non Pattern methods could be removed.

@sfackler
Copy link
Member

It'd be nice to cut it down to the minimum set of changes required to get the perf improvements we're looking for, yeah.

@nagisa
Copy link
Member Author

nagisa commented May 19, 2017

Removing the unnecessary inline annotations removed some of the larger regressions as well.

@sfackler
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented May 19, 2017

📌 Commit 41debc3 has been approved by sfackler

Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request May 19, 2017
Minor optimisation of the string operations

The interesting benchmarks are

<table cellspacing="0" border="0">
	<tr>
		<th height="17" align="right"><b> (&lt;100% → better than original )</b></th>
		<th align="left" sdnum="2057;0;0.00%">Delta</th>
		<th align="left">Original ns/i</th>
		<th align="left">This Patch ns/i</th>
	</tr>
	<tr>
		<td height="17" align="left">str::contains_bang_char::long_lorem_ipsum</td>
		<td align="right" sdval="0.249775381850854" sdnum="2057;0;0.00%">24.98%</td>
		<td align="right" sdval="5565" sdnum="2057;">5565</td>
		<td align="right" sdval="1390" sdnum="2057;">1390</td>
	</tr>
	<tr>
		<td height="17" align="left">str::contains_bang_char::short_ascii</td>
		<td align="right" sdval="0.277777777777778" sdnum="2057;0;0.00%">27.78%</td>
		<td align="right" sdval="144" sdnum="2057;">144</td>
		<td align="right" sdval="40" sdnum="2057;">40</td>
	</tr>
	<tr>
		<td height="17" align="left">str::contains_bang_char::short_mixed</td>
		<td align="right" sdval="0.333333333333333" sdnum="2057;0;0.00%">33.33%</td>
		<td align="right" sdval="120" sdnum="2057;">120</td>
		<td align="right" sdval="40" sdnum="2057;">40</td>
	</tr>
	<tr>
		<td height="17" align="left">str::contains_bang_char::short_pile_of_poo</td>
		<td align="right" sdval="0.594202898550725" sdnum="2057;0;0.00%">59.42%</td>
		<td align="right" sdval="69" sdnum="2057;">69</td>
		<td align="right" sdval="41" sdnum="2057;">41</td>
	</tr>
	<tr>
		<td height="17" align="left">str::ends_with_ascii_char::long_lorem_ipsum</td>
		<td align="right" sdval="0.452629649990837" sdnum="2057;0;0.00%">45.26%</td>
		<td align="right" sdval="5457" sdnum="2057;">5457</td>
		<td align="right" sdval="2470" sdnum="2057;">2470</td>
	</tr>
	<tr>
		<td height="17" align="left">str::ends_with_ascii_char::short_ascii</td>
		<td align="right" sdval="0.450054684651841" sdnum="2057;0;0.00%">45.01%</td>
		<td align="right" sdval="5486" sdnum="2057;">5486</td>
		<td align="right" sdval="2469" sdnum="2057;">2469</td>
	</tr>
	<tr>
		<td height="17" align="left">str::ends_with_ascii_char::short_mixed</td>
		<td align="right" sdval="0.463132795304475" sdnum="2057;0;0.00%">46.31%</td>
		<td align="right" sdval="5452" sdnum="2057;">5452</td>
		<td align="right" sdval="2525" sdnum="2057;">2525</td>
	</tr>
	<tr>
		<td height="17" align="left">str::ends_with_ascii_char::short_pile_of_poo</td>
		<td align="right" sdval="0.453426419254088" sdnum="2057;0;0.00%">45.34%</td>
		<td align="right" sdval="5443" sdnum="2057;">5443</td>
		<td align="right" sdval="2468" sdnum="2057;">2468</td>
	</tr>
	<tr>
		<td height="17" align="left">str::ends_with_unichar::long_lorem_ipsum</td>
		<td align="right" sdval="0.459385290889133" sdnum="2057;0;0.00%">45.94%</td>
		<td align="right" sdval="5466" sdnum="2057;">5466</td>
		<td align="right" sdval="2511" sdnum="2057;">2511</td>
	</tr>
	<tr>
		<td height="17" align="left">str::ends_with_unichar::short_ascii</td>
		<td align="right" sdval="0.460593841642229" sdnum="2057;0;0.00%">46.06%</td>
		<td align="right" sdval="5456" sdnum="2057;">5456</td>
		<td align="right" sdval="2513" sdnum="2057;">2513</td>
	</tr>
	<tr>
		<td height="17" align="left">str::ends_with_unichar::short_mixed</td>
		<td align="right" sdval="0.454976303317536" sdnum="2057;0;0.00%">45.50%</td>
		<td align="right" sdval="5486" sdnum="2057;">5486</td>
		<td align="right" sdval="2496" sdnum="2057;">2496</td>
	</tr>
	<tr>
		<td height="17" align="left">str::ends_with_unichar::short_pile_of_poo</td>
		<td align="right" sdval="0.456497797356828" sdnum="2057;0;0.00%">45.65%</td>
		<td align="right" sdval="5448" sdnum="2057;">5448</td>
		<td align="right" sdval="2487" sdnum="2057;">2487</td>
	</tr>
	<tr>
		<td height="17" align="left">str::find_underscore_char::long_lorem_ipsum</td>
		<td align="right" sdval="0.622058559367703" sdnum="2057;0;0.00%">62.21%</td>
		<td align="right" sdval="5567" sdnum="2057;">5567</td>
		<td align="right" sdval="3463" sdnum="2057;">3463</td>
	</tr>
	<tr>
		<td height="17" align="left">str::find_underscore_char::short_ascii</td>
		<td align="right" sdval="0.664383561643836" sdnum="2057;0;0.00%">66.44%</td>
		<td align="right" sdval="146" sdnum="2057;">146</td>
		<td align="right" sdval="97" sdnum="2057;">97</td>
	</tr>
	<tr>
		<td height="17" align="left">str::find_underscore_char::short_mixed</td>
		<td align="right" sdval="0.762295081967213" sdnum="2057;0;0.00%">76.23%</td>
		<td align="right" sdval="122" sdnum="2057;">122</td>
		<td align="right" sdval="93" sdnum="2057;">93</td>
	</tr>
	<tr>
		<td height="17" align="left">str::find_underscore_str::short_pile_of_poo</td>
		<td align="right" sdval="0.476190476190476" sdnum="2057;0;0.00%">47.62%</td>
		<td align="right" sdval="252" sdnum="2057;">252</td>
		<td align="right" sdval="120" sdnum="2057;">120</td>
	</tr>
	<tr>
		<td height="17" align="left">str::find_zzz_char::long_lorem_ipsum</td>
		<td align="right" sdval="0.621655593463818" sdnum="2057;0;0.00%">62.17%</td>
		<td align="right" sdval="5569" sdnum="2057;">5569</td>
		<td align="right" sdval="3462" sdnum="2057;">3462</td>
	</tr>
	<tr>
		<td height="17" align="left">str::find_zzz_char::short_ascii</td>
		<td align="right" sdval="0.666666666666667" sdnum="2057;0;0.00%">66.67%</td>
		<td align="right" sdval="147" sdnum="2057;">147</td>
		<td align="right" sdval="98" sdnum="2057;">98</td>
	</tr>
	<tr>
		<td height="17" align="left">str::find_zzz_char::short_mixed</td>
		<td align="right" sdval="0.754098360655738" sdnum="2057;0;0.00%">75.41%</td>
		<td align="right" sdval="122" sdnum="2057;">122</td>
		<td align="right" sdval="92" sdnum="2057;">92</td>
	</tr>
	<tr>
		<td height="17" align="left">str::find_zzz_str::long_lorem_ipsum</td>
		<td align="right" sdval="0.728648648648649" sdnum="2057;0;0.00%">72.86%</td>
		<td align="right" sdval="925" sdnum="2057;">925</td>
		<td align="right" sdval="674" sdnum="2057;">674</td>
	</tr>
	<tr>
		<td height="17" align="left">str::rfind_underscore_char::long_lorem_ipsum</td>
		<td align="right" sdval="0.3418911335578" sdnum="2057;0;0.00%">34.19%</td>
		<td align="right" sdval="7128" sdnum="2057;">7128</td>
		<td align="right" sdval="2437" sdnum="2057;">2437</td>
	</tr>
	<tr>
		<td height="17" align="left">str::rfind_underscore_char::short_ascii</td>
		<td align="right" sdval="0.373056994818653" sdnum="2057;0;0.00%">37.31%</td>
		<td align="right" sdval="193" sdnum="2057;">193</td>
		<td align="right" sdval="72" sdnum="2057;">72</td>
	</tr>
	<tr>
		<td height="17" align="left">str::rfind_underscore_char::short_mixed</td>
		<td align="right" sdval="0.414201183431953" sdnum="2057;0;0.00%">41.42%</td>
		<td align="right" sdval="169" sdnum="2057;">169</td>
		<td align="right" sdval="70" sdnum="2057;">70</td>
	</tr>
	<tr>
		<td height="17" align="left">str::rfind_underscore_char::short_pile_of_poo</td>
		<td align="right" sdval="0.505050505050505" sdnum="2057;0;0.00%">50.51%</td>
		<td align="right" sdval="99" sdnum="2057;">99</td>
		<td align="right" sdval="50" sdnum="2057;">50</td>
	</tr>
	<tr>
		<td height="17" align="left">str::rfind_zzz_char::long_lorem_ipsum</td>
		<td align="right" sdval="0.341983447888904" sdnum="2057;0;0.00%">34.20%</td>
		<td align="right" sdval="7129" sdnum="2057;">7129</td>
		<td align="right" sdval="2438" sdnum="2057;">2438</td>
	</tr>
	<tr>
		<td height="17" align="left">str::rfind_zzz_char::short_ascii</td>
		<td align="right" sdval="0.371134020618557" sdnum="2057;0;0.00%">37.11%</td>
		<td align="right" sdval="194" sdnum="2057;">194</td>
		<td align="right" sdval="72" sdnum="2057;">72</td>
	</tr>
	<tr>
		<td height="17" align="left">str::rfind_zzz_char::short_mixed</td>
		<td align="right" sdval="0.409356725146199" sdnum="2057;0;0.00%">40.94%</td>
		<td align="right" sdval="171" sdnum="2057;">171</td>
		<td align="right" sdval="70" sdnum="2057;">70</td>
	</tr>
	<tr>
		<td height="17" align="left">str::rfind_zzz_char::short_pile_of_poo</td>
		<td align="right" sdval="0.548076923076923" sdnum="2057;0;0.00%">54.81%</td>
		<td align="right" sdval="104" sdnum="2057;">104</td>
		<td align="right" sdval="57" sdnum="2057;">57</td>
	</tr>
	<tr>
		<td height="17" align="left">str::trim_right_ascii_char::short_mixed</td>
		<td align="right" sdval="1.10416666666667" sdnum="2057;0;0.00%">110.42%</td>
		<td align="right" sdval="48" sdnum="2057;">48</td>
		<td align="right" sdval="53" sdnum="2057;">53</td>
	</tr>
	<tr>
		<td height="17" align="left">string::bench_from</td>
		<td align="right" sdval="1.17241379310345" sdnum="2057;0;0.00%">117.24%</td>
		<td align="right" sdval="58" sdnum="2057;">58</td>
		<td align="right" sdval="68" sdnum="2057;">68</td>
	</tr>
	<tr>
		<td height="17" align="left">string::bench_from_str</td>
		<td align="right" sdval="1.17241379310345" sdnum="2057;0;0.00%">117.24%</td>
		<td align="right" sdval="58" sdnum="2057;">58</td>
		<td align="right" sdval="68" sdnum="2057;">68</td>
	</tr>
	<tr>
		<td height="17" align="left">string::bench_push_str</td>
		<td align="right" sdval="1.11111111111111" sdnum="2057;0;0.00%">111.11%</td>
		<td align="right" sdval="54" sdnum="2057;">54</td>
		<td align="right" sdval="60" sdnum="2057;">60</td>
	</tr>
	<tr>
		<td height="17" align="left">string::bench_to_string</td>
		<td align="right" sdval="1.12068965517241" sdnum="2057;0;0.00%">112.07%</td>
		<td align="right" sdval="58" sdnum="2057;">58</td>
		<td align="right" sdval="65" sdnum="2057;">65</td>
	</tr>
	<tr>
		<td height="17" align="left">string::from_utf8_lossy_100_invalid</td>
		<td align="right" sdval="1.1131458469588" sdnum="2057;0;0.00%">111.31%</td>
		<td align="right" sdval="1529" sdnum="2057;">1529</td>
		<td align="right" sdval="1702" sdnum="2057;">1702</td>
	</tr>
</table>
bors added a commit that referenced this pull request May 19, 2017
@bors bors merged commit 41debc3 into rust-lang:master May 19, 2017
@lilianmoraru
Copy link

lilianmoraru commented May 24, 2017

Observed this in TWIR...
Not sure the inline is very good. In artificial benchmarks it will look good most of the time but in real code it might blow the instruction cache(bring these functions in but throw the other ones and then when you get back to the other ones, you have to throw these... imagine some kind of loop).
Of course "might" does not mean that it will. That's why in the C++ world it is usually recommended to let the compiler/optimizer analyze the code an decide.

@cbreeden
Copy link
Contributor

That's why in the C++ world it is usually recommended to let the compiler/optimizer analyze the code an decide.

Using #[inline] doesn't force inline, it just (now I'm really going to butcher this), exports the object file to give LLVM the chance to inline it (though I'm sure what I said isn't accurate). There is an #[inline(always)], but that's not what's being done here.

@lilianmoraru
Copy link

I call it a suggestion to the compiler to inline.
From what I've seen, it usually tries to inline, unless the function is too complicated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants