Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement StringPairs for efficient pairs(::AbstractString) #51631

Merged
merged 8 commits into from
Jun 3, 2024

Conversation

jakobnissen
Copy link
Contributor

The default pairs will iterate keys and values separately. For strings, this represents double work, since both these iterations will need to determine valid string indices.
The introduced StringPairs type will, whenever possible, only compute valid indices once.
Currently, this is only optimised for String and SubString{String}, and not for AbstractString, nor is it optimised when reversed.

Simple benchmark:

using BenchmarkTools
a = lpad("March", 20)
@btime lstrip($a)
  • Master: 71.6 ns
  • This PR: 23.9 ns

Closes #51624

@jakobnissen jakobnissen added performance Must go faster strings "Strings!" labels Oct 7, 2023
Copy link
Member

@StefanKarpinski StefanKarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and is a nice optimization.

@jakobnissen jakobnissen added the merge me PR is reviewed. Merge when all tests are passing label Oct 11, 2023
base/strings/util.jl Outdated Show resolved Hide resolved
base/strings/util.jl Outdated Show resolved Hide resolved
base/strings/util.jl Outdated Show resolved Hide resolved
@jakobnissen jakobnissen added merge me PR is reviewed. Merge when all tests are passing and removed merge me PR is reviewed. Merge when all tests are passing labels Oct 11, 2023
@oscardssmith
Copy link
Member

Is this a better solution than #51671?

@jakobnissen
Copy link
Contributor Author

See my comment here: #51631 (comment)
They both improve timings, and both PRs together are faster than either one.

@jakobnissen
Copy link
Contributor Author

Is there anything else that needs to be done here? FWIW, I profiled lstrip, and didn't see the SubString constructor take any significant time (as suggested by this comment) - most time was spend in string iteration.

@DilumAluthge DilumAluthge requested a review from vtjnash October 16, 2023 02:08
@DilumAluthge DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label Oct 16, 2023
@DilumAluthge
Copy link
Member

@vtjnash Could you give this another review?

@vtjnash
Copy link
Member

vtjnash commented Oct 23, 2023

It sounds like someone still needs to investigate and fix the actual performance issue (#51631 (comment))

@jakobnissen
Copy link
Contributor Author

jakobnissen commented Oct 23, 2023

Okay, so I did two benchmarks. One simply measures the performance of pairs(::String), which is what this issue is really about. This is the benchmark

function time_pairs(s::String)
    n = 0
    for (i, c) in pairs(s)
        n = xor(n, reinterpret(UInt32, c) * i)
    end
    n
end

for source in ["en", "dk", "cn"]
    src = read(source, String)
    short = first(src, 20)
    println(source, " long:  ", @belapsed time_pairs($src))
    println(source, " short: ", @belapsed time_pairs($short))
end

With three difference sources: English (all ASCII), Danish (mostly ASCII) and Chinese (little ASCII). Results

            master    #51671    #51671 & this
en, short  53.8 ns    37.3 ns   19.9 ns    
en, long   21.7 us    17.4 us   11.8 us
da, short  57.3 ns    26.1 ns   19.6 ns
da, long   28.9 us    13.1 us    9.3 us
cn, short  116  ns    112  ns   52.9  us
cn, long   16.9 us    18.3 us    9.7 us

And here, for a simple lstrip benchmark:

bar(s) = sum(ncodeunits(lstrip(c -> reinterpret(UInt32, c) >>> 24 > 10, s)) for i in 1:10000000)

This takes ~950 ms on #51671, and ~640 ms on #51671 + this PR
This uses - on #51671 + this approximately:

  • 91% of time in lstrip
  • Of that, approximately 18% in SubString
  • 44% in iterate
  • I don't know what the rest is. Most of it shows up as "overhead" of lstrip in benchmarking. The benchmark does not allocate.

So, this PR is a clear win.

@jakobnissen
Copy link
Contributor Author

Bump. What needs to be done to get this through?

The default `pairs` will iterate keys and values separately. For strings, this
represents double work, since both these iterations will need to determine valid
string indices.
The introduced StringPairs type will, whenever possible, only compute valid
indices once.
Currently, this is only optimised for `String` and `SubString{String}`, and not
for `AbstractString`, nor is it optimised when reversed.
@jakobnissen jakobnissen added the needs nanosoldier run This PR should have benchmarks run on it label Oct 29, 2023
@jakobnissen
Copy link
Contributor Author

I've added "needs nanosoldier" because the noinline in string iteration may have unintended side effects. For example, it makes iteration of Cyrillic 2x slower and Chinese 1.6x slower. On the other hand, it allows iterate to inline itself so it will be faster for ASCII in many contexts. Just like the recent changes to nextind. So let's check its impact.
If the impact is unacceptable then I'll close this PR because then I don't know how to add the algorithmic improvement in this PR without causing regressions

@vtjnash
Copy link
Member

vtjnash commented Apr 16, 2024

@nanosoldier runbenchmarks("shootout" && "misc" && "io" && "string" && "strings", vs=":master")

@vtjnash vtjnash added merge me PR is reviewed. Merge when all tests are passing and removed needs nanosoldier run This PR should have benchmarks run on it labels Apr 16, 2024
@vtjnash
Copy link
Member

vtjnash commented Apr 16, 2024

It may still be worthwhile for someone to figure out why this isn't handled in the compiler already (past analysis suggested it was a problem with the way it used the SubString constructor #51631 (comment)), but might as well take the win now

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed, but no benchmarks were actually executed. Perhaps your tag predicate contains misspelled tags? cc @

@vtjnash
Copy link
Member

vtjnash commented Apr 16, 2024

@nanosoldier runbenchmarks("shootout" || "misc" || "io" || "string" || "strings", vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@jakobnissen
Copy link
Contributor Author

@DilumAluthge thanks for keeping this up to date. This is good to go now.

@jakobnissen
Copy link
Contributor Author

Bump

@DilumAluthge DilumAluthge merged commit 5897a92 into JuliaLang:master Jun 3, 2024
7 checks passed
@DilumAluthge DilumAluthge removed the merge me PR is reviewed. Merge when all tests are passing label Jun 3, 2024
@jakobnissen jakobnissen deleted the string_pairs branch June 4, 2024 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lstrip could be more efficient: enumerate vs pairs
6 participants