Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance for string(...) #28876

Merged
merged 2 commits into from
Aug 26, 2018
Merged

improve performance for string(...) #28876

merged 2 commits into from
Aug 26, 2018

Conversation

KristofferC
Copy link
Member

@KristofferC KristofferC commented Aug 24, 2018

Before:

julia> @btime string('a') # what is even going on here...
  732.898 ns (4 allocations: 192 bytes)
"a"

julia> @btime string('a', "aa")
  168.540 ns (4 allocations: 192 bytes)
"aaa"

julia> @btime string("a", 'a', 3)
  467.898 ns (6 allocations: 288 bytes)
"aa3"

julia> @btime string('a', 3, "fdsfhjsdfjdlsfjldsjflsdjfkldsjflksdjflkdsjfldsjfk")
  537.153 ns (7 allocations: 368 bytes)
"a3fdsfhjsdfjdlsfjldsjflsdjfkldsjflksdjflkdsjfldsjfk"

After:

julia> @btime string('a')
  12.222 ns (1 allocation: 32 bytes)
"a"

julia> @btime string('a', "aa")
  37.108 ns (2 allocations: 64 bytes)
"aaa"

julia> @btime string("a", 'a', 3)
  293.483 ns (5 allocations: 272 bytes)
"aa3"

julia> @btime string('a', 3, "fdsfhjsdfjdlsfjldsjflsdjfkldsjflksdjflkdsjfldsjfk")
  313.825 ns (5 allocations: 320 bytes)
"a3fdsfhjsdfjdlsfjldsjflsdjfkldsjflksdjflkdsjfldsjfk"

Ref JuliaLang/LinearAlgebra.jl#557

@KristofferC
Copy link
Member Author

KristofferC commented Aug 24, 2018

It's funny, merging back the env keyword into the original function while keeping everything else as is gives

julia> @btime string('a')
  671.764 ns (3 allocations: 160 bytes)
"a"

julia> @btime string("a", 'a', 3)
  287.393 ns (5 allocations: 272 bytes)
"aa3"

julia> @btime string('a', 3, "fdsfhjsdfjdlsfjldsjflsdjfkldsjflksdjflkdsjfldsjfk")
  312.225 ns (5 allocations: 320 bytes)
"a3fdsfhjsdfjdlsfjldsjflsdjfkldsjflksdjflkdsjfldsjfk"

i.e. only the single Char argument is penalized. Why??

Edit: It is not only ::Char that is penalized, for example with this PR:

julia> @btime string(1, 2)
  243.444 ns (7 allocations: 368 bytes)
"12"

but with the methods merged:

julia> @btime string(1,2)
  923.842 ns (7 allocations: 368 bytes)
"12"

Anyway, I'll leave it like this since it is consistently faser.

@KristofferC KristofferC mentioned this pull request Aug 24, 2018
@bkamins
Copy link
Member

bkamins commented Aug 24, 2018

If special case of string(::Char...) is particularly relevant then it is possible to have a special method for this case.

Something like:

function string(c::Char...)
    sz = 0
    for v in c
        sz += codelen(v)
    end
    u = Vector{UInt8}(undef, sz)
    i = 1
    for v in c
        x = bswap(reinterpret(UInt32, v))
        for j in 1:codelen(v)
            @inbounds u[i] = x % UInt8
            i += 1
            x >>= 8
        end
    end
    String(u)
end

should be a bit faster.

EDIT: added @inbounds.

@KristofferC
Copy link
Member Author

KristofferC commented Aug 25, 2018

Why can't we bring back the previous optimized version for Union{String, Char}?

julia> function string2(a::Union{String,AbstractChar}...)
           n = 0
           for v in a
               if v isa Char
                   n += Base.codelen(v)
               else
                   n += sizeof(v)
               end
           end
           out = Base._string_n(n)
           offs = 1
           for v in a
               if v isa Char
                   x = bswap(reinterpret(UInt32, v))
                   for j in 1:Base.codelen(v)
                       unsafe_store!(pointer(out, offs), x % UInt8)
                       offs += 1
                       x >>= 8
                   end
               else
                   unsafe_copyto!(pointer(out,offs), pointer(v), sizeof(v))
                   offs += sizeof(v)
               end
           end
           return out
       end
string2 (generic function with 1 method)

julia> @btime string2('a', "fsf", 'a', "fdsf")
  42.563 ns (1 allocation: 32 bytes)
"afsfafdsf"

julia> @btime string('a', "fsf", 'a', "fdsf")
  259.211 ns (3 allocations: 176 bytes)

@bkamins
Copy link
Member

bkamins commented Aug 25, 2018

Good point - we can . Though the signature should be Union{String, SubString{String}, Char} not Union{String, AbstractChar} as we should handle substrings and characters other than Char might have a different implementation.

As you are probably aware what you propose is not the "old" implementation of Union{String, AbstarctChar} which used print and would be slower, but it is a very nice piece of code combining all what we had discussed up to this point 👍.

@KristofferC
Copy link
Member Author

Yeah, the optimization I was referring to bringing back was the one that used to look like:

function string(a::Union{String,Char}...)
n = 0
for d in a
if isa(d,Char)
n += codelen(d::Char)
else
n += (d::String).len
end
end
out = _string_n(n)
offs = 1
p = pointer(out)
for d in a
if isa(d,Char)
c = UInt32(d::Char)
if c < 0x80
unsafe_store!(p, c%UInt8, offs); offs += 1
elseif c < 0x800
unsafe_store!(p, (( c >> 6 ) | 0xC0)%UInt8, offs); offs += 1
unsafe_store!(p, (( c & 0x3F ) | 0x80)%UInt8, offs); offs += 1
elseif c < 0x10000
unsafe_store!(p, (( c >> 12 ) | 0xE0)%UInt8, offs); offs += 1
unsafe_store!(p, (((c >> 6) & 0x3F ) | 0x80)%UInt8, offs); offs += 1
unsafe_store!(p, (( c & 0x3F ) | 0x80)%UInt8, offs); offs += 1
elseif c < 0x110000
unsafe_store!(p, (( c >> 18 ) | 0xF0)%UInt8, offs); offs += 1
unsafe_store!(p, (((c >> 12) & 0x3F ) | 0x80)%UInt8, offs); offs += 1
unsafe_store!(p, (((c >> 6) & 0x3F ) | 0x80)%UInt8, offs); offs += 1
unsafe_store!(p, (( c & 0x3F ) | 0x80)%UInt8, offs); offs += 1
else
# '\ufffd'
unsafe_store!(p, 0xef, offs); offs += 1
unsafe_store!(p, 0xbf, offs); offs += 1
unsafe_store!(p, 0xbd, offs); offs += 1
end
else
l = (d::String).len
unsafe_copy!(pointer(out,offs), pointer(d::String), l)
offs += l
end
end
return out
end

I pushed an update. I kept the previos modifications to print_to_string since the improvements there are still valid. Please review :)

@bkamins
Copy link
Member

bkamins commented Aug 25, 2018

All looks good to me. The only fear I had is whether compiler efficiently handles the situation of 3 element union and isa tests and it looks that it does as it should. Thank you.

@KristofferC KristofferC merged commit 7314249 into master Aug 26, 2018
@KristofferC KristofferC deleted the kc/string_char_perf branch August 26, 2018 10:16
KristofferC added a commit that referenced this pull request Aug 26, 2018
* improve performance for various string methods

(cherry picked from commit 7314249)
KristofferC added a commit that referenced this pull request Sep 8, 2018
* improve performance for various string methods

(cherry picked from commit 7314249)
KristofferC added a commit that referenced this pull request Sep 8, 2018
* improve performance for various string methods

(cherry picked from commit 7314249)
This was referenced Nov 7, 2018
KristofferC added a commit that referenced this pull request Feb 11, 2019
* improve performance for various string methods

(cherry picked from commit 7314249)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants