Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance of complex number parsing #35727

Closed
wants to merge 1 commit into from

Conversation

KristofferC
Copy link
Member

@KristofferC KristofferC commented May 4, 2020

using DelimitedFiles
A = randn(Complex{Float64},300,50)
writedlm("A.csv",A,',')
@time readdlm("A.csv",',',Complex{Float64},'\n')

Before:

  9.316321 seconds (293.64 k allocations: 3.262 GiB, 3.35% gc time)

After

  0.558150 seconds (18.06 k allocations: 1.130 MiB)

It is still very slow but hey.

Ref #35721

The super slowness came from the generic findprev which needs to allocate a brand new string to for the reversal:

r = reverse(s)

@KristofferC KristofferC added performance Must go faster strings "Strings!" labels May 4, 2020
@KristofferC KristofferC requested a review from stevengj May 4, 2020 09:07
@KristofferC KristofferC changed the title improve performance of complex number printing improve performance of complex number parsiong May 4, 2020
@KristofferC KristofferC changed the title improve performance of complex number parsiong improve performance of complex number parsing May 4, 2020
@codecov-io
Copy link

Codecov Report

Merging #35727 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #35727   +/-   ##
=======================================
  Coverage   86.14%   86.14%           
=======================================
  Files         346      346           
  Lines       64615    64617    +2     
=======================================
+ Hits        55663    55666    +3     
+ Misses       8952     8951    -1     
Impacted Files Coverage Δ
base/fastmath.jl 98.33% <ø> (ø)
base/special/rem_pio2.jl 97.74% <ø> (ø)
base/range.jl 88.49% <100.00%> (+0.05%) ⬆️
base/tuple.jl 92.89% <100.00%> (ø)
base/bitset.jl 92.23% <0.00%> (-1.95%) ⬇️
base/floatfuncs.jl 90.24% <0.00%> (-1.22%) ⬇️
base/float.jl 86.72% <0.00%> (-0.57%) ⬇️
base/promotion.jl 81.77% <0.00%> (-0.53%) ⬇️
base/twiceprecision.jl 88.77% <0.00%> (-0.34%) ⬇️
stdlib/LinearAlgebra/src/triangular.jl 96.31% <0.00%> (-0.13%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ab87d2...8c30017. Read the comment docs.

@stevengj
Copy link
Member

stevengj commented May 4, 2020

Wouldn't it be better to just speed up findprev for strings with a predicate function? Isn't the needed implementation just a loop?

@stevengj
Copy link
Member

stevengj commented May 4, 2020

For that matter, even the findnext implementation for AbstractString seems to leave much to be desired (it creates a SubString object and a pairs iterator rather than just looping directly). Seems like it should be just:

function findnext(testf::Function, s::AbstractString, i::Integer)
    z = ncodeunits(s) + 1
    1  i  z || throw(BoundsError(s, i))
    @inbounds i == z || isvalid(s, i) || string_index_err(s, i)
    e = lastindex(s)
    @inbounds while i < e
        testf(s[i]) && return i
        i = nextind(s, i)
    end
    return nothing
end

function findprev(testf::Function, s::AbstractString, i::Integer)
    z = ncodeunits(s) + 1
    0  i  z || throw(BoundsError(s, i))
    i == z && return nothing # current behavior, but seems a bit odd?
    @inbounds i == 0 || isvalid(s, i) || string_index_err(s, i)
    @inbounds while i > 1
        testf(s[i]) && return i
        i = prevind(s, i)
    end
    return nothing
end

(Note that returning nothing for findprev(f, s, ncodeunits(s)+1) seems like a bug to me, but that is the current behavior of this function.)

@stevengj
Copy link
Member

stevengj commented May 4, 2020

In fact, a lot of the code in base/strings/search.jl seems like it over-relies on iterator tricks rather than just looping with nextind/prevind. Can't we always assume an AbstractString is indexable? cc @Keno, who banged on this code in the iterator overhaul.

@JeffBezanson
Copy link
Member

(it creates a SubString object and a pairs iterator rather than just looping directly)

It would be interesting to know if this got faster since we can now stack allocate those objects.

@KristofferC
Copy link
Member Author

Seems like it should be just:

Mission accomplished ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants