Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of readuntil #20621

Merged
merged 9 commits into from
Sep 28, 2017
Merged

Improve performance of readuntil #20621

merged 9 commits into from
Sep 28, 2017

Conversation

omus
Copy link
Member

@omus omus commented Feb 16, 2017

Was looking into readuntil and found a way to improve performance. Below are some of the benchmarks I used for comparison:

julia> function readuntil_old(s::IO, r::AbstractString)
           l = length(r)
           if l == 0
               return ""
           end
           out = IOBuffer()
           m = Array{Char}(l)  # last part of stream to match
           t = collect(r)
           i = 0
           while !eof(s)
               i += 1
               c = read(s, Char)
               write(out, c)
               if i <= l
                   m[i] = c
               else
                   # shift to last part of s
                   for j = 2:l
                       m[j-1] = m[j]
                   end
                   m[l] = c
               end
               if i >= l && m == t
                   break
               end
           end
           return String(take!(out))
       end
readuntil_old (generic function with 1 method)

julia> function readuntil_new(s::IO, t::AbstractString)
           l = length(t)
           if l == 0
               return ""
           end
           out = IOBuffer()
           i = 1
           while !eof(s)
               c = read(s, Char)
               write(out, c)
               i = c == t[i] ? i + 1 : 1
               if i > l
                   break
               end
           end
           return String(take!(out))
       end

readuntil_new (generic function with 1 method)

julia> using BenchmarkTools

julia> str = String([rand('A':'Z', 50000); '1']);

julia> io = IOBuffer(str);

julia> goal = str[end-1000:end];
julia> b1 = @benchmark readuntil_old(seekstart($io), $goal);
julia> b2 = @benchmark readuntil_new(seekstart($io), $goal);
julia> judge(median(b2),median(b1))
BenchmarkTools.TrialJudgement: 
  time:   -95.01% => improvement (5.00% tolerance)
  memory: -10.88% => improvement (1.00% tolerance)

julia> goal = str[end:end];
julia> b1 = @benchmark readuntil_old(seekstart($io), $goal);
julia> b2 = @benchmark readuntil_new(seekstart($io), $goal);
julia> judge(median(b2),median(b1))
BenchmarkTools.TrialJudgement: 
  time:   +2.53% => invariant (5.00% tolerance)
  memory: -0.28% => invariant (1.00% tolerance)

julia> goal = str;
julia> b1 = @benchmark readuntil_old($(seekstart(io)), $goal);
julia> b2 = @benchmark readuntil_new($(seekstart(io)), $goal);
julia>judge(median(b2),median(b1))
BenchmarkTools.TrialJudgement: 
  time:   +4.36% => invariant (5.00% tolerance)
  memory: -85.45% => improvement (1.00% tolerance)

I'll add these benchmarks to BaseBenchmarks.jl

@omus omus added io Involving the I/O subsystem: libuv, read, write, etc. performance Must go faster labels Feb 16, 2017
base/io.jl Outdated
m[l] = c
end
if i >= l && m == t
i = c == t[i] ? i + 1 : 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to advance 'i' here by a whole character to handle Unicode. I think it'll be easiest to do that by using start/next/done instead of 1/+1/l

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original code i represents a character number and not a index into the string.

@vtjnash
Copy link
Member

vtjnash commented Feb 16, 2017

I think this proposed algorithm would fail with a terminator like "aab" matching against "aaab" (or in general, any string where the first character is repeated anywhere else in the string)

@omus
Copy link
Member Author

omus commented Feb 17, 2017

Benchmarks for latest version. Note the original benchmarks weren't rewinding the stream correctly so I also update the description's benchmarks.

julia> using BenchmarkTools
julia> str = String([rand('A':'Z', 50000); '1']);
julia> io = IOBuffer(str);

julia> goal = str[end-1000:end];
julia> b1 = @benchmark readuntil_old(seekstart($io), $goal);
julia> b2 = @benchmark readuntil_new(seekstart($io), $goal);
julia>judge(median(b2),median(b1))
BenchmarkTools.TrialJudgement: 
  time:   -95.66% => improvement (5.00% tolerance)
  memory: +5.27% => regression (1.00% tolerance)

julia> goal = str[end:end];
julia> b1 = @benchmark readuntil_old(seekstart($io), $goal);
julia> b2 = @benchmark readuntil_new(seekstart($io), $goal);
julia>judge(median(b2),median(b1))
BenchmarkTools.TrialJudgement: 
  time:   -16.59% => improvement (5.00% tolerance)
  memory: +0.00% => invariant (1.00% tolerance)

julia> goal = str;
julia> b1 = @benchmark readuntil_old(seekstart($io), $goal);
julia> b2 = @benchmark readuntil_new(seekstart($io), $goal);
julia>judge(median(b2),median(b1))
BenchmarkTools.TrialJudgement: 
  time:   +0.27% => invariant (5.00% tolerance)
  memory: +42.69% => regression (1.00% tolerance)

Memory regression is caused from the new array backtrack.

vtjnash added a commit that referenced this pull request Feb 17, 2017
@omus
Copy link
Member Author

omus commented Feb 17, 2017

Travis failure appears unrelated.

omus pushed a commit that referenced this pull request Mar 23, 2017
@omus
Copy link
Member Author

omus commented Mar 23, 2017

O(n + k) version of the algorithm which computes backtracking information on-the-fly. I ended up storing the backtracking information in a SparseVector which is more memory efficient than an Vector and computationally faster than a Dict.

@omus
Copy link
Member Author

omus commented Mar 23, 2017

I have a unicode test I want to add but currently that results in:

Error During Test
  Test threw an exception of type Base.UVError
  Expression: readuntil(io(t), s) == m
  read: network is down (ENETDOWN)

@StefanKarpinski
Copy link
Member

Wat?

@omus
Copy link
Member Author

omus commented Mar 23, 2017

It looks like the I/O producers "File" and "PipeEndpoint" choke with unicode input in test/read.jl

@tkelman
Copy link
Contributor

tkelman commented Mar 23, 2017

omus and others added 8 commits September 18, 2017 15:11
Looking over my original test I realized it potentially could be
offensive. I've used a different example to avoid any potential issues.
Caches backtracking information as it is needed. Using a SparseVector
which has a lower memory footprint than Vector but is more performant
than Dict.
Skip testing the I/O producers "File" and "PipeEndpoint" when working
with unicode.
makes it possible to use readuntil with any array (indexable) object
and optimizes a few more cases
@vtjnash
Copy link
Member

vtjnash commented Sep 19, 2017

@omus PTAL at my latest updates to your code :)

@omus
Copy link
Member Author

omus commented Sep 19, 2017

Looks really solid. I'll try to dig up some of my old benchmarks and give this a spin.

@vtjnash
Copy link
Member

vtjnash commented Sep 19, 2017

To be slightly more fair to the current algorithm, I was using the following modified version:

function readuntil_old(s::IO, r::Vector{UInt8})
                         l = sizeof(r)
                         if l == 0
                             return ""
                         end
                         out = Base.StringVector(0)
                         m = Array{UInt8}(l)  # last part of stream to match
                         i = 0
                         while !eof(s)
                             i += 1
                             c = read(s, UInt8)
                             push!(out, c)
                             if i <= l
                                 m[i] = c
                             else
                                 # shift to last part of s
                                 for j = 2:l
                                     m[j-1] = m[j]
                                 end
                                 m[l] = c
                             end
                             if i >= l && m == r
                                 break
                             end
                         end
                         return String(out)
                     end
 @time let g2 = Vector{UInt8}(goal); for i in 1:100; readuntil_old(seekstart(io), g2); end; end

I also added the worst case to my tests, where the new algo in this PR really shines:

str = ("A" ^ 50000) * "B";
goal = ("A" ^ 5000) * "Z";

@omus
Copy link
Member Author

omus commented Sep 19, 2017

I think you need to remove the String(...) from readuntil_old to make it a fair fight.

The only benchmark where the old algorithm was slightly faster is in this case:

goal = "A" ^ 50000;
str = "A" ^ 50000;

@omus
Copy link
Member Author

omus commented Sep 19, 2017

I'll make a PR to BaseBenchmarks.

@ararslan
Copy link
Member

Nanosoldier is now ready to run the readuntil benchmarks whenever you are.

@omus
Copy link
Member Author

omus commented Sep 22, 2017

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @ararslan

@omus
Copy link
Member Author

omus commented Sep 25, 2017

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please squash when merging

@omus
Copy link
Member Author

omus commented Sep 28, 2017

The new changes look great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
io Involving the I/O subsystem: libuv, read, write, etc. performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants