Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Deque#concat(Indexable) #13283

Merged

Conversation

HertzDevil
Copy link
Contributor

Like #13280, but for Deque. Depends on #13257.

The main differences are there were no existing overloads more specialized than Enumerable arguments here, and there is already only a single buffer growth policy. Benchmarks:

Source
require "benchmark"

class Deque(T)
  def concat2(other : Indexable)
    # this PR's implementation
  end
end

{% begin %}
  M = {{ env("M").to_i }}
  N = {{ env("N").to_i }}
{% end %}
puts "M=#{M}, N=#{N}"

x = Deque(Int64).new

puts "Array:"
y_array = Array(Int64).new
Benchmark.ips do |b|
  b.report("ctrl") { x = Deque(Int64).new(M, &.to_i64); y_array = Array(Int64).new(N, &.to_i64) }
  b.report("old")  { x = Deque(Int64).new(M, &.to_i64); y_array = Array(Int64).new(N, &.to_i64); x.concat(y_array) }
  b.report("new")  { x = Deque(Int64).new(M, &.to_i64); y_array = Array(Int64).new(N, &.to_i64); x.concat2(y_array) }
end

puts "Slice:"
y_slice = Slice(Int64).empty
Benchmark.ips do |b|
  b.report("ctrl") { x = Deque(Int64).new(M, &.to_i64); y_slice = Slice(Int64).new(N, &.to_i64) }
  b.report("old")  { x = Deque(Int64).new(M, &.to_i64); y_slice = Slice(Int64).new(N, &.to_i64); x.concat(y_slice) }
  b.report("new")  { x = Deque(Int64).new(M, &.to_i64); y_slice = Slice(Int64).new(N, &.to_i64); x.concat2(y_slice) }
end

puts "StaticArray:"
y_static_array = uninitialized Int64[N]
Benchmark.ips do |b|
  b.report("old")  { x = Deque(Int64).new(M, &.to_i64); x.concat(y_static_array) }
  b.report("new")  { x = Deque(Int64).new(M, &.to_i64); x.concat2(y_static_array) }
end

puts "Deque:"
y_deque = Deque(Int64).new
Benchmark.ips do |b|
  b.report("ctrl") { x = Deque(Int64).new(M, &.to_i64); y_deque = Deque(Int64).new(N, &.to_i64) }
  b.report("old")  { x = Deque(Int64).new(M, &.to_i64); y_deque = Deque(Int64).new(N, &.to_i64); x.concat(y_deque) }
  b.report("new")  { x = Deque(Int64).new(M, &.to_i64); y_deque = Deque(Int64).new(N, &.to_i64); x.concat2(y_deque) }
end
M=1000, N=100
Array:
ctrl 820.27k (  1.22µs) (± 0.95%)  8.89kB/op        fastest
 old 448.18k (  2.23µs) (± 0.72%)  24.5kB/op   1.83× slower
 new 489.15k (  2.04µs) (± 0.92%)  24.5kB/op   1.68× slower
Slice:
ctrl 837.32k (  1.19µs) (± 0.85%)  8.86kB/op        fastest
 old 446.36k (  2.24µs) (± 0.97%)  24.5kB/op   1.88× slower
 new 499.05k (  2.00µs) (± 1.02%)  24.5kB/op   1.68× slower
StaticArray:
old 474.77k (  2.11µs) (± 0.69%)  23.5kB/op   1.13× slower
new 534.29k (  1.87µs) (± 0.96%)  23.5kB/op        fastest
Deque:
ctrl 808.45k (  1.24µs) (± 0.73%)  8.89kB/op        fastest
 old 440.56k (  2.27µs) (± 0.95%)  24.5kB/op   1.84× slower
 new 490.72k (  2.04µs) (± 0.70%)  24.5kB/op   1.65× slower

M=1000, N=1000
Array:
ctrl 493.17k (  2.03µs) (± 1.11%)  15.7kB/op        fastest
 old 192.04k (  5.21µs) (± 0.56%)  31.3kB/op   2.57× slower
 new 345.75k (  2.89µs) (± 2.76%)  31.3kB/op   1.43× slower
Slice:
ctrl 495.68k (  2.02µs) (± 1.07%)  15.7kB/op        fastest
 old 194.21k (  5.15µs) (± 0.53%)  31.3kB/op   2.55× slower
 new 334.78k (  2.99µs) (± 0.75%)  31.3kB/op   1.48× slower
StaticArray:
old 232.06k (  4.31µs) (± 0.49%)  23.5kB/op   1.91× slower
new 444.11k (  2.25µs) (± 0.63%)  23.5kB/op        fastest
Deque:
ctrl 455.51k (  2.20µs) (± 0.84%)  15.7kB/op        fastest
 old 188.27k (  5.31µs) (± 0.68%)  31.3kB/op   2.42× slower
 new 321.27k (  3.11µs) (± 0.56%)  31.3kB/op   1.42× slower

M=100, N=1000
Array:
ctrl 939.99k (  1.06µs) (± 0.90%)  8.89kB/op        fastest
 old 219.48k (  4.56µs) (± 0.72%)  32.8kB/op   4.28× slower
 new 452.07k (  2.21µs) (± 0.70%)  21.4kB/op   2.08× slower
Slice:
ctrl 956.11k (  1.05µs) (± 0.88%)  8.86kB/op        fastest
 old 219.85k (  4.55µs) (± 0.56%)  32.8kB/op   4.35× slower
 new 459.57k (  2.18µs) (± 0.61%)  21.4kB/op   2.08× slower
StaticArray:
old 273.96k (  3.65µs) (± 0.50%)  25.0kB/op   2.51× slower
new 686.58k (  1.46µs) (± 0.66%)  13.6kB/op        fastest
Deque:
ctrl 820.61k (  1.22µs) (± 0.79%)  8.89kB/op        fastest
 old 212.27k (  4.71µs) (± 0.99%)  32.8kB/op   3.87× slower
 new 426.32k (  2.35µs) (± 1.14%)  21.4kB/op   1.92× slower

@straight-shoota straight-shoota added this to the 1.9.0 milestone Apr 19, 2023
@straight-shoota straight-shoota merged commit 5908aee into crystal-lang:master Apr 21, 2023
@HertzDevil HertzDevil deleted the perf/deque-concat-indexable branch April 21, 2023 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants