Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize BitArray#reverse! #11363

Merged

Conversation

HertzDevil
Copy link
Contributor

With BitArray#reverse! available through Indexable::Mutable, this PR adds a much more performant implementation through the addition of the llvm.bitreverse.* intrinsics. These intrinsics are available in all LLVM releases supported by Crystal (latest down to 3.8).

On large BitArrays, this new implementation is ~20x faster for sizes that are multiples of 32, ~13x faster for sizes that aren't. On small BitArrays this becomes a constant-time operation. Benchmarks:

require "benchmark"
require "bit_array"

struct BitArray
  # implementation from `Indexable::Mutable`
  def reverse_old! : self
    return self if size <= 1

    index0 = 0
    index1 = size - 1

    while index0 < index1
      swap(index0, index1)
      index0 += 1
      index1 -= 1
    end

    self
  end
end

module A
  @@ba = uninitialized BitArray
  def self.ba; @@ba; end
  def self.ba=(@@ba); end
end

N = {0, 1, 2, 4, 5, 8, 10, 16, 32, 50, 64, 100, 128, 256, 500, 512, 1000, 1024, 2048, 4096, 5000}

N.each do |n|
  A.ba = BitArray.new(n)
  A.ba.map! { rand > 0.5 }

  Benchmark.ips do |b|
    b.report("old #{n}") do
      A.ba.reverse_old!
    end

    b.report("new #{n}") do
      A.ba.reverse!
    end
  end
end

reverse

@straight-shoota straight-shoota added this to the 1.3.0 milestone Oct 27, 2021
@straight-shoota straight-shoota merged commit d3bf53a into crystal-lang:master Oct 29, 2021
@HertzDevil HertzDevil deleted the perf/bitarray-reverse branch October 29, 2021 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants