Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply Array#push's resizing heuristic to #unshift #10750

Merged
merged 4 commits into from
Jun 26, 2021

Conversation

HertzDevil
Copy link
Contributor

@HertzDevil HertzDevil commented May 27, 2021

Fixes #10748.

The following illustrates some examples of the heuristic in action:

class Array(T)
  def to_s(io : IO) : Nil
    root = root_buffer
    io << '['
    @capacity.times do |i|
      io << ", " if i > 0
      if @offset_to_buffer <= i < @offset_to_buffer + @size
        root[i].inspect(io)
      else
        io << '_'
      end
    end
    io << ']'
  end
end
x = Array(Int32).new(5)
x.push(1, 2); puts x
x.unshift(3); puts x
x.push(4); puts x

# before:
# [1, 2, _, _, _]
# [_, _, _, _, 3, 1, 2, _, _, _]
# [_, _, _, _, 3, 1, 2, 4, _, _]

# after:
# [1, 2, _, _, _]
# [_, 3, 1, 2, _] # there is enough space; move 1 and 2 to the middle of the root buffer
# [_, 3, 1, 2, 4]
x = Array(Int32).new(4)
x.unshift(0); puts x
x.shift; x.unshift(1); puts x
x.shift; x.unshift(2); puts x
x.unshift(4, 3); puts x

# before:
# [_, _, _, 0, _, _, _, _]
# [_, _, _, _, _, _, _, 1, _, _, _, _, _, _, _, _]
# [_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 2, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _]
# [_, _, _, _, _, _, _, _, _, _, _, _, _, 4, 3, 2, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _]

# after:
# [_, 0, _, _] # there is pre-allocated space; start inserting elements from the middle
# [_, 1, _, _]
# [_, 2, _, _]
# [_, 4, 3, 2] # there is enough space; move 3 and 2 to the middle of the root buffer

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Copy link
Contributor

@stakach stakach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@stakach
Copy link
Contributor

stakach commented May 27, 2021

Hopefully this can make it into v1.0.1

@HertzDevil HertzDevil changed the title Apply Array#shift's resizing heuristic to #unshift Apply Array#push's resizing heuristic to #unshift May 27, 2021
Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine for a quick fix.

But I think it might be beneficial to take a closer look at the applied heuristics. I could see some room for improvement there.
A worst-case scenario with this is adding an element each to the front and back of an array that's filled to capacity (a typical example would be literals). Adding just these two elements quadruples the capacity: it's doubled by both push and unshift, regardless of their order. This means allocating a lot of unnecessary memory.
If the resize operations for push and unshift would not fill the buffer all the way to the front and back, respectively, this could be avoided.
I don't know what would be a better solution and I'm not even sure this needs to be changed, but it probably deserves an evaluation and discussion. Obviously, the best solution depends on a fair balance between different use cases.

spec/std/array_spec.cr Outdated Show resolved Hide resolved
@asterite
Copy link
Member

I could see some room for improvement there

I see what you did here 😄 😉

@asterite
Copy link
Member

We should probably merge this since it's an important fix

@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:collection labels Jun 18, 2021
Copy link
Member

@beta-ziliani beta-ziliani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this for 1.1.0. There's the unanswered question about the need for 10k iterations in tests, when probably a handful ones should suffice. But we can change that later (I'd like to hear @HertzDevil opinion on the matter).

@asterite asterite added this to the 1.1.0 milestone Jun 25, 2021
@asterite asterite merged commit a03539c into crystal-lang:master Jun 26, 2021
@HertzDevil HertzDevil deleted the bug/array-unshift-resize branch June 27, 2021 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:collection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leak on pop followed by unshift
5 participants