Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize behavior of shrink in standard library data structures #2009

Open
andrewrk opened this issue Feb 25, 2019 · 2 comments
Open

optimize behavior of shrink in standard library data structures #2009

andrewrk opened this issue Feb 25, 2019 · 2 comments
Labels
standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@andrewrk
Copy link
Member

Related: #1306

As @schmee notes in #2008, ArrayList.shrink does not actually interact with the allocator.

I would like it to call shrink in the allocator. However the allocator currently has no way to communicate back, "hey I actually can't reclaim this memory, so if you just want to hang on to it, that would actually be better". So I want to look into this before deciding how shrink will work for array lists.

To resolve this issue, we need to enumerate the various use cases for the various data structures that can have the concept of shrink:

  • HashMap
  • ArrayList
  • PriorityQueue
  • (there are more)

And we need to consider the various allocators that may be used as a backing allocator for these data structures:

  • ArenaAllocator / FixedBufferAllocator (can never reclaim space when reallocating)
  • General Purpose Allocator (can sometimes reclaim space when reallocating)
  • (there are more)

And then make sure that the most optimal thing is happening in every case, for every data structure. If this isn't the case, then the Allocator API and/or data structure APIs need to be modified.

@andrewrk andrewrk added the standard library This issue involves writing Zig code for the standard library. label Feb 25, 2019
@andrewrk andrewrk added this to the 0.5.0 milestone Feb 25, 2019
@andrewrk
Copy link
Member Author

andrewrk commented Mar 1, 2019

Here's the plan to solve it #1306 (comment)

andrewrk added a commit that referenced this issue Mar 15, 2019
Before, allocator implementations had to provide `allocFn`,
`reallocFn`, and `freeFn`.

Now, they must provide only `reallocFn` and `shrinkFn`.
Reallocating from a zero length slice is allocation, and
shrinking to a zero length slice is freeing.

When the new memory size is less than or equal to the
previous allocation size, `reallocFn` now has the option
to return `error.OutOfMemory` to indicate that the allocator
would not be able to take advantage of the new size.

For more details see #1306. This commit closes #1306.

This commit paves the way to solving #2009.

This commit also introduces a memory leak to all coroutines.
There is an issue where a coroutine calls the function and it
frees its own stack frame, but then the return value of `shrinkFn`
is a slice, which is implemented as an sret struct. Writing to
the return pointer causes invalid memory write. We could work
around it by having a global helper function which has a void
return type and calling that instead. But instead this hack will
suffice until I rework coroutines to be non-allocating. Basically
coroutines are not supported right now until they are reworked as
in #1194.
@andrewrk
Copy link
Member Author

Have a look at the new Allocator interface API:

zig/std/mem.zig

Lines 11 to 71 in 9c13e9b

pub const Allocator = struct {
pub const Error = error{OutOfMemory};
/// Realloc is used to modify the size or alignment of an existing allocation,
/// as well as to provide the allocator with an opportunity to move an allocation
/// to a better location.
/// When the size/alignment is greater than the previous allocation, this function
/// returns `error.OutOfMemory` when the requested new allocation could not be granted.
/// When the size/alignment is less than or equal to the previous allocation,
/// this function returns `error.OutOfMemory` when the allocator decides the client
/// would be better off keeping the extra alignment/size. Clients will call
/// `shrinkFn` when they require the allocator to track a new alignment/size,
/// and so this function should only return success when the allocator considers
/// the reallocation desirable from the allocator's perspective.
/// As an example, `std.ArrayList` tracks a "capacity", and therefore can handle
/// reallocation failure, even when `new_n` <= `old_mem.len`. A `FixedBufferAllocator`
/// would always return `error.OutOfMemory` for `reallocFn` when the size/alignment
/// is less than or equal to the old allocation, because it cannot reclaim the memory,
/// and thus the `std.ArrayList` would be better off retaining its capacity.
/// When `reallocFn` returns,
/// `return_value[0..min(old_mem.len, new_byte_count)]` must be the same
/// as `old_mem` was when `reallocFn` is called. The bytes of
/// `return_value[old_mem.len..]` have undefined values.
/// The returned slice must have its pointer aligned at least to `new_alignment` bytes.
reallocFn: fn (
self: *Allocator,
// Guaranteed to be the same as what was returned from most recent call to
// `reallocFn` or `shrinkFn`.
// If `old_mem.len == 0` then this is a new allocation and `new_byte_count`
// is guaranteed to be >= 1.
old_mem: []u8,
// If `old_mem.len == 0` then this is `undefined`, otherwise:
// Guaranteed to be the same as what was returned from most recent call to
// `reallocFn` or `shrinkFn`.
// Guaranteed to be >= 1.
// Guaranteed to be a power of 2.
old_alignment: u29,
// If `new_byte_count` is 0 then this is a free and it is guaranteed that
// `old_mem.len != 0`.
new_byte_count: usize,
// Guaranteed to be >= 1.
// Guaranteed to be a power of 2.
// Returned slice's pointer must have this alignment.
new_alignment: u29,
) Error![]u8,
/// This function deallocates memory. It must succeed.
shrinkFn: fn (
self: *Allocator,
// Guaranteed to be the same as what was returned from most recent call to
// `reallocFn` or `shrinkFn`.
old_mem: []u8,
// Guaranteed to be the same as what was returned from most recent call to
// `reallocFn` or `shrinkFn`.
old_alignment: u29,
// Guaranteed to be less than or equal to `old_mem.len`.
new_byte_count: usize,
// If `new_byte_count == 0` then this is `undefined`, otherwise:
// Guaranteed to be less than or equal to `old_alignment`.
new_alignment: u29,
) []u8,

In particular, realloc now returns error.OutOfMemory when the new size is smaller than the old size and the allocator decides it has nothing useful to do with the reclaimed bytes.

And here's std.ArrayList now which takes advantage of it:

zig/std/array_list.zig

Lines 144 to 150 in 9c13e9b

pub fn shrink(self: *Self, new_len: usize) void {
assert(new_len <= self.len);
self.len = new_len;
self.items = self.allocator.realloc(self.items, new_len) catch |e| switch (e) {
error.OutOfMemory => return, // no problem, capacity is still correct then.
};
}

So what's left is to update the other data structures from the standard library to take advantage of this new interface.

@andrewrk andrewrk modified the milestones: 0.5.0, 0.6.0 Sep 20, 2019
@andrewrk andrewrk modified the milestones: 0.6.0, 0.7.0 Oct 17, 2019
@andrewrk andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 13, 2020
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 Nov 6, 2020
@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

No branches or pull requests

1 participant