-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer types for array backend #48728
Conversation
This is almost entirely copying SimpleVector implementations. When calling `Core.sbuf(...)` it still throws a segmentation fault 11. Right now my only guess is that I inadvertently have miscounted the amount of builtin tags that I inserted and that it's trying to access memory that isn't there. I'm still figuring out that entire bit of code though, so it's easier said than done.
Ha, @oscardssmith and I were just chatting about this about a month ago and we revived a branch I had started a couple years ago doing something similar: https://github.com/JuliaLang/julia/tree/add-buffer-type2. It takes a slightly different approach in basically copying (not saying this PR isn't good/right; just wanted to raise the fact that a parallel effort was restarted recently) |
@Tokazama I think a mix between our approaches is probably best. Specifically, I like that our version let you have a non UInt8 eltype, but I like your data layout for the struct a lot better. |
I think allowing the eltype to be included is a good idea if we can keep the structure itself the same. This does complicate things quite a bit (particularly with the garbage collector) because you have to figure out size and alignment manually a lot more without fields that explicitly reference that. I wasn't confident enough in everything else I was doing to spend more time implementing that too. |
I think all you need to do is make a type parameter for the struct that stores the type, and then you can just get the size information from the type parameter. |
If someone figured out the shared pointer thing so we could have a type that is truly transparent then it should be fairly simple. We'd basically have three pipelines for allocation:
Then we could have a
That would definitely work, but since I was still getting errors with a very basic implementation I wasn't sure it was a good idea to add more complexity yet. |
While it's on my mind I'll change around a couple things to make the transition to different types more approachable. |
I might not quite understand the details well enough, but I thought @vtjnash was against doing any kind of shared pointer exposure in a buffer type; |
I don't mean we expose the pointer in its current state. I mean we need a new dedicated type that can serve the same role but is more transparent about what it's doing. So we'd need to declare a new type in "julia.h" as typedef struct {
JL_DATA_TYPE
void *ptr
} jl_shared_ptr_t; There then we could have something that reference counts in the garbage collector to ensure it's properly rooted. EDIT: part of the strategy in #47735 is what I'm trying to get at here |
latest has a field for the type and the element size. I think I fixed all the code that is dependent on how this changes the amount of memory consumed but I could be wrong. I haven't had errors in my naive tests in repl but once we have settled on some other design decisions we clearly need tests here. One of the current assumptions here is that |
Is it really a good idea to add the immutable buffer in this PR? To me that seems like a lot of complexity that's mostly orthogonal. |
For me it was easier to implement them together because the compiler and builtin functions have to be different to accommodate the typename switch built on the same Also, I'm not worrying about the behemoth of code for optimizations that replace creating a new immutable buffer with mutating the typename of a mutable buffer instead. In terms of compiler work I've only done the basic escape analysis for builtin functions so that derivatives of |
@gbaraldi, I think you're more familiar with the internal mechanics behind allocation and GC here. Do you have any suggestions concerning what entry points into the GC I should be using to track newly assigned pointers through a |
Previously boxed values weren't being tracked by the GC when set in the native Julia code the set values and we didn't even bother worrying about immutables that contained pointer fields at all. This should appropriately track those values and ensures that we use a pointer safe variant of `memmove` for pointer fields
Use "kinds" instead of multiple fields to describe more of the layout. This could potentially be used to provide a clear way of switching up storage in the future without being disruptive to existing code, since a lot of this could change in the future anyway.
`Core.allocate_buffer` is intended to allocate buffer types, only requiring direct interaction with C when the GC collector needs to be notified. This isn't yet the default way to allocate because I'm still having trouble getting the GC to properly mark, track, and preserve stuff despite using exported C calls like `jl_gc_alloc`
Technically this executes but there are still some issues with rooting
Previous commit got resizing methods working in almost Pure Julia, but we can't allocate anything on the Julia side reliably. Unfortunately, this means any growth or deletion method must be predominantly done in C so that data at provided indices can actually be shifted around before freeing old data.
These don't need specific code generation anymore since this is all done using Julia code now
Despite some rough edges, I think this is ready for review. I'm still getting some odd errors when resizing |
Regarding garbage collection, is anyone here talking to the folks (e.g. @NHDaly et al) working on memory management toolkit. https://github.com/mmtk/mmtk-julia or https://github.com/mmtk/julia |
This is a large and important enough change that I think we need a detailed design document with everything thought through, and with wide buy-in before a lot of code is written. Every detail here can have wide-reaching implications. Some tricky points are how to accommodate non-moving resizing (which becomes very important for large data), and the existing |
👍 to growable and non-growable buffers (especially in the context of wrapping foreign pointers, where the underlying memory may not support it) |
@jpsamaroo to be clear, when we say "growable" we don't mean that it is guaranteed to be growable, but just that you can make a realloc-like call that might be able to grow the size of the buffer (if it happens to be feasible). This is unfortunately necessary to support efficient resizing of large arrays (where the OS can just give you more pages of memory), but since it is only sometimes possible, it wouldn't be a problem if some buffers don't successfully resize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I highlighted the remaining fixme and Todo comments in the code
@@ -724,6 +724,7 @@ function perform_lifting!(compact::IncrementalCompact, | |||
return stmt_val # N.B. should never happen | |||
end | |||
|
|||
# FIXME: Buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixme
@@ -1017,6 +1017,7 @@ end | |||
|
|||
rewrite_invoke_exprargs!(expr::Expr) = (expr.args = invoke_rewrite(expr.args); expr) | |||
|
|||
# DECISION: Buffer? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decision
# TODO Buffer: escape analysis for jl_buffer_copy | ||
# is_buffer_copy(name::Symbol) = name === :jl_buffer_copy | ||
# # FIXME this implementation is very conservative, improve the accuracy and solve broken test cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todo and fixme
@@ -442,7 +447,8 @@ static size_t dereferenceable_size(jl_value_t *jt) | |||
// Return the min required / expected alignment of jltype (on the stack or heap) | |||
static unsigned julia_alignment(jl_value_t *jt) | |||
{ | |||
if (jl_is_array_type(jt)) { | |||
// TODO BUffer: is it safe to assume will also always have this alignment? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todo
@@ -694,6 +694,10 @@ function statement_cost(ex::Expr, line::Int, src::Union{CodeInfo, IRCode}, sptyp | |||
elseif (f === Core.arrayref || f === Core.const_arrayref || f === Core.arrayset) && length(ex.args) >= 3 | |||
atyp = argextype(ex.args[3], src, sptypes) | |||
return isknowntype(atyp) ? 4 : error_path ? params.inline_error_path_cost : params.inline_nonleaf_penalty | |||
# TODO optimize get_buffer_index && set_buffer_index!? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todo
Is this proposal in progress located anywhere or is the goal to have something more complete put together before getting too many people involved? |
Draft proposal in progress here: https://gist.github.com/JeffBezanson/a25dde3bebb5a734af87bb5ddcf31fb0 |
I'm going to close this PR since Jeff's proposal already has some clear advantages over this approach. I'd be happy to help further work in this area if it would be helpful going forward. On the other hand, I understand too much feedback at this stage may be counterproductive so I'll stand by until further involvement is solicited. |
Motivation
After a conversation on slack with @StefanKarpinski and @brenhinkeller it seemed like a lot of other PRs on improving arrays stalled out and the first fundamental step is having a generic buffer type. I still have crashes in the REPL with this current implementation and it appears I haven't fully implemented the GC side correctly, but I figured it was better to put this out there for someone else to start with then completely abandon this.
The goal here is to provide the core functionality for bare-bones storage of multiple repeated elements.
Implementation
There are three primary storage types this PR is intended to support:
Buffer{T}
: has a fixed size and mutable elementsDynamicBuffer{T}
: has a dynamic size and mutable elements.ImmutableBuffer{T}
: has a fixed size and immutable elements (not implemented here but should be as a future PR for with thefreeze
/thaw
compiler optimizations worked on in priorImmutableArray
PRs).The type layout is modeled after that of
jl_array_t
and is similar to the following native Julia structure:Just like
Array
, the number of bytes stored determines ifdata
points to inline allocations that extends the size ofMemoryChunk
viajl_gc_alloc
or a seperately stored chunk viajl_gc_managed_malloc
andjl_gc_track_malloced_buffer
.Storage of elements is similar as bit unions, bits, or pointers to boxed types is identical to that of
Array
orjl_array_t
.Note that there are not explicitly stored flags, offsets, additional dimensions, or element size. This means that:
Unlike
jl_array_t
, all information about the element type must be derived again everytime it's needed when running C code. Of course, this is not an issue once we are working with TBAA and that is all optimized away. However, I'm uncertain whether that will slow things down that don't often interact with TBAA directly (such as the garbage collector).We don't have
flags.how == 1
to mark a julia-allocated buffer that needs to be marked orflags.isshared
. I've tried to use the last two its of the data pointer to mark these (seejl_buffer_isshared
andjl_buffer_isunmarked
). It may make more sense to have a type that explicitly wraps a shared buffer, preserving that bit for some other future use and allowing more explicit representation of data storage through the type system. I've yet to spend much time grocking the unmarked allocated data aspect of this, so there may be a better approach that I've yet to consider.The lack of an offset or maximum size means that resizing for
DynamicBuffer
will always result in a reallocation or new allocation. Feature parity withVector
is intended to be accomplished through Julia code with something likewhere the length of
DynamicBuffer
here is functionally equivalent toArray
s maxsize.I'm in the process of moving the implementation to native Julia. This is a bit challenging since this is essentially an attempt to rewrite large portions of "array.c" where a lot of things aren't implemented in native Julia code (interacting with the GC, traversing
Union
s) and sometimes working efficiently with a pointer get tricky (storing pointers to immutable types).Remaining Work
Names may not be sufficiently self documenting and may even be misleading. The term "buffer" is often used to refer to something more akin to what our current
Vector
is with offsets and resizing. Perhaps something that is more clearly describing a continuous chunk of memory such asMemoryChunk
,ResizableMemoryChunk
, andImmutableMemoryChunk
.What needs to be done here to support better allocation practices? There's a lot of interest in providing users with the ability specify how things are allocated (allocating to the stack, bump allocators, smart pointers). I've assumed that most of that would be better addressed in future PRs by those who really understand how to optimize that sort of thing, but I'd also like to ensure that the implementation here is not prohibittive to future developments.
Could we provide better support for
Bool
here so that we don't need an explicitBitBuffer
type?More support for future implementation of
DynamicVector
with user friendly resizing methodsThe assumed effects for methods should change based on the buffer variant. This is currently buggy. For example, the effects for
length(::Buffer)
should not be the same aslength(::DynamicBuffer)
, but currently are.Would it make sense to implement
ImmutableBuffer
here without any of thethaw
/freeze
stuff that would complicate things?