Buffer types for array backend #48728

Tokazama · 2023-02-19T23:21:47Z

Motivation

After a conversation on slack with @StefanKarpinski and @brenhinkeller it seemed like a lot of other PRs on improving arrays stalled out and the first fundamental step is having a generic buffer type. I still have crashes in the REPL with this current implementation and it appears I haven't fully implemented the GC side correctly, but I figured it was better to put this out there for someone else to start with then completely abandon this.

The goal here is to provide the core functionality for bare-bones storage of multiple repeated elements.

Implementation

There are three primary storage types this PR is intended to support:

Buffer{T}: has a fixed size and mutable elements
DynamicBuffer{T}: has a dynamic size and mutable elements.
ImmutableBuffer{T}: has a fixed size and immutable elements (not implemented here but should be as a future PR for with the freeze/thaw compiler optimizations worked on in prior ImmutableArray PRs).

The type layout is modeled after that of jl_array_t and is similar to the following native Julia structure:

struct MemoryChunk{T}
    length::Int
    data::Ptr{T}
end

Just like Array, the number of bytes stored determines if data points to inline allocations that extends the size of MemoryChunk via jl_gc_alloc or a seperately stored chunk via jl_gc_managed_malloc and jl_gc_track_malloced_buffer.

Storage of elements is similar as bit unions, bits, or pointers to boxed types is identical to that of Array or jl_array_t.

Note that there are not explicitly stored flags, offsets, additional dimensions, or element size. This means that:

Unlike jl_array_t, all information about the element type must be derived again everytime it's needed when running C code. Of course, this is not an issue once we are working with TBAA and that is all optimized away. However, I'm uncertain whether that will slow things down that don't often interact with TBAA directly (such as the garbage collector).
We don't have flags.how == 1 to mark a julia-allocated buffer that needs to be marked or flags.isshared. I've tried to use the last two its of the data pointer to mark these (see jl_buffer_isshared and jl_buffer_isunmarked). It may make more sense to have a type that explicitly wraps a shared buffer, preserving that bit for some other future use and allowing more explicit representation of data storage through the type system. I've yet to spend much time grocking the unmarked allocated data aspect of this, so there may be a better approach that I've yet to consider.
The lack of an offset or maximum size means that resizing for DynamicBuffer will always result in a reallocation or new allocation. Feature parity with Vector is intended to be accomplished through Julia code with something like
```
mutable struct DynamicVector{T} <: DenseVector{T}
    buffer::DynamicBuffer{T}
    offset::Int
    length::Int
end
```
where the length of DynamicBuffer here is functionally equivalent to Arrays maxsize.

I'm in the process of moving the implementation to native Julia. This is a bit challenging since this is essentially an attempt to rewrite large portions of "array.c" where a lot of things aren't implemented in native Julia code (interacting with the GC, traversing Unions) and sometimes working efficiently with a pointer get tricky (storing pointers to immutable types).

Remaining Work

Names may not be sufficiently self documenting and may even be misleading. The term "buffer" is often used to refer to something more akin to what our current Vector is with offsets and resizing. Perhaps something that is more clearly describing a continuous chunk of memory such as MemoryChunk, ResizableMemoryChunk, and ImmutableMemoryChunk.
What needs to be done here to support better allocation practices? There's a lot of interest in providing users with the ability specify how things are allocated (allocating to the stack, bump allocators, smart pointers). I've assumed that most of that would be better addressed in future PRs by those who really understand how to optimize that sort of thing, but I'd also like to ensure that the implementation here is not prohibittive to future developments.
Could we provide better support for Bool here so that we don't need an explicit BitBuffer type?
More support for future implementation of DynamicVector with user friendly resizing methods
The assumed effects for methods should change based on the buffer variant. This is currently buggy. For example, the effects for length(::Buffer) should not be the same as length(::DynamicBuffer), but currently are.
Would it make sense to implement ImmutableBuffer here without any of the thaw/freeze stuff that would complicate things?

This is almost entirely copying SimpleVector implementations. When calling `Core.sbuf(...)` it still throws a segmentation fault 11. Right now my only guess is that I inadvertently have miscounted the amount of builtin tags that I inserted and that it's trying to access memory that isn't there. I'm still figuring out that entire bit of code though, so it's easier said than done.

quinnj · 2023-02-20T00:00:59Z

Ha, @oscardssmith and I were just chatting about this about a month ago and we revived a branch I had started a couple years ago doing something similar: https://github.com/JuliaLang/julia/tree/add-buffer-type2. It takes a slightly different approach in basically copying jl_array_t (allowing any eltype) but having it be fixed size. We got it mostly working, but there were still a few GC-related pieces to fix.

(not saying this PR isn't good/right; just wanted to raise the fact that a parallel effort was restarted recently)

oscardssmith · 2023-02-20T00:17:34Z

@Tokazama I think a mix between our approaches is probably best. Specifically, I like that our version let you have a non UInt8 eltype, but I like your data layout for the struct a lot better.

Tokazama · 2023-02-20T00:22:02Z

I think allowing the eltype to be included is a good idea if we can keep the structure itself the same. This does complicate things quite a bit (particularly with the garbage collector) because you have to figure out size and alignment manually a lot more without fields that explicitly reference that. I wasn't confident enough in everything else I was doing to spend more time implementing that too.

oscardssmith · 2023-02-20T00:25:59Z

I think all you need to do is make a type parameter for the struct that stores the type, and then you can just get the size information from the type parameter.

Tokazama · 2023-02-20T00:29:40Z

If someone figured out the shared pointer thing so we could have a type that is truly transparent then it should be fairly simple. We'd basically have three pipelines for allocation:

SimplBuffer{BitType}
SimpleBuffer{UnionBitType}
SimpleBuffer{SharedPtr{BigType}}

Then we could have a ThinArray type that wraps it, hiding some of the ugliness from users without losing the transparency for other use cases.

I think all you need to do is make a type parameter for the struct that stores the type, and then you can just get the size information from the type parameter.

That would definitely work, but since I was still getting errors with a very basic implementation I wasn't sure it was a good idea to add more complexity yet.

Tokazama · 2023-02-20T00:40:38Z

While it's on my mind I'll change around a couple things to make the transition to different types more approachable.

quinnj · 2023-02-20T01:45:46Z

I might not quite understand the details well enough, but I thought @vtjnash was against doing any kind of shared pointer exposure in a buffer type; jl_value_ts should only ever live in /src code and never exposed on the Julia side.

Tokazama · 2023-02-20T02:55:01Z

I might not quite understand the details well enough, but I thought @vtjnash was against doing any kind of shared pointer exposure in a buffer type; jl_value_ts should only ever live in /src code and never exposed on the Julia side.

I don't mean we expose the pointer in its current state. I mean we need a new dedicated type that can serve the same role but is more transparent about what it's doing. So we'd need to declare a new type in "julia.h" as

typedef struct {
    JL_DATA_TYPE
    void *ptr
} jl_shared_ptr_t;

There then we could have something that reference counts in the garbage collector to ensure it's properly rooted.

EDIT: part of the strategy in #47735 is what I'm trying to get at here

Tokazama · 2023-02-20T19:32:23Z

latest has a field for the type and the element size. I think I fixed all the code that is dependent on how this changes the amount of memory consumed but I could be wrong. I haven't had errors in my naive tests in repl but once we have settled on some other design decisions we clearly need tests here.

One of the current assumptions here is that SimpleBuffer always owns its data. So there's no pointer to another one containing the actual data. I prefer that this sort of transparency exists at this level, but I figured it needed to be explicitly pointed out given how this is different than Array.

oscardssmith · 2023-02-21T15:54:20Z

Is it really a good idea to add the immutable buffer in this PR? To me that seems like a lot of complexity that's mostly orthogonal.

Tokazama · 2023-02-21T16:12:30Z

For me it was easier to implement them together because the compiler and builtin functions have to be different to accommodate the typename switch built on the same jl_buffer_t. If it's too much I can just remove it though.

Also, I'm not worrying about the behemoth of code for optimizations that replace creating a new immutable buffer with mutating the typename of a mutable buffer instead. In terms of compiler work I've only done the basic escape analysis for builtin functions so that derivatives of jl_buffer_t work.

Tokazama · 2023-04-12T05:12:11Z

One thing I was thinking about with this PR is that we might want to make it have a ccall barrier, and move most of the logic either to an LLVM intrinsic or julia itself. Doing that would allow for further constant propagation and might make it easier for LLVM to do things like turning a heap allocation into a stack one.

@gbaraldi, I think you're more familiar with the internal mechanics behind allocation and GC here. Do you have any suggestions concerning what entry points into the GC I should be using to track newly assigned pointers through a ccall on the Julia side and facilitating stack allocations?

Previously boxed values weren't being tracked by the GC when set in the native Julia code the set values and we didn't even bother worrying about immutables that contained pointer fields at all. This should appropriately track those values and ensures that we use a pointer safe variant of `memmove` for pointer fields

Use "kinds" instead of multiple fields to describe more of the layout. This could potentially be used to provide a clear way of switching up storage in the future without being disruptive to existing code, since a lot of this could change in the future anyway.

`Core.allocate_buffer` is intended to allocate buffer types, only requiring direct interaction with C when the GC collector needs to be notified. This isn't yet the default way to allocate because I'm still having trouble getting the GC to properly mark, track, and preserve stuff despite using exported C calls like `jl_gc_alloc`

Technically this executes but there are still some issues with rooting

Previous commit got resizing methods working in almost Pure Julia, but we can't allocate anything on the Julia side reliably. Unfortunately, this means any growth or deletion method must be predominantly done in C so that data at provided indices can actually be shifted around before freeing old data.

These don't need specific code generation anymore since this is all done using Julia code now

Tokazama · 2023-04-27T01:20:08Z

Despite some rough edges, I think this is ready for review. I'm still getting some odd errors when resizing DynamicBuffer, but besides that all else seems to work fine.

mkitti · 2023-04-27T06:09:14Z

Regarding garbage collection, is anyone here talking to the folks (e.g. @NHDaly et al) working on memory management toolkit.

https://github.com/mmtk/mmtk-julia or https://github.com/mmtk/julia

JeffBezanson · 2023-05-18T20:49:18Z

This is a large and important enough change that I think we need a detailed design document with everything thought through, and with wide buy-in before a lot of code is written. Every detail here can have wide-reaching implications. Some tricky points are how to accommodate non-moving resizing (which becomes very important for large data), and the existing Array feature of wrapping foreign pointers. Do we really want both growable and non-growable buffers? We need everybody to be comfortable with the trade-offs chosen here, and compare some alternate designs. @vtjnash, @oscardssmith and I started working on a proposal, but this will take some time.

NHDaly · 2023-05-18T22:23:14Z

Thanks for the ping, @mkitti. CC: @diogo and @kpamnany

jpsamaroo · 2023-05-19T15:49:50Z

👍 to growable and non-growable buffers (especially in the context of wrapping foreign pointers, where the underlying memory may not support it)

oscardssmith · 2023-05-19T15:53:54Z

@jpsamaroo to be clear, when we say "growable" we don't mean that it is guaranteed to be growable, but just that you can make a realloc-like call that might be able to grow the size of the buffer (if it happens to be feasible). This is unfortunately necessary to support efficient resizing of large arrays (where the OS can just give you more pages of memory), but since it is only sometimes possible, it wouldn't be a problem if some buffers don't successfully resize.

mkitti

I highlighted the remaining fixme and Todo comments in the code

mkitti · 2023-05-19T15:59:13Z

base/compiler/ssair/passes.jl

@@ -724,6 +724,7 @@ function perform_lifting!(compact::IncrementalCompact,
    return stmt_val # N.B. should never happen
 end

+# FIXME: Buffer


mkitti · 2023-05-19T15:59:29Z

base/compiler/ssair/inlining.jl

@@ -1017,6 +1017,7 @@ end

 rewrite_invoke_exprargs!(expr::Expr) = (expr.args = invoke_rewrite(expr.args); expr)

+# DECISION: Buffer?


mkitti · 2023-05-19T16:00:00Z

base/compiler/ssair/EscapeAnalysis/EscapeAnalysis.jl

+# TODO Buffer: escape analysis for jl_buffer_copy
+# is_buffer_copy(name::Symbol) = name === :jl_buffer_copy
+# # FIXME this implementation is very conservative, improve the accuracy and solve broken test cases


Todo and fixme

mkitti · 2023-05-19T16:02:32Z

src/cgutils.cpp

@@ -442,7 +447,8 @@ static size_t dereferenceable_size(jl_value_t *jt)
 // Return the min required / expected alignment of jltype (on the stack or heap)
 static unsigned julia_alignment(jl_value_t *jt)
 {
-    if (jl_is_array_type(jt)) {
+    // TODO BUffer: is it safe to assume will also always have this alignment?


mkitti · 2023-05-19T16:04:00Z

base/compiler/optimize.jl

@@ -694,6 +694,10 @@ function statement_cost(ex::Expr, line::Int, src::Union{CodeInfo, IRCode}, sptyp
            elseif (f === Core.arrayref || f === Core.const_arrayref || f === Core.arrayset) && length(ex.args) >= 3
                atyp = argextype(ex.args[3], src, sptypes)
                return isknowntype(atyp) ? 4 : error_path ? params.inline_error_path_cost : params.inline_nonleaf_penalty
+            # TODO optimize get_buffer_index && set_buffer_index!?


Tokazama · 2023-05-22T18:36:30Z

This is a large and important enough change that I think we need a detailed design document with everything thought through, and with wide buy-in before a lot of code is written. Every detail here can have wide-reaching implications. Some tricky points are how to accommodate non-moving resizing (which becomes very important for large data), and the existing Array feature of wrapping foreign pointers. Do we really want both growable and non-growable buffers? We need everybody to be comfortable with the trade-offs chosen here, and compare some alternate designs. @vtjnash, @oscardssmith and I started working on a proposal, but this will take some time.

Is this proposal in progress located anywhere or is the goal to have something more complete put together before getting too many people involved?

JeffBezanson · 2023-05-22T19:34:59Z

Draft proposal in progress here: https://gist.github.com/JeffBezanson/a25dde3bebb5a734af87bb5ddcf31fb0

Tokazama · 2023-06-06T23:51:38Z

I'm going to close this PR since Jeff's proposal already has some clear advantages over this approach. I'd be happy to help further work in this area if it would be helpful going forward. On the other hand, I understand too much feedback at this stage may be counterproductive so I'll stand by until further involvement is solicited.

Tokazama added 11 commits February 9, 2023 15:18

SimpleBuffer type modeled after SimpleVector

3b4bfb8

initialize type

af35dd8

add tags for buffer type

c6cd7fd

Fix codegen

c50f5b7

add dedicated hash

7e8ea0c

fix typo

cf00f92

apply for sbuf

2965d18

builds but doesn't work

97167e5

enforce 8-bit values for SimpleBuffer

2c0e068

internal 8-bit methods

5c7c7cf

Support eltype for SimpleBuffer

0172e85

giordano added the arrays [a, r, r, a, y, s] label Feb 20, 2023

proper size for gc and more constructors

7bd19bb

Tokazama added 4 commits February 20, 2023 14:49

Merge branch 'master' into buffers

3424c11

escape analysis on buffers

4604d77

mutable vs immutable buffers

17b0b11

julia implimentation of immutable instantiation

f702552

Tokazama changed the title ~~SimpleBuffer Draft for array backend~~ Buffer types for array backend Feb 21, 2023

Fix buffer size calculation

18da8a2

Tokazama added 13 commits April 15, 2023 15:53

Restructure layout stuff

0099b36

Use "kinds" instead of multiple fields to describe more of the layout. This could potentially be used to provide a clear way of switching up storage in the future without being disruptive to existing code, since a lot of this could change in the future anyway.

Merge branch 'master' into buffers

5abe6c4

Fix moving pointers when copying

89b1bf3

Merge branch 'master' into buffers

b672ce4

All Julia unsafe_copyto!

a517561

Technically this executes but there are still some issues with rooting

copy for boxed types and begin native resizing

e6b3be6

Remove C code that is fully replaced by Julia impl

b26baf4

Merge branch 'master' into buffers

e2ace7f

Remove bufref and bufset codegen

649e500

These don't need specific code generation anymore since this is all done using Julia code now

More tests for resizing

cbf5b2c

Tokazama marked this pull request as ready for review April 27, 2023 01:18

Tokazama added 2 commits April 27, 2023 03:53

Merge branch 'master' into buffers

654ac8f

Remove unintended find/replace changes in flisp

0652811

mkitti reviewed May 19, 2023

View reviewed changes

Tokazama closed this Jun 6, 2023

brenhinkeller removed the triage This should be discussed on a triage call label Aug 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Buffer types for array backend #48728

Buffer types for array backend #48728

Tokazama commented Feb 19, 2023 •

edited

Loading

quinnj commented Feb 20, 2023

oscardssmith commented Feb 20, 2023

Tokazama commented Feb 20, 2023

oscardssmith commented Feb 20, 2023

Tokazama commented Feb 20, 2023

Tokazama commented Feb 20, 2023

quinnj commented Feb 20, 2023

Tokazama commented Feb 20, 2023 •

edited

Loading

Tokazama commented Feb 20, 2023

oscardssmith commented Feb 21, 2023

Tokazama commented Feb 21, 2023 •

edited

Loading

Tokazama commented Apr 12, 2023

Tokazama commented Apr 27, 2023

mkitti commented Apr 27, 2023

JeffBezanson commented May 18, 2023

NHDaly commented May 18, 2023

jpsamaroo commented May 19, 2023

oscardssmith commented May 19, 2023

mkitti left a comment

mkitti May 19, 2023

mkitti May 19, 2023

mkitti May 19, 2023

mkitti May 19, 2023

mkitti May 19, 2023

Tokazama commented May 22, 2023

JeffBezanson commented May 22, 2023

Tokazama commented Jun 6, 2023

		@@ -1017,6 +1017,7 @@ end

		rewrite_invoke_exprargs!(expr::Expr) = (expr.args = invoke_rewrite(expr.args); expr)

		# DECISION: Buffer?

Buffer types for array backend #48728

Buffer types for array backend #48728

Conversation

Tokazama commented Feb 19, 2023 • edited Loading

Motivation

Implementation

Remaining Work

quinnj commented Feb 20, 2023

oscardssmith commented Feb 20, 2023

Tokazama commented Feb 20, 2023

oscardssmith commented Feb 20, 2023

Tokazama commented Feb 20, 2023

Tokazama commented Feb 20, 2023

quinnj commented Feb 20, 2023

Tokazama commented Feb 20, 2023 • edited Loading

Tokazama commented Feb 20, 2023

oscardssmith commented Feb 21, 2023

Tokazama commented Feb 21, 2023 • edited Loading

Tokazama commented Apr 12, 2023

Tokazama commented Apr 27, 2023

mkitti commented Apr 27, 2023

JeffBezanson commented May 18, 2023

NHDaly commented May 18, 2023

jpsamaroo commented May 19, 2023

oscardssmith commented May 19, 2023

mkitti left a comment

Choose a reason for hiding this comment

mkitti May 19, 2023

Choose a reason for hiding this comment

mkitti May 19, 2023

Choose a reason for hiding this comment

mkitti May 19, 2023

Choose a reason for hiding this comment

mkitti May 19, 2023

Choose a reason for hiding this comment

mkitti May 19, 2023

Choose a reason for hiding this comment

Tokazama commented May 22, 2023

JeffBezanson commented May 22, 2023

Tokazama commented Jun 6, 2023

Tokazama commented Feb 19, 2023 •

edited

Loading

Tokazama commented Feb 20, 2023 •

edited

Loading

Tokazama commented Feb 21, 2023 •

edited

Loading