Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: ADBC driver panics when consuming Arrow batches without properly aligned buffers #2526

Open
felipecrv opened this issue Feb 14, 2025 · 18 comments
Assignees
Labels
Type: bug Something isn't working

Comments

@felipecrv
Copy link
Contributor

felipecrv commented Feb 14, 2025

What happened?

Running SQL queries through the Snowflake ADBC driver, from Rust, can cause a crash due to buffer alignment issues.

thread 'main' panicked at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-buffer-53.4.0/src/buffer/scalar.rs:143:17:                                                                        
Memory pointer from external source (e.g, FFI) is not aligned with the specified scalar type. Before importing buffer through FFI, please make sure the allocation is aligned.

This is known problem for Rust [1] when it's consuming buffers from systems that don't care as much about alignment as rustc does. It's been fixed in the FFI integration with Python [2].

The ADBC Rust driver wrapper should do the same. How can it trigger the ArrayData::align_buffers() call that makes the Python FFI integration work without alignment issues?

[1] apache/arrow#43552
[2] apache/arrow-rs#6472

Stack Trace

thread 'tokio-runtime-worker' panicked at /home/felipe/.cargo/git/checkouts/arrow-rs-25fc54bc5e9ed300/f5b51ff/arrow-buffer/src/buffer/scalar.rs:138:17:
Memory pointer from external source (e.g, FFI) is not aligned with the specified scalar type. Before importing buffer through FFI, please make sure the allocation is aligned.
stack backtrace:
   0: rust_begin_unwind
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:662:5
   1: core::panicking::panic_fmt
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panicking.rs:74:14
   2: <arrow_buffer::buffer::scalar::ScalarBuffer<T> as core::convert::From<arrow_buffer::buffer::immutable::Buffer>>::from
   3: <T as core::convert::Into<U>>::into
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/convert/mod.rs:759:9
   4: arrow_buffer::buffer::scalar::ScalarBuffer<T>::new
             at /home/felipe/.cargo/git/checkouts/arrow-rs-25fc54bc5e9ed300/f5b51ff/arrow-buffer/src/buffer/scalar.rs:72:9
   5: <arrow_array::array::primitive_array::PrimitiveArray<T> as core::convert::From<arrow_data::data::ArrayData>>::from
             at /home/felipe/.cargo/git/checkouts/arrow-rs-25fc54bc5e9ed300/f5b51ff/arrow-array/src/array/primitive_array.rs:1538:22
   6: arrow_array::array::make_array
             at /home/felipe/.cargo/git/checkouts/arrow-rs-25fc54bc5e9ed300/f5b51ff/arrow-array/src/array/mod.rs:793:48
   7: <arrow_array::array::struct_array::StructArray as core::convert::From<arrow_data::data::ArrayData>>::from::{{closure}}
             at /home/felipe/.cargo/git/checkouts/arrow-rs-25fc54bc5e9ed300/f5b51ff/arrow-array/src/array/struct_array.rs:302:23
   8: core::iter::adapters::map::map_fold::{{closure}}
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/iter/adapters/map.rs:88:28
   9: <core::slice::iter::Iter<T> as core::iter::traits::iterator::Iterator>::fold
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/slice/iter/macros.rs:232:27
  10: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/iter/adapters/map.rs:128:9
  11: core::iter::traits::iterator::Iterator::for_each
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/iter/traits/iterator.rs:813:9
  12: alloc::vec::Vec<T,A>::extend_trusted
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/vec/mod.rs:3121:17
  13: <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/vec/spec_extend.rs:26:9
  14: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/vec/spec_from_iter_nested.rs:60:9
  15: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/vec/spec_from_iter.rs:33:9
  16: <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/vec/mod.rs:2985:9
  17: core::iter::traits::iterator::Iterator::collect
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/iter/traits/iterator.rs:2000:9
  18: <arrow_array::array::struct_array::StructArray as core::convert::From<arrow_data::data::ArrayData>>::from
             at /home/felipe/.cargo/git/checkouts/arrow-rs-25fc54bc5e9ed300/f5b51ff/arrow-array/src/array/struct_array.rs:299:22
  19: <arrow_array::ffi_stream::ArrowArrayStreamReader as core::iter::traits::iterator::Iterator>::next::{{closure}}
             at /home/felipe/.cargo/git/checkouts/arrow-rs-25fc54bc5e9ed300/f5b51ff/arrow-array/src/ffi_stream.rs:367:54
  20: core::result::Result<T,E>::map
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/result.rs:771:25
  21: <arrow_array::ffi_stream::ArrowArrayStreamReader as core::iter::traits::iterator::Iterator>::next
             at /home/felipe/.cargo/git/checkouts/arrow-rs-25fc54bc5e9ed300/f5b51ff/arrow-array/src/ffi_stream.rs:367:18
  22: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/iter/adapters/map.rs:107:9
  23: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/vec/spec_from_iter_nested.rs:24:32
  24: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/vec/spec_from_iter.rs:33:9
  25: <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/alloc/src/vec/mod.rs:2985:9
  26: core::iter::traits::iterator::Iterator::collect
             at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/iter/traits/iterator.rs:2000:9

How can we reproduce the bug?

No response

Environment/Setup

No response

@felipecrv felipecrv added the Type: bug Something isn't working label Feb 14, 2025
@zeroshade
Copy link
Member

While this should definitely be done in ADBC rust to allow arbitrary drivers to work properly, should we also have the snowflake driver ensure that it always outputs aligned buffers?

@lidavidm
Copy link
Member

Does Go not align buffers?

Also please provide a backtrace if possible

@zeroshade
Copy link
Member

It should, but for ADBC we're using the mallocator so all the memory we use for Arrow is allocated in C. So, we're at the mercy of malloc

@felipecrv
Copy link
Contributor Author

Also please provide a backtrace if possible

Attached one now. I didn't have it when I filed the issue.

And I think I also figured how to fix this. But the fix will be in the arrow-array crate instead of this repo.

I think this is the place where the data.align_buffers() call can be placed. Right before the untyped ArrayData becomes a StructArray:

https://github.com/apache/arrow-rs/blob/main/arrow-array/src/ffi_stream.rs#L367

Very easy if I make it a requirement without the ability of opt-out from Rust callers like adbc-core.

@felipecrv felipecrv self-assigned this Feb 15, 2025
@CurtHagenlocher
Copy link
Contributor

FWIW, the allocator in the C# implementation of Arrow will overallocate by 64 bytes and then adjust the starting location to be on a 64-byte boundary.

@felipecrv
Copy link
Contributor Author

FWIW, the allocator in the C# implementation of Arrow will overallocate by 64 bytes and then adjust the starting location to be on a 64-byte boundary.

Which is great. That means that when ArrayData::align_buffers() is called on the Rust side, no buffer needs to be copied to be correctly aligned. I think the final stance on this topic should be that:

  1. producers must try their best to produce aligned buffers
  2. Rust consumers should always validate the alignment because they shouldn't trust external data and the risk of a panic is not worth the time saved skipping the alignment check (which will rarely trigger a copy of any buffers).

@lidavidm
Copy link
Member

The C++ impl does the overallocating trick too (or maybe it was the Java one). I think it wouldn't be hard to have mallocator do the same thing, right Matt?

@paleolimbot
Copy link
Member

It would be nice to know which data type Rust is complaining about (are there any data types that require an alignment >8 bytes anywhere in the spec?)

@tustvold
Copy link

tustvold commented Feb 15, 2025

FWIW I wonder if this is the same issue as apache/arrow#32276, e.g. if snowflake is using flight under the hood.

The C Data interface has the following to say on this topic

It is recommended, but not required, that the memory addresses of the buffers be aligned at least according to the type of primitive data that they contain. Consumers MAY decide not to support unaligned memory.

IMO copying unaligned buffers is a bandaid and arrow implementations should avoid creating such buffers.

are there any data types that require an alignment >8 bytes anywhere in the spec

IIRC i128 as used by Decimal128 requires 16 byte alignment on some architectures. I'm not aware of any types with alignment requirements greater than their size.

@felipecrv
Copy link
Contributor Author

It would be nice to know which data type Rust is complaining about (are there any data types that require an alignment >8 bytes anywhere in the spec?)

The integers coming from the Snowflake driver are actually decimal128 and arrow-rs uses mem::align_of::<u128>() to pick the BufferSpec::FixedWidth { alignment } for the decimal 128 buffers [1]. That is 16 bytes and C malloc only guarantees 8-byte alignment unless a malloc that takes alignment arguments is used. That's really hard to guarantee from Go or C code.

[1] https://github.com/apache/arrow-rs/blob/main/arrow-data/src/data.rs#L1591

@lidavidm
Copy link
Member

I created apache/arrow-go#282

@lidavidm
Copy link
Member

Ah, but Go directly uses []byte, so if we do the over-allocation trick we have nowhere to stash the "real" pointer...

@lidavidm
Copy link
Member

The best thing I can think of is to wrap aligned_alloc but that carries a number of caveats (requires C11, not portable to Windows, does not support realloc, cannot align an allocation smaller than the alignment).

@paleolimbot
Copy link
Member

It does seem unrealistic to achieve 16 byte alignment under most real-world circumstances (I don't think anything coming zero copy from IPC, for example, can ever guarantee more than 8).

@zeroshade
Copy link
Member

That actually might be the issue with the snowflake driver. The IPC reader is gonna try to zero copy where applicable, so it'll depend on how the network buffers allocate when reading data from the snowflake https requests

@tustvold
Copy link

tustvold commented Feb 17, 2025

but Go directly uses []byte, so if we do the over-allocation

How does Go support buffers sent over FFI / from IPC, which require using a release callback / ref count?

Perhaps it is worth raising on the mailing list, IMO either FFI should use aligned buffers, or we should remove the language from the spec and highlight instead that FFI may not be zero-copy. I suspect this will depend on how widespread the alignment challenges are, if it is just Go, it might make sense to just fix it there

The IPC reader is gonna try to zero copy where applicable, so it'll depend on how the network buffers allocate when reading data from the snowflake https requests

FWIW arrow-rs copies to align on IPC for this reason, but given network/disk transfers are involved there is less of an expectation of zero-copy here

@lidavidm
Copy link
Member

How does Go support buffers sent over FFI / from IPC, which require using a release callback / ref count?

Sorry, I was a bit imprecise with that - the allocator interface in arrow-go uses []byte but the rest of the library handles it through a Buffer which associates a []byte and an allocator object and has an associated reference count.

For the problem at hand, Matt and I were talking about adjusting Buffer to handle this. My point was more that we can't get away with just adjusting the allocator implementation. (The native-Go allocator actually does the overallocation trick, but there's no problem there because it's using the garbage collector. We can't use that allocator for ADBC FFI because Go does not like it when Go-allocated pointers escape Go code except under very specific circumstances.)

@lidavidm
Copy link
Member

Perhaps it is worth raising on the mailing list, IMO either FFI should use aligned buffers, or we should remove the language from the spec and highlight instead that FFI may not be zero-copy. I suspect this will depend on how widespread the alignment challenges are, if it is just Go, it might make sense to just fix it there

It would be a little unfortunate to lose zero-copy for in-process FFI, IMO. It takes away from one of the main selling points (even if I suspect that many applications would not notice in practice). I don't think the problem is unsolvable, it just isn't as trivial a change as I first thought.

@felipecrv felipecrv changed the title rust: ADBC driver crashes when consuming Arrow batches without properly aligned buffers rust: ADBC driver panics when consuming Arrow batches without properly aligned buffers Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants