46.0.0 (2023-08-21)
Breaking changes:
- API improvement:
batches_to_flight_data
forces clone #4656 [arrow] - Add AnyDictionary Abstraction and Take ArrayRef in DictionaryArray::with_values #4707 [arrow] (tustvold)
- Cleanup parquet type builders #4706 [parquet] (tustvold)
- Take kernel dyn Array #4705 [arrow] (tustvold)
- Improve ergonomics of Scalar #4704 [arrow] (tustvold)
- Datum based comparison kernels (#4596) #4701 [parquet] [arrow] [arrow-flight] (tustvold)
- Improve
Array
Logical Nullability #4691 [parquet] [arrow] (tustvold) - Validate ArrayData Buffer Alignment and Automatically Align IPC buffers (#4255) #4681 [arrow] (tustvold)
- More intuitive bool-to-string casting #4666 [arrow] (fsdvh)
- enhancement: batches_to_flight_data use a schema ref as param. #4665 [arrow] [arrow-flight] (jackwener)
- fix: from_thrift avoid panic when stats in invalid. #4642 [parquet] (jackwener)
- bug: Add some missing field in row group metadata: ordinal, total co… #4636 [parquet] (liurenjie1024)
- Remove deprecated limit kernel #4597 [arrow] (tustvold)
Implemented enhancements:
- parquet: support setting the field_id with an ArrowWriter #4702 [parquet]
- Support references in i256 arithmetic ops #4694 [arrow]
- Precision-Loss Decimal Arithmetic #4664 [arrow]
- Faster i256 Division #4663 [arrow]
- Support
concat_batches
for 0 columns #4661 [arrow] filter_record_batch
should support filtering record batch without columns #4647 [arrow]- Improve speed of
lexicographical_partition_ranges
#4614 [arrow] - object_store: multipart ranges for HTTP #4612
- Add Rank Function #4606 [arrow]
- Datum Based Comparison Kernels #4596 [parquet] [arrow] [arrow-flight]
- Convenience method to create
DataType::List
correctly #4544 [arrow] - Remove Deprecated Arithmetic Kernels #4481 [arrow]
- Equality kernel where null==null gives true #4438 [arrow]
Fixed bugs:
- Parquet ArrowWriter Ignores Nulls in Dictionary Values #4690 [parquet] [arrow]
- Schema Nullability Validation Fails to Account for Dictionary Nulls #4689 [parquet] [arrow]
- Comparison Kernels Ignore Nulls in Dictionary Values #4688 [parquet] [arrow]
- Casting List to String Ignores Format Options #4669 [arrow]
- Double free in C Stream Interface #4659 [arrow]
- CI Failing On Packed SIMD #4651 [arrow]
RowInterner::size()
much too low for high cardinality dictionary columns #4645 [arrow]- Decimal PrimitiveArray change datatype after try_unary #4644
- Better explanation in docs for Dictionary field encoding using RowConverter #4639 [arrow]
List(FixedSizeBinary)
array equality check may return wrong result #4637 [arrow]arrow::compute::nullif
panics ifNullArray
is provided #4634 [arrow]- Empty lists in FixedSizeListArray::try_new is not handled #4623 [arrow]
- Bounds checking in
MutableBuffer::set_null_bits
can be bypassed #4620 [arrow] - TypedDictionaryArray Misleading Null Behaviour #4616 [parquet] [arrow]
- bug: Parquet writer missing row group metadata fields such as
compressed_size
,file offset
. #4610 [parquet] new_null_array
generates an invalid union array #4600 [arrow]- Footer parsing fails for very large parquet file. #4592 [parquet]
- bug(parquet): Disabling global statistics but enabling for particular column breaks reading #4587 [parquet]
arrow::compute::concat
panics for dense union arrays with non-trivial type IDs #4578 [arrow]
Closed issues:
- [object_store] when Create a AmazonS3 instance work with MinIO without set endpoint got error MissingRegion #4617
Merged pull requests:
- Add distinct kernels (#960) (#4438) #4716 [arrow] (tustvold)
- Update parquet object_store 0.7 #4715 [parquet] (tustvold)
- Support Field ID in ArrowWriter (#4702) #4710 [parquet] (tustvold)
- Remove rank kernels #4703 [arrow] (tustvold)
- Support references in i256 arithmetic ops #4692 [arrow] (viirya)
- Cleanup DynComparator (#2654) #4687 [arrow] (tustvold)
- Separate metadata fetch from
ArrowReaderBuilder
construction (#4674) #4676 [parquet] (tustvold) - cleanup some assert() with error propagation #4673 [parquet] (zeevm)
- Faster i256 Division (2-100x) (#4663) #4672 [arrow] (tustvold)
- Fix MSRV CI #4671 (tustvold)
- Fix equality of nested nullable FixedSizeBinary (#4637) #4670 [arrow] (tustvold)
- Use ArrayFormatter in cast kernel #4668 [arrow] (tustvold)
- Minor: Improve API docs for FlightSQL metadata builders #4667 [arrow] [arrow-flight] (alamb)
- Support
concat_batches
for 0 columns #4662 [arrow] (Dandandan) - fix ownership of c stream error #4660 [arrow] (wjones127)
- Minor: Fix illustration for dict encoding #4657 [arrow] (JayjeetAtGithub)
- minor: move comment to the correct location #4655 [arrow] (jackwener)
- Update packed_simd and run miri tests on simd code #4654 [arrow] (jhorstmann)
- impl
From<Vec<T>>
forBufferBuilder
andMutableBuffer
#4650 [arrow] (mbrobbel) - Filter record batch with 0 columns #4648 [arrow] (Dandandan)
- Account for child
Bucket
size in OrderPreservingInterner #4646 [arrow] (alamb) - Implement
Default
,Extend
andFromIterator
forBufferBuilder
#4638 [arrow] (mbrobbel) - fix(select): handle
NullArray
innullif
#4635 [arrow] (kawadakk) - Move
BufferBuilder
toarrow-buffer
#4630 [arrow] (mbrobbel) - allow zero sized empty fixed #4626 [arrow] (smiklos)
- fix: compute_dictionary_mapping use wrong offsetSize #4625 [arrow] (jackwener)
- impl
FromIterator
forMutableBuffer
#4624 [arrow] (mbrobbel) - expand docs for FixedSizeListArray #4622 [arrow] (smiklos)
- fix(buffer): panic on end index overflow in
MutableBuffer::set_null_bits
#4621 [arrow] (kawadakk) - impl
Default
forarrow_buffer::buffer::MutableBuffer
#4619 [arrow] (mbrobbel) - Minor: improve docs and add example for lexicographical_partition_ranges #4615 [arrow] (alamb)
- Cleanup sort #4613 [arrow] (tustvold)
- Add rank function (#4606) #4609 [arrow] (tustvold)
- Add more docs and examples for ListArray and OffsetsBuffer #4607 [arrow] (alamb)
- Simplify dictionary sort #4605 [arrow] (tustvold)
- Consolidate sort benchmarks #4604 [arrow] (tustvold)
- Don't Reorder Nulls in sort_to_indices (#4545) #4603 [arrow] (tustvold)
- fix(data): create child arrays of correct length when building a sparse union null array #4601 [arrow] (kawadakk)
- Use u32 metadata_len when parsing footer of parquet. #4599 [parquet] (Berrysoft)
- fix(data): map type ID to child index before indexing a union child array #4598 [arrow] (kawadakk)
- Remove deprecated arithmetic kernels (#4481) #4594 [arrow] (tustvold)
- Test Disabled Page Statistics (#4587) #4589 [parquet] (tustvold)
- Cleanup ArrayData::buffers #4583 [arrow] (tustvold)
- Use contains_nulls in ArrayData equality of byte arrays #4582 [arrow] (tustvold)
- Vectorized lexicographical_partition_ranges (~80% faster) #4575 [arrow] (tustvold)
- chore: add datatype new_list #4561 [arrow] (fansehep)
* This Changelog was automatically generated by github_changelog_generator