-
Notifications
You must be signed in to change notification settings - Fork 1.3k
/
Copy path41.0.0.md
363 lines (343 loc) · 32.8 KB
/
41.0.0.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Apache DataFusion 41.0.0 Changelog
This release consists of 245 commits from 69 contributors. See credits at the end of this changelog for more information.
**Breaking changes:**
- make unparser `Dialect` trait `Send` + `Sync` [#11504](https://github.com/apache/datafusion/pull/11504) (y-f-u)
- Implement physical plan serialization for csv COPY plans , add `as_any`, `Debug` to `FileFormatFactory` [#11588](https://github.com/apache/datafusion/pull/11588) (Lordworms)
- Consistent API to set parameters of aggregate and window functions (`AggregateExt` --> `ExprFunctionExt`) [#11550](https://github.com/apache/datafusion/pull/11550) (timsaucer)
- Rename `ColumnOptions` to `ParquetColumnOptions` [#11512](https://github.com/apache/datafusion/pull/11512) (alamb)
- Rename `input_type` --> `input_types` on AggregateFunctionExpr / AccumulatorArgs / StateFieldsArgs [#11666](https://github.com/apache/datafusion/pull/11666) (lewiszlw)
- Rename RepartitionExec metric `repart_time` to `repartition_time` [#11703](https://github.com/apache/datafusion/pull/11703) (alamb)
- Remove `AggregateFunctionDefinition` [#11803](https://github.com/apache/datafusion/pull/11803) (lewiszlw)
- Skipping partial aggregation when it is not helping for high cardinality aggregates [#11627](https://github.com/apache/datafusion/pull/11627) (korowa)
- Optionally create name of aggregate expression from expressions [#11776](https://github.com/apache/datafusion/pull/11776) (lewiszlw)
**Performance related:**
- feat: Optimize CASE expression for "column or null" use case [#11534](https://github.com/apache/datafusion/pull/11534) (andygrove)
- feat: Optimize CASE expression for usage where then and else values are literals [#11553](https://github.com/apache/datafusion/pull/11553) (andygrove)
- perf: Optimize IsNotNullExpr [#11586](https://github.com/apache/datafusion/pull/11586) (andygrove)
**Implemented enhancements:**
- feat: Add `fail_on_overflow` option to `BinaryExpr` [#11400](https://github.com/apache/datafusion/pull/11400) (andygrove)
- feat: add UDF to_local_time() [#11347](https://github.com/apache/datafusion/pull/11347) (appletreeisyellow)
- feat: switch to using proper Substrait types for IntervalYearMonth and IntervalDayTime [#11471](https://github.com/apache/datafusion/pull/11471) (Blizzara)
- feat: support UDWFs in Substrait [#11489](https://github.com/apache/datafusion/pull/11489) (Blizzara)
- feat: support `unnest` in GROUP BY clause [#11469](https://github.com/apache/datafusion/pull/11469) (JasonLi-cn)
- feat: support `COUNT()` [#11229](https://github.com/apache/datafusion/pull/11229) (tshauck)
- feat: consume and produce Substrait type extensions [#11510](https://github.com/apache/datafusion/pull/11510) (Blizzara)
- feat: Error when a SHOW command is passed in with an accompanying non-existant variable [#11540](https://github.com/apache/datafusion/pull/11540) (itsjunetime)
- feat: support Map literals in Substrait consumer and producer [#11547](https://github.com/apache/datafusion/pull/11547) (Blizzara)
- feat: add bounds for unary math scalar functions [#11584](https://github.com/apache/datafusion/pull/11584) (tshauck)
- feat: Add support for cardinality function on maps [#11801](https://github.com/apache/datafusion/pull/11801) (Weijun-H)
- feat: support `Utf8View` type in `starts_with` function [#11787](https://github.com/apache/datafusion/pull/11787) (tshauck)
- feat: Expose public method for optimizing physical plans [#11879](https://github.com/apache/datafusion/pull/11879) (andygrove)
**Fixed bugs:**
- fix: Fix eq properties regression from #10434 [#11363](https://github.com/apache/datafusion/pull/11363) (suremarc)
- fix: make sure JOIN ON expression is boolean type [#11423](https://github.com/apache/datafusion/pull/11423) (jonahgao)
- fix: `regexp_replace` fails when pattern or replacement is a scalar `NULL` [#11459](https://github.com/apache/datafusion/pull/11459) (Weijun-H)
- fix: unparser generates wrong sql for derived table with columns [#11505](https://github.com/apache/datafusion/pull/11505) (y-f-u)
- fix: make `UnKnownColumn`s not equal to others physical exprs [#11536](https://github.com/apache/datafusion/pull/11536) (jonahgao)
- fix: fixes trig function order by [#11559](https://github.com/apache/datafusion/pull/11559) (tshauck)
- fix: CASE with NULL [#11542](https://github.com/apache/datafusion/pull/11542) (Weijun-H)
- fix: panic and incorrect results in `LogFunc::output_ordering()` [#11571](https://github.com/apache/datafusion/pull/11571) (jonahgao)
- fix: expose the fluent API fn for approx_distinct instead of the module [#11644](https://github.com/apache/datafusion/pull/11644) (Michael-J-Ward)
- fix: dont try to coerce list for regex match [#11646](https://github.com/apache/datafusion/pull/11646) (tshauck)
- fix: regr_count now returns Uint64 [#11731](https://github.com/apache/datafusion/pull/11731) (Michael-J-Ward)
- fix: set `null_equals_null` to false when `convert_cross_join_to_inner_join` [#11738](https://github.com/apache/datafusion/pull/11738) (jonahgao)
- fix: Add additional required expression for natural join [#11713](https://github.com/apache/datafusion/pull/11713) (Lordworms)
- fix: hash join tests with forced collisions [#11806](https://github.com/apache/datafusion/pull/11806) (korowa)
- fix: `collect_columns` quadratic complexity [#11843](https://github.com/apache/datafusion/pull/11843) (crepererum)
**Documentation updates:**
- Minor: Add link to blog to main DataFusion website [#11356](https://github.com/apache/datafusion/pull/11356) (alamb)
- Add `to_local_time()` in function reference docs [#11401](https://github.com/apache/datafusion/pull/11401) (appletreeisyellow)
- Minor: Consolidate specification doc sections [#11427](https://github.com/apache/datafusion/pull/11427) (alamb)
- Combine the Roadmap / Quarterly Roadmap sections [#11426](https://github.com/apache/datafusion/pull/11426) (alamb)
- Minor: Add an example for backtrace pretty print [#11450](https://github.com/apache/datafusion/pull/11450) (goldmedal)
- Docs: Document creating new extension APIs [#11425](https://github.com/apache/datafusion/pull/11425) (alamb)
- Minor: Clarify which parquet options are used for reading/writing [#11511](https://github.com/apache/datafusion/pull/11511) (alamb)
- Support `newlines_in_values` CSV option [#11533](https://github.com/apache/datafusion/pull/11533) (connec)
- chore: Minor cleanup `simplify_demo()` example [#11576](https://github.com/apache/datafusion/pull/11576) (kavirajk)
- Move Datafusion Query Optimizer to library user guide [#11563](https://github.com/apache/datafusion/pull/11563) (devesh-2002)
- Fix typo in doc of Partitioning [#11612](https://github.com/apache/datafusion/pull/11612) (waruto210)
- Doc: A tiny typo in scalar function's doc [#11620](https://github.com/apache/datafusion/pull/11620) (2010YOUY01)
- Change default Parquet writer settings to match arrow-rs (except for compression & statistics) [#11558](https://github.com/apache/datafusion/pull/11558) (wiedld)
- Rename `functions-array` to `functions-nested` [#11602](https://github.com/apache/datafusion/pull/11602) (goldmedal)
- Add parser option enable_options_value_normalization [#11330](https://github.com/apache/datafusion/pull/11330) (xinlifoobar)
- Add reference to #comet channel in Arrow Rust Discord server [#11637](https://github.com/apache/datafusion/pull/11637) (ajmarcus)
- Extract catalog API to separate crate, change `TableProvider::scan` to take a trait rather than `SessionState` [#11516](https://github.com/apache/datafusion/pull/11516) (findepi)
- doc: why nullable of list item is set to true [#11626](https://github.com/apache/datafusion/pull/11626) (jcsherin)
- Docs: adding explicit mention of test_utils to docs [#11670](https://github.com/apache/datafusion/pull/11670) (edmondop)
- Ensure statistic defaults in parquet writers are in sync [#11656](https://github.com/apache/datafusion/pull/11656) (wiedld)
- Merge `string-view2` branch: reading from parquet up to 2x faster for some ClickBench queries (not on by default) [#11667](https://github.com/apache/datafusion/pull/11667) (alamb)
- Doc: Add Sail to known users list [#11791](https://github.com/apache/datafusion/pull/11791) (shehabgamin)
- Move min and max to user defined aggregate function, remove `AggregateFunction` / `AggregateFunctionDefinition::BuiltIn` [#11013](https://github.com/apache/datafusion/pull/11013) (edmondop)
- Change name of MAX/MIN udaf to lowercase max/min [#11795](https://github.com/apache/datafusion/pull/11795) (edmondop)
- doc: Add support for `map` and `make_map` functions [#11799](https://github.com/apache/datafusion/pull/11799) (Weijun-H)
- Improve readme page in crates.io [#11809](https://github.com/apache/datafusion/pull/11809) (lewiszlw)
- refactor: remove unneed mut for session context [#11864](https://github.com/apache/datafusion/pull/11864) (sunng87)
**Other:**
- Prepare 40.0.0 Release [#11343](https://github.com/apache/datafusion/pull/11343) (andygrove)
- Support `NULL` literals in where clause [#11266](https://github.com/apache/datafusion/pull/11266) (xinlifoobar)
- Implement TPCH substrait integration test, support tpch_6, tpch_10, t… [#11349](https://github.com/apache/datafusion/pull/11349) (Lordworms)
- Fix bug when pushing projection under joins [#11333](https://github.com/apache/datafusion/pull/11333) (jonahgao)
- Minor: some cosmetics in `filter.rs`, fix clippy due to logical conflict [#11368](https://github.com/apache/datafusion/pull/11368) (comphead)
- Update prost-derive requirement from 0.12 to 0.13 [#11355](https://github.com/apache/datafusion/pull/11355) (dependabot[bot])
- Minor: update dashmap `6.0.1` [#11335](https://github.com/apache/datafusion/pull/11335) (alamb)
- Improve and test dataframe API examples in docs [#11290](https://github.com/apache/datafusion/pull/11290) (alamb)
- Remove redundant `unalias_nested` calls for creating Filter's [#11340](https://github.com/apache/datafusion/pull/11340) (alamb)
- Enable `clone_on_ref_ptr` clippy lint on optimizer [#11346](https://github.com/apache/datafusion/pull/11346) (lewiszlw)
- Update termtree requirement from 0.4.1 to 0.5.0 [#11383](https://github.com/apache/datafusion/pull/11383) (dependabot[bot])
- Introduce `resources_err!` error macro [#11374](https://github.com/apache/datafusion/pull/11374) (comphead)
- Enable `clone_on_ref_ptr` clippy lint on common [#11384](https://github.com/apache/datafusion/pull/11384) (lewiszlw)
- Track parquet writer encoding memory usage on MemoryPool [#11345](https://github.com/apache/datafusion/pull/11345) (wiedld)
- Minor: remove clones and unnecessary Arcs in `from_substrait_rex` [#11337](https://github.com/apache/datafusion/pull/11337) (alamb)
- Minor: Change no-statement error message to be clearer [#11394](https://github.com/apache/datafusion/pull/11394) (itsjunetime)
- Change `array_agg` to return `null` on no input rather than empty list [#11299](https://github.com/apache/datafusion/pull/11299) (jayzhan211)
- Minor: return "not supported" for `COUNT DISTINCT` with multiple arguments [#11391](https://github.com/apache/datafusion/pull/11391) (jonahgao)
- Enable `clone_on_ref_ptr` clippy lint on sql [#11380](https://github.com/apache/datafusion/pull/11380) (lewiszlw)
- Move configuration information out of example usage page [#11300](https://github.com/apache/datafusion/pull/11300) (alamb)
- chore: reuse a single function to create the Substrait TPCH consumer test contexts [#11396](https://github.com/apache/datafusion/pull/11396) (Blizzara)
- refactor: change error type for "no statement" [#11411](https://github.com/apache/datafusion/pull/11411) (crepererum)
- Implement prettier SQL unparsing (more human readable) [#11186](https://github.com/apache/datafusion/pull/11186) (MohamedAbdeen21)
- Move `overlay` planning to`ExprPlanner` [#11398](https://github.com/apache/datafusion/pull/11398) (dharanad)
- Coerce types for all union children plans when eliminating nesting [#11386](https://github.com/apache/datafusion/pull/11386) (gruuya)
- Add customizable equality and hash functions to UDFs [#11392](https://github.com/apache/datafusion/pull/11392) (joroKr21)
- Implement ScalarFunction `MAKE_MAP` and `MAP` [#11361](https://github.com/apache/datafusion/pull/11361) (goldmedal)
- Improve `CommonSubexprEliminate` rule with surely and conditionally evaluated stats [#11357](https://github.com/apache/datafusion/pull/11357) (peter-toth)
- fix(11397): surface proper errors in ParquetSink [#11399](https://github.com/apache/datafusion/pull/11399) (wiedld)
- Minor: Add note about SQLLancer fuzz testing to docs [#11430](https://github.com/apache/datafusion/pull/11430) (alamb)
- Trivial: use arrow csv writer's timestamp_tz_format [#11407](https://github.com/apache/datafusion/pull/11407) (tmi)
- Improved unparser documentation [#11395](https://github.com/apache/datafusion/pull/11395) (alamb)
- Avoid calling shutdown after failed write of AsyncWrite [#11415](https://github.com/apache/datafusion/pull/11415) (joroKr21)
- Short term way to make `AggregateStatistics` still work when min/max is converted to udaf [#11261](https://github.com/apache/datafusion/pull/11261) (Rachelint)
- Implement TPCH substrait integration test, support tpch_13, tpch_14,16 [#11405](https://github.com/apache/datafusion/pull/11405) (Lordworms)
- Minor: fix giuthub action labeler rules [#11428](https://github.com/apache/datafusion/pull/11428) (alamb)
- Minor: change internal error to not supported error for nested field … [#11446](https://github.com/apache/datafusion/pull/11446) (alamb)
- Minor: change Datafusion --> DataFusion in docs [#11439](https://github.com/apache/datafusion/pull/11439) (alamb)
- Support serialization/deserialization for custom physical exprs in proto [#11387](https://github.com/apache/datafusion/pull/11387) (lewiszlw)
- remove termtree dependency [#11416](https://github.com/apache/datafusion/pull/11416) (Kev1n8)
- Add SessionStateBuilder and extract out the registration of defaults [#11403](https://github.com/apache/datafusion/pull/11403) (Omega359)
- integrate consumer tests, implement tpch query 18 to 22 [#11462](https://github.com/apache/datafusion/pull/11462) (Lordworms)
- Docs: Explain the usage of logical expressions for `create_aggregate_expr` [#11458](https://github.com/apache/datafusion/pull/11458) (jayzhan211)
- Return scalar result when all inputs are constants in `map` and `make_map` [#11461](https://github.com/apache/datafusion/pull/11461) (Rachelint)
- Enable `clone_on_ref_ptr` clippy lint on functions\* [#11468](https://github.com/apache/datafusion/pull/11468) (lewiszlw)
- minor: non-overlapping `repart_time` and `send_time` metrics [#11440](https://github.com/apache/datafusion/pull/11440) (korowa)
- Minor: rename `row_groups.rs` to `row_group_filter.rs` [#11481](https://github.com/apache/datafusion/pull/11481) (alamb)
- Support alternate formats for unparsing `datetime` to `timestamp` and `interval` [#11466](https://github.com/apache/datafusion/pull/11466) (y-f-u)
- chore: Add criterion benchmark for CaseExpr [#11482](https://github.com/apache/datafusion/pull/11482) (andygrove)
- Initial support for `StringView`, merge changes from `string-view` development branch [#11402](https://github.com/apache/datafusion/pull/11402) (alamb)
- Replace to_lowercase with to_string in sql example [#11486](https://github.com/apache/datafusion/pull/11486) (lewiszlw)
- Minor: Make execute_input_stream Accessible for Any Sinking Operators [#11449](https://github.com/apache/datafusion/pull/11449) (berkaysynnada)
- Enable `clone_on_ref_ptr` clippy lints on proto [#11465](https://github.com/apache/datafusion/pull/11465) (lewiszlw)
- upgrade sqlparser 0.47 -> 0.48 [#11453](https://github.com/apache/datafusion/pull/11453) (MohamedAbdeen21)
- Add extension hooks for encoding and decoding UDAFs and UDWFs [#11417](https://github.com/apache/datafusion/pull/11417) (joroKr21)
- Remove element's nullability of array_agg function [#11447](https://github.com/apache/datafusion/pull/11447) (jayzhan211)
- Get expr planners when creating new planner [#11485](https://github.com/apache/datafusion/pull/11485) (jayzhan211)
- Support alternate format for Utf8 unparsing (CHAR) [#11494](https://github.com/apache/datafusion/pull/11494) (sgrebnov)
- implement retract_batch for xor accumulator [#11500](https://github.com/apache/datafusion/pull/11500) (drewhayward)
- Refactor: more clearly delineate between `TableParquetOptions` and `ParquetWriterOptions` [#11444](https://github.com/apache/datafusion/pull/11444) (wiedld)
- chore: fix typos of common and core packages [#11520](https://github.com/apache/datafusion/pull/11520) (JasonLi-cn)
- Move spill related functions to spill.rs [#11509](https://github.com/apache/datafusion/pull/11509) (findepi)
- Add tests that show the different defaults for `ArrowWriter` and `TableParquetOptions` [#11524](https://github.com/apache/datafusion/pull/11524) (wiedld)
- Create `datafusion-physical-optimizer` crate [#11507](https://github.com/apache/datafusion/pull/11507) (lewiszlw)
- Minor: Assert `test_enabled_backtrace` requirements to run [#11525](https://github.com/apache/datafusion/pull/11525) (comphead)
- Move handlign of NULL literals in where clause to type coercion pass [#11491](https://github.com/apache/datafusion/pull/11491) (xinlifoobar)
- Update parquet page pruning code to use the `StatisticsExtractor` [#11483](https://github.com/apache/datafusion/pull/11483) (alamb)
- Enable SortMergeJoin LeftAnti filtered fuzz tests [#11535](https://github.com/apache/datafusion/pull/11535) (comphead)
- chore: fix typos of expr, functions, optimizer, physical-expr-common,… [#11538](https://github.com/apache/datafusion/pull/11538) (JasonLi-cn)
- Minor: Remove clone in `PushDownFilter` [#11532](https://github.com/apache/datafusion/pull/11532) (jayzhan211)
- Minor: avoid a clone in type coercion [#11530](https://github.com/apache/datafusion/pull/11530) (alamb)
- Move array `ArrayAgg` to a `UserDefinedAggregate` [#11448](https://github.com/apache/datafusion/pull/11448) (jayzhan211)
- Move `MAKE_MAP` to ExprPlanner [#11452](https://github.com/apache/datafusion/pull/11452) (goldmedal)
- chore: fix typos of sql, sqllogictest and substrait packages [#11548](https://github.com/apache/datafusion/pull/11548) (JasonLi-cn)
- Prevent bigger files from being checked in [#11508](https://github.com/apache/datafusion/pull/11508) (findepi)
- Add dialect param to use double precision for float64 in Postgres [#11495](https://github.com/apache/datafusion/pull/11495) (Sevenannn)
- Minor: move `SessionStateDefaults` into its own module [#11566](https://github.com/apache/datafusion/pull/11566) (alamb)
- refactor: rewrite mega type to an enum containing both cases [#11539](https://github.com/apache/datafusion/pull/11539) (LorrensP-2158466)
- Move `sql_compound_identifier_to_expr ` to `ExprPlanner` [#11487](https://github.com/apache/datafusion/pull/11487) (dharanad)
- Support SortMergeJoin spilling [#11218](https://github.com/apache/datafusion/pull/11218) (comphead)
- Fix unparser invalid sql for query with order [#11527](https://github.com/apache/datafusion/pull/11527) (y-f-u)
- Provide DataFrame API for `map` and move `map` to `functions-array` [#11560](https://github.com/apache/datafusion/pull/11560) (goldmedal)
- Move OutputRequirements to datafusion-physical-optimizer crate [#11579](https://github.com/apache/datafusion/pull/11579) (xinlifoobar)
- Minor: move `Column` related tests and rename `column.rs` [#11573](https://github.com/apache/datafusion/pull/11573) (jonahgao)
- Fix SortMergeJoin antijoin flaky condition [#11604](https://github.com/apache/datafusion/pull/11604) (comphead)
- Improve Union Equivalence Propagation [#11506](https://github.com/apache/datafusion/pull/11506) (mustafasrepo)
- Migrate `OrderSensitiveArrayAgg` to be a user defined aggregate [#11564](https://github.com/apache/datafusion/pull/11564) (jayzhan211)
- Minor:Disable flaky SMJ antijoin filtered test until the fix [#11608](https://github.com/apache/datafusion/pull/11608) (comphead)
- support Decimal256 type in datafusion-proto [#11606](https://github.com/apache/datafusion/pull/11606) (leoyvens)
- Chore/fifo tests cleanup [#11616](https://github.com/apache/datafusion/pull/11616) (ozankabak)
- Fix Internal Error for an INNER JOIN query [#11578](https://github.com/apache/datafusion/pull/11578) (xinlifoobar)
- test: get file size by func metadata [#11575](https://github.com/apache/datafusion/pull/11575) (zhuliquan)
- Improve unparser MySQL compatibility [#11589](https://github.com/apache/datafusion/pull/11589) (sgrebnov)
- Push scalar functions into cross join [#11528](https://github.com/apache/datafusion/pull/11528) (lewiszlw)
- Remove ArrayAgg Builtin in favor of UDF [#11611](https://github.com/apache/datafusion/pull/11611) (jayzhan211)
- refactor: simplify `DFSchema::field_with_unqualified_name` [#11619](https://github.com/apache/datafusion/pull/11619) (jonahgao)
- Minor: Use upstream `concat_batches` from arrow-rs [#11615](https://github.com/apache/datafusion/pull/11615) (alamb)
- Fix : `signum` function bug when `0.0` input [#11580](https://github.com/apache/datafusion/pull/11580) (getChan)
- Enforce uniqueness of `named_struct` field names [#11614](https://github.com/apache/datafusion/pull/11614) (dharanad)
- Minor: unecessary row_count calculation in `CrossJoinExec` and `NestedLoopsJoinExec` [#11632](https://github.com/apache/datafusion/pull/11632) (alamb)
- ExprBuilder for Physical Aggregate Expr [#11617](https://github.com/apache/datafusion/pull/11617) (jayzhan211)
- Minor: avoid copying order by exprs in planner [#11634](https://github.com/apache/datafusion/pull/11634) (alamb)
- Unify CI and pre-commit hook settings for clippy [#11640](https://github.com/apache/datafusion/pull/11640) (findepi)
- Parsing SQL strings to Exprs with the qualified schema [#11562](https://github.com/apache/datafusion/pull/11562) (Lordworms)
- Add some zero column tests covering LIMIT, GROUP BY, WHERE, JOIN, and WINDOW [#11624](https://github.com/apache/datafusion/pull/11624) (Kev1n8)
- Refactor/simplify window frame utils [#11648](https://github.com/apache/datafusion/pull/11648) (ozankabak)
- Minor: use `ready!` macro to simplify `FilterExec` [#11649](https://github.com/apache/datafusion/pull/11649) (alamb)
- Temporarily pin toolchain version to avoid clippy errors [#11655](https://github.com/apache/datafusion/pull/11655) (findepi)
- Fix clippy errors for Rust 1.80 [#11654](https://github.com/apache/datafusion/pull/11654) (findepi)
- Add `CsvExecBuilder` for creating `CsvExec` [#11633](https://github.com/apache/datafusion/pull/11633) (connec)
- chore(deps): update sqlparser requirement from 0.48 to 0.49 [#11630](https://github.com/apache/datafusion/pull/11630) (dependabot[bot])
- Add support for USING to SQL unparser [#11636](https://github.com/apache/datafusion/pull/11636) (wackywendell)
- Run CI with latest (Rust 1.80), add ticket references to commented out tests [#11661](https://github.com/apache/datafusion/pull/11661) (alamb)
- Use `AccumulatorArgs::is_reversed` in `NthValueAgg` [#11669](https://github.com/apache/datafusion/pull/11669) (jcsherin)
- Implement physical plan serialization for json Copy plans [#11645](https://github.com/apache/datafusion/pull/11645) (Lordworms)
- Minor: improve documentation on `SessionState` [#11642](https://github.com/apache/datafusion/pull/11642) (alamb)
- Add LimitPushdown optimization rule and CoalesceBatchesExec fetch [#11652](https://github.com/apache/datafusion/pull/11652) (alihandroid)
- Update to arrow/parquet `52.2.0` [#11691](https://github.com/apache/datafusion/pull/11691) (alamb)
- Minor: Rename `RepartitionMetrics::repartition_time` to `RepartitionMetrics::repart_time` to match metric [#11478](https://github.com/apache/datafusion/pull/11478) (alamb)
- Update cache key used in rust CI script [#11641](https://github.com/apache/datafusion/pull/11641) (findepi)
- Fix bug in `remove_join_expressions` [#11693](https://github.com/apache/datafusion/pull/11693) (jonahgao)
- Initial changes to support using udaf min/max for statistics and opti… [#11696](https://github.com/apache/datafusion/pull/11696) (edmondop)
- Handle nulls in approx_percentile_cont [#11721](https://github.com/apache/datafusion/pull/11721) (Dandandan)
- Reduce repetition in try_process_group_by_unnest and try_process_unnest [#11714](https://github.com/apache/datafusion/pull/11714) (JasonLi-cn)
- Minor: Add example for `ScalarUDF::call` [#11727](https://github.com/apache/datafusion/pull/11727) (alamb)
- Use `cargo release` in `bench.sh` [#11722](https://github.com/apache/datafusion/pull/11722) (alamb)
- expose some fields on session state [#11716](https://github.com/apache/datafusion/pull/11716) (waynexia)
- Make DefaultSchemaAdapterFactory public [#11709](https://github.com/apache/datafusion/pull/11709) (adriangb)
- Check hashes first during probing the aggr hash table [#11718](https://github.com/apache/datafusion/pull/11718) (Rachelint)
- Implement physical plan serialization for parquet Copy plans [#11735](https://github.com/apache/datafusion/pull/11735) (Lordworms)
- Support cross-timezone `timestamp` comparison via coercsion [#11711](https://github.com/apache/datafusion/pull/11711) (jeffreyssmith2nd)
- Minor: Improve documentation for AggregateUDFImpl::state_fields [#11740](https://github.com/apache/datafusion/pull/11740) (lewiszlw)
- Do not push down Sorts if it violates the sort requirements [#11678](https://github.com/apache/datafusion/pull/11678) (alamb)
- Use upstream `StatisticsConverter` from arrow-rs in DataFusion [#11479](https://github.com/apache/datafusion/pull/11479) (alamb)
- Fix `plan_to_sql`: Add wildcard projection to SELECT statement if no projection was set [#11744](https://github.com/apache/datafusion/pull/11744) (LatrecheYasser)
- Use upstream `DataType::from_str` in arrow-cast [#11254](https://github.com/apache/datafusion/pull/11254) (alamb)
- Fix documentation warnings, make CsvExecBuilder and Unparsed pub [#11729](https://github.com/apache/datafusion/pull/11729) (alamb)
- [Minor] Add test for only nulls (empty) as input in APPROX_PERCENTILE_CONT [#11760](https://github.com/apache/datafusion/pull/11760) (Dandandan)
- Add `TrackedMemoryPool` with better error messages on exhaustion [#11665](https://github.com/apache/datafusion/pull/11665) (wiedld)
- Derive `Debug` for logical plan nodes [#11757](https://github.com/apache/datafusion/pull/11757) (lewiszlw)
- Minor: add "clickbench extended" queries to slt tests [#11763](https://github.com/apache/datafusion/pull/11763) (alamb)
- Minor: Add comment explaining rationale for hash check [#11750](https://github.com/apache/datafusion/pull/11750) (alamb)
- Fix bug that `COUNT(DISTINCT)` on StringView panics [#11768](https://github.com/apache/datafusion/pull/11768) (XiangpengHao)
- [Minor] Refactor approx_percentile [#11769](https://github.com/apache/datafusion/pull/11769) (Dandandan)
- minor: always time batch_filter even when the result is an empty batch [#11775](https://github.com/apache/datafusion/pull/11775) (andygrove)
- Improve OOM message when a single reservation request fails to get more bytes. [#11771](https://github.com/apache/datafusion/pull/11771) (wiedld)
- [Minor] Short circuit `ApplyFunctionRewrites` if there are no function rewrites [#11765](https://github.com/apache/datafusion/pull/11765) (gruuya)
- Fix #11692: Improve doc comments within macros [#11694](https://github.com/apache/datafusion/pull/11694) (Rafferty97)
- Extract `CoalesceBatchesStream` to a struct [#11610](https://github.com/apache/datafusion/pull/11610) (alamb)
- refactor: move ExecutionPlan and related structs into dedicated mod [#11759](https://github.com/apache/datafusion/pull/11759) (waynexia)
- Minor: Add references to github issue in comments [#11784](https://github.com/apache/datafusion/pull/11784) (findepi)
- Add docs and rename param for `Signature::numeric` [#11778](https://github.com/apache/datafusion/pull/11778) (matthewmturner)
- Support planning `Map` literal [#11780](https://github.com/apache/datafusion/pull/11780) (goldmedal)
- Support `LogicalPlan` `Debug` differently than `Display` [#11774](https://github.com/apache/datafusion/pull/11774) (lewiszlw)
- Remove redundant Aggregate when `DISTINCT` & `GROUP BY` are in the same query [#11781](https://github.com/apache/datafusion/pull/11781) (mertak-synnada)
- Minor: add ticket reference and fmt [#11805](https://github.com/apache/datafusion/pull/11805) (alamb)
- Improve MSRV CI check to print out problems to log [#11789](https://github.com/apache/datafusion/pull/11789) (alamb)
- Improve log func tests stability [#11808](https://github.com/apache/datafusion/pull/11808) (lewiszlw)
- Add valid Distinct case for aggregation [#11814](https://github.com/apache/datafusion/pull/11814) (mertak-synnada)
- Don't implement `create_sliding_accumulator` repeatedly [#11813](https://github.com/apache/datafusion/pull/11813) (lewiszlw)
- chore(deps): update rstest requirement from 0.21.0 to 0.22.0 [#11811](https://github.com/apache/datafusion/pull/11811) (dependabot[bot])
- Minor: Update exected output due to logical conflict [#11824](https://github.com/apache/datafusion/pull/11824) (alamb)
- Pass scalar to `eq` inside `nullif` [#11697](https://github.com/apache/datafusion/pull/11697) (simonvandel)
- refactor: move `aggregate_statistics` to `datafusion-physical-optimizer` [#11798](https://github.com/apache/datafusion/pull/11798) (Weijun-H)
- Minor: refactor probe check into function `should_skip_aggregation` [#11821](https://github.com/apache/datafusion/pull/11821) (alamb)
- Minor: consolidate `path_partition` test into `core_integration` [#11831](https://github.com/apache/datafusion/pull/11831) (alamb)
- Move optimizer integration tests to `core_integration` [#11830](https://github.com/apache/datafusion/pull/11830) (alamb)
- Bump deprecated version of SessionState::new_with_config_rt to 41.0.0 [#11839](https://github.com/apache/datafusion/pull/11839) (kezhuw)
- Fix partial aggregation skipping with Decimal aggregators [#11833](https://github.com/apache/datafusion/pull/11833) (alamb)
- Fix bug with zero-sized buffer for StringViewArray [#11841](https://github.com/apache/datafusion/pull/11841) (XiangpengHao)
- Reduce clone of `Statistics` in `ListingTable` and `PartitionedFile` [#11802](https://github.com/apache/datafusion/pull/11802) (Rachelint)
- Add `LogicalPlan::CreateIndex` [#11817](https://github.com/apache/datafusion/pull/11817) (lewiszlw)
- Update `object_store` to 0.10.2 [#11860](https://github.com/apache/datafusion/pull/11860) (danlgrca)
- Add `skipped_aggregation_rows` metric to aggregate operator [#11706](https://github.com/apache/datafusion/pull/11706) (alamb)
- Cast `Utf8View` to `Utf8` to support `||` from `StringViewArray` [#11796](https://github.com/apache/datafusion/pull/11796) (dharanad)
- Improve nested loop join code [#11863](https://github.com/apache/datafusion/pull/11863) (lewiszlw)
- [Minor]: Refactor to use Result.transpose() [#11882](https://github.com/apache/datafusion/pull/11882) (djanderson)
- support `ANY()` op [#11849](https://github.com/apache/datafusion/pull/11849) (samuelcolvin)
## Credits
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
```
48 Andrew Lamb
20 张林伟
9 Jay Zhan
9 Jonah Gao
8 Andy Grove
8 Lordworms
8 Piotr Findeisen
8 wiedld
7 Oleks V
6 Jax Liu
5 Alex Huang
5 Arttu
5 JasonLi
5 Trent Hauck
5 Xin Li
4 Dharan Aditya
4 Edmondo Porcu
4 dependabot[bot]
4 kamille
4 yfu
3 Daniël Heres
3 Eduard Karacharov
3 Georgi Krastev
2 Chris Connelly
2 Chunchun Ye
2 June
2 Marco Neumann
2 Marko Grujic
2 Mehmet Ozan Kabak
2 Michael J Ward
2 Mohamed Abdeen
2 Ruihang Xia
2 Sergei Grebnov
2 Xiangpeng Hao
2 jcsherin
2 kf zheng
2 mertak-synnada
1 Adrian Garcia Badaracco
1 Alexander Rafferty
1 Alihan Çelikcan
1 Ariel Marcus
1 Berkay Şahin
1 Bruce Ritchie
1 Devesh Rahatekar
1 Douglas Anderson
1 Drew Hayward
1 Jeffrey Smith II
1 Kaviraj Kanagaraj
1 Kezhu Wang
1 Leonardo Yvens
1 Lorrens Pantelis
1 Matthew Cramerus
1 Matthew Turner
1 Mustafa Akur
1 Namgung Chan
1 Ning Sun
1 Peter Toth
1 Qianqian
1 Samuel Colvin
1 Shehab Amin
1 Simon Vandel Sillesen
1 Tim Saucer
1 Wendell Smith
1 Yasser Latreche
1 Yongting You
1 danlgrca
1 tmi
1 waruto
1 zhuliquan
```
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.