Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: duckdb/duckdb
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 131787252cc0506d0fdeb4e8b9de10b68118d156
Choose a base ref
...
head repository: duckdb/duckdb
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: ca5af32c331f9d5ea49f7158d5c83a47f25b8b79
Choose a head ref

Commits on Oct 8, 2024

  1. Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    9d2b005 View commit details

Commits on Oct 9, 2024

  1. Copy the full SHA
    52b0302 View commit details
  2. Actually add shell code

    Mytherin committed Oct 9, 2024
    Copy the full SHA
    88145e3 View commit details
  3. Copy the full SHA
    f2dd2e1 View commit details
  4. Rename to .thousand_sep

    Mytherin committed Oct 9, 2024
    Copy the full SHA
    9cc655b View commit details
  5. Copy the full SHA
    3444bc2 View commit details

Commits on Oct 15, 2024

  1. upgrade to the nullable dtype variant if the produced array is masked…

    …, pandas can't make this decision on its own apparently
    Tishj committed Oct 15, 2024
    Copy the full SHA
    b2d2956 View commit details
  2. even wrapping the numpy array in a Series with an explicit nullable d…

    …type does not stop the conversion to float...
    Tishj committed Oct 15, 2024
    Copy the full SHA
    bfcc570 View commit details
  3. update tests

    Tishj committed Oct 15, 2024
    Copy the full SHA
    e62ccb3 View commit details

Commits on Oct 16, 2024

  1. Copy the full SHA
    f068562 View commit details
  2. Copy the full SHA
    8a8eb98 View commit details
  3. also not covered by suppress

    Tishj committed Oct 16, 2024
    Copy the full SHA
    472d959 View commit details
  4. use different way to construct the Series, add clarifying comment to …

    …the validity mask AllValid(V entry) method
    Tishj committed Oct 16, 2024
    Copy the full SHA
    8155e29 View commit details

Commits on Oct 17, 2024

  1. Copy the full SHA
    fc4f668 View commit details

Commits on Oct 21, 2024

  1. Copy the full SHA
    a8bd0ae View commit details
  2. run format fix

    fanyang01 committed Oct 21, 2024
    Copy the full SHA
    f1e5e33 View commit details

Commits on Oct 22, 2024

  1. Copy the full SHA
    5f19273 View commit details

Commits on Oct 23, 2024

  1. Copy the full SHA
    41a674f View commit details
  2. Run format fix

    fanyang01 committed Oct 23, 2024
    Copy the full SHA
    3a17d59 View commit details
  3. Copy the full SHA
    01cafb7 View commit details

Commits on Oct 24, 2024

  1. Copy the full SHA
    42fc541 View commit details
  2. Copy the full SHA
    cd6e7f2 View commit details
  3. Copy the full SHA
    2e88e4c View commit details
  4. Run format fix

    fanyang01 committed Oct 24, 2024
    Copy the full SHA
    e50db01 View commit details

Commits on Oct 25, 2024

  1. Copy the full SHA
    5742a21 View commit details
  2. Copy the full SHA
    b5800be View commit details
  3. Copy the full SHA
    5287d82 View commit details
  4. test: escape \n

    fanyang01 committed Oct 25, 2024
    Copy the full SHA
    7635e2f View commit details

Commits on Oct 28, 2024

  1. Copy the full SHA
    c8f209c View commit details

Commits on Oct 29, 2024

  1. WIP: add ColumnIndex class that is used instead of column_t in column…

    …_ids, which can refer to sub-columns
    Mytherin committed Oct 29, 2024
    Copy the full SHA
    74042c7 View commit details
  2. Copy the full SHA
    95ccb3f View commit details
  3. WithDefault

    Mytherin committed Oct 29, 2024
    Copy the full SHA
    0175614 View commit details
  4. Copy the full SHA
    974d031 View commit details
  5. Copy the full SHA
    a41b1d7 View commit details
  6. Add add_months function

    binste committed Oct 29, 2024
    Copy the full SHA
    0411da2 View commit details
  7. Copy the full SHA
    41c1387 View commit details
  8. Add array function

    binste committed Oct 29, 2024
    Copy the full SHA
    ab92b99 View commit details
  9. Add array_agg function

    binste committed Oct 29, 2024
    Copy the full SHA
    f600791 View commit details
  10. Add array_append function

    binste committed Oct 29, 2024
    Copy the full SHA
    e36d978 View commit details

Commits on Oct 30, 2024

  1. Copy the full SHA
    5ee186b View commit details

Commits on Oct 31, 2024

  1. Copy the full SHA
    72babd0 View commit details
  2. Run format fix

    fanyang01 committed Oct 31, 2024
    Copy the full SHA
    ca7c9d3 View commit details
  3. Copy the full SHA
    3cb21fd View commit details
  4. Copy the full SHA
    3975c9d View commit details
  5. Copy the full SHA
    96482ac View commit details
  6. Run format fix

    fanyang01 committed Oct 31, 2024
    Copy the full SHA
    573b15f View commit details
  7. Update nodes.json

    fanyang01 authored Oct 31, 2024
    Copy the full SHA
    48e4429 View commit details
  8. need to initialize the grouping sets when there is a group by all, an…

    …d also rebind the having
    Tmonster committed Oct 31, 2024
    Copy the full SHA
    f3f1bc9 View commit details

Commits on Nov 1, 2024

  1. Add array_insert function

    binste committed Nov 1, 2024
    Copy the full SHA
    8eb49bb View commit details
  2. Copy the full SHA
    3c78f64 View commit details
Showing with 7,856 additions and 1,165 deletions.
  1. +30 −5 .github/patches/extensions/delta/{multifilereader_shared_ptr.patch → fixes.patch}
  2. +0 −26 .github/patches/extensions/delta/multifilereader_tablefunction_param.patch
  3. +44 −4 .github/patches/extensions/spatial/random_test_fix.patch
  4. +10 −1 .github/patches/extensions/substrait/or_filter_pushdown.patch
  5. +134 −1 .github/patches/extensions/vss/partitioning.patch
  6. +1 −0 Makefile
  7. +18 −0 benchmark/tpch/struct/tpch_q1_struct.benchmark
  8. +18 −0 benchmark/tpch/struct/tpch_q1_struct_nested.benchmark
  9. BIN data/csv/null_terminator.csv
  10. +27 −0 data/csv/unquoted_escape/basic.tsv
  11. +339 −0 data/csv/unquoted_escape/human_eval.csv
  12. +339 −0 data/csv/unquoted_escape/human_eval.tsv
  13. +6 −0 data/csv/unquoted_escape/identical.csv
  14. +8 −0 data/csv/unquoted_escape/mixed.csv
  15. +10 −0 data/csv/unquoted_escape/plain.csv
  16. BIN data/parquet-testing/issue6630_1.parquet
  17. BIN data/parquet-testing/issue6630_2.parquet
  18. +1 −1 extension/core_functions/aggregate/nested/binned_histogram.cpp
  19. +10 −2 extension/core_functions/core_functions_config.py
  20. +1 −6 extension/core_functions/scalar/enum/enum_functions.cpp
  21. +4 −2 extension/core_functions/scalar/string/starts_with.cpp
  22. +0 −1 extension/icu/icu_extension.cpp
  23. +22 −0 extension/jemalloc/include/malloc_ncpus.h
  24. +5 −0 extension/jemalloc/jemalloc/README.md
  25. +5 −0 extension/jemalloc/jemalloc/src/jemalloc.c
  26. +11 −0 extension/jemalloc/jemalloc_extension.cpp
  27. +2 −0 extension/parquet/parquet_writer.cpp
  28. +3 −0 scripts/generate_metric_enums.py
  29. +3 −3 src/catalog/catalog_entry/duck_table_entry.cpp
  30. +1 −1 src/common/bind_helpers.cpp
  31. +64 −8 src/common/box_renderer.cpp
  32. +35 −12 src/common/enum_util.cpp
  33. +12 −8 src/common/enums/metric_type.cpp
  34. +4 −1 src/common/multi_file_list.cpp
  35. +1 −1 src/common/multi_file_reader.cpp
  36. +1 −1 src/common/serializer/buffered_file_writer.cpp
  37. +9 −1 src/common/string_util.cpp
  38. +151 −129 src/common/types/value.cpp
  39. +1 −1 src/common/types/vector.cpp
  40. +6 −2 src/execution/adaptive_filter.cpp
  41. +4 −2 src/execution/index/fixed_size_buffer.cpp
  42. +61 −38 src/execution/operator/csv_scanner/scanner/string_value_scanner.cpp
  43. +3 −1 src/execution/operator/csv_scanner/sniffer/dialect_detection.cpp
  44. +57 −3 src/execution/operator/csv_scanner/state_machine/csv_state_machine_cache.cpp
  45. +18 −1 src/execution/operator/csv_scanner/util/csv_reader_options.cpp
  46. +17 −7 src/execution/operator/persistent/physical_batch_insert.cpp
  47. +6 −0 src/execution/operator/persistent/physical_copy_to_file.cpp
  48. +1 −1 src/execution/operator/persistent/physical_delete.cpp
  49. +7 −4 src/execution/operator/persistent/physical_insert.cpp
  50. +1 −1 src/execution/operator/projection/physical_tableinout_function.cpp
  51. +5 −4 src/execution/operator/scan/physical_table_scan.cpp
  52. +1 −1 src/execution/physical_plan/plan_aggregate.cpp
  53. +13 −11 src/execution/physical_plan/plan_get.cpp
  54. +3 −2 src/function/function.cpp
  55. +80 −1 src/function/function_binder.cpp
  56. +1 −0 src/function/scalar/string/contains.cpp
  57. +45 −20 src/function/scalar/string/like.cpp
  58. +1 −16 src/function/scalar/struct/struct_extract.cpp
  59. +12 −4 src/function/table/copy_csv.cpp
  60. +12 −7 src/function/table/query_function.cpp
  61. +1 −0 src/function/table/read_csv.cpp
  62. +1 −1 src/function/table/system/duckdb_types.cpp
  63. +51 −43 src/function/table/system/test_all_types.cpp
  64. +31 −17 src/function/table/table_scan.cpp
  65. +32 −0 src/include/duckdb.h
  66. +9 −1 src/include/duckdb/common/box_renderer.hpp
  67. +72 −0 src/include/duckdb/common/column_index.hpp
  68. +8 −0 src/include/duckdb/common/enum_util.hpp
  69. +17 −0 src/include/duckdb/common/enums/collation_type.hpp
  70. +5 −4 src/include/duckdb/common/enums/metric_type.hpp
  71. +6 −2 src/include/duckdb/common/file_buffer.hpp
  72. +2 −1 src/include/duckdb/common/multi_file_list.hpp
  73. +1 −1 src/include/duckdb/common/serializer/buffered_file_writer.hpp
  74. +3 −0 src/include/duckdb/common/string_util.hpp
  75. +1 −0 src/include/duckdb/common/types/validity_mask.hpp
  76. +8 −10 src/include/duckdb/common/types/value.hpp
  77. +2 −0 src/include/duckdb/execution/adaptive_filter.hpp
  78. +3 −0 src/include/duckdb/execution/operator/csv_scanner/base_scanner.hpp
  79. +4 −1 src/include/duckdb/execution/operator/csv_scanner/csv_reader_options.hpp
  80. +2 −0 src/include/duckdb/execution/operator/csv_scanner/csv_state.hpp
  81. +10 −1 src/include/duckdb/execution/operator/csv_scanner/csv_state_machine.hpp
  82. +1 −1 src/include/duckdb/execution/operator/csv_scanner/csv_state_machine_cache.hpp
  83. +7 −3 src/include/duckdb/execution/operator/csv_scanner/state_machine_options.hpp
  84. +3 −0 src/include/duckdb/execution/operator/csv_scanner/string_value_scanner.hpp
  85. +1 −1 src/include/duckdb/execution/operator/helper/physical_reservoir_sample.hpp
  86. +1 −1 src/include/duckdb/execution/operator/persistent/physical_insert.hpp
  87. +2 −2 src/include/duckdb/execution/operator/projection/physical_tableinout_function.hpp
  88. +3 −2 src/include/duckdb/execution/operator/scan/physical_table_scan.hpp
  89. +11 −0 src/include/duckdb/function/function.hpp
  90. +33 −0 src/include/duckdb/function/scalar/struct_utils.hpp
  91. +18 −4 src/include/duckdb/function/table_function.hpp
  92. +8 −0 src/include/duckdb/main/capi/extension_api.hpp
  93. +5 −1 src/include/duckdb/main/capi/header_generation/apis/v0/dev/dev.json
  94. +66 −1 src/include/duckdb/main/capi/header_generation/functions/value_interface.json
  95. +1 −0 src/include/duckdb/main/connection.hpp
  96. +3 −0 src/include/duckdb/main/relation.hpp
  97. +2 −0 src/include/duckdb/main/relation/table_relation.hpp
  98. +2 −0 src/include/duckdb/main/relation/value_relation.hpp
  99. +1 −1 src/include/duckdb/optimizer/filter_combiner.hpp
  100. +17 −1 src/include/duckdb/optimizer/remove_unused_columns.hpp
  101. +1 −0 src/include/duckdb/parser/parsed_data/sample_options.hpp
  102. +3 −0 src/include/duckdb/parser/parser.hpp
  103. +1 −0 src/include/duckdb/parser/qualified_name.hpp
  104. +2 −1 src/include/duckdb/parser/simplified_token.hpp
  105. +4 −3 src/include/duckdb/planner/bind_context.hpp
  106. +4 −3 src/include/duckdb/planner/collation_binding.hpp
  107. +3 −1 src/include/duckdb/planner/expression_binder.hpp
  108. +4 −4 src/include/duckdb/planner/operator/logical_get.hpp
  109. +4 −3 src/include/duckdb/planner/table_binding.hpp
  110. +2 −1 src/include/duckdb/planner/table_filter.hpp
  111. +3 −0 src/include/duckdb/storage/block_manager.hpp
  112. +74 −27 src/include/duckdb/storage/buffer/block_handle.hpp
  113. +1 −1 src/include/duckdb/storage/buffer/buffer_handle.hpp
  114. +1 −2 src/include/duckdb/storage/compression/alp/alp_compress.hpp
  115. +1 −2 src/include/duckdb/storage/compression/alprd/alprd_compress.hpp
  116. +4 −4 src/include/duckdb/storage/data_table.hpp
  117. +25 −2 src/include/duckdb/storage/serialization/nodes.json
  118. +1 −1 src/include/duckdb/storage/standard_buffer_manager.hpp
  119. +70 −0 src/include/duckdb/storage/storage_index.hpp
  120. +2 −1 src/include/duckdb/storage/table/column_checkpoint_state.hpp
  121. +1 −1 src/include/duckdb/storage/table/delete_state.hpp
  122. +3 −1 src/include/duckdb/storage/table/row_group.hpp
  123. +7 −5 src/include/duckdb/storage/table/row_group_collection.hpp
  124. +11 −6 src/include/duckdb/storage/table/scan_state.hpp
  125. +13 −9 src/include/duckdb/storage/write_ahead_log.hpp
  126. +5 −5 src/include/duckdb/transaction/local_storage.hpp
  127. +8 −0 src/include/duckdb_extension.h
  128. +43 −3 src/main/capi/duckdb_value-c.cpp
  129. +1 −0 src/main/client_context.cpp
  130. +5 −0 src/main/connection.cpp
  131. +14 −20 src/main/extension/CMakeLists.txt
  132. +19 −5 src/main/profiling_info.cpp
  133. +1 −0 src/main/query_profiler.cpp
  134. +15 −0 src/main/relation.cpp
  135. +11 −0 src/main/relation/table_relation.cpp
  136. +18 −0 src/main/relation/value_relation.cpp
  137. +8 −1 src/optimizer/build_probe_side_optimizer.cpp
  138. +8 −0 src/optimizer/expression_heuristics.cpp
  139. +13 −12 src/optimizer/filter_combiner.cpp
  140. +6 −5 src/optimizer/join_order/relation_statistics_helper.cpp
  141. +156 −22 src/optimizer/remove_unused_columns.cpp
  142. +5 −2 src/optimizer/rule/comparison_simplification.cpp
  143. +3 −3 src/optimizer/statistics/operator/propagate_get.cpp
  144. +2 −2 src/optimizer/unnest_rewriter.cpp
  145. +2 −0 src/parser/parsed_data/sample_options.cpp
  146. +117 −0 src/parser/parser.cpp
  147. +5 −0 src/parser/qualified_name.cpp
  148. +35 −6 src/parser/query_error_context.cpp
  149. +5 −1 src/parser/statement/delete_statement.cpp
  150. +5 −1 src/parser/statement/update_statement.cpp
  151. +1 −0 src/parser/transform/helpers/transform_sample.cpp
  152. +1 −1 src/parser/transform/statement/transform_create_view.cpp
  153. +10 −4 src/planner/bind_context.cpp
  154. +2 −2 src/planner/binder.cpp
  155. +2 −2 src/planner/binder/expression/bind_comparison_expression.cpp
  156. +10 −0 src/planner/binder/query_node/bind_select_node.cpp
  157. +3 −2 src/planner/binder/query_node/plan_setop.cpp
  158. +5 −0 src/planner/binder/statement/bind_drop.cpp
  159. +11 −2 src/planner/binder/statement/bind_insert.cpp
  160. +1 −1 src/planner/binder/statement/bind_vacuum.cpp
  161. +45 −12 src/planner/binder/tableref/bind_pivot.cpp
  162. +1 −1 src/planner/binder/tableref/bind_showref.cpp
  163. +6 −2 src/planner/binder/tableref/bind_table_function.cpp
  164. +16 −4 src/planner/collation_binding.cpp
  165. +4 −3 src/planner/expression_binder/index_binder.cpp
  166. +35 −20 src/planner/operator/logical_get.cpp
  167. +1 −1 src/planner/planner.cpp
  168. +19 −13 src/planner/table_binding.cpp
  169. +6 −5 src/planner/table_filter.cpp
  170. +87 −16 src/storage/buffer/block_handle.cpp
  171. +33 −25 src/storage/buffer/block_manager.cpp
  172. +2 −2 src/storage/buffer/buffer_handle.cpp
  173. +26 −23 src/storage/buffer/buffer_pool.cpp
  174. +1 −2 src/storage/compression/bitpacking.cpp
  175. +1 −1 src/storage/compression/dictionary_compression.cpp
  176. +4 −1 src/storage/compression/fixed_size_uncompressed.cpp
  177. +1 −1 src/storage/compression/fsst.cpp
  178. +1 −1 src/storage/compression/rle.cpp
  179. +8 −8 src/storage/data_table.cpp
  180. +13 −8 src/storage/local_storage.cpp
  181. +1 −1 src/storage/metadata/metadata_manager.cpp
  182. +15 −0 src/storage/serialization/serialize_nodes.cpp
  183. +62 −60 src/storage/standard_buffer_manager.cpp
  184. +1 −1 src/storage/storage_info.cpp
  185. +4 −24 src/storage/storage_manager.cpp
  186. +7 −1 src/storage/table/array_column_data.cpp
  187. +6 −1 src/storage/table/column_checkpoint_state.cpp
  188. +43 −18 src/storage/table/row_group.cpp
  189. +18 −16 src/storage/table/row_group_collection.cpp
  190. +7 −6 src/storage/table/scan_state.cpp
  191. +42 −4 src/storage/table/struct_column_data.cpp
  192. +1 −1 src/storage/table_index_list.cpp
  193. +4 −8 src/storage/temporary_file_manager.cpp
  194. +36 −10 src/storage/wal_replay.cpp
  195. +25 −15 src/storage/write_ahead_log.cpp
  196. +6 −3 src/transaction/duck_transaction_manager.cpp
  197. +2 −0 test/api/capi/test_capi_profiling.cpp
  198. +37 −0 test/api/capi/test_capi_values.cpp
  199. +19 −0 test/api/test_api.cpp
  200. +12 −0 test/api/test_prepared_api.cpp
  201. +8 −5 test/appender/test_nested_appender.cpp
  202. +43 −0 test/issues/general/test_14540.test
  203. +61 −0 test/optimizer/pushdown/test_pushdown_cte_group_by_all.test
  204. +1 −1 test/sql/aggregate/group/group_by_all_having.test
  205. +1 −3 test/sql/attach/attach_force_checkpoint_deadlock.test_slow
  206. +50 −0 test/sql/catalog/function/attached_macro.test
  207. +1 −1 test/sql/catalog/function/query_function.test
  208. +98 −0 test/sql/collate/collate_like.test
  209. +106 −0 test/sql/collate/icu_collation_propagation.test
  210. +116 −0 test/sql/collate/test_collation_propagation.test
  211. +6 −8 test/sql/collate/test_icu_collate.test
  212. +2 −4 test/sql/collate/test_unsupported_collations.test
  213. +36 −0 test/sql/copy/csv/null_terminator.test
  214. +24 −0 test/sql/copy/csv/unquoted_escape/32k_rows.test
  215. +43 −0 test/sql/copy/csv/unquoted_escape/basic.test
  216. +86 −0 test/sql/copy/csv/unquoted_escape/human_eval.test
  217. +15 −0 test/sql/copy/csv/unquoted_escape/identical.test
  218. +16 −0 test/sql/copy/csv/unquoted_escape/mixed.test
  219. +2 −2 test/sql/copy/parquet/parquet_6630_union_by_name.test
  220. +13 −1 test/sql/copy/parquet/writer/test_parquet_write.test
  221. +80 −0 test/sql/error/lineitem_errors.test
  222. +5 −0 test/sql/explain/test_explain_analyze.test
  223. +2 −2 test/sql/function/string/test_jaro_winkler.test
  224. +1 −1 test/sql/join/external/{simple_external_join.test_coverage → simple_external_join.test_slow}
  225. +32 −0 test/sql/optimizer/test_no_pushdown_cast_into_cte.test
  226. +3 −3 test/sql/pivot/pivot_errors.test
  227. +36 −0 test/sql/pivot/pivot_expressions.test
  228. +6 −0 test/sql/pragma/profiling/test_custom_profiling_optimizer.test
  229. +1 −0 test/sql/pragma/profiling/test_default_profiling_settings.test
  230. +14 −0 test/sql/pragma/test_show_tables.test
  231. +33 −1 test/sql/sample/same_seed_same_sample.test_slow
  232. +31 −0 test/sql/storage/struct_default_entries.test_slow
  233. +51 −0 test/sql/storage/wal_torn_write.cpp
  234. +3 −3 test/sql/subquery/scalar/correlated_missing_columns.test
  235. +159 −0 test/sql/types/struct/nested_struct_projection_pushdown.test
  236. +67 −0 test/sql/types/struct/struct_projection_pushdown.test
  237. +31 −0 test/sql/types/struct/struct_projection_pushdown_index.test_slow
  238. +31 −0 test/sql/types/struct/tpch_struct_projection_pushdown.test_slow
  239. +1 −0 test/sqlite/sqllogic_test_logger.cpp
  240. +6 −0 third_party/zstd/dict/cover.cpp
  241. +6 −0 third_party/zstd/dict/fastcover.cpp
  242. +3 −0 third_party/zstd/dict/zdict.cpp
  243. +33 −7 tools/pythonpkg/duckdb-stubs/__init__.pyi
  244. +4 −0 tools/pythonpkg/duckdb/__init__.py
  245. +2 −0 tools/pythonpkg/duckdb/experimental/spark/context.py
  246. +4 −4 tools/pythonpkg/duckdb/experimental/spark/sql/column.py
  247. +1,243 −29 tools/pythonpkg/duckdb/experimental/spark/sql/functions.py
  248. +2 −3 tools/pythonpkg/duckdb_python.cpp
  249. +86 −2 tools/pythonpkg/scripts/cache_data.json
  250. +2 −2 tools/pythonpkg/scripts/connection_methods.json
  251. +14 −1 tools/pythonpkg/scripts/generate_connection_methods.py
  252. +12 −1 tools/pythonpkg/scripts/generate_connection_wrapper_methods.py
  253. +12 −0 tools/pythonpkg/scripts/imports.py
  254. +5 −0 tools/pythonpkg/src/include/duckdb_python/expression/pyexpression.hpp
  255. +2 −1 tools/pythonpkg/src/include/duckdb_python/import_cache/modules/numpy_module.hpp
  256. +16 −1 tools/pythonpkg/src/include/duckdb_python/import_cache/modules/pandas_module.hpp
  257. +1 −1 tools/pythonpkg/src/include/duckdb_python/pyconnection/pyconnection.hpp
  258. +5 −1 tools/pythonpkg/src/include/duckdb_python/pyrelation.hpp
  259. +0 −1 tools/pythonpkg/src/include/duckdb_python/pyresult.hpp
  260. +0 −8 tools/pythonpkg/src/native/python_conversion.cpp
  261. +3 −1 tools/pythonpkg/src/numpy/array_wrapper.cpp
  262. +67 −8 tools/pythonpkg/src/pyconnection.cpp
  263. +56 −6 tools/pythonpkg/src/pyexpression.cpp
  264. +14 −0 tools/pythonpkg/src/pyexpression/initialize.cpp
  265. +98 −1 tools/pythonpkg/src/pyrelation.cpp
  266. +6 −1 tools/pythonpkg/src/pyrelation/initialize.cpp
  267. +72 −9 tools/pythonpkg/src/pyresult.cpp
  268. +1 −1 tools/pythonpkg/tests/fast/api/test_dbapi00.py
  269. +110 −5 tools/pythonpkg/tests/fast/api/test_to_parquet.py
  270. +4 −4 tools/pythonpkg/tests/fast/pandas/test_fetch_nested.py
  271. +1 −1 tools/pythonpkg/tests/fast/pandas/test_pandas_category.py
  272. +73 −5 tools/pythonpkg/tests/fast/pandas/test_pandas_types.py
  273. +6 −15 tools/pythonpkg/tests/fast/spark/test_replace_empty_value.py
  274. +7 −0 tools/pythonpkg/tests/fast/spark/test_spark_arrow_table.py
  275. +15 −1 tools/pythonpkg/tests/fast/spark/test_spark_column.py
  276. +106 −0 tools/pythonpkg/tests/fast/spark/test_spark_functions_array.py
  277. +14 −0 tools/pythonpkg/tests/fast/spark/test_spark_functions_date.py
  278. +51 −0 tools/pythonpkg/tests/fast/spark/test_spark_functions_numeric.py
  279. +56 −0 tools/pythonpkg/tests/fast/spark/test_spark_functions_sort.py
  280. +55 −0 tools/pythonpkg/tests/fast/spark/test_spark_functions_string.py
  281. +15 −0 tools/pythonpkg/tests/fast/spark/test_spark_group_by.py
  282. +79 −0 tools/pythonpkg/tests/fast/spark/test_spark_order_by.py
  283. +18 −12 tools/pythonpkg/tests/fast/test_all_types.py
  284. +110 −0 tools/pythonpkg/tests/fast/test_expression.py
  285. +86 −0 tools/pythonpkg/tests/fast/test_relation.py
  286. +39 −26 tools/pythonpkg/tests/fast/types/test_object_int.py
  287. +6 −0 tools/pythonpkg/tests/spark_namespace/sql/dataframe.py
  288. +4 −0 tools/shell/include/shell_state.hpp
  289. +209 −28 tools/shell/shell.cpp
  290. +65 −0 tools/shell/tests/test_errors.py
  291. +36 −0 tools/shell/tests/test_shell_basics.py
  292. +4 −5 tools/sqlite3_api_wrapper/sqlite3_api_wrapper.cpp
Original file line number Diff line number Diff line change
@@ -1,8 +1,30 @@
diff --git a/src/functions/delta_scan.cpp b/src/functions/delta_scan.cpp
index 23482f1..968f116 100644
index 65eb34f..9b45db2 100644
--- a/src/functions/delta_scan.cpp
+++ b/src/functions/delta_scan.cpp
@@ -599,12 +599,12 @@ void DeltaMultiFileReader::FinalizeBind(const MultiFileReaderOptions &file_optio
@@ -464,7 +464,11 @@ unique_ptr<MultiFileList> DeltaSnapshot::ComplexFilterPushdown(ClientContext &co
for (const auto &filter : filters) {
combiner.AddFilter(filter->Copy());
}
- auto filterstmp = combiner.GenerateTableScanFilters(info.column_ids);
+ vector<ColumnIndex> column_indexes;
+ for(auto column_id : info.column_ids) {
+ column_indexes.emplace_back(column_id);
+ }
+ auto filterstmp = combiner.GenerateTableScanFilters(column_indexes);

// TODO: can/should we figure out if this filtered anything?
auto filtered_list = make_uniq<DeltaSnapshot>(context, paths[0]);
@@ -529,7 +533,7 @@ unique_ptr<NodeStatistics> DeltaSnapshot::GetCardinality(ClientContext &context)
return nullptr;
}

-unique_ptr<MultiFileReader> DeltaMultiFileReader::CreateInstance() {
+unique_ptr<MultiFileReader> DeltaMultiFileReader::CreateInstance(const TableFunction &table_function) {
return std::move(make_uniq<DeltaMultiFileReader>());
}

@@ -618,12 +622,12 @@ void DeltaMultiFileReader::FinalizeBind(const MultiFileReaderOptions &file_optio
}
}

@@ -18,12 +40,15 @@ index 23482f1..968f116 100644

// Generate the correct Selection Vector Based on the Raw delta KernelBoolSlice dv and the row_id_column
diff --git a/src/include/functions/delta_scan.hpp b/src/include/functions/delta_scan.hpp
index 23c937d..84220f9 100644
index aac35cc..84220f9 100644
--- a/src/include/functions/delta_scan.hpp
+++ b/src/include/functions/delta_scan.hpp
@@ -105,7 +105,7 @@ struct DeltaMultiFileReaderGlobalState : public MultiFileReaderGlobalState {
@@ -103,9 +103,9 @@ struct DeltaMultiFileReaderGlobalState : public MultiFileReaderGlobalState {
};

struct DeltaMultiFileReader : public MultiFileReader {
static unique_ptr<MultiFileReader> CreateInstance(const TableFunction &table_function);
- static unique_ptr<MultiFileReader> CreateInstance();
+ static unique_ptr<MultiFileReader> CreateInstance(const TableFunction &table_function);
//! Return a DeltaSnapshot
- unique_ptr<MultiFileList> CreateFileList(ClientContext &context, const vector<string> &paths,
+ shared_ptr<MultiFileList> CreateFileList(ClientContext &context, const vector<string> &paths,

This file was deleted.

48 changes: 44 additions & 4 deletions .github/patches/extensions/spatial/random_test_fix.patch
Original file line number Diff line number Diff line change
@@ -13,11 +13,51 @@ index 007d386..8754619 100644

state.current_idx++;
}
diff --git a/spatial/src/spatial/core/index/rtree/rtree_index_plan_scan.cpp b/spatial/src/spatial/core/index/rtree/rtree_index_plan_scan.cpp
index b420233..f904fdd 100644
--- a/spatial/src/spatial/core/index/rtree/rtree_index_plan_scan.cpp
+++ b/spatial/src/spatial/core/index/rtree/rtree_index_plan_scan.cpp
@@ -46,7 +46,7 @@ public:
column_t referenced_column = column_ids[bound_colref.binding.column_index];
// search for the referenced column in the set of column_ids
for (idx_t i = 0; i < get_column_ids.size(); i++) {
- if (get_column_ids[i] == referenced_column) {
+ if (get_column_ids[i].GetPrimaryIndex() == referenced_column) {
bound_colref.binding.column_index = i;
return;
}
@@ -213,7 +213,7 @@ public:
auto &type = get.returned_types[column_id];
bool found = false;
for (idx_t i = 0; i < column_ids.size(); i++) {
- if (column_ids[i] == column_id) {
+ if (column_ids[i].GetPrimaryIndex() == column_id) {
column_id = i;
found = true;
break;
diff --git a/spatial/src/spatial/core/index/rtree/rtree_index_scan.cpp b/spatial/src/spatial/core/index/rtree/rtree_index_scan.cpp
index 9168790..7fd53a2 100644
index 01f2966..0fabe44 100644
--- a/spatial/src/spatial/core/index/rtree/rtree_index_scan.cpp
+++ b/spatial/src/spatial/core/index/rtree/rtree_index_scan.cpp
@@ -208,7 +208,6 @@ TableFunction RTreeIndexScanFunction::GetFunction() {
@@ -31,7 +31,7 @@ BindInfo RTreeIndexScanBindInfo(const optional_ptr<FunctionData> bind_data_p) {
struct RTreeIndexScanGlobalState : public GlobalTableFunctionState {
ColumnFetchState fetch_state;
TableScanState local_storage_state;
- vector<storage_t> column_ids;
+ vector<StorageIndex> column_ids;

// Index scan state
unique_ptr<IndexScanState> index_state;
@@ -54,7 +54,7 @@ static unique_ptr<GlobalTableFunctionState> RTreeIndexScanInitGlobal(ClientConte
if (id != DConstants::INVALID_INDEX) {
col_id = bind_data.table.GetColumn(LogicalIndex(id)).StorageOid();
}
- result->column_ids.push_back(col_id);
+ result->column_ids.emplace_back(col_id);
}

// Initialize the storage scan state
@@ -205,7 +205,6 @@ TableFunction RTreeIndexScanFunction::GetFunction() {
func.pushdown_complex_filter = nullptr;
func.to_string = RTreeIndexScanToString;
func.table_scan_progress = nullptr;
@@ -56,10 +96,10 @@ index 465cb87..5aa49dd 100644

ExtensionUtil::RegisterFunction(db, read);
diff --git a/spatial/src/spatial/gdal/functions/st_read.cpp b/spatial/src/spatial/gdal/functions/st_read.cpp
index b730baa..8d08898 100644
index 9bd92e8..600fad8 100644
--- a/spatial/src/spatial/gdal/functions/st_read.cpp
+++ b/spatial/src/spatial/gdal/functions/st_read.cpp
@@ -676,7 +676,7 @@ void GdalTableFunction::Register(DatabaseInstance &db) {
@@ -675,7 +675,7 @@ void GdalTableFunction::Register(DatabaseInstance &db) {
GdalTableFunction::InitGlobal, GdalTableFunction::InitLocal);

scan.cardinality = GdalTableFunction::Cardinality;
11 changes: 10 additions & 1 deletion .github/patches/extensions/substrait/or_filter_pushdown.patch
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
diff --git a/src/to_substrait.cpp b/src/to_substrait.cpp
index a466fb2..44a9923 100644
index a466fb2..0dd0766 100644
--- a/src/to_substrait.cpp
+++ b/src/to_substrait.cpp
@@ -1296,6 +1296,17 @@ substrait::Rel *DuckDBToSubstrait::TransformGet(LogicalOperator &dop) {
@@ -20,3 +20,12 @@ index a466fb2..44a9923 100644
if (!dget.table_filters.filters.empty()) {
// Pushdown filter
auto filter = CreateConjunction(dget.table_filters.filters,
@@ -1317,7 +1328,7 @@ substrait::Rel *DuckDBToSubstrait::TransformGet(LogicalOperator &dop) {
auto &column_ids = dget.GetColumnIds();
for (auto col_idx : dget.projection_ids) {
auto struct_item = select->add_struct_items();
- struct_item->set_field(static_cast<int32_t>(column_ids[col_idx]));
+ struct_item->set_field(static_cast<int32_t>(column_ids[col_idx].GetPrimaryIndex()));
// FIXME do we need to set the child? if yes, to what?
}
projection->set_allocated_select(select);
135 changes: 134 additions & 1 deletion .github/patches/extensions/vss/partitioning.patch
Original file line number Diff line number Diff line change
@@ -1,7 +1,47 @@
diff --git a/src/hnsw/hnsw_index.cpp b/src/hnsw/hnsw_index.cpp
index b8bfd1f..50cb165 100644
--- a/src/hnsw/hnsw_index.cpp
+++ b/src/hnsw/hnsw_index.cpp
@@ -581,7 +581,7 @@ void HNSWIndex::VerifyAllocations(IndexLock &state) {
// Can rewrite index expression?
//------------------------------------------------------------------------------
static void TryBindIndexExpressionInternal(Expression &expr, idx_t table_idx, const vector<column_t> &index_columns,
- const vector<column_t> &table_columns, bool &success, bool &found) {
+ const vector<ColumnIndex> &table_columns, bool &success, bool &found) {

if (expr.type == ExpressionType::BOUND_COLUMN_REF) {
found = true;
@@ -592,7 +592,7 @@ static void TryBindIndexExpressionInternal(Expression &expr, idx_t table_idx, co

const auto referenced_column = index_columns[ref.binding.column_index];
for (idx_t i = 0; i < table_columns.size(); i++) {
- if (table_columns[i] == referenced_column) {
+ if (table_columns[i].GetPrimaryIndex() == referenced_column) {
ref.binding.column_index = i;
return;
}
diff --git a/src/hnsw/hnsw_index_scan.cpp b/src/hnsw/hnsw_index_scan.cpp
index bd4826c..746e22d 100644
index bd4826c..16f4953 100644
--- a/src/hnsw/hnsw_index_scan.cpp
+++ b/src/hnsw/hnsw_index_scan.cpp
@@ -29,7 +29,7 @@ BindInfo HNSWIndexScanBindInfo(const optional_ptr<FunctionData> bind_data_p) {
struct HNSWIndexScanGlobalState : public GlobalTableFunctionState {
ColumnFetchState fetch_state;
TableScanState local_storage_state;
- vector<storage_t> column_ids;
+ vector<StorageIndex> column_ids;

// Index scan state
unique_ptr<IndexScanState> index_state;
@@ -52,7 +52,7 @@ static unique_ptr<GlobalTableFunctionState> HNSWIndexScanInitGlobal(ClientContex
if (id != DConstants::INVALID_INDEX) {
col_id = bind_data.table.GetColumn(LogicalIndex(id)).StorageOid();
}
- result->column_ids.push_back(col_id);
+ result->column_ids.emplace_back(col_id);
}

// Initialize the storage scan state
@@ -141,7 +141,6 @@ TableFunction HNSWIndexScanFunction::GetFunction() {
func.pushdown_complex_filter = nullptr;
func.to_string = HNSWIndexScanToString;
@@ -10,3 +50,96 @@ index bd4826c..746e22d 100644
func.projection_pushdown = true;
func.filter_pushdown = false;
func.get_bind_info = HNSWIndexScanBindInfo;
diff --git a/src/hnsw/hnsw_optimize_join.cpp b/src/hnsw/hnsw_optimize_join.cpp
index fb79fdf..9201a3b 100644
--- a/src/hnsw/hnsw_optimize_join.cpp
+++ b/src/hnsw/hnsw_optimize_join.cpp
@@ -19,6 +19,7 @@
#include "duckdb/planner/expression_iterator.hpp"
#include "duckdb/storage/table/scan_state.hpp"
#include "duckdb/transaction/duck_transaction.hpp"
+#include "duckdb/storage/storage_index.hpp"

#include "hnsw/hnsw.hpp"
#include "hnsw/hnsw_index.hpp"
@@ -74,7 +75,7 @@ public:

ColumnFetchState fetch_state;
TableScanState local_storage_state;
- vector<storage_t> phyiscal_column_ids;
+ vector<StorageIndex> physical_column_ids;

// Index scan state
unique_ptr<IndexScanState> index_state;
@@ -85,7 +86,7 @@ unique_ptr<OperatorState> PhysicalHNSWIndexJoin::GetOperatorState(ExecutionConte
auto result = make_uniq<HNSWIndexJoinState>();

auto &local_storage = LocalStorage::Get(context.client, table.catalog);
- result->phyiscal_column_ids.reserve(inner_column_ids.size());
+ result->physical_column_ids.reserve(inner_column_ids.size());

// Figure out the storage column ids from the projection expression
for (auto &id : inner_column_ids) {
@@ -93,14 +94,14 @@ unique_ptr<OperatorState> PhysicalHNSWIndexJoin::GetOperatorState(ExecutionConte
if (id != DConstants::INVALID_INDEX) {
col_id = table.GetColumn(LogicalIndex(id)).StorageOid();
}
- result->phyiscal_column_ids.push_back(col_id);
+ result->physical_column_ids.emplace_back(col_id);
}

// Initialize selection vector
result->match_sel.Initialize();

// Initialize the storage scan state
- result->local_storage_state.Initialize(result->phyiscal_column_ids, nullptr);
+ result->local_storage_state.Initialize(result->physical_column_ids, nullptr);
local_storage.InitializeScan(table.GetStorage(), result->local_storage_state.local_state, nullptr);

// Initialize the index scan state
@@ -152,7 +153,7 @@ OperatorResultType PhysicalHNSWIndexJoin::Execute(ExecutionContext &context, Dat
const auto &row_ids = hnsw_index.GetMultiScanResult(*state.index_state);

// Execute one big fetch for the LHS
- table.GetStorage().Fetch(transcation, chunk, state.phyiscal_column_ids, row_ids, output_idx, state.fetch_state);
+ table.GetStorage().Fetch(transcation, chunk, state.physical_column_ids, row_ids, output_idx, state.fetch_state);

// Now slice the chunk so that we include the rhs too
chunk.Slice(input, state.match_sel, output_idx, OUTER_COLUMN_OFFSET);
@@ -573,7 +574,9 @@ bool HNSWIndexJoinOptimizer::TryOptimize(Binder &binder, ClientContext &context,
//------------------------------------------------------------------------------

auto index_join = make_uniq<LogicalHNSWIndexJoin>(binder.GenerateTableIndex(), duck_table, *index_ptr, k_value);
- index_join->inner_column_ids = inner_get.GetColumnIds();
+ for(auto &column_id : inner_get.GetColumnIds()) {
+ index_join->inner_column_ids.emplace_back(column_id.GetPrimaryIndex());
+ }
index_join->inner_projection_ids = inner_get.projection_ids;
index_join->inner_returned_types = inner_get.returned_types;

diff --git a/src/hnsw/hnsw_optimize_scan.cpp b/src/hnsw/hnsw_optimize_scan.cpp
index 28cee3a..d5aded4 100644
--- a/src/hnsw/hnsw_optimize_scan.cpp
+++ b/src/hnsw/hnsw_optimize_scan.cpp
@@ -170,7 +170,7 @@ public:
auto &type = get.returned_types[column_id];
bool found = false;
for (idx_t i = 0; i < column_ids.size(); i++) {
- if (column_ids[i] == column_id) {
+ if (column_ids[i].GetPrimaryIndex() == column_id) {
column_id = i;
found = true;
break;
diff --git a/src/hnsw/hnsw_optimize_topk.cpp b/src/hnsw/hnsw_optimize_topk.cpp
index 6f78cea..14967d3 100644
--- a/src/hnsw/hnsw_optimize_topk.cpp
+++ b/src/hnsw/hnsw_optimize_topk.cpp
@@ -198,7 +198,7 @@ public:
auto &type = get.returned_types[column_id];
bool found = false;
for (idx_t i = 0; i < column_ids.size(); i++) {
- if (column_ids[i] == column_id) {
+ if (column_ids[i].GetPrimaryIndex() == column_id) {
column_id = i;
found = true;
break;
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -483,6 +483,7 @@ generate-files:
python3 scripts/generate_settings.py
python3 scripts/generate_serialization.py
python3 scripts/generate_enum_util.py
python3 scripts/generate_metric_enums.py
-@python3 tools/pythonpkg/scripts/generate_connection_code.py || echo "Warning: generate_connection_code.py failed, cxxheaderparser & pcpp are required to perform this step"
# Run the formatter again after (re)generating the files
$(MAKE) format-main
18 changes: 18 additions & 0 deletions benchmark/tpch/struct/tpch_q1_struct.benchmark
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# name: benchmark/tpch/struct/tpch_q1_struct.benchmark
# description: Run Q01 over lineitem stored in structs
# group: [struct]

name Q01 Structs
group tpch
subgroup sf1

require tpch

load
CALL dbgen(sf=1, suffix='_normalized');
CREATE TABLE lineitem_struct AS SELECT lineitem_normalized AS struct_val FROM lineitem_normalized;
CREATE VIEW lineitem AS SELECT UNNEST(struct_val) FROM lineitem_struct;

run extension/tpch/dbgen/queries/q01.sql

result extension/tpch/dbgen/answers/sf1/q01.csv
18 changes: 18 additions & 0 deletions benchmark/tpch/struct/tpch_q1_struct_nested.benchmark
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# name: benchmark/tpch/struct/tpch_q1_struct_nested.benchmark
# description: Run Q01 over lineitem stored in nested structs
# group: [struct]

name Q01 Structs
group tpch
subgroup sf1

require tpch

load
CALL dbgen(sf=1, suffix='_normalized');
CREATE TABLE lineitem_struct AS SELECT {'id': rowid, 'values': lineitem_normalized} AS struct_val FROM lineitem_normalized;
CREATE VIEW lineitem AS SELECT UNNEST(struct_val, recursive := true) FROM lineitem_struct;

run extension/tpch/dbgen/queries/q01.sql

result extension/tpch/dbgen/answers/sf1/q01.csv
Binary file added data/csv/null_terminator.csv
Binary file not shown.
27 changes: 27 additions & 0 deletions data/csv/unquoted_escape/basic.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
0 \\ 0
1 \ 1
2 \
2
3 a\\a 3
4 b\ b 4
5 c\
c 5
6 \\d 6
7 \ e 7
8 \
f 8
9 g\\ 9
10 h\ 10
11 i\
11
12 \\j 12
13 \ k 13
14 \
l 14
15 \\\\ 15
16 \ \ 16
17 \
\
17
18 \\\ \
18
Loading