Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP(iox-11398): (moar) patched df upgrade 2024-07-02 #32

Closed
wants to merge 5 commits into from

Conversation

wiedld
Copy link
Collaborator

@wiedld wiedld commented Jul 12, 2024

⚠️ This will not be merged. ⚠️

Same as the existing branch here, but added an additional two cherry-picked commits:

apache@6038f4c

commit a33679d0bf33d960e83c6d47d3c420922d0c3271
Author: wiedld <[email protected]>
Date:   Wed Jul 10 11:21:01 2024 -0700

    Track parquet writer encoding memory usage on MemoryPool (#11345)

apache@1dfac86

commit f07e3431e356eeed5d7a1cf9d3cc9eeeafaf8c5f (HEAD -> wiedld/update-df-july-week-1, influx_origin/wiedld/update-df-july-week-1)
Author: wiedld <[email protected]>
Date:   Fri Jul 12 04:04:42 2024 -0700

    fix(11397): surface proper errors in ParquetSink (#11399)

appletreeisyellow and others added 5 commits July 11, 2024 16:27
* feat: add UDF `to_local_time()`

* chore: support column value in array

* chore: lint

* chore: fix conversion for us, ms, and s

* chore: add more tests for daylight savings time

* chore: add function description

* refactor: update tests and add examples in description

* chore: add description and example

* chore: doc

chore: doc

chore: doc

chore: doc

chore: doc

* chore: stop copying

* chore: fix typo

* chore: mention that the offset varies based on daylight savings time

* refactor: parse timezone once and update examples in description

* refactor: replace map..concat with flat_map

* chore: add hard code timestamp value in test

chore: doc

chore: doc

* chore: handle errors and remove panics

* chore: move some test to slt

* chore: clone time_value

* chore: typo

---------

Co-authored-by: Andrew Lamb <[email protected]>
…1203)

* fix: Incorrect LEFT JOIN evaluation result on OR conditions

* Add a few more test cases

* Don't push join filter predicates into join_conditions

* Add test case and fix typo

* Add test case

---------

Co-authored-by: Andrew Lamb <[email protected]>
* feat(11344): track memory used for non-parallel writes

* feat(11344): track memory usage during parallel writes

* test(11344): create bounded stream for testing

* test(11344): test ParquetSink memory reservation

* feat(11344): track bytes in file writer

* refactor(11344): tweak the ordering to add col bytes to rg_reservation, before selecting shrinking for data bytes flushed

* refactor: move each col_reservation and rg_reservation to match the parallelized call stack for col vs rg

* test(11344): add memory_limit enforcement test for parquet sink

* chore: cleanup to remove unnecessary reservation management steps

* fix: fix CI test failure due to file extension rename
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure
@wiedld
Copy link
Collaborator Author

wiedld commented Aug 7, 2024

We have advanced beyond this branch. Closing.

@wiedld wiedld closed this Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants