Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

window_lead test appears to be non-deterministic #135

Closed
andygrove opened this issue Jan 19, 2023 · 0 comments · Fixed by #182
Closed

window_lead test appears to be non-deterministic #135

andygrove opened this issue Jan 19, 2023 · 0 comments · Fixed by #182
Labels
bug Something isn't working

Comments

@andygrove
Copy link
Member

Describe the bug

The test works for me locally but fails in CI/

2023-01-19T00:06:25.0267445Z df = <datafusion.DataFrame object at 0x7ffbd80ccc70>
2023-01-19T00:06:25.0268026Z 
2023-01-19T00:06:25.0268336Z     def test_window_lead(df):
2023-01-19T00:06:25.0268610Z         df = df.select(
2023-01-19T00:06:25.0268869Z             column("a"),
2023-01-19T00:06:25.0269119Z             f.alias(
2023-01-19T00:06:25.0269347Z                 f.window(
2023-01-19T00:06:25.0269666Z                     "lead", [column("b")], order_by=[f.order_by(column("b"))]
2023-01-19T00:06:25.0269979Z                 ),
2023-01-19T00:06:25.0270215Z                 "a_next",
2023-01-19T00:06:25.0270448Z             ),
2023-01-19T00:06:25.0270671Z         )
2023-01-19T00:06:25.0270865Z     
2023-01-19T00:06:25.0271152Z         table = pa.Table.from_batches(df.collect())
2023-01-19T00:06:25.0271426Z     
2023-01-19T00:06:25.0271707Z         expected = {"a": [1, 2, 3], "a_next": [5, 6, None]}
2023-01-19T00:06:25.0272031Z >       assert table.to_pydict() == expected
2023-01-19T00:06:25.0272883Z E       AssertionError: assert {'a': [3, 1, ... [None, 5, 6]} == {'a': [1, 2, ... [5, 6, None]}
2023-01-19T00:06:25.0273211Z E         Differing items:
2023-01-19T00:06:25.0273583Z E         {'a_next': [None, 5, 6]} != {'a_next': [5, 6, None]}
2023-01-19T00:06:25.0273949Z E         {'a': [3, 1, 2]} != {'a': [1, 2, 3]}
2023-01-19T00:06:25.0274208Z E         Full diff:
2023-01-19T00:06:25.0274555Z E         - {'a': [1, 2, 3], 'a_next': [5, 6, None]}
2023-01-19T00:06:25.0274917Z E         ?            ---              ------
2023-01-19T00:06:25.0275260Z E         + {'a': [3, 1, 2], 'a_next': [None, 5, 6]}
2023-01-19T00:06:25.0275725Z E         ?        +++                      ++++++

To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

@andygrove andygrove added the bug Something isn't working label Jan 19, 2023
yahoNanJing pushed a commit to yahoNanJing/arrow-ballista that referenced this issue Jan 29, 2023
yahoNanJing pushed a commit to yahoNanJing/arrow-ballista that referenced this issue Jan 29, 2023
yahoNanJing added a commit to apache/datafusion-ballista that referenced this issue Jan 30, 2023
* Update datafusion dependency to the latest version

* Fix python

* Skip ut of test_window_lead due to apache/datafusion-python#135

* Fix clippy

---------

Co-authored-by: yangzhong <[email protected]>
fsdvh added a commit to coralogix/arrow-ballista that referenced this issue Feb 2, 2023
* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* feat: update script such that ballista-cli image is built as well (apache#601)

* Fix Cargo.toml format issue (apache#616)

* Refactor executor main (apache#614)

* Refactor executor main

* copy all configs

* toml fmt

* Refactor scheduler main (apache#615)

* refactor scheduler main

* toml fmt

* Python: add method to get explain output as a string (apache#593)

* Update contributor guide (apache#617)

* Cluster state refactor part 1 (apache#560)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* replace master with main (apache#621)

* implement new release process (apache#623)

* add docs on who can release (apache#632)

* Upgrade to DataFusion 16 (again) (apache#636)

* Update datafusion dependency to the latest version (apache#612)

* Update datafusion dependency to the latest version

* Fix python

* Skip ut of test_window_lead due to apache/datafusion-python#135

* Fix clippy

---------

Co-authored-by: yangzhong <[email protected]>

* Upgrade to DataFusion 17 (apache#639)

* Upgrade to DF 17

* Restore original error handling functionality

* Customize session builder

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* Add ClusterState trait

* Expose active job count

* Remove println

* Resubmit jobs when no resources available for scheduling

* Make parse_physical_expr public

* Reduce log spam

* Fix job submitted metric by ignoring resubmissions

* Record when job is queued in scheduler metrics (#28)

* Record when job is queueud in scheduler metrics

* add additional buckets for exec times

* Upstream rebase (#29)

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* Add ClusterState trait

* Expose active job count

* Remove println

* Resubmit jobs when no resources available for scheduling

* Make parse_physical_expr public

* Reduce log spam

* Fix job submitted metric by ignoring resubmissions

* Record when job is queued in scheduler metrics (#28)

* Record when job is queueud in scheduler metrics

* add additional buckets for exec times

* fmt

* clippy

* tomlfmt

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Dan Harris <[email protected]>

* Post merge update

* update message formatting

* post merge update

* another post-merge updates

* update github actions

* clippy

* update script

* fmt

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Tim Van Wassenhove <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
fsdvh added a commit to coralogix/arrow-ballista that referenced this issue Feb 15, 2023
* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* feat: update script such that ballista-cli image is built as well (apache#601)

* Fix Cargo.toml format issue (apache#616)

* Refactor executor main (apache#614)

* Refactor executor main

* copy all configs

* toml fmt

* Refactor scheduler main (apache#615)

* refactor scheduler main

* toml fmt

* Python: add method to get explain output as a string (apache#593)

* Update contributor guide (apache#617)

* Cluster state refactor part 1 (apache#560)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* replace master with main (apache#621)

* implement new release process (apache#623)

* add docs on who can release (apache#632)

* Upgrade to DataFusion 16 (again) (apache#636)

* Update datafusion dependency to the latest version (apache#612)

* Update datafusion dependency to the latest version

* Fix python

* Skip ut of test_window_lead due to apache/datafusion-python#135

* Fix clippy

---------

Co-authored-by: yangzhong <[email protected]>

* Upgrade to DataFusion 17 (apache#639)

* Upgrade to DF 17

* Restore original error handling functionality

* Customize session builder

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* Add ClusterState trait

* Expose active job count

* Remove println

* Resubmit jobs when no resources available for scheduling

* Make parse_physical_expr public

* Reduce log spam

* Fix job submitted metric by ignoring resubmissions

* Record when job is queued in scheduler metrics (#28)

* Record when job is queueud in scheduler metrics

* add additional buckets for exec times

* Upstream rebase (#29)

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* Add ClusterState trait

* Expose active job count

* Remove println

* Resubmit jobs when no resources available for scheduling

* Make parse_physical_expr public

* Reduce log spam

* Fix job submitted metric by ignoring resubmissions

* Record when job is queued in scheduler metrics (#28)

* Record when job is queueud in scheduler metrics

* add additional buckets for exec times

* fmt

* clippy

* tomlfmt

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Dan Harris <[email protected]>

* Post merge update

* update message formatting

* post merge update

* another post-merge updates

* update github actions

* clippy

* update script

* fmt

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Tim Van Wassenhove <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
fsdvh added a commit to coralogix/arrow-ballista that referenced this issue Feb 17, 2023
* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* feat: update script such that ballista-cli image is built as well (apache#601)

* Fix Cargo.toml format issue (apache#616)

* Refactor executor main (apache#614)

* Refactor executor main

* copy all configs

* toml fmt

* Refactor scheduler main (apache#615)

* refactor scheduler main

* toml fmt

* Python: add method to get explain output as a string (apache#593)

* Update contributor guide (apache#617)

* Cluster state refactor part 1 (apache#560)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* replace master with main (apache#621)

* implement new release process (apache#623)

* add docs on who can release (apache#632)

* Upgrade to DataFusion 16 (again) (apache#636)

* Update datafusion dependency to the latest version (apache#612)

* Update datafusion dependency to the latest version

* Fix python

* Skip ut of test_window_lead due to apache/datafusion-python#135

* Fix clippy

---------

Co-authored-by: yangzhong <[email protected]>

* Upgrade to DataFusion 17 (apache#639)

* Upgrade to DF 17

* Restore original error handling functionality

* check in benchmark image (apache#647)

* Remove `python` dir & python-related workflows (apache#654)

* refactor: remove python dir & python-related workflows

* remove brackets

* Handle job resubmission (apache#586)

* Handle job resubmission

* Make resubmission configurable and add test

* Fix debug log

* Add executor self-registration mechanism in the heartbeat service (apache#649)

Co-authored-by: yangzhong <[email protected]>

* Cluster state refactor Part 2 (apache#658)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

* WIP

* Implement JobState

* Tests and fixes

* do not hold ref across await point

* Fix clippy warnings

* Fix tomlfmt github action

* uncomment test

---------

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* Upgrade to DataFusion 18.0.0-rc1 (apache#664)

* Add executor terminating status for graceful shutdown

* Remove empty file

* Minor refactor to reduce duplicate code (apache#659)

* move test_util to ballista-examples package (apache#661)

* Upgrade to DataFusion 18 (apache#668)

* Enable physical plan round-trip tests (apache#666)

* Customize session builder

* Construct Executor with functions

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* Add ClusterState trait

* Expose active job count

* Make parse_physical_expr public

* Fix job submitted metric by ignoring resubmissions

* Record when job is queued in scheduler metrics (#28)

* Record when job is queueud in scheduler metrics

* add additional buckets for exec times

* Upstream rebase (#29)

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* Add ClusterState trait

* Expose active job count

* Remove println

* Resubmit jobs when no resources available for scheduling

* Make parse_physical_expr public

* Reduce log spam

* Fix job submitted metric by ignoring resubmissions

* Record when job is queued in scheduler metrics (#28)

* Record when job is queueud in scheduler metrics

* add additional buckets for exec times

* fmt

* clippy

* tomlfmt

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Dan Harris <[email protected]>

* Update from upstream (#30)

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* feat: update script such that ballista-cli image is built as well (apache#601)

* Fix Cargo.toml format issue (apache#616)

* Refactor executor main (apache#614)

* Refactor executor main

* copy all configs

* toml fmt

* Refactor scheduler main (apache#615)

* refactor scheduler main

* toml fmt

* Python: add method to get explain output as a string (apache#593)

* Update contributor guide (apache#617)

* Cluster state refactor part 1 (apache#560)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* replace master with main (apache#621)

* implement new release process (apache#623)

* add docs on who can release (apache#632)

* Upgrade to DataFusion 16 (again) (apache#636)

* Update datafusion dependency to the latest version (apache#612)

* Update datafusion dependency to the latest version

* Fix python

* Skip ut of test_window_lead due to apache/datafusion-python#135

* Fix clippy

---------

Co-authored-by: yangzhong <[email protected]>

* Upgrade to DataFusion 17 (apache#639)

* Upgrade to DF 17

* Restore original error handling functionality

* Customize session builder

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* Add ClusterState trait

* Expose active job count

* Remove println

* Resubmit jobs when no resources available for scheduling

* Make parse_physical_expr public

* Reduce log spam

* Fix job submitted metric by ignoring resubmissions

* Record when job is queued in scheduler metrics (#28)

* Record when job is queueud in scheduler metrics

* add additional buckets for exec times

* Upstream rebase (#29)

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* Add ClusterState trait

* Expose active job count

* Remove println

* Resubmit jobs when no resources available for scheduling

* Make parse_physical_expr public

* Reduce log spam

* Fix job submitted metric by ignoring resubmissions

* Record when job is queued in scheduler metrics (#28)

* Record when job is queueud in scheduler metrics

* add additional buckets for exec times

* fmt

* clippy

* tomlfmt

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Dan Harris <[email protected]>

* Post merge update

* update message formatting

* post merge update

* another post-merge updates

* update github actions

* clippy

* update script

* fmt

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Tim Van Wassenhove <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Dan Harris <[email protected]>

* post merge fixes

* fix branch naming in github actions

* cleanup

* fmt

* update imports

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Tim Van Wassenhove <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Ian Alexander Joiner <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: jiangzhx <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
ch-sc added a commit to coralogix/arrow-ballista that referenced this issue Mar 31, 2023
* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* feat: update script such that ballista-cli image is built as well (apache#601)

* Fix Cargo.toml format issue (apache#616)

* Refactor executor main (apache#614)

* Refactor executor main

* copy all configs

* toml fmt

* Refactor scheduler main (apache#615)

* refactor scheduler main

* toml fmt

* Python: add method to get explain output as a string (apache#593)

* Update contributor guide (apache#617)

* Cluster state refactor part 1 (apache#560)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* replace master with main (apache#621)

* implement new release process (apache#623)

* add docs on who can release (apache#632)

* Upgrade to DataFusion 16 (again) (apache#636)

* Update datafusion dependency to the latest version (apache#612)

* Update datafusion dependency to the latest version

* Fix python

* Skip ut of test_window_lead due to apache/datafusion-python#135

* Fix clippy

---------

Co-authored-by: yangzhong <[email protected]>

* Upgrade to DataFusion 17 (apache#639)

* Upgrade to DF 17

* Restore original error handling functionality

* check in benchmark image (apache#647)

* Remove `python` dir & python-related workflows (apache#654)

* refactor: remove python dir & python-related workflows

* remove brackets

* Handle job resubmission (apache#586)

* Handle job resubmission

* Make resubmission configurable and add test

* Fix debug log

* Add executor self-registration mechanism in the heartbeat service (apache#649)

Co-authored-by: yangzhong <[email protected]>

* Cluster state refactor Part 2 (apache#658)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

* WIP

* Implement JobState

* Tests and fixes

* do not hold ref across await point

* Fix clippy warnings

* Fix tomlfmt github action

* uncomment test

---------

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* Upgrade to DataFusion 18.0.0-rc1 (apache#664)

* Minor refactor to reduce duplicate code (apache#659)

* move test_util to ballista-examples package (apache#661)

* Upgrade to DataFusion 18 (apache#668)

* Enable physical plan round-trip tests (apache#666)

* Prep 0.11 (apache#682)

* Change version to 0.11.0

* changelog

* update react-timeago version

* yarn upgrade

* fix

* fix

* revert yarn change

* Print versions

* Print locations

* Avoid github shenanigans

* Try to get runners running

* Try to get runners running

* already root

---------

Co-authored-by: Andy Grove <[email protected]>

* [minor] remove todo (apache#683)

* Add executor terminating status for graceful shutdown (apache#667)

* Add executor terminating status for graceful shutdown

* Remove empty file

* Update ballista/executor/src/executor_process.rs

Co-authored-by: Brent Gardner <[email protected]>

---------

Co-authored-by: Brent Gardner <[email protected]>

* Allow `BallistaContext::read_*` methods to read multiple paths. (apache#679)

* updated dependency in cargo, added read_json method, modified read_* methods to read multiple paths.

* ran cargo fmt

* Added revision for proper builds.

* Update scheduler.md (apache#657)

* Mark `SchedulerState` as pub (apache#688)

* Mark as pub

* Fmt

---------

Co-authored-by: Daniël Heres <[email protected]>

* Update graphviz-rust requirement from 0.5.0 to 0.6.1 (apache#651)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Upgrade DataFusion to 19.0.0 (apache#691)

* update release notes (apache#692)

* Make task launcher pub (apache#695)

Co-authored-by: Daniël Heres <[email protected]>

* Make task_manager pub (apache#698)

Co-authored-by: Daniël Heres <[email protected]>

* Add ExecutionEngine abstraction (apache#687)

* Allow accessing s3 locations in client mode (apache#700)

* Allow accessing s3 locations in client mode

* Removed s3 feature from test dependencies.

* fixed cargo-tomlfmt issues

* deployment/docker-compose.md incorrect remote ref (apache#699)

* Fix for error message during testing (apache#707)

* Fix cargo clippy

* Fix for error message during testing

* Remove unwrap for dealing with JobQueued event

* log task ids when launch tasks

---------

Co-authored-by: yangzhong <[email protected]>

* Upgrade datafusion to 20.0.0 & sqlparser to to 0.32.0 (apache#711)

* Upgrade datafusion & sqlparser

* Move ballista_round_trip tests of benchmark into a separate feature to avoid stack overflow

* Fix failed tests of scheduler

* Update README.md (apache#729)

* Update link to proto file in dev docs (apache#713)

* Fix `show tables` fails (apache#715)

* Remove cancelled jobs from active cache (#36)

* Downgrade expected error to warning (#37)

* Downgrade expected error to warning

* add context

* Serialize configoptions and pass them to executor (#34)

* serialize configoptions and pass them to executor and allow extensions for TaskContext

* use ConfigOptions::with_extensions

* fix usage of ConfigOptions

* clippy

* Add wait_drained to SchedulerServer and Executor (#41)

* Add missing code from previous commits

* Fixes after merging from master

* Reintroduce Executor::with_functions

* Adapt prometheus histogram buckets

* cargo tomlfmt

* cargo fmt --all

* Allow too_many_arguments lint

* Cargo tomlfmt

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Tim Van Wassenhove <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Ian Alexander Joiner <[email protected]>
Co-authored-by: jiangzhx <[email protected]>
Co-authored-by: Yang Jiang <[email protected]>
Co-authored-by: Lakkam Sai Krishna Reddy <[email protected]>
Co-authored-by: Vrishabh <[email protected]>
Co-authored-by: Daniël Heres <[email protected]>
Co-authored-by: Daniël Heres <[email protected]>
Co-authored-by: Joe Williams <[email protected]>
Co-authored-by: Jaap Aarts <[email protected]>
Co-authored-by: mpurins-coralogix <[email protected]>
fsdvh added a commit to coralogix/arrow-ballista that referenced this issue Apr 26, 2023
* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Prepare 0.10.0 Release (apache#522)

* bump version

* CHANGELOG

* Ballista gets a docker image!!! (apache#521)

* Ballista gets a docker image!!!

* Enable flight sql

* Allow executing startup script

* Allow executing executables

* Clippy

* Remove capture group (apache#527)

* fix python build in CI (apache#528)

* fix python build in CI

* save progress

* use same min rust version in all crates

* fix

* use image from pyo3

* use newer image from pyo3

* do not require protoc

* wheels now generated

* rat - exclude generated file

* Update docs for simplified instructions (apache#532)

* Update docs for simplified instructions

* Fix whoopsie

* Update docs/source/user-guide/flightsql.md

Co-authored-by: Andy Grove <[email protected]>

Co-authored-by: Andy Grove <[email protected]>

* remove --locked (apache#533)

* Bump actions/labeler from 4.0.2 to 4.1.0 (apache#525)

* Provide a memory StateBackendClient (apache#523)

* Rename StateBackend::Standalone to StateBackend:Sled

* Copy utility files from sled crate since they cannot be used directly

* Provide a memory StateBackendClient

* Fix dashmap deadlock issue

* Fix for the comments

Co-authored-by: yangzhong <[email protected]>

* only build docker images on rc tags (apache#535)

* docs: fix style in the Helm readme (apache#551)

* Fix Helm chart's image format (apache#550)

* Update datafusion requirement from 14.0.0 to 15.0.0 (apache#552)

* Update datafusion requirement from 14.0.0 to 15.0.0

* Fix UT

* Fix python

* Fix python

* Fix Python

Co-authored-by: yangzhong <[email protected]>

* Make it concurrently to launch tasks to executors (apache#557)

* Make it concurrently to launch tasks to executors

* Refine for comments

Co-authored-by: yangzhong <[email protected]>

* fix(ui): fix last seen (apache#562)

* Support Alibaba Cloud OSS with ObjectStore (apache#567)

* Fix cargo clippy (apache#571)

Co-authored-by: yangzhong <[email protected]>

* Super minor spelling error (apache#573)

* Update env_logger requirement from 0.9 to 0.10 (apache#539)

Updates the requirements on [env_logger](https://github.com/rust-cli/env_logger) to permit the latest version.
- [Release notes](https://github.com/rust-cli/env_logger/releases)
- [Changelog](https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md)
- [Commits](rust-cli/env_logger@v0.9.0...v0.10.0)

---
updated-dependencies:
- dependency-name: env_logger
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update graphviz-rust requirement from 0.4.0 to 0.5.0 (apache#574)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* updated readme to contain correct versions of dependencies. (apache#580)

* Fix benchmark image link (apache#596)

* Add support for Azure (apache#599)

* Remove outdated script and use evergreen version of rust (apache#597)

* Remove outdated script and use evergreen version of rust

* Use debian protobuf

* feat: update script such that ballista-cli image is built as well (apache#601)

* Fix Cargo.toml format issue (apache#616)

* Refactor executor main (apache#614)

* Refactor executor main

* copy all configs

* toml fmt

* Refactor scheduler main (apache#615)

* refactor scheduler main

* toml fmt

* Python: add method to get explain output as a string (apache#593)

* Update contributor guide (apache#617)

* Cluster state refactor part 1 (apache#560)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* replace master with main (apache#621)

* implement new release process (apache#623)

* add docs on who can release (apache#632)

* Upgrade to DataFusion 16 (again) (apache#636)

* Update datafusion dependency to the latest version (apache#612)

* Update datafusion dependency to the latest version

* Fix python

* Skip ut of test_window_lead due to apache/datafusion-python#135

* Fix clippy

---------

Co-authored-by: yangzhong <[email protected]>

* Upgrade to DataFusion 17 (apache#639)

* Upgrade to DF 17

* Restore original error handling functionality

* check in benchmark image (apache#647)

* Remove `python` dir & python-related workflows (apache#654)

* refactor: remove python dir & python-related workflows

* remove brackets

* Handle job resubmission (apache#586)

* Handle job resubmission

* Make resubmission configurable and add test

* Fix debug log

* Add executor self-registration mechanism in the heartbeat service (apache#649)

Co-authored-by: yangzhong <[email protected]>

* Cluster state refactor Part 2 (apache#658)

* Customize session builder

* Add setter for executor slots policy

* Construct Executor with functions

* Add queued and completed timestamps to successful job status

* Add public methods to SchedulerServer

* Public method for getting execution graph

* Public method for stage metrics

* Use node-level local limit (#20)

* Use node-level local limit

* serialize limit in shuffle writer

* Revert "Merge pull request #19 from coralogix/sc-5792"

This reverts commit 08140ef, reversing
changes made to a7f1384.

* add log

* make sure we don't forget limit for shuffle writer

* update accum correctly and try to break early

* Check local limit accumulator before polling for more data

* fix build

Co-authored-by: Martins Purins <[email protected]>

* configure_me_codegen retroactively reserved on our `bind_host` parame… (apache#520)

* configure_me_codegen retroactively reserved on our `bind_host` parameter name

* Add label and pray

* Add more labels why not

* Add ClusterState trait

* Refactor slightly for clarity

* Revert "Use node-level local limit (#20)"

This reverts commit ff96bcd.

* Revert "Public method for stage metrics"

This reverts commit a802315.

* Revert "Public method for getting execution graph"

This reverts commit 490bda5.

* Revert "Add public methods to SchedulerServer"

This reverts commit 5ad27c0.

* Revert "Add queued and completed timestamps to successful job status"

This reverts commit c615fce.

* Revert "Construct Executor with functions"

This reverts commit 24d4830.

* Always forget the apache header

* WIP

* Implement JobState

* Tests and fixes

* do not hold ref across await point

* Fix clippy warnings

* Fix tomlfmt github action

* uncomment test

---------

Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>

* Upgrade to DataFusion 18.0.0-rc1 (apache#664)

* Minor refactor to reduce duplicate code (apache#659)

* move test_util to ballista-examples package (apache#661)

* Upgrade to DataFusion 18 (apache#668)

* Enable physical plan round-trip tests (apache#666)

* Prep 0.11 (apache#682)

* Change version to 0.11.0

* changelog

* update react-timeago version

* yarn upgrade

* fix

* fix

* revert yarn change

* Print versions

* Print locations

* Avoid github shenanigans

* Try to get runners running

* Try to get runners running

* already root

---------

Co-authored-by: Andy Grove <[email protected]>

* [minor] remove todo (apache#683)

* Add executor terminating status for graceful shutdown (apache#667)

* Add executor terminating status for graceful shutdown

* Remove empty file

* Update ballista/executor/src/executor_process.rs

Co-authored-by: Brent Gardner <[email protected]>

---------

Co-authored-by: Brent Gardner <[email protected]>

* Allow `BallistaContext::read_*` methods to read multiple paths. (apache#679)

* updated dependency in cargo, added read_json method, modified read_* methods to read multiple paths.

* ran cargo fmt

* Added revision for proper builds.

* Update scheduler.md (apache#657)

* Mark `SchedulerState` as pub (apache#688)

* Mark as pub

* Fmt

---------

Co-authored-by: Daniël Heres <[email protected]>

* Update graphviz-rust requirement from 0.5.0 to 0.6.1 (apache#651)

Updates the requirements on [graphviz-rust](https://github.com/besok/graphviz-rust) to permit the latest version.
- [Release notes](https://github.com/besok/graphviz-rust/releases)
- [Changelog](https://github.com/besok/graphviz-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/besok/graphviz-rust/commits)

---
updated-dependencies:
- dependency-name: graphviz-rust
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Upgrade DataFusion to 19.0.0 (apache#691)

* update release notes (apache#692)

* Make task launcher pub (apache#695)

Co-authored-by: Daniël Heres <[email protected]>

* Make task_manager pub (apache#698)

Co-authored-by: Daniël Heres <[email protected]>

* Add ExecutionEngine abstraction (apache#687)

* Allow accessing s3 locations in client mode (apache#700)

* Allow accessing s3 locations in client mode

* Removed s3 feature from test dependencies.

* fixed cargo-tomlfmt issues

* deployment/docker-compose.md incorrect remote ref (apache#699)

* Fix for error message during testing (apache#707)

* Fix cargo clippy

* Fix for error message during testing

* Remove unwrap for dealing with JobQueued event

* log task ids when launch tasks

---------

Co-authored-by: yangzhong <[email protected]>

* Upgrade datafusion to 20.0.0 & sqlparser to to 0.32.0 (apache#711)

* Upgrade datafusion & sqlparser

* Move ballista_round_trip tests of benchmark into a separate feature to avoid stack overflow

* Fix failed tests of scheduler

* Update README.md (apache#729)

* Update link to proto file in dev docs (apache#713)

* Fix `show tables` fails (apache#715)

* Remove cancelled jobs from active cache (#36)

* Downgrade expected error to warning (#37)

* Downgrade expected error to warning

* add context

* Serialize configoptions and pass them to executor (#34)

* serialize configoptions and pass them to executor and allow extensions for TaskContext

* use ConfigOptions::with_extensions

* fix usage of ConfigOptions

* clippy

* Add wait_drained to SchedulerServer and Executor (#41)

* Add missing code from previous commits

* Fixes after merging from master

* Reintroduce Executor::with_functions

* Adapt prometheus histogram buckets

* cargo tomlfmt

* cargo fmt --all

* Allow too_many_arguments lint

* sc-16350: introducing notion of external and internal error in the failed job status

* Cargo tomlfmt

* sc-16350: small test fix

* sc-16350: partially implemented ballista error serialization

* sc-16350: update failed job proto definition

* sc-16350: cleanup

* sc-16350: update protoc

* sc-16350: update action

* sc-16350: more action update

* sc-16350: update test

* sc-16350: allow optional fields

* VTX-522: fix models

* VTX-522: cleanup

* Change error to warn

* VTX-522: fixes

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Andy Grove <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: yahoNanJing <[email protected]>
Co-authored-by: yangzhong <[email protected]>
Co-authored-by: Xin Hao <[email protected]>
Co-authored-by: Duyet Le <[email protected]>
Co-authored-by: r.4ntix <[email protected]>
Co-authored-by: Jeremy Dyer <[email protected]>
Co-authored-by: Sai Krishna Reddy Lakkam <[email protected]>
Co-authored-by: Aidan Kovacic <[email protected]>
Co-authored-by: Tim Van Wassenhove <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Co-authored-by: Martins Purins <[email protected]>
Co-authored-by: Brent Gardner <[email protected]>
Co-authored-by: Ian Alexander Joiner <[email protected]>
Co-authored-by: jiangzhx <[email protected]>
Co-authored-by: Yang Jiang <[email protected]>
Co-authored-by: Lakkam Sai Krishna Reddy <[email protected]>
Co-authored-by: Vrishabh <[email protected]>
Co-authored-by: Daniël Heres <[email protected]>
Co-authored-by: Daniël Heres <[email protected]>
Co-authored-by: Joe Williams <[email protected]>
Co-authored-by: Jaap Aarts <[email protected]>
Co-authored-by: mpurins-coralogix <[email protected]>
Co-authored-by: Christoph Schulze <[email protected]>
Co-authored-by: Dan Harris <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant