-
Notifications
You must be signed in to change notification settings - Fork 72
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Support for LIMIT clause with DataFusion (#529)
* Condition for BinaryExpr, filter, input_ref, rexcall, and rexliteral * Updates for test_filter * more of test_filter.py working with the exception of some date pytests * Add workflow to keep datafusion dev branch up to date (#440) * Include setuptools-rust in conda build recipie, in host and run * Remove PyArrow dependency * rebase with datafusion-sql-planner * refactor changes that were inadvertent during rebase * timestamp with loglca time zone * Bump DataFusion version (#494) * bump DataFusion version * remove unnecessary downcasts and use separate structs for TableSource and TableProvider * Include RelDataType work * Include RelDataType work * Introduced SqlTypeName Enum in Rust and mappings for Python * impl PyExpr.getIndex() * add getRowType() for logical.rs * Introduce DaskTypeMap for storing correlating SqlTypeName and DataTypes * use str values instead of Rust Enums, Python is unable to Hash the Rust Enums if used in a dict * linter changes, why did that work on my local pre-commit?? * linter changes, why did that work on my local pre-commit?? * Convert final strs to SqlTypeName Enum * removed a few print statements * commit to share with colleague * updates * checkpoint * Temporarily disable conda run_test.py script since it uses features not yet implemented * formatting after upstream merge * expose fromString method for SqlTypeName to use Enums instead of strings for type checking * expanded SqlTypeName from_string() support * accept INT as INTEGER * tests update * checkpoint * checkpoint * Refactor PyExpr by removing From trait, and using recursion to expand expression list for rex calls * skip test that uses create statement for gpuci * Basic DataFusion Select Functionality (#489) * Condition for BinaryExpr, filter, input_ref, rexcall, and rexliteral * Updates for test_filter * more of test_filter.py working with the exception of some date pytests * Add workflow to keep datafusion dev branch up to date (#440) * Include setuptools-rust in conda build recipie, in host and run * Remove PyArrow dependency * rebase with datafusion-sql-planner * refactor changes that were inadvertent during rebase * timestamp with loglca time zone * Include RelDataType work * Include RelDataType work * Introduced SqlTypeName Enum in Rust and mappings for Python * impl PyExpr.getIndex() * add getRowType() for logical.rs * Introduce DaskTypeMap for storing correlating SqlTypeName and DataTypes * use str values instead of Rust Enums, Python is unable to Hash the Rust Enums if used in a dict * linter changes, why did that work on my local pre-commit?? * linter changes, why did that work on my local pre-commit?? * Convert final strs to SqlTypeName Enum * removed a few print statements * Temporarily disable conda run_test.py script since it uses features not yet implemented * expose fromString method for SqlTypeName to use Enums instead of strings for type checking * expanded SqlTypeName from_string() support * accept INT as INTEGER * Remove print statements * Default to UTC if tz is None * Delegate timezone handling to the arrow library * Updates from review Co-authored-by: Charles Blackmon-Luca <[email protected]> * updates for expression * uncommented pytests * uncommented pytests * code cleanup for review * code cleanup for review * Enabled more pytest that work now * Enabled more pytest that work now * Output Expression as String when BinaryExpr does not contain a named alias * Output Expression as String when BinaryExpr does not contain a named alias * Disable 2 pytest that are causing gpuCI issues. They will be address in a follow up PR * Handle Between operation for case-when * adjust timestamp casting * Refactor projection _column_name() logic to the _column_name logic in expression.rs * removed println! statements * Updates from review * Add Offset and point to repo with offset in datafusion * Introduce offset * limit updates * commit before upstream merge * Code formatting * update Cargo.toml to use Arrow-DataFusion version with LIMIT logic * Bump DataFusion version to get changes around variant_name() * Use map partitions for determining the offset * Refactor offset partition func * Update to use TryFrom logic * Add cloudpickle to independent scheduler requirements Co-authored-by: Charles Blackmon-Luca <[email protected]> Co-authored-by: Andy Grove <[email protected]>
- Loading branch information
1 parent
f7c57b3
commit e10b3f2
Showing
17 changed files
with
187 additions
and
101 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
use crate::expression::PyExpr; | ||
use crate::sql::exceptions::py_type_err; | ||
|
||
use datafusion::scalar::ScalarValue; | ||
use pyo3::prelude::*; | ||
|
||
use datafusion::logical_expr::{logical_plan::Limit, Expr, LogicalPlan}; | ||
|
||
#[pyclass(name = "Limit", module = "dask_planner", subclass)] | ||
#[derive(Clone)] | ||
pub struct PyLimit { | ||
limit: Limit, | ||
} | ||
|
||
#[pymethods] | ||
impl PyLimit { | ||
#[pyo3(name = "getLimitN")] | ||
pub fn limit_n(&self) -> PyResult<PyExpr> { | ||
Ok(PyExpr::from( | ||
Expr::Literal(ScalarValue::UInt64(Some(self.limit.n.try_into().unwrap()))), | ||
Some(self.limit.input.clone()), | ||
)) | ||
} | ||
} | ||
|
||
impl TryFrom<LogicalPlan> for PyLimit { | ||
type Error = PyErr; | ||
|
||
fn try_from(logical_plan: LogicalPlan) -> Result<Self, Self::Error> { | ||
match logical_plan { | ||
LogicalPlan::Limit(limit) => Ok(PyLimit { limit: limit }), | ||
_ => Err(py_type_err("unexpected plan")), | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
use crate::expression::PyExpr; | ||
use crate::sql::exceptions::py_type_err; | ||
|
||
use datafusion::scalar::ScalarValue; | ||
use pyo3::prelude::*; | ||
|
||
use datafusion::logical_expr::{logical_plan::Offset, Expr, LogicalPlan}; | ||
|
||
#[pyclass(name = "Offset", module = "dask_planner", subclass)] | ||
#[derive(Clone)] | ||
pub struct PyOffset { | ||
offset: Offset, | ||
} | ||
|
||
#[pymethods] | ||
impl PyOffset { | ||
#[pyo3(name = "getOffset")] | ||
pub fn offset(&self) -> PyResult<PyExpr> { | ||
Ok(PyExpr::from( | ||
Expr::Literal(ScalarValue::UInt64(Some(self.offset.offset as u64))), | ||
Some(self.offset.input.clone()), | ||
)) | ||
} | ||
|
||
#[pyo3(name = "getFetch")] | ||
pub fn offset_fetch(&self) -> PyResult<PyExpr> { | ||
// TODO: Still need to implement fetch size! For now get everything from offset on with '0' | ||
Ok(PyExpr::from( | ||
Expr::Literal(ScalarValue::UInt64(Some(0))), | ||
Some(self.offset.input.clone()), | ||
)) | ||
} | ||
} | ||
|
||
impl TryFrom<LogicalPlan> for PyOffset { | ||
type Error = PyErr; | ||
|
||
fn try_from(logical_plan: LogicalPlan) -> Result<Self, Self::Error> { | ||
match logical_plan { | ||
LogicalPlan::Offset(offset) => Ok(PyOffset { offset: offset }), | ||
_ => Err(py_type_err("unexpected plan")), | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.