-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: dask select
method
#667
Conversation
ListOfCompliantExpr = Union[list[PandasLikeExpr], list[ArrowExpr]] | ||
CompliantDataFrame = Union[PandasLikeDataFrame, ArrowDataFrame] | ||
ListOfCompliantSeries = Union[ | ||
list[PandasLikeSeries], list[ArrowSeries], list[DaskExpr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ok? Using DaskExpr
instead of Series?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, yes
df: DaskLazyFrame, | ||
*exprs: IntoDaskExpr, | ||
**named_exprs: IntoDaskExpr, | ||
) -> list[DaskExpr]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, Expr instead of Series
|
||
data = {"a": [1, 3, 2], "b": [4, 4, 6], "z": [7.0, 8, 9]} | ||
df = nw.from_native(dd.from_pandas(pd.DataFrame(data))) | ||
result = df.select("a", nw.col("b") + 1, (nw.col("z") * 2).alias("z*2")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few things to test to be (a bit more) confident about the implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, nice one!
at this rate, we might be running the tpc-h q1 query with dask within a couple of weeks π₯
ListOfCompliantExpr = Union[list[PandasLikeExpr], list[ArrowExpr]] | ||
CompliantDataFrame = Union[PandasLikeDataFrame, ArrowDataFrame] | ||
ListOfCompliantSeries = Union[ | ||
list[PandasLikeSeries], list[ArrowSeries], list[DaskExpr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, yes
def __narwhals_namespace__(self) -> DaskNamespace: # pragma: no cover | ||
from narwhals._dask.namespace import DaskNamespace | ||
|
||
return DaskNamespace(backend_version=self._backend_version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exists for mypy only for now, so no cover
. I imagine it can be useful in future anyway
) | ||
raise NotImplementedError(msg) | ||
|
||
def _create_expr_from_callable( # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used only in reuse_series_implementation
and reuse_series_namespace_implementation
for now, so we are not hitting it, and may never will. If that's the case we can also move it to NotImplementedError
group
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm yeah i'm tempted to say NotImplementedError
then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't make mypy to shut up, let's address this later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome, thanks @FBruzzesi ! exciting how quickly this is moving
couple of minor comments, but feel to merge and address separately
) | ||
raise NotImplementedError(msg) | ||
|
||
def _create_expr_from_callable( # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm yeah i'm tempted to say NotImplementedError
then
What type of PR is this? (check all applicable)
Related issues
Checklist
If you have comments or can explain your changes, please do so below.
Edit: For DaskNamespace, as of now:
_create_expr_from_callable
: implemented_create_expr_from_series
,_create_compliant_series
,_create_series_from_scalar
raiseNotImplementedError
feedbacks are welcomed!