Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Dynamic Attribute Access for Namespaces for Improved Typing #14475

Closed
max-muoto opened this issue Feb 14, 2024 · 13 comments
Closed

Allow Dynamic Attribute Access for Namespaces for Improved Typing #14475

max-muoto opened this issue Feb 14, 2024 · 13 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@max-muoto
Copy link
Contributor

max-muoto commented Feb 14, 2024

Description

Currently if you follow the example for defining a custom DataFrame namespace:

@pl.api.register_dataframe_namespace("split")
class SplitFrame:
    def __init__(self, df: pl.DataFrame):
        self._df = df

    def by_alternate_rows(self) -> list[pl.DataFrame]:
        df = self._df.with_row_index(name="n")
        return [
            df.filter((pl.col("n") % 2) == 0).drop("n"),
            df.filter((pl.col("n") % 2) != 0).drop("n"),
        ]


pl.DataFrame(
    data=["aaa", "bbb", "ccc", "ddd", "eee", "fff"],
    schema=[("txt", pl.String)],
).split.by_alternate_rows()

split will be flagged as being an invalid attribute access when using Pyright or MyPy, since this is dynamically added onto the DataFrame.

We could add general dynamic attribute access to get around this:

class DataFrame:
    if typing.TYPE_CHECKING:
        def __getattr__(self, attr_name: str, /) -> typing.Any: ...

I do see this being problematic from the perspective of having to over over to see if the attribute actually exists in case you were to mistype an actual attribute on DataFrame.

Perhaps, we could add an explicit namespace lookup table to avoid this.

df.ns.split.by_alternate_rows() # `ns` has `__getattr__` return `Any` for non-standard namespaces.

This way, one can opt out of strict typing for these dynamic namespaces without having to add typing ignore comments. We could still keep in the old way of accessing these namespaces for those who don't care about the typing. This has been flagged on StackOverflow as an issue for reference: https://stackoverflow.com/questions/77733446/polars-api-registering-and-type-checkers

@max-muoto max-muoto added the enhancement New feature or an improvement of an existing feature label Feb 14, 2024
@max-muoto max-muoto changed the title Allow Dynamic Attribute Access for Namespaces Allow Dynamic Attribute Access for Namespaces for Improved Typing Feb 14, 2024
@max-muoto
Copy link
Contributor Author

Happy to put up a PR if this is something we're interested in.

@max-muoto
Copy link
Contributor Author

I have yet to test this, but it would look something like this. Draft PR: #14476

@stinodego
Copy link
Member

stinodego commented Feb 14, 2024

This is something that is brought up often on Stack Overflow and on our Discord. If there's an elegant workaround, that would be great.

I am not sure how I feel about adding a namespace just for working around this issue. If you have something that works, I'd be happy to review / play around with it so I can make up my mind.

@dangotbanned
Copy link
Contributor

dangotbanned commented Feb 14, 2024

#13899 related, which has info on some existing workarounds.

@max-muoto by opting out of typing, is this also not providing code completion?

Personally, I think the ergonomic benefits of accessors are outweighed by the lack of information the methods will have when compared to the rest of polars.

I'd really appreciate a simpler solution than polugins, if possible

@MarcoGorelli
Copy link
Collaborator

I think the ergonomic benefits of accessors are outweighed by the lack of information the methods will have when compared to the rest of polars

Just checking I'm reading this sentence correctly: are you essentially saying "the lack of typing / code completion negates the ergonomic benefits of accessors, so accessors are not worth it"?

@dangotbanned
Copy link
Contributor

Just checking I'm reading this sentence correctly: are you essentially saying "the lack of typing / code completion negates the ergonomic benefits of accessors, so accessors are not worth it"?

Yeah that's what I was going for @MarcoGorelli, sorry it wasn't clear.

The only thing I'd add is I'm talking about 3rd-party accessors, in their current form.
All of the 1st-party accessors make using polars a really intuitive experience.

It's the contrast between the two, currently, that I was trying to highlight. Which I'm unsure if this proposal would resolve (but would really appreciate if it did)

@deanm0000
Copy link
Collaborator

I'm not well versed on typing and linting so bear with me. Would this make type hinting work or it'll just keep linters from complaining about non-existent attributes or something else?

@max-muoto
Copy link
Contributor Author

This is something that is brought up often on Stack Overflow and on our Discord. If there's an elegant workaround, that would be great.

I am not 100% how I feel about adding a namespace just for working around this issue. If you have something that works, I'd be happy to review / play around with it so I can make up my mind.

Sounds good, I'll wrap up my PR and ping you here once that's good to look at. Thanks!

@max-muoto
Copy link
Contributor Author

I'm not well versed on typing and linting so bear with me. Would this make type hinting work or it'll just keep linters from complaining about non-existent attributes or something else?

This would mainly serve as a way of stopping your type-checker from complaining about these attributes, but it wouldn't give specific return annotations, instead returning Any.

@max-muoto
Copy link
Contributor Author

max-muoto commented Feb 14, 2024

#13899 related, which has info on some existing workarounds.

@max-muoto by opting out of typing, is this also not providing code completion?

Personally, I think the ergonomic benefits of accessors are outweighed by the lack of information the methods will have when compared to the rest of polars.

I'd really appreciate a simpler solution than polugins, if possible

Yes, unlike polugins this won't provide code completion, but will primarily serve as a way of avoiding typing errors. Personally ran into some friction with using polugins, with some missing imports. Obviously can just fix these things myself, but it would be nice to have a solution that's low friction and a good way of working around type-checkers.

A MyPy plugin would likely work well for MyPy users, but having something compatible with the native Python typing system is important for Pyright users.

@stinodego
Copy link
Member

As pointed out by @alexander-beedie in the linked PR, this doesn't look like the right solution to the problem.

Users can either ignore the type errors explicitly, or use something like polugins. Maybe we should create a MyPy plugin. There is probably some good approach here, but users ideally shouldn't change the way they write their code to appease a type checker / linter / formatter.

I'll close this issue as not planned, specifically because the proposed solution is not desirable.

@stinodego stinodego closed this as not planned Won't fix, can't repro, duplicate, stale Feb 18, 2024
@Distortedlogic
Copy link

I have a reasonable solution to this I believe. I have published a package for it. Very alpha but it does the main thing.

https://github.com/Summit-Sailors/polar_patch

@Distortedlogic
Copy link

@stinodego can you review the concept of my proposed solution to this issue? at your convenience of course :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

6 participants