Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to extend a Stream with new methods #442

Open
florentbr opened this issue Dec 5, 2021 · 6 comments
Open

Fail to extend a Stream with new methods #442

florentbr opened this issue Dec 5, 2021 · 6 comments

Comments

@florentbr
Copy link

Inheritance on a Stream seems to be broken.

The following example fails with 'map' object has no attribute 'draw' :

import streamz

class MyStream(streamz.Stream) :
    def draw(self) :
        print('called draw')

@MyStream.register_api()
class draw(MyStream) :
    def __init__(self, *args, **kwargs) :
        print('called draw')

s = MyStream()
s.map(lambda x : x + 5).draw()
s.emit(99)

I couldn't find a class factory in the source and it looks like all the nodes are instantiated directlly with Stream when it should use the type of the main instance.

@martindurant
Copy link
Member

It would work with @Stream.register_api - will that suffice?

@florentbr
Copy link
Author

@martindurant, the issue with @Stream.register_api is that it will mutate the original Stream class which will impact all the instances everywere else. So it doesn't fullfill the requirement for extending a stream. For instance, if I were to override the sink method, it would mess/break the other streams. What I'm expecting is basic polymorphism.

@florentbr
Copy link
Author

It's called subclassing in the pandas documentation :
https://pandas.pydata.org/docs/development/extending.html#subclassing-pandas-data-structures

@martindurant
Copy link
Member

martindurant commented Dec 6, 2021

I think I see what you are trying to do, but you face the following problem: map is a method on Stream (and defined by the map class), so that when you call .map, there is no easy way to know that this was called from within a subclass - map is not a child of your custom class and cannot be made so dynamically.

That leaves you with the two obvious paths, as you point out: come up with unique names for all node types, or allow name clashes and understand that you may clobber for all streams. For your example, I'm not totally sure how you expect the method draw to be called, since it doesn't actually take any events. All existing nodes are implemented at the update (incoming events) and emit (outgoing events) points.

However, another possibility, is maybe what you are actually after. You can use Stream.connect to tie together nodes without using the register_api route. For example, the following are equivalent, where the second and third versions avoid using the automatically created method.

    s = Stream()
    m = s.map(func)

or

    s = Stream()
    m = streamz.core.map(s, func)

or

    s = Stream()
    m = streamz.core.map(None, func)
    s.connect(m)

@florentbr
Copy link
Author

map is a method on Stream (and defined by the map class), so that when you call .map, there is no easy way to know that this was called from within a subclass

map is a wrapped function. The class/subclass is available at the time it's registered and called :

return func(*args, **kwargs)

map is not a child of your custom class and cannot be made so dynamically.

The following Stream.register_api() would adds the expected inheritance at registration :

    @classmethod
    def register_api(cls, func):
        @functools.wraps(func)
        def wrapped(self, *args, **kwargs):
            clazz = type(func.__name__, (func, type(self)), {})
            node = clazz(self, *args, **kwargs)
            return node
        setattr(Stream, func.__name__, wrapped)

For your example, I'm not totally sure how you expect the method draw to be called, since it doesn't actually take any events.

This is just an example to demonstrate the issue with inheritance. Don't read too much into it.

You can use Stream.connect to tie together nodes without using the register_api route.

This library threats it's methods as plugings but doesn't keep track of them.

So I'd have to rewrite all the methods that are registered by Stream.register_api to connect the nodes. It kind of defeat the purpose of inheritance.

@martindurant
Copy link
Member

I haven't had a chance yet to understand what difference your alternate decorator has, but please do propose a PR to demonstrate it! I daresay there are others who have thought along these lines and not known how to proceed.

florentbr added a commit to florentbr/streamz that referenced this issue Dec 7, 2021
florentbr added a commit to florentbr/streamz that referenced this issue Dec 8, 2021
florentbr added a commit to florentbr/streamz that referenced this issue Dec 10, 2021
florentbr added a commit to florentbr/streamz that referenced this issue Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants