Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: generate groupby method wrapper code #16959

Closed
jbrockmendel opened this issue Jul 15, 2017 · 7 comments · Fixed by #28651
Closed

CLN: generate groupby method wrapper code #16959

jbrockmendel opened this issue Jul 15, 2017 · 7 comments · Fixed by #28651

Comments

@jbrockmendel
Copy link
Member

$ grep -r "[^_]exec(" pandas
pandas/compat/__init__.py:    exec("""
pandas/core/groupby.py:        exec(_def_str)
pandas/core/groupby.py:        exec(_def_str)

The usage in compat is for raise_with_traceback where the py2 version is a syntax error in py3. This is pretty straightforward to de-exec by putting the py2-specific code in a module that only gets imported in py2.

The usage in core.groupby is tougher. The string being exec'ed defines a method in some cases and property in others. The property version I think I've got a working fix, but the method version has me stumped.

These _def_strs are constructed in core.groupby._whitelist_method_generator:

    method_wrapper_template = \
        """def %(name)s(%(sig)s) :
    \"""
    %(doc)s
    \"""
    f = %(self)s.__getattr__('%(name)s')
    return f(%(args)s)"""
    property_wrapper_template = \
        """@property
def %(name)s(self) :
    \"""
    %(doc)s
    \"""
    return self.__getattr__('%(name)s')"""
    for name in whitelist:
        # don't override anything that was explicitly defined
        # in the base class
        if hasattr(GroupBy, name):
            continue
        # ugly, but we need the name string itself in the method.
        f = getattr(klass, name)
        doc = f.__doc__
        doc = doc if type(doc) == str else ''
        if isinstance(f, types.MethodType):
            wrapper_template = method_wrapper_template
            decl, args = make_signature(f)
            # pass args by name to f because otherwise
            # GroupBy._make_wrapper won't know whether
            # we passed in an axis parameter.
            args_by_name = ['{0}={0}'.format(arg) for arg in args[1:]]
            params = {'name': name,
                      'doc': doc,
                      'sig': ','.join(decl),
                      'self': args[0],
                      'args': ','.join(args_by_name)}
        else:
            wrapper_template = property_wrapper_template
            params = {'name': name, 'doc': doc}
        yield wrapper_template % params
@jreback
Copy link
Contributor

jreback commented Jul 15, 2017

not sure this is a big deal. this is just a way to code generate.

@jbrockmendel
Copy link
Member Author

For the raise_with_traceback I agree it's not that big a deal, I've just been indoctrinated to react to exec. Let's forget about that one.

In the case of method_wrapper_template the fact that I couldn't reason about how it works makes it a problem for me. Moving forward on the requests made by @gfyoung in #16913 is going to require poking around in there. I'll take another look at it next week.

@jreback
Copy link
Contributor

jreback commented Jul 16, 2017

yeah what we usually do is to generate code (for cython) using tempita. We actually could do that here, its cleaner and more pythonic. (see the pxi.in -> pxi files).

@jreback jreback changed the title Get rid of exec() usage? CLN: generate groupby method wrapper code Jul 16, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 16, 2017

Moving forward on the requests made by @gfyoung in #16913 is going to require poking around in there.

@jbrockmendel : Could you explain why deprecations are requiring you to go through this refactoring?

@jbrockmendel
Copy link
Member Author

Could you explain why deprecations are requiring you to go through this refactoring?

The non-invasive part of the deprecation does not. What I was referring to was the suggestion that removing these methods entirely might fix the problem of the circular import:

[...] That might go away once those hist and plot methods are removed.

Removing these methods from Series/DataFrame does cascade into requiring changes in core.groupby. Like with #16913, the stumbled-upon issue (exec) is more important than the original motivator (refactoring that's a ways down the road).

@gfyoung
Copy link
Member

gfyoung commented Jul 16, 2017

The non-invasive part of the deprecation does not. What I was referring to was the suggestion that removing these methods entirely might fix the problem of the circular import:

Got it. That makes sense. 👍

@jbrockmendel
Copy link
Member Author

Digging into this a bit further...

    method_wrapper_template = \
        """def %(name)s(%(sig)s) :
    \"""
    %(doc)s
    \"""
    f = %(self)s.__getattr__('%(name)s')
    return f(%(args)s)"""

The %(self)s there in most cases resolves to just self (the exception is hist, which uses an accessor, which is another layer of redirection...). So at first this looks like it should get into some kind of loop trying to access itself. But it turns out __getattr__ has been patched so that this actually accesses self.obj.__getattr__('%(name)s').

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants