Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: make attribute setting de-facto insert column #9033

Closed
jreback opened this issue Dec 7, 2014 · 8 comments
Closed

API: make attribute setting de-facto insert column #9033

jreback opened this issue Dec 7, 2014 · 8 comments
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@jreback
Copy link
Contributor

jreback commented Dec 7, 2014

xref #8994
xref #5904
xref #8572

This might be a bit controversial, but the issues raised in #8994 and #5904
point to some continued confusion w.r.t. attribute setting 'being' column setting

so if we now have

df = DataFrame({'A' : [1,2,3], 'B' : 5 })
df.C = 5

is an attribute set

it could be a column set
e.g. de-facto df['C'] = 5

If someone actually wants an attribute to 'stick' around. (meaning they would have to intercept the __finalize__ methods and actually deal with them properly, then I think it is reasonable to also have them add to the _internal_names as well (see #8572)

So basically would try to set a column (unless its an internal name).

(note that getattr is de-facto already equivalent to __getitem__, e.g. df.B === df['B'])
(don't mind my JS equivalence notion :)

@jreback jreback added API Design Compat pandas objects compatability with Numpy or Python functions labels Dec 7, 2014
@jreback jreback added this to the 0.16.0 milestone Dec 7, 2014
@jreback
Copy link
Contributor Author

jreback commented Dec 7, 2014

cc @jakevdp
cc @kjordahl
cc @hugadams
@jorisvandenbossche @shoyer @cpcloud @hayd

@jreback jreback changed the title API: make attribute setting de-facto setattr API: make attribute setting de-facto insert column Dec 7, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@BrenBarn
Copy link

I think this change should be considered very very very carefully. It is a gigantic change to the API. It means that any later additions of new methods or attributes to DataFrame may unexpectedly break tons of code. For instance, if I have code that does df.blah = 5, and we later add a method or attribute called blah, my code will now overwrite that attribute. Allowing quick attribute access for reads is very different, because even if we add a blah method, code that does df.blah will not modify the object. I think it would be wise not to into changes like this that not only change behavior, but have major implications for any and all future changes to the DataFrame API.

@shoyer
Copy link
Member

shoyer commented May 21, 2015

@BrenBarn Thanks for commenting, you've convinced me. This indeed seems very dangerous.

@jreback
Copy link
Contributor Author

jreback commented May 21, 2015

yeh, this was sort of pie-in-the-sky. To promote consistency. In theory its a nice idea, but by auto-converting to a column, then you pretty much preclude future method expansion. Fundamentally there is really ONE way to create columns, namely __setitem__, but for convience attribute access is not symmetric.

@jreback jreback modified the milestones: Someday, Next Major Release May 21, 2015
@hughesadam87
Copy link

Personally, this would break my library as attribute setting is hacked in. I'm not sure if it would affect GeoPandas, since they have honest-to-goodness dataframe subclasses, and I'm using composite classes. Not that I think this should hold you back from a major release, but personally I'm not a fan of this behavior. The attribute setting syntax is pretty consistent throughout Python, and it's not really that difficult to do df['C'] = 50. If anything, maybe raise a warning when trying to set the attribute?

If you do put this through, can you do a small writeup about how to use the _finalize_ method as you mentioned?

@jreback
Copy link
Contributor Author

jreback commented May 21, 2015

@hugadams you might find this interesting reading (near the bottom): http://pandas.pydata.org/pandas-docs/stable/internals.html

@hughesadam87
Copy link

Ha wow, I need to keep up I guess!

On Thu, May 21, 2015 at 12:24 PM, jreback [email protected] wrote:

@hugadams https://github.com/hugadams you might find this interesting
reading (near the bottom):
http://pandas.pydata.org/pandas-docs/stable/internals.html


Reply to this email directly or view it on GitHub
#9033 (comment).

Adam Hughes
Physics Ph.D Candidate
George Washington University

@mroeschke mroeschke added Enhancement and removed Compat pandas objects compatability with Numpy or Python functions labels Apr 10, 2020
@mroeschke mroeschke added Indexing Related to indexing on series/frames, not to indexes themselves and removed API Design labels Apr 11, 2021
@mroeschke mroeschke removed this from the Someday milestone Oct 13, 2022
@jbrockmendel
Copy link
Member

closeble? NDFrame.__setattr__ not has a warning specifically about not doing this

@jbrockmendel jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

6 participants