Deprecate pd.Index(names='') #19295

toobaz · 2018-01-18T10:55:53Z

pd.Index accepts name - great
pd.MultiIndex accepts names - after all, they are multiple names
pd.Index accepts names for compatibility with pd.MultiIndex - well, OK, after all, that constructor can result in a MultiIndex
pd.Index should accept ( BUG: Creating Index name using names names argument, doesn't set index name #19082 ) names even when it results in a flat index - well, OK, still for compatibility
pd.MultiIndex accepts name for compatibility - wait, wasn't names already provided for compatibility?! OK, forget it, go for name.
pd.Index accepts name even for multiple levels' names, for compatibility - with MultiIndex, which accepts it for compatibility with pd.Index - aaaaalright
All pd.Index subclasses (which will never result in multiple levels, and which currently in some cases discard names, in other cases raise an error) should accept names for compatibility - no, wait.

Proposal 1

There is one way to have compatibility across any kind of Index (sub)class constructor, and it is name. When the constructor results in a MultiIndex (which can happen with pd.Index or pd.MultiIndex), then name should be list-like, and each level will "receive" the respective component
in those cases - and only in those cases - in which a constructor actually results in a MultiIndex, names can be used as an alias for name. In all other cases, it is not accepted.

or alternatively:

(b) (my preference) names is deprecated (in all cases in which it is currently supported, and remains unsupported in other constructors/cases)
(c) (tradeoff between mental health and backward compatibility) names is supported in pd.MultiIndex, still supported but deprecated in pd.Index when resulting in MultiIndex (and remains unsupported in other constructors/cases)

Corollary:

names will not be supported by any constructor that is not pd.Index or pd.MultiIndex.

Notice that a 1-level MultiIndex is still a MultiIndex. That is,

pd.Index([('a',), ('b',)], name=('only_one_name',)) will still work
pd.Index([('a',), ('b',)], names=('only_one_name',)) will still work (at least as long as we don't deprecate names entirely)
pd.Index([('a',), ('b',)], name='only_one_name') will still not work

Proposal 2

There is one way to have compatibility across any kind of Index (sub)class constructor, and it is names, which must always be list-like.
In those cases in which the constructor results in a flat index, name is also accepted, and interpreted as names=(name,); instead name is deprecated for pd.MultiIndex, and for pd.Index when resulting in a MultiIndex (even if with 1 level)

Corollary:

implementation-wise, we will want to decorate all __new__ of non-MultiIndex subclasses to convert names to name

The text was updated successfully, but these errors were encountered:

jreback · 2018-01-18T11:27:34Z

its easiest just to only accept name only. If its a scalar, make it a list. Then the len of names must match the number of levels. Very simple this way. I don't think it would break any external APIS (of course we need to actually deprcate names).

SO Proposal 1. 2b

toobaz · 2018-01-18T11:36:47Z

its easiest just to only accept name only.

I like this!

If its a scalar, make it a list.

I don't like this! You mean in a 1-level MultiIndex? I think the user must still pass ("name",), we don't need to guess.

jreback · 2018-01-18T11:47:16Z

yeah that’s fine ; let’s make it explicit

toobaz · 2018-01-29T14:02:11Z

What about pd.MultiIndex.from_tuples, pd.MultiIndex.from_product... ? Replace names -> name there too?

toobaz · 2018-01-29T14:06:01Z

By the way, the more I think about it, the more my "Proposal 2" looks cleaner to me. But it's more a vague sensation than an objective assessment.

jorisvandenbossche · 2018-02-02T12:52:50Z

I think we should look back here at what is the goal? Why do we need/want compatibility? Is it because people want to be able to write code to creates an index without knowing in advance the values and names it receives will contains tuples or not / will result in MultiIndex or not?

I think that is a legitimate case, but, one that is only relevant for Index itself.

But that would also lead me to conclue: for the single level Index subclasses (Int64Index, DatetimeIndex, ...) this will never be the case, so for those we don't care and we use the 'normal' keyword, i.e. name, and there is no reason here to support names.

So I think this is more one of the options of Proposal 1?

Can you explain more what you find attractive in Proposal 2? It feels very strange to me to see name as a kind of alias of names for DatetimeIndex, etc.

toobaz · 2018-02-02T16:53:33Z

@jorisvandenbossche you are right that (in my proposal 2) the support for names is only compelling for Index, which is the only case in which we don't know what is going to come out.

This said, I think (although I don't think it is a priority) that

supporting names in all classes is just a matter of creating the decorator
we might want to support names e.g. in copy():

idx = Index(some_data_which_I_ignore, names=tuple_with_maybe_1_element)
idx2 = idx.copy(names=another_tuple_with_maybe_1_element)

... and hence (presumably) also in the constructor

people might reasonably expect that something that results (for instance) in a Int64Index (that is, Index()) takes similar arguments than a Int64Index, where applicable
I'm not claiming we should present (e.g. in the docs) name as an alias... it is just the way you could see it if you want to "think general"

But again, I'm relatively indifferent on this.

jorisvandenbossche · 2018-02-09T14:54:45Z

supporting names in all classes is just a matter of creating the decorator

It might well be that it is not difficult to support it, but still, what's the point to support it?

we might want to support names e.g. in copy():

That could be interesting (not sure if it is needed, you can also do .copy().rename()), but I don't think that necessarily means it should be the same in the constructor.

people might reasonably expect that something that results (for instance) in a Int64Index (that is, Index()) takes similar arguments than a Int64Index, where applicable

"where applicable", yes, and names is not applicable, I would say :)

But again, I'm relatively indifferent on this.

Me as well, as indeed, we can also simply allow it and convert it under the hood, without advertising it. But, if we allow it, it should be documented, and that will add to the docstring for something that is not really needed IMO.

jorisvandenbossche · 2018-02-09T14:56:05Z

So I would vote for: Index subclasses only name, MultiIndex only names, Index both

jreback · 2018-02-24T17:31:14Z

cc @PoppyBagel

TomAugspurger · 2018-04-23T19:13:40Z

Pushing to the next release.

jbrockmendel · 2021-06-19T03:54:36Z

possibly closed by #38597

mroeschke · 2022-06-07T21:59:00Z

This looks fairly addressed by #38597 so closing. We can reopen a new issue if there were specific followups needed

toobaz mentioned this issue Jan 18, 2018

BUG: Creating Index with the names argument #19168

Closed

3 tasks

jreback added Indexing Related to indexing on series/frames, not to indexes themselves API Design Deprecate Functionality to remove in pandas labels Jan 18, 2018

jreback added this to the 0.23.0 milestone Jan 18, 2018

toobaz changed the title ~~Deprecate pd.*Index*(names=''), at least when resulting in flat index - or vice-versa~~ Deprecate pd.*Index*(names='') Jan 20, 2018

arminv mentioned this issue Apr 10, 2018

ERR: disallow non-hashables in Index/MultiIndex construction & rename #20548

Merged

4 tasks

TomAugspurger modified the milestones: 0.23.0, Next Major Release Apr 23, 2018

toobaz mentioned this issue May 21, 2018

(Assuming we don't want to disallow 1-level MultiIndexes), stop identifying MultiIndex by nlevels == 1 #21149

Closed

toobaz added Index Related to the Index class or subclasses and removed Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 29, 2019

mroeschke added MultiIndex and removed API Design labels Jun 12, 2021

jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Jun 19, 2021

mroeschke closed this as completed Jun 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate pd.Index(names='') #19295

Deprecate pd.Index(names='') #19295

toobaz commented Jan 18, 2018

jreback commented Jan 18, 2018 •

edited

Loading

toobaz commented Jan 18, 2018

jreback commented Jan 18, 2018

toobaz commented Jan 29, 2018

toobaz commented Jan 29, 2018 •

edited

Loading

jorisvandenbossche commented Feb 2, 2018

toobaz commented Feb 2, 2018

jorisvandenbossche commented Feb 9, 2018

jorisvandenbossche commented Feb 9, 2018

jreback commented Feb 24, 2018

TomAugspurger commented Apr 23, 2018

jbrockmendel commented Jun 19, 2021

mroeschke commented Jun 7, 2022

Deprecate pd.*Index*(names='') #19295

Deprecate pd.*Index*(names='') #19295

Comments

toobaz commented Jan 18, 2018

Proposal 1

Proposal 2

jreback commented Jan 18, 2018 • edited Loading

toobaz commented Jan 18, 2018

jreback commented Jan 18, 2018

toobaz commented Jan 29, 2018

toobaz commented Jan 29, 2018 • edited Loading

jorisvandenbossche commented Feb 2, 2018

toobaz commented Feb 2, 2018

jorisvandenbossche commented Feb 9, 2018

jorisvandenbossche commented Feb 9, 2018

jreback commented Feb 24, 2018

TomAugspurger commented Apr 23, 2018

jbrockmendel commented Jun 19, 2021

mroeschke commented Jun 7, 2022

Deprecate pd.Index(names='') #19295

Deprecate pd.Index(names='') #19295

jreback commented Jan 18, 2018 •

edited

Loading

toobaz commented Jan 29, 2018 •

edited

Loading