Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we override the matplotlib color shorthands? #293

Closed
mwaskom opened this issue Sep 15, 2014 · 27 comments · Fixed by #554
Closed

Should we override the matplotlib color shorthands? #293

mwaskom opened this issue Sep 15, 2014 · 27 comments · Fixed by #554

Comments

@mwaskom
Copy link
Owner

mwaskom commented Sep 15, 2014

Matplotlib lets you specify colors like "b" for blue, "r" for red, etc. which is nice. But, those colors are mapped to the corners of RGB space, which, bleh.

This is additionally problematic as some functions (like plt.scatter) hardcode these values as default.

It occurs to me that seaborn could override this by messing with the values in the mpl.colors.ColorConverter.colors dictionary. But that might be too much manipulation?

Opening this issue for discussion.

@shoyer
Copy link
Contributor

shoyer commented Sep 16, 2014

For what it's worth, I think this would be fine (and appreciated). Users who are concerned preserving the exact original styling of their matplotlib figures won't be using import seaborn anyways.

@mwaskom
Copy link
Owner Author

mwaskom commented Sep 16, 2014

Here are some issues that come to mind:

  • What function should this be done in? Should there be a user-facing function or should it just be done in sns.set? Should people be able to pass their own mapping dictionary? Should we do something clever like, given a palette, map the colors to the closest shorthand based on some RGB space math?
  • What color palette to use? Always the seaborn default (deep)? Should sns.set_palette check whether the palette is other variants (muted, dark, etc.) and reset the color dict then?
  • It's possible to reset the default and import-time style parameters with sns.reset_default sns.reset_orig. Should these also reset the color shorthands?

@shoyer
Copy link
Contributor

shoyer commented Sep 16, 2014

I would integrate remapping of color shortcuts with seaborn's style changes and thus sns.reset_default/ sns.reset_orig. That's one less new concept/API for users to figure out.

@tacaswell
Copy link
Contributor

👎 I foresee this causing headaches on the mpl side. If you really want the ability to re-map the shortcuts on the fly I think it should be done upstream.

More generally, I think much of the non-default-color/depends-on-pandas functionality of seaborn should be pulled upstream to mpl.

@tacaswell
Copy link
Contributor

cc @mdboom @WeatherGod @efiring

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

I foresee this causing headaches on the mpl side.

Do you mean headaches for you guys (issues saying "wtf my plots changed!") or do you actually expect anything to break? If it's just changing the RGB tuples in the dictionary values (not adding or subtracting any entries), everything should run fine, right?

More generally, I think much of the non-default-color/depends-on-pandas functionality of seaborn should be pulled upstream to mpl.

I agree though that kind of thing would have to wait until 2.0 right? And then anything else until 3.0? I think your "don't break tests on point releases" policy is very sound, but for stuff like this it's better to have an independent actor.

@WeatherGod
Copy link

The issue with hard-coded defaults is being slowly worked on with the style work in matplotlib, so for the purposes of this discussion, I would consider that being taken care of. As for what should 'r', 'g', and 'b' should mean, you do realize that 'red', 'green', and 'blue' are also available, right? The shorthand colors are the "pure" ones, while the longhand names are the HTML-defined colors. If you want to start messing around with how color strings are interpreted, go ahead and do that within seaborn and make sure that it is opt-in (I hate it when, just because a dependency of a dependency got imported, it starts mucking about with the packages I directly use).

Now, what I could see happen is something along the lines of the CSS-like work (MEP 22, IIRC?). In there, maybe it could be possible to map stringy color names to hexcodes? Users could have colors like "dark1", "light3" and such, but they would be defined in the style sheet. Just me thinking out loud.

@tacaswell
Copy link
Contributor

I suspect it will mostly be people with broken plots not realizing that seaborn was reaching in and changing things. My issue is more that it violates encapsulation, than adding another knob. Once you start reaching in and mucking with the internals to that degree seaborn has gone from being a layer on top to a monkey-patch-implemented fork.

I think many things (like the pallet context managers, the mutable color maps, what ever issues the syle package has that makes it annoying for you to use, the color pallets) can be pulled in in backwards compatible way. These are useful enough tools I would like to get them as widely distributed as possible.

I also have the sense that there is a lot of over-lapping effort between pandas plotting, prettyplotlib, and seaborn. I think it would be worth everyone's time to centralize that effort.

[edited to fix word salad]

@efiring
Copy link

efiring commented Oct 4, 2014

I agree with @tacaswell's objection to this.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

My issue is more that it violates encapsulation, than adding another knob. Once you start reaching in and mucking with the internals to that degree seaborn has gone from being a layer on top to a monkey-patch-implemented fork.

I think this is a valid concern, which is why I opened the issue instead of just doing it. But I also think this is a borderline case, and that actually more consistency would be gained from implementing it. And I do think that there's a difference between changing the behavior of a function and changing the lexicon used in a dictionary. Plus, it's not like it's called mpl.colors.ColorConverter._colors. I'm principled about what I mess with :)

What I have in mind is quite restricted (although not fully worked out). Because the default seaborn color cycle is based on the matplotlib one, it would just replace each entry in the colors dict with the "seaborn" version of that color. That way when you use "b" and "r" you'll get the blue and red you expect in the context of seaborn. (People get confused goes both ways -- I get issues from people surprised that plt.scatter and friends aren't following the seaborn color palette). Nothing more than that. Calling sns.reset_defaults will return these values to their original state. I could also add a "matplotlib" palette so it would be easy to undo this behavior without totally reseting the seaborn style changes.

I suspect it will mostly be people with broken plots not realizing that seaborn was reaching in and changing things.

Perhaps, although my hope is that it will be fairly obvious that it is congruent with everything else seaborn is doing.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

The issue with hard-coded defaults is being slowly worked on with the style work in matplotlib, so for the purposes of this discussion, I would consider that being taken care of.

Except it won't actually be "taken care of" for users until 2.0, which I'm assuming is months if not years in the future (based on the fact that you seem to be in development for 1.5 at the moment). And won't be taken care of at all for people stuck on the 1.x series.

As for what should 'r', 'g', and 'b' should mean, you do realize that 'red', 'green', and 'blue' are also available, right? The shorthand colors are the "pure" ones, while the longhand names are the HTML-defined colors.

Yes. I am aware of how you can specify colors in matplotlib. My point is that it would be nice to have the shorthands map to useful colors. I'd much rather write plt.plot(x, y, "g") than plt.plot(x, y, sns.colors["g"]) or similar.

I don't really see what's "pure" about "b", "r", "g". If anything I see it as the opposite -- I would never mess with what the HTML color names get mapped to because that has a very clear definition. But I don't think "b" obviously means RGB (0, 0, 1), and the RGB color space has no particular relationship to human color vision. I'll point out too that the shorthands aren't completely "pure" in the sense of RBG space even; for example, "g" is not (0, 1, 0) but (0, .5, 0), presumably because the former is really obnoxious.

@mwaskom mwaskom closed this as completed Oct 4, 2014
@mwaskom mwaskom reopened this Oct 4, 2014
@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

Whoops, wrong button.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

A middle ground would be a function to do this that people have to explicitly call independently of other changes to the default color cycle.

@tacaswell
Copy link
Contributor

I think we are talking past each other. The issue is not that the ability to re-define the meanings of color short cuts, I am tentatively on board with this, the issue is that the machinery for this should live up-stream.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

Ah! Gotcha. That makes sense.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

Though I would still want a seaborn interface on top of that to go from the named seaborn palettes to particular values for the shortcuts, of course.

@efiring
Copy link

efiring commented Oct 4, 2014

It looks like in the present mpl color dictionaries, r, g, and b correspond exactly to red, green, and blue. The minor difference is that c, m, and y use coefficients of 0.75 where cyan, magenta and yellow use 1.0. You might not like those colors, but they are conventional definitions; I would leave them alone.

It seems to me that even within mpl, messing with the present name:color mappings is not a good idea. It sounds like what is needed is a facility for any default color to be set (as is the case with the color cycle), and for there to be a separate facility allowing the user to assign non-conflicting names to colors, with or without seaborn. Then seaborn could define "sr", and set this as the default wherever "r" was the original mpl default. I think this could even work with single-letter colors other than rgbcmykw, so one could define a color "x", and then plot(data, 'xo') to get circles of color "x".
I think all of this can be done without an API break, so it would not have to be far in the future. Of course, there could always be a "gotcha" I am missing.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

It looks like in the present mpl color dictionaries, r, g, and b correspond exactly to red, green, and blue. The minor difference is that c, m, and y use coefficients of 0.75 where cyan, magenta and yellow use 1.0. You might not like those colors, but they are conventional definitions; I would leave them alone.

This isn't the main point here, but we don't perceive colors in RGB channels, so I don't really think that calling (1, 0, 0) "exactly red" makes any sense.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

The broader point about "monkey patching" is this: Python as a language has ways to afford changes and other ways to restrict them. If you want to signal that people shouldn't mess with this color dictionary, I would suggest either renaming it _colors or changing it from a dict to named tuple. To me, a global mutable object that isn't named with an underscore is fair game for manipulation.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

Then seaborn could define "sr", and set this as the default wherever "r" was the original mpl default. I think this could even work with single-letter colors other than rgbcmykw, so one could define a color "x", and then plot(data, 'xo') to get circles of color "x".

This does not seem very useful. The goal is not to define colors with completely obscure names, it is to bring the colors referred to by a specific set of shorthand tokens in line with the rest of the aesthetic defaults.

@efiring
Copy link

efiring commented Oct 4, 2014

I agree it is not the main point; but does this mean you think it is also OK to redefine "red", the html value?

Python, monkey-patching, and mpl: yes, those are reasonable suggestions, among the vast number of ways mpl code could be improved. MPL was not designed from the top down as a beautiful edifice, with perfect foresight as to what should be private and what public.

@efiring
Copy link

efiring commented Oct 4, 2014

OK, you are dead-set on manipulating the meanings of rgbcmykw, and this might indeed be a good idea. If so, it can be done with an interface added to mpl, as I think @tacaswell is advocating; it seems like it should be quite simple.

@mwaskom
Copy link
Owner Author

mwaskom commented Oct 4, 2014

I agree it is not the main point; but does this mean you think it is also OK to redefine "red", the html value?

No, because that has a specific meaning. Named color strings in matplotlib are interpreted as lookup values in the HTML color table. I think it's aesthetically unfortunate that "HTML red' maps to (1, 0, 0), but I wouldn't go into the cnames and arbitrarily change the meaning of individual colors. If there were some facility for swapping out the entire set, that might be interesting (I think something like this was mentioned upthread), but that's not something I'm concerned with.

On the other hand, from the user's perspective is "b" a lookup into a specific color lexicon, or should it correspond with the first color in the default cycle? People might be split in what they expect, but the latter both feels more consistent and lets people using seaborn specify colors with a very useful shorthand. (And it fixes the issue with hard-encoded defaults).

@WeatherGod
Copy link

So, if the issue is to have the color in the color cycle match a color used
elsewhere, then what is wrong with just simply adding to that dictionary
non-conflicting names. You can even use namespaces if you wish! When
seaborn is imported, it can augment the colors dictionary with values such
as "sr.blue", "sr.red", "sr.motherofpearl", and then set the color cycle to
use those values.

The only thing that remains is the issue of the default values of the
function signatures, which -- contary to your claims -- does not have to
wait for 2.0 to happen. We are in the transition process of establishing a
logical "styles" such that we can have a "current" style, "legacy" style,
and maybe even per-version styles. Through that, we can emulate default
function parameters in a forward-compatible way. We need help implementing
it though.

On Sat, Oct 4, 2014 at 3:38 PM, Michael Waskom [email protected]
wrote:

I agree it is not the main point; but does this mean you think it is also
OK to redefine "red", the html value?

No, because that has a specific meaning. Named color strings in matplotlib
are interpreted as lookup values in the HTML color table. I think it's
aesthetically unfortunate that "HTML red' maps to (1, 0, 0), but I
wouldn't go into the cnames and arbitrarily change the meaning of
individual colors. If there were some facility for swapping out the entire
set, that might be interesting (I think something like this was mentioned
upthread), but that's not something I'm concerned with.

On the other hand, from the user's perspective is "b" a lookup into a
specific color lexicon or should it correspond with the first color in the
default cycle? People might be split in what they expect, but the latter
both feels more consistent and lets people using seaborn specify colors
with a very useful shorthand. (And it fixes the issue with hard-encoded
defaults).


Reply to this email directly or view it on GitHub
#293 (comment).

@alexbw
Copy link

alexbw commented Nov 26, 2014

I don't know if this is the right thread, but I find Seaborn's behavior regarding shorthand linestyle to be frustrating.

Before importing seaborn:
unknown-1

After importing seaborn:
unknown

The plot undoubtedly looks nicer, but lacks the markers, which can be incredibly useful as a reference. I'm aware that sns.pointplot() will provide something similar, but it is confusing as a long-time matplotlib user, and aspiring seaborn user, for this very easy plotting shorthand to lose its functionality.

@mwaskom
Copy link
Owner Author

mwaskom commented Nov 26, 2014

@alexbw this is a matplotlib bug. See #344.

@alexbw
Copy link

alexbw commented Dec 8, 2014

Thanks, I see the fix in 0.5.1, sorry to muddle the current thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants