Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: lack of legend for ds.count_cat #248

Closed
mortonjt opened this issue Sep 28, 2016 · 3 comments
Closed

DOC: lack of legend for ds.count_cat #248

mortonjt opened this issue Sep 28, 2016 · 3 comments

Comments

@mortonjt
Copy link

In the last figure in the NYC taxi cab notebook, the order of the colors were given

Here the order of colors is roughly red (midnight), yellow (4am), green (8am), cyan (noon), blue (4pm), purple (8pm), and back to red (since hours and colors are both cyclic).

But it isn't clear exactly where this order of obtained. Was it obtained from the ordering of df['hour']? It would be really nice if we could have some sort of legend to explain the colors for these sorts of categories.

@jbednar
Copy link
Member

jbednar commented Sep 28, 2016

The colors were specified directly, per hour:

colors = ["#FF0000","#FF3F00","#FF7F00","#FFBF00","#FFFF00","#BFFF00","#7FFF00","#3FFF00",
          "#00FF00","#00FF3F","#00FF7F","#00FFBF","#00FFFF","#00BFFF","#007FFF","#003FFF",
          "#0000FF","#3F00FF","#7F00FF","#BF00FF","#FF00FF","#FF00BF","#FF007F","#FF003F",]

I.e., there are 24 colors in this list, starting with red, and each one is used in order for the 24 hours of the day starting at midnight (binning time by hour). The legends notebook shows how to add a legend. We're still working on it being less awkward to do all this (see issue #126).

@jbednar jbednar closed this as completed Sep 28, 2016
@mortonjt
Copy link
Author

oh awesome! I didn't see that issue. Will there be any interest in just passing a pd.Series? When doing a first pass at the data, a legend painted on the plot is nice, but isn't explicitly required. But it is awkward trying to figure out which colors map to which categories. Because what I'm noticing is that ordering of the categories doesn't actually affect the ordering the colors.

For example, when I run the following code

import pandas as pd
import datashader as ds
import datashader.transfer_functions as tf
from bokeh.plotting import output_notebook

df = pd.DataFrame(
    {'x': [3, 3, 3, 6, 6, 6, 8, 8, 8],
     'y': [2, 5, 8, 2, 5, 8, 2, 5, 8],
     'group' : ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c']
    }
)
df['group'] = df['group'].astype('category')
x_range, y_range = ((0, 10), (0, 10))
colors = ['#FF0000', "#00FF00", "#0000FF"]

plot_width  = int(750)
plot_height = int(plot_width//1.2)

cvs = ds.Canvas(plot_width=plot_width, plot_height=plot_height, 
                x_range=x_range, y_range=y_range)
agg = cvs.points(df, 'x', 'y', ds.count_cat('group'))
img = tf.shade(agg, how='linear', color_key=colors) 
tf.dynspread(img, threshold=0.2, max_px=4)

I get the following image

datashader1

But when I run the following code

df = pd.DataFrame(
    {'x': [8, 3, 3, 6, 6, 6, 8, 8, 3],
     'y': [5, 5, 8, 2, 5, 8, 2, 8, 2],
     'group' : ['c', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'a']
    }
)
df['group'] = df['group'].astype('category')
x_range, y_range = ((0, 10), (0, 10))
colors = ['#FF0000', "#00FF00", "#0000FF"]

plot_width  = int(750)
plot_height = int(plot_width//1.2)

cvs = ds.Canvas(plot_width=plot_width, plot_height=plot_height, 
                x_range=x_range, y_range=y_range)
agg = cvs.points(df, 'x', 'y', ds.count_cat('group'))
img = tf.shade(agg, how='linear', color_key=colors)
tf.dynspread(img, threshold=0.2, max_px=4)

I get the exact same image.

So it doesn't look like the ordering of the categories in the dataframe matters. Which makes it a bit confusing with colors correspond to which categories.

@jbednar
Copy link
Member

jbednar commented Sep 29, 2016

See the PR above, which provides a function to report which colors correspond to which categories. I'll merge that if there are no objections or test failures, and hopefully that will help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants