Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicating the legend example with other data does not seem to work #276

Open
StevenCHowell opened this issue Feb 1, 2017 · 2 comments

Comments

@StevenCHowell
Copy link

StevenCHowell commented Feb 1, 2017

I am trying to adapt the code from the legend example notebook to another data set. I replaced the data with the 5 Gaussian distributions, updating the appropriate inputs but the legend is entirely black.

Here is the code I ran (in a jupyter notebook):

import pandas as pd
import numpy as np

from bokeh.io import output_notebook, show
from bokeh.plotting import Figure
output_notebook()

import datashader as ds
import datashader.transfer_functions as tf

from datashader.colors import Hot
from datashader.bokeh_ext import create_ramp_legend, create_categorical_legend

# create sample dataset
np.random.seed(1)
num=1000000

dists = {cat: pd.DataFrame(dict(x=np.random.normal(x,s,num),
                                y=np.random.normal(y,s,num),
                                val=val,cat=cat))
         for x,y,s,val,cat in 
         [(2,2,0.01,10,"d1"), (2,-2,0.1,20,"d2"), (-2,-2,0.5,30,"d3"), (-2,2,1.0,40,"d4"), (0,0,3,50,"d5")]}

df = pd.concat(dists,ignore_index=True)
df["cat"]=df["cat"].astype("category")
df.tail()  # show some of the data in an interactive setting

def create_base_plot():
    
    # taxi data is in meters
    xmin = df.x.min()
    ymin = df.y.min()
    xmax = df.x.max()
    ymax = df.y.max()

    cvs = ds.Canvas(plot_width=900,
                    plot_height=600,
                    x_range=(xmin, xmax),
                    y_range=(ymin, ymax))

    agg = cvs.points(df, 'x', 'y')
    img = tf.shade(agg, cmap=Hot, how='eq_hist')
    fig = Figure(x_range=(xmin, xmax),
                 y_range=(ymin, ymax),
                 plot_width=600,
                 plot_height=600,
                 tools='')
    
    fig.background_fill_color = 'black'
    fig.toolbar_location = None
    fig.axis.visible = False
    fig.grid.grid_line_alpha = 0
    fig.min_border_left = 0
    fig.min_border_right = 0
    fig.min_border_top = 0
    fig.min_border_bottom = 0

    fig.image_rgba(image=[img.data],
                   x=[xmin],
                   y=[ymin],
                   dw=[xmax-xmin],
                   dh=[ymax-ymin])
    return fig, (xmin, ymin, xmax, ymax), agg

fig, extent, datashader_agg = create_base_plot()
show(fig)

legend_fig = create_ramp_legend(datashader_agg,
                                Hot,
                                how='eq_hist',
                                width=600)
show(legend_fig)

Here is the result:
legend_fail

I noticed the range for my aggregation is much larger than the taxi example, [0, 728852] compared to [0, 1968].

>>> datashader_agg.min()
<xarray.DataArray ()>
array(0, dtype=int32)
>>> datashader_agg.max()
<xarray.DataArray ()>
array(728852, dtype=int32)

The increased range should not be responsible for the error but I will look into that.

I am not certain this is a bug or simply an misunderstanding of the example on my part.

@StevenCHowell
Copy link
Author

Here is a simpler testing script.

imports and setup:

# imports
import pandas as pd
import numpy as np

import datashader as ds
import datashader.transfer_functions as tf

from datashader.bokeh_ext import create_ramp_legend, create_categorical_legend

import bokeh.plotting
bokeh.plotting.output_notebook()

# create sample dataset
np.random.seed(1)
num=1000000
dists = {cat: pd.DataFrame(dict(x=np.random.normal(x,s,num),
                                y=np.random.normal(y,s,num),
                                val=val,cat=cat))
         for x,y,s,val,cat in 
         [(2,2,0.01,10,"d1"), (2,-2,0.1,20,"d2"), (-2,-2,0.5,30,"d3"), 
          (-2,2,1.0,40,"d4"), (0,0,3,50,"d5")]}
df = pd.concat(dists,ignore_index=True)
df["cat"]=df["cat"].astype("category")
df.tail()  # view data sample in interactive view

actual plotting script:

# generate the plot with a legend
height = 600
width = 600

# palette = ['white', 'navy']
from bokeh.palettes import Viridis256 as palette
# from datashader.colors import Hot as palette

how = 'eq_hist'
# how = 'linear'
# how = 'log'

x_range = [df.x.min(), df.x.max()]
y_range = [df.y.min(), df.y.max()]

cvs = ds.Canvas(plot_width=width, plot_height=height,
                x_range=x_range, y_range=y_range)

agg = cvs.points(df, 'x', 'y')
img = tf.shade(agg, cmap=palette, how=how)
fig = bokeh.plotting.Figure(x_range=x_range, y_range=y_range, 
                            plot_width=width, plot_height=height, 
                            tools='')

fig.image_rgba(image=[img.data], x=x_range[0], y=y_range[0], 
               dw=[x_range[1]-x_range[0]], dh=[y_range[1]-y_range[0]])

bokeh.plotting.show(fig)

legend_fig = create_ramp_legend(agg, palette, how=how, width=width)
bokeh.plotting.show(legend_fig)

sample output demonstrating the problem:

image

@jbednar
Copy link
Member

jbednar commented Feb 15, 2017

It looks to me like this is mostly a documentation problem; the docstring for create_ramp_legend implies that any 'how' option is supported, but at present the actual code only supports 'linear' and 'log', without ever checking for other options. So it is not currently safe to use anything but those two 'how' options. I have a plan for how to support other options (#126), but meanwhile I've updated master to show that only those two options are allowed (commit 6101791e8).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants