-
-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with _colorize() #899
Comments
Looking at the code, I think there's actually a more insidious problem if total = data.sum(axis=2)
# zero-count pixels will be 0/0, but it's safe to ignore that when dividing
with np.errstate(divide='ignore', invalid='ignore'):
r = (data.dot(rs)/total).astype(np.uint8)
g = (data.dot(gs)/total).astype(np.uint8)
b = (data.dot(bs)/total).astype(np.uint8) ...and also the same assumption down below, where I guess the solution is to shift and rescale the data into a positive interval first up. In fact, as a workaround, I can do this to the raster before I even call |
We've improved a lot of the Theory/DesignColorize (part of tf.shade) is always meant to use color to indicate the average category in this pixel according to the For non-count aggregates that can include negative or zero values, it's difficult to work out precisely what should happen to the colors and alpha. My intuition is that colors should still reflect an average of the categories present, now weighted by value rather than count, and that alpha should map the minimum value found (even if negative or zero) to As suspected by @o-smirnov , the Color handling for zero and negative valuesFor the
only the The medium blob (third largest and third smallest, in the lower left) shows up gray instead of the correct green color, indicating that it was mapped to black (with some transparency) rather than green. That blob has a value of 0, and in the current version of the quoted bit of _colorize code I think it is clear that pixels with 0 value will not end up with a meaningful color (0/0):
Here the color_data replaces nans with 0 to avoid having nans propagate throughout the results, but we have to find another approach. The other colors all look correct; the largest is orange as expected, and if you take that away you can see each of the others matches the colors from After fixing what happens for zero values, we'll need to look more closely at the color mixing to see if there are any other problems with how the colors are weighted. It's possible that it's working properly when negative values are present only when one color dominates the results, so that the sign in the dot product matches between the alpha mapping for negative and zero valuesWe can look at the alpha handling by editing _colorize to force the color to be red for all cases:
When we do that, the alpha channel mapping is: Remember that the values here are, from smallest blob to largest blob, d1(10), d2(-20), d3(0), d4(-40), d5(50), so we'd expect the blobs to be ordered d4(-40), d2(-20), d3(0), d1(10), d5(50) So, to make progress, we need to first ensure that zero-valued categories still get the correct color, then can look at the overall results again. Offsetting the dataI did try adding |
As explained on #910, I ended up having to add a special case for any pixels mapping to zero, as the color was undefined in those cases. Defining the color as the average color for all non-NaN category values seems to work well in practice. |
Fixed in #910. |
Thanks for looking into this @jbednar! I have been distracted by a different software release here, so I couldn't keep up from my side. I'm going to have to find some time over the next few days to absorb the changes into shadems, and report back here. I think we're really on the bleeding edge here in terms of using colour and alpha to represent data, so the uncertainty is to be expected, and kind of exciting! |
Thanks! The notebook 2_Pipeline.ipynb illustrates how it all works now, so please look at the new examples in that section and see if it fits your intuitions. There's a breaking change to the behavior that we'll mention in the release notes, with |
@jbednar, thanks, I have belated verified that this all seems to work as expected now, at least with positive-valued reductions. I have taken all my workarounds out of shades, and the colours and alphas come up as expected. The negative values (and, even worse, total=0) case still breaks my head a little bit. I'm not sure what the right approach is, or if there even is one... Just thinking out loud now:
I still don't know -- but if I come up with something clever, I'll post here... |
* Fixes performance issue raised in #899
ALL software version info
0.10.0 release, but same problem on master.
Description of expected behavior and the observed behavior
I'm using an aggregator similar to
color_cat()
to colorize data. The resulting raster comes out very under-saturated, as it seems to have only two values in the alpha channel: 0 and 40.Stepping through with the debugger, I can see the problem is here: https://github.com/holoviz/datashader/blob/master/datashader/transfer_functions/__init__.py#L375
I haven't got an explicit span set, so
span
is equal to [0, max(total)] at this point (the latter being some very large number), while thea
array is normalized to [0,1]. Calling interp with a huge span and such a smalla
results in all non-zero values of a pinned to min_alpha=40.Comparing to the implementation of
_interpolate
, span should be set to min(a),max(a) at this point for correct results...The text was updated successfully, but these errors were encountered: