Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor semantic mapping operations #2090

Merged
merged 48 commits into from
May 23, 2020
Merged
Changes from 1 commit
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
e90067f
Prototype of rugplot that passes original tests
mwaskom May 19, 2020
4b145aa
Update test style
mwaskom May 19, 2020
872a103
Implement idea for less-verbose interaction with Plotter classes
mwaskom May 19, 2020
3838f69
Explore an idea about how to abstract hue mapping
mwaskom May 20, 2020
8a31b95
Shush Flake8
mwaskom May 20, 2020
8ff1f9c
Define semantics with tuples, not lists, to make immutable
mwaskom May 20, 2020
8f29936
Define semantic mappings with some complex higher-order magic
mwaskom May 20, 2020
44dbf88
Move some of the hue mapping logic
mwaskom May 20, 2020
0dc1333
Continued refactoring of variable assignment and hue mapping
mwaskom May 20, 2020
aa05d14
Refactor lineplot and get tests to pass
mwaskom May 21, 2020
7f9a15f
Get most RelationalPlotter tests passing
mwaskom May 21, 2020
9758ff3
Fix error introduced during refactoring
mwaskom May 21, 2020
dddc857
Move hue mapping tests from test_relational to test_core
mwaskom May 21, 2020
45a1b5f
Avoid treating string palette arg as signaling categorical
mwaskom May 21, 2020
9730248
Set map_type to include datetime, add note about missing implementation
mwaskom May 21, 2020
eb8b390
Change semantic inheritance to be restrictive rather than expansive
mwaskom May 21, 2020
e191856
Consider boolean data categorical at Plotter level
mwaskom May 21, 2020
2756efb
Sort out where utils/core funcs should go
mwaskom May 21, 2020
b240c38
Strip nose out of the utils tests
mwaskom May 21, 2020
b39b89c
Move new decorator to where it belongs and add a test
mwaskom May 21, 2020
771b807
Clean up a few leftovers from utils reorg
mwaskom May 21, 2020
bd8b00d
Add more HueMapping tests
mwaskom May 21, 2020
8a41017
Make core module private
mwaskom May 21, 2020
97bc4e5
Make objects in core non-private
mwaskom May 21, 2020
79116ed
Add initial version of SizeMapping object
mwaskom May 21, 2020
0df8593
Messy first pass at replacing parse_size with SizeMapping
mwaskom May 21, 2020
1183736
Fix size mapping to match current behavior, defer decoupling from plo…
mwaskom May 22, 2020
3a34532
Add test to capture relplot bug
mwaskom May 22, 2020
8f99541
Fix relplot numeric hues
mwaskom May 22, 2020
f1ae674
Move all hue/size lookup logic into corresponding Mapping objects
mwaskom May 22, 2020
303628f
Finalize refactoring of size mapping
mwaskom May 22, 2020
ce66b38
Add prototype of StyleMapping
mwaskom May 22, 2020
42026af
Integrate StyleMapping into relational plots
mwaskom May 22, 2020
5d80109
Get relational tests to pass
mwaskom May 22, 2020
00fc9bb
Move StyleMapping tests to core and excise parse_style from relational
mwaskom May 22, 2020
5ab9a63
Point rugplot at old code for now
mwaskom May 22, 2020
e1d6e5f
Add some more basic tests
mwaskom May 22, 2020
baf0524
Treat units as a normal semantic
mwaskom May 22, 2020
d75789d
Rename assign_variables method
mwaskom May 23, 2020
2aaef33
Address some TODOs about style/organization/defaults
mwaskom May 23, 2020
35e4457
Address more small TODOs and flesh out docs
mwaskom May 23, 2020
feae13b
LogNorm now fails with non-positive data (as it arguably should)
mwaskom May 23, 2020
266157b
Handle units in relplot (fixes #2080)
mwaskom May 23, 2020
5c0de8a
Ignore false-alarm warning from numpy on string/number comparison
mwaskom May 23, 2020
12b4d28
Catch a few pieces of residual cruft
mwaskom May 23, 2020
2d5a0bd
Ignore a separate dubious numpy warning
mwaskom May 23, 2020
1d88a50
Improve test coverage
mwaskom May 23, 2020
ddad631
Avoid error in relational user guide page
mwaskom May 23, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Move some of the hue mapping logic
mwaskom committed May 20, 2020
commit 44dbf88252ee36a6f20914c5b45a0bdf0bffb829
159 changes: 0 additions & 159 deletions seaborn/core.py
Original file line number Diff line number Diff line change
@@ -457,165 +457,6 @@ def establish_variables_longform(self, data=None, **kwargs):
return plot_data, variables


class HueMapping:

# Defaults attributes (TODO use data class?)
map_type = None
levels = [None]
limits = None
norm = None
cmap = None
palette = {}

def __init__(
self, plotter, palette=None, order=None, norm=None,
):

# Infer the type of mapping to use
from .palettes import QUAL_PALETTES # Avoid circular import

if palette in QUAL_PALETTES:
map_type = "categorical"
elif norm is not None:
map_type = "numeric"
elif isinstance(palette, (Mapping, Sequence)):
map_type = "categorical"
else:
# TODO we will likely need to impelement datetime mapping
if plotter.var_types["hue"] == "numeric":
map_type = "numeric"
else:
map_type = "categorical"

data = plotter.plot_data["hue"]
if data.notna().any():

# Our goal is to end up with a dictionary mapping every unique
# value in `data` to a color. We will also keep track of the
# metadata about this mapping we will need for, e.g., a legend

# --- Option 1: numeric mapping with a matplotlib colormap

if map_type == "numeric":

data = pd.to_numeric(data)
levels, palette, cmap, norm = self.numeric_to_palette(
data, order, palette, norm
)
limits = norm.vmin, norm.vmax

# --- Option 2: categorical mapping using seaborn palette

else:

cmap = None
limits = None
levels, palette = self.categorical_to_palette(
# Casting data to list to handle differences in the way
# pandas represents numpy datetime64 data
list(data), order, palette
)

self.palette = palette
self.levels = levels
self.limits = limits
self.norm = norm
self.cmap = cmap

def color_vector(self, data):

# TODO need to debug whe data.map(self.palette) doesn't work
# TODO call this "mapping" and keep palette for the orig var?
return [self.palette.get(val) for val in data]

def categorical_to_palette(self, data, order, palette):
"""Determine colors when the hue variable is qualitative."""
# Avoid circular import
from .palettes import color_palette

# -- Identify the order and name of the levels

if order is None:
levels = categorical_order(data)
else:
levels = order
n_colors = len(levels)

# -- Identify the set of colors to use

if isinstance(palette, dict):

missing = set(levels) - set(palette)
if any(missing):
err = "The palette dictionary is missing keys: {}"
raise ValueError(err.format(missing))

else:

if palette is None:
if n_colors <= len(get_color_cycle()):
colors = color_palette(None, n_colors)
else:
colors = color_palette("husl", n_colors)
elif isinstance(palette, list):
if len(palette) != n_colors:
err = "The palette list has the wrong number of colors."
raise ValueError(err)
colors = palette
else:
colors = color_palette(palette, n_colors)

palette = dict(zip(levels, colors))

return levels, palette

def numeric_to_palette(self, data, order, palette, norm):
"""Determine colors when the hue variable is quantitative."""
levels = list(np.sort(remove_na(data.unique())))

# TODO do we want to do something complicated to ensure contrast
# at the extremes of the colormap against the background?

# Identify the colormap to use
# Avoid circular import
from .palettes import cubehelix_palette, _parse_cubehelix_args

palette = "ch:" if palette is None else palette
if isinstance(palette, mpl.colors.Colormap):
cmap = palette
elif str(palette).startswith("ch:"):
args, kwargs = _parse_cubehelix_args(palette)
cmap = cubehelix_palette(0, *args, as_cmap=True, **kwargs)
elif isinstance(palette, dict):
colors = [palette[k] for k in sorted(palette)]
cmap = mpl.colors.ListedColormap(colors)
else:
try:
cmap = mpl.cm.get_cmap(palette)
except (ValueError, TypeError):
err = "Palette {} not understood"
raise ValueError(err)

if norm is None:
norm = mpl.colors.Normalize()
elif isinstance(norm, tuple):
norm = mpl.colors.Normalize(*norm)
elif not isinstance(norm, mpl.colors.Normalize):
err = "``hue_norm`` must be None, tuple, or Normalize object."
raise ValueError(err)

if not norm.scaled():
norm(np.asarray(data.dropna()))

# TODO this should also use color_lookup, but that needs the
# class attributes that get set after using this function...
if not isinstance(palette, dict):
palette = dict(zip(levels, cmap(norm(levels))))
# palette = {l: cmap(norm([l, 1]))[0] for l in levels}

return levels, palette, cmap, norm


def variable_type(vector, boolean_type="numeric"):
"""Determine whether a vector contains numeric, categorical, or dateime data.

2 changes: 1 addition & 1 deletion seaborn/distributions.py
Original file line number Diff line number Diff line change
@@ -53,7 +53,7 @@ def __init__(
height=None,
):

self.establish_variables(data, **variables)
super().__init__(data, **variables)

self.height = height