Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multiple resolution when hue variable has no name #2462

Merged
merged 5 commits into from
Feb 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion doc/releases/v0.12.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,17 @@ v0.12.0 (Unreleased)

- |Enhancement| In :func:`histplot`, added `stat="percent"` as an option for normalization such that bar heights sum to 100 (:pr:`2461`).

- |Enhancement| |Fix| Improved integration with the matplotlib color cycle in most axes-level functions (:pr:`2449`).
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note is not relevant to this PR; I think I was tying off loose ends from a previous PR.


- |Fix| In :func:`lineplot, allowed the `dashes` keyword to set the style of a line without mapping a `style` variable (:pr:`2449`).

- |Fix| In :func:`rugplot`, fixed a bug that prevented the use of datetime data (:pr:`2458`).

- |Fix| In :func:`histplot` and :func:`kdeplot`, fixed a bug where the `alpha` parameter was ignored when `fill=False` (:pr:`2460`).

- |Fix| |Enhancement| Improved integration with the matplotlib color cycle in most axes-level functions (:pr:`2449`).
- |Fix| In :func:`histplot` and :func:`kdeplot`, fixed a bug where the `multiple` was ignored when `hue` was provided as a vector without a name (:pr:`2462`).

- |Defaults| In :func:`displot`, the default alpha value now adjusts to a provided `multiple` parameter even when `hue` is not assigned (:pr:`2462`).

- Made `scipy` an optional dependency and added `pip install seaborn[all]` as a method for ensuring the availability of compatible `scipy` and `statsmodels` libraries at install time. This has a few minor implications for existing code, which are explained in the Github pull request (:pr:`2398`).

Expand Down
2 changes: 1 addition & 1 deletion doc/tutorial/distributions.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@
"metadata": {},
"outputs": [],
"source": [
"sns.displot(penguins, x=\"flipper_length_mm\", col=\"sex\", multiple=\"dodge\")"
"sns.displot(penguins, x=\"flipper_length_mm\", col=\"sex\")"
]
},
{
Expand Down
33 changes: 16 additions & 17 deletions seaborn/distributions.py
Original file line number Diff line number Diff line change
Expand Up @@ -225,8 +225,16 @@ def _default_discrete(self):
return discrete

def _resolve_multiple(self, curves, multiple):
"""Modify the density data structure to handle multiple densities."""

# Default baselines have all densities starting at 0
baselines = {k: np.zeros_like(v) for k, v in curves.items()}

# TODO we should have some central clearinghouse for checking if any
# "grouping" (terminnology?) semantics have been assigned
if "hue" not in self.variables:
return curves, baselines

# Modify the density data structure to handle multiple densities
if multiple in ("stack", "fill"):

# Setting stack or fill means that the curves share a
Expand Down Expand Up @@ -261,11 +269,6 @@ def _resolve_multiple(self, curves, multiple):
.shift(1, axis=1)
.fillna(0))

else:

# All densities will start at 0
baselines = {k: np.zeros_like(v) for k, v in curves.items()}

if multiple == "dodge":

# Account for the unique semantic (non-faceting) levels
Expand Down Expand Up @@ -413,13 +416,6 @@ def plot_univariate_histogram(
else:
common_norm = False

# Turn multiple off if no hue or if hue exists but is redundant with faceting
facet_vars = [self.variables.get(var, None) for var in ["row", "col"]]
if "hue" not in self.variables:
multiple = None
elif self.variables["hue"] in facet_vars:
multiple = None

# Estimate the smoothed kernel densities, for use later
if kde:
# TODO alternatively, clip at min/max bins?
Expand Down Expand Up @@ -503,7 +499,8 @@ def plot_univariate_histogram(

# Default alpha should depend on other parameters
if fill:
if multiple == "layer":
# Note: will need to account for other grouping semantics if added
if "hue" in self.variables and multiple == "layer":
default_alpha = .5 if element == "bars" else .25
elif kde:
default_alpha = .5
Expand Down Expand Up @@ -903,7 +900,7 @@ def plot_univariate_density(
log_scale,
)

# Note: raises when no hue and multiple != layer. A problem?
# Adjust densities based on the `multiple` rule
densities, baselines = self._resolve_multiple(densities, multiple)

# Control the interaction with autoscaling by defining sticky_edges
Expand All @@ -916,9 +913,11 @@ def plot_univariate_density(
else:
sticky_support = []

# XXX unfilled kdeplot is ignoring
if fill:
default_alpha = .25 if multiple == "layer" else .75
if multiple == "layer":
default_alpha = .25
else:
default_alpha = .75
else:
default_alpha = 1
alpha = plot_kws.pop("alpha", default_alpha) # TODO make parameter?
Expand Down
11 changes: 11 additions & 0 deletions seaborn/tests/test_distributions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1217,6 +1217,17 @@ def test_hue_dodge(self, long_df):
assert_array_almost_equal(layer_xs[1], dodge_xs[1])
assert_array_almost_equal(layer_xs[0], dodge_xs[0] - bw / 2)

def test_hue_as_numpy_dodged(self, long_df):
# https://github.com/mwaskom/seaborn/issues/2452

ax = histplot(
long_df,
x="y", hue=long_df["a"].to_numpy(),
multiple="dodge", bins=1,
)
# Note hue order reversal
assert ax.patches[1].get_x() < ax.patches[0].get_x()

def test_multiple_input_check(self, flat_series):

with pytest.raises(ValueError, match="`multiple` must be"):
Expand Down