Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: df.boxplot fails to use existing axis/subplot (#3578) #6991

Closed
wants to merge 1 commit into from

Conversation

onesandzeroes
Copy link
Contributor

Alright, I think I have a fix for issue #3578. In plotting._grouped_plot_by_column, there was no check being done for whether ax was None, which was the main source of the issue. When the boxplot is of multiple columns, I don't think there's anything sensible that can be done with the ax argument, so that
now raises a ValueError.

I've written some tests, but I'm not sure if they really get to the heart of the problem, so any insight on how to improve them would be appreciated.

Testing code adapted from the original bug report to demonstrate the new behaviour:

import matplotlib.pyplot as plt
import pandas
from pandas import DataFrame, Series

data = {'day': Series([1, 1, 1, 2, 2, 2, 3, 3, 3]),
        'group2': Series([2, 2, 2, 2, 2, 1, 1, 1, 1]),
        'val': Series([3, 4, 5, 6, 7, 8, 9, 10, 11]),
        'val2': Series([8, 9, 10, 11, 12, 13, 14, 15])}
df = pandas.DataFrame(data)

# Single-column using existing axis: should create a single plot
plt.figure()
plt.subplot(2, 2, 1)
plt.plot([1, 2, 3])
plt.subplot(2, 2, 4)
ax = plt.gca()
fig = ax.get_figure()
axes = df.boxplot('val', 'day', ax=ax)
print("Testing identity of returned axes: (should be True)")
print(id(axes) == id(ax.get_axes()))
plt.show()

# Multiple column, not using existing axis: should create two plots
plt.figure()
plt.subplot(2, 2, 1)
plt.plot([1, 2, 3])
plt.subplot(2, 2, 4)
ax = plt.gca()
axes = df.boxplot(['val', 'val2'], 'day')
print("Testing identity of returned axes: (should be False)")
print(id(axes) == id(ax.get_axes()))
plt.show()


# Multiple column using existing axis: should raise an exception,
# since it's hard to know what to do: we need an axis for each
# column and ax is just a single axis object
plt.figure()
plt.subplot(2, 2, 1)
plt.plot([1, 2, 3])
plt.subplot(2, 2, 4)
ax = plt.gca()
try:
    df.boxplot(['val', 'val2'], 'day', ax=ax)
except ValueError:
    print("Raising exception as expected")
plt.show()

@jreback
Copy link
Contributor

jreback commented Apr 28, 2014

@TomAugspurger @cpcloud

@cpcloud
Copy link
Member

cpcloud commented May 5, 2014

@onesandzeroes can you rehash and force push to run travis ci again? thx

here's a line to do just that

git commit --amend -C HEAD && git push --force

@onesandzeroes
Copy link
Contributor Author

Alright, I've just pushed it up now. I have a feeling that this might overlap with the new #7035 pull request, which is fixing a pretty closely related issue, so if that gets merged first I might have to go back over and check this one.

@onesandzeroes
Copy link
Contributor Author

Sorry, was I supposed to rebase as well? The build still seems to be failing for unrelated reasons (reading finance data from Yahoo), is this likely to be fixed if I rebase on master?

@onesandzeroes
Copy link
Contributor Author

I've rebased on master and the tests seem to be passing now.

@jreback jreback added this to the 0.15.0 milestone May 6, 2014
@jreback
Copy link
Contributor

jreback commented May 6, 2014

@TomAugspurger @cpcloud ?

@jreback jreback modified the milestones: 0.14.1, 0.15.0 May 6, 2014
@jreback
Copy link
Contributor

jreback commented May 6, 2014

@onesandzeroes if you can squash would be great, see https://github.com/pydata/pandas/wiki/Using-Git

@TomAugspurger
Copy link
Contributor

Could you add a test for when by is None? In this case we return a dict (which may be changing in the nearish future, but it would still be good to test it for now). Something like

import itertools

fig, ax = plt.subplots()
d = df.boxplot(ax=ax)
lines = list(chain.from_iterable(d.values()))
self.assertEqual(len(ax.get_lines()), len(lines))

sharex=True, sharey=True,
figsize=figsize, ax=ax)
else:
if ngroups > 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could handle this case, assuming that ax is an array with size equal to ngroups. Want to open a separate issue for that?

@TomAugspurger
Copy link
Contributor

This is out of scope for this issue, but do we have a specified way for when a plot call with None for figure / ax creates a new fig / ax vs. updates an existing one? I'll pay more attention, but I seem to remember it being inconsistent.


Just that one additional test for when by is None and then I think this is ready.

@onesandzeroes
Copy link
Contributor Author

Added the new test, rebased and squashed it down to a single commit, so it should be good to go.

I've also created a new issue for allowing multiple axes in the ax argument as @TomAugspurger mentioned.

axes needs to be a plt.Axes object

use ax get_axes() to set axes

Add tests for use of existing axis

Add test for when by is None

Add fix to the release notes
@TomAugspurger
Copy link
Contributor

Merged via f851b57

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants