Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FacetGrid.map_dataframe passes disallowed keyword arguments to pointplot #3004

Closed
desilinguist opened this issue Sep 7, 2022 · 12 comments · Fixed by #3016
Closed

FacetGrid.map_dataframe passes disallowed keyword arguments to pointplot #3004

desilinguist opened this issue Sep 7, 2022 · 12 comments · Fixed by #3016

Comments

@desilinguist
Copy link

My code that used to work perfectly fine with 0.11.0 breaks with the new 0.12.0 release. The code creates a FacetGrid and then applies pointplot() to each of the cells as follows:

...
g = sns.FacetGrid(df_fs, row="metric", col="learner_name",
                  hue="variable", height=2.5, aspect=1,
                  margin_titles=True, despine=True, sharex=False,
                  sharey=False, legend_out=False, palette="Set1")
g = g.map_dataframe(sns.pointplot, "training_set_size", "value",
                    scale=.5, ci=None)
...

The relevant traceback is as follows:

Traceback (most recent call last):
 ...
  File "/builds/EducationalTestingService/skll/skll/experiments/output.py", line 127, in generate_learning_curve_plots
    g = g.map_dataframe(sns.pointplot, "training_set_size", "value",
  File "/root/sklldev/lib/python3.8/site-packages/seaborn/axisgrid.py", line 819, in map_dataframe
    self._facet_plot(func, ax, args, kwargs)
  File "/root/sklldev/lib/python3.8/site-packages/seaborn/axisgrid.py", line 848, in _facet_plot
    func(*plot_args, **plot_kwargs)
TypeError: pointplot() got an unexpected keyword argument 'label'

Looking at the code for FaceGrid.map_dataframe, it does indeed create a label keyword argument which, I guess, causes the failure when pointplot() is called. From reading the release notes, it looks like this is because of this item

Removed the (previously-unused) option to pass additional keyword arguments to pointplot()

That keyword argument is created if hue is specified which I am not sure how to get around since I have multiple variables that I want to represent with different colors. If there's another way to achieve this, I'd really appreciate any guidance.

@desilinguist
Copy link
Author

I tried modifying the code to move the hue variable from the FacetGrid() instantiation call to the map_dataframe() call instead:

...
g = sns.FacetGrid(df_fs, row="metric", col="learner_name", height=2.5, 
                               aspect=1, margin_titles=True, despine=True, sharex=False, 
                               sharey=False, legend_out=False, palette="Set1")
g = g.map_dataframe(sns.pointplot, x="training_set_size", y="value",
                    hue="variable", scale=.5, errorbar=None)
...

While this code does work, it does not produce accurate results. Here's the plot with the above code with v0.12.0:

12

For comparison, here's the plot as produced by the original code with v0.11.2:

11

@desilinguist
Copy link
Author

desilinguist commented Sep 7, 2022

I figured out how to make this work by following the recommendations that:

  • hue levels and keywords should be handled by the plotting function and not FacetGrid
  • we need to make sure that the variable in the data frame that maps to hue levels is categorical.
  • it is now recommended to explicitly assign palette colors to hue levels.

Here's the new code:

...
df_fs["variable"] = df_fs["variable"].astype("category")
g = sns.FacetGrid(df_fs, row="metric", col="learner_name",
                  height=2.5, aspect=1, margin_titles=True,
                  despine=True, sharex=False,
                  sharey=False, legend_out=False)
g = g.map_dataframe(sns.pointplot, x="training_set_size",
                    y="value", hue="variable", scale=.5,
                    errorbar=None,
                    palette={"train_score_mean": train_color,
                             "test_score_mean": test_color})
...

This code now produces the same (correct) plot as with v0.11.2.

@mwaskom mwaskom reopened this Sep 7, 2022
@mwaskom
Copy link
Owner

mwaskom commented Sep 7, 2022

We should unbreak this, even if it's discouraged usage.

Glad you were able to work out the right thing to do here, but I am a little curious why you didn't opt for catplot, which would do all this complicated bookkeeping for you.

@mwaskom mwaskom added this to the v0.12.1 milestone Sep 7, 2022
@mwaskom mwaskom added the bug label Sep 7, 2022
@desilinguist
Copy link
Author

Indeed, catplot would have been simpler and I did try it but the marker size seemed to be much larger and less to my liking than if I used FacetGrid.

@mwaskom
Copy link
Owner

mwaskom commented Sep 8, 2022

Do you have an example? Catplot should basically just be generating the code in your third post.

@desilinguist
Copy link
Author

Sure! Attached are two plots that are saved in 300 DPI using plt.savefig(). The first was generated using FacetGrid + pointplot and the second was generated using catplot. I am doing a bunch of matplotlib-level processing to add the plot titles and legend manually but that part is identical between the two scenarios.

factgrid+pointplot

catplot

@mwaskom
Copy link
Owner

mwaskom commented Sep 8, 2022

Thanks but I’d need to see the actual code to make sense of the example.

@desilinguist
Copy link
Author

Ah, sorry. Please take a look at the generate_learning_curve_plots() function here.

@mwaskom
Copy link
Owner

mwaskom commented Sep 8, 2022

That link 404s (is it a private repo?)

I can't reproduce whatever you might be seeing with a simple example though:

image

(The tips example dataframe loads with categorical dtypes so it simplifies the bookkeeping when using FacetGrid).

@desilinguist
Copy link
Author

Apologies, that branch was probably merged by the time you got to it. It's now in the main branch.  

@mwaskom
Copy link
Owner

mwaskom commented Sep 8, 2022

I don't see any use of catplot on that page?

@desilinguist
Copy link
Author

Yeah, as I mentioned, I didn't use catplot in production because of the marker size.

Here's a gist that shows how I combined the FacetGrid and pointplot calls together.

However, I am extremely embarrassed to say that it now works fine 😬! Looking back on it, probably because when I did the test originally, I forgot to include the scale=0.5 keyword in the catplot call.

Apologies for wasting your time on this secondary issue.

mwaskom added a commit that referenced this issue Sep 12, 2022
Unbreaks FacetGrid + pointplot (fixes #3004) and is generally useful.
mwaskom added a commit that referenced this issue Oct 12, 2022
* Add label parameter to pointplot

Unbreaks FacetGrid + pointplot (fixes #3004) and is generally useful.

* Update default kws in pointplot tests

* Update release notes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants