Replies: 3 comments
-
You need to account for the fact that the points are drawn by iterating over the hue levels first. Sorting is one way to accomplish this. Grouping your dataframe and then indexing into it appropriately would be another. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the pointer @mwaskom ! So I've implemented your second solution "Grouping your dataframe and then indexing into it appropriately" as such: import seaborn as sns
import matplotlib.pyplot as plt
# Demo dataset and apply some culling for visibility
tips = sns.load_dataset("tips")
tips = tips[::5]
x_var = 'day'
y_var = 'total_bill'
h_var = 'sex' #None
# Create the stripplot
ax = sns.stripplot(x=x_var, y=y_var, hue=h_var, data=tips, jitter=True, dodge=True)
# Group the DataFrame by 'x_var' and 'h_var' if hue is present
grouped = tips.groupby([x_var, h_var] if h_var else [x_var])
# Label the points within each group
for collection, (group_key, group_data) in zip(ax.collections, grouped):
for i, (x, y) in enumerate(collection.get_offsets()):
y_value = group_data.iloc[i]['total_bill']
ax.text(x, y, f"{y_value:.0f}", ha='center', va='bottom')
# Add labels
ax.set_xlabel("Day of the Week")
ax.set_ylabel("Total Bill Amount")
plt.show()
I don't think I'm understanding this, because for this to work I still need to iterate over Of the two solutions, do you have an opinion on which might be preferrable (clarity of code, efficiency...)? Cheers, |
Beta Was this translation helpful? Give feedback.
-
Not sure I have an opinion between those two. Another way to do it that might be more robust is to make the x,y coordinate tuple a key into a dictionary where the labels you want are the values. Then you would be more robust to the internals of seaborn, although of course it requires unique x,y tuples. Alternatively, this is a lot easier (if still a little awkward, pending #3120) in the objects interface: (
so.Plot(tips, x="day", y="total_bill", color="smoker")
.add(so.Dot(), so.Dodge(), so.Jitter(seed=0))
.add(so.Text(halign="center", valign="bottom"), so.Dodge(by=["color"]), so.Jitter(seed=0), text="time")
) |
Beta Was this translation helpful? Give feedback.
-
I do apologize if this has been asked before, I couldn't find a similar discussion.
This is the code I use for labelling points in a stripplot with values from a specific column in the dataframe that was used to generate the stripplot (hue
h_var='sex'
orh_var=None
):The only way I found how to reconcile the index of the stripplot dot positions with that of the dataframe rows is to first sort the dataframe, either by [x_var, y_var] or [x_var, h_var, y_var] if a hue is provided. Then, I extract the stripplot collections offsets, which do match the sorted dataframe rows. Without sorting first, the dataframe rows do not match the offsets indexes as extracted in my code.
I was wondering if there is a better way (the sns way?) to achieve this.
In my application, the labels applied to the stripplot dots could be a filename for the outliers outside the whiskers of a boxplot when a boxplot is displayed on top of a stripplot. And I also use adjustText to try and avoid overlapping labels.
Cheers,
Egor
Beta Was this translation helpful? Give feedback.
All reactions