-
-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Beeswarm plot #61
base: master
Are you sure you want to change the base?
Conversation
This will be used to change the number of bins for the beeswarm plot.
There's evidently something I'm missing, but when I do
Why do |
You need to add Sorry for the wait, btw- I've been on holiday and I seem to be the only JuliaPlots member that currently maintains StatPlots actively. I'll try to get to do a real review within a few days. As soon as you have it working, it would be helpful with a png image for the readme. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This'll be really nice. I have some stylistic comments. I should note this doesn't currently work on my machine (an error is thrown from RecipesBase) but I can't off the top of my head say what the problem is. Does the plot work on your system?
@@ -25,18 +26,24 @@ using StatPlots | |||
gr(size=(400,300)) | |||
``` | |||
|
|||
The `DataFrames` support allows passing `DataFrame` columns as symbols. Operations on DataFrame column can be specified using quoted expressions, e.g. | |||
The `DataFrames` support allows passing `DataFrame` columns as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for also taking the time to clean up files etc. But I'd like to keep changes like this separate from changes that add new functionality - can you cherry-pick this and the other changes (deleted/insert lines) to a separate PR and keep this PR on the beeswarm
recipe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit of the readme has been changed, so this can be scrapped when you rebase.
src/beeswarm.jl
Outdated
info("side set to :$side") | ||
end | ||
x, y = Float64[], Float64[] | ||
glabels = sort(collect(unique(x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the code here is identical across beeswarm, violin and boxplot - I haven't checked carefully, but would it not be possible to extract a general function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure yet -- I'll try to get the beeswarm function working, then see if we can extract some things.
@mkborregaard thanks for the comments -- it's not working for me either, but I did not expected it to. Still working on it. |
Stopping here for the day -- the internals work, but there is nothing displayed. I'm at a loss, and the documentation of plots and recipes is not helpful. |
I'll have a look in the morning :-) |
There appears to be a bug in how recipes work together with Plots, where the default |
|
||
x := xp | ||
y := yp | ||
seriestype := :scatter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, add
if get!(d, :markershape, :circle) == :none
d[:markershape] = :circle
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cf JuliaPlots/Plots.jl#989 , a PR to fix this behaviour in Plots so this code won't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was merged so disregard this comment.
It's not quite done yet (I need to change the way the widths are calculated), but it is at least working as of bec231d using DataFrames, Plots, Distributions
include("src/StatPlots.jl")
using StatPlots
N = 100
labels = repeat(["a", "b", "c", "d"], inner=N)
values = vcat(
rand(Normal(0.0), N),
rand(Normal(2.0), N),
rand(Normal(3.0), N),
rand(Normal(1.0, 0.5), N)
)
d = DataFrame(labels = labels, values = values)
violin(d, :labels, :values, leg=false, c=:lightgrey, side=:right)
beeswarm!(d, :labels, :values, c=repeat([:green, :orange, :blue, :purple], inner=N), side=:left, bins=:scott) |
very nice :-) |
bump? :-) |
# make the violin | ||
xcenter = Plots.discrete_value!(d[:subplot][:xaxis], glabel)[1] | ||
|
||
for i in 2:length(centers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using i
here is what throws your width calculations - i
is also the index in the outer loop. Change to j
.
|
||
for i in 2:length(centers) | ||
inside = Bool[centers[i-1] < u <= centers[i] for u in lab_y] | ||
if sum(inside) > 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need
if sum(inside) == 1
lab_x[inside] .+= xcenter
elseif sum(inside) > 1
for when there's only a single point.
There are a number of difficulties with making a good violin plot, one of them being the highly local one that we generally don't know the markersize in points in StatPlots across the different backends. That is going to be remedied at some point, though, as Plots will move to a more explicit definition of dpi in terms of physical units. It just needs to be coded up, but means that we could put the basic functionality in here. Another issue is the binning - as it is now, the histogram bins are much larger on the y axis than on the x axis, which makes the points avoid each other along the x axis but not the y axis: Making the bins much smaller (here I think if you do want to use binning for something like this, you'd need to align the points along the bin centers - doing that makes it look more regular, but still slightly strange: For the case for aligning to bin centers - and also a general critique of using dot-plots to observe densities - check this paper by Wilkinson (the Grammar of Graphics guy): http://moderngraphics11.pbworks.com/f/wilkinson_1999.DotPlots.pdf |
this is currently a work in progress
I'm working on adding a draft implementation of beeswarm plots, to be iterated over. Very much looking for feedback.