-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wquantile #117
wquantile #117
Conversation
Just to clarify: you didn't read any R source code when implementing this? |
@@ -158,6 +158,10 @@ Quantile and Friends | |||
w = rand(n) | |||
xk = median(x, weights(w)) | |||
|
|||
.. function:: wquantile(x, w, q) | |||
|
|||
Compute the weighted quantile of ``x``, using weights given by a weight vector ``w`` (of type ``WeightVec``). Weights must not be negative. The weight and data vectors must have the same length `n`. Quantiles as defined as follow: assume for simplicity that weights sum to `n`. Denote :math:`x_h` the smallest element of ``x`` such that the cumulated sum up of weights up to `x_h` is strictly superior to :math:`lv * q + m`. Denote :math:`x_l` the element just before `x_h`. The sample quantile is defined as `(1-gamma) x_l + gamma x_h`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The weight and data vectors" -> "weights"
"as follow" -> "as follows"
"for simplicity": looks like you can always assume this, right? Better remove these words.
"cumulated sum up of weights up to" -> "cumulative sum of weights up to"
"gamma" -> "\gamma"
Also, looks like several expressions lack :math:
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and you say "quantile" (singular) while a vector of probabilities is supported. It would make sense to use the plural, and to say a word about the probabilities, following the vocabulary used for ?quantile
.
Thanks! I've only had a look at most obvious things. Indeed, @johnmyleswhite is right that you must not have taken inspiration from the R code to write this, or it can't be licensed under MIT. It's a bit hard to use their notation without reading their code, though you can always make an implementation from scratch just based on the behaviour described in R docs. |
Thanks for your comments. I changed my code accordingly. I don't understand what coverage/coveralls is so I'm not sure why it fails. No I did not look at the R code (which does not compute weighted quantile btw). A question about your workflow: when I modify a local copy of a package like StatsBase, how can I install this local version as a package? Also, is there a way to load every function in this local version, including non exported function? |
Coveralls is used to measure the portion of the code which is covered by tests. Apparently there are some issues with it at the moment. When you modify a package in |
@@ -158,6 +158,9 @@ Quantile and Friends | |||
w = rand(n) | |||
xk = median(x, weights(w)) | |||
|
|||
.. function:: wquantile(x, p, w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q
is p
in the code now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather the contrary.
@matthieugomez, can you answer the question about whether you read the R source when writing this code? If you did, then I'm afraid that iterating over this patch is a waste of time since the project can't accept it in any case. |
He did answer it, @StefanKarpinski |
|
||
|
||
###### Weighted quantile ##### | ||
# Variables m, g, low, high, correspond respectively to m, gamma, x[j], x[j+1] in the R help file for quantile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, you can use γ
to be closer to the math in Julia.
l = 1 | ||
wthreshold = cumulative_weight - wt[vpermute[i-1]] - index[k] | ||
# loop over posterior elements with same value | ||
while (i + l <= lv && v[vpermute[i+l]] == high[k]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard Julia style is to put no parentheses after while
.
The tests fail since you rebased, looks like you forgot to export the function. |
I removed I also used the weighted quantile function to improve the function |
I'm all for using |
|
||
|
||
function describe{T<:Real}(a::AbstractArray{T}; detail = false) | ||
detail? show(detailedsummarystats(a)) : show(summarystats(a)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a space before ?
(same below).
Thanks for the feedback again. What's the best way to use the code for the method of one type to define the method for another type ? Define Abstract type? Write a convert function from weighted to unweighted summary ? I don't see how this would make the code simpler.
|
I think you can define an abstract type, say |
Even simpler: you could merge all (non-)detailed and (un-)weighted printing methods into one method parameterized on |
I've simplified the |
You'll have to rebase. @nalimilan You've done most of the review, is this ready for merging? |
Bump? |
It's been a while. Can this be merged? I've suppressed the types to display weighted summary stats and I'll write it as a different pull request. |
This looks good to merge. What's going on with Travis? |
There have been some certificate problems on the nightlies, but they should vbe fixed now. I've restarted the build and it seems to work. |
Thanks for working on this, @matthieugomez. |
This function implements a weighted quantile. It is a generalization of quantile type 7 as implemented in Julia base (by that I mean results are the same with w = 1).
The function follows the notation in the R help file for quantile, which hopefully will make it easy to add more quantile types. Only non negative weights are allowed.
I've added some tests and documentation.
This is my first pull request so I may be doing some things incorrectly. In particular, I'm not sure about the order of arguments (weight vs quantiles).