-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of a function to compute the k-th order central moment of an array #153
Conversation
commit c9f73b5 Author: Vandenplas, Jeremie <[email protected]> Date: Sun Mar 1 20:20:58 2020 +0100 moment_dev commit df4fedb Author: Vandenplas, Jeremie <[email protected]> Date: Sat Feb 29 22:26:53 2020 +0100 moment_dev: addition of TOCs commit 20a76b8 Author: Vandenplas, Jeremie <[email protected]> Date: Sat Feb 29 22:19:03 2020 +0100 moment_dev: addition of spec commit 7c33e00 Author: Vandenplas, Jeremie <[email protected]> Date: Sat Feb 29 21:07:24 2020 +0100 moment_dev: addition of tests for complex commit 6c2630b Author: Vandenplas, Jeremie <[email protected]> Date: Fri Feb 28 23:33:00 2020 +0100 moment_dev: test progress commit 1743f87 Author: Vandenplas, Jeremie <[email protected]> Date: Fri Feb 28 22:25:43 2020 +0100 moment_dev: correction of moment for complex commit dfbdd97 Merge: 195c62d 5f35833 Author: Vandenplas, Jeremie <[email protected]> Date: Fri Feb 28 20:12:35 2020 +0100 Merge remote-tracking branch 'upstream/master' into moment_dev commit 195c62d Author: Vandenplas, Jeremie <[email protected]> Date: Sun Feb 23 20:26:09 2020 +0100 moment_dev: removed some explicit statements commit 7c5233f Author: Vandenplas, Jeremie <[email protected]> Date: Sat Feb 22 16:51:06 2020 +0100 moment_dev: addtition of test commit 5ff0968 Author: Vandenplas, Jeremie <[email protected]> Date: Sat Feb 22 16:09:54 2020 +0100 moment_dev: addition of a function for central moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, along the lines of previous work in stdlib.
I don't have strong opinions on the questions you asked at the end. Overall +1.
I just implemented (absolute) central/raw moments in this branch. Personally I only use central moments, which is the default returned value for most packages I checked (that sometimes supports only the central moment). |
I noticed the Scipy version uses exponentiation by squares for efficiency. Do you think in Fortran we can simply trust the compiler to do the right thing? Second thing I am wondering is what is the sensitivity of this function to underflow/overflow? The reason I am asking is because I recall reading a paper by Hanson and Hopkin concerning some subtleties of the intrinsic Fortran Concerning the API, the SciPy version allows to pass a list of moments orders, e.g.
Since I have not used this function before, I don't know whether this is particularly useful or not? It sure would complicate the allocation of a LHS variable... |
Concerning this point, I find the Octave solution of passing a character ("c", "a", "ac", "r", "ar") to an optional
|
Thank you @ivan-pi for the comments.
I hope so. At least I always trust it, while never check it. If it is not the case, it could a nice procedure for
Good point. The function mainly relies on intrinsic functions (e.g., I think that this point is applicable to all procedures in
I am not in favor of such a behavior (the user can call several times the same function, with different order). The main issue would be the dimension of the result (what would happen if only one value is given? Is the result a scalar ( |
Nice suggestion. I agree that using characters may simplify the API. Here is my reasoning to propose logicals and not characters:
Again, I also find that the API with characters is nicer, but have some (unfounded?) concerns about performance. |
Thank you for explaining these points. As long as the API does not change, we can always modify for performance/accuracy later, or as you suggest add specialized versions. I agree with your concern about the dimension of the result. MATLAB does not have it either, so I suppose there is no need really. The current behavior is also fairly easy to remember, for the case of computing the moment along a single axis, the user simply needs to remember, the target array is one rank smaller than the original array:
For array slices, or chained calls to moments it becomes more tricky (I don't know whether there are such use cases). The purpose of this function is primarily statistics, right? I have used central moments before for image analysis purposes (e.g. to locate the centroid or area of an object), but I think a different function would be necessary for this. |
I think your concerns about performance are unjustified. If you check the routines in other languages they performs checks for NaN, dimensions, etc., but they are still more popular than Fortran (simply because they provide these functions). LAPACK also uses characters in the API to specify the form of the system of equations (transpose, no transpose...), but they return the I do agree with your concern about the multiple character combinations. I suppose renaming the |
Indeed, it is primarily statistics. I think the function you mentioned is an extension to 2D.
You convinced me. I will start an implementation with characters, instead of logicals. Then we can further discuss pros and cons based on the code. However, it could be easier to first merge this PR (if there is a consensus about it), and then submit another PR with the different moments.
#95 could be usefull in these cases.
I agree. |
After taking a closer look at the source code, the error message for "wrong dimension" currently refers to the
As the raw and absolute do not break backward compatibility of API I think this might be easier too. It would be nice to get an opinion from the remaining developers concerning their preference between logicals vs characters for option passing. |
Thank you for reviewing the code. It is now corrected as suggested.
I agree! It needs more discussions. |
Sorry to barging in late to the discussion. I do not have strong feelings about the function since I do not usually use it. I do not want to disrupt the implementation when it already has a lot of work in it but there are two points that are not clear. First, probably the simplest is that I do not understand what happens when a scalar mask is provided. I though that perhaps for functions as Secondly, when started to ponder about the interface with logical vs character flags I went to look to other implementations and started to think on the usefulness of the options. Because is not my area of expertise, I only was able to think of Some Applications
For these reasons I would favor (slightly) an API more similar to Julia in that the center be given as a float point number (or rather a number of the same type than the array).
Additionally, in the same spirit we could consider adding an array of weights Thus, we only would have a remaining flag for absolute, which (because of my ignorance) I do not know how used/needed is. Because of the advanced state of the current pull request, I'd be perfectly happy to leave it as it is now, and if there is any interest keep the discussion for a future next iteration. |
Thank you for your comments @fiolj . There is no problem! I mainly implemented it to facilitate discussion.
Personally I do not understand the usefullness of a scalar logical, because it will always return Regarding the implementation, it could be indeed passed to the intrinsic function
This is a nice idea. result = moment(x, 2) ! raw moment
result = moment(x, 2, center = mean(x)) ! central moment
result = moment(x, 2, dim = 3, center = mean(x, 3, mask=(x >3)), mask = (x >3)) !central moment Your proposed API adds some flexibility to the function
What are your opinions about this API? @fiolj @ivan-pi @certik @milancurcic Re: weights, the aim was to implement a function for the moment in terms of mathematics/statistics. However, if the addition of an argument can extend its used in physics, I would agree. |
n = size(x, kind = int64) | ||
mean = sum(x) / n | ||
|
||
res = sum((x - mean)**order) / n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not
res = mean((x - mean(x))**order)
here? Is there a concern about the overhead from calling mean()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was indeed to avoid the overhead of calling mean
(especially when mean
is an array).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late review. This is excellent, thank you. I especially appreciate the hyperlinked TOC in the spec file.
I only had one question about the implementation. Otherwise, good to go.
Thank you @milancurcic for the review. Any thoughs about a different API: result = moment(array, order [, center] [,absolute] [, mask])
result = moment(array, order, dim [, center] [,absolute] [, mask]) where |
I've started some simple timing tests with gfortran on the overhead of using mean(), I do not have conclusive results yet, but noticed a few (minor) points in the code. Some of these will change with the new API implementation, but I mention them anyway.
Also, we could put all calculations under the right case (although it perhaps may become quite irrelevant with the new API, unless we decide to use "center moment" as a default)
|
Funny. I implemented it in the way you proposed for I aimed to review the code of |
The API with optional I suggest to:
Small and specific PRs will allow for a focused discussion. |
I agree with @milancurcic about how to move forward. I am not a target audience either for this, so I do not have a strong opinion either. |
@milancurcic @certik Thank you for your advices. I will merge this PR. @fiolj I would be interested by your results regarding the overhead of calling |
A (naive) comparison of times for moment is in: https://gist.github.com/fiolj/7ebb1611e200d1fcd248b5866f851b76 Strangely enough, I am not finding differences between using |
Thank you @fiolj for these time comparisons. |
Related to #113
In this draft pull request I propose the implementation of a function for the k-th order central moment of an array.
Other languages
Matlab moment
SciPy moment
Julia moment
R moment
Octave moment
Note: The functions of R and Octave can also return absolute central moment, raw moment, and absolute raw moment.
Proposed API and implementation
moment
- central moment of array elementsDescription
Returns the k-th order central moment of all the elements of
array
, or of the elements ofarray
along dimensiondim
if provided, and if the corresponding element inmask
istrue
.The k-th order central moment is defined as :
where n is the number of elements.
Syntax
result = moment(array, order [, mask])
result = moment(array, order, dim [, mask])
Arguments
array
: Shall be an array of typeinteger
,real
, orcomplex
.order
: Shall be an scalar of typeinteger
.dim
: Shall be a scalar of typeinteger
with a value in the range from 1 to n, where n is the rank ofarray
.mask
(optional): Shall be of typelogical
and either by a scalar or an array of the same shape asarray
.Return value
If
array
is of typereal
orcomplex
, the result is of the same type asarray
.If
array
is of typeinteger
, the result is of typereal(dp)
.If
dim
is absent, a scalar with the k-th central moment of all elements inarray
is returned. Otherwise, an array of rank n-1, where n equals the rank ofarray
, and a shape similar to that of
array
with dimensiondim
dropped is returned.If
mask
is specified, the result is the k-th central moment of all elements ofarray
corresponding totrue
elements ofmask
. If every element ofmask
isfalse
, the result is IEEE
NaN
.Example
Questions for discussion
absolute
andraw
? (see example in this branch )