Add regularized incomplete beta and gamma functions. Use them and their auxiliary functions as much possible. #148

andreasnoack · 2013-09-17T19:54:24Z

This is a big one. The three major functions are the incomplete beta and gamma functions and the inverse of the last, but these bring some auxiliary functions which are of interest by their own rights. For example, some are useful for precise calculations of pdfs. Please have a look at the headers and see if it can be used anywhere else.

I have done a lot of testing of the functions agains Rmath and the Fortran version of NSWC, but there are many possible combinations of parameter values so I cannot guarantee that all error are gone. When the new implementation differed (relatively) from the old by more than 1e-14 I compared to Mathematica and in all cases both versions were inaccurate by more that they differed from each other.

I have also written fallback quantile functions. The continuous one is just Newton's method and the discrete is a normal approximation and a search. The two quantile functions also allow for a fallback random number method. I did this because I saw that a couple of the R's quantile functions and RNGs worked through "generic" methods. Please consider the choice of initial value in the continuous quantile function. I am not sure my choice is the right thing to do.

What remains is specialised RNG's for at least binomial and Poisson. The code R uses is not free enough for our needs.

…ir auxiliary functions as much possible.

johnmyleswhite · 2013-09-17T20:12:08Z

This is a pretty amazing body of work. Well done!

My first order reaction is that (1) these new special functions should probably be in Base and (2) it would be great to use less Fortran-like names.

dmbates · 2013-09-17T21:21:24Z

Indeed, thanks @andreasnoackjensen . This is amazing.

simonbyrne · 2013-09-17T21:49:56Z

Agreed, thanks @andreasnoackjensen, it's some very impressive work.

I agree with @johnmyleswhite on the names. There are a couple of cases where multiple dispatch could be used to cut down on prefixes. We could also take hints from functions like log1p, so that dgmln1 could be lgamma1p.

lindahua · 2013-09-17T22:12:52Z

This is really impressive! The 3000+ lines of codes in specialfuns.jl is amazing.

Some of these functions would have broader use than within this package -- actually I think they deserve in Base or a package of itself (say SpecialFunctions.jl) ?

cc @StefanKarpinski @ViralBShah

simonbyrne · 2013-09-18T09:54:11Z

Do we want to merge this straight into master, or into the rm-rmath branch first?

simonbyrne · 2013-09-18T10:37:25Z

These are probably the most useful functions, which we might want to consider renaming (/'s indicate Float32 and Float64 versions), I've also noted some possible names:

gratio / dgrat = incomplete gamma function ratio
rcomp / drcomp = exp(-x) * x^a / gamma(a)
- I don't know of a good name for this function
dpdel = lgamma(x) - (x-0.5)*log(x) - x + 0.5*log(2pi) (Stirling series)
- I've called this lstirling in the rm-rmath branch, with stirling(x) = exp(lstirling(x)), but I'm open to better names for this.
rlog / drlog = x - 1 - log(x)
- I've called its negation logmxp1 in rm-rmath
gaminv / dginv = inverse gamma function ratio
bratio = beta function ratio
brcomp = x^a * y^b / beta(a,b)
drlog1 = x - log(1+x)
- Its negation could be log1pmx?
dbcorr = dpdel(a) + dpdel(b) - dpdel(a+b)
dlgdiv = log(gamma(b)/gamma(a+b))
gam1 / dgam1 = 1/gamma(1+x)
- rgamma1p? (r for reciprocal)
gamln1 / dgmln1 = lgamma(1+x)
- lgamma1p?
exparg / dxparg = log(realmin()) : log(realmax())) (depending on Boolean argument)
- logrealmax / logrealmin? It would also be great to define this for BigFloats (actually, what I really need is the largest x such that exp(x) < Inf).
- UPDATE: It turns out in the documentation that is how they actually define it. So how about realmaxexp?

One potential problem is that some of these functions are only valid for certain values (e.g. dpdel is only valid for arguments > 10).

Also, it seems the following already have equivalents in Base:

alnrel / dlnrel = log1p
dbetln = lbeta

andreasnoack · 2013-09-18T12:13:24Z

@simonbyrne I don't know what is the easiest way to handle the merging. Please tell me if it is better to make the pull request against rm-rmath.

I kept the old names to make it easier to compare with the source, but now that the translation is done, I think it is good to change the names and use multiple dispatch. I think your name suggestions are fine. Some specific comments

rcomp and brcomp are the two pdfs multiplied with x and x(1-x) respectively. We could keep the names and export the pdfs for those two distributions or just make them available through pdf.
Maybe we could check the argument in dpdel and throw an exception if it is outside the domain valid for the approximation.
gamln1/dgmln1 is not worth exporting as I just defined it as a dummy function. As the only auxiliary function dgmln1 was not more precise than than just calculating it directly.
I would also like realmax and realmin for BigFloat because then gratio and gaminv would work for that type as well. Maybe I should just remove exparg and call log(realmax/min) directly.
I just missed log1p. I have tried use build in functions as much as possible, e.g. the error function. I'll update that. However, dbetln is another thing. In Base lbeta is calculated from lgamma and precision is lost and therefore dbetln is a better version of that function. Still, I guess that for BigFloat it is better to use the version from base because MPFR ships with a gamma function.

simonbyrne · 2013-09-18T12:30:13Z

I think I've made some conflicting changes, so I'll try to merge it in manually.

Ideally, it would be great to extend the functions to the whole domain, but in the meantime I think throwing an error is probably best.
A true lgamma1p function could be useful, as lgamma(1+x) is inaccurate for small x.
In that case, perhaps we should change the lbeta in Base (leaving the existing definition for BigFloats only)?

simonbyrne · 2013-09-18T12:48:15Z

How about:

import Base.realmax
realmax(::Type{BigFloat}) = prevfloat(inf(BigFloat))
realmaxexp(::Type{BigFloat}) = with_bigfloat_rounding(()-> log(realmax(BigFloat)), RoundDown)

andreasnoack · 2013-09-18T12:54:23Z

Just a clarification to avoid double work. Are you manually copying this into rm-math? If so, I will not do updates on this pull request.

I just had a look at the source and it defines dgmln(x)=-log1p(dgam1(x)) for the interval -0.5 .le. x .le. 1.5. Outside that interval I think lgamma(1+x) is fine.

The realmax for BigFloat seems fine. Can you also do a realmin?

simonbyrne · 2013-09-18T13:09:32Z

I've merged it in via git merge, so feel free to do it on either branch.

You could try nextfloat(zero(BigFloat)) for realmin.

andreasnoack · 2013-09-18T13:15:20Z

I think that is too small. At least for Float64, I get

julia> realmin()
2.2250738585072014e-308

julia> nextfloat(zero(Float64))
5.0e-324

simonbyrne · 2013-09-18T13:35:23Z

Ahh, I forgot about subnormals.

However it seems that MPFR doesn't do subnormals (or at least if it does, it doesn't do them the same way), as for regular IEEE-754 floats:

julia> z = zero(Float64)
0.0

julia> nextfloat(zero(Float64))
5.0e-324

julia> nextfloat(nextfloat(zero(Float64)))
1.0e-323

the second one doubles the first, wheras for BigFloats

julia> z = zero(BigFloat)
0e+00 with 256 bits of precision

julia> nextfloat(z)
2.382564904887951073216169781732674520415196125559239787955023752600945386104324e-323228497 with 256 bits of precision

julia> nextfloat(nextfloat(z))
2.382564904887951073216169781732674520415196125559239787955023752600945386104366e-323228497 with 256 bits of precision

there isn't a loss of granularity.

Might want to get someone who understands this stuff to confirm it though...

lindahua · 2013-09-18T16:02:01Z

As the file specialfuns.jl gets much bigger, it would be easier to read and maintain, if we divide the file into smaller ones, and put them into a directory specialfuns.

dmbates · 2013-09-18T16:13:01Z

Has there been further discussion of which of the special functions should go into Base? In particular have @ViralBShah or @StefanKarpinski expressed opinions yet? It may be my parochial point of view but I feel the incomplete beta function and incomplete gamma function should be in Base along with Bessel functions, etc. but my anti-Bessel function stance may have been influenced by a poorly-taught undergraduate applied math course.

StefanKarpinski · 2013-09-18T17:10:56Z

cc: @JeffBezanson as well. Have to catch up on this thread myself.

StefanKarpinski · 2013-09-18T17:12:37Z

It may be desirable to open an issue against https://github.com/JuliaLang/julia, liking back to this discussion, in order to have a little more visibility.

JeffBezanson · 2013-09-18T17:31:58Z

What is the original fortran code some of this is based on? Can we call it?

lindahua · 2014-06-20T12:54:04Z

@andreasnoackjensen Would you please incorporate the latest changes to the master to make this mergeable again?

andreasnoack · 2014-06-20T13:07:21Z

@simonbyrne You did some further work on this, right? What is the status on that?

lindahua · 2014-06-20T13:14:10Z

My refactoring was not related to this PR. But I have moved stuff around. In particular, I move generic functions (and fallbacks) to src/univariates.jl. I think that's what blocks this PR.

simonbyrne · 2014-06-22T09:48:31Z

I did merge this into #138, but I've since abandoned that in favour a piecewise approach. I've added some of the utility functions (log1pmx and lstirling_asym are based on functions from this), though I've found that the NSWC code is not always optimal. Part of the reason is that it predates the IEEE floating point standard, and so was written without niceties as gradual underflow and exact round-to-nearest arithmetic: you can see this from lines such as Q = 0.5 - (0.5 - P).

I've been slowly working on replacing these and comparing to alternative implementations (Rmath, Boost, GSL, etc.), but it's a slow process. We should probably implement some automatic tests that will compute error at some large number of points (similar to what Boost does): this would require some reference BigFloat implementation, but it might be worth the effort.

lindahua · 2015-07-29T15:49:49Z

@andreasnoack Can you close this and make a PR to the StatsFuns package?

andreasnoack · 2015-07-29T17:28:36Z

I'll close her, but please keep the branch. I don't know when I'll prepare the new pull request.

Add regularized incomplete beta and gamma functions. Use them and the…

12d5aea

…ir auxiliary functions as much possible.

lindahua mentioned this pull request Sep 18, 2013

Special functions to Base JuliaLang/julia#4301

Closed

simonbyrne mentioned this pull request Sep 23, 2013

WIP: remove Rmath dependence #138

Open

25 tasks

simonbyrne mentioned this pull request Dec 5, 2013

add realmin realmax for BigFloat JuliaLang/julia#5036

Merged

lindahua added new-distr labels Jun 20, 2014

andreasnoack closed this Jul 29, 2015

matbesancon deleted the anj/beta branch November 16, 2018 21:21

andreasnoack restored the anj/beta branch September 22, 2020 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add regularized incomplete beta and gamma functions. Use them and their auxiliary functions as much possible. #148

Add regularized incomplete beta and gamma functions. Use them and their auxiliary functions as much possible. #148

andreasnoack commented Sep 17, 2013

johnmyleswhite commented Sep 17, 2013

dmbates commented Sep 17, 2013

simonbyrne commented Sep 17, 2013

lindahua commented Sep 17, 2013

simonbyrne commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

andreasnoack commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

andreasnoack commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

andreasnoack commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

lindahua commented Sep 18, 2013

dmbates commented Sep 18, 2013

StefanKarpinski commented Sep 18, 2013

StefanKarpinski commented Sep 18, 2013

JeffBezanson commented Sep 18, 2013

lindahua commented Jun 20, 2014

andreasnoack commented Jun 20, 2014

lindahua commented Jun 20, 2014

simonbyrne commented Jun 22, 2014

lindahua commented Jul 29, 2015

andreasnoack commented Jul 29, 2015

Add regularized incomplete beta and gamma functions. Use them and their auxiliary functions as much possible. #148

Add regularized incomplete beta and gamma functions. Use them and their auxiliary functions as much possible. #148

Conversation

andreasnoack commented Sep 17, 2013

johnmyleswhite commented Sep 17, 2013

dmbates commented Sep 17, 2013

simonbyrne commented Sep 17, 2013

lindahua commented Sep 17, 2013

simonbyrne commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

andreasnoack commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

andreasnoack commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

andreasnoack commented Sep 18, 2013

simonbyrne commented Sep 18, 2013

lindahua commented Sep 18, 2013

dmbates commented Sep 18, 2013

StefanKarpinski commented Sep 18, 2013

StefanKarpinski commented Sep 18, 2013

JeffBezanson commented Sep 18, 2013

lindahua commented Jun 20, 2014

andreasnoack commented Jun 20, 2014

lindahua commented Jun 20, 2014

simonbyrne commented Jun 22, 2014

lindahua commented Jul 29, 2015

andreasnoack commented Jul 29, 2015