Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of special mathematical functions to include #179

Open
certik opened this issue May 4, 2020 · 16 comments
Open

List of special mathematical functions to include #179

certik opened this issue May 4, 2020 · 16 comments
Labels
topic: mathematics linear algebra, sparse matrices, special functions, FFT, random numbers, statistics, ...

Comments

@certik
Copy link
Member

certik commented May 4, 2020

Here is a paper from 2007 that was submitted to the Fortran Committee, but ultimately rejected:

https://wg5-fortran.org/N1651-N1700/N1688.pdf

The functions there seem to be exactly in the scope of stdlib, so we should include them and we can use this paper as a starting point.

@ivan-pi
Copy link
Member

ivan-pi commented May 4, 2020

Indeed, this document looks like a nice starting point. Would this go to stdlib_experimental_specfun (or stdlib_experimental_special_functions) for now or would we borrow the name iso_fortran_special_functions?

I suppose many of these functions could be adapted from the following sources:

Also the scipy.special module gives references to many of these older codes.

@jvdp1
Copy link
Member

jvdp1 commented May 5, 2020

Here is a paper from 2007 that was submitted to the Fortran Committee, but ultimately rejected:

Do we know why it was rejected? Maybe the author(s) of this proposal has/have already some implementations that could be integrated in stdlib. Of course, @ivan-pi 's list will be useful too.
Anyway, I think it would be nice to have them in stdlib.
I am in favor of the name stdlib_experimental_special_functions.

@certik
Copy link
Member Author

certik commented May 5, 2020

It was @vansnyder's proposal. Van, do you know why it was rejected?

@vansnyder
Copy link

vansnyder commented May 5, 2020 via email

@vansnyder
Copy link

JPL Math77 is now available without license from netlib.

@vansnyder
Copy link

I've revised my 2009 proposal to correspond to Fortran 2020. ISO doesn't allow the kind of revision suffix that J3 allows, but I did it anyway. I just can't ask Steve to put it on the WG5 server with that number.

I think the stdlib should be packaged in modules, each one a coherent set of related procedures (and types and constants). These should ultimately all be described in a consistent style, as an optional part of the standard. Procedures should be described in the same format as in subclause 16.9 of part 1 of the standard. Constants and opaque types can be described as in subclause 16.10. Types that have public components and bindings could be described by a type definition that shows only type parameters, and the public components and bindings. Type-bound procedures should be described as in subclause 16.9.

N1688r2.pdf

@certik
Copy link
Member Author

certik commented May 6, 2020

Thanks @vansnyder for the updated document. Yes, that is our goal to have stdlib organized as modules with coherent set of procedures and documented in a Standard compatible way. Thanks for the tips how to do that.

@ivan-pi
Copy link
Member

ivan-pi commented May 7, 2020

I've found this post from 2017 on Julia Discourse by Steven Johnson, where he comments on the implementation of special functions in Fortran vs Julia:

For example, I implemented an erfinv function Julia (JuliaLang/julia#2987 35), and it was about 3x faster than Matlab or SciPy’s erfinv function, both of which are taken from standard Fortran libraries. (This is benchmarking single-threaded vectorized calls on large arrays where the Matlab/Python overhead should be negligible.) The underlying algorithm is similar to those used in the Fortran routines (in Matlab’s case this is only a guess), because almost everyone uses the same rational-function approximations published in the 1970s.

I have found similar gains (compared to Fortran code called in SciPy) for other special functions, e.g. polygamma functions (JuliaLang/julia#7125 14) and exponential integrals (JuliaMath/SpecialFunctions.jl#19 11).

The reason Julia can beat the Fortran code is that metaprogramming makes it easy to apply performance optimizations that are awkward in Fortran. We have metaprogramming macros (@evalpoly) that can easily inline polynomial evaluations, whereas the Fortran code makes function calls that loop over look-up tables of polynomial coefficients. Even greater speedups are possible for evaluating polynomials of complex arguments, where there is a fancy recurrence from Knuth that is almost impossible to use effectively without code generation. In principle, the Fortran authors could have done the same inlining and gotten similar performance, but the code would have been much more painful to write by hand. (They could even have written a program to generate Fortran code, but that is even more painful.)

While I don't think we should focus strongly on optimization to begin with, it would be interesting to see if we can do something similar with fypp.

@certik
Copy link
Member Author

certik commented May 7, 2020

@ivan-pi nice find. Initially indeed we should focus on functionality, but later our goal should definitely be to be as fast or faster than Julia. It might be a nice benchmark for Flang and LFortran also.

@ivan-pi
Copy link
Member

ivan-pi commented Jun 21, 2020

As my weekend project, I had a go at implementing Horner's algorithm with fypp (see mentioned issue aradi/fypp#8). This way polynomials can be efficiently inlined.

The syntax looks like

        @:horner(p,t,0.160304955844066229311e2,&
                      -0.90784959262960326650e2,&
                       0.18644914861620987391e3,&
                      -0.16900142734642382420e3,&
                       0.6545466284794487048e2,&
                      -0.864213011587247794e1,&
                       0.1760587821390590,prec=dp)

and gets expanded into

  p = ((((((0.1760587821390590_dp*t + (-0.864213011587247794e1_dp))*t + (0.6545466284794487048e2_dp))*t +&
      & (-0.16900142734642382420e3_dp))*t + (0.18644914861620987391e3_dp))*t + (-0.90784959262960326650e2_dp))*t +&
      & (0.160304955844066229311e2_dp))

There is still some space for improvement with respect to the preprocessor syntax.

I've implemented the inverse error function using the Horner macro. The expanded code is:

elemental function erfinv(x) result(res)
    use, intrinsic:: ieee_arithmetic, only: ieee_value, &
      ieee_positive_inf, ieee_negative_inf, ieee_quiet_nan
    real(dp), intent(in) :: x
    real(dp) :: a, t, res, p, q

    a = abs(x)
    if (a >= 1.0_dp) then
      if (x == 1.0_dp) then
        res = ieee_value(1._dp, ieee_positive_inf)
      else if (x == -1.0_dp) then
        res = ieee_value(1._dp, ieee_negative_inf)
      else
        ! domain error
        res = ieee_value(1._dp,ieee_quiet_nan)
      end if
    else if (a <= 0.75_dp) then ! Table 17 in Blair et al.
        t = x*x - 0.5625_dp
  p = ((((((0.1760587821390590_dp*t + (-0.864213011587247794e1_dp))*t + (0.6545466284794487048e2_dp))*t +&
      & (-0.16900142734642382420e3_dp))*t + (0.18644914861620987391e3_dp))*t + (-0.90784959262960326650e2_dp))*t +&
      & (0.160304955844066229311e2_dp))
  q = ((((((0.1e1_dp*t + (-0.206010730328265443e2_dp))*t + (0.10760453916055123830e3_dp))*t + (-0.22210254121855132366e3_dp))*t +&
      & (0.21015790486205317714e3_dp))*t + (-0.91374167024260313936e2_dp))*t + (0.147806470715138316110e2_dp))
        res = x * p / q
    else if (a <= 0.9375_dp) then ! Table 37 in Blair et al.
        t = x*x - 0.87890625_dp
  p = (((((((0.237516689024448_dp*t + (-0.5478927619598318769e1_dp))*t + (0.19121334396580330163e2_dp))*t +&
      & (-0.22655292823101104193e2_dp))*t + (0.11763505705217827302e2_dp))*t + (-0.29344398672542478687e1_dp))*t +&
      & (0.3444556924136125216_dp))*t + (-0.152389263440726128e-1_dp))
  q = (((((((0.1e1_dp*t + (-0.10014376349783070835e2_dp))*t + (0.24640158943917284883e2_dp))*t + (-0.23716715521596581025e2_dp))*t&
      & + (0.10695129973387014469e2_dp))*t + (-0.24068318104393757995e1_dp))*t + (0.2610628885843078511_dp))*t +&
      & (-0.108465169602059954e-1_dp))
        res = x * p/q
    else ! Table 58 in Blair et al.
      t = 1.0_dp / sqrt(-log(1.0_dp - a))
  p = ((((((((((0.22419563223346345828e-2_dp*t + (-0.177910045751117599791e-1_dp))*t + (0.668168077118049895750e-1_dp))*t +&
      & (0.72718806231556811306121_dp))*t + (0.207897426301749172289354e1_dp))*t + (0.262556728794480727266643e1_dp))*t +&
      & (0.283026779017544899742694e1_dp))*t + (0.1042615854929826612283637e1_dp))*t + (0.129695500997273524030254_dp))*t +&
      & (0.5350414748789301376564e-2_dp))*t + (0.56451977709864482298e-4_dp))
  q = ((((((((0.1e1_dp*t + (0.203724318174121779298258e1_dp))*t + (0.387828582770420112635182e1_dp))*t +&
      & (0.376311685364050289010232e1_dp))*t + (0.303793311735222062372456e1_dp))*t + (0.105429322326264911952443e1_dp))*t +&
      & (0.129866154169116469345513_dp))*t + (0.5350558706793065395335e-2_dp))*t + (0.56451699862760651514e-4_dp))
        res = p / (sign(t,x) * q)
    end if
end function

Edit: there was an error in my erfinv version near the ends of the domain (-1,1), that I've now replaced.

Swapping axes to compare with the error function in gnuplot I can see the code works correctly:
erfinv

@vansnyder
Copy link

vansnyder commented Jun 22, 2020 via email

@urbanjost
Copy link

Curious. Did you time your example against any intrinsics?

@ivan-pi
Copy link
Member

ivan-pi commented Jun 23, 2020

Thanks @vansnyder for the suggestions. After tabulating the error, I found the precision dropped slightly in one of the intervals. I replaced the code above now with some slightly more accuracte coefficients. After tabulating ERF(ERFINV(X)) - X (calculated in double precision) over the range (-0.995,0.995), the error does not exceed 2.e-16, but I will do more tests before making this a pull request.

Curious. Did you time your example against any intrinsics?

Not yet. I went down the rabbit-hole so to say, and started designing some benchmarking macros similar to those in BenchmarkTools.jl.

For comparison, the inverse error function is available in

This was referenced Jan 18, 2021
@awvwgk awvwgk added the topic: mathematics linear algebra, sparse matrices, special functions, FFT, random numbers, statistics, ... label Sep 18, 2021
@Beliavsky
Copy link

On comp.lang.fortran Al Greynolds asked Why no complex ERF intrinsic? and @arjenmarkus mentioned that it is a candidate for stdlib.

@vansnyder
Copy link

vansnyder commented Oct 25, 2021 via email

@rebcabin
Copy link

rebcabin commented Nov 27, 2022

I've revised my 2009 proposal to correspond to Fortran 2020. ISO doesn't allow the kind of revision suffix that J3 allows, but I did it anyway. I just can't ask Steve to put it on the WG5 server with that number.

I think the stdlib should be packaged in modules, each one a coherent set of related procedures (and types and constants). These should ultimately all be described in a consistent style, as an optional part of the standard. Procedures should be described in the same format as in subclause 16.9 of part 1 of the standard. Constants and opaque types can be described as in subclause 16.10. Types that have public components and bindings could be described by a type definition that shows only type parameters, and the public components and bindings. Type-bound procedures should be described as in subclause 16.9.

N1688r2.pdf

@vansnyder on your 2.3 line 24, I can easily see someone's designing hardware for spherical harmonics for geoid models, so having an intrinsic for code-gen is not too far-fetched.

(Hi, Van, yes, it's the same Brian Beckman from 40 years ago at JPL and Earth models in MASTERFIT :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: mathematics linear algebra, sparse matrices, special functions, FFT, random numbers, statistics, ...
Projects
None yet
Development

No branches or pull requests

8 participants