Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for "Optional" features when Optim.jl is loaded #469

Closed
elcritch opened this issue Feb 19, 2016 · 6 comments
Closed

Support for "Optional" features when Optim.jl is loaded #469

elcritch opened this issue Feb 19, 2016 · 6 comments

Comments

@elcritch
Copy link

Reading through a few of the issues (e.g. #194 ) it appears Optim.jl would be beneficial in a number of cases. However, there is good reason to hesitate adding a dependency to a "core" type package like Distributions. It'd be nice to have some discussion on how/if to handle this issue.

I'm working on cleaning up code I've created for fitting the Weibull distributions using MLE. Currently my code relies on Optim.jl since the analytic solutions are quite complex and would still require a optimization library to solve. It's far easier to use Optim.jl to solve since different MLE equations are solved better by various optimization routines.

Either I could provide the package as an unofficial Distributions-Extras.jl or include code to only load Optim.jl when a user loads or requests the additional features. The latter would become be a "soft" dependency. There's a PR 6884 in JuliaLang to support option module reloads in 0.5.0.

@johnmyleswhite
Copy link
Member

FWIW, I tend to think this kind of linkage is almost impossible to get right because there's no clearly correct ordering of dependencies without breaking things into very fine-grained chunks.

Although some of the recent issues support making Optim.jl a dependency of this package, historically the dependency actually went in the exact opposite direction: see, e.g., https://github.com/JuliaOpt/Optim.jl/blob/a81583777b4715f85fd04aad2e0b3e0cf756bd4b/src/Optim.jl

My personal preference: the core of Distributions.jl shouldn't depend on Optim.jl (or the other way around).

@elcritch
Copy link
Author

That's exactly the problem.

What's the possibility of using a conditional modules once they become available? It'd probably be the only feasible way to introduce support for using Optim.jl or similar.

Optional Module loading:

module Distributions
   module OptionalFitting requires Optim
     # code only loaded after user imports / uses Optim.jl
     using Optim
     ... define extra fitting / extra bayesian methods / etc ... 
   end
end

@elcritch
Copy link
Author

The other option might be provide an internal Newton-Rhapson method. Actually it looks like Dirichelt.jl already implements one. That'd likely be fine for Weibull statistics. Of course this wouldn't benefit from any improvements/algorithms added to Optim.

Alternatively what about a "Distributions-Extra.jl" package that could be a child of Distributions and Optim and include these extras?

@johnmyleswhite
Copy link
Member

I'm generally not excited about conditional modules, but perhaps that's the best thing in the long run.

For some problems, I think you get a lot of benefits out of writing a custom Newton-Raphson method by hand. The Gamma MLE is doing this for instance; after initializing using a method of moments approach. These hand-written optimization functions can be much more well tailored to the problem (i.e. using stack allocation for everything in the whole inner loop and using no function pointers).

Having Distributions-Extra.jl is ok, but I'd be pretty worried about how well it would get maintained given how hard we struggle to maintain just Distributions.jl.

@johnmyleswhite
Copy link
Member

In case my previous comment seemed too negative, I think lots of people should build packages that extend Distributions.jl -- and those packages are totally safe to depend on any combination of Optim + Distributions.

@elcritch
Copy link
Author

@johnmyleswhite You comments didn't seem negative – maintaining this type of code is tricky. Thanks for the feedback. Especially the stack allocation part. Currently it takes a bit to run 1k+ iterations when doing bootstrapping. I'll take a look at the gamma solver since the dirichelt is multivariate.

It seems in general it seems better for implementers in Distributions.jl to bake their own optimizer routines. Though I'll take a look if I can reuse the gamma one. It might be feasible to have a simple internal optimizer for MLE fitting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants