Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Update to ForwardDiff 0.2.x #248

Merged
merged 1 commit into from
Jul 27, 2016

Conversation

jrevels
Copy link
Collaborator

@jrevels jrevels commented Jul 26, 2016

This updates the code to utilize the new ForwardDiff. I had trouble locally aligning the versions of Celeste's dependencies, so I'm relying on Travis to tell me whether or not this PR works.

I'm using WIP tag for this PR because I was a bit confused by the convert methods in test/derivative_utils.jl. They break the semantic interface of Base.convert, i.e. these convert methods do not obey convert(T, x)::T). It seems like their purpose is to map ElboArgs{T} to ElboArgs{GradientNumber{N,T}} or ElboArgs{HessianNumber{N,T}}, but I'm not sure where/how this is getting used.

I technically could do a naive rewrite of these methods using ForwardDiff's new Dual type, but as I was unsure, I chose to comment them out and see what happens.

@jeff-regier
Copy link
Owner

Hi Jarrett, This looks good. Thanks a lot for the contribution! I'm glad to be using the latest version of ForwardDiff.

I'm surprised those convert methods were still around...I thought I removed all the dead code. I go ahead and delete them.

@jeff-regier jeff-regier merged commit f2ec518 into jeff-regier:master Jul 27, 2016
@jeff-regier
Copy link
Owner

Uggg...our unit tests start taking 25 minutes rather than 11 minutes with this PR. I didn't notice that before merging it. Any idea what's happening? I'm going to try reverting these changes in the meantime.

@jrevels
Copy link
Collaborator Author

jrevels commented Jul 27, 2016

I would guess that a lot of the extra time is compile time. This is because the new ForwardDiff is revealing much more type information for the compiler can use for type inference/code generation. While the new ForwardDiff is generally much faster (it's far more optimized w.r.t. GC and memory usage), this means there's a lot of overhead for "one-off" differentiations like unit tests.

I've noticed that the extra compiler overhead is especially pronounced for the new ForwardDiff.hessian, which now stack-allocates gradient components. This stack allocation can result in huge improvements w.r.t. memory usage, but can incur a large compile-time cost.

That being said, you should definitely run benchmarks before a change like this, to make sure I didn't accidentally introduce regressions. I see that you have some benchmarks, but I don't know how to run them locally. I'd be very interested to see the results if you get a chance to run them.

@jrevels
Copy link
Collaborator Author

jrevels commented Jul 27, 2016

Also, if you do get around to benchmarking this, it'd be interesting to see if the performance difference between this branch, and this branch if you use ForwardDiff's usecache=false option.

@jeff-regier
Copy link
Owner

I wonder if the long compile time with the latest ForwardDiff.jl is due to JuliaLang/julia#14743 . Celeste only support Julia 0.5 currently -- perhaps with 0.4 the runtime of the units tests wouldn't have increased so much.

@rgiordan
Copy link
Contributor

rgiordan commented Aug 8, 2016

This does seem to be all attributable to compilation time, and usecache=false doesn't make any difference in that compile time.

rgiordan pushed a commit that referenced this pull request Aug 11, 2016
[WIP] Update to ForwardDiff 0.2.x

Former-commit-id: f2ec518
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants