Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move predict from Turing #716

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from
Draft

Move predict from Turing #716

wants to merge 10 commits into from

Conversation

sunxd3
Copy link
Member

@sunxd3 sunxd3 commented Nov 12, 2024

This PR aims to move predict function from Turing.jl repo to here (DynamicPPL). This PR won't change the way that predict is fundamentally implemented. (Later in #651, we will transition to using fix to implement predict.)

The challenge of this PR is that:

  1. predict returns a MCMCChains.Chain
  2. the implementation in Turing.jl uses the Chain generation pipeline in Turing.jl (same pipeline called at the end of sample)
  3. it doesn't really make sense to move all the Chain-related util functions into DynamicPPL
  4. so we need to separate a subset of the util functions and add to DynamicPPL

What I have done as of now:

  1. move the predict function and recovered a subset of the util functions needed to make it functional
  2. sample in tests now uses LgoDensityFunction interface

Modifications made to the moved util functions are:

  1. AbstractMCMC.bundle_samples are renamed to _bundle_samples; unused keywords arguments are removed
  2. Transition type is copied from Turing.jl repo, but the stat field is removed as it is never used in predict

But most of the functions in the PR right now should be the same or straightforwardly identifiable from Turing.jl code.

@sunxd3 sunxd3 marked this pull request as draft November 13, 2024 08:52
sunxd3 and others added 2 commits November 13, 2024 09:21
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@sunxd3
Copy link
Member Author

sunxd3 commented Nov 14, 2024

Some tests still fail: the mean of the predictions looks correct, but it seems the variance is high. Not certain where goes wrong, so need further investigation.

The reason is some tests implicitly rely on the variance of the posterior samples. Discarding some initial samples fixes this. Turing do this by default, but via LogDensityFunction we need do the discarding explicitly.

sunxd3 and others added 4 commits November 18, 2024 11:06
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@coveralls
Copy link

coveralls commented Nov 18, 2024

Pull Request Test Coverage Report for Build 12007336979

Details

  • 47 of 48 (97.92%) changed or added relevant lines in 2 files are covered.
  • 25 unchanged lines in 4 files lost coverage.
  • Overall coverage increased (+0.2%) to 84.55%

Changes Missing Coverage Covered Lines Changed/Added Lines %
ext/DynamicPPLMCMCChainsExt.jl 34 35 97.14%
Files with Coverage Reduction New Missed Lines %
src/model.jl 1 94.44%
src/varinfo.jl 6 86.3%
src/simple_varinfo.jl 6 86.6%
src/threadsafe.jl 12 57.76%
Totals Coverage Status
Change from base Build 11934706726: 0.2%
Covered Lines: 3601
Relevant Lines: 4259

💛 - Coveralls

Copy link

codecov bot commented Nov 18, 2024

Codecov Report

Attention: Patch coverage is 97.91667% with 1 line in your changes missing coverage. Please review.

Project coverage is 84.55%. Comparing base (ba490bf) to head (53b6749).

Files with missing lines Patch % Lines
ext/DynamicPPLMCMCChainsExt.jl 97.14% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #716      +/-   ##
==========================================
+ Coverage   84.35%   84.55%   +0.19%     
==========================================
  Files          30       30              
  Lines        4211     4259      +48     
==========================================
+ Hits         3552     3601      +49     
+ Misses        659      658       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@sunxd3
Copy link
Member Author

sunxd3 commented Nov 18, 2024

We had a fast discussion on this today at the meeting. Tor raised that we should probably implement predict that take generic Vector as the second argument (instead of just Chain), this is because predict works with sample, and sample can produce non-Chain type returns.

Also although we don't use fix for this PR yet, it is worthwhile to have some nice and better thought-out implementations.

@torfjelde
Copy link
Member

Vector as the second argument

Specifically, I was thinking Vector{<:VarInfo}:) But otherwise, this sounds very good 👍

sunxd3 and others added 3 commits November 21, 2024 12:42
varinfos::AbstractArray{<:AbstractVarInfo};
include_all=false,
)
predictive_samples = Array{PredictiveSample}(undef, size(varinfos))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need the PredictiveSample here?

My original suggestion was just to use Vector{<:OrderedDict} for the return-value (an abstractly typed PredictiveSample doesn't really offer anything beyond this, does it?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't think too deep about this. A new type certainly is easier to dispatch on, but may not be necessary. Let me look into it

@torfjelde
Copy link
Member

Otherwise stuff is starting to look nice though:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants