diff --git a/dev/examples/initializing-hmc/index.html b/dev/examples/initializing-hmc/index.html index eeb79f5a..404376aa 100644 --- a/dev/examples/initializing-hmc/index.html +++ b/dev/examples/initializing-hmc/index.html @@ -9,143 +9,143 @@ scatter(x, y; xlabel="x", ylabel="y", legend=false, msw=0, ms=2) - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

We'll fit this using a simple polynomial regression model:

\[\begin{aligned} \sigma &\sim \text{Half-Normal}(\mu=0, \sigma=1)\\ @@ -190,44 +190,44 @@ ∇P = ADgradient(:ForwardDiff, P)

ForwardDiff AD wrapper for TransformedLogDensity of dimension 5, w/ chunk size 5

Pathfinder can take any object that implements this interface.

result_pf = pathfinder(∇P)
Single-path Pathfinder result
   tries: 2
   draws: 5
-  fit iteration: 22 (total: 30)
-  fit ELBO: -117.74 ± 0.25
+  fit iteration: 32 (total: 32)
+  fit ELBO: -116.82 ± 0.61
   fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}(
 dim: 5
-μ: [-0.351938851773518, 0.24689540392550613, 0.05827742217490308, 0.11655069217591063, 0.17482994194820867]
-Σ: [0.005794060117888708 0.004501772414217777 … 0.005154335534465927 -0.004280279761853394; 0.004501772414217777 0.03531824833492471 … 0.007991672055998154 -0.017293830014585607; … ; 0.005154335534465927 0.007991672055998154 … 0.7015097471184754 -0.440833068526388; -0.004280279761853394 -0.017293830014585607 … -0.440833068526388 0.28326699991508697]
+μ: [-0.35193742732366645, 0.2469005998325288, 0.05827588221943315, 0.11655176419252547, 0.17482764675446022]
+Σ: [0.004783202383029375 -0.001170612806498005 … 0.017642243198185595 -0.010817997078899744; -0.001170612806498005 0.020979748064002175 … -0.03205170538208288 0.01130698208468521; … ; 0.017642243198185595 -0.03205170538208288 … 1.0789712537388085 -0.6702794586413363; -0.010817997078899744 0.01130698208468521 … -0.6702794586413363 0.42226112257957793]
 )
 
init_params = result_pf.draws[:, 1]
5-element Vector{Float64}:
- -0.3490992268968827
-  0.1768169119835949
-  0.1570690692732149
- -0.25930709446876066
-  0.4601383974449641
inv_metric = result_pf.fit_distribution.Σ
5×5 Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}:
-  0.00579406   0.00450177  -0.00167075   0.00515434  -0.00428028
-  0.00450177   0.0353182   -0.00747888   0.00799167  -0.0172938
- -0.00167075  -0.00747888   0.0168776   -0.0892416    0.0570341
-  0.00515434   0.00799167  -0.0892416    0.70151     -0.440833
- -0.00428028  -0.0172938    0.0570341   -0.440833     0.283267

Initializing from Pathfinder's draws

Here we just need to pass one of the draws as the initial point q:

result_dhmc1 = mcmc_with_warmup(
+ -0.20140055125980677
+  0.12271379282185454
+ -0.1694336237882354
+  0.8614423999825955
+ -0.16004585896461493
inv_metric = result_pf.fit_distribution.Σ
5×5 Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}:
+  0.0047832    -0.00117061   -0.000928193   0.0176422  -0.010818
+ -0.00117061    0.0209797    -0.000129508  -0.0320517   0.011307
+ -0.000928193  -0.000129508   0.0187343    -0.112272    0.0692776
+  0.0176422    -0.0320517    -0.112272      1.07897    -0.670279
+ -0.010818      0.011307      0.0692776    -0.670279    0.422261

Initializing from Pathfinder's draws

Here we just need to pass one of the draws as the initial point q:

result_dhmc1 = mcmc_with_warmup(
     Random.GLOBAL_RNG,
     ∇P,
     ndraws;
     initialization=(; q=init_params),
     reporter=NoProgressReport(),
-)
(posterior_matrix = [-0.2758483430657324 -0.2893555991876607 … -0.37227629978523785 -0.3736822563814165; 0.31279906133060686 0.39336339879549875 … 0.271963235723552 0.27608919719580766; … ; 2.1809397121524112 0.6999844721757637 … 0.5842955088857441 1.8260496793967949; -1.5722167409245562 -0.6838662776940696 … -0.1548026495712692 -0.7703779597891128], tree_statistics = DynamicHMC.TreeStatisticsNUTS[DynamicHMC.TreeStatisticsNUTS(-117.66725488232109, 6, turning at positions -32:31, 0.9983291009431183, 63, DynamicHMC.Directions(0xbc27959f)), DynamicHMC.TreeStatisticsNUTS(-118.83703125483953, 6, turning at positions -35:28, 0.9777449382939619, 63, DynamicHMC.Directions(0xd07c275c)), DynamicHMC.TreeStatisticsNUTS(-117.35774859313008, 6, turning at positions -24:39, 0.9981679732178887, 63, DynamicHMC.Directions(0xdf6ec2e7)), DynamicHMC.TreeStatisticsNUTS(-117.95148087539897, 6, turning at positions -30:33, 0.9993693337084574, 63, DynamicHMC.Directions(0x3fe72a21)), DynamicHMC.TreeStatisticsNUTS(-124.00673222308028, 5, turning at positions -24:7, 0.8104172296411057, 31, DynamicHMC.Directions(0x315b4727)), DynamicHMC.TreeStatisticsNUTS(-122.49865390925322, 5, turning at positions -27:4, 0.9991340230919505, 31, DynamicHMC.Directions(0xbc5ece44)), DynamicHMC.TreeStatisticsNUTS(-115.91470245218702, 6, turning at positions -37:26, 0.9630110228799411, 63, DynamicHMC.Directions(0x6c355f9a)), DynamicHMC.TreeStatisticsNUTS(-114.39996506023576, 6, turning at positions -59:4, 0.9979480921106051, 63, DynamicHMC.Directions(0x0ab27f44)), DynamicHMC.TreeStatisticsNUTS(-115.41146259913958, 5, turning at positions 0:31, 0.9841617186594932, 31, DynamicHMC.Directions(0x9258eeff)), DynamicHMC.TreeStatisticsNUTS(-116.26828121119772, 6, turning at positions 0:63, 0.9908824354639745, 63, DynamicHMC.Directions(0x7b5b407f))  …  DynamicHMC.TreeStatisticsNUTS(-114.99478723539855, 6, turning at positions -63:0, 0.9907022431104505, 63, DynamicHMC.Directions(0xef2aaa40)), DynamicHMC.TreeStatisticsNUTS(-116.97869290793183, 6, turning at positions -33:30, 0.7143602645686365, 63, DynamicHMC.Directions(0x8b77191e)), DynamicHMC.TreeStatisticsNUTS(-117.11186074123758, 6, turning at positions -48:15, 0.9737851940269915, 63, DynamicHMC.Directions(0x4da332cf)), DynamicHMC.TreeStatisticsNUTS(-116.65213028014398, 5, turning at positions -12:19, 0.985788223742232, 31, DynamicHMC.Directions(0xe9613733)), DynamicHMC.TreeStatisticsNUTS(-115.36237866813609, 2, turning at positions 1:4, 0.955490897307274, 7, DynamicHMC.Directions(0x4f520a0c)), DynamicHMC.TreeStatisticsNUTS(-118.5153762924302, 6, turning at positions -39:24, 0.9854234355029934, 63, DynamicHMC.Directions(0x70cec658)), DynamicHMC.TreeStatisticsNUTS(-114.58940759461872, 6, turning at positions -52:11, 0.9746921531995503, 63, DynamicHMC.Directions(0xc17210cb)), DynamicHMC.TreeStatisticsNUTS(-115.55715635279923, 6, turning at positions -29:34, 0.9074377963974533, 63, DynamicHMC.Directions(0xa15f60e2)), DynamicHMC.TreeStatisticsNUTS(-114.51994962110324, 6, turning at positions -3:60, 0.9939306867613937, 63, DynamicHMC.Directions(0x22920bbc)), DynamicHMC.TreeStatisticsNUTS(-117.22219257244788, 5, turning at positions -30:-33, 0.7625347575553412, 35, DynamicHMC.Directions(0x2695cf02))], κ = Gaussian kinetic energy (Diagonal), √diag(M⁻¹): [0.09617852377050368, 0.1467357966306328, 0.9427362465213873, 0.841927276752426, 0.5581416342226222], ϵ = 0.05204229470662355)

Initializing metric adaptation from Pathfinder's estimate

To start with Pathfinder's inverse metric estimate, we just need to initialize a GaussianKineticEnergy object with it as input:

result_dhmc2 = mcmc_with_warmup(
+)
(posterior_matrix = [-0.42919697100974286 -0.21139987374860927 … -0.22891457642505386 -0.33697543481854547; 0.48762605177160445 0.20262889511823914 … 0.2780834137705154 0.5447343569669959; … ; 0.9816611739784356 0.35854051771698725 … -0.29019017804493763 0.3642718591782569; -0.42761123506228177 0.016430712777248335 … 0.3864856201949099 -0.44487446738133996], tree_statistics = DynamicHMC.TreeStatisticsNUTS[DynamicHMC.TreeStatisticsNUTS(-117.41394406969577, 7, turning at positions -102:25, 0.996969674355359, 127, DynamicHMC.Directions(0x94ec7119)), DynamicHMC.TreeStatisticsNUTS(-116.38727364732652, 6, turning at positions -10:53, 0.9408602272997408, 63, DynamicHMC.Directions(0xd35d94b5)), DynamicHMC.TreeStatisticsNUTS(-122.34842074548729, 5, turning at positions 22:37, 0.9685683448308003, 47, DynamicHMC.Directions(0x658f88b5)), DynamicHMC.TreeStatisticsNUTS(-117.73587548128637, 6, turning at positions -8:55, 0.9205483370267166, 63, DynamicHMC.Directions(0x0933d177)), DynamicHMC.TreeStatisticsNUTS(-116.7826214110699, 6, turning at positions -45:-108, 0.9229544878924183, 127, DynamicHMC.Directions(0xeb352013)), DynamicHMC.TreeStatisticsNUTS(-117.46496917475748, 5, turning at positions -5:26, 0.9950471930429885, 31, DynamicHMC.Directions(0xc667fe3a)), DynamicHMC.TreeStatisticsNUTS(-121.8205179028736, 6, turning at positions -52:11, 0.5377305954120832, 63, DynamicHMC.Directions(0xd739974b)), DynamicHMC.TreeStatisticsNUTS(-114.4297516222513, 6, turning at positions -42:21, 0.9999415419313027, 63, DynamicHMC.Directions(0x25ed57d5)), DynamicHMC.TreeStatisticsNUTS(-117.97526690130654, 6, turning at positions 51:114, 0.9542847420596517, 127, DynamicHMC.Directions(0x831fc4f2)), DynamicHMC.TreeStatisticsNUTS(-118.14671669864885, 6, turning at positions -23:40, 0.9996859381468479, 63, DynamicHMC.Directions(0x9ea797a8))  …  DynamicHMC.TreeStatisticsNUTS(-116.28226876069697, 5, turning at positions -17:-24, 0.9941580906090818, 55, DynamicHMC.Directions(0x64d9c15f)), DynamicHMC.TreeStatisticsNUTS(-117.55520220821121, 6, turning at positions -16:47, 0.8627163765317972, 63, DynamicHMC.Directions(0x49afbf6f)), DynamicHMC.TreeStatisticsNUTS(-116.90846339831225, 5, turning at positions -50:-53, 0.9492362141772703, 59, DynamicHMC.Directions(0x68cd4bc6)), DynamicHMC.TreeStatisticsNUTS(-115.69232080039238, 7, turning at positions -55:72, 0.8534494973113602, 127, DynamicHMC.Directions(0x9146ca48)), DynamicHMC.TreeStatisticsNUTS(-120.41351488156272, 5, turning at positions 21:24, 0.9943538863976024, 55, DynamicHMC.Directions(0x9b63fc60)), DynamicHMC.TreeStatisticsNUTS(-117.75486776360326, 3, turning at positions -11:-14, 1.0, 15, DynamicHMC.Directions(0xb5c7fdb1)), DynamicHMC.TreeStatisticsNUTS(-117.88943724110086, 6, turning at positions -25:38, 0.9960354418105171, 63, DynamicHMC.Directions(0x5a6ce8e6)), DynamicHMC.TreeStatisticsNUTS(-115.2129750928417, 6, turning at positions -42:21, 0.9763375164470813, 63, DynamicHMC.Directions(0x900c5495)), DynamicHMC.TreeStatisticsNUTS(-116.1749007866109, 6, turning at positions 55:118, 0.9063254939971952, 127, DynamicHMC.Directions(0xf5b5a2f6)), DynamicHMC.TreeStatisticsNUTS(-120.17330342805336, 6, turning at positions -48:15, 0.9248222337081514, 63, DynamicHMC.Directions(0x75275acf))], κ = Gaussian kinetic energy (Diagonal), √diag(M⁻¹): [0.10455343664482653, 0.16510796593373156, 0.9239050246960616, 0.9815622112734754, 0.6972805974168755], ϵ = 0.03724657208882331)

Initializing metric adaptation from Pathfinder's estimate

To start with Pathfinder's inverse metric estimate, we just need to initialize a GaussianKineticEnergy object with it as input:

result_dhmc2 = mcmc_with_warmup(
     Random.GLOBAL_RNG,
     ∇P,
     ndraws;
     initialization=(; q=init_params, κ=GaussianKineticEnergy(inv_metric)),
     warmup_stages=default_warmup_stages(; M=Symmetric),
     reporter=NoProgressReport(),
-)
(posterior_matrix = [-0.32648878112876384 -0.4498067859073784 … -0.36253933637973246 -0.2942447143233157; 0.1789489916056201 0.2705448204425909 … 0.3067924203299595 0.12069966548599022; … ; 1.1951679257639007 -0.13642757907793546 … -1.9301625886255915 1.786347939406547; -0.5786945141617406 0.013197186437861663 … 0.7754402322031595 -1.005541126401132], tree_statistics = DynamicHMC.TreeStatisticsNUTS[DynamicHMC.TreeStatisticsNUTS(-116.70836299448789, 2, turning at positions -2:-5, 0.7301853041411208, 7, DynamicHMC.Directions(0x207aac42)), DynamicHMC.TreeStatisticsNUTS(-117.667763940461, 2, turning at positions -1:-4, 0.9023143210668639, 7, DynamicHMC.Directions(0xe93c3103)), DynamicHMC.TreeStatisticsNUTS(-121.63867308383959, 2, turning at positions 4:5, 0.7280634557080287, 7, DynamicHMC.Directions(0xe7270e6d)), DynamicHMC.TreeStatisticsNUTS(-123.10679167712954, 3, turning at positions -1:6, 0.7837276732237234, 7, DynamicHMC.Directions(0x5f1018ee)), DynamicHMC.TreeStatisticsNUTS(-119.99658317186748, 3, turning at positions -1:6, 0.9999999999999999, 7, DynamicHMC.Directions(0x7edde8ee)), DynamicHMC.TreeStatisticsNUTS(-116.74761696010485, 3, turning at positions -5:2, 0.9999999999999999, 7, DynamicHMC.Directions(0x24a3ed02)), DynamicHMC.TreeStatisticsNUTS(-115.76997523869682, 3, turning at positions -7:0, 0.9614756882882664, 7, DynamicHMC.Directions(0xe3258bd0)), DynamicHMC.TreeStatisticsNUTS(-116.71885589205205, 3, turning at positions 0:7, 0.9227570915349447, 7, DynamicHMC.Directions(0xb7928e0f)), DynamicHMC.TreeStatisticsNUTS(-119.53868695864278, 3, turning at positions -2:5, 0.947333613939426, 7, DynamicHMC.Directions(0xc8b216cd)), DynamicHMC.TreeStatisticsNUTS(-118.58913181581636, 2, turning at positions 0:3, 0.9810358527557205, 3, DynamicHMC.Directions(0xf854e9db))  …  DynamicHMC.TreeStatisticsNUTS(-116.23247473967454, 3, turning at positions -7:0, 0.972153105250679, 7, DynamicHMC.Directions(0xddea2d58)), DynamicHMC.TreeStatisticsNUTS(-116.75693992925254, 3, turning at positions -2:5, 0.8179952447787305, 7, DynamicHMC.Directions(0x6b1a053d)), DynamicHMC.TreeStatisticsNUTS(-115.31016319240376, 3, turning at positions 2:9, 0.9150281108627871, 15, DynamicHMC.Directions(0x2f68fbc9)), DynamicHMC.TreeStatisticsNUTS(-115.03079604985959, 3, turning at positions -5:2, 0.966953505236604, 7, DynamicHMC.Directions(0x1811d08a)), DynamicHMC.TreeStatisticsNUTS(-119.04301779347969, 3, turning at positions -7:0, 0.8141752871237031, 7, DynamicHMC.Directions(0xffc25420)), DynamicHMC.TreeStatisticsNUTS(-120.66561638733852, 3, turning at positions -7:-14, 0.8152733003536984, 15, DynamicHMC.Directions(0xd4254bc1)), DynamicHMC.TreeStatisticsNUTS(-118.22994577380338, 3, turning at positions -4:3, 0.993389555180804, 7, DynamicHMC.Directions(0xe02aca4b)), DynamicHMC.TreeStatisticsNUTS(-119.44027609408025, 3, turning at positions -1:6, 0.9149157662478588, 7, DynamicHMC.Directions(0xdc4bb1ee)), DynamicHMC.TreeStatisticsNUTS(-120.7510753218437, 3, turning at positions -1:6, 0.8915789365969319, 7, DynamicHMC.Directions(0xaff5b83e)), DynamicHMC.TreeStatisticsNUTS(-121.8043833441733, 3, turning at positions -4:3, 0.9439045716910872, 7, DynamicHMC.Directions(0x512426d3))], κ = Gaussian kinetic energy (Symmetric), √diag(M⁻¹): [0.09619301578778361, 0.15690946921549248, 0.993146819375534, 0.9035124315845696, 0.6267610944004433], ϵ = 0.4111885323414418)

We also specified that DynamicHMC should tune a dense Symmetric matrix. However, we could also have requested a Diagonal metric.

Use Pathfinder's metric estimate for sampling

To turn off metric adaptation entirely and use Pathfinder's estimate, we just set the number and size of the metric adaptation windows to 0.

result_dhmc3 = mcmc_with_warmup(
+)
(posterior_matrix = [-0.31014276886825337 -0.3106878294599664 … -0.3730454801184784 -0.3533404536024512; 0.267717123829322 0.04660326476567415 … 0.414620009341598 0.37932939382376846; … ; -0.568646642074459 -1.0292760706849555 … -0.11134341163745792 -0.02827637419331233; 0.9611002844502203 0.7325931335649423 … 0.3119996940588185 0.27599663245071826], tree_statistics = DynamicHMC.TreeStatisticsNUTS[DynamicHMC.TreeStatisticsNUTS(-115.7756349008694, 3, turning at positions -7:0, 0.9999999999999999, 7, DynamicHMC.Directions(0x55266cd8)), DynamicHMC.TreeStatisticsNUTS(-115.90610868981345, 3, turning at positions -6:1, 0.9421034479296162, 7, DynamicHMC.Directions(0x450e1409)), DynamicHMC.TreeStatisticsNUTS(-117.48957192032064, 2, turning at positions -2:1, 0.6707089911316859, 3, DynamicHMC.Directions(0x9de7dd79)), DynamicHMC.TreeStatisticsNUTS(-120.98646703298404, 4, turning at positions -5:10, 0.981521795667835, 15, DynamicHMC.Directions(0xb2bd883a)), DynamicHMC.TreeStatisticsNUTS(-118.1772153607903, 2, turning at positions -3:0, 0.7999750870288872, 3, DynamicHMC.Directions(0xb2038924)), DynamicHMC.TreeStatisticsNUTS(-116.04538304166215, 2, turning at positions -3:-6, 0.7421969365056256, 7, DynamicHMC.Directions(0x7159f471)), DynamicHMC.TreeStatisticsNUTS(-117.3078094358584, 3, turning at positions 0:7, 0.98376858854086, 7, DynamicHMC.Directions(0xc8cd499f)), DynamicHMC.TreeStatisticsNUTS(-116.21112748208503, 3, turning at positions -2:5, 0.9361700699063016, 7, DynamicHMC.Directions(0x804b1a3d)), DynamicHMC.TreeStatisticsNUTS(-116.6627131682672, 3, turning at positions 7:14, 0.9524854418867338, 15, DynamicHMC.Directions(0x6c92645e)), DynamicHMC.TreeStatisticsNUTS(-116.23730157895373, 4, turning at positions -9:6, 0.9946496227184538, 15, DynamicHMC.Directions(0x1fee9526))  …  DynamicHMC.TreeStatisticsNUTS(-115.2486485726483, 3, turning at positions -6:1, 0.9360185875556345, 7, DynamicHMC.Directions(0x75080b21)), DynamicHMC.TreeStatisticsNUTS(-116.61287951930387, 3, turning at positions -1:6, 0.930339753460138, 7, DynamicHMC.Directions(0x610e2ca6)), DynamicHMC.TreeStatisticsNUTS(-116.21017508013456, 3, turning at positions -4:3, 0.9863437560699042, 7, DynamicHMC.Directions(0xfc34992b)), DynamicHMC.TreeStatisticsNUTS(-114.54254427477282, 3, turning at positions -2:-9, 0.8817877246875803, 15, DynamicHMC.Directions(0xca036cd6)), DynamicHMC.TreeStatisticsNUTS(-116.7009212820656, 3, turning at positions -4:-11, 0.8886135247324539, 15, DynamicHMC.Directions(0x26abcb44)), DynamicHMC.TreeStatisticsNUTS(-118.96741935047831, 2, turning at positions 4:7, 0.737620892185108, 7, DynamicHMC.Directions(0x31d22197)), DynamicHMC.TreeStatisticsNUTS(-116.64681022206888, 4, turning at positions -13:2, 0.9417505471018972, 15, DynamicHMC.Directions(0x17dcdae2)), DynamicHMC.TreeStatisticsNUTS(-113.87803518129803, 2, turning at positions -3:0, 0.9768712482851519, 3, DynamicHMC.Directions(0x8f5d82b8)), DynamicHMC.TreeStatisticsNUTS(-113.61032551791598, 3, turning at positions 0:7, 0.9969578182704871, 7, DynamicHMC.Directions(0x8ede1cd7)), DynamicHMC.TreeStatisticsNUTS(-115.34685088248375, 2, turning at positions -3:-6, 0.7695188241780929, 7, DynamicHMC.Directions(0xd01e9ad1))], κ = Gaussian kinetic energy (Symmetric), √diag(M⁻¹): [0.0965317466481926, 0.15847293750719074, 0.9664008766490653, 0.8736134921867713, 0.6113141649474361], ϵ = 0.36953154011916056)

We also specified that DynamicHMC should tune a dense Symmetric matrix. However, we could also have requested a Diagonal metric.

Use Pathfinder's metric estimate for sampling

To turn off metric adaptation entirely and use Pathfinder's estimate, we just set the number and size of the metric adaptation windows to 0.

result_dhmc3 = mcmc_with_warmup(
     Random.GLOBAL_RNG,
     ∇P,
     ndraws;
     initialization=(; q=init_params, κ=GaussianKineticEnergy(inv_metric)),
     warmup_stages=default_warmup_stages(; middle_steps=0, doubling_stages=0),
     reporter=NoProgressReport(),
-)
(posterior_matrix = [-0.3220999997719651 -0.3539428454128892 … -0.30121528354499405 -0.4041290268392466; 0.20922838146662068 0.27503336515201066 … 0.3321879640126638 0.1402967215906879; … ; -1.4596534441690088 0.49830920997529193 … -0.4195580015564275 -0.1108684729509048; 0.7874958291454363 -0.5349471228235019 … 0.4298572143808084 0.26806480905203345], tree_statistics = DynamicHMC.TreeStatisticsNUTS[DynamicHMC.TreeStatisticsNUTS(-116.08868723825647, 3, turning at positions 0:7, 0.9639151099664913, 7, DynamicHMC.Directions(0x739551f7)), DynamicHMC.TreeStatisticsNUTS(-115.61834900625175, 3, turning at positions -6:1, 0.9859740075959113, 7, DynamicHMC.Directions(0xaaafbb81)), DynamicHMC.TreeStatisticsNUTS(-117.0701364285515, 3, turning at positions -4:3, 0.7854735691209184, 7, DynamicHMC.Directions(0x92f0b80b)), DynamicHMC.TreeStatisticsNUTS(-120.92139131786823, 4, turning at positions 13:16, 0.8356562395419037, 19, DynamicHMC.Directions(0xd1ba64dc)), DynamicHMC.TreeStatisticsNUTS(-117.01121219875554, 5, turning at positions -56:-63, 0.9839987580066771, 63, DynamicHMC.Directions(0xb59a4cc0)), DynamicHMC.TreeStatisticsNUTS(-113.61636832717885, 3, turning at positions -1:6, 0.9843239993844046, 7, DynamicHMC.Directions(0x3cbbdd0e)), DynamicHMC.TreeStatisticsNUTS(-118.1064001496842, 5, turning at positions -24:-31, 0.9068905544707647, 55, DynamicHMC.Directions(0xdbfb21d8)), DynamicHMC.TreeStatisticsNUTS(-119.20844748257664, 3, turning at positions -1:6, 0.9769238655159613, 7, DynamicHMC.Directions(0x412e7b16)), DynamicHMC.TreeStatisticsNUTS(-117.49956392367964, 3, turning at positions -1:-8, 0.9916980256441906, 15, DynamicHMC.Directions(0xa0ff0dc7)), DynamicHMC.TreeStatisticsNUTS(-117.6369647715093, 3, turning at positions 0:7, 0.981386254331903, 7, DynamicHMC.Directions(0xe818f797))  …  DynamicHMC.TreeStatisticsNUTS(-118.25919129308764, 4, turning at positions -9:-16, 0.9353725519452714, 23, DynamicHMC.Directions(0x427b8067)), DynamicHMC.TreeStatisticsNUTS(-118.76351868444107, 5, turning at positions -13:-20, 0.9968995056924803, 47, DynamicHMC.Directions(0x10d9e19b)), DynamicHMC.TreeStatisticsNUTS(-116.97208589801134, 2, turning at positions -3:-6, 0.9530442805291315, 7, DynamicHMC.Directions(0x6172f4c1)), DynamicHMC.TreeStatisticsNUTS(-115.01592069581415, 3, turning at positions 0:7, 0.9405106205666193, 7, DynamicHMC.Directions(0x0d18df97)), DynamicHMC.TreeStatisticsNUTS(-114.17429417003753, 2, turning at positions 0:3, 0.9431001984970614, 3, DynamicHMC.Directions(0x427be733)), DynamicHMC.TreeStatisticsNUTS(-115.81278660512051, 2, turning at positions -3:0, 0.7880384318709096, 3, DynamicHMC.Directions(0x7df42a40)), DynamicHMC.TreeStatisticsNUTS(-116.69748819751109, 3, turning at positions 1:8, 0.9960526312879888, 15, DynamicHMC.Directions(0x7e8ddff8)), DynamicHMC.TreeStatisticsNUTS(-116.33728773819973, 4, turning at positions -10:-17, 0.9976433446532477, 31, DynamicHMC.Directions(0xf14966ae)), DynamicHMC.TreeStatisticsNUTS(-115.56938599776959, 4, turning at positions 15:18, 0.993730396814222, 31, DynamicHMC.Directions(0xf99a8bb2)), DynamicHMC.TreeStatisticsNUTS(-116.16448886228234, 2, turning at positions -2:1, 0.8615756698640805, 3, DynamicHMC.Directions(0x3a0ecd95))], κ = Gaussian kinetic energy (WoodburyPDMat), √diag(M⁻¹): [0.07611872383250201, 0.1879314990493204, 0.12991366130961352, 0.8375617870452757, 0.5322283343782884], ϵ = 0.5921487293357944)

AdvancedHMC.jl

Similar to Pathfinder and DynamicHMC, AdvancedHMC can also work with a LogDensityProblem:

using AdvancedHMC
+)
(posterior_matrix = [-0.4181625730398314 -0.23877134506937422 … -0.35614010602074525 -0.24722202784428687; 0.05571152811327862 0.619978254287339 … 0.20045614019028773 0.31306989981244226; … ; 0.5655661954890383 0.4215438738188121 … 0.6400758154365427 1.0100512173072085; 0.26902213235922845 0.045262270609987165 … -0.31061309450176394 -0.5660013259462144], tree_statistics = DynamicHMC.TreeStatisticsNUTS[DynamicHMC.TreeStatisticsNUTS(-118.53676632555118, 3, turning at positions -3:4, 0.9833313602551751, 7, DynamicHMC.Directions(0xbe9b702c)), DynamicHMC.TreeStatisticsNUTS(-120.8838229100884, 3, turning at positions -5:2, 0.7341930608698936, 7, DynamicHMC.Directions(0xd26a56a2)), DynamicHMC.TreeStatisticsNUTS(-119.95672868844748, 2, turning at positions 1:4, 0.9202879625885396, 7, DynamicHMC.Directions(0xfa36c5f4)), DynamicHMC.TreeStatisticsNUTS(-123.86640832029974, 3, turning at positions -2:5, 0.6493471024251117, 7, DynamicHMC.Directions(0x760a88cd)), DynamicHMC.TreeStatisticsNUTS(-124.29934923967515, 2, turning at positions -4:-7, 0.9720951305027359, 7, DynamicHMC.Directions(0x1dd13028)), DynamicHMC.TreeStatisticsNUTS(-119.1539650995537, 4, turning at positions 15:22, 0.9854062272332355, 23, DynamicHMC.Directions(0xda9cfc3e)), DynamicHMC.TreeStatisticsNUTS(-121.39213425234674, 3, turning at positions -7:0, 0.9454032875301186, 7, DynamicHMC.Directions(0xe15d7378)), DynamicHMC.TreeStatisticsNUTS(-119.96011164509231, 4, turning at positions 9:12, 0.9415326017234932, 19, DynamicHMC.Directions(0x06e1ac98)), DynamicHMC.TreeStatisticsNUTS(-121.0541668513089, 4, turning at positions -15:-18, 0.9918490575831471, 31, DynamicHMC.Directions(0xb1acee4d)), DynamicHMC.TreeStatisticsNUTS(-116.44757077188903, 3, turning at positions -1:6, 0.9122997019401838, 7, DynamicHMC.Directions(0x20e9e9ce))  …  DynamicHMC.TreeStatisticsNUTS(-121.56185800529579, 2, turning at positions 0:3, 0.9999999999999999, 3, DynamicHMC.Directions(0x6e800713)), DynamicHMC.TreeStatisticsNUTS(-117.27002583868729, 4, turning at positions 14:21, 0.9566276366622275, 31, DynamicHMC.Directions(0x57eaab35)), DynamicHMC.TreeStatisticsNUTS(-115.95077961112513, 3, turning at positions 0:7, 0.9766196003987198, 7, DynamicHMC.Directions(0xdbea3ebf)), DynamicHMC.TreeStatisticsNUTS(-116.18470511396829, 3, turning at positions -1:6, 0.9458181707424528, 7, DynamicHMC.Directions(0x63d16046)), DynamicHMC.TreeStatisticsNUTS(-115.20026995617256, 3, turning at positions -4:3, 0.9160195963698293, 7, DynamicHMC.Directions(0x2c472c0b)), DynamicHMC.TreeStatisticsNUTS(-115.04048344120976, 3, turning at positions -6:-13, 1.0, 15, DynamicHMC.Directions(0x3c99b0c2)), DynamicHMC.TreeStatisticsNUTS(-114.43740020033614, 2, turning at positions -1:2, 0.9999999999999999, 3, DynamicHMC.Directions(0xeeda9292)), DynamicHMC.TreeStatisticsNUTS(-114.00367254041109, 3, turning at positions -3:4, 0.9895162034571758, 7, DynamicHMC.Directions(0x1512902c)), DynamicHMC.TreeStatisticsNUTS(-114.65821370763774, 2, turning at positions -1:-4, 0.9146540642893474, 7, DynamicHMC.Directions(0xc7973343)), DynamicHMC.TreeStatisticsNUTS(-115.9492589091601, 3, turning at positions -4:3, 0.9256976182440564, 7, DynamicHMC.Directions(0xe43d4c2b))], κ = Gaussian kinetic energy (WoodburyPDMat), √diag(M⁻¹): [0.06916069970025875, 0.14484387478938207, 0.13687335049101199, 1.0387354108428228, 0.6498162221579102], ϵ = 0.6642437176866007)

AdvancedHMC.jl

Similar to Pathfinder and DynamicHMC, AdvancedHMC can also work with a LogDensityProblem:

using AdvancedHMC
 
 nadapts = 500;

Initializing from Pathfinder's draws

metric = DiagEuclideanMetric(dim)
 hamiltonian = Hamiltonian(metric, ∇P)
@@ -275,4 +275,4 @@
     nadapts;
     drop_warmup=true,
     progress=false,
-)
+)
diff --git a/dev/examples/quickstart/index.html b/dev/examples/quickstart/index.html index f4a96005..c5694ea9 100644 --- a/dev/examples/quickstart/index.html +++ b/dev/examples/quickstart/index.html @@ -33,15 +33,15 @@ fit ELBO: 4.2 ± 0.16 fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}( dim: 5 -μ: [-0.5493677556227341, 0.4900589068506337, -0.759985156515199, 0.24999020550772186, 0.9407722695354226] -Σ: [2.150970010634484 0.44793188030504283 … 0.07865677145605805 0.35715917262291524; 0.44793188030504283 1.1053053241218844 … -0.16922520628877635 -0.14360616907571785; … ; 0.07865677145605805 -0.16922520628877635 … 0.10987234999062208 -0.19942469590579673; 0.35715917262291524 -0.14360616907571785 … -0.19942469590579673 7.815927657688505] +μ: [-0.549367755630811, 0.49005890685159853, -0.759985156515975, 0.24999020550713552, 0.9407722695299942] +Σ: [2.1509700197410884 0.4479318796411965 … 0.0786567719428354 0.3571591798224109; 0.4479318796411965 1.1053053239805566 … -0.16922520623211834 -0.14360617025821132; … ; 0.0786567719428354 -0.16922520623211834 … 0.10987234997574531 -0.19942469524921236; 0.3571591798224109 -0.14360617025821132 … -0.19942469524921236 7.815927661689837] )

result is a PathfinderResult. See its docstring for a description of its fields.

The L-BFGS optimizer constructs an approximation to the inverse Hessian of the negative log density using the limited history of previous points and gradients. For each iteration, Pathfinder uses this estimate as an approximation to the covariance matrix of a multivariate normal that approximates the target distribution. The distribution that maximizes the evidence lower bound (ELBO) is stored in result.fit_distribution. Its mean and covariance are quite close to our target distribution's.

result.fit_distribution.μ
5-element Vector{Float64}:
- -0.5493677556227341
-  0.4900589068506337
- -0.759985156515199
-  0.24999020550772186
-  0.9407722695354226
result.fit_distribution.Σ
5×5 Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}:
+ -0.549367755630811
+  0.49005890685159853
+ -0.759985156515975
+  0.24999020550713552
+  0.9407722695299942
result.fit_distribution.Σ
5×5 Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}:
  2.15097     0.447932    0.176873    0.0786568   0.357159
  0.447932    1.10531    -0.0812456  -0.169225   -0.143606
  0.176873   -0.0812456   0.259695    0.070208   -0.716034
@@ -89,7 +89,7 @@
     result, logp_mvnormal_marginal, xrange, yrange, 20;
     xlabel="x₁", ylabel="x₂",
 )
-gif(anim; fps=5)

A banana-shaped distribution

Now we will run Pathfinder on the following banana-shaped distribution with density

\[\pi(x_1, x_2) = e^{-x_1^2 / 2} e^{-5 (x_2 - x_1^2)^2 / 2}.\]

First we define the log density problem:

Random.seed!(23)
+gif(anim; fps=5)

A banana-shaped distribution

Now we will run Pathfinder on the following banana-shaped distribution with density

\[\pi(x_1, x_2) = e^{-x_1^2 / 2} e^{-5 (x_2 - x_1^2)^2 / 2}.\]

First we define the log density problem:

Random.seed!(23)
 
 struct BananaProblem end
 function LogDensityProblems.capabilities(::Type{<:BananaProblem})
@@ -106,84 +106,84 @@
 contour(xrange, yrange, exp ∘ logp_banana ∘ Base.vect; xlabel="x₁", ylabel="x₂")
- + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + +

Now we run pathfinder.

result = pathfinder(prob_banana; init_scale=10)
Single-path Pathfinder result
   tries: 1
@@ -192,14 +192,14 @@
   fit ELBO: 0.95 ± 0.45
   fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}(
 dim: 2
-μ: [0.015763420706665268, -0.01866987405376694]
-Σ: [0.8977728310791517 0.2802043789334102; 0.2802043789334102 0.28037921907318625]
+μ: [0.01576342070666535, -0.018669874053767438]
+Σ: [0.8977728310791531 0.2802043789334129; 0.2802043789334129 0.2803792190731861]
 )
 

As before we can visualise each iteration of the algorithm.

anim = plot_pathfinder_trace(
     result, logp_banana, xrange, yrange, 20;
     xlabel="x₁", ylabel="x₂",
 )
-gif(anim; fps=5)

Since the distribution is far from normal, Pathfinder is unable to fit the distribution well. Especially for such complicated target distributions, it's always a good idea to run multipathfinder, which runs single-path Pathfinder multiple times.

ndraws = 1_000
+gif(anim; fps=5)

Since the distribution is far from normal, Pathfinder is unable to fit the distribution well. Especially for such complicated target distributions, it's always a good idea to run multipathfinder, which runs single-path Pathfinder multiple times.

ndraws = 1_000
 result = multipathfinder(prob_banana, ndraws; nruns=20, init_scale=10)
Multi-path Pathfinder result
   runs: 20
   draws: 1000
@@ -211,1084 +211,1084 @@
 plot!(xlims=extrema(xrange), ylims=extrema(yrange), xlabel="x₁", ylabel="x₂", legend=false)
- + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + +

While the draws do a poor job of covering the tails of the distribution, they are still useful for identifying the nonlinear correlation between these two parameters.

A 100-dimensional funnel

As we have seen above, running multi-path Pathfinder is much more useful for target distributions that are far from normal. One particularly difficult distribution to sample is Neal's funnel:

\[\begin{aligned} \tau &\sim \mathrm{Normal}(\mu=0, \sigma=3)\\ @@ -1314,12 +1314,12 @@ prob_funnel = ADgradient(:ForwardDiff, FunnelProblem(100))

First, let's fit this posterior with single-path Pathfinder.

result_single = pathfinder(prob_funnel; init_scale=10)
Single-path Pathfinder result
   tries: 1
   draws: 5
-  fit iteration: 7 (total: 837)
-  fit ELBO: 80.79 ± 2.7
+  fit iteration: 7 (total: 825)
+  fit ELBO: 80.82 ± 2.7
   fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}(
 dim: 100
-μ: [-1.868838990931804, 0.024837021391949404, 0.11525244777276428, 0.058206714593271716, -0.12080271646426252, 0.09834085858411482, 0.018901202844480977, 0.11621438173510676, 0.0432064256863324, 0.05499296034710646  …  -0.03523316637564368, -0.008302726992459809, -0.03672464078623023, -0.05221660538143662, 0.11075031001428179, -0.01435866960768015, 0.0012089710319747784, 0.03667496715908613, -0.039439822750521304, -0.011929007852928865]
-Σ: [0.01565720380276408 -0.00035250623327071935 … 0.0005597070504154974 0.00016931390358133945; -0.00035250623327071935 0.2564577094134916 … -5.0019460752579546e-5 -1.8023151920408084e-5; … ; 0.0005597070504154974 -5.0019460752579546e-5 … 0.25894700512055224 2.5570273572077715e-5; 0.00016931390358133945 -1.8023151920408084e-5 … 2.5570273572077715e-5 0.25518670933479515]
+μ: [-1.8688389909318373, 0.02483702139196528, 0.11525244777276325, 0.05820671459326632, -0.12080271646425483, 0.09834085858410133, 0.01890120284447968, 0.11621438173509588, 0.04320642568632944, 0.05499296034710184  …  -0.035233166375640335, -0.008302726992459231, -0.03672464078622684, -0.05221660538143204, 0.11075031001427069, -0.014358669607679241, 0.0012089710319746288, 0.03667496715908247, -0.03943982275052029, -0.011929007852928091]
+Σ: [0.015657203802764703 -0.00035250623327065696 … 0.0005597070504154743 0.000169313903581332; -0.00035250623327065696 0.2564577094134917 … -5.001946075259094e-5 -1.8023151920411608e-5; … ; 0.0005597070504154743 -5.001946075259094e-5 … 0.25894700512055213 2.5570273572082635e-5; 0.000169313903581332 -1.8023151920411608e-5 … 2.5570273572082635e-5 0.25518670933479504]
 )
 

Let's visualize this sequence of multivariate normals for the first two dimensions.

β₁_range = -5:0.01:5
 τ_range = -15:0.01:5
@@ -1328,11 +1328,11 @@
     result_single, logp_funnel, τ_range, β₁_range, 15;
     show_elbo=true, xlabel="τ", ylabel="β₁",
 )
-gif(anim; fps=2)

For this challenging posterior, we can again see that most of the approximations are not great, because this distribution is not normal. Also, this distribution has a pole instead of a mode, so there is no MAP estimate, and no Laplace approximation exists. As optimization proceeds, the approximation goes from very bad to less bad and finally extremely bad. The ELBO-maximizing distribution is at the neck of the funnel, which would be a good location to initialize MCMC.

Now we run multipathfinder.

ndraws = 1_000
+gif(anim; fps=2)

For this challenging posterior, we can again see that most of the approximations are not great, because this distribution is not normal. Also, this distribution has a pole instead of a mode, so there is no MAP estimate, and no Laplace approximation exists. As optimization proceeds, the approximation goes from very bad to less bad and finally extremely bad. The ELBO-maximizing distribution is at the neck of the funnel, which would be a good location to initialize MCMC.

Now we run multipathfinder.

ndraws = 1_000
 result = multipathfinder(prob_funnel, ndraws; nruns=20, init_scale=10)
Multi-path Pathfinder result
   runs: 20
   draws: 1000
-  Pareto shape diagnostic: 0.86 (bad)

Again, the poor Pareto shape diagnostic indicates we should run MCMC to get draws suitable for computing posterior estimates.

We can see that the bulk of Pathfinder's draws come from the neck of the funnel, where the fit from the single path we examined was located.

τ_approx = result.draws[1, :]
+  Pareto shape diagnostic: 1.02 (very bad)

Again, the poor Pareto shape diagnostic indicates we should run MCMC to get draws suitable for computing posterior estimates.

We can see that the bulk of Pathfinder's draws come from the neck of the funnel, where the fit from the single path we examined was located.

τ_approx = result.draws[1, :]
 β₁_approx = result.draws[2, :]
 
 contour(τ_range, β₁_range, exp ∘ logp_funnel ∘ Base.vect)
@@ -1340,1087 +1340,1086 @@
 plot!(; xlims=extrema(τ_range), ylims=extrema(β₁_range), xlabel="τ", ylabel="β₁", legend=false)
- + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + - + diff --git a/dev/examples/turing/index.html b/dev/examples/turing/index.html index 36496116..09000d30 100644 --- a/dev/examples/turing/index.html +++ b/dev/examples/turing/index.html @@ -19,7 +19,7 @@ fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}( dim: 3 μ: [1.4756791589636236, 2.0060480974694594, -1.5519095930330966] -Σ: [0.001763074266651822 -0.00025292501945253554 -0.0001797379945779572; -0.00025292501945253554 5.069345304521976e-5 1.8086024595530317e-5; -0.0001797379945779572 1.8086024595530317e-5 0.0048990292501730134] +Σ: [0.0017630742666647626 -0.0002529250194507042 -0.0001797379945751404; -0.0002529250194507042 5.069345304455368e-5 1.8086024599027062e-5; -0.0001797379945751404 1.8086024599027062e-5 0.004899029250235438] )
multipathfinder(fun.func, 1_000; dim, nruns=8)
Multi-path Pathfinder result
   runs: 8
@@ -31,13 +31,13 @@
   fit ELBO: -0.38 ± 0.1
   fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}(
 dim: 3
-μ: [1.4756463647535056, 2.00605192573605, -1.551964549132786]
-Σ: [0.002722911823156756 -0.0004064098114399603 0.00021035284868706146; -0.0004064098114399603 7.40877240303502e-5 -1.7048889064825907e-5; 0.00021035284868706146 -1.7048889064825907e-5 0.007139166137205458]
+μ: [1.4756463647533917, 2.0060519257360676, -1.5519645491328373]
+Σ: [0.0027229118231170304 -0.0004064098114431647 0.0002103528487284935; -0.0004064098114431647 7.408772403207449e-5 -1.7048889079627797e-5; 0.0002103528487284935 -1.7048889079627797e-5 0.007139166137243391]
 )
 
result_multi = multipathfinder(model, 1_000; nruns=8)
Multi-path Pathfinder result
   runs: 8
   draws: 1000
-  Pareto shape diagnostic: 0.36 (good)

Here, the Pareto shape diagnostic indicates that it is likely safe to use these draws to compute posterior estimates.

When passed a Model, Pathfinder also gives access to the posterior draws in a familiar MCMCChains.Chains object.

result_multi.draws_transformed
Chains MCMC chain (1000×3×1 Array{Float64, 3}):
+  Pareto shape diagnostic: 0.73 (bad)

Here, the Pareto shape diagnostic indicates that it is likely safe to use these draws to compute posterior estimates.

When passed a Model, Pathfinder also gives access to the posterior draws in a familiar MCMCChains.Chains object.

result_multi.draws_transformed
Chains MCMC chain (1000×3×1 Array{Float64, 3}):
 
 Iterations        = 1:1:1000
 Number of chains  = 1
@@ -45,31 +45,31 @@
 parameters        = α, β, σ
 
 Summary Statistics
-  parameters      mean       std   naive_se      mcse        ess      rhat
-      Symbol   Float64   Float64    Float64   Float64    Float64   Float64
+  parameters      mean       std   naive_se      mcse         ess      rhat
+      Symbol   Float64   Float64    Float64   Float64     Float64   Float64
 
-           α    1.4775    0.0444     0.0014    0.0013   978.3001    1.0028
-           β    2.0057    0.0072     0.0002    0.0002   946.8461    1.0010
-           σ    0.2174    0.0147     0.0005    0.0005   991.4018    0.9991
+           α    1.4795    0.0435     0.0014    0.0014   1075.0205    0.9997
+           β    2.0054    0.0073     0.0002    0.0002   1096.8717    0.9993
+           σ    0.2163    0.0148     0.0005    0.0004   1098.8354    1.0005
 
 Quantiles
   parameters      2.5%     25.0%     50.0%     75.0%     97.5%
       Symbol   Float64   Float64   Float64   Float64   Float64
 
-           α    1.3864    1.4510    1.4758    1.5101    1.5629
-           β    1.9915    2.0009    2.0056    2.0104    2.0205
-           σ    0.1914    0.2076    0.2172    0.2259    0.2504
+           α    1.3931    1.4510    1.4792    1.5114    1.5608
+           β    1.9917    2.0005    2.0052    2.0105    2.0201
+           σ    0.1896    0.2059    0.2158    0.2253    0.2487
 

We can also use these posterior draws to initialize MCMC sampling.

init_params = collect.(eachrow(result_multi.draws_transformed.value[1:4, :, 1]))
4-element Vector{Vector{Float64}}:
- [1.535410397404318, 1.991499991802661, 0.23699572783562703]
- [1.5124689845460457, 2.0062575407148313, 0.2104953568964579]
- [1.350661325614133, 2.025933683672095, 0.2244779584816331]
- [1.455048638174994, 2.0117139639447634, 0.212075640360078]
chns = sample(model, Turing.NUTS(), MCMCThreads(), 1_000, 4; init_params, progress=false)
Chains MCMC chain (1000×15×4 Array{Float64, 3}):
+ [1.503689259731647, 2.0005169139705226, 0.217178536808854]
+ [1.4563127428512423, 2.007861245763021, 0.2442945441973665]
+ [1.4714899060464897, 2.0081288726859126, 0.2201222841705958]
+ [1.4982936475571569, 1.9977714462732648, 0.23551303312312902]
chns = sample(model, Turing.NUTS(), MCMCThreads(), 1_000, 4; init_params, progress=false)
Chains MCMC chain (1000×15×4 Array{Float64, 3}):
 
 Iterations        = 501:1:1500
 Number of chains  = 4
 Samples per chain = 1000
-Wall duration     = 8.19 seconds
-Compute duration  = 5.91 seconds
+Wall duration     = 10.06 seconds
+Compute duration  = 7.26 seconds
 parameters        = α, β, σ
 internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
 
@@ -77,18 +77,18 @@
   parameters      mean       std   naive_se      mcse         ess      rhat    ⋯
       Symbol   Float64   Float64    Float64   Float64     Float64   Float64    ⋯
 
-           α    1.4757    0.0431     0.0007    0.0011   1690.3217    0.9993    ⋯
-           β    2.0060    0.0075     0.0001    0.0002   1866.8666    0.9993    ⋯
-           σ    0.2168    0.0155     0.0002    0.0003   2500.6496    1.0012    ⋯
+           α    1.4747    0.0425     0.0007    0.0010   1545.1557    1.0035    ⋯
+           β    2.0062    0.0074     0.0001    0.0002   1643.9705    1.0047    ⋯
+           σ    0.2163    0.0154     0.0002    0.0003   2534.9378    1.0014    ⋯
                                                                 1 column omitted
 
 Quantiles
   parameters      2.5%     25.0%     50.0%     75.0%     97.5%
       Symbol   Float64   Float64   Float64   Float64   Float64
 
-           α    1.3925    1.4460    1.4755    1.5042    1.5601
-           β    1.9910    2.0010    2.0060    2.0111    2.0206
-           σ    0.1892    0.2059    0.2157    0.2267    0.2504
+           α    1.3930    1.4459    1.4742    1.5040    1.5584
+           β    1.9915    2.0012    2.0063    2.0112    2.0207
+           σ    0.1890    0.2055    0.2155    0.2261    0.2489
 

To use Pathfinder's estimate of the metric and skip warm-up, at the moment one needs to use AdvancedHMC directly.

ℓπ(x) = -fun.func.f(x, nothing)
 function ∂ℓπ∂θ(x)
     g = similar(x)
@@ -115,6 +115,6 @@
     nadapts;
     drop_warmup=true,
     progress=false,
-)
([[1.4676402233577517, 2.0074038351557344, -1.4841937242140848], [1.4807946823073779, 2.0093123538946296, -1.5181587082519967], [1.415393196701997, 2.007056906013418, -1.4321505435894895], [1.493231958550808, 2.00341124900399, -1.5931804982180824], [1.4561374815832329, 2.011388144537265, -1.5185743201225377], [1.5218218898915163, 1.9997281004626335, -1.6081599902590913], [1.4865222773504163, 2.0060288523741434, -1.5960983644526323], [1.4645278977026965, 2.009341376402341, -1.6767083888554495], [1.512515685316033, 2.0027271841784025, -1.4073616456821285], [1.5036001911203791, 1.995789055325274, -1.5486660645040526]  …  [1.5129472987618144, 1.9931476408498332, -1.605188822644906], [1.4362468608990842, 2.01238639794947, -1.630789444029088], [1.5302979980280567, 1.995947091140596, -1.462321807636107], [1.4907390646514165, 2.0065112037496045, -1.5450615247120634], [1.490634870113431, 2.0010298354323384, -1.6301395083494823], [1.4646460455078356, 2.009722646886248, -1.552417987563257], [1.5300956489836317, 1.997405454393315, -1.6451578523810704], [1.436088289703882, 2.01085264592819, -1.591661426829158], [1.4824733020478118, 1.9996745881169993, -1.5531501748711314], [1.4809078303928631, 1.9994430010724284, -1.5688139887827495]], NamedTuple[(n_steps = 7, is_accept = true, acceptance_rate = 0.883600253830029, log_density = 7.761488891605511, hamiltonian_energy = -5.449857209110606, hamiltonian_energy_error = -0.004598836381440208, max_hamiltonian_energy_error = 0.2744620290551536, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.9392494156413083, log_density = 7.528433449761108, hamiltonian_energy = -6.681374538522578, hamiltonian_energy_error = 0.019214956520409388, max_hamiltonian_energy_error = 0.12379508840124753, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.7120275508653812, log_density = 4.186245521102456, hamiltonian_energy = -3.6754079091725127, hamiltonian_energy_error = 0.438488030607314, max_hamiltonian_energy_error = 0.438488030607314, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 7, is_accept = true, acceptance_rate = 0.9605800748703299, log_density = 7.949650072506822, hamiltonian_energy = -3.1610190479831237, hamiltonian_energy_error = -0.5854110216058186, max_hamiltonian_energy_error = -0.5854110216058186, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 7, is_accept = true, acceptance_rate = 0.96554694407735, log_density = 7.802574934285351, hamiltonian_energy = -7.459218488786334, hamiltonian_energy_error = 0.03961194141551161, max_hamiltonian_energy_error = 0.0555739612847761, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.9505344253847955, log_density = 7.200063535130109, hamiltonian_energy = -6.840269168776417, hamiltonian_energy_error = 0.058406665731227214, max_hamiltonian_energy_error = 0.09613982298851642, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 11, is_accept = true, acceptance_rate = 0.9969004647515886, log_density = 7.876706043725011, hamiltonian_energy = -7.016030854399295, hamiltonian_energy_error = -0.09956255341367015, max_hamiltonian_energy_error = -0.11975268992651777, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.7818256896080591, log_density = 6.329867834048604, hamiltonian_energy = -5.638796993011745, hamiltonian_energy_error = 0.23821975937041362, max_hamiltonian_energy_error = 0.3838240200238836, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 7, is_accept = true, acceptance_rate = 0.8885254637118603, log_density = 5.861245671846356, hamiltonian_energy = -3.8157290926649043, hamiltonian_energy_error = 0.11037000607528213, max_hamiltonian_energy_error = 0.33485362654746265, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 11, is_accept = true, acceptance_rate = 0.9511796450153046, log_density = 6.607957774659404, hamiltonian_energy = -3.8377368477041816, hamiltonian_energy_error = -0.09436292398567758, max_hamiltonian_energy_error = -0.3038257291549553, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false)  …  (n_steps = 7, is_accept = true, acceptance_rate = 0.8916885801927464, log_density = 5.226404588072593, hamiltonian_energy = -4.277896352320278, hamiltonian_energy_error = 0.2690165699846876, max_hamiltonian_energy_error = 0.2690165699846876, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 11, is_accept = true, acceptance_rate = 1.0, log_density = 7.019780437862182, hamiltonian_energy = -4.819960784311828, hamiltonian_energy_error = -0.3677626374235512, max_hamiltonian_energy_error = -0.42035826865857473, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 7, is_accept = true, acceptance_rate = 0.9132036146804784, log_density = 6.612394319079511, hamiltonian_energy = -5.386581233294251, hamiltonian_energy_error = 0.10012500899977006, max_hamiltonian_energy_error = 0.21324112747079038, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 1.0, log_density = 7.878321485404268, hamiltonian_energy = -6.66661460787871, hamiltonian_energy_error = -0.2183710411307045, max_hamiltonian_energy_error = -0.2183710411307045, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.7797955547387869, log_density = 7.153461855847842, hamiltonian_energy = -4.549506622611551, hamiltonian_energy_error = -0.007392300189944301, max_hamiltonian_energy_error = 0.6187067956498566, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.9900488836858519, log_density = 8.030537772910673, hamiltonian_energy = -6.619381490352603, hamiltonian_energy_error = -0.11693593797947965, max_hamiltonian_energy_error = -0.11693593797947965, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 7, is_accept = true, acceptance_rate = 0.7160772739874185, log_density = 6.264624132178908, hamiltonian_energy = -3.1266014248405947, hamiltonian_energy_error = 0.12538246369640227, max_hamiltonian_energy_error = 0.7535328451430714, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 7, is_accept = true, acceptance_rate = 0.9481407681503587, log_density = 7.5173471057431875, hamiltonian_energy = -4.2587346571289055, hamiltonian_energy_error = -0.16005835781577904, max_hamiltonian_energy_error = 0.17225479155535695, tree_depth = 3, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.9513250028584923, log_density = 7.121690491698855, hamiltonian_energy = -6.636997266552586, hamiltonian_energy_error = 0.08947353718440709, max_hamiltonian_energy_error = 0.08947353718440709, tree_depth = 2, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false), (n_steps = 1, is_accept = true, acceptance_rate = 0.9528697555900233, log_density = 6.8601547814818105, hamiltonian_energy = -6.703189205681179, hamiltonian_energy_error = 0.04827705246423708, max_hamiltonian_energy_error = 0.04827705246423708, tree_depth = 1, numerical_error = false, step_size = 0.7264850453013639, nom_step_size = 0.7264850453013639, is_adapt = false)])

Now we pack the samples into an MCMCChains.Chains:

samples_transformed = reduce(vcat, fun.transform.(samples)')
+)
([[1.4426042147627391, 2.0094864584230985, -1.5097996538158076], [1.3998801943955104, 2.0201857251000255, -1.5361412109482127], [1.4051458899240876, 2.0202987017130605, -1.457309630509756], [1.4856945995579873, 1.9978364348181632, -1.422372782450631], [1.4460291970887889, 2.010382861379786, -1.4799224767217032], [1.4291875110690104, 2.0130504124589437, -1.470253900176862], [1.527268872027908, 1.9985613433085196, -1.655679525083512], [1.485022823255241, 2.0017636951026465, -1.5700450126448366], [1.4010387695566593, 2.0203389024541463, -1.5493432125802893], [1.579087552476485, 1.9885635798863892, -1.5516860356925646]  …  [1.5127023721707622, 2.003051823062902, -1.4699062231801843], [1.5126796396083544, 1.9990533669284931, -1.4175253110468966], [1.4346955418325675, 2.010892009834125, -1.7100130890108074], [1.4465467949617823, 2.005298620524573, -1.3022887516128432], [1.4727546165006193, 2.0068809214226335, -1.4989835378177188], [1.533925524540236, 1.998882122225741, -1.4953690781357463], [1.4759745894487657, 2.007929763179185, -1.587467213181056], [1.5169438878358201, 1.9998096552744524, -1.5118302842284315], [1.45650192002816, 2.0065197223553435, -1.56576920914184], [1.4750879248893007, 2.0106303352426673, -1.5443232018616122]], NamedTuple[(n_steps = 3, is_accept = true, acceptance_rate = 1.0, log_density = 7.684203337928883, hamiltonian_energy = -4.199747276702116, hamiltonian_energy_error = -0.5939044198788781, max_hamiltonian_energy_error = -0.5939044198788781, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.778975096896601, log_density = 6.314450752692314, hamiltonian_energy = -5.935015841819251, hamiltonian_energy_error = 0.30442742312767734, max_hamiltonian_energy_error = 0.30442742312767734, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 15, is_accept = true, acceptance_rate = 0.9604448951047677, log_density = 5.772673629275509, hamiltonian_energy = -4.914929071630293, hamiltonian_energy_error = 0.06343006617318014, max_hamiltonian_energy_error = -0.30294479871613333, tree_depth = 4, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 7, is_accept = true, acceptance_rate = 0.8880909364139795, log_density = 5.327309910543318, hamiltonian_energy = -4.040199973766979, hamiltonian_energy_error = 0.13572020911629323, max_hamiltonian_energy_error = 0.4287639050728651, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.9511254440273101, log_density = 7.507613996453539, hamiltonian_energy = -4.732347339027851, hamiltonian_energy_error = -0.4308394681312011, max_hamiltonian_energy_error = -0.4308394681312011, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.7623678233077849, log_density = 7.064609836790011, hamiltonian_energy = -4.582233792253414, hamiltonian_energy_error = 0.07249563677501136, max_hamiltonian_energy_error = 0.4097372021717405, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.9370419201960098, log_density = 6.128176970824219, hamiltonian_energy = -5.6457048820558295, hamiltonian_energy_error = 0.10692338661544731, max_hamiltonian_energy_error = -0.1784358796176253, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.947963647387175, log_density = 7.8348001294198735, hamiltonian_energy = -5.504518177864945, hamiltonian_energy_error = -0.24978076169693075, max_hamiltonian_energy_error = -0.24978076169693075, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.8344836836075172, log_density = 6.262682678277982, hamiltonian_energy = -5.970467983901393, hamiltonian_energy_error = 0.37052358239180805, max_hamiltonian_energy_error = 0.37052358239180805, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 7, is_accept = true, acceptance_rate = 0.9041123906525679, log_density = 5.004838509212749, hamiltonian_energy = -4.191941588271407, hamiltonian_energy_error = 0.24053874065999992, max_hamiltonian_energy_error = 0.29145831549519574, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false)  …  (n_steps = 63, is_accept = true, acceptance_rate = 0.9835631111087395, log_density = 7.0315323875482, hamiltonian_energy = -5.77659677382491, hamiltonian_energy_error = -0.07822188039970257, max_hamiltonian_energy_error = -0.2571471826020124, tree_depth = 5, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.9029265124072229, log_density = 6.176505119575974, hamiltonian_energy = -5.5278705249551185, hamiltonian_energy_error = 0.1735084234032236, max_hamiltonian_energy_error = 0.1735084234032236, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.8833314201251646, log_density = 4.643642412265735, hamiltonian_energy = -3.0415758939301765, hamiltonian_energy_error = 0.022131566587327622, max_hamiltonian_energy_error = -0.4654939629228574, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 15, is_accept = true, acceptance_rate = 0.7183012209717858, log_density = 2.125536061715974, hamiltonian_energy = 0.8002028846063025, hamiltonian_energy_error = 0.5110369324654185, max_hamiltonian_energy_error = 0.7205762061592034, tree_depth = 3, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 1.0, log_density = 7.9398939062048, hamiltonian_energy = -2.796630947823016, hamiltonian_energy_error = -0.7795285752002163, max_hamiltonian_energy_error = -0.7795285752002163, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.8971893464299524, log_density = 6.956243956522768, hamiltonian_energy = -6.740456103395156, hamiltonian_energy_error = 0.15638766340715105, max_hamiltonian_energy_error = 0.15638766340715105, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.8009452909366125, log_density = 7.9398838489398145, hamiltonian_energy = -4.520313626376483, hamiltonian_energy_error = -0.20572759349833447, max_hamiltonian_energy_error = 0.437393378055142, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 3, is_accept = true, acceptance_rate = 0.758449776862797, log_density = 7.608920626665104, hamiltonian_energy = -5.635998941369806, hamiltonian_energy_error = 0.11261034283846261, max_hamiltonian_energy_error = 0.4972558146558246, tree_depth = 2, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 19, is_accept = true, acceptance_rate = 0.9949662919820594, log_density = 7.8705920752571705, hamiltonian_energy = -7.297503627254557, hamiltonian_energy_error = -0.06226479993551948, max_hamiltonian_energy_error = -0.09578892650285908, tree_depth = 4, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false), (n_steps = 11, is_accept = true, acceptance_rate = 0.954410598110778, log_density = 7.464713369734034, hamiltonian_energy = -7.375261478962942, hamiltonian_energy_error = 0.09882393074116447, max_hamiltonian_energy_error = 0.09882393074116447, tree_depth = 3, numerical_error = false, step_size = 0.7965357614972391, nom_step_size = 0.7965357614972391, is_adapt = false)])

Now we pack the samples into an MCMCChains.Chains:

samples_transformed = reduce(vcat, fun.transform.(samples)')
 varnames = Pathfinder.flattened_varnames_list(model)
-chns = MCMCChains.Chains(samples_transformed, varnames)

See Initializing HMC with Pathfinder for further examples.

+chns = MCMCChains.Chains(samples_transformed, varnames)

See Initializing HMC with Pathfinder for further examples.

diff --git a/dev/index.html b/dev/index.html index 7927b0ff..e126d420 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · Pathfinder.jl

Pathfinder.jl: Parallel quasi-Newton variational inference

This package implements Pathfinder, [Zhang2021] a variational method for initializing Markov chain Monte Carlo (MCMC) methods.

Single-path Pathfinder

Single-path Pathfinder (pathfinder) attempts to return draws in or near the typical set, usually with many fewer gradient evaluations. Pathfinder uses the limited-memory BFGS(L-BFGS) optimizer to construct a maximum a posteriori (MAP) estimate of a target posterior distribution $p$. It then uses the trace of the optimization to construct a sequence of multivariate normal approximations to the target distribution, returning the approximation that maximizes the evidence lower bound (ELBO) – equivalently, minimizes the Kullback-Leibler divergence from the target distribution – as well as draws from the distribution.

Multi-path Pathfinder

Multi-path Pathfinder (multipathfinder) consists of running Pathfinder multiple times. It returns a uniformly-weighted mixture model of the multivariate normal approximations of the individual runs. It also uses importance resampling to return samples that better approximate the target distribution and assess the quality of the approximation.

Uses

Using the Pathfinder draws

Folk theorem of statistical computing

When you have computational problems, often there’s a problem with your model.

Visualizing posterior draws is a common way to diagnose problems with a model. However, problematic models often tend to be slow to warm-up. Even if the draws returned by Pathfinder are only approximations to the posterior, they can sometimes still be used to diagnose basic issues such as highly correlated parameters, parameters with very different posterior variances, and multimodality.

Initializing MCMC

Pathfinder can be used to initialize MCMC. This especially useful when sampling with Hamiltonian Monte Carlo. See Initializing HMC with Pathfinder for details.

Integration with the Julia ecosystem

Pathfinder uses several packages for extended functionality:

  • Optimization.jl: This allows the L-BFGS optimizer to be replaced with any of the many Optimization-compatible optimizers and supports use of callbacks. Note that any changes made to Pathfinder using these features would be experimental.
  • Transducers.jl: parallelization support
  • Distributions.jl/PDMats.jl: fits can be used anywhere a Distribution can be used
  • LogDensityProblems.jl: defining the log-density function, gradient, and Hessian
  • ProgressLogging.jl: In Pluto, Juno, and VSCode, nested progress bars are shown. In the REPL, use TerminalLoggers.jl to get progress bars.
  • Zhang2021Lu Zhang, Bob Carpenter, Andrew Gelman, Aki Vehtari (2021). Pathfinder: Parallel quasi-Newton variational inference. arXiv: 2108.03782 [stat.ML]. Code
+Home · Pathfinder.jl

Pathfinder.jl: Parallel quasi-Newton variational inference

This package implements Pathfinder, [Zhang2021] a variational method for initializing Markov chain Monte Carlo (MCMC) methods.

Single-path Pathfinder

Single-path Pathfinder (pathfinder) attempts to return draws in or near the typical set, usually with many fewer gradient evaluations. Pathfinder uses the limited-memory BFGS(L-BFGS) optimizer to construct a maximum a posteriori (MAP) estimate of a target posterior distribution $p$. It then uses the trace of the optimization to construct a sequence of multivariate normal approximations to the target distribution, returning the approximation that maximizes the evidence lower bound (ELBO) – equivalently, minimizes the Kullback-Leibler divergence from the target distribution – as well as draws from the distribution.

Multi-path Pathfinder

Multi-path Pathfinder (multipathfinder) consists of running Pathfinder multiple times. It returns a uniformly-weighted mixture model of the multivariate normal approximations of the individual runs. It also uses importance resampling to return samples that better approximate the target distribution and assess the quality of the approximation.

Uses

Using the Pathfinder draws

Folk theorem of statistical computing

When you have computational problems, often there’s a problem with your model.

Visualizing posterior draws is a common way to diagnose problems with a model. However, problematic models often tend to be slow to warm-up. Even if the draws returned by Pathfinder are only approximations to the posterior, they can sometimes still be used to diagnose basic issues such as highly correlated parameters, parameters with very different posterior variances, and multimodality.

Initializing MCMC

Pathfinder can be used to initialize MCMC. This especially useful when sampling with Hamiltonian Monte Carlo. See Initializing HMC with Pathfinder for details.

Integration with the Julia ecosystem

Pathfinder uses several packages for extended functionality:

  • Optimization.jl: This allows the L-BFGS optimizer to be replaced with any of the many Optimization-compatible optimizers and supports use of callbacks. Note that any changes made to Pathfinder using these features would be experimental.
  • Transducers.jl: parallelization support
  • Distributions.jl/PDMats.jl: fits can be used anywhere a Distribution can be used
  • LogDensityProblems.jl: defining the log-density function, gradient, and Hessian
  • ProgressLogging.jl: In Pluto, Juno, and VSCode, nested progress bars are shown. In the REPL, use TerminalLoggers.jl to get progress bars.
  • Zhang2021Lu Zhang, Bob Carpenter, Andrew Gelman, Aki Vehtari (2021). Pathfinder: Parallel quasi-Newton variational inference. arXiv: 2108.03782 [stat.ML]. Code
diff --git a/dev/lib/internals/index.html b/dev/lib/internals/index.html index de11a713..a3e2c097 100644 --- a/dev/lib/internals/index.html +++ b/dev/lib/internals/index.html @@ -23,4 +23,4 @@ \end{align}\]

source
Pathfinder.lbfgs_inverse_hessiansMethod
lbfgs_inverse_hessians(
     θs, ∇logpθs; Hinit=gilbert_init, history_length=5, ϵ=1e-12
 ) -> Tuple{Vector{WoodburyPDMat},Int}

From an L-BFGS trajectory and gradients, compute the inverse Hessian approximations at each point.

Given positions θs with gradients ∇logpθs, construct LBFGS inverse Hessian approximations with the provided history_length.

The 2nd returned value is the number of BFGS updates to the inverse Hessian matrices that were rejected due to keeping the inverse Hessian positive definite.

source
Pathfinder.pdfactorizeMethod
pdfactorize(A, B, D) -> WoodburyPDFactorization

Factorize the positive definite matrix $W = A + B D B^\mathrm{T}$.

The result is the "square root" factorization (L, R), where $W = L R$ and $L = R^\mathrm{T}$.

Let $U^\mathrm{T} U = A$ be the Cholesky decomposition of $A$, and let $Q X = U^{-\mathrm{T}} B$ be a thin QR decomposition. Define $C = I + XDX^\mathrm{T}$, with the Cholesky decomposition $V^\mathrm{T} V = C$. Then, $W = R^\mathrm{T} R$, where

\[R = \begin{pmatrix} U & 0 \\ 0 & I \end{pmatrix} Q^\mathrm{T} V.\]

The positive definite requirement is equivalent to the requirement that both $A$ and $C$ are positive definite.

For a derivation of this decomposition for the special case of diagonal $A$, see appendix A of [Zhang2021].

See pdunfactorize, WoodburyPDFactorization, WoodburyPDMat

source
Pathfinder.pdunfactorizeMethod
pdunfactorize(F::WoodburyPDFactorization) -> (A, B, D)

Perform a reverse operation to pdfactorize.

Note that this function does not compute the inverse of pdfactorize. It only computes matrices that produce the same matrix $W = A + B D B^\mathrm{T}$ as for the inputs to pdfactorize.

source
Pathfinder.resampleMethod
resample(rng, x, log_weights, ndraws) -> (draws, psis_result)
-resample(rng, x, ndraws) -> draws

Draw ndraws samples from x, with replacement.

If log_weights is provided, perform Pareto smoothed importance resampling.

source
+resample(rng, x, ndraws) -> draws

Draw ndraws samples from x, with replacement.

If log_weights is provided, perform Pareto smoothed importance resampling.

source
diff --git a/dev/lib/public/index.html b/dev/lib/public/index.html index d7c7c7d1..69c52f32 100644 --- a/dev/lib/public/index.html +++ b/dev/lib/public/index.html @@ -2,4 +2,4 @@ Public · Pathfinder.jl

Public Documentation

Documentation for Pathfinder.jl's public interface.

See the Internals section for documentation of internal functions.

Index

Public Interface

Pathfinder.pathfinderFunction
pathfinder(ℓ; kwargs...)
 pathfinder(fun::SciMLBase::OptimizationFunction; kwargs...)
 pathfinder(prob::SciMLBase::OptimizationProblem; kwargs...)

Find the best multivariate normal approximation encountered while maximizing a log density.

From an optimization trajectory, Pathfinder constructs a sequence of (multivariate normal) approximations to the distribution specified by a log density function. The approximation that maximizes the evidence lower bound (ELBO), or equivalently, minimizes the KL divergence between the approximation and the true distribution, is returned.

The covariance of the multivariate normal distribution is an inverse Hessian approximation constructed using at most the previous history_length steps.

Arguments

  • : an object, representing the log-density of the target distribution and its gradient, that implements the LogDensityProblems interface.
  • fun::SciMLBase.OptimizationFunction: an optimization function that represents the negative log density with its gradient. It must have the necessary features (e.g. a Hessian function, if applicable) for the chosen optimization algorithm. For details, see Optimization.jl: OptimizationFunction.
  • prob::SciMLBase.OptimizationProblem: an optimization problem containing a function with the same properties as fun, as well as an initial point, in which case init and dim are ignored.

Keywords

  • dim::Int: dimension of the target distribution, needed only if fun is provided and init is not.
  • init::AbstractVector{<:Real}: initial point of length dim from which to begin optimization. If not provided, an initial point of type Vector{Float64} and length dim is created and filled using init_sampler.
  • init_scale::Real: scale factor $s$ such that the default init_sampler samples entries uniformly in the range $[-s, s]$
  • init_sampler: function with the signature (rng, x) -> x that modifies a vector of length dims in-place to generate an initial point
  • ndraws_elbo::Int=5: Number of draws used to estimate the ELBO
  • ndraws::Int=ndraws_elbo: number of approximate draws to return
  • rng::Random.AbstractRNG: The random number generator to be used for drawing samples
  • executor::Transducers.Executor=Transducers.SequentialEx(): Transducers.jl executor that determines if and how to perform ELBO computation in parallel. The default (SequentialEx()) performs no parallelization. If rng is known to be thread-safe, and the log-density function is known to have no internal state, then Transducers.PreferParallel() may be used to parallelize log-density evaluation. This is generally only faster for expensive log density functions.
  • history_length::Int=6: Size of the history used to approximate the inverse Hessian.
  • optimizer: Optimizer to be used for constructing trajectory. Can be any optimizer compatible with Optimization.jl, so long as it supports callbacks. Defaults to Optim.LBFGS(; m=history_length, linesearch=LineSearches.MoreThuente()). See the Optimization.jl documentation for details.
  • ntries::Int=1_000: Number of times to try the optimization, restarting if it fails. Before every restart, a new initial point is drawn using init_sampler.
  • fail_on_nonfinite::Bool=true: If true, optimization fails if the log-density is a NaN or Inf or if the gradient is ever non-finite. If nretries > 0, then optimization will be retried after reinitialization.
  • kwargs... : Remaining keywords are forwarded to Optimization.solve.

Returns

source
Pathfinder.PathfinderResultType
PathfinderResult

Container for results of single-path Pathfinder.

Fields

  • input: User-provided input object, e.g. a LogDensityProblem, optim_fun, optim_prob, or another object.
  • optimizer: Optimizer used for maximizing the log-density
  • rng: Pseudorandom number generator that was used for sampling
  • optim_prob::SciMLBase.OptimizationProblem: Otimization problem used for optimization
  • logp: Log-density function
  • fit_distribution::Distributions.MvNormal: ELBO-maximizing multivariate normal distribution
  • draws::AbstractMatrix{<:Real}: draws from multivariate normal with size (dim, ndraws)
  • fit_distribution_transformed: fit_distribution transformed to the same space as the user-supplied target distribution. This is only different from fit_distribution when integrating with other packages, and its type depends on the type of input.
  • draws_transformed: draws transformed to be draws from fit_distribution_transformed.
  • fit_iteration::Int: Iteration at which ELBO estimate was maximized
  • num_tries::Int: Number of tries until Pathfinder succeeded
  • optim_solution::SciMLBase.OptimizationSolution: Solution object of optimization.
  • optim_trace::Pathfinder.OptimizationTrace: container for optimization trace of points, log-density, and gradient. The first point is the initial point.
  • fit_distributions::AbstractVector{Distributions.MvNormal}: Multivariate normal distributions for each point in optim_trace, where fit_distributions[fit_iteration + 1] == fit_distribution
  • elbo_estimates::AbstractVector{<:Pathfinder.ELBOEstimate}: ELBO estimates for all but the first distribution in fit_distributions.
  • num_bfgs_updates_rejected::Int: Number of times a BFGS update to the reconstructed inverse Hessian was rejected to keep the inverse Hessian positive definite.

Returns

source
Pathfinder.multipathfinderFunction
multipathfinder(ℓ, ndraws; kwargs...)
-multipathfinder(fun::SciMLBase.OptimizationFunction, ndraws; kwargs...)

Run pathfinder multiple times to fit a multivariate normal mixture model.

For nruns=length(init), nruns parallel runs of pathfinder produce nruns multivariate normal approximations $q_k = q(\phi | \mu_k, \Sigma_k)$ of the posterior. These are combined to a mixture model $q$ with uniform weights.

$q$ is augmented with the component index to generate random samples, that is, elements $(k, \phi)$ are drawn from the augmented mixture model

\[\tilde{q}(\phi, k | \mu, \Sigma) = K^{-1} q(\phi | \mu_k, \Sigma_k),\]

where $k$ is a component index, and $K=$ nruns. These draws are then resampled with replacement. Discarding $k$ from the samples would reproduce draws from $q$.

If importance=true, then Pareto smoothed importance resampling is used, so that the resulting draws better approximate draws from the target distribution $p$ instead of $q$. This also prints a warning message if the importance weighted draws are unsuitable for approximating expectations with respect to $p$.

Arguments

  • : an object, representing the log-density of the target distribution and its gradient, that implements the LogDensityProblems interface.
  • fun::SciMLBase.OptimizationFunction: an optimization function that represents a negative log density with its gradient. It must have the necessary features (e.g. a Hessian function) for the chosen optimization algorithm. For details, see Optimization.jl: OptimizationFunction.
  • ndraws::Int: number of approximate draws to return

Keywords

  • init: iterator of length nruns of initial points of length dim from which each single-path Pathfinder run will begin. length(init) must be implemented. If init is not provided, nruns must be, and dim must be if fun provided.
  • nruns::Int: number of runs of Pathfinder to perform. Ignored if init is provided.
  • ndraws_per_run::Int: The number of draws to take for each component before resampling. Defaults to a number such that ndraws_per_run * nruns > ndraws.
  • importance::Bool=true: Perform Pareto smoothed importance resampling of draws.
  • rng::AbstractRNG=Random.GLOBAL_RNG: Pseudorandom number generator. It is recommended to use a parallelization-friendly PRNG like the default PRNG on Julia 1.7 and up.
  • executor::Transducers.Executor=Transducers.SequentialEx(): Transducers.jl executor that determines if and how to run the single-path runs in parallel. If a transducer for multi-threaded computation is selected, you must first verify that rng and the log density function are thread-safe.
  • executor_per_run::Transducers.Executor=Transducers.SequentialEx(): Transducers.jl executor used within each run to parallelize PRNG calls. Defaults to no parallelization. See pathfinder for a description.
  • kwargs... : Remaining keywords are forwarded to pathfinder.

Returns

source
Pathfinder.MultiPathfinderResultType
MultiPathfinderResult

Container for results of multi-path Pathfinder.

Fields

  • input: User-provided input object, e.g. either logp, (logp, ∇logp), optim_fun, optim_prob, or another object.
  • optimizer: Optimizer used for maximizing the log-density
  • rng: Pseudorandom number generator that was used for sampling
  • optim_prob::SciMLBase.OptimizationProblem: Otimization problem used for optimization
  • logp: Log-density function
  • fit_distribution::Distributions.MixtureModel: uniformly-weighted mixture of ELBO- maximizing multivariate normal distributions from each run.
  • draws::AbstractMatrix{<:Real}: draws from fit_distribution with size (dim, ndraws), potentially resampled using importance resampling to be closer to the target distribution.
  • draw_component_ids::Vector{Int}: component id of each draw in draws.
  • fit_distribution_transformed: fit_distribution transformed to the same space as the user-supplied target distribution. This is only different from fit_distribution when integrating with other packages, and its type depends on the type of input.
  • draws_transformed: draws transformed to be draws from fit_distribution_transformed.
  • pathfinder_results::Vector{<:PathfinderResult}: results of each single-path Pathfinder run.
  • psis_result::Union{Nothing,<:PSIS.PSISResult}: If importance resampling was used, the result of Pareto-smoothed importance resampling. psis_result.pareto_shape also diagnoses whether draws can be used to compute estimates from the target distribution. See PSIS.PSISResult for details
source
+multipathfinder(fun::SciMLBase.OptimizationFunction, ndraws; kwargs...)

Run pathfinder multiple times to fit a multivariate normal mixture model.

For nruns=length(init), nruns parallel runs of pathfinder produce nruns multivariate normal approximations $q_k = q(\phi | \mu_k, \Sigma_k)$ of the posterior. These are combined to a mixture model $q$ with uniform weights.

$q$ is augmented with the component index to generate random samples, that is, elements $(k, \phi)$ are drawn from the augmented mixture model

\[\tilde{q}(\phi, k | \mu, \Sigma) = K^{-1} q(\phi | \mu_k, \Sigma_k),\]

where $k$ is a component index, and $K=$ nruns. These draws are then resampled with replacement. Discarding $k$ from the samples would reproduce draws from $q$.

If importance=true, then Pareto smoothed importance resampling is used, so that the resulting draws better approximate draws from the target distribution $p$ instead of $q$. This also prints a warning message if the importance weighted draws are unsuitable for approximating expectations with respect to $p$.

Arguments

Keywords

Returns

source
Pathfinder.MultiPathfinderResultType
MultiPathfinderResult

Container for results of multi-path Pathfinder.

Fields

  • input: User-provided input object, e.g. either logp, (logp, ∇logp), optim_fun, optim_prob, or another object.
  • optimizer: Optimizer used for maximizing the log-density
  • rng: Pseudorandom number generator that was used for sampling
  • optim_prob::SciMLBase.OptimizationProblem: Otimization problem used for optimization
  • logp: Log-density function
  • fit_distribution::Distributions.MixtureModel: uniformly-weighted mixture of ELBO- maximizing multivariate normal distributions from each run.
  • draws::AbstractMatrix{<:Real}: draws from fit_distribution with size (dim, ndraws), potentially resampled using importance resampling to be closer to the target distribution.
  • draw_component_ids::Vector{Int}: component id of each draw in draws.
  • fit_distribution_transformed: fit_distribution transformed to the same space as the user-supplied target distribution. This is only different from fit_distribution when integrating with other packages, and its type depends on the type of input.
  • draws_transformed: draws transformed to be draws from fit_distribution_transformed.
  • pathfinder_results::Vector{<:PathfinderResult}: results of each single-path Pathfinder run.
  • psis_result::Union{Nothing,<:PSIS.PSISResult}: If importance resampling was used, the result of Pareto-smoothed importance resampling. psis_result.pareto_shape also diagnoses whether draws can be used to compute estimates from the target distribution. See PSIS.PSISResult for details
source
diff --git a/dev/search/index.html b/dev/search/index.html index a9c99b84..7d2bb8e6 100644 --- a/dev/search/index.html +++ b/dev/search/index.html @@ -1,2 +1,2 @@ -Search · Pathfinder.jl

Loading search...

    +Search · Pathfinder.jl

    Loading search...