Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Population Samplers (implemented DE-MCMC) #2735

Merged
merged 60 commits into from
Dec 5, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
b95af1f
catching other error types that may occur when gradients are not avai…
michaelosthege Oct 18, 2017
168c3f5
Merge branch 'master' of https://github.com/pymc-devs/pymc3
michaelosthege Nov 9, 2017
671b387
Merge branch 'master' of https://github.com/pymc-devs/pymc3
michaelosthege Nov 14, 2017
48ef144
Merge branch 'master' of https://github.com/pymc-devs/pymc3
michaelosthege Nov 16, 2017
c0ce6ec
Merge branch 'master' of https://github.com/pymc-devs/pymc3
michaelosthege Nov 21, 2017
fd16ba5
added a benchmark example for correlated dimensions
michaelosthege Nov 21, 2017
8ec0b80
marking Metropolis as COMPATIBLE for all types, added line breaks (co…
michaelosthege Nov 21, 2017
1d04f53
specifying single job to force the sample_many function
michaelosthege Nov 21, 2017
2bd6a5a
modified sampling procedure to interate chains in parallel (instead o…
michaelosthege Nov 21, 2017
df37cbb
updated description
michaelosthege Nov 21, 2017
ba7d083
indexing samplers by chain number instead of chain id
michaelosthege Nov 22, 2017
a4894bf
print transposes result table
michaelosthege Nov 22, 2017
cd14d6a
created PopulationArrayStepShared base class that allows the individu…
michaelosthege Nov 23, 2017
dbc42bc
modified sampling loop to account for the PopulationArrayStepShared s…
michaelosthege Nov 23, 2017
0ca3804
added the DEMetropolis sampler
michaelosthege Nov 23, 2017
2206b45
raisig an error when the population is too small
michaelosthege Nov 23, 2017
6e63f49
verbose debug logging
michaelosthege Nov 23, 2017
9395de3
removed debug print
michaelosthege Nov 23, 2017
6105dec
forcing CompoundStep type
michaelosthege Nov 23, 2017
0494bc0
formatting
michaelosthege Nov 23, 2017
e25ce19
setting DEMetropolis as a blocked step method
michaelosthege Nov 23, 2017
2841442
measuring the runtime, example with both 2D z-variable and two 1D x,y…
michaelosthege Nov 23, 2017
fd6a1ef
changed the initialization order such that variable transforms are ap…
michaelosthege Nov 24, 2017
cf4ee2f
fixed a bug caused by start=None
michaelosthege Nov 24, 2017
6733ae1
fixes a bug in computing lambda
michaelosthege Nov 27, 2017
368f4aa
using a Uniform proposal with low initial scale
michaelosthege Nov 27, 2017
abcb12f
renamed local variable
michaelosthege Nov 27, 2017
acf1311
logging the crossover and scaling
michaelosthege Nov 28, 2017
3b6a2d9
fixed a bug that caused step methods to not be copied
michaelosthege Nov 28, 2017
27e8263
smarter multiprocessing
michaelosthege Nov 28, 2017
d83c5f4
automatic multiprocessing decision, reporting relative sampling rates
michaelosthege Nov 28, 2017
fc4e1d0
print format
michaelosthege Nov 28, 2017
569731b
inheriting PopulationArraySharedStep from ArrayStepShared, using a bi…
michaelosthege Nov 28, 2017
8de3977
printing the number of effective samples per variable
michaelosthege Nov 28, 2017
67e07a4
docstrings and comments
michaelosthege Nov 28, 2017
5769b9d
falling back to sequential sampling if no population samplers are used
michaelosthege Nov 28, 2017
5f6c29c
removed debugging stats logging
michaelosthege Nov 28, 2017
b53b510
fixed nested if else
michaelosthege Nov 28, 2017
4af2aa5
updated print statement
michaelosthege Nov 28, 2017
e5f7ff2
fixed a bug related to bijection updating
michaelosthege Nov 29, 2017
874e6b2
docstring and comments
michaelosthege Nov 29, 2017
eba4a9d
refactoring for better clarity and less diff
michaelosthege Nov 29, 2017
f71a37c
code style
michaelosthege Nov 29, 2017
63ae017
removed unused import
michaelosthege Nov 29, 2017
e20295a
fixed a bug where Slice was preferred on multidimensional variables
michaelosthege Nov 29, 2017
acc7538
printing the stepper hierarchy, fixed a variable name, handling non-C…
michaelosthege Nov 29, 2017
c3c233c
fixed a bug where DEMetropolis assigned itself to discrete vars, fixe…
michaelosthege Nov 29, 2017
01adad8
improved code style, including Slice in comparison
michaelosthege Nov 29, 2017
44b1115
including DEMetropolis in existing tests, added test case for Populat…
michaelosthege Nov 30, 2017
9a07a43
fixes python 2.7 compatibility
michaelosthege Nov 30, 2017
e2cfbbb
Using multiprocessing to parallelize iteration of chain populations (…
michaelosthege Nov 30, 2017
30f437d
added references
michaelosthege Dec 1, 2017
4998005
added a warning that DEMetropolis is experimental
michaelosthege Dec 1, 2017
107a618
forgotten space
michaelosthege Dec 1, 2017
b94906a
modified the PopulationStepper to automatically use parallelization o…
michaelosthege Dec 1, 2017
58fa336
avoiding a reimport, disabled chain parallelization by default
michaelosthege Dec 1, 2017
3bcf0fd
increased nchains
michaelosthege Dec 1, 2017
c919742
resolving conflicts
michaelosthege Dec 4, 2017
24b4d63
resolving conflicts
michaelosthege Dec 4, 2017
7db836a
included DEMetropolis in new features
michaelosthege Dec 5, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions RELEASE-NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
- Improve NUTS initialization `advi+adapt_diag_grad` and add `jitter+adapt_diag_grad` (#2643)
- Fixed `compareplot` to use `loo` output.

### New Features
- Michael Osthege added support for population-samplers and implemented differential evolution metropolis (`DEMetropolis`). For models with correlated dimensions that can not use gradient-based samplers, the `DEMetropolis` sampler can give higher effective sampling rates. (also see [PR#2735](https://github.com/pymc-devs/pymc3/pull/2735))


## PyMC3 3.2 (October 10, 2017)

Expand Down
100 changes: 100 additions & 0 deletions pymc3/examples/samplers_mvnormal.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
"""
Comparing different samplers on a correlated bivariate normal distribution.

This example will sample a bivariate normal with Metropolis, NUTS and DEMetropolis
at two correlations (0, 0.9) and print out the effective sample sizes, runtime and
normalized effective sampling rates.
"""


import numpy as np
import time
import pandas as pd
import pymc3 as pm
import theano.tensor as tt

# with this flag one can switch between defining the bivariate normal as
# either a 2D MvNormal (USE_XY = False) split up the two dimensions into
# two variables 'x' and 'y'. The latter is recommended because it highlights
# different behaviour with respect to blocking.
USE_XY = True

def run(steppers, p):
steppers = set(steppers)
traces = {}
effn = {}
runtimes = {}

with pm.Model() as model:
if USE_XY:
x = pm.Flat('x')
y = pm.Flat('y')
mu = np.array([0.,0.])
cov = np.array([[1.,p],[p,1.]])
z = pm.MvNormal.dist(mu=mu, cov=cov, shape=(2,)).logp(tt.stack([x,y]))
pot = pm.Potential('logp_xy', z)
start = {'x': 0, 'y': 0}
else:
mu = np.array([0.,0.])
cov = np.array([[1.,p],[p,1.]])
z = pm.MvNormal('z', mu=mu, cov=cov, shape=(2,))
start={'z': [0, 0]}

for step_cls in steppers:
name = step_cls.__name__
t_start = time.time()
mt = pm.sample(
draws=10000,
chains=16, parallelize=False,
step=step_cls(),
start=start
)
runtimes[name] = time.time() - t_start
print('{} samples across {} chains'.format(len(mt) * mt.nchains, mt.nchains))
traces[name] = mt
en = pm.diagnostics.effective_n(mt)
print('effective: {}\r\n'.format(en))
if USE_XY:
effn[name] = np.mean(en['x']) / len(mt) / mt.nchains
else:
effn[name] = np.mean(en['z']) / len(mt) / mt.nchains
return traces, effn, runtimes


if __name__ == '__main__':
methods = [
pm.Metropolis,
pm.Slice,
pm.NUTS,
pm.DEMetropolis
]
names = [c.__name__ for c in methods]

df_base = pd.DataFrame(columns=['p'] + names)
df_base['p'] = [.0,.9]
df_base = df_base.set_index('p')

df_effectiven = df_base.copy()
df_runtime = df_base.copy()
df_performance = df_base.copy()

for p in df_effectiven.index:
trace, rate, runtime = run(methods, p)
for name in names:
df_effectiven.set_value(p, name, rate[name])
df_runtime.set_value(p, name, runtime[name])
df_performance.set_value(p, name, rate[name] / runtime[name])

print('\r\nEffective sample size [0...1]')
print(df_effectiven.T.to_string(float_format='{:.3f}'.format))

print('\r\nRuntime [s]')
print(df_runtime.T.to_string(float_format='{:.1f}'.format))

if 'NUTS' in names:
print('\r\nNormalized effective sampling rate [0...1]')
df_performance = df_performance.T / df_performance.loc[0]['NUTS']
else:
print('\r\nNormalized effective sampling rate [1/s]')
df_performance = df_performance.T
print(df_performance.to_string(float_format='{:.3f}'.format))
Loading