diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 503f0fb6..b67f2cf2 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-11-10T09:53:02","documenter_version":"1.7.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-11-10T09:59:54","documenter_version":"1.7.0"}} \ No newline at end of file diff --git a/dev/index.html b/dev/index.html index ab4deea1..cf805089 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,3 +1,3 @@ Home · RestrictedBoltzmannMachines.jl

RestrictedBoltzmannMachines.jl Documentation

A Julia package to train and simulate Restricted Boltzmann Machines. The package is registered. Install it with:

import Pkg
-Pkg.add("RestrictedBoltzmannMachines")

The source code is hosted on Github.

https://github.com/cossio/RestrictedBoltzmannMachines.jl

This package doesn't export any symbols. It can be imported like this:

import RestrictedBoltzmannMachines as RBMs

to avoid typing a long name everytime.

Most of the functions have a helpful docstring. See Reference section.

See also the Examples listed on the menu on the left side bar to understand how the package works as a whole.

+Pkg.add("RestrictedBoltzmannMachines")

The source code is hosted on Github.

https://github.com/cossio/RestrictedBoltzmannMachines.jl

This package doesn't export any symbols. It can be imported like this:

import RestrictedBoltzmannMachines as RBMs

to avoid typing a long name everytime.

Most of the functions have a helpful docstring. See Reference section.

See also the Examples listed on the menu on the left side bar to understand how the package works as a whole.

diff --git a/dev/literate/MNIST/1e8e1eed.png b/dev/literate/MNIST/1e8e1eed.png deleted file mode 100644 index 540f38a4..00000000 Binary files a/dev/literate/MNIST/1e8e1eed.png and /dev/null differ diff --git a/dev/literate/MNIST/3e7f5320.png b/dev/literate/MNIST/3e7f5320.png deleted file mode 100644 index dadb38b0..00000000 Binary files a/dev/literate/MNIST/3e7f5320.png and /dev/null differ diff --git a/dev/literate/MNIST/73d0743f.png b/dev/literate/MNIST/73d0743f.png new file mode 100644 index 00000000..4bd3c9c6 Binary files /dev/null and b/dev/literate/MNIST/73d0743f.png differ diff --git a/dev/literate/MNIST/9cb3df5e.png b/dev/literate/MNIST/9cb3df5e.png new file mode 100644 index 00000000..32cd9112 Binary files /dev/null and b/dev/literate/MNIST/9cb3df5e.png differ diff --git a/dev/literate/MNIST/d0ac7736.png b/dev/literate/MNIST/d0ac7736.png deleted file mode 100644 index dedf3420..00000000 Binary files a/dev/literate/MNIST/d0ac7736.png and /dev/null differ diff --git a/dev/literate/MNIST/d521a019.png b/dev/literate/MNIST/d521a019.png new file mode 100644 index 00000000..67732109 Binary files /dev/null and b/dev/literate/MNIST/d521a019.png differ diff --git a/dev/literate/MNIST/fa269099.png b/dev/literate/MNIST/fa269099.png deleted file mode 100644 index 54a65dd2..00000000 Binary files a/dev/literate/MNIST/fa269099.png and /dev/null differ diff --git a/dev/literate/MNIST/fd4fb255.png b/dev/literate/MNIST/fd4fb255.png new file mode 100644 index 00000000..a2de6e0e Binary files /dev/null and b/dev/literate/MNIST/fd4fb255.png differ diff --git a/dev/literate/MNIST/index.html b/dev/literate/MNIST/index.html index d05e0135..6fac227a 100644 --- a/dev/literate/MNIST/index.html +++ b/dev/literate/MNIST/index.html @@ -26,9 +26,9 @@ Makie.image!(ax, imggrid(digits), colorrange=(1,0)) Makie.hidedecorations!(ax) Makie.hidespines!(ax) -figExample block output

Initialize an RBM with 400 hidden units.

rbm = BinaryRBM(Float, (28,28), 400)
-initialize!(rbm, train_x) # match single-site statistics

Initially, the RBM assigns a poor pseudolikelihood to the data.

println("log(PL) = ", mean(@time log_pseudolikelihood(rbm, train_x)))
  1.974835 seconds (8.58 M allocations: 484.612 MiB, 4.35% gc time, 89.79% compilation time)
-log(PL) = -0.2544163

Now we train the RBM on the data.

batchsize = 256
+fig
Example block output

Initialize an RBM with 400 hidden units.

rbm = BinaryRBM(Float, (28,28), 400)
+initialize!(rbm, train_x) # match single-site statistics

Initially, the RBM assigns a poor pseudolikelihood to the data.

println("log(PL) = ", mean(@time log_pseudolikelihood(rbm, train_x)))
  1.956182 seconds (8.58 M allocations: 484.734 MiB, 5.10% gc time, 90.16% compilation time)
+log(PL) = -0.2590893

Now we train the RBM on the data.

batchsize = 256
 iters = 10000
 history = MVHistory()
 @time pcd!(
@@ -39,25 +39,25 @@
             @trace history iter lpl
         end
     end
-)
268.708819 seconds (20.83 M allocations: 205.582 GiB, 21.09% gc time, 6.22% compilation time)

After training, the pseudolikelihood score of the data improves significantly. Plot of log-pseudolikelihood of trian data during learning.

fig = Makie.Figure(resolution=(500,300))
+)
268.390142 seconds (20.85 M allocations: 205.583 GiB, 20.59% gc time, 6.25% compilation time)

After training, the pseudolikelihood score of the data improves significantly. Plot of log-pseudolikelihood of trian data during learning.

fig = Makie.Figure(resolution=(500,300))
 ax = Makie.Axis(fig[1,1], xlabel = "train time", ylabel="pseudolikelihood")
 Makie.lines!(ax, get(history, :lpl)...)
-fig
Example block output

Sample digits from the RBM starting from a random condition.

nsteps = 3000
+fig
Example block output

Sample digits from the RBM starting from a random condition.

nsteps = 3000
 fantasy_F = zeros(nrows*ncols, nsteps)
 fantasy_x = bitrand(28,28,nrows*ncols)
 fantasy_F[:,1] .= free_energy(rbm, fantasy_x)
 @time for t in 2:nsteps
     fantasy_x .= sample_v_from_v(rbm, fantasy_x)
     fantasy_F[:,t] .= free_energy(rbm, fantasy_x)
-end
 17.973418 seconds (244.20 k allocations: 11.988 GiB, 2.05% gc time, 1.14% compilation time)

Check equilibration of sampling

fig = Makie.Figure(resolution=(400,300))
+end
 19.657277 seconds (244.20 k allocations: 11.988 GiB, 2.68% gc time, 1.06% compilation time)

Check equilibration of sampling

fig = Makie.Figure(resolution=(400,300))
 ax = Makie.Axis(fig[1,1], xlabel="sampling time", ylabel="free energy")
 fantasy_F_μ = vec(mean(fantasy_F; dims=1))
 fantasy_F_σ = vec(std(fantasy_F; dims=1))
 Makie.band!(ax, 1:nsteps, fantasy_F_μ - fantasy_F_σ/2, fantasy_F_μ + fantasy_F_σ/2)
 Makie.lines!(ax, 1:nsteps, fantasy_F_μ)
-fig
Example block output

Plot the sampled digits.

fig = Makie.Figure(resolution=(40ncols, 40nrows))
+fig
Example block output

Plot the sampled digits.

fig = Makie.Figure(resolution=(40ncols, 40nrows))
 ax = Makie.Axis(fig[1,1], yreversed=true)
 Makie.image!(ax, imggrid(reshape(fantasy_x, 28, 28, ncols, nrows)), colorrange=(1,0))
 Makie.hidedecorations!(ax)
 Makie.hidespines!(ax)
-fig
Example block output

This page was generated using Literate.jl.

+figExample block output

This page was generated using Literate.jl.

diff --git a/dev/literate/ais/02d616e3.png b/dev/literate/ais/02d616e3.png new file mode 100644 index 00000000..0237d1df Binary files /dev/null and b/dev/literate/ais/02d616e3.png differ diff --git a/dev/literate/ais/3c5c2a46.png b/dev/literate/ais/3c5c2a46.png deleted file mode 100644 index e395683b..00000000 Binary files a/dev/literate/ais/3c5c2a46.png and /dev/null differ diff --git a/dev/literate/ais/index.html b/dev/literate/ais/index.html index 5b8717e0..66dd5162 100644 --- a/dev/literate/ais/index.html +++ b/dev/literate/ais/index.html @@ -11,7 +11,7 @@ train_x = Array{Float}(train_x[:, :, train_y .== 0] .> 0.5)
┌ Warning: MNIST.traindata() is deprecated, use `MNIST(split=:train)[:]` instead.
 └ @ MLDatasets ~/.julia/packages/MLDatasets/0MkOE/src/datasets/vision/mnist.jl:187

Train an RBM

rbm = BinaryRBM(Float, (28,28), 128)
 initialize!(rbm, train_x)
-@time pcd!(rbm, train_x; iters=10000, batchsize=128)
 73.000979 seconds (2.54 M allocations: 69.159 GiB, 21.74% gc time, 0.46% compilation time)

Get some equilibrated samples from model

v = train_x[:, :, rand(1:size(train_x, 3), 1000)]
+@time pcd!(rbm, train_x; iters=10000, batchsize=128)
 72.364014 seconds (2.48 M allocations: 69.156 GiB, 21.84% gc time, 0.38% compilation time)

Get some equilibrated samples from model

v = train_x[:, :, rand(1:size(train_x, 3), 1000)]
 v = sample_v_from_v(rbm, v; steps=1000)

Estimate Z with AIS and reverse AIS.

nsamples=100
 ndists = [10, 100, 1000, 10_000, 100_000]
 R_ais = Vector{Float64}[]
@@ -25,16 +25,16 @@
     push!(R_rev,
         @time raise(rbm; nbetas, init, v=v[:,:,rand(1:size(v, 3), nsamples)])
     )
-end
  5.279541 seconds (9.03 M allocations: 518.537 MiB, 0.87% gc time, 99.27% compilation time)
-  0.331629 seconds (416.61 k allocations: 81.735 MiB, 8.02% gc time, 81.80% compilation time)
-  0.828958 seconds (12.82 k allocations: 703.543 MiB, 43.30% gc time)
-  0.396869 seconds (12.80 k allocations: 731.346 MiB, 5.96% gc time)
-  4.699458 seconds (129.82 k allocations: 6.979 GiB, 4.75% gc time)
-  4.089396 seconds (129.81 k allocations: 7.269 GiB, 5.49% gc time)
- 46.905996 seconds (1.30 M allocations: 69.899 GiB, 4.72% gc time)
- 41.013590 seconds (1.30 M allocations: 72.818 GiB, 5.49% gc time)
-467.561514 seconds (13.00 M allocations: 699.103 GiB, 4.73% gc time)
-409.161304 seconds (13.00 M allocations: 728.305 GiB, 5.46% gc time)

Plots

fig = Makie.Figure()
+end
  5.520453 seconds (9.03 M allocations: 518.459 MiB, 0.84% gc time, 99.32% compilation time)
+  0.343873 seconds (416.61 k allocations: 81.735 MiB, 11.18% gc time, 78.94% compilation time)
+  0.567918 seconds (12.82 k allocations: 703.543 MiB, 4.21% gc time)
+  0.407625 seconds (12.80 k allocations: 731.346 MiB, 6.80% gc time)
+  4.692093 seconds (129.82 k allocations: 6.979 GiB, 5.30% gc time)
+  4.174864 seconds (129.81 k allocations: 7.269 GiB, 6.19% gc time)
+ 47.945588 seconds (1.30 M allocations: 69.899 GiB, 5.31% gc time)
+ 42.443840 seconds (1.30 M allocations: 72.818 GiB, 6.10% gc time)
+479.548087 seconds (13.00 M allocations: 699.103 GiB, 5.48% gc time)
+422.967367 seconds (13.00 M allocations: 728.305 GiB, 6.20% gc time)

Plots

fig = Makie.Figure()
 ax = Makie.Axis(
     fig[1,1], width=700, height=400, xscale=log10, xlabel="interpolating distributions", ylabel="log(Z)"
 )
@@ -59,4 +59,4 @@
 Makie.xlims!(extrema(ndists)...)
 Makie.axislegend(ax, position=:rb)
 Makie.resize_to_layout!(fig)
-fig
Example block output

This page was generated using Literate.jl.

+figExample block output

This page was generated using Literate.jl.

diff --git a/dev/literate/layers/Gaussian/795ba454.png b/dev/literate/layers/Gaussian/795ba454.png new file mode 100644 index 00000000..650d6e3f Binary files /dev/null and b/dev/literate/layers/Gaussian/795ba454.png differ diff --git a/dev/literate/layers/Gaussian/dcd25d7b.png b/dev/literate/layers/Gaussian/dcd25d7b.png deleted file mode 100644 index e0095581..00000000 Binary files a/dev/literate/layers/Gaussian/dcd25d7b.png and /dev/null differ diff --git a/dev/literate/layers/Gaussian/index.html b/dev/literate/layers/Gaussian/index.html index 20cfc428..427c2c9d 100644 --- a/dev/literate/layers/Gaussian/index.html +++ b/dev/literate/layers/Gaussian/index.html @@ -11,4 +11,4 @@ lines!(xs[iθ, iγ, :], ps[iθ, iγ, :], linewidth=2) end axislegend(ax) -figExample block output

This page was generated using Literate.jl.

+figExample block output

This page was generated using Literate.jl.

diff --git a/dev/literate/layers/ReLU/51146dcc.png b/dev/literate/layers/ReLU/51146dcc.png new file mode 100644 index 00000000..e1ed7e89 Binary files /dev/null and b/dev/literate/layers/ReLU/51146dcc.png differ diff --git a/dev/literate/layers/ReLU/90b9b714.png b/dev/literate/layers/ReLU/90b9b714.png deleted file mode 100644 index 39ef493f..00000000 Binary files a/dev/literate/layers/ReLU/90b9b714.png and /dev/null differ diff --git a/dev/literate/layers/ReLU/index.html b/dev/literate/layers/ReLU/index.html index 2a3da809..b4d08c4a 100644 --- a/dev/literate/layers/ReLU/index.html +++ b/dev/literate/layers/ReLU/index.html @@ -11,4 +11,4 @@ lines!(xs[iθ, iγ, :], ps[iθ, iγ, :], linewidth=2) end axislegend(ax) -figExample block output

This page was generated using Literate.jl.

+figExample block output

This page was generated using Literate.jl.

diff --git a/dev/literate/layers/dReLU/272dcce9.png b/dev/literate/layers/dReLU/272dcce9.png new file mode 100644 index 00000000..b7b9f893 Binary files /dev/null and b/dev/literate/layers/dReLU/272dcce9.png differ diff --git a/dev/literate/layers/dReLU/aaf691cc.png b/dev/literate/layers/dReLU/aaf691cc.png deleted file mode 100644 index dcf0edfc..00000000 Binary files a/dev/literate/layers/dReLU/aaf691cc.png and /dev/null differ diff --git a/dev/literate/layers/dReLU/index.html b/dev/literate/layers/dReLU/index.html index 7f22b788..f71e6083 100644 --- a/dev/literate/layers/dReLU/index.html +++ b/dev/literate/layers/dReLU/index.html @@ -24,4 +24,4 @@ Makie.axislegend(ax) end end -figExample block output

This page was generated using Literate.jl.

+figExample block output

This page was generated using Literate.jl.

diff --git a/dev/literate/metropolis/9a960ca8.png b/dev/literate/metropolis/9a960ca8.png new file mode 100644 index 00000000..d54dcc99 Binary files /dev/null and b/dev/literate/metropolis/9a960ca8.png differ diff --git a/dev/literate/metropolis/e58bc247.png b/dev/literate/metropolis/e58bc247.png deleted file mode 100644 index f7b39dc2..00000000 Binary files a/dev/literate/metropolis/e58bc247.png and /dev/null differ diff --git a/dev/literate/metropolis/index.html b/dev/literate/metropolis/index.html index 8bdb9bd2..09d33251 100644 --- a/dev/literate/metropolis/index.html +++ b/dev/literate/metropolis/index.html @@ -28,4 +28,4 @@ Makie.scatter!(ax, log.([get(freqs, v, 0.0) for v in 𝒱]), -β * ℱ .- logsumexp(-β * ℱ)) Makie.scatter!(ax, log.([get(freqs, v, 0.0) for v in 𝒱]), -ℱ .- logsumexp(-ℱ)) Makie.abline!(ax, 0, 1, color=:red) -figExample block output

Correlation

cor([get(freqs, v, 0.0) for v in 𝒱], exp.(-β * ℱ .- logsumexp(-β * ℱ)))
0.9999026041008862

This page was generated using Literate.jl.

+figExample block output

Correlation

cor([get(freqs, v, 0.0) for v in 𝒱], exp.(-β * ℱ .- logsumexp(-β * ℱ)))
0.9998856148596774

This page was generated using Literate.jl.

diff --git a/dev/reference/index.html b/dev/reference/index.html index 9a20d0ca..0e682a54 100644 --- a/dev/reference/index.html +++ b/dev/reference/index.html @@ -1,7 +1,7 @@ -Reference · RestrictedBoltzmannMachines.jl

Reference

RestrictedBoltzmannMachines.PottsType
Potts(θ)

Layer with Potts units, with external fields θ. Encodes categorical variables as one-hot vectors. The number of classes is the size of the first dimension.

source
RestrictedBoltzmannMachines.SpinType
Spin(θ)

Layer with spin units, with external fields θ. The energy of a layer with units $s_i$ is given by:

\[E = -\sum_i \theta_i s_i\]

where each spin $s_i$ takes values $\pm 1$.

source
RestrictedBoltzmannMachines.BinaryRBMMethod
BinaryRBM(a, b, w)
-BinaryRBM(N, M)

Construct an RBM with binary visible and hidden units, which has an energy function:

\[E(v, h) = -a'v - b'h - v'wh\]

Equivalent to RBM(Binary(a), Binary(b), w).

source
RestrictedBoltzmannMachines.HopfieldRBMMethod
HopfieldRBM(g, θ, γ, w)
-HopfieldRBM(g, w)

Construct an RBM with spin visible units and Gaussian hidden units. If not given, θ = 0 and γ = 1 by default.

\[E(v, h) = -g'v - θ'h + \sum_\mu \frac{γ_\mu}{2} h_\mu^2 - v'wh\]

source
RestrictedBoltzmannMachines.aisMethod
ais(rbm0, rbm1, v0, βs)

Provided v0 is an equilibrated sample from rbm0, returns F such that mean(exp.(F)) is an unbiased estimator of Z1/Z0, the ratio of partition functions of rbm1 and rbm0.

!!! tip Use logmeanexp logmeanexp(F), using the function logmeanexp[@ref] provided in this package, tends to give a better approximation of log(Z1) - log(Z0) than mean(F).

source
RestrictedBoltzmannMachines.aiseMethod
aise(rbm, [βs]; [nbetas], init=rbm.visible, nsamples=1)

AIS estimator of the log-partition function of rbm. It is recommended to fit init to the single-site statistics of rbm (or the data).

!!! tip Use large nbetas For more accurate estimates, use larger nbetas. It is usually better to have large nbetas and small nsamples, rather than large nsamples and small nbetas.

source
RestrictedBoltzmannMachines.annealMethod
anneal(rbm0, rbm1; β)

Returns an RBM that interpolates between rbm0 and rbm1. Denoting by E0(v, h) and E1(v, h) the energies assigned by rbm0 and rbm1, respectively, the returned RBM assigns energies given by:

E(v,h) = (1 - β) * E0(v) + β * E1(v, h)
source
RestrictedBoltzmannMachines.block_matrix_invertMethod
block_matrix_invert(A, B, C, D)

Inversion of a block matrix, using the formula:

\[\begin{bmatrix} +Reference · RestrictedBoltzmannMachines.jl

Reference

RestrictedBoltzmannMachines.PottsType
Potts(θ)

Layer with Potts units, with external fields θ. Encodes categorical variables as one-hot vectors. The number of classes is the size of the first dimension.

source
RestrictedBoltzmannMachines.SpinType
Spin(θ)

Layer with spin units, with external fields θ. The energy of a layer with units $s_i$ is given by:

\[E = -\sum_i \theta_i s_i\]

where each spin $s_i$ takes values $\pm 1$.

source
RestrictedBoltzmannMachines.BinaryRBMMethod
BinaryRBM(a, b, w)
+BinaryRBM(N, M)

Construct an RBM with binary visible and hidden units, which has an energy function:

\[E(v, h) = -a'v - b'h - v'wh\]

Equivalent to RBM(Binary(a), Binary(b), w).

source
RestrictedBoltzmannMachines.HopfieldRBMMethod
HopfieldRBM(g, θ, γ, w)
+HopfieldRBM(g, w)

Construct an RBM with spin visible units and Gaussian hidden units. If not given, θ = 0 and γ = 1 by default.

\[E(v, h) = -g'v - θ'h + \sum_\mu \frac{γ_\mu}{2} h_\mu^2 - v'wh\]

source
RestrictedBoltzmannMachines.aisMethod
ais(rbm0, rbm1, v0, βs)

Provided v0 is an equilibrated sample from rbm0, returns F such that mean(exp.(F)) is an unbiased estimator of Z1/Z0, the ratio of partition functions of rbm1 and rbm0.

!!! tip Use logmeanexp logmeanexp(F), using the function logmeanexp[@ref] provided in this package, tends to give a better approximation of log(Z1) - log(Z0) than mean(F).

source
RestrictedBoltzmannMachines.aiseMethod
aise(rbm, [βs]; [nbetas], init=rbm.visible, nsamples=1)

AIS estimator of the log-partition function of rbm. It is recommended to fit init to the single-site statistics of rbm (or the data).

!!! tip Use large nbetas For more accurate estimates, use larger nbetas. It is usually better to have large nbetas and small nsamples, rather than large nsamples and small nbetas.

source
RestrictedBoltzmannMachines.annealMethod
anneal(rbm0, rbm1; β)

Returns an RBM that interpolates between rbm0 and rbm1. Denoting by E0(v, h) and E1(v, h) the energies assigned by rbm0 and rbm1, respectively, the returned RBM assigns energies given by:

E(v,h) = (1 - β) * E0(v) + β * E1(v, h)
source
RestrictedBoltzmannMachines.block_matrix_invertMethod
block_matrix_invert(A, B, C, D)

Inversion of a block matrix, using the formula:

\[\begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{C} & \mathbf{D} \end{bmatrix}^{-1} @@ -13,11 +13,11 @@ \begin{bmatrix} \mathbf{I} & -\mathbf{B} \mathbf{D}^{-1} \\ -\mathbf{C} \mathbf{A}^{-1} & \mathbf{I} -\end{bmatrix}\]

Assumes that A and D are square and invertible.

source
RestrictedBoltzmannMachines.block_matrix_logdetMethod
block_matrix_logdet(A, B, C, D)

Log-determinant of a block matrix using the determinant lemma.

\[\det\left( \begin{bmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{C} & \mathbf{D} \end{bmatrix} \right) = \det(A) \det(D - CA^{-1}B) -= \det(D) \det(A - BD^{-1}C)\]

Here we assume that A and D are invertible, and moreover are easy to invert (for example, if they are diagonal). We use this to chose one or the other of the two formulas above.

source
RestrictedBoltzmannMachines.categorical_sampleMethod
categorical_sample(P)

Given a probability array P of size (q, *), returns an array C of size (*), such that C[i] ∈ 1:q is a random sample from the categorical distribution P[:,i]. You must ensure that P defines a proper probability distribution.

source
RestrictedBoltzmannMachines.collect_statesMethod
collect_states(layer)

Returns an array of all states of layer. Only defined for discrete layers.

Warning

Use only for small layers. For large layers, the exponential number of states will not fit in memory.

source
RestrictedBoltzmannMachines.initialize!Function
initialize!(rbm, [data]; ϵ = 1e-6)

Initializes the RBM and returns it. If provided, matches average visible unit activities from data.

initialize!(layer, [data]; ϵ = 1e-6)

Initializes a layer and returns it. If provided, matches average unit activities from data.

source
RestrictedBoltzmannMachines.log_likelihoodMethod
log_likelihood(rbm, v)

Log-likelihood of v under rbm, with the partition function compued by extensive enumeration. For discrete layers, this is exponentially slow for large machines.

source
RestrictedBoltzmannMachines.log_partitionMethod
log_partition(rbm)

Log-partition of rbm, computed by extensive enumeration of visible states (except for particular cases such as Gaussian-Gaussian RBM). This is exponentially slow for large machines.

If your RBM has a smaller hidden layer, mirroring the layers of the rbm first (see mirror).

source
RestrictedBoltzmannMachines.log_pseudolikelihoodMethod
log_pseudolikelihood(rbm, v; exact = false)

Log-pseudolikelihood of v. If exact is true, the exact pseudolikelihood is returned. But this is slow if v consists of many samples. Therefore by default exact is false, in which case the result is a stochastic approximation, where a random site is selected for each sample, and its conditional probability is calculated. In average the results with exact = false coincide with the deterministic result, and the estimate is more precise as the number of samples increases.

source
RestrictedBoltzmannMachines.log_pseudolikelihood_sitesMethod
log_pseudolikelihood_sites(rbm, v, sites)

Log-pseudolikelihood of a site conditioned on the other sites, where sites is an array of site indices (CartesianIndex), one for each sample. Returns an array of log-pseudolikelihood values, for each sample.

source
RestrictedBoltzmannMachines.log_pseudolikelihood_stochMethod
log_pseudolikelihood_stoch(rbm, v)

Log-pseudolikelihood of v. This function computes an stochastic approximation, by doing a trace over random sites for each sample. For large number of samples, this is in average close to the exact value of the pseudolikelihood.

source
RestrictedBoltzmannMachines.metropolis!Method
metropolis!(v, rbm; β = 1)

Metropolis-Hastings sampling from rbm at inverse temperature β. Uses v[:,:,..,:,1] as initial configurations, and writes the Monte-Carlo chains in v[:,:,..,:,2:end].

source
RestrictedBoltzmannMachines.metropolisMethod
metropolis(rbm, v; β = 1, steps = 1)

Metropolis-Hastings sampling from rbm at inverse temperature β, starting from configuration v. Moves are proposed by normal Gibbs sampling.

source
RestrictedBoltzmannMachines.raiseMethod
raise(rbm::RBM, βs; v, init)

Reverse AIS estimator of the log-partition function of rbm. While aise tends to understimate the log of the partition function, raise tends to overstimate it. v must be an equilibrated sample from rbm.

!!! tip Use logmeanexp If F = raise(...), then -logmeanexp(-F), using the function logmeanexp[@ref] provided in this package, tends to give a better approximation of log(Z) than mean(F).

!!! tip Sandwiching the log-partition function If Rf = aise(...), Rr = raise(...) are the AIS and reverse AIS estimators, we have the stochastic bounds logmeanexp(Rf) ≤ log(Z) ≤ -logmeanexp(-Rr).

source
RestrictedBoltzmannMachines.rescale_activations!Method
rescale_activations!(layer, λ::AbstractArray)

For continuous layers with scale parameters, re-parameterizes such that unit activations are divided by λ, and returns true. For other layers, does nothing and returns false.

source
RestrictedBoltzmannMachines.rescale_hidden!Method
rescale_hidden!(rbm, λ::AbstractArray)

For continuous hidden units with a scale parameter, scales parameters such that hidden unit activations are divided by λ, and returns true. For other hidden units does nothing and returns false. The modified RBM is equivalent to the original one.

source
RestrictedBoltzmannMachines.sample_h_from_hMethod
sample_h_from_h(rbm, h; steps=1)

Samples a hidden configuration conditional on another hidden configuration h. Ensures type stability by requiring that the returned array is of the same type as h.

source
RestrictedBoltzmannMachines.sample_v_from_vMethod
sample_v_from_v(rbm, v; steps=1)

Samples a visible configuration conditional on another visible configuration v. Ensures type stability by requiring that the returned array is of the same type as v.

source
RestrictedBoltzmannMachines.substitution_matrix_exhaustiveFunction
substitution_matrix_exhaustive(rbm, v)

Returns an q x N x B tensor of free energies F, where q is the number of possible values of each site, B the number of data points, and N the sequence length:

`q, N, B = size(v)

Thus F and v have the same size. The entry F[x,i,b] gives the free energy cost of flipping site i to x of v[b] from its original value to x, that is:

F[x,i,b] = free_energy(rbm, v_) - free_energy(rbm, v[b])

where v_ is the same as v[b] in all sites but i, where v_ has the value x.

Note that i can be a set of indices.

source
RestrictedBoltzmannMachines.substitution_matrix_sitesFunction
substitution_matrix_sites(rbm, v, sites)

Returns an q x B matrix of free energies F, where q is the number of possible values of each site, and B the number of data points. The entry F[x,b] equals the free energy cost of flipping site[b] of v[b] to x, that is (schemetically):

F[x, b] = free_energy(rbm, v_) - free_energy(rbm, v)

where v = v[b], and v_ is the same as v in all sites except site[b], where v_ has the value x.

source
RestrictedBoltzmannMachines.tnmeanvarMethod
tnmeanvar(a)

Mean and variance of the standard normal distribution truncated to the interval (a, +∞). Equivalent to tnmean(a), tnvar(a) but saves some common computations. WARNING: tnvar(a) can fail for very very large values ofa`.

source
RestrictedBoltzmannMachines.∂cgfFunction
∂cgf(layer, inputs = 0; wts = 1)

Unit activation moments, conjugate to layer parameters. These are obtained by differentiating cgfs with respect to the layer parameters. Averages over configurations (weigthed by wts).

source
+= \det(D) \det(A - BD^{-1}C)\]

Here we assume that A and D are invertible, and moreover are easy to invert (for example, if they are diagonal). We use this to chose one or the other of the two formulas above.

source
RestrictedBoltzmannMachines.categorical_sampleMethod
categorical_sample(P)

Given a probability array P of size (q, *), returns an array C of size (*), such that C[i] ∈ 1:q is a random sample from the categorical distribution P[:,i]. You must ensure that P defines a proper probability distribution.

source
RestrictedBoltzmannMachines.collect_statesMethod
collect_states(layer)

Returns an array of all states of layer. Only defined for discrete layers.

Warning

Use only for small layers. For large layers, the exponential number of states will not fit in memory.

source
RestrictedBoltzmannMachines.initialize!Function
initialize!(rbm, [data]; ϵ = 1e-6)

Initializes the RBM and returns it. If provided, matches average visible unit activities from data.

initialize!(layer, [data]; ϵ = 1e-6)

Initializes a layer and returns it. If provided, matches average unit activities from data.

source
RestrictedBoltzmannMachines.log_likelihoodMethod
log_likelihood(rbm, v)

Log-likelihood of v under rbm, with the partition function compued by extensive enumeration. For discrete layers, this is exponentially slow for large machines.

source
RestrictedBoltzmannMachines.log_partitionMethod
log_partition(rbm)

Log-partition of rbm, computed by extensive enumeration of visible states (except for particular cases such as Gaussian-Gaussian RBM). This is exponentially slow for large machines.

If your RBM has a smaller hidden layer, mirroring the layers of the rbm first (see mirror).

source
RestrictedBoltzmannMachines.log_pseudolikelihoodMethod
log_pseudolikelihood(rbm, v; exact = false)

Log-pseudolikelihood of v. If exact is true, the exact pseudolikelihood is returned. But this is slow if v consists of many samples. Therefore by default exact is false, in which case the result is a stochastic approximation, where a random site is selected for each sample, and its conditional probability is calculated. In average the results with exact = false coincide with the deterministic result, and the estimate is more precise as the number of samples increases.

source
RestrictedBoltzmannMachines.log_pseudolikelihood_sitesMethod
log_pseudolikelihood_sites(rbm, v, sites)

Log-pseudolikelihood of a site conditioned on the other sites, where sites is an array of site indices (CartesianIndex), one for each sample. Returns an array of log-pseudolikelihood values, for each sample.

source
RestrictedBoltzmannMachines.log_pseudolikelihood_stochMethod
log_pseudolikelihood_stoch(rbm, v)

Log-pseudolikelihood of v. This function computes an stochastic approximation, by doing a trace over random sites for each sample. For large number of samples, this is in average close to the exact value of the pseudolikelihood.

source
RestrictedBoltzmannMachines.metropolis!Method
metropolis!(v, rbm; β = 1)

Metropolis-Hastings sampling from rbm at inverse temperature β. Uses v[:,:,..,:,1] as initial configurations, and writes the Monte-Carlo chains in v[:,:,..,:,2:end].

source
RestrictedBoltzmannMachines.metropolisMethod
metropolis(rbm, v; β = 1, steps = 1)

Metropolis-Hastings sampling from rbm at inverse temperature β, starting from configuration v. Moves are proposed by normal Gibbs sampling.

source
RestrictedBoltzmannMachines.raiseMethod
raise(rbm::RBM, βs; v, init)

Reverse AIS estimator of the log-partition function of rbm. While aise tends to understimate the log of the partition function, raise tends to overstimate it. v must be an equilibrated sample from rbm.

!!! tip Use logmeanexp If F = raise(...), then -logmeanexp(-F), using the function logmeanexp[@ref] provided in this package, tends to give a better approximation of log(Z) than mean(F).

!!! tip Sandwiching the log-partition function If Rf = aise(...), Rr = raise(...) are the AIS and reverse AIS estimators, we have the stochastic bounds logmeanexp(Rf) ≤ log(Z) ≤ -logmeanexp(-Rr).

source
RestrictedBoltzmannMachines.rescale_activations!Method
rescale_activations!(layer, λ::AbstractArray)

For continuous layers with scale parameters, re-parameterizes such that unit activations are divided by λ, and returns true. For other layers, does nothing and returns false.

source
RestrictedBoltzmannMachines.rescale_hidden!Method
rescale_hidden!(rbm, λ::AbstractArray)

For continuous hidden units with a scale parameter, scales parameters such that hidden unit activations are divided by λ, and returns true. For other hidden units does nothing and returns false. The modified RBM is equivalent to the original one.

source
RestrictedBoltzmannMachines.sample_h_from_hMethod
sample_h_from_h(rbm, h; steps=1)

Samples a hidden configuration conditional on another hidden configuration h. Ensures type stability by requiring that the returned array is of the same type as h.

source
RestrictedBoltzmannMachines.sample_v_from_vMethod
sample_v_from_v(rbm, v; steps=1)

Samples a visible configuration conditional on another visible configuration v. Ensures type stability by requiring that the returned array is of the same type as v.

source
RestrictedBoltzmannMachines.substitution_matrix_exhaustiveFunction
substitution_matrix_exhaustive(rbm, v)

Returns an q x N x B tensor of free energies F, where q is the number of possible values of each site, B the number of data points, and N the sequence length:

`q, N, B = size(v)

Thus F and v have the same size. The entry F[x,i,b] gives the free energy cost of flipping site i to x of v[b] from its original value to x, that is:

F[x,i,b] = free_energy(rbm, v_) - free_energy(rbm, v[b])

where v_ is the same as v[b] in all sites but i, where v_ has the value x.

Note that i can be a set of indices.

source
RestrictedBoltzmannMachines.substitution_matrix_sitesFunction
substitution_matrix_sites(rbm, v, sites)

Returns an q x B matrix of free energies F, where q is the number of possible values of each site, and B the number of data points. The entry F[x,b] equals the free energy cost of flipping site[b] of v[b] to x, that is (schemetically):

F[x, b] = free_energy(rbm, v_) - free_energy(rbm, v)

where v = v[b], and v_ is the same as v in all sites except site[b], where v_ has the value x.

source
RestrictedBoltzmannMachines.tnmeanvarMethod
tnmeanvar(a)

Mean and variance of the standard normal distribution truncated to the interval (a, +∞). Equivalent to tnmean(a), tnvar(a) but saves some common computations. WARNING: tnvar(a) can fail for very very large values ofa`.

source
RestrictedBoltzmannMachines.∂cgfFunction
∂cgf(layer, inputs = 0; wts = 1)

Unit activation moments, conjugate to layer parameters. These are obtained by differentiating cgfs with respect to the layer parameters. Averages over configurations (weigthed by wts).

source