A set of Bayesian neural network layers to perform stochastic variational inference
- Variational layers with reparameterized Monte Carlo estimators [Blundell et al. 2015]
- Variational layers with Flipout Monte Carlo estimators [Wen et al. 2018]
Abstract class which inherits from torch.nn.Module
Calculates the Kullback-Leibler divergence from distribution normal Q (parametrized mu_q, sigma_q) to distribution normal P (parametrized mu_p, sigma_p)
- mu_q: torch.Tensor -> mu parameter of distribution Q
- sigma_q: torch.Tensor -> sigma parameter of distribution Q
- mu_p: float -> mu parameter of distribution P
- sigma_p: float -> sigma parameter of distribution P
torch.Tensor of shape 0
bayesian_torch.layers.LinearReparameterization(in_features, out_features, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_features: int -> size of each input sample,
- out_features: int -> size of each output sample,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterization and performs torch.nn.functional.linear
.
- X: torch.Tensor with shape
(batch_size, in_features)
- torch.Tensor with shape =
(X.shape[0], out_features)
, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.Conv1dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv1d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.Conv2dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv2d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H, W)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.Conv3dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv3d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H, W, L)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.ConvTranspose1dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose1d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.ConvTranspose2dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose2d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H, W)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.ConvTranspose3dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose3d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H, W, L)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.LSTMReparameterization(in_features, out_features, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_features: int -> size of each input sample,
- out_features: int -> size of each output sample,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
forward(X, hidden_states=None)
Samples the weights with reparameterzation and performs LSTM feedforward operation.
- X: torch.Tensor with shape
(batch_size, in_features)
- hidden_states: None or tuple (torch.Tensor with shape =
(X.shape[0], seq_len, out_features)
, torch.Tensor with shape =(X.shape[0], seq_len, out_features)
)
- tuple: (torch.Tensor with shape =
(X.shape[0], seq_len, out_features)
, tuple (torch.Tensor with shape =(X.shape[0], seq_len, out_features)
, torch.Tensor with shape =(X.shape[0], seq_len, out_features)
)) , float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.LinearFlipout(in_features, out_features, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_features: int -> size of each input sample,
- out_features: int -> size of each output sample,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with flipout reparameterzation and performs torch.nn.functional.linear
.
- X: torch.Tensor with shape
(batch_size, in_features)
- torch.Tensor with shape =
(X.shape[0], out_features)
, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.Conv1dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with flipout reparameterzation and performs torch.nn.functional.conv1d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.Conv2dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with flipout reparameterzation and performs torch.nn.functional.conv2d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H, W)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.Conv3dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with flipout reparameterzation and performs torch.nn.functional.conv3d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H, W, L)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.ConvTranspose1dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose1d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.ConvTranspose2dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose2d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H, W)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.ConvTranspose3dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_channels: int -> number of channels in the input image,
- out_channels: int -> number of channels produced by the convolution,
- kernel_size: int -> size of the convolving kernel,
- stride: int -> stride of the convolution. Default: 1,
- padding: int -> zero-padding added to both sides of the input. Default: 0,
- dilation: int -> spacing between kernel elements. Default: 1,
- groups: int -> number of blocked connections from input channels to output channels,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose3d
. Check PyTorch official documentation for tensor output shape.
- X: torch.Tensor with shape
(batch_size, C, H, W, L)
- torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior
bayesian_torch.layers.LSTMFlipout(in_features, out_features, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)
- in_features: int -> size of each input sample,
- out_features: int -> size of each output sample,
- prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
- prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
- posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
- posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
- bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,
forward(X, hidden_states=None)
Samples the weights with Flipout and performs LSTM feedforward operation.
- X: torch.Tensor with shape
(batch_size, in_features)
- hidden_states: None or tuple (torch.Tensor with shape =
(X.shape[0], seq_len, out_features)
, torch.Tensor with shape =(X.shape[0], seq_len, out_features)
)
- tuple: (torch.Tensor with shape =
(X.shape[0], seq_len, out_features)
, tuple (torch.Tensor with shape =(X.shape[0], seq_len, out_features)
, torch.Tensor with shape =(X.shape[0], seq_len, out_features)
)) , float corresponding to KL divergence from the samples weights distribution to the prior