Skip to content

3. Running the MCMC

Catalina Vallejos edited this page Jun 7, 2020 · 5 revisions

‼️ THIS WIKI IS NO LONGER MAINTAINED - PLEASE REFER TO THE VIGNETTE INSTEAD ‼️

The BASiCS_MCMC function is used to run the MCMC on the SingleCellExperiment object, which was created in the previous section. The following arguments are required when running the MCMC:

  • Data: The SingleCellExperiment object containing at least a counts slot of the raw transcript counts and a BatchInfo vector in the colData when running the sampler without spike-in genes.
  • N: The number of iterations.
  • Thin: The thinning factor. Only every Thin iteration will be stored.
  • Burn: The burn-in period before the chain converges (usually half of the total number of iterations).
  • Regression: A boolean to indicate whether the regression model should be run or not (see below).
  • WithSpikes: A boolean indicating wether the data contains spike-in genes (default: TRUE).

The last two parameters above, define four different running modes. These are summarised in the following diagram.

Spikes vs no-spikes

If WithSpikes = TRUE, BASiCS employs a vertical integration approach in which information from technical spike-in genes is used in order to infer technical variability. When technical spike-in genes are not available, BASiCS uses a horizontal integration strategy which borrows information across multiple technical replicates (Eling et al. 2018). This implementation mode can be used by setting WithSpikes = FALSE. Note: BASiCS_MCMC will fail to run if a single batch of samples is provided. See 2. Input preparation for details about how to provide this information.

# Regression vs no regression

When Regression = FALSE, BASiCS uses independent priors for mean ($\mu_i$) and over-dispersion ($\delta_i) parameters. Instead, when Regression = TRUE, the BASiCS model uses a joint informative prior formulation to account for the relationship between mean and over-dispersion gene-specific parameters. The latter is used to infer a global regression trend between these parameters and, subsequently, to derive a residual over-dispersion ($\epsilon_i$) measure that is defined as departures with respect to this trend.

Clone this wiki locally