Add a background section

TensorBFS · Jul 13, 2023 · 09871f3 · 09871f3
1 parent bce1321
commit 09871f3
Show file tree

Hide file tree

Showing 6 changed files with 270 additions and 3 deletions.
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -3,3 +3,4 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
 LiveServer = "16fef848-5104-11e9-1b77-fb7a48bbb589"
 TensorInference = "c2297e78-99bd-40ad-871d-f50e56b81012"
+TikzPictures = "37f6aa50-8035-52d0-81c2-5a1d08754b2d"
diff --git a/docs/make.jl b/docs/make.jl
@@ -30,6 +30,7 @@ makedocs(;
     ),
     pages=[
         "Home" => "index.md",
+        "Background" => "background.md",
         "Examples" => [
             "Asia network" => "generated/asia/main.md",
            ],

diff --git a/docs/src/assets/preambles/asia-network.tex b/docs/src/assets/preambles/asia-network.tex
@@ -0,0 +1,17 @@
+\usepackage{tikz}
+\usepackage{xcolor-material} % https://ctan.org/pkg/xcolor-material
+\usetikzlibrary{arrows.meta}
+
+\colorlet{A}{MaterialRed!00}
+\colorlet{B}{MaterialPurple!00}
+\colorlet{D}{MaterialIndigo!00}
+\colorlet{E}{MaterialLightBlue!00}
+\colorlet{L}{MaterialTeal!00}
+\colorlet{S}{MaterialLightGreen!00}
+\colorlet{T}{MaterialYellow!00}
+\colorlet{X}{MaterialOrange!00}
+
+\tikzset {
+  myarrow/.style= {-{Stealth[scale=1.0]},shorten >=2pt, draw=gray, line width=1pt},
+  myvar/.style={circle, thick, draw=gray, fill=white},
+}
diff --git a/docs/src/assets/preambles/the-inference-tasks.tex b/docs/src/assets/preambles/the-inference-tasks.tex
@@ -0,0 +1,19 @@
+\usepackage{tikz}
+\usetikzlibrary{positioning}
+\usetikzlibrary{shadows}
+\usetikzlibrary{arrows.meta}
+\usepackage{bm} % used for bold letters in math environments
+
+\tikzset {
+  every node/.style={node distance=20mm and 1.0mm},
+  myroundbox/.style= {
+    rectangle, rounded corners=3mm, drop shadow, minimum height=1.0cm,
+    font=\small, minimum width=\columnwidth*0.28, align=center, fill=white,
+    draw=gray, line width=1pt,
+  },
+  myrectbox/.style= {
+    rectangle, drop shadow, minimum height=1.0cm, font=\small, minimum
+    width=\columnwidth*0.28, align=center, fill=white, draw=gray, line width=1pt,
+  },
+  myarrow/.style={draw=gray, -{Stealth[scale=1.0]}, line width=1pt, shorten >=2pt},
+}
diff --git a/docs/src/background.md b/docs/src/background.md
@@ -0,0 +1,228 @@
+# Background
+
+*TensorInference* implements efficient methods to perform Bayesian inference in
+probabilistic graphical models, such as Bayesian Networks or Markov random
+fields.
+
+## Probabilistic graphical models (PGMs)
+
+PGMs capture the mathematical modeling of reasoning in the presence of
+uncertainty. Bayesian networks and Markov random fields are popular types of
+PGMs. Consider the following Bayesian network known as the *ASIA network*
+[^lauritzen1988local]. 
+
+| **Random variable**  | **Meaning**                     |
+|        :---:         | :---                            |
+|        ``A``         | Recent trip to Asia             |
+|        ``T``         | Patient has tuberculosis        |
+|        ``S``         | Patient is a smoker             |
+|        ``L``         | Patient has lung cancer         |
+|        ``B``         | Patient has bronchitis          |
+|        ``E``         | Patient hast ``T`` and/or ``L`` |
+|        ``X``         | Chest X-Ray is positive         |
+|        ``D``         | Patient has dyspnoea            |
+
+```@eval
+using TikzPictures
+
+tp = TikzPicture(
+  L"""
+    % The various elements are conveniently placed using a matrix:
+    \matrix[row sep=0.5cm,column sep=0.5cm] {
+      % First line
+      \node (a) [myvar] {$A$};  &
+                                &
+                                &
+      \node (s) [myvar] {$S$};  &
+                               \\
+      % Second line
+      \node (t) [myvar] {$T$};  &
+                                &
+      \node (l) [myvar] {$L$};  &
+                                &
+      \node (b) [myvar] {$B$}; \\
+      % Third line
+                                &
+      \node (e) [myvar] {$E$};  &
+                                &
+                                &
+                               \\
+      % Forth line
+      \node (x) [myvar] {$X$};  &
+                                &
+                                &
+      \node (d) [myvar] {$D$};  &
+                               \\
+  };
+
+  \draw [myarrow] (a) edge (t);
+  \draw [myarrow] (s) edge (l);
+  \draw [myarrow] (s) edge (b);
+  \draw [myarrow] (t) edge (e);
+  \draw [myarrow] (l) edge (e);
+  \draw [myarrow] (e) edge (x);
+  \draw [myarrow] (e) edge (d);
+  \draw [myarrow] (b) edge (d);
+  """,
+  options="transform shape, scale=1.4",
+  preamble="\\input{" * joinpath(@__DIR__, "assets", "preambles", "asia-network") * "}",
+)
+save(SVG(joinpath(@__DIR__, "asia-bayesian-network")), tp)
+```
+![](asia-bayesian-network.svg)
+
+The ASIA network corresponds a simplified example from the context of medical
+diagnosis that describes the probabilistic relationships between different
+random variables corresponding to possible diseases, symptoms, risk factors and
+test results. It consists of a graph ``G = (\bm{V},\mathcal{E})`` and a
+probability distribution ``P(\bm{V})`` where ``G`` is a directed acyclic graph,
+``\bm{V}`` is the set of variables and ``\mathcal{E}`` is the set of edges
+connecting the variables. We assume all variables to be discrete. Each variable
+``V`` is quantified with a *conditional probability distribution* ``P(V \mid
+pa(V))`` where ``pa(V)`` are the parents of ``V``. These conditional probability
+distributions together with the graph ``G`` induce a *joint probability
+distribution* over ``P(\bm{V})``, given by
+
+```math
+P(\bm{V}) = \prod_{V\in\bm{V}} P(V \mid pa(V)).
+```
+
+
+## The inference tasks
+
+Given a set of **random variables** ``\bm{V}`` and their **joint
+distribution** ``P(\bm{V})``, compute one or more conditional
+distributions over a set of **query variables** ``\bm{Q}`` given observations
+``\bm{e}`` for the set of **observed variables** ``\bm{E}``.
+
+Tasks are each with respect to a graphical model ``\mathcal{M} = \{\bm{V},
+\bm{D}, \bm{\phi}\}``, where:
+
+``\bm{V} = \{ V_1 , V_2 , \dots , V_N \}`` is the set of the model’s variables
+
+``\bm{D} = \{ D_{V_1} , D_{V_2} , \dots , D_{V_N} \}`` is the set of discrete domains for each variable
+
+``\bm{\phi} = \{ \phi_1 , \phi_2 , \dots , \phi_N \}`` is the set of the model’s functions
+
+``\bm{V}`` can be further partitioned into two sets, evidence variables
+``\bm{E}`` and the rest ``\bm{V}^\prime = \bm{V} \setminus \bm{E}``.
+
+```@eval
+using TikzPictures
+
+tp = TikzPicture(
+  L"""
+    %\draw[help lines] (0,0) grid (10,-7);
+
+    % mrv: the "node distances" refer to the distance between the edge of a shape
+    % to the edge of the other shape. That is why I use "ie_aux" and "tasks_aux"
+    % below: to have equal distances between nodes with respect to the center of
+    % the shapes.
+
+    % row 1
+    \node[myroundbox] (rv) {Random Variables\\$\bm{V}$};
+    \node[right=of rv](aux1) {};
+    \node[right=of aux1,myroundbox] (jd) {Joint Distribution\\$P(\bm{V})$};
+    \node[right=of jd](aux2) {};
+    \node[right=of aux2,myroundbox] (e) {Evidence\\$\bm{E=e}$};
+    \node[right=of e](aux3) {};
+    \node[right=of aux3,myroundbox] (qv) {Query Variables\\$\bm{Q}$};
+    % row 2
+    \node[below=of aux2,myrectbox] (ie) {Inference Engine};
+    \node[below=of aux2] (ie_aux) {};
+    % row 3
+    \node[below=of ie_aux] (tasks_aux) {};
+    \node[left=of tasks_aux,myroundbox] (mar) {MAR};
+    \node[left=of mar] (aux4) {};
+    \node[left=of aux4,myroundbox] (pr) {PR};
+    \node[right=of tasks_aux,myroundbox] (map) {MAP};
+    \node[right=of map] (aux5) {};
+    \node[right=of aux5,myroundbox] (mmap) {MMAP};
+    % row 0
+    \node[above=of aux2,yshift=-12mm,text=gray] (in) {\textbf{Input}};
+    % row 4
+    \node[below=of tasks_aux,yshift=7mm,text=gray] (out) {\textbf{Output}};
+
+    %% edges
+    \draw[myarrow] (rv) -- (ie);
+    \draw[myarrow] (jd) -- (ie);
+    \draw[myarrow] (e)  -- (ie);
+    \draw[myarrow] (qv) -- (ie);
+    \draw[myarrow] (ie) -- (pr);
+    \draw[myarrow] (ie) -- (mar);
+    \draw[myarrow] (ie) -- (map);
+    \draw[myarrow] (ie) -- (mmap);
+  """,
+  options="transform shape, scale=1.4",
+  preamble="\\input{" * joinpath(@__DIR__, "assets", "preambles", "the-inference-tasks") * "}",
+)
+save(SVG("the-inference-tasks"), tp)
+```
+![](the-inference-tasks.svg)
+
+### Probability of evidence (PR)
+
+Computing the partition function (ie. normalizing constant) or probability of
+evidence:
+
+```math
+PR(\bm{V}^{\prime} \mid \bm{E}=\bm{e}) = \sum_{V^{\prime} \in \bm{V}^{\prime}} \prod_{\phi \in \bm{\phi}} \phi(V^{\prime},\bm{e})
+```
+
+This task involves calculating the probability of the observed evidence, which
+can be useful for model comparison or anomaly detection. This involves summing
+the joint probability over all possible states of the unobserved variables in
+the model, given some observed variables. This is a fundamental task in Bayesian
+statistics and is often used as a stepping stone for other types of inference."
+
+### Marginal inference (MAR): 
+
+Computing the marginal probability distribution over all variables given
+evidence:
+
+```math
+MAR(V_i \mid \bm{E}=\bm{e}) = \frac{ \sum_{V^{\prime\prime} \in \bm{V}^{\prime}
+\setminus V_i} \prod_{\phi \in \bm{\phi}} \phi(V^{\prime\prime},\bm{e}) }{
+    PR(\bm{V}^{\prime} \mid \bm{E}=\bm{e}) }
+```
+
+This task involves computing the marginal probability of a subset of variables,
+integrating out the others. In other words, it computes the probability
+distribution of some variables of interest regardless of the states of all other
+variables. This is useful when we're interested in the probabilities of some
+specific variables in the model, but not the entire model.
+
+### Maximum a Posteriori Probability estimation (MAP)
+
+Computing the most likely assignment to all variables given evidence:
+
+```math
+MAP(V_i \mid \bm{E}=\bm{e}) = \arg \max_{V^{\prime} \in \bm{V}^{\prime}}
+\prod_{\phi \in \bm{\phi}} \phi(V^{\prime},\bm{e})
+```
+
+In the MAP task, given some observed variables, the goal is to find the most
+probable assignment of values to some subset of the unobserved variables. It
+provides the states of variables that maximize the posterior probability given
+some observed evidence. This is often used when we want the most likely
+explanation or prediction according to the model.
+
+### Marginal Maximum a Posteriori (MMAP)
+
+Computing the most likely assignment to the query variables, ``\bm{V}_M \subset
+\bm{V}^{\prime}`` after marginalizing out the remaining variables ``\bm{V}_S =
+\bm{V}^{\prime} \setminus \bm{V}_M``:
+
+```math
+MMAP(V_i \mid \bm{E}=e) = \arg \max_{V_M \in \bm{V}_M} \sum_{V_S \in \bm{V}_S}
+\prod_{\phi \in \bm{\phi}} f(V_M, V_S, e)
+```
+
+This task is essentially a combination of the MAR and MAP tasks. The MMAP task
+involves finding the most probable assignment (the MAP estimate) for a subset of
+the variables, while marginalizing over (summing out) the remaining variables.
+This task is useful when we want to know the most likely state of some
+variables, but there's some uncertainty over others that we need to average out.
+
+[^lauritzen1988local]:
+    Steffen L Lauritzen and David J Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. *Journal of the Royal Statistical Society: Series B (Methodological)*, 50(2):157–194, 1988.
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -12,6 +12,9 @@ probabilistic inference in graphical models.
 
 Solutions to the most common probabilistic inference tasks, including:
 
+- **Probability of evidence (PR)**: Calculates the total probability of the
+  observed evidence across all possible states of the unobserved variables.
+
 - **Marginal inference (MAR)**: Computes the probability distribution of a
   subset of variables, ignoring the states of all other variables.
 
@@ -21,12 +24,10 @@ Solutions to the most common probabilistic inference tasks, including:
 - **Marginal Maximum a Posteriori (MMAP)**: Finds the most probable state of a
   subset of variables, averaging out the uncertainty over the remaining ones.
 
-- **Probability of evidence (PR)**: Calculates the total probability of the
-  observed evidence across all possible states of the unobserved variables.
-
 ## Outline
 ```@contents
 Pages = [
+  "background.md",
   "generated/asia/README.md",
   "uai-file-formats.md",
   "performance.md",