Skip to content

Commit

Permalink
Merge pull request #76 from Zinoex/fm/docs
Browse files Browse the repository at this point in the history
Update docs
  • Loading branch information
Zinoex authored Dec 4, 2024
2 parents 88af6a6 + c09fd22 commit 0653ea0
Show file tree
Hide file tree
Showing 7 changed files with 259 additions and 78 deletions.
48 changes: 38 additions & 10 deletions docs/src/reference/systems.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,29 +6,57 @@ num_states(s::IntervalMarkovProcess)
initial_states(s::IntervalMarkovProcess)
AllStates
transition_prob(mp::IntervalMarkovProcess)
IntervalMarkovChain
IntervalMarkovDecisionProcess
IntervalMarkovChain
stateptr(mdp::IntervalMarkovDecisionProcess)
OrthogonalIntervalMarkovChain
OrthogonalIntervalMarkovDecisionProcess
OrthogonalIntervalMarkovChain
stateptr(mdp::OrthogonalIntervalMarkovDecisionProcess)
MixtureIntervalMarkovChain
MixtureIntervalMarkovDecisionProcess
MixtureIntervalMarkovChain
stateptr(mdp::MixtureIntervalMarkovDecisionProcess)
```

## Probability representation

### Interval ambiguity sets
```@docs
IntervalProbabilities
lower(p::IntervalProbabilities)
lower(p::IntervalProbabilities, i, j)
upper(p::IntervalProbabilities)
upper(p::IntervalProbabilities, i, j)
gap(p::IntervalProbabilities)
gap(p::IntervalProbabilities, i, j)
sum_lower(p::IntervalProbabilities)
sum_lower(p::IntervalProbabilities, j)
num_source(p::IntervalProbabilities)
num_target(p::IntervalProbabilities)
axes_source(p::IntervalProbabilities)
```

### Marginal interval ambiguity sets
```@docs
OrthogonalIntervalProbabilities
lower(p::OrthogonalIntervalProbabilities, l)
lower(p::OrthogonalIntervalProbabilities, l, i, j)
upper(p::OrthogonalIntervalProbabilities, l)
upper(p::OrthogonalIntervalProbabilities, l, i, j)
gap(p::OrthogonalIntervalProbabilities, l)
gap(p::OrthogonalIntervalProbabilities, l, i, j)
sum_lower(p::OrthogonalIntervalProbabilities, l)
sum_lower(p::OrthogonalIntervalProbabilities, l, j)
num_source(p::OrthogonalIntervalProbabilities)
num_target(p::OrthogonalIntervalProbabilities)
axes_source(p::OrthogonalIntervalProbabilities)
```

### Mixtures of marginal interval ambiguity sets
```@docs
MixtureIntervalProbabilities
lower
upper
gap
sum_lower
num_source
num_target
axes_source
num_source(p::MixtureIntervalProbabilities)
num_target(p::MixtureIntervalProbabilities)
axes_source(p::MixtureIntervalProbabilities)
mixture_probs
weighting_probs
```
8 changes: 4 additions & 4 deletions docs/src/theory.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ where ``s = (s_1, \ldots, s_n)\in S``. We will denote the product ambiguity set

## IMDPs
Interval Markov Decision Processes (IMDPs), also called bounded-parameter MDPs [1], are a generalization of MDPs, where the transition probabilities, given source state and action, are not known exactly, but they are constrained to be in some probability interval.
Formally, an IMDP ``M`` is a tuple ``M = (S, S_0`, A, \overline{P}, \underline{P})``, where
Formally, an IMDP ``M`` is a tuple ``M = (S, S_0`, A, \Gamma)``, where

- ``S`` is a finite set of states,
- ``S_0 \subseteq S`` is a set of initial states,
Expand All @@ -38,14 +38,14 @@ Paths, strategies, and adversaries are defined similarly to IMDPs. See [3] for m
## Mixtures of OD-IMDPs
Mixtures of OD-IMDPs are included to address the issue the OD-IMDPs may not be able to represent all uncertainty in the transition probabilities. The mixture model is a convex combination of OD-IMDPs, where each OD-IMDP has its own set of ambiguity sets. Furthermore, the weights of the mixture are also interval-valued.

Formally, a mixture of OD-IMDPs ``M`` with ``K`` OD-IMDPs and ``n`` marginals is a tuple ``M = (S, S_0, A, \Gamma, \Gamma_\alpha)``, where
Formally, a mixture of OD-IMDPs ``M`` with ``K`` OD-IMDPs and ``n`` marginals is a tuple ``M = (S, S_0, A, \Gamma, \Gamma^\alpha)``, where
- ``S = S_1 \times \cdots \times S_n`` is a finite set of joint states with ``S_i`` being a finite set of states for the ``i``-th marginal,
- ``S_0 \subseteq S`` is a set of initial states,
- ``A`` is a finite set of actions,
- ``\Gamma = \{\Gamma_{r,s,a}\}_{r \in K, s\in S,a \in A}`` is a set of ambiguity sets for source-action pair ``(s, a)`` and OD-IMDP ``R``, where each ``\Gamma_{r,s,a} = \bigotimes_{i=1}^n \Gamma^i_{r,s,a}`` with ``\Gamma^i_{r,s,a}`` is an interval ambiguity set over the ``i``-th marginal, i.e. over ``S_i``.
- ``\Gamma_\alpha`` is a set of interval ambiguity sets for the weights of the mixture, i.e. over ``\{1, \ldots, K\}``.
- ``\Gamma^\alpha = \{\Gamma^\alpha_{s,a}\}_{s \in S, a \in A}`` is a set of interval ambiguity sets for the weights of the mixture, i.e. over ``\{1, \ldots, K\}``.

A feasible distribution for a mixture of OD-IMDPs is ``\sum_{r \in K} \alpha_r \prod_{i = 1}^n \gamma_{r,s,a}`` where ``\alpha \in \Gamma_\alpha`` and ``\gamma_{r,s,a} \in \Gamma_{r,s,a}``. See [3] for more details on mixtures of OD-IMDPs.
A feasible distribution for a mixture of OD-IMDPs is ``\sum_{r \in K} \alpha_{s,a}(r) \prod_{i = 1}^n \gamma_{r,s,a}`` where ``\alpha_{s,a} \in \Gamma^\alpha_{s,a}`` and ``\gamma_{r,s,a} \in \Gamma_{r,s,a}`` for each source-action pair ``(s, a)``. See [3] for more details on mixtures of OD-IMDPs.

### Reachability
In this formal framework, we can describe computing reachability given a target set ``G`` and a horizon ``K \in \mathbb{N} \cup \{\infty\}`` as the following objective
Expand Down
17 changes: 11 additions & 6 deletions src/models/IntervalMarkovDecisionProcess.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,17 @@
A type representing (stationary) Interval Markov Decision Processes (IMDP), which are Markov Decision Processes with uncertainty in the form of intervals on
the transition probabilities.
Formally, let ``(S, S_0, A, \\bar{P}, \\underbar{P})`` be an interval Markov decision processes, where ``S`` is the set of states, ``S_0 \\subset S`` is the set of initial states,
``A`` is the set of actions, and ``\\bar{P} : A \\to \\mathbb{R}^{|S| \\times |S|}`` and ``\\underbar{P} : A \\to \\mathbb{R}^{|S| \\times |S|}`` are functions
representing the upper and lower bound transition probability matrices prespectively for each action. Then the ```IntervalMarkovDecisionProcess``` type is
defined as follows: indices `1:num_states` are the states in ``S``, `transition_prob` represents ``\\bar{P}`` and ``\\underbar{P}``, actions are
implicitly defined by `stateptr` (e.g. if `stateptr[3] == 4` and `stateptr[4] == 7` then the actions available to state 3 are `[4, 5, 6]`),
and `initial_states` is the set of initial states ``S_0``. If no initial states are specified, then the initial states are assumed to be all states in ``S``.
Formally, let ``(S, S_0, A, \\Gamma)`` be an interval Markov decision process, where
- ``S`` is the set of states,
- ``S_0 \\subseteq S`` is the set of initial states,
- ``A`` is the set of actions, and
- ``\\Gamma = \\{\\Gamma_{s,a}\\}_{s \\in S, a \\in A}`` is a set of interval ambiguity sets on the transition probabilities, for each source-action pair.
Then the ```IntervalMarkovDecisionProcess``` type is defined as follows: indices `1:num_states` are the states in ``S``,
`transition_prob` represents ``\\Gamma``, actions are implicitly defined by `stateptr` (e.g. if `stateptr[3] == 4` and `stateptr[4] == 7` then
the actions available to state 3 are `[1, 2, 3]`), and `initial_states` is the set of initial states ``S_0``. If no initial states are specified,
then the initial states are assumed to be all states in ``S`` represented by `AllStates`. See [IntervalProbabilities](@ref) and [Theory](@ref) for more information
on the structure of the transition probability ambiguity sets.
### Fields
- `transition_prob::P`: interval on transition probabilities where columns represent source/action pairs and rows represent target states.
Expand Down
88 changes: 64 additions & 24 deletions src/models/MixtureIntervalMarkovDecisionProcess.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,37 +8,76 @@
A type representing (stationary) Mixture Interval Markov Decision Processes (OIMDP), which are IMDPs where the transition
probabilities for each state can be represented as the product of the transition probabilities of individual processes.
# TODO: Update theory section
Formally, let ``(S, S_0, A, \\bar{P}, \\underbar{P})`` be an interval Markov decision processes, where ``S`` is the set of states, ``S_0 \\subset S`` is the set of initial states,
``A`` is the set of actions, and ``\\bar{P} : A \\to \\mathbb{R}^{|S| \\times |S|}`` and ``\\underbar{P} : A \\to \\mathbb{R}^{|S| \\times |S|}`` are functions
representing the upper and lower bound transition probability matrices prespectively for each action. Then the ```IntervalMarkovDecisionProcess``` type is
defined as follows: indices `1:num_states` are the states in ``S``, `transition_prob` represents ``\\bar{P}`` and ``\\underbar{P}``, actions are
implicitly defined by `stateptr` (e.g. if `stateptr[3] == 4` and `stateptr[4] == 7` then the actions available to state 3 are `[4, 5, 6]`),
and `initial_states` is the set of initial states ``S_0``. If no initial states are specified, then the initial states are assumed to be all states in ``S``.
Formally, let ``(S, S_0, A, \\Gamma, \\Gamma_\\alpha)`` be an interval Markov decision processes, where
- ``S = S_1 \\times \\cdots \\times S_n`` is the set of joint states with ``S_i`` the set of states for the `i`-th marginal,
- ``S_0 \\subseteq S`` is the set of initial states,
- ``A`` is the set of actions,
- ``\\Gamma = \\{\\Gamma_{s,a}\\}_{s \\in S, a \\in A}`` is a set of interval ambiguity sets on the transition probabilities,
for each source-action pair, with ``\\Gamma_{s,a} = \\bigotimes_{i=1}^n \\Gamma_{s,a}^i`` and ``\\Gamma_{s,a}^i`` is a marginal interval ambiguity sets
on the ``i``-th marginal, and
- ``\\Gamma^\\alpha = \\{\\Gamma^\\alpha_{s,a}\\}_{s \\in S, a \\in A}`` is the interval ambiguity set for the mixture.
Then the ```MixtureIntervalMarkovDecisionProcess``` type is defined as follows: indices `1:num_states` are the states in ``S`` and
`transition_prob` represents ``\\Gamma`` and ``\\Gamma^\\alpha``. Actions are implicitly defined by `stateptr` (e.g. if `source_dims` in `transition_prob`
is `(2, 3, 2)`, and `stateptr[3] == 4` and `stateptr[4] == 7` then the actions available to state `CartesianIndex(1, 2, 1)` are `[1, 2, 3]`), and `initial_states`
is the set of initial states ``S_0``. If no initial states are specified, then the initial states are assumed to be all states in ``S``
represented by `AllStates`. See [MixtureIntervalProbabilities](@ref) and [Theory](@ref) for more information on the structure
of the transition probability ambiguity sets.
### Fields
# TODO: Update fields
- `transition_prob::P`: interval on transition probabilities where columns represent source/action pairs and rows represent target states.
- `stateptr::VT`: pointer to the start of each source state in `transition_prob` (i.e. `transition_prob[:, stateptr[j]:stateptr[j + 1] - 1]` is the transition
probability matrix for source state `j`) in the style of colptr for sparse matrices in CSC format.
- `transition_prob::P`: ambiguity set on transition probabilities (see [MixtureIntervalProbabilities](@ref) for the structure).
- `stateptr::VT`: pointer to the start of each source state in `transition_prob` (i.e. `transition_prob[k][l][:, stateptr[j]:stateptr[j + 1] - 1]` is the transition
probability matrix for source state `j` for each model `k` and axis `l`) in the style of colptr for sparse matrices in CSC format.
- `initial_states::VI`: initial states.
- `num_states::Int32`: number of states.
### Examples
# TODO: Update examples
```jldoctest
```
There is also a constructor for `MixtureIntervalMarkovDecisionProcess` where the transition probabilities are given as a list of
transition probabilities for each source state.
The following example is a simple mixture of two `OrthogonalIntervalProbabilities` with one dimension and the same source/action pairs.
The first state has two actions and the second state has one action. The weighting ambiguity set is also specified for the same three source-action pairs.
```jldoctest
prob1 = OrthogonalIntervalProbabilities(
(
IntervalProbabilities(;
lower = [
0.0 0.5 0.1
0.1 0.3 0.2
],
upper = [
0.5 0.7 0.6
0.7 0.4 0.8
],
),
),
(Int32(2),),
)
prob2 = OrthogonalIntervalProbabilities(
(
IntervalProbabilities(;
lower = [
0.1 0.4 0.2
0.3 0.0 0.1
],
upper = [
0.4 0.6 0.5
0.7 0.5 0.7
],
),
),
(Int32(2),),
)
weighting_probs = IntervalProbabilities(; lower = [
0.3 0.5 0.4
0.4 0.3 0.2
], upper = [
0.8 0.7 0.7
0.7 0.5 0.4
])
mixture_prob = MixtureIntervalProbabilities((prob1, prob2), weighting_probs)
stateptr = Int32[1, 3, 4]
mdp = MixtureIntervalMarkovDecisionProcess(mixture_prob, stateptr)
```
"""
struct MixtureIntervalMarkovDecisionProcess{
P <: MixtureIntervalProbabilities,
Expand Down Expand Up @@ -82,7 +121,7 @@ end
Construct a Mixture Interval Markov Chain from mixture interval transition probabilities. The initial states are optional and if not specified,
all states are assumed to be initial states. The number of states is inferred from the size of the transition probability matrix.
The returned type is an `OrthogonalIntervalMarkovDecisionProcess` with only one action per state (i.e. `stateptr[j + 1] - stateptr[j] == 1` for all `j`).
The returned type is an `MixtureIntervalMarkovDecisionProcess` with only one action per state (i.e. `stateptr[j + 1] - stateptr[j] == 1` for all `j`).
This is done to unify the interface for value iteration.
"""
function MixtureIntervalMarkovChain(
Expand Down Expand Up @@ -117,7 +156,8 @@ end
Return the state pointer of the Interval Markov Decision Process. The state pointer is a vector of integers where the `i`-th element
is the index of the first element of the `i`-th state in the transition probability matrix.
I.e. `transition_prob[:, stateptr[j]:stateptr[j + 1] - 1]` is the transition probability matrix for source state `j`.
I.e. `mixture_probs(transition_prob)[k][l][:, stateptr[j]:stateptr[j + 1] - 1]` is the independent transition probability matrix for (flattened) source state `j`
for axis `l` and model `k`, and `mixture_probs(transition_prob)[:, stateptr[j]:stateptr[j + 1] - 1]` is the weighting matrix for `j`.
"""
stateptr(mdp::MixtureIntervalMarkovDecisionProcess) = mdp.stateptr

Expand Down
45 changes: 25 additions & 20 deletions src/models/OrthogonalIntervalMarkovDecisionProcess.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,35 +8,40 @@
A type representing (stationary) Orthogonal Interval Markov Decision Processes (OIMDP), which are IMDPs where the transition
probabilities for each state can be represented as the product of the transition probabilities of individual processes.
# TODO: Update theory section
Formally, let ``(S, S_0, A, \\bar{P}, \\underbar{P})`` be an interval Markov decision processes, where ``S`` is the set of states, ``S_0 \\subset S`` is the set of initial states,
``A`` is the set of actions, and ``\\bar{P} : A \\to \\mathbb{R}^{|S| \\times |S|}`` and ``\\underbar{P} : A \\to \\mathbb{R}^{|S| \\times |S|}`` are functions
representing the upper and lower bound transition probability matrices prespectively for each action. Then the ```IntervalMarkovDecisionProcess``` type is
defined as follows: indices `1:num_states` are the states in ``S``, `transition_prob` represents ``\\bar{P}`` and ``\\underbar{P}``, actions are
implicitly defined by `stateptr` (e.g. if `stateptr[3] == 4` and `stateptr[4] == 7` then the actions available to state 3 are `[4, 5, 6]`),
and `initial_states` is the set of initial states ``S_0``. If no initial states are specified, then the initial states are assumed to be all states in ``S``.
Formally, let ``(S, S_0, A, \\Gamma)`` be an orthogonal interval Markov decision process [1], where
- ``S = S_1 \\times \\cdots \\times S_n`` is the set of joint states with ``S_i`` the set of states for the `i`-th marginal,
- ``S_0 \\subseteq S`` is the set of initial states,
- ``A`` is the set of actions, and
- ``\\Gamma = \\{\\Gamma_{s,a}\\}_{s \\in S, a \\in A}`` is a set of interval ambiguity sets on the transition probabilities,
for each source-action pair, with ``\\Gamma_{s,a} = \\bigotimes_{i=1}^n \\Gamma_{s,a}^i`` and ``\\Gamma_{s,a}^i`` is a marginal interval ambiguity sets
on the ``i``-th marginal.
Then the ```OrthogonalIntervalMarkovDecisionProcess``` type is defined as follows: indices `1:num_states` are the states in ``S`` and
`transition_prob` represents ``\\Gamma``. Actions are implicitly defined by `stateptr` (e.g. if `source_dims` in `transition_prob` is `(2, 3, 2)`, and
`stateptr[3] == 4` and `stateptr[4] == 7` then the actions available to state `CartesianIndex(1, 2, 1)` are `[1, 2, 3]`), and `initial_states`
is the set of initial states ``S_0``. If no initial states are specified, then the initial states are assumed to be all states in ``S``
represented by `AllStates`. See [OrthogonalIntervalProbabilities](@ref) and [Theory](@ref) for more information on the structure
of the transition probability ambiguity sets.
### Fields
# TODO: Update fields
- `transition_prob::P`: interval on transition probabilities where columns represent source/action pairs and rows represent target states.
- `stateptr::VT`: pointer to the start of each source state in `transition_prob` (i.e. `transition_prob[:, stateptr[j]:stateptr[j + 1] - 1]` is the transition
probability matrix for source state `j`) in the style of colptr for sparse matrices in CSC format.
- `transition_prob::P`: interval on transition probabilities where columns represent source/action pairs and rows represent target states along each marginal.
- `stateptr::VT`: pointer to the start of each source state in `transition_prob` (i.e. `transition_prob[l][:, stateptr[j]:stateptr[j + 1] - 1]` is the transition
probability matrix for source state `j` for each axis `l`) in the style of colptr for sparse matrices in CSC format.
- `initial_states::VI`: initial states.
- `num_states::Int32`: number of states.
### Examples
Assume that `prob1`, `prob2`, and `prob3` are `IntervalProbabilities` for the first, second, and third axis, respectively, defined as the example
in [OrthogonalIntervalProbabilities](@ref). Then the following code constructs an `OrthogonalIntervalMarkovDecisionProcess` with three axes of three states each.
The number of actions per state is one, i.e. the model is a Markov chain. Therefore, the `stateptr` is a unit range `1:num_states + 1` and we can call
the convenience constructor `OrthogonalIntervalMarkovChain` instead.
# TODO: Update example
```jldoctest
prob = OrthogonalIntervalProbabilities((prob1, prob2, prob3), (Int32(3), Int32(3), Int32(3)))
mc = OrthogonalIntervalMarkovChain(prob)
```
There is also a constructor for `OrthogonalIntervalMarkovDecisionProcess` where the transition probabilities are given as a list of
transition probabilities for each source state.
```jldoctest
```
[1] Mathiesen, F. B., Haesaert, S., & Laurenti, L. (2024). Scalable control synthesis for stochastic systems via structural IMDP abstractions. arXiv preprint arXiv:2411.11803.
"""
struct OrthogonalIntervalMarkovDecisionProcess{
Expand Down Expand Up @@ -127,7 +132,7 @@ end
Return the state pointer of the Interval Markov Decision Process. The state pointer is a vector of integers where the `i`-th element
is the index of the first element of the `i`-th state in the transition probability matrix.
I.e. `transition_prob[:, stateptr[j]:stateptr[j + 1] - 1]` is the transition probability matrix for source state `j`.
I.e. `transition_prob[l][:, stateptr[j]:stateptr[j + 1] - 1]` is the transition probability matrix for (flattened) source state `j` for axis `l`.
"""
stateptr(mdp::OrthogonalIntervalMarkovDecisionProcess) = mdp.stateptr

Expand Down
Loading

0 comments on commit 0653ea0

Please sign in to comment.