-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Equinox.Cosmos Storage + Programming Model description #50
Comments
This was referenced Nov 26, 2018
Maybe this should go in the wiki with a WIP label, it looks quite useful. |
Good point - in fact, the thought crossed my mind just this morning (reason I made it an Issue is that some form of this needs to go in the README too). But the perfect shouldn't be the enemy of the good, so I'm on it... |
Thank you! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
NB this is long and needs lots of editing.
Storage model (see source)
Batches
Events are stored in immutable batches consisting of:
p
artitionKey:string
// stream identifieri
ndex:int64
// base index of this batch (0
for first event in a stream)id
:string
// same asi
(CosmosDb forces every doc to have one, and it must be astring
)e
vents:Event[]
// (see next section) typically there is one item in the array (can be many if events are small, for RU and perf-efficiency reasons)ts
// CosmosDb-intrinsic last updated date for this record (changes when replicated etc, hence seet
below)Events
Per
Event
, we have the following:c
ase - the case of this union in the Discriminated Union of events this stream bears (aka Event Type)d
ata - json data (CosmosDb maintains it as actual json; it can be indexed and queried if desired)m
etadata - carries ancillary information for an eventt
- creation timestampTip Batch
The tip is readable via a point-read, as the
id
has a fixed known value (-1
). It uses the same base layout as an Event-Batch, but adds the following:_etag
: CosmosDb-managed field updated per-touch (facilitatesNotModified
result, see below)id
: always-1
so one can reference it in a point-read GET request and not pay the cost and latency associated with a full queryu
: Array of _unfold_ed events based on a point-in-time state (see State, Snapshots, Events and Unfolds, Unfolded Events andunfold
in the programming model section)State, Snapshots, Events and Unfolds
In an Event Sourced system, we typically distinguish between the following basic elements
Events - Domain Events representing actual real world events that have occurred, reflecting the domain as understood by domain experts - see Event Storming. The customer favorited the item, the customer saved SKU Y for later, $200 got charged with transaction id X.
State - derived representations established from Events. A given set of code in an environment will, in service of some decision making process, interpret those events as implying a particular state in a model. If we change the code slightly or add a field, you wouldn't necessarily expect a version of your code from a year ago to generate you equivalent state that you can simply blast into your object model and go. (But you could easily hold a copy in memory as long as your process runs)
Snapshots - A snapshot is an intentionally roundtrippable version of a State, which can be saved and restored. Typically one would do this to save the cost of loading all the Events in a long running sequence of Events to re-establish the State. The EventStore folks have a great walkthrough on Rolling Snapshots.
Projections - the term projection is heavily overloaded, meaning anything from the proceeds of a SELECT statement, the result of a
map
operation, an EventStore projection, an event being propagated via Kafka (no, further examples are not required)..... and:
unfold
is based on the FP function of that name, bearing the signature'state -> 'event seq
. When usingEquinox.Cosmos
, the unfold produces projections, represented as _event_s to snapshot the state at a position in the stream.Generating and saving
unfold
ed eventsPeriodically, along with writing the events that a decision function yields to represent the implications of a command given the present state, we also
unfold
the resultingstate'
and supply those to thesync
function too. Theunfold
function takes thestate
and projects one or more snapshot-events which can be used to reestablish the same state we have thus far derived from watching the events on the stream. Unlike normal events,unfold
ed events do not get replicated to other systems, and can also be thrown away at will (we also compress them rather than storing them as fully expanded json).Reading from the Storage Model
Most reads request tip with an
IfNoneMatch
precondition citing the `etag it bore when we last saw it, which, when combined with a cache means one of the following happens when a reader is trying to establish the state of a stream prior to processing a Command:NotModified
(depending on workload, can be the dominant case) - for1
RU, minimal latency and close-to-0
network bandwidth, we know the present stateNotFound
(there's nothing in the stream) - for equivalently low cost, we know the state isinitial
Found
- (if there are multiple writers and/or we don't have a cached version) - for the minimal possible cost (a point read, not a query), we have all we need to establish the state:-i
: a version numbere
: events since that version numberu
: unfolded auxiliary events computed at the same time as the batch of events was sent (aka projections/snapshots) - (these enable us to establish thestate
without further queries or roundtrips to load and fold all preceding events)Building a state from the Storage Model and/or the Cache
Given a stream with:
If we have
state4
based on the events up to{i:3, c:c1, d: d4}
and the index document, we can produce thestate
by folding in a variety of ways:fold initial [ C1 d1; C2 d2; C3 d3; C1 d4; C3 d5 ]
(but would need a query to load the first 2 batches, with associated RUs and roundtrips)fold state4 [ C3 d5 ]
(only need to pay to transport the tip document as a point read)isStart (S1 s5)
=true
):fold initial [S1 s5]
(point read + transport + decompresss5
)isStart (S2 s4)
=true
):fold initial [S2 s4; C3 d5]
(only need to pay to transport the tip document as a point read and decompresss4
ands5
)If we have
state3
based on the events up to{i:3, c:c1, d: d4}
, we can produce thestate
by folding in a variety of ways:fold initial [ C1 d1; C2 d2; C3 d3; C1 d4; C3 d5 ]
(but query, roundtrips)fold state3 [C1 d4 C3 d5]
(only pay for point read+transport)fold initial [S2 s4; C3 d5]
(only pay for point read+transport)isStart (S1 s5)
=true
):fold initial [S1 s5]
(point read + transport + decompresss5
)isStart (S2 s4)
=true
):fold initial [S2 s4; C3 d5]
(only need to pay to transport the tip document as a point read and decompresss4
ands5
)If we have
state5
based on the events up toC3 d5
, and (being the writer, or a recent reader), have the etag:etagXYZ
, we can do aHTTP GET
withetag: IfNoneMatch etagXYZ
, which will return302 Not Modified
with < 1K of data, and a charge of1.00
RU allowing us to derive the state as:state5
Programming model
In F#, the Equinox programming model involves, per aggregation of events on a given category of stream:
'state
: the state required to support the decision or query being supported (not serializable or stored; can be held in a .NETMemoryCache
)initial: 'state
: the implied state of an empty stream'event
: a discriminated union representing all the possible Events from which a state beevolve
d (seee
andu
in the data model). Typically the mapping of the json to an'event
c
ase is driven by aUnionContractEncoder
fold : 'state -> 'event seq -> 'state
: function used to fold events (real ones and/or unfolded ones) into the running'state
evolve: state -> 'event -> 'state
- thefolder
function from whichfold
is built, representing the application of the delta the'event
implies for the model to thestate
decide: 'state -> 'command -> event' list
: responsible for (in an idempotent manner) interpreting acommand
in the context of astate
as theevents
that should be written to the stream to record the decisionWhen using the
Equinox.Cosmos
adapter, one will typically implement two further functions in order to avoid having to have every'event
in the stream having to be loaded and processed in order to build the'state
(versus a single cheap point read from CosmosDb to read the tip):unfold: 'state -> 'event seq
: function used to render events representing the state which facilitate quickly re-establishing astate
without needing to go back to the first event that occurred on a streamisStart: 'event -> bool
: predicate indicating whether a given'event
is sufficient as a starting point e.g.High level Command Processing flow
When running a decision process, we thus have the following stages:
'state
(based on a given position in the stream of Events)decide
function look at the request/command and yield a set of events (or none) that represent the effect of that decision in terms of events3a. if there is no conflict (nobody else decided anything since we decided what we'd do), append the events to the stream (record the new position and etag)
3b. if there is a conflict, take the conflicting events that other writers have produced since step 1,
fold
them into our state, and go back to 2 (the CosmosDb stored procedure sends them back immediately at zero cost or latency)Sync stored procedure high level flow (see source)
The
sync
stored procedure takes a document as input which is almost identical to the format of the tip batch (in fact, if the stream is found to be empty, it forms the template for the first document created in the stream). The request includes the following elements:expectedVesion
: the position the requestor is basing their proposed batch of events on (no, anetag
would not be relevant)e
: array of Events (see Event, above) to append if the expectedVersion check is fulfilledu
: array ofunfold
ed events which supersede items with equivalentc
ase values (aka snapshots, projectiosn)maxEvents
: the maximum number of events to record in an individual batch. For example:e
contains 2 events, the tip document'se
has 2 documents and themaxEvents
is5
, the events get merged into the tipmaxEvents
is1
, the tip gets frozen as aBatch
, and the new request becomes the tip (as an atomic transaction on the server side)thirdPartyUnfoldRetention
: how many events to keep before the base (i
) of the batch if required by laggingu
nfolds which would otherwise fall out of scope as a result of the appends in this batch (this will default to0
, so for example if a writer says maxEvents10
and there is anu
nfold based on an event more than10
old it will be removed as part of the appending process)Example
The following example is a minimal version of the Favorites model, with shortcuts for brevity (yes, and imperfect performance characteristics):
The text was updated successfully, but these errors were encountered: