estore
short for event-store. It is a pure Erlang store for mostly ordered time indexed events or logs.
It is build around the concept of using sequential reads and writes and allow for distributed storage (like riak_core
). estore
archives this by a few mechanisms.
For once wrote are allowed to be out of order for a certain grace period this allows to use a simple append even if events don't come in perfect order. This means that for reads we need to read a bit between the requested time.
For example with a grace periods of 30s
when reading between 21:01
and 21:02
we would read up to 21:02.30
discarding every event that has a timestamp over 21:02
.
While the grace periods does cover most events that come out of order sometimes that is not enough. In this cases they are put in the reconciliation file. The file does not guarantee any order.
This has the downside that during a read all elements in the recon file. However so far tests have shown that the recon file remains fairly small in real life conditions even with many events arriving from multiple servers.
estore
keeps an index file that keeps 'most' indexes for events written to the event store file. The file is not guaranteed or required to keep indexes for every entry. However having the indexes allows to avoid scanning the store file.
The fact that indexing is not required to be 100% means that this can later be tuned and optimized.
Each event is required to carry a 32 bit ID. those ID's allow that in the case of a distributed use events can be repaired without accidentally duplicating or reduplicating events.
It allows to have two equal events with the same timestamp and content and still repair them.
The small size was chosen as the ID only needs to be unique between two events with the same timestamp and the same event!
The estore
has two interfaces. Once the estore
itself wich allows keeping files smaller it shards multiple efiles
based on time range, that way it keeps the recon files as small as possible. It also works well when distributing and sharding by time range.
The efile
is the 'simple' implementation of the above mention concepts.
That's rather simple.
$ rebar3 compile
The normal interface is estore
so this section will discuss only that, however the efile
interface is near identical.
%% Create a estore with a 1 day and a grace period of 1 minute.
T0 = erlang:system_time(nano_seconds),
{ok, E1} = estore:new("exmaplestore", [{file_size, {1, d}}, {grace, {1, m}}]).
{ok, E2} = estore:append([estore:event(<<"my event">>)], E1),
{ok, Es, E3} = estore:read(T0, erlang:system_time(nano_seconds), E2).