Skip to content

Commit

Permalink
Update general documentation and examples
Browse files Browse the repository at this point in the history
  • Loading branch information
whitfin committed Sep 13, 2024
1 parent e93df9c commit c9d3961
Show file tree
Hide file tree
Showing 6 changed files with 131 additions and 106 deletions.
58 changes: 0 additions & 58 deletions docs/general/action-blocks.md

This file was deleted.

88 changes: 88 additions & 0 deletions docs/general/batching-actions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Batching Actions

It's sometimes the case that you need to execute several cache actions in a row. Although you can do this in the normal, this is actually somewhat inefficient as each call has to do various management (such as looking up cache states). For this reason Cachex offers several mechanisms for making multiple calls in sequence.

## Submitting Batches

The simplest way to make several cache calls together is `Cachex.execute/3`. This API allows the caller to provide a function which will be provided with a pre-validated cache state which can be used (instead of the cache name) to execute cache actions. This will skip all of the cache management overhead you'd see typically:

```elixir
# standard way to execute several actions
r1 = Cachex.get!(:my_cache, "key1")
r2 = Cachex.get!(:my_cache, "key2")
r3 = Cachex.get!(:my_cache, "key3")

# using Cachex.execute/3 to optimize the batch of calls
{r1, r2, r3} =
Cachex.execute!(:my_cache, fn cache ->
# execute our batch of actions
r1 = Cachex.get!(cache, "key1")
r2 = Cachex.get!(cache, "key2")
r3 = Cachex.get!(cache, "key3")

# pass back all results as a tuple
{r1, r2, r3}
end)
```

Although this syntax might look a little more complicated at a glance, it should be fairly straightforward to get used to. The small change in approach here gives a fairly large boost to cache throughput. To compare the two examples above, we can use a tool like [Benchee](https://github.com/PragTob/benchee) for a rough comparison:

```
Name ips average deviation median 99th %
grouped 1.72 M 580.68 ns ±3649.68% 500 ns 750 ns
individually 1.31 M 764.02 ns ±2335.25% 625 ns 958 ns
```

We can clearly see the time saving when using the batched approach, even if there is a large deviation in the numbers above. Somewhat intuitively, the time saving scales to the number of actions you're executing in your batch, even if it is unlikely that anyone is doing more than a few calls at once.

It's important to note that even though you're executing a batch of actions, other processes can access and modify keys at any time during your `Cachex.execute/3` call. These calls still occur your calling process; they're not sent through any kind of arbitration process. To demonstrate this, here's a quick example:

```elixir
# start our execution block
Cachex.execute!(:my_cache, fn cache ->
# set a base value in the cache
Cachex.put!(cache, "key", "value")

# we're paused but other changes can happen
:timer.sleep(5000)

# this may have have been set elsewhere
Cachex.get!(cache, "key")
end)
```

As we wait 5 seconds before reading the value back, the value may have been modified or even removed by other processes using the cache (such as TTL cleanup or other places in your application). If you want to guarantee that nothing is modified between your interactions, you should consider a transactional block instead.

## Transactional Batches

A transactional block will guarantee that your actions against a cache key will happen with zero interaction from other processes. Transactions look almost exactly the same as `Cachex.execute/3`, except that they require a list of keys to lock for the duration of their execution.

The entry point to a Cachex transaction is (unsurprisingly) `Cachex.transaction/4`. If we take the example from the previous section, let's look at how we can guarantee consistency between our cache calls:

```elixir
# start our execution block
Cachex.transaction!(:my_cache, ["key"], fn cache ->
# set a base value in the cache
Cachex.put!(cache, "key", "value")

# we're paused but other changes will not happen
:timer.sleep(5000)

# this will be guaranteed to return "value"
Cachex.get!(cache, "key")
end)
```

It's critical to provide the keys you wish to lock when calling `Cachex.transaction/4`, as any keys not specified will still be available to be written by other processes during your function's execution. If you're making a simple cache call, the transactional flow will only be taken if there is a simultaneous transaction happening against the same key. This enables caches to stay lightweight whilst allowing for these batches when they really matter.

Another pattern which may prove useful is providing an empty list of keys, which will guarantee that your transaction runs at a time when no keys in the cache are currently locked. For example, the following code will guarantee that no keys are locked when purging expired records:

```elixir
Cachex.transaction!(:my_cache, [], fn cache ->
Cachex.purge!(cache)
end)
```

Transactional flows are only enabled the first time you call `Cachex.transaction/4`, so you shouldn't see any peformance penalty in the case you're not actively using transactions. This also has the benefit of not requiring transaction support to be configured inside the cache options, as was the case in earlier versions of Cachex.

The last major difference between `Cachex.execute/3` and `Cachex.transaction/4` is where they run; transactions are executed inside a secondary worker process, so each transaction will run only after the previous has completed. As such there is a minor performance overhead when working with transactions, so use them only when you need to.
34 changes: 0 additions & 34 deletions docs/general/disk-interaction.md

This file was deleted.

29 changes: 29 additions & 0 deletions docs/general/local-persistence.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Local Persistence

Cachex ships with basic support for dumping a cache to a local file using the [Erlang Term Format](https://www.erlang.org/doc/apps/erts/erl_ext_dist). These files can then be used to seed data into a new instance of a cache to persist values between cache instances.

As it stands all persistence must be handled manually via the Cachex API, although additional features may be added in future to add convenience around this. Note that the use of the term "dump" over "backup" is intentional, as these files are just extracted datasets from a cache, rather than a serialization of the cache itself.

## Writing to Disk

To dump a cache to a file on disk, you can use the `Cachex.dump/3` function. This function supports an optional `:compression` option (between `0-9`) to help reduce the required disk space. By default this value is set to `1` to try and optimize the tradeoff between performance and disk usage. Another common approach is to dump with `compression: 0` and run compression from outside of the Erlang VM.

```elixir
{ :ok, true } = Cachex.dump(:my_cache, "/tmp/my_cache.dump")
```

The above demonstrates how simple it is to dump your cache to a location on disk (in this case `/tmp/my_cache.dump`). Any options can be provided as a `Keyword` list as an optional third parameter.

## Loading from Disk

To seed a cache from an existing dump, you can use `Cachex.load/3`. This will *merge* the dump into your cache, overwriting and clashing keys and maintaining any keys which existed in the cache beforehand. If you want a direct match of the dump inside your cache, you should use `Cachex.clear/2` before loading your data.

```elixir
# optionally clean your cache first
{ :ok, _amt } = Cachex.clear(:my_cache)

# then you can load the existing dump into your cache
{ :ok, true } = Cachex.load(:my_cache, "/tmp/my_cache.dump")
```

Please note that loading from an existing dump will maintain all existing expirations, and records which have already expired will *not* be added to the cache table. This should not be surprising, but it is worth calling out.
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Streaming Caches
# Streaming Records

Cachex provides the ability to create an Elixir `Stream` seeded by the contents of a cache, using an ETS table continuation and `Stream.resource/3`. This then allows the developer to use any of the `Enum` or `Stream` module functions against the entries in cache, which can be a very powerful and flexible tool.

## Simple Streams
## Basic Streams

By default, `Cachex.stream/3` will return a `Stream` over all entries in a cache which are yet to expire (at the time of stream creation). These cache entries will be streamed as `Cachex.Spec.entry` records, so you can use pattern matching to pull any of the entry fields assuming you have `Cachex.Spec` imported:

Expand All @@ -18,13 +18,13 @@ Cachex.put(:my_cache, "three", 3)

# == 6
:my_cache
|> Cachex.stream!
|> Enum.reduce(0, fn(entry(value: value), total) ->
total + value
end)
|> Cachex.stream!()
|> Enum.reduce(0, fn entry(value: value), total ->
total + value
end)
```

## Efficient Streams
## Efficient Querying

While the `Enum` module provides the ability to filter records easily, we can do better by pre-filtering using a match specification. Under the hood these matches are as defined by the Erlang documentation, and can be passed as the second argument to `Cachex.stream/3`.

Expand All @@ -41,15 +41,15 @@ Cachex.put(:my_cache, "two", 2)
Cachex.put(:my_cache, "three", 3)

# generate our filter to find odd values
filter = { :==, { :rem, :value, 2 }, 1 }
filter = {:==, {:rem, :value, 2}, 1}

# generate the query using the filter, only return `:value
query = Cachex.Query.create(where: filter, output: :value)

# == 4
:my_cache
|> Cachex.stream!(query)
|> Enum.sum
|> Enum.sum()
```

Rather than retrieve and handle the whole cache entry, here we're using `:output` to choose only the `:value` column from each entry. This lets us skip out on `Enum.reduce/3` and go directly to `Enum.sum/1`, much easier!
Expand All @@ -67,7 +67,7 @@ Cachex.put(:my_cache, "two", 2)
Cachex.put(:my_cache, "three", 3)

# generate our filter to find odd values
filter = { :==, { :rem, :value, 2 }, 1 }
filter = {:==, {:rem, :value, 2}, 1}

# wrap our filter to filter expired values
filter = Cachex.Query.expired(filter)
Expand All @@ -78,7 +78,7 @@ query = Cachex.Query.create(where: filter, output: :value)
# == 4
:my_cache
|> Cachex.stream!(query)
|> Enum.sum
|> Enum.sum()
```

This function accepts a query guard and wraps it with clauses to filter out expired records. The returned guard can then be passed to `Cachex.Query.create/1` to return only the expired records which match your query. This is all fairly simple, but it's definitely something to keep in mind when working with `Cachex.Query`!
6 changes: 3 additions & 3 deletions mix.exs
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ defmodule Cachex.Mixfile do
extras: [
"docs/extensions/custom-commands.md",
"docs/extensions/execution-lifecycle.md",
"docs/general/action-blocks.md",
"docs/general/disk-interaction.md",
"docs/general/streaming-caches.md",
"docs/general/batching-actions.md",
"docs/general/local-persistence.md",
"docs/general/streaming-records.md",
"docs/management/limiting-caches.md",
"docs/management/expiring-records.md",
"docs/migrations/migrating-to-v3.md",
Expand Down

0 comments on commit c9d3961

Please sign in to comment.