Update general documentation and examples

whitfin · Sep 13, 2024 · c9d3961 · c9d3961
1 parent e93df9c
commit c9d3961
Show file tree

Hide file tree

Showing 6 changed files with 131 additions and 106 deletions.
diff --git a/docs/general/action-blocks.md b/docs/general/action-blocks.md
diff --git a/docs/general/batching-actions.md b/docs/general/batching-actions.md
@@ -0,0 +1,88 @@
+# Batching Actions
+
+It's sometimes the case that you need to execute several cache actions in a row. Although you can do this in the normal, this is actually somewhat inefficient as each call has to do various management (such as looking up cache states). For this reason Cachex offers several mechanisms for making multiple calls in sequence.
+
+## Submitting Batches
+
+The simplest way to make several cache calls together is `Cachex.execute/3`. This API allows the caller to provide a function which will be provided with a pre-validated cache state which can be used (instead of the cache name) to execute cache actions. This will skip all of the cache management overhead you'd see typically:
+
+```elixir
+# standard way to execute several actions
+r1 = Cachex.get!(:my_cache, "key1")
+r2 = Cachex.get!(:my_cache, "key2")
+r3 = Cachex.get!(:my_cache, "key3")
+
+# using Cachex.execute/3 to optimize the batch of calls
+{r1, r2, r3} =
+  Cachex.execute!(:my_cache, fn cache ->
+    # execute our batch of actions
+    r1 = Cachex.get!(cache, "key1")
+    r2 = Cachex.get!(cache, "key2")
+    r3 = Cachex.get!(cache, "key3")
+
+    # pass back all results as a tuple
+    {r1, r2, r3}
+  end)
+```
+
+Although this syntax might look a little more complicated at a glance, it should be fairly straightforward to get used to. The small change in approach here gives a fairly large boost to cache throughput. To compare the two examples above, we can use a tool like [Benchee](https://github.com/PragTob/benchee) for a rough comparison:
+
+```
+Name                   ips        average  deviation         median         99th %
+grouped             1.72 M      580.68 ns  ±3649.68%         500 ns         750 ns
+individually        1.31 M      764.02 ns  ±2335.25%         625 ns         958 ns
+```
+
+We can clearly see the time saving when using the batched approach, even if there is a large deviation in the numbers above. Somewhat intuitively, the time saving scales to the number of actions you're executing in your batch, even if it is unlikely that anyone is doing more than a few calls at once.
+
+It's important to note that even though you're executing a batch of actions, other processes can access and modify keys at any time during your `Cachex.execute/3` call. These calls still occur your calling process; they're not sent through any kind of arbitration process. To demonstrate this, here's a quick example:
+
+```elixir
+# start our execution block
+Cachex.execute!(:my_cache, fn cache ->
+  # set a base value in the cache
+  Cachex.put!(cache, "key", "value")
+
+  # we're paused but other changes can happen
+  :timer.sleep(5000)
+
+  # this may have have been set elsewhere
+  Cachex.get!(cache, "key")
+end)
+```
+
+As we wait 5 seconds before reading the value back, the value may have been modified or even removed by other processes using the cache (such as TTL cleanup or other places in your application). If you want to guarantee that nothing is modified between your interactions, you should consider a transactional block instead.
+
+## Transactional Batches
+
+A transactional block will guarantee that your actions against a cache key will happen with zero interaction from other processes. Transactions look almost exactly the same as `Cachex.execute/3`, except that they require a list of keys to lock for the duration of their execution.
+
+The entry point to a Cachex transaction is (unsurprisingly) `Cachex.transaction/4`. If we take the example from the previous section, let's look at how we can guarantee consistency between our cache calls:
+
+```elixir
+# start our execution block
+Cachex.transaction!(:my_cache, ["key"], fn cache ->
+  # set a base value in the cache
+  Cachex.put!(cache, "key", "value")
+
+  # we're paused but other changes will not happen
+  :timer.sleep(5000)
+
+  # this will be guaranteed to return "value"
+  Cachex.get!(cache, "key")
+end)
+```
+
+It's critical to provide the keys you wish to lock when calling `Cachex.transaction/4`, as any keys not specified will still be available to be written by other processes during your function's execution. If you're making a simple cache call, the transactional flow will only be taken if there is a simultaneous transaction happening against the same key. This enables caches to stay lightweight whilst allowing for these batches when they really matter.
+
+Another pattern which may prove useful is providing an empty list of keys, which will guarantee that your transaction runs at a time when no keys in the cache are currently locked. For example, the following code will guarantee that no keys are locked when purging expired records:
+
+```elixir
+Cachex.transaction!(:my_cache, [], fn cache ->
+  Cachex.purge!(cache)
+end)
+```
+
+Transactional flows are only enabled the first time you call `Cachex.transaction/4`, so you shouldn't see any peformance penalty in the case you're not actively using transactions. This also has the benefit of not requiring transaction support to be configured inside the cache options, as was the case in earlier versions of Cachex.
+
+The last major difference between `Cachex.execute/3` and `Cachex.transaction/4` is where they run; transactions are executed inside a secondary worker process, so each transaction will run only after the previous has completed. As such there is a minor performance overhead when working with transactions, so use them only when you need to.
diff --git a/docs/general/disk-interaction.md b/docs/general/disk-interaction.md
diff --git a/docs/general/local-persistence.md b/docs/general/local-persistence.md
@@ -0,0 +1,29 @@
+# Local Persistence
+
+Cachex ships with basic support for dumping a cache to a local file using the [Erlang Term Format](https://www.erlang.org/doc/apps/erts/erl_ext_dist). These files can then be used to seed data into a new instance of a cache to persist values between cache instances.
+
+As it stands all persistence must be handled manually via the Cachex API, although additional features may be added in future to add convenience around this. Note that the use of the term "dump" over "backup" is intentional, as these files are just extracted datasets from a cache, rather than a serialization of the cache itself.
+
+## Writing to Disk
+
+To dump a cache to a file on disk, you can use the `Cachex.dump/3` function. This function supports an optional `:compression` option (between `0-9`) to help reduce the required disk space. By default this value is set to `1` to try and optimize the tradeoff between performance and disk usage. Another common approach is to dump with `compression: 0` and run compression from outside of the Erlang VM.
+
+```elixir
+{ :ok, true } = Cachex.dump(:my_cache, "/tmp/my_cache.dump")
+```
+
+The above demonstrates how simple it is to dump your cache to a location on disk (in this case `/tmp/my_cache.dump`). Any options can be provided as a `Keyword` list as an optional third parameter.
+
+## Loading from Disk
+
+To seed a cache from an existing dump, you can use `Cachex.load/3`. This will *merge* the dump into your cache, overwriting and clashing keys and maintaining any keys which existed in the cache beforehand. If you want a direct match of the dump inside your cache, you should use `Cachex.clear/2` before loading your data.
+
+```elixir
+# optionally clean your cache first
+{ :ok, _amt } = Cachex.clear(:my_cache)
+
+# then you can load the existing dump into your cache
+{ :ok, true } = Cachex.load(:my_cache, "/tmp/my_cache.dump")
+```
+
+Please note that loading from an existing dump will maintain all existing expirations, and records which have already expired will *not* be added to the cache table. This should not be surprising, but it is worth calling out.
diff --git a/docs/general/streaming-caches.md → docs/general/streaming-records.md b/docs/general/streaming-caches.md → docs/general/streaming-records.md
@@ -1,8 +1,8 @@
-# Streaming Caches
+# Streaming Records
 
 Cachex provides the ability to create an Elixir `Stream` seeded by the contents of a cache, using an ETS table continuation and `Stream.resource/3`. This then allows the developer to use any of the `Enum` or `Stream` module functions against the entries in cache, which can be a very powerful and flexible tool.
 
-## Simple Streams
+## Basic Streams
 
 By default, `Cachex.stream/3` will return a `Stream` over all entries in a cache which are yet to expire (at the time of stream creation). These cache entries will be streamed as `Cachex.Spec.entry` records, so you can use pattern matching to pull any of the entry fields assuming you have `Cachex.Spec` imported:
 
@@ -18,13 +18,13 @@ Cachex.put(:my_cache, "three", 3)
 
 # == 6
 :my_cache
-|> Cachex.stream!
-|> Enum.reduce(0, fn(entry(value: value), total) ->
-    total + value
-   end)
+|> Cachex.stream!()
+|> Enum.reduce(0, fn entry(value: value), total ->
+  total + value
+end)
 ```
 
-## Efficient Streams
+## Efficient Querying
 
 While the `Enum` module provides the ability to filter records easily, we can do better by pre-filtering using a match specification. Under the hood these matches are as defined by the Erlang documentation, and can be passed as the second argument to `Cachex.stream/3`.
 
@@ -41,15 +41,15 @@ Cachex.put(:my_cache, "two", 2)
 Cachex.put(:my_cache, "three", 3)
 
 # generate our filter to find odd values
-filter = { :==, { :rem, :value, 2 }, 1 }
+filter = {:==, {:rem, :value, 2}, 1}
 
 # generate the query using the filter, only return `:value
 query = Cachex.Query.create(where: filter, output: :value)
 
 # == 4
 :my_cache
 |> Cachex.stream!(query)
-|> Enum.sum
+|> Enum.sum()
 ```
 
 Rather than retrieve and handle the whole cache entry, here we're using `:output` to choose only the `:value` column from each entry. This lets us skip out on `Enum.reduce/3` and go directly to `Enum.sum/1`, much easier!
@@ -67,7 +67,7 @@ Cachex.put(:my_cache, "two", 2)
 Cachex.put(:my_cache, "three", 3)
 
 # generate our filter to find odd values
-filter = { :==, { :rem, :value, 2 }, 1 }
+filter = {:==, {:rem, :value, 2}, 1}
 
 # wrap our filter to filter expired values
 filter = Cachex.Query.expired(filter)
@@ -78,7 +78,7 @@ query = Cachex.Query.create(where: filter, output: :value)
 # == 4
 :my_cache
 |> Cachex.stream!(query)
-|> Enum.sum
+|> Enum.sum()
 ```
 
 This function accepts a query guard and wraps it with clauses to filter out expired records. The returned guard can then be passed to `Cachex.Query.create/1` to return only the expired records which match your query. This is all fairly simple, but it's definitely something to keep in mind when working with `Cachex.Query`!
diff --git a/mix.exs b/mix.exs
@@ -34,9 +34,9 @@ defmodule Cachex.Mixfile do
         extras: [
           "docs/extensions/custom-commands.md",
           "docs/extensions/execution-lifecycle.md",
-          "docs/general/action-blocks.md",
-          "docs/general/disk-interaction.md",
-          "docs/general/streaming-caches.md",
+          "docs/general/batching-actions.md",
+          "docs/general/local-persistence.md",
+          "docs/general/streaming-records.md",
           "docs/management/limiting-caches.md",
           "docs/management/expiring-records.md",
           "docs/migrations/migrating-to-v3.md",