Provides durable clojure reference types where the values are not always local to the program.
Places an emphasis on the ability to share references to values between programs.
Heavily inspired by Rich Hickey's talk The Language of the System.
Available via clojars:
[riverford/durable-ref "0.1.6"]
Begin with the tutorial
-
An immutable weakly interned, caching reference.
-
A
clojure.lang.Atom
style coordinated reference, for safe concurrent updates. -
A basic mutable reference, concurrent updates are unsafe.
Available by require
ing the appropriate riverford.durable-ref.scheme
namespace, see each section for details.
- Amazon S3 supporting
value
,volatile
- Amazon DynamoDB supporting
value
,volatile
,atomic
- Redis supporting
value
,volatile
,atomic
Available by require
ing the appropriate riverford.durable-ref.format
namespace, see each section for details.
It is often useful to be able to refer to a value across machines or to preserve a reference to a value across restarts and so on. Particularily when the value is large and impractical to convey directly.
Often you will see this:
(put-value! storage k v)
And later:
(get-value storage k)
k
is then a reference in your program to the value in storage.
This is sufficient for many programs. It is close to typical object-oriented style, and allows for different storage implementations.
However there are some problems with this using this style to reference values:
k
Does not reflect any properties of the reference, e.g can I rely on it being immutable?- storage itself and storage format are often complected together.
- The referencing scheme needs to be known by code that wants to do look ups (e.g what is
k
?, do I need an equivalentstorage
instance?)
This library defines an extensible URI based set of conventions that allows different reference mechanisms to be implemented on top of the same storages and formats.
Reference a value like this:
value:s3://my-bucket/my-path/a5e744d0164540d33b1d7ea616c28f2fa97e754a.edn
Or like this:
value:file:///Users/me/obj/a5e744d0164540d33b1d7ea616c28f2fa97e754a.json
Or a volatile (mutable) reference like this:
volatile:file:///Users/me/people/fred.edn.zip
This approach conveys several benefits:
- All information required to deref a value is encoded in the reference itself (e.g location, format, path).
- Storage and format are seperate components, independent of reference semantics and can be changed (and extended) independently.
- Semantics of the reference are encoded in the scheme, allowing one to e.g leverage the persistence/immutablity of the reference to e.g cache values pervasively.
- URI's are themselves values, they can be exhanged freely over the wire in different formats, whilst preserving the meaning of the reference.
I hope this library starts the conversation on how durable reference types should be implemented in a consistent way. The goal being that the references can be shared between programs without them having knowledge of one anothers internals.
Because value references require immutability, programs are able to cache the value of the reference across all instances of that reference.
This is done internally by weak-interning results of (reference uri)
calls where the uri denotes a value (as opposed to a mutable or ambiguous reference).
This means you can maintain an arbitrary number of aliases of the reference and only pay the cost of a dereference once (or until all instances of the reference are GC'd)
There is no caching of deref results on mutable references.
The api for references is in the riverford.durable-ref.core
namespace.
(require '[riverford.durable-ref.core :as dref])
Pick a suitable directory on your machine for storing values. I am going to use /Users/danielstone/objects
Obtain a durable reference to a value with persist
. Passing a base-uri (directory)
object and optional opts (e.g {:as "edn.zip"}
).
(def fred-ref
(dref/persist "file:///users/danielstone/objects"
{:name "fred"
:age 42}))
fred-ref
;; =>
#object[riverford.durable_ref.core.DurableValueRef
"value:file:///users/danielstone/objects/7664124773263ad3bda79e9267e1793915c09e2d.edn"]
Notice the reference URI you get back includes a sha1 identity hash of the object.
references implement clojure.lang.IDeref
@fred-ref
;; =>
{:name "fred", :age 42}
alternatively, references can be derefenced with value
, perhaps to signal the fact a deref could fail (due to unavailability of storage)
(dref/value fred-ref)
;; =>
{:name "fred" :age 42}
value
also supports additional options (e.g to forward to storage and format implementations).
You can obtain a URI of the reference
(dref/uri fred-ref)
;; =>
#object[java.net.URI 0x437f6d9e "value:file:///users/danielstone/objects/7664124773263ad3bda79e9267e1793915c09e2d.edn"]
Most ref operations such as value
support using a URI or string directly.
(dref/value "value:file:///users/danielstone/objects/7664124773263ad3bda79e9267e1793915c09e2d.edn")
;; =>
{:name "fred", :age 42}
Values are cached for value references, and reference instances themselves are weak-interned via
a WeakHashMap. Repeated value
/persist!
calls on the same value will be very cheap while reference instances are on the heap.
If you want to evict a cached value in a ref, use evict!
.
If storage changes, value references will throw on deref.
reference
reacquires a reference object of the correct type from a URI or string.
(dref/reference "value:file:///users/danielstone/objects/7664124773263ad3bda79e9267e1793915c09e2d.edn")
;; =>
#object[riverford.durable_ref.core.DurableValueRef
"value:file:///users/danielstone/objects/7664124773263ad3bda79e9267e1793915c09e2d.edn"]
Even if your storage does not immediately reflect your write, its ok as long as you retain the reference
returned by persist
, this is because the value is pre-cached. Due to reference weak-interning, you can alias it
as a URI or string and as long as the reference hasn't been GC'd, you will continue to see the value.
First, decide on a global URI for your mutable reference. I will use a temporary in memory reference to keep the tutorial simple.
atomic:mem://tmp/fred.edn
You can call value
on it (even if its never been written to).
(dref/value "atomic:mem://tmp/fred.edn")
;; =>
nil
You can mutate the ref by applying a function with atomic-swap!
. The function will be applied atomically, and
ref will assume the result as the new value.
The swap function like swap!
on an atom, will return the result of applying the function.
(dref/atomic-swap! "atomic:mem://tmp/fred.edn" (fnil inc 0))
;; =>
1
(dref/atomic-swap! "atomic:mem://tmp/fred.edn" (fnil inc 0))
;; =>
2
You can mutate (ignoring any existing value) the ref with overwrite!
(dref/overwrite! "atomic:mem://tmp/fred.edn" {:name "fred"})
;; =>
nil
You can call reference
on a URI or string to acquire atomic reference object
(def fred-atom-ref (dref/reference "atomic:mem://tmp/fred.edn"))
fred-atom-ref
;; =>
#object[riverford.durable_ref.core.DurableAtomicRef "atomic:mem://tmp/fred.edn"]
All reference objects implement clojure.lang.IDeref
@fred-atom-ref
;; =>
{:name "fred"}
Atomic references implement clojure.lang.IAtom
(swap! fred-atom-ref assoc :age 42)
;; =>
{:name "fred", :age 42}
Finally the ref can be deleted (when the storage supports it) with delete!
(dref/delete! fred-atom-ref)
;; =>
nil
@fred-atom-ref
;; =>
nil
Like atomic
references, first decide on an appropriate name for your volatile reference.
volatile:file:///users/danielstone/objects/fred.edn
You can call value
on it (even if its never been written to).
(dref/value "volatile:file:///users/danielstone/objects/fred.edn")
;; =>
nil
You can mutate the ref with overwrite!
(dref/overwrite! "volatile:file:///users/danielstone/objects/fred.edn" {:name "fred"})
;; =>
nil
;; be aware, that the ability to read immediately
;; is determined by the consistency properties of your storage
;; (always assume possibilty of stale values)
(dref/value "volatile:file:///users/danielstone/objects/fred.edn")
;; =>
{:name "fred"}
You can call reference
on it to acquire reference object.
(def fred-mut-ref (dref/reference "volatile:file:///users/danielstone/objects/fred.edn"))
fred-mut-ref
;; =>
#object[riverford.durable_ref.core.DurableVolatileRef "volatile:file:///users/danielstone/objects/fred.edn"]
The reference object implements clojure.lang.IDeref
@fred-mut-ref
;; =>
{:name "fred"}
Finally mutable refs can be deleted (when the storage supports it) with delete!
(dref/delete! fred-mut-ref)
;; =>
nil
@fred-mut-ref
;; =>
nil
Scheme: mem
URI convention: mem://{path-a}/{path-b ...}/{id}.{ext}
e.g mem://testing/fred.edn
Supported refs: value
, volatile
, atomic
Transient in-memory storage. Useful for testing. I would not recommend using it in production.
Scheme: file
URI convention: file:///{folder-a}/{folder-b ...}/{id}.{extension}
e.g file:///Users/me/foo/fred.edn
Supported refs: value
, volatile
Local disk backed storage.
Scheme: s3
URI convention: s3://{bucket}/{folder-a}/{folder-b ...}/{id}.{extension}
e.g s3://my-bucket/foo/fred.edn
Supported refs: value
, volatile
S3 is good for value refs, particularily if they are large and accessed cold infrequently. Be aware of its eventual consistency however.
using amazonica
:dependencies [amazonica "0.3.77"]
(require '[riverford.durable-ref.scheme.s3.amazonica])
;; Storage options (optionally provide in an options map to persist, value, overwrite!, delete!)
;; see amazonica documentation for more information
{:scheme {:s3 {:amazonica {:shared-opts {} ;; spliced into all amazonica requests
:read-opts {} ;; spliced into get-object requests
:write-opts {} ;; spliced into put-object requests
:delete-opts {} ;; spliced into delete-object requests
}}}}
Scheme: dynamodb
URI convention: dynamodb:http://dynamodb.{region}.amazonaws.com/{table}/{id}.{extension}
e.g dynamodb:http://dynamodb.eu-west-1.amazonaws.com/my-table/fred.edn
Supported refs: value
, volatile
, atomic
Does not work with arbitrary tables, requires a table with a single string hash-key id
.
Will use column data
to store serialized objects. The column version
is used to implement conditional puts.
using amazonica
:dependencies [amazonica "0.3.77"]
(require '[riverford.durable-ref.scheme.dynamodb.amazonica])
;; Storage options (optionally provide in an options map to persist, value, overwrite!, delete!, atomic-swap!)
;; see amazonica documentation for more information
{:scheme {:dynamodb {:amazonica {:shared-opts {} ;; spliced into all amazonica requests
:read-opts {} ;; spliced into get-item requests
:write-opts {} ;; spliced into put-item requests
:delete-opts {} ;; spliced into delete-item requests
:creds {} ;; use if you want to override your access credentials
:cas-back-off-fn (fn [uri n])
;; a callback function called on conditional put failure.
;; Receives the uri and current number of CAS iterations performed
;; Use to implement things like back-off.
}}}}
Scheme: redis
URI convention: redis:tcp://{host}:{port}/{database-number}/{id}.{extension} e.g.
redis:tcp://localhost:6379/0/fred.edn`
Supported refs: value
, volatile
, atomic
To be able to use Clojure's reference type interfaces you can add the Redis credentials
using riverford.durable-ref.scheme.redis.carmine/add-credentials!
, and remove them
again when no longer needed with riverford.durable-ref.scheme.redis.carmine/remove-credentials!
.
This will allow you to do things like this:
(require '[riverford.durable-ref.core :as dref])
(require '[riverford.durable-ref.scheme.redis.carmine :refer [add-credentials!]])
(add-credentials! "localhost" 6379 "foobar")
(def ref-1 (dref/reference "atomic:redis:tcp://localhost:6379/0/ref-1.edn"))
@ref-1 ;=> nil
(reset! ref-1 {:foo #{:a :b}})
@ref-1 ;=> {:foo #{:a :b}}
(swap! ref-1 assoc :bar 42)
@ref-1 ;=> {:foo #{:a :b} :bar 42}
By default the CAS implementation spins forever with no backoff. If you are in a scenario where you think a maximum number of retries is useful, or want to provide a back-off scheme, you can use the dref API:
(require '[riverford.durable-ref.core :as dref])
(require '[riverford.durable-ref.scheme.redis.carmine])
(def ref-1 (dref/reference "atomic:redis:tcp://localhost:6379/0/ref-1.edn"))
(dref/value ref-1 {:scheme {:redis {:carmine {:creds {:password "foobar"}}}}})
(dref/atomic-swap! ref-1 inc {:scheme {:redis {:carmine {:creds {:password "foobar"}
:cas-back-off-fn (fn [^URI uri idx]
(Thread/sleep (* 10 idx)))}}}})
using carmine
:dependencies [com.taoensso/carmine "2.16.0"]
(require '[riverford.durable-ref.scheme.redis.carmine])
;; Storage options (optionally provide in an options map to persist, value, overwrite!, delete!, atomic-swap!)
{:scheme {:redis {:carmine {:creds {}
;; use if you want to override your access credentials
;; key is `:password`
:cas-back-off-fn (fn [uri n])
;; a callback function called on conditional put failure.
;; Receives the uri and current number of CAS iterations performed
;; Use to implement things like back-off.
}}}}
Extensions (edn
, edn.zip
)
Serialization using clojure.edn
and pr
Extensions (fressian
, fressian.zip
)
Serialization via data.fressian
:dependencies [org.clojure/data.fressian "0.2.1"]
(require '[riverford.durable-ref.format.fressian])
;; Format options (optionally provide in an options map to persist, value, overwrite!, delete!)
;; see fressian docs for more details
{:format {:fressian {:read-opts {} ;; spliced into create-reader calls
:write-opts {} ;; spliced into create-writer calls
}}}
Extensions (json
, json.zip
)
Using cheshire
:dependencies [cheshire "5.6.3"]
(require '[riverford.durable-ref.format.json.cheshire])
;; Format options (optionally provide in an options map to persist, value, overwrite!, delete!)
;; see cheshire docs for more details
{:format {:json {:cheshire {:write-opts {} ;; passed as options to generate-stream calls
:read-opts {
:key-fn f1 ;; passed as the `key-fn` arg to parse-stream calls
:array-coerce-fn f2 ;; passed as the `array-coerce-fn` to parse-stream calls
}}}}}
Extensions (nippy
)
Using nippy
:dependencies [com.taoensso/nippy "2.12.2"]
(require '[riverford.durable-ref.format.nippy])
;; Format options (optionally provide in an options map to persist, value, overwrite!, delete!)
;; see nippy docs for more details
{:format {:nippy {:write-opts {} ;; passed as options to freeze calls
:read-opts {} ;; passed as options to thaw calls
}}
These are the storage multimethods you can implement currently (dispatching on the scheme):
read-bytes
, receives the uri and options passed tovalue
. Returns a byte array or nil.write-bytes!
, receives the uri, the serialized byte array and options passed topersist!
,overwrite!.
- (optional)
delete-bytes!
, receives the uri and options passed todelete!
. - (optional)
do-atomic-swap!
, receives the uri, function and options passed toatomic-swap!
.
These are the format multimethods you can implement currently (dispatching on the extension):
serialize
, receives the object, the format string, and options passed topersist!
,overwrite!
deserialize
, receives the serialized byte array, the format string and options passed tovalue
.
- more formats and storages
Pull requests welcome!
Copyright © 2016 Riverford Organic Farmers Ltd
Distributed under the 3-clause license ("New BSD License" or "Modified BSD License").