Skip to content

getting started

Daniel Higginbotham edited this page Nov 2, 2022 · 7 revisions

This guide will show you how to use datapotato to generate and insert records that have simple relationships. We'll be working with a "database" for a dream journal application, because dreams are neat.

Create a schema to generate data

To get started, we'll need a way to generate example records. Libraries like clojure.spec and malli allow you to create schemas that specify the shape of your data, which you can then use to generate examples. We're going to use malli. Let's create a schema for user records:

(ns dream-journal
  (:require
   [donut.datapotato.atom :as da]
   [donut.datapotato.core :as dc]
   [malli.generator :as mg]))

(def User
  [:map
   [:id pos-int?]
   [:username string?]])
   
;; use mg/generate to generate examples
(mg/generate User)

;; =>
{:id 37550, :username "EiaB5V3xqYDa11x7rZ"}

Inserting fixtures

Now we need to tell datapotato how to generate these records and insert them into a database. In lieu of an actual database, we're going to use an atom as the data store so that you can inspect it and so that you don't have to set up a real db.

NOTE see Database Integration for instructions on how to get running with an actual database.

Here's the code:

(def fixture-atom (atom []))

(def potato-schema
  {:user {:prefix   :u
          :generate {:schema User}}})

(def potato-db
  {:schema   potato-schema
   :generate {:generator mg/generate}
   :fixtures {:insert da/insert
              :setup  (fn [_] (reset! fixture-atom []))
              :atom   fixture-atom}})

;; populate fixture-atom
(dc/with-fixtures potato-db
  (dc/insert-fixtures {:user [{:count 3}]}))

Starting at the bottom, the last two lines populate the "database" with user records:

[[:user {:id 82, :username "14u0S8q1l6"}]
 [:user {:id 19591, :username "8Bs2709B0Xqw0qa7oT4VL2u751"}]
 [:user {:id 269, :username "VKhJ44eSI325np"}]]

dc/with-fixtures is a macro that sets some dynamic bindings and does some useful bookkeeping, including calling the :setup function before evaluating the rest of its body. In this case, the :setup function calls reset! on the fixture-atom, removing previously-inserted data.

dc/insert-fixtures generates examples and inserts them. Here it's taking a single argument, {:user [{:count 3}]}, which is a _query that describes the type of records to insert, and how many of them. We'll look at the query syntax in a bit more detail later in this guide.

Next we have potato-db, a map that includes configuration necessary to generate data and insert it.

Its :schema key references potato-schema, a map that describes the ent types in your system (in this case, :user). You can think of ent types as corresponding to database tables. :generate {:schema User} shows how to specify what schema to use to generate examples.

Ent types need a :prefix, specified above with :prefix :u. This is by potato-db to assign names to every entity it generates. For example, if you generate three entites, they'll be named :u0, :u1, and :u2. These names are used internally, but you'll also soon see how you can use them to retrieve the values datapotato generates.

potato-db's :generate key configures the generator function to use with {:generator mg/gen}. For clojure.spec, this would probably be {:generator (comp clojure.spec.gen.alpha/generate clojure.spec.alpha/gen)}.

The :fixtures key configures insertion behavior. The :insert function is used to insert each example record individually. This function is specific to your datastore; in this case, we're using a function that datapotato ships with, da/insert. This function expects your atom to be specified under the :atom key.

insert-fixtures return value

Because Malli and clojure.spec generate random data, it can be unclear how to use this data in tests. For example, say you're testing an api endpoint for /user/{:user-id}. You want to first insert a user record, and then you need the user's ID for the input. How do you get it?

One way to is to rely on the return value from insert-fixtures:

(dc/with-fixtures potato-db
  (dc/insert-fixtures {:user [{:count 1}]}))
  
;; =>
{:u0 {:id 2, :username "72KNO1fX0fi79wk1XqPC"}}

insert-fixtures returns a map where the key are _ent-id_s, and the value is the inserted record. ent-ids are automatically generated using the pattern :{prefix}{int}. The :prefix for :user is :u, so when datapotato generates users it names them :u0, :u1, :u2, etc. So, while records are generated randomly, the identifiers for those records are generated deterministically. Therefore, you can write code like this:

(dc/with-fixtures potato-db
  (let [{:keys [u0]} (dc/insert-fixtures {:user [{:count 1}]})]
    (test-api-call {:method :get
                    :uri    (str "/users/" (:id u0))})))

Hierarchical Records

Let's expand this to insert some dream journal entries. We'll need to add DreamJournal and an Entry malli schemas:

(def DreamJournal
  [:map
   [:id pos-int?]
   [:owner-id pos-int?]
   [:dream-journal-name string?]])

(def Entry
  [:map
   [:id pos-int?]
   [:dream-journal-id pos-int?]
   [:content string?]])

We also need to update our potato-schema to include these new ent types:

(def potato-schema
  {:user          {:prefix   :u
                   :generate {:schema User}}
   :dream-journal {:prefix    :dj
                   :generate  {:schema DreamJournal}
                   :relations {:owner-id [:user :id]}}
   :entry         {:prefix    :e
                   :generate  {:schema Entry}
                   :relations {:dream-journal-id [:dream-journal :id]}}})

New here is the :relations key. This is used to set the correct values for generated examples, and to ensure that records are inserted in the correct order.

For example, when a :dream-journal record gets generated, its :owner-id is initially a random integer. However, you need its value to be the :id of the :user that it belongs to. The config :relations {:owner-id [:user :id]} is you tell datapotato about this relationship. Its how you tell datapotato, "When you generate a :dream-journal, make sure you first generate a :user, and that you set the :dream-journal's :owner-id to the :user's :id".

The same logic applies for :entry. You can see this when inserting entries:

(dc/with-fixtures potato-db
  (dc/insert-fixtures {:entry [{:count 2}]}))

@fixture-atom
;; =>
[[:user {:id 2, :username "xl2gQGY2lW"}]
 [:dream-journal {:id 10, :owner-id 2, :dream-journal-name "80i1bP5a203qBjd0ODlaIzKZ5U"}]
 [:entry {:id 43075646, :dream-journal-id 10, :content "1bSkEEu1s2An2"}]
 [:entry {:id 10710, :dream-journal-id 10, :content "PGqa4C10"}]]

A :user is inserted first, then :dream-journal, then two :entry records. You only specified that you wanted two :entry records, but because you specified :relations datapotato knew that you also had to create a :dream-journal and a :user. Note that only one :dream-journal was created; datapotato will only generate and insert the minimum records needed to satisfy your request.

Queries and overwriting generated data

Let's take a closer look at how we specified what to generate and insert:

(dc/with-fixtures potato-db
  (dc/insert-fixtures {:entry [{:count 2}]}))

The map {:entry [{:count 2}]} is a query. The basic structure of a query is

{ent-type [query-term-1 query-term-2]}

ent-type is a keyword like :user or :dream-journal that you've included in your potato schema. Query terms are maps that let you configure the behavior of data generation and insertion:

  • The :count key specifies how many records to generate and insert
  • The :set key lets you specify constant values to use when generating and inserting records, overriding the generated values

For example:

(dc/with-fixtures potato-db
  (dc/insert-fixtures {:entry [{:set {:content "hotdogs again."}}]}))

This creates a single entry, with "hotdogs again." as the value of :content instead of an auto-generated value. Note that the :count key is missing. Its default value is 1.

You can include more than one entity type in a query:

(dc/with-fixtures potato-db
  (dc/insert-fixtures {:user  [{:count 3}]
                       :entry [{:count 1}]}))

You can also include more than one query term for an entity type:

{:user [{:username "val"} {:username "kilmer"}]}

This will create two users, one with the username "val" and the other with the username "kilmer".

Predictable generated primary keys

When you're generating records to be inserted in a database, you want to avoid generating two records with the same primary key. You can accomplish this by creating a custom generator and using that in your schemas:

(require '[clojure.test.check.generators :as gen :include-macros true])

(def id-atom (atom 0))
(def monotonic-id-gen
  (gen/fmap (fn [_] (swap! id-atom inc)) (gen/return nil)))
  
(def ID
  [:and {:gen/gen monotonic-id-gen} pos-int?])

(def User
  [:map
   [:id ID]
   [:username string?]])


(def potato-schema
  {:user          {:prefix   :u
                   :generate {:schema User}}})

(def potato-db
  {:schema   potato-schema
   :generate {:generator mg/generate}
   :fixtures {:insert da/insert
              :setup  (fn [_]
                        (reset! fixture-atom [])
                        (reset! id-atom 0))
              :atom   fixture-atom}})

We use generator monotonic-id-gen to incremement an integer stored in id-atom and use the return value. In potato-db, we've updated :setup to reset id-atom to 0. Now when you insert data, the ids are guaranteed to be unique:

(dc/with-fixtures potato-db
  (dc/insert-fixtures {:user [{:count 3}]}))
 
 ;; =>
{:u0 #:user{:id 2, :username "RS6IQtPY1FNM247"},
 :u1 #:user{:id 1, :username "A47yjO"},
 :u2 #:user{:id 3, :username "a4SjZtyJk58LJ0g8W"}}

Testing Random Data

Nubank's matcher combinators library is useful for testing random data where you care more about the properties of the data than the exact values.

Using dbxray to generate schemas

dbxray is a library that can generate malli schemas, clojure.spec specs, and potato schemas by inspecting a database if you can connect to it with next-jdbc. It could save you some time getting started!

Next steps

  • See database integration for instructions on working with datomic, next-jdbc, xtdb, or the database of your choice
  • Intro is the entrypoint for a tutorial that explains datapotato from the ground up and covers less-frequent use cases
  • Visiting Functions explains how datapotato at its core is actually a tool for generating and traversing graphs, and how the tools for working with test fixtures are built on top of that