Skip to content

Commit

Permalink
Improve keyset docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ddnexus committed Dec 15, 2024
1 parent bcc41c4 commit 7d44f8d
Show file tree
Hide file tree
Showing 3 changed files with 62 additions and 39 deletions.
85 changes: 54 additions & 31 deletions docs/api/keyset.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,12 @@ If you want the best of the two worlds, check out the [keyset_for_ui extra](/doc
| `set` | The `uniquely ordered` `ActiveRecord::Relation` or `Sequel::Dataset` collection to paginate. |
| `keyset` | The hash of column/direction pairs. Pagy extracts it from the order of the `set`. |
| `keyset attributes` | The hash of keyset-column/record-value pairs of a record. |
| `cutoff` | A point in the `set` where a `page` ended. Its value is a `Base64` encoded URL-safe string. |
| `keyset values` | The array of the `values` of the `keyset attributes`. |
| `cutoff` | The value that identifies where the `page` ends, and the `next` one begins. It is encoded as a `Base64` URL-safe string. |
| `page` | The current `page`, i.e. the page of records beginning after the `cutoff` of the previous page. Also the `:page` variable, which is set to the `cutoff` of the previous page |
| `next` | The next `page`, i.e. the page of records beginning after the `cutoff`. Also the `cutoff` value retured by the `next` method. |

### Keyset, Keyset For UI or Offset pagination?
### Choose the right pagination type

+++ Keyset

Expand All @@ -75,19 +76,19 @@ Only useful when you don't need any frontend (e.g. infinite pagination)
* The same performance of Keyset
* Most of the Frontend features

!!!warning Advanced usage
!!!warning Advanced usage; no use in APIs

It requires more effort and resource to setup
- It requires more effort and resource to setup
- It does not make sense in APIs that don't need UI
!!!

+++ Offset
!!!success Use Offset pagination with UIs and small DBs

* You will get all the frontend features
* You can avoid the slowness by simply limiting the `:max_pages` pages: the users would not browse thousands of records deep into
your collection anyway
* You can reduce the slowness by limiting the `:max_pages` pages

!!!warning Limited use for APIs
!!!warning Limited use for APIs; not guaranteed to be accurate in UI

* Your server will suffer on big data and your API will be slower for no good reasons
* Not accurate: It can skip or double-show records after insertion and deletions.
Expand Down Expand Up @@ -164,41 +165,62 @@ If you need a specific order:

#### Understanding the Cutoffs

A `cutoff` defines a point in the `set` where a `page` ended. All the records AFTER that point are or will be part of the `next` page.
The `cutoff` of a `page` is the **value** that identifies where the `page` _has ended_, and the `next` one begins.

Let's consider an example of a simple `set`. In order to avoid confusion with numeric ids and number of records, let's assume that
it has an `id` column populated by unique alphanumeric codes, and its order is: `order(:id)`.
Let's consider an example of a simple `set` of 29 records, with an `id` column populated by character keys, and its order is:
`order(:id)`.

Assuming a LIMIT of 6, the first page will include the first 6 records in the set: no `cutoff` required so far...
Assuming a LIMIT of 10, the _"first page"_ will just include the first 10 records in the `set`: no `cutoff` value is known so
far...

```
| page | not yet paginated |
beginning ->|. . . . . .|. . . . . . . . . . . . . . . . . . . . . . . . . . .|<- end of set
│ first page (10) >│ rest (19) >│
beginning of set >[· · · · · · · · · ·]· · · · · · · · · · · · · · · · · · ·]< end of set
```

After we pull the first 6 records from the beginning of the `set`, we read the `id` of the last one, which is `F`. So our `cutoff` can be defined like: _"the point up to the value `F` in the `id` column"_.
At this point, it's the exact same first page pulled with OFFSET pagination, however, we don't want to use OFFSET to get the
records after the first 10: that would be slow in a big table, so we need a way to identify the beginning of the next page without
counting the records.

Notice that this is not like saying _"up to the record `F`"_. It's important to understand that a `cutoff` refers just to a value
At this point we need to get the next 10 records... WITHOUT using OFFSET (i.e. the whole point we want to avoid for performance),
so we read the `id` of the last one, which is `X`... and that is the `cutoff` value of the first page. It can be described like:
_"the point up to the value `X` in the `id` column"_.

Notice that this is not like saying _"up to the record `X`"_. It's important to understand that a `cutoff` refers just to a value
in a column (or a combination of multiple column, in case of muti-columns keysets).

Indeed, that very record could be deleted right after we read it, and our `cutoff` will still be the valid reference that _"we paginated the `set`, up to the "F" value"_...
Indeed, that very record could be deleted right after we read it, and our `cutoff-X` will still be the valid truth that we
paginated the `set`, up to the "X" value", cutting any further record off the `page`...

```
| page | page | not yet paginated |
beginning ->|. . . . . F]. . . . . .|. . . . . . . . . . . . . . . . . . . . .|<- end of set
|
cutoff-F
│ first page (10) >│ second page (10) >│ rest (9) >│
beginning of set >[· · · · · · · · · X]· · · · · · · · · ·]· · · · · · · · ·]< end of set
cutoff-X
```

For getting the `next` page of records - this time - we pull the `next` 6 records AFTER the `cutoff-F`. Again, we read the `id` of the last one, which is `L`: so we have our new `cutoff-L`, which is the end of the current `page`, and the `next` will go AFTER it...
For getting the `next` page of records (i.e. the _"second page"_) we pull the `next` 10 records AFTER the `cutoff-X`. Again, we
read the `id` of the last one, which is `Y`: so we have our new `cutoff-Y`, which is the end of the current `page`, and the `next`
will go AFTER it...

```
| page | page | page | not yet paginated |
beginning ->|. . . . . F]. . . . . L]. . . . . .|. . . . . . . . . . . . . . .|<- end of set
| |
cutoff-F cutoff-L
│ first page (10) >│ second page (10) >│ last page (9) >│
beginning of set >[· · · · · · · · · X]· · · · · · · · · Y]· · · · · · · · ·]< end of set
▲ ▲
cutoff-X cutoff-Y
```

Pagy encodes the values of the `cutoffs` in a `Base64` URL-safe string that is sent as a param in the `request`.

When we pull the `next` page from the `cutoff-Y`, we find only the remaining 9 records, which means that it's the _"last page"_,
which naturally ends with the end of the `set`, so it doesn't need any `cutoff` value to spearate it from further records.

#### Keynotes

- A `cutoff` identifies a "cutoff value", for a `page` in the `set`. It is not a record nor a reference to it.
- Its value is derived from the `keyset attributes values` array of the last record of the `page`, converted to JSON, and encoded
as a Base64 URL-safe string, for easy use in URLs.
- `Pagy::Keyset` embeds it in the request URL; `Pagy::KeysetForUI` caches it on the server.
- All the `page`s but the last, end with the `cutoff`.
- All the `page`s but the first, begin AFTER the `cutoff` of the previous `page`.

## ORMs

Expand All @@ -217,7 +239,7 @@ Pagy::Keyset.new(sequel_set)
==- `Pagy::Keyset.new(set, **vars)`

The constructor takes the `set`, and an optional hash of [variables](#variables). It returns a `Pagy::Keyset::ActiveRecord` or
`Pagy::Keyset::Sequel` object (depending by the `set` class).
`Pagy::Keyset::Sequel` object (depending on the `set` class).

==- `next`

Expand Down Expand Up @@ -248,8 +270,9 @@ Default `nil`.

==- `:jsonify_keyset_attributes`

A lambda to override the generic json encoding of the `keyset` attributes. It receives the keyset attributes to jsonify, and it should return a JSON string of the `attributes.values` array. Use it when the generic `to_json` method would lose
some information when decoded.
A lambda to override the generic JSON encoding of the `keyset attributes`. It receives the `keyset attributes` as an arument, and
it should return a JSON string of the `attributes.values` array. Use it when the generic `to_json` method would lose some
information when decoded.

For example: `Time` objects may lose or round the fractional seconds through the encoding/decoding cycle, causing the ordering to
fail and thus creating all sort of unexpected behaviors (e.g. skipping or repeating the same page, missing or duplicated records,
Expand All @@ -260,7 +283,7 @@ etc.). Here is what you can do:
jsonify_keyset_attributes = lambda do |attributes|
# Convert it to a string matching the stored value/format in SQLite DB
attributes[:created_at] = attributes[:created_at].strftime('%F %T.%6N')
attributes.values.to_json # remember to return an array of the values only
attributes.values.to_json # remember to return the array of values, not the attribute hash
end

Pagy::Keyset(set, jsonify_keyset_attributes:)
Expand Down
12 changes: 6 additions & 6 deletions docs/api/keyset_for_ui.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,12 +81,12 @@ Querying with the LIMIT again, might cause records to get skipped or to appear t

While te accuracy is guaranteed, in case of insertions or deletions of records falling in the range of the visited page, the page will obviously have a number of records different from expected.

That might not be a problem in most cases, however in extreme cases, a complete page of records might get wiped out, resulting in a completely empty (or with just very few records) page. That's not a logical problem, nor a common one, but it may look weird to the users.
That is not a logical nor common problem, however in extreme cases, a page of records might change its size so noticeably and unexpectedly that it may look somehow "broken" to the users.

!!!success We are planning to fix the problem in the future by:
!!!success We plan to implement page-rebalancing:

- Adding automatic compacting of empty (or almost empty) visited pages.
- Adding automatic splitting of eccesively grown visited pages.
- Automatic compacting of empty (or almost empty) visited pages.
- Automatic splitting of eccesively grown visited pages.
!!!

## Setup
Expand All @@ -102,7 +102,7 @@ internally:
Pagy::KeysetForUI.new(active_record_set)
#=> #<Pagy::KeysetForUI::ActiveRecord:0x00000001066215e0>

Pagy::Keyset.new(sequel_set)
Pagy::KeysetForUI.new(sequel_set)
#=> #<Pagy::KeysetForUI::Sequel:0x00000001066545e0>
```

Expand Down Expand Up @@ -134,7 +134,7 @@ Paginate only `:max_pages` ignoring the rest.

==- `:reset_overflow`

Resets the pagination in case of overflow, instead of raising a `Pagy::OverflowError`. Use it when you don't need to `rescue` and handle the event in any particular way. Notice: it keeps the current `cache_key`
Resets the pagination in case of overflow, instead of raising a `Pagy::OverflowError`. Use it when you don't need to `rescue` and handle the event in any particular way. Notice: it reuses the current `cache_key`

===

Expand Down
4 changes: 2 additions & 2 deletions docs/extras/keyset_for_ui.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def pagy_cache_new_key = my_custom_cache.generate_key
## Understanding the cache

This extra uses the `session` object as the cache for the `cutoffs` (not for the records!) by default, because it's simple and
works in any app, at least for prototyping.
works out of the box in any app, at least for prototyping.

Notice that the `cutoffs` array can potentially grow big if you don't use `:max_pages`, especially if your `keyset` contains
multiple ordered columns and more if their size is big. You must be aware of it.
Expand Down Expand Up @@ -99,7 +99,7 @@ The key used to locate the `cutoffs` in the cache storage.

==- `:cache_key_param`

The name of the cache key param. It is `:cache_key` by default. Pass a different symbol to change it.
The name of the cache key param. It is `:cache_key` by default. Pass a different symbol to change/shorten it.

===

Expand Down

0 comments on commit 7d44f8d

Please sign in to comment.