Skip to content

Commit

Permalink
docs: edits to new tablespaces and samples content (cloudnative-pg#3500)
Browse files Browse the repository at this point in the history
Signed-off-by: Betsy Gitelman <[email protected]>
Signed-off-by: Gabriele Bartolini <[email protected]>
Co-authored-by: Gabriele Bartolini <[email protected]>
Co-authored-by: Jaime Silvela <[email protected]>
  • Loading branch information
3 people authored Dec 10, 2023
1 parent 7287d9e commit 485f65c
Show file tree
Hide file tree
Showing 2 changed files with 75 additions and 75 deletions.
2 changes: 1 addition & 1 deletion docs/src/samples.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,4 +119,4 @@ your PostgreSQL Cluster.
Remember to update `bootstrap.recovery.backup.name` with the backup name.
: [`cluster-restore-with-tablespaces.yaml`](samples/cluster-restore-with-tablespaces.yaml)

For a list of available options, please refer to the ["API Reference" page](cloudnative-pg.v1.md).
For a list of available options, see the ["API Reference" page](cloudnative-pg.v1.md).
148 changes: 74 additions & 74 deletions docs/src/tablespaces.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
# Tablespaces

A tablespace stands as a robust and widely embraced feature within database
management systems, offering a powerful means to enhance the vertical
A tablespace is a robust and widely embraced feature in database
management systems. It offers a powerful means to enhance the vertical
scalability of a database by decoupling the physical and logical modeling of
data. Essentially, it serves as a technique for physical database modeling,
enabling the efficient distribution of I/O operations across multiple volumes
on distinct storage, thereby optimizing performance through parallel on-disk
on distinct storage. It thereby optimizes performance through parallel on-disk
read/write operations.

In the context of the database industry, tablespaces play a strategic role,
particularly when paired with table partitioninga logical database modeling
particularly when paired with table partitioning, a logical database modeling
technique. They prove instrumental in managing large-scale databases and are
also employed for tasks such as separating tables from indexes or executing
also used for tasks such as separating tables from indexes or executing
temporary operations.

Tablespaces in PostgreSQL have been playing a pivotal role since 2005 (version
8.0), while declarative partitioning was introduced in 2017 (version 10).
Consequently, tablespaces are seamlessly integrated into all supported releases
of PostgreSQL. Quoting from the
[PostgreSQL documentation on Tablespaces](https://www.postgresql.org/docs/current/manage-ag-tablespaces.html):
[PostgreSQL documentation on tablespaces](https://www.postgresql.org/docs/current/manage-ag-tablespaces.html):

> By using tablespaces, an administrator can control the disk layout of a
> PostgreSQL installation. This is useful in at least two ways.
Expand All @@ -31,33 +31,33 @@ of PostgreSQL. Quoting from the
## Declarative tablespaces

CloudNativePG provides support for PostgreSQL tablespaces through **declarative
tablespaces**, operating at two distinct levels:
CloudNativePG provides support for PostgreSQL tablespaces through *declarative
tablespaces*, operating at two distinct levels:

- Kubernetes: managing persistent volume claims, identically to how PGDATA and
- Kubernetes, managing persistent volume claims, identically to how PGDATA and
WAL volumes are handled
- PostgreSQL: managing the `TABLESPACE` global objects within the PostgreSQL
- PostgreSQL, managing the `TABLESPACE` global objects in the PostgreSQL
instance

Being a part of the Kubernetes ecosystem, CloudNativePG's declarative
tablespaces are implemented leveraging Persistent Volume Claims (and Persistent
Volumes). Each tablespace defined in the cluster is housed in its own
persistent volume. CloudNativePG takes care of generating the PVCs, mounting
the required volumes in the instance Pods in normalized locations, and ensuring
tablespaces are implemented by leveraging persistent volume claims (and persistent
volumes). Each tablespace defined in the cluster is housed in its own
persistent volume. CloudNativePG takes care of generating the PVCs. It mounts
the required volumes in the instance pods in normalized locations and ensures
replicas are ready to support tablespaces before activating them in the
primary.

Tablespaces can be setup when the cluster is created, or added at a later time
provided the storage is available when requested. Currently, they cannot be
removed, but this limitation will be addressed in a future minor/patch version
You can set up tablespaces when creating the cluster or add them later,
provided the storage is available when requested. Currently, you can't
remove them. However, this limitation will be addressed in a future minor or patch version
of CloudNativePG.

## Using declarative tablespaces

Using declarative tablespaces is easy. You can find a full example in
Using declarative tablespaces is straightforward. You can find a full example in
[`cluster-example-with-tablespaces.yaml`](samples/cluster-example-with-tablespaces.yaml).

Simply use the new `tablespaces` stanza on a new or existing `Cluster` resource:
To use them, use the new `tablespaces` stanza on a new or existing `Cluster` resource:

``` yaml
spec:
Expand All @@ -77,16 +77,16 @@ spec:
size: 2Gi
```
Note that each tablespace has its own storage section where the size and the
storage class of the generated PVC can be configured. The administrator can thus
Each tablespace has its own storage section where you can configure the size and the
storage class of the generated PVC. The administrator can thus
plan to use different storage classes for different kinds of workloads, as
explained in the next section.
explained in [Storage classes and tablespaces](#storage-classes-and-tablespaces).
CloudNativePG will create the above persistent volume claims for each instance
in the high availability Postgres cluster, and mount them in each pod when they
have been provisioned. Then, it will ensure that the `tbs1`, `tbs2`, and `tbs3`
CloudNativePG creates the persistent volume claims for each instance
in the high-availability Postgres cluster. It mounts them in each pod when they
have been provisioned. Then, it ensures that the `tbs1`, `tbs2`, and `tbs3`
tablespaces are created on the primary PostgreSQL instance using the `CREATE
TABLESPACE` command. This process is quick, and you will see this reflected in
TABLESPACE` command. This process is quick, and you see this reflected in
Postgres:

``` txt
Expand Down Expand Up @@ -125,12 +125,12 @@ status:

## Storage classes and tablespaces

As for PGDATA and WAL volumes, you can use different storage classes for your
tablespaces too. This is a very convenient way of optimizing your resources,
You can use different storage classes for your tablespaces, just as you can for PGDATA and
WAL volumes. This is a convenient way of optimizing your resources,
balancing performance and costs of your storage based on data access usage and
expectations.

Let's use the following example to explain the feature:
This example helps to explain the feature:

```yaml
apiVersion: postgresql.cnpg.io/v1
Expand All @@ -153,18 +153,18 @@ spec:
storageClass: balanced
```

The `yardbirds` cluster example above requests 4 persistent volume claims using
The `yardbirds` cluster example requests 4 persistent volume claims using
3 different storage classes:

- default storage class: used by the `PGDATA` and WALs
- `fastest`: used by the `current` tablespace to store the most active and
demanding set of data in the database
- `balanced`: used by the `this_year` tablespace to store older partitions of
- Default storage class – Used by the `PGDATA` and WAL volumes.
- `fastest` – Used by the `current` tablespace to store the most active and
demanding set of data in the database.
- `balanced` – Used by the `this_year` tablespace to store older partitions of
data that are rarely accessed by users and where performance expectations
are not the highest
aren't the highest.

You can then take advantage of horizontal table partitioning and create
the current month's table (e.g. facts for December 2023) in the `current`
the current month's table (for example, facts for December 2023) in the `current`
tablespace:

``` sql
Expand All @@ -174,18 +174,18 @@ CREATE TABLE facts_202312 PARTITION OF facts
```

!!! Important
The above example assumes you are familiar with
This example assumes you're familiar with
[PostgreSQL declarative partitioning](https://www.postgresql.org/docs/current/ddl-partitioning.html).

## Tablespace ownership

By default, unless differently specified, tablespaces are owned by the `app`
application user (as defined in `.spec.bootstrap.initdb.owner`) — see
["Bootstrap a new cluster](bootstrap.md#bootstrap-an-empty-cluster-initdb) for
By default, unless otherwise specified, tablespaces are owned by the `app`
application user, as defined in `.spec.bootstrap.initdb.owner`. See
[Bootstrap a new cluster](bootstrap.md#bootstrap-an-empty-cluster-initdb) for
details.
This default behavior should work in most microservice database use cases.
This default behavior works in most microservice database use cases.

You can set the owner of a tablespace through the `owner` stanza, for example
You can set the owner of a tablespace in the `owner` stanza, for example
the `postgres` user, like in the following excerpt:

```yaml
Expand All @@ -199,13 +199,13 @@ the `postgres` user, like in the following excerpt:
```

!!! Important
Make sure that, if you change the ownership of a tablespace, you are using
an existing role. Otherwise, the status of the cluster will report the
issue and stop reconciling tablespaces until fixed. It is your responsibility
to monitor the status and the log, and promptly intervene by fixing the issue.
If you change the ownership of a tablespace, make sure that you're using
an existing role. Otherwise, the status of the cluster reports the
issue and stops reconciling tablespaces until fixed. It's your responsibility
to monitor the status and the log and to promptly intervene by fixing the issue.

If you define a tablespace with an owner that doesn't exist, CloudNativePG will
be unable to create the tablespace, and will reflect this in the cluster status:
If you define a tablespace with an owner that doesn't exist, CloudNativePG can't
create the tablespace and reflects this in the cluster status:

``` yaml
spec:
Expand Down Expand Up @@ -239,31 +239,31 @@ spec:
status: pending
```

## Backup and Recovery
## Backup and recovery

CloudNativePG automatically handles backup of tablespaces (and the relative
CloudNativePG handles backup of tablespaces (and the relative
tablespace map) both on object stores and volume snapshots.

!!! Warning
By default, backups are taken from replica nodes. A backup taken immediately
after the creation of tablespaces in a cluster could result in an
incomplete view of the tablespaces from the replica, and thus an incomplete
after creating tablespaces in a cluster can result in an
incomplete view of the tablespaces from the replica and thus an incomplete
backup. The lag will be resolved in a maximum of 5 minutes, with the next
reconciliation.

Once a cluster with tablespaces has a base backup, it is possible to restore a
new cluster from it. When it comes to the recovery side, it is your
Once a cluster with tablespaces has a base backup, you can restore a
new cluster from it. When it comes to the recovery side, it's your
responsibility to ensure that the `Cluster` definition of the recovered
database contains the exact list of tablespaces.

## Replica clusters

Replica clusters must have the same tablespace definition as their origin.
The reason is that tablespace management commands like `CREATE TABLESPACE`
are WAL logged and will be replayed by any physical replication client (streaming and/or via WAL shipping).
are WAL logged and are replayed by any physical replication client (streaming or by way of WAL shipping).

It is your responsibility to ensure that replica cluster have the same list of
tablespaces, with the same name (storage class and size might vary).
It's your responsibility to ensure that replica clusters have the same list of
tablespaces, with the same name. Storage class and size might vary.

For example:

Expand Down Expand Up @@ -291,21 +291,21 @@ spec:

PostgreSQL allows you to define one or more temporary tablespaces to create
temporary objects (temporary tables and indexes on temporary tables) when a
`CREATE` command does not explicitly specify a tablespace, as well as temporary
`CREATE` command doesn't explicitly specify a tablespace, and to create temporary
files for purposes such as sorting large data sets. When no temporary
tablespace is specified, PostgreSQL uses the default tablespace of a database -
tablespace is specified, PostgreSQL uses the default tablespace of a database, which is
currently the main `PGDATA` volume.

When you specify more than one temporary tablespace, PostgreSQL randomly picks
one the first time a temporary object needs to be created in a transaction,
then sequentially iterates through the list.
one the first time a temporary object needs to be created in a transaction.
Then it sequentially iterates through the list.

Temporary tablespaces work like regular tablespaces, also regarding backups.
Temporary tablespaces also work like regular tablespaces with regard to backups.

CloudNativePG provides the `.spec.tablespaces[*].name.temporary` option to
determine whether a tablespace should be added to the `temp_tablespaces`
PostgreSQL parameter, and thus become eligible to store temporary data that
does not have an explicit tablespace assignment.
determine whether to add a tablespace to the `temp_tablespaces`
PostgreSQL parameter and thus become eligible to store temporary data that
doesn't have an explicit tablespace assignment.

```yaml
spec:
Expand All @@ -318,22 +318,22 @@ spec:
```

They can be created at initialization time or added later, requiring a
rolling update. The `temporary: true/false` option simply adds/removes the
tablespace name to/from the list of tablespaces in the `temp_tablespaces`
option (which doesn't require a restart of PostgreSQL to be changed).
rolling update. The `temporary: true/false` option adds or removes the
tablespace name to or from the list of tablespaces in the `temp_tablespaces`
option. This change doesn't require a restart of PostgreSQL.

Although temporary tablespaces can also work as regular tablespaces (meaning
that users can also host regular data on them while also using them for
temporary operations), we recommend not to mix the two workloads.
that users can also host regular data on them while using them for
temporary operations), we recommend that you don't mix the two workloads.

See [PostgreSQL documentation on `temp_tablespaces`](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-TEMP-TABLESPACES)
See the [PostgreSQL documentation on `temp_tablespaces`](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-TEMP-TABLESPACES)
for details.

## kubectl plugin support

The [kubectl status](kubectl-plugin.md#status) plugin includes a section
dedicated to tablespaces which offers a convenient overview, including
tablespace status, owner, temporary flag, or any errors:
dedicated to tablespaces that offers a convenient overview, including
tablespace status, owner, temporary flag, and any errors:

``` yaml
[...]
Expand All @@ -351,5 +351,5 @@ Instances status

## Limitations

Currently, tablespaces cannot be removed from an existing CloudNativePG
Currently, you can't remove tablespaces from an existing CloudNativePG
cluster.

0 comments on commit 485f65c

Please sign in to comment.