Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimescaleDB: Add psql to timescaledb migration scripts #1364

Merged
merged 22 commits into from
Jan 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/ct-install.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ chart-repos:
- fluxcd=https://charts.fluxcd.io
- loki=https://grafana.github.io/helm-charts
- prometheus=https://prometheus-community.github.io/helm-charts
- timescaledb=https://raw.githubusercontent.com/timescale/timescaledb-kubernetes/master/charts/repo/
- traefik=https://helm.traefik.io/traefik
check-version-increment: false
charts:
Expand Down
1 change: 1 addition & 0 deletions .github/ct-lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,6 @@ chart-repos:
- fluxcd=https://charts.fluxcd.io
- loki=https://grafana.github.io/helm-charts
- prometheus=https://prometheus-community.github.io/helm-charts
- timescaledb=https://raw.githubusercontent.com/timescale/timescaledb-kubernetes/master/charts/repo/
- traefik=https://helm.traefik.io/traefik
check-version-increment: false
8 changes: 4 additions & 4 deletions charts/hedera-mirror/requirements.lock
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@ dependencies:
version: 0.12.0-rc1
- name: postgresql-ha
repository: https://charts.bitnami.com/bitnami
version: 6.3.2
version: 6.3.4
- name: redis
repository: https://charts.bitnami.com/bitnami
version: 12.2.3
version: 12.2.4
- name: hedera-mirror-rest
repository: file://../hedera-mirror-rest
version: 0.12.0-rc1
- name: timescaledb-multinode
repository: https://raw.githubusercontent.com/timescale/timescaledb-kubernetes/master/charts/repo/
version: 0.7.0
digest: sha256:b261bda6d871b9ddc44e06ad21b51429a14028ca8db2ff330d54a11e5109e630
generated: "2020-12-21T15:48:49.454221-06:00"
digest: sha256:5b4f3a8e19c559e3507fce9339b86f91a82da8cbe57eee076c5d67f085c5b3f8
generated: "2021-01-04T17:28:47.546372-06:00"
33 changes: 21 additions & 12 deletions charts/hedera-mirror/templates/job-timescaledb-init.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,41 +2,50 @@
apiVersion: batch/v1
kind: Job
metadata:
labels:
{{- include "hedera-mirror.labels" . | nindent 4 }}
labels: {{- include "hedera-mirror.labels" . | nindent 4 }}
name: {{ include "hedera-mirror.dbHost" . }}-init-job
namespace: {{ include "hedera-mirror.namespace" . }}
annotations:
"helm.sh/hook": post-install
"helm.sh/hook-delete-policy": hook-succeeded
steven-sheehy marked this conversation as resolved.
Show resolved Hide resolved
spec:
backoffLimit: 4
backoffLimit: 1
template:
spec:
containers:
- name: init-mirrornode-db
image: timescaledev/timescaledb-ha:pg12-ts2.0.0-rc3
image: "{{ .Values.timescaledb.image.repository }}:{{ .Values.timescaledb.image.tag }}"
command:
- sh
- -c
- |
set -e

while ! pg_isready -U postgres -h {{ include "hedera-mirror.dbHost" . }}-data; do sleep 1; done;
psql --echo-queries -d "${ACCESS_SVC_CONNSTR_POSTGRES}" --set ON_ERROR_STOP=1 -f ${DB_INIT_FILE}
psql --echo-queries -d "${ACCESS_SVC_CONNSTR_POSTGRES}" --set ON_ERROR_STOP=1 -f ${DB_INIT_DIR}/users_v2.sql
echo 'Completed db initialization and user creation for mirror node'

psql --echo-queries -d "${ACCESS_SVC_CONNSTR_MIRRORNODE}" --set ON_ERROR_STOP=1 -f ${DB_INIT_DIR}/schema_v2.sql
echo 'Completed db schema initialization'

psql --echo-queries -d "${ACCESS_SVC_CONNSTR_POSTGRES}" --set ON_ERROR_STOP=1 -f ${DB_INIT_DIR}/path_v2.sql
echo 'Completed db search path setting'

echo 'Completed db initialization for mirror node'
env:
- name: ACCESS_SVC_CONNSTR_POSTGRES
value: host={{ include "hedera-mirror.dbHost" . }} user=postgres connect_timeout=3 sslmode=disable password={{ .Values.timescaledb.credentials.accessNode.superuser }}
- name: DB_INIT_FILE
value: /usr/etc/db-init/init.sql
- name: ACCESS_SVC_CONNSTR_MIRRORNODE
value: host={{ include "hedera-mirror.dbHost" . }} dbname={{ .Values.importer.config.hedera.mirror.importer.db.name }} user={{ .Values.importer.config.hedera.mirror.importer.db.owner }} connect_timeout=3 sslmode=disable password={{ .Values.importer.config.hedera.mirror.importer.db.ownerPassword }}
- name: DB_INIT_DIR
value: /usr/etc/db-init
volumeMounts:
- name: timescale-db-init-volume
- name: timescaledb-init-volume
mountPath: /usr/etc/db-init
volumes:
- name: timescale-db-init-volume
- name: timescaledb-init-volume
secret:
defaultMode: 420
secretName: {{ include "hedera-mirror.dbHost" . }}-init
restartPolicy: OnFailure
ttlSecondsAfterFinished: 600
{{- end -}}
restartPolicy: Never
{{- end -}}
2 changes: 1 addition & 1 deletion charts/hedera-mirror/templates/secret-postgresql.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ metadata:
namespace: {{ include "hedera-mirror.namespace" . }}
type: Opaque
stringData:
init.sql: |-
init_v1.sql: |-
{{- $dbname := .Values.importer.config.hedera.mirror.importer.db.name }}
{{- $password := .Values.importer.config.hedera.mirror.importer.db.password }}
{{- $username := .Values.importer.config.hedera.mirror.importer.db.username }}
Expand Down
74 changes: 61 additions & 13 deletions charts/hedera-mirror/templates/secret-timescaledb.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,75 @@ kind: Secret
metadata:
labels:
{{- include "hedera-mirror.labels" . | nindent 4 }}
name: {{ printf "%s-timescaledb-init" .Release.Name }}
name: {{ include "hedera-mirror.dbHost" . }}-init
namespace: {{ include "hedera-mirror.namespace" . }}
type: Opaque
stringData:
init.sql: |-
users_v2.sql: |-
{{- $dbName := .Values.importer.config.hedera.mirror.importer.db.name }}
{{- $dbOwner := .Values.importer.config.hedera.mirror.importer.db.owner }}
{{- $dbOwnerPassword := .Values.importer.config.hedera.mirror.importer.db.ownerPassword }}
{{- $importerUser := .Values.importer.config.hedera.mirror.importer.db.username }}
{{- $importerPassword := .Values.importer.config.hedera.mirror.importer.db.password }}
{{- $grpcUsername := .Values.grpc.config.hedera.mirror.grpc.db.username }}
{{- $grpcPassword := .Values.grpc.config.hedera.mirror.grpc.db.password }}
{{- $restUser := .Values.global.rest.username }}
{{- $restPassword := .Values.global.rest.password }}
create database {{ $dbName }};
{{- $dbSchema := .Values.importer.config.hedera.mirror.importer.db.schema }}

-- create owner user
create user {{ $dbOwner }} with login password '{{ $dbOwnerPassword }}';

-- create primary user and db
create database {{ $dbName }} with owner {{ $dbOwner }};

-- create roles
create role readonly;
create role readwrite in role readonly;

-- create users
create user {{ $grpcUsername }} with login password '{{ $grpcPassword }}' in role readonly;
create user {{ $restUser }} with login password '{{ $restPassword }}' in role readonly;
create user {{ $importerUser }} with login password '{{ $importerPassword }}' in role readwrite;

-- drop timescaledb extension for future install to ensure availability in custom schema
drop extension if exists timescaledb cascade;
schema_v2.sql: |-
-- create schema and set schema user permissions
create schema if not exists {{ $dbSchema }} authorization {{ $dbOwner }};
grant usage on schema {{ $dbSchema }} to public;

-- revoke default public permissions on schema
revoke create on schema {{ $dbSchema }} from public;

-- grant connect and schema access to readonly role
grant connect on database {{ $dbName }} to readonly;
grant usage on schema {{ $dbSchema }} to readonly;

-- grant select privileges on tables to readonly
-- grant all privileges on all tables in schema {{ $dbSchema }} to {{ $dbOwner }};
grant select on all tables in schema {{ $dbSchema }} to readonly;
alter default privileges in schema {{ $dbSchema }} grant select on tables to readonly;

-- grant select privileges on sequences to readonly
-- grant all privileges on all sequences in schema {{ $dbSchema }} to {{ $dbOwner }};
grant select on all sequences in schema {{ $dbSchema }} to readonly;
alter default privileges in schema {{ $dbSchema }} grant select on sequences to readonly;

-- grant write privileges on sequences to readwrite
grant insert, update, delete on all tables in schema {{ $dbSchema }} to readwrite;
alter default privileges in schema {{ $dbSchema }} grant insert, update on tables to readwrite;
grant usage on all sequences in schema {{ $dbSchema }} to readwrite;
alter default privileges in schema {{ $dbSchema }} grant usage on sequences to readwrite;
path_v2.sql: |-
\c {{ $dbName }};
create extension if not exists timescaledb cascade;
create user {{ $importerUser }} with login password '{{ $importerPassword }}';
create role viewer;
create user {{ $grpcUsername }} with login password '{{ $grpcPassword }}' in role viewer;
create user {{ $restUser }} with login password '{{ $restPassword }}' in role viewer;
grant select on all tables in schema public to {{ $importerUser }};
grant select on all tables in schema public to viewer;
alter default privileges for role {{ $importerUser }} in schema public grant select on tables to viewer;
create extension pg_stat_statements;
{{- end -}}
-- alter search path for given schema
alter user {{ $dbOwner }} set search_path = {{ $dbSchema }}, public;
alter user {{ $importerUser }} set search_path = {{ $dbSchema }}, public;
alter user {{ $grpcUsername }} set search_path = {{ $dbSchema }}, public;
alter user {{ $restUser }} set search_path = {{ $dbSchema }}, public;

-- add extensions, ensuring they're available to new schema
create extension if not exists timescaledb cascade schema {{ $dbSchema }};
create extension if not exists pg_stat_statements cascade schema {{ $dbSchema }};
{{- end -}}
7 changes: 4 additions & 3 deletions charts/hedera-mirror/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,9 @@ timescaledb:
dataNode:
superuser: mirror_node_pass
enabled: false
image:
repository: timescaledev/timescaledb-ha
tag: pg12.5-ts2.0.0-p0
persistentVolume:
size: 500Gi
resources:
Expand All @@ -200,7 +203,5 @@ timescaledb:
parameters:
max_wal_size: 8GB # recommended to be 80% of the Volume Size
min_wal_size: 2GB # 80% of the WAL Volume Size
shared_buffers: 1GB # recommended to be 25% of available instance memory
shared_buffers: 1GB # recommended to be 25% of available instance memory
work_mem: 50MB


16 changes: 16 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,3 +67,19 @@ services:
tty: true
ports:
- 5551:5551

timescaledb:
deploy:
replicas: 0
image: timescaledev/timescaledb-ha:pg12.5-ts2.0.0-p0
restart: unless-stopped
stop_grace_period: 2m
stop_signal: SIGTERM
tty: true
environment:
POSTGRES_PASSWORD: mirror_node_pass
volumes:
- ./timescaledb:/var/lib/postgresql/data
- ./hedera-mirror-importer/src/main/resources/db/scripts/init_v2.sql:/docker-entrypoint-initdb.d/init_v2.sql
ports:
- 5432:5432
7 changes: 5 additions & 2 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,12 @@ value, it is recommended to only populate overridden properties in the custom `a
| `hedera.mirror.importer.db.host` | 127.0.0.1 | The IP or hostname used to connect to the database |
| `hedera.mirror.importer.db.loadBalance` | true | Whether to enable pgpool load balancing. If false, it sends all reads to the primary db backend instead of load balancing them across the primary and replicas. |
| `hedera.mirror.importer.db.name` | mirror_node | The name of the database |
| `hedera.mirror.importer.db.password` | mirror_node_pass | The database password the processor uses to connect. |
| `hedera.mirror.importer.db.owner` | mirror_node | The username of the db user with owner permissions to create and modify the schema |
| `hedera.mirror.importer.db.ownerPassword` | mirror_node_pass | The password for the owner user the processor uses to connect. |
| `hedera.mirror.importer.db.password` | mirror_node_pass | The database password for the Importer user the processor uses to connect. |
| `hedera.mirror.importer.db.port` | 5432 | The port used to connect to the database |
| `hedera.mirror.importer.db.username` | mirror_node | The username the processor uses to connect to the database |
| `hedera.mirror.importer.db.schema` | public | The name of the custom schema database objects will be created in. This is applicable from v2 of the data schema |
| `hedera.mirror.importer.db.username` | mirror_node | The Importer username the processor uses to connect to the database |
| `hedera.mirror.importer.downloader.accessKey` | "" | The cloud storage access key |
| `hedera.mirror.importer.downloader.allowAnonymousAccess` | | Whether the cloud storage bucket allows for anonymous access. |
| `hedera.mirror.importer.downloader.balance.batchSize` | 15 | The number of signature files to download per node before downloading the signed files |
Expand Down
56 changes: 45 additions & 11 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,31 @@ runnable Mirror Node JAR file in the `target` directory.
## Running Locally

### Database Setup
In addition to OpenJDK 11, you will need to install a database and initialize it.
The Mirror Node utilizes [PostgreSQL](https://postgresql.org) v9.6 or [TimescaleDB](https://docs.timescale.com/latest/main) v2 depending on the version of its database schema.

In addition to OpenJDK 11, you will need to install [PostgreSQL](https://postgresql.org) 9.6 and initialize it. The only
setup required is to create the initial database and owner since [Flyway](https://flywaydb.org) manages the database
schema. The SQL script located at `hedera-mirror-importer/src/main/resources/db/scripts/init.sql` can be used to
accomplish this. Edit the file and change the `db_name`, `db_user`, `db_password` `db_owner`, `grpc_user`, or
`grpc_password` as appropriate. Make sure the application [configuration](configuration.md) matches the values in the
script. Run the script as a DB admin user and check the output carefully to ensure no errors occurred.
For both databases, since [Flyway](https://flywaydb.org) will manage the database schema, the only required setup steps include:
* creating the database, users, schema, and extensions.
* ensuring all permissions are set.

Scripts for v1 and v2 are provided to accomplish this.
Make sure the application [configuration](configuration.md) matches the values in the script.
Run the script as a super user and check the output carefully to ensure no errors occurred.

#### PostgreSQL (V1)
Run the SQL script located at `hedera-mirror-importer/src/main/resources/db/scripts/init_v1.sql`.
Edit the file and change the `db_name`, `db_user`, `db_password` `db_owner`, `grpc_user`, or `grpc_password` as appropriate.

```console
psql postgres -f hedera-mirror-importer/src/main/resources/db/scripts/init_v1.sql
```

#### TimescaleDB (V2)
Run the SQL script located at `hedera-mirror-importer/src/main/resources/db/scripts/init_v2.sql`.
Edit the file and change the db user names, passwords and schema as appropriate.

```console
psql postgres -f hedera-mirror-importer/src/main/resources/db/scripts/init.sql
psql postgres -f hedera-mirror-importer/src/main/resources/db/scripts/init_v2.sql
```

### Importer
Expand Down Expand Up @@ -86,7 +101,7 @@ npm test

Docker Compose scripts are provided and run all the mirror node components:

- PostgreSQL database
- PostgreSQL/TimescaleDB database
- GRPC API
- Importer
- Monitor
Expand All @@ -95,14 +110,33 @@ Docker Compose scripts are provided and run all the mirror node components:
Containers use the following persisted volumes:

- `./db` on your local machine maps to `/var/lib/postgresql/data` in the containers. This contains the files for the
PostgreSQL database. If the database container fails to initialise properly and the database fails to run, you will
have to delete this folder prior to attempting a restart otherwise the database initialisation scripts will not be
run.
PostgreSQL/TimescaleDB database. If the database container fails to initialise properly and the database fails to run,
you will have to delete this folder prior to attempting a restart otherwise the database initialisation scripts will
not be run.

- `./data` on your local machine maps to `/var/lib/hedera-mirror-importer` in the container. This contains files
downloaded from S3 or GCP. These are necessary not only for the database data to be persisted, but also so that the
parsing containers can access file obtained via the downloading containers

### Configuration

#### TimescaleDB vs PostgreSQL
To utilize the TimescaleDB database over the default PostgreSQL database, disable the PostgreSQL container and enable the TimescaleDB container.

To achieve this the `docker-compose.yml` can be updated to set the postgres `db` service replicas to 0 whiles removing this same setting from the `timescaledb` service as follows:
```yaml
...
services:
db:
deploy:
replicas: 0
...
timescaledb:
# deploy:
# replicas: 0
ijungmann marked this conversation as resolved.
Show resolved Hide resolved
...
```

### Starting

Before starting, [configure](configuration.md) the application by updating the [application.yml](../application.yml)
Expand Down
43 changes: 43 additions & 0 deletions docs/operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,49 @@ systemctl status hedera-mirror-importer.service
sudo journalctl -fu hedera-mirror-importer.service
```

### v1 to v2 Data Migration

To support time series logic the Mirror Node DB schema shifted from PostgeSQL (v1) to TimescaleDB (v2).
[Migrating from a Different PostgreSQL Database](https://docs.timescale.com/latest/getting-started/migrating-data#different-db) highlights the general recommended data migration steps when moving to TimescaleDB.

For mirror node operators running v1 db schema, the following steps can be taken to upgrade to v2.

1. Set up a new TimescaleDB database

A new TimescaleDB server must be spun up.

Refer to Mirror Node [DB Installation](installation.md#database-setup) for manual instructions.

To use the Mirror Node configured docker container, simply run:

```shell script
$ docker-compose up timescaledb
```

Refer to [TimescaleDB Installation Instructions](https://docs.timescale.com/latest/getting-started/installation) for other installation options.

> **_NOTE:_** The following steps assume the database, users and schema have been created as detailed above

2. Configure migration properties

The configuration file `hedera-mirror-importer/src/main/resources/db/scripts/timescaledb/migration.config` contains db variables for easy running.
These options include variables such as db names, passwords, users, hosts for both the existing db and the new db.

Update these values appropriately for your db setup.

3. Run migration script

From the `hedera-mirror-importer/src/main/resources/db` directory run the `migration.sh` script
```shell script
$ ./scripts/timescaledb/migration.sh
```

The script uses successive `psql` connections to back up, configure and restore data on the new database nodes.
First it copies over the `flyway_schema_history` table, to maintain migration history.
It then utilizes the migration sql script used by normal flyway operations to create the new tables and then creates the Timescale hypertables based on these.
Following this the tables from the old database are backed up as csv files using `\COPY` and then the data inserted into the new database also using `\COPY`.
Finally the schema of the `flyway_schema_history` is updated and the sequence values are updated to ensure continuation.

## Monitor

The monitor is a Java-based application and should be able to run on any platform that Java supports. That said, we
Expand Down
Loading