In this guide we will step through setting up pgEdge Platform. We’ll focus for now on setting this up on Rocky Linux 8, however very similar steps are used on Debian based systems or on MacOS. Our goal in this guide is to create a multi-active pgEdge cluster and then use pgbench to create some representative tables and read/write activity on the cluster.
Two software components from pgEdge will be used in this guide. First, we’ll install nodectl
which is used to configure Postgres and install additional extensions. Second is Spock
, the Postgres extension providing logical, multi-active replication.
You’ll need root permissions on these systems in order to install any OS
packages needed as prerequisites.
- Enterprise Linux (EL) 8+, Ubuntu 20.04+, Debian 10+, OSX M1/M2
- A non-root user with
sudo
privileges systemctl
availablePython 3.6+
& an updated pip3- Python packages install via updated
pip3
as described below - Two servers (vm's are fine) networked with traffic on port 5432 allowed
- SSH access into the servers
On EL8+ systems, we’ll make sure these packages are installed:
yum install -y python39 python39-pip curl sudo
On Debian based systems, we'll make sure the native packages are installed:
apt install -y python3 python3-devel python3-psutil
On both EL8+ & Debian based systems, we'll make sure the latest pip3
is installed and
then install psycopg3 as follows:
pip3 install psycopg psycopg-binary --user
In any directory owned by your user, use the following command to install nodectl
:
python3 -c "$(curl -fsSL https://pgedge-download.s3.amazonaws.com/REPO/install.py)"
cd into the pgedge
directory created and install the pgEdge Platform with the nc
command. Specify a superuser name, password, and a database name. Note that the names cannot be pgEdge and cannot be any postgreSQL reserved words. For the examples given in this documentation, I will be using a database named demo.
./nc spock install -U <superuser-name> -P <superuser-password> -d <database-name>
For this demo I will be using the following command:
./nc spock install -U admin -P mypassword1 -d demo
If you encounter an error running this command, you may need to update your SELINUX mode to permissive, reboot, and retry the operation.
Using nodectl on each node, create the spock components needed for replication. First you will create a spock node by providing the name of the node, network address, and database name. You will provide the IP address of each node and the name of the pgedge user which has been created for replication, not the super user you created. Next you will make replication sets by providing the replication set name and the database name. For both the node name (n1) and the replication set name (demo_replication_set), these can be whatever you want but you will have to reference them in future commands.
Node n1
(IP address 10.1.2.5):
./nc spock create-node n1 "host=10.1.2.5 user=pgedge dbname=demo" demo ./nc spock create-repset demo_replication_set demo
Node n2
(IP address 10.2.2.5):
./nc spock create-node n2 'host=10.2.2.5 user=pgedge dbname=demo' demo ./nc spock create-repset demo_replication_set demo
Next, use nodectl to create the subscriptions. For these commands you will provide the subscription name, the network address for the node this one is subscribing to, and the database name.
Node n1
:
./nc spock create-sub sub_n1n2 'host=10.2.2.5 port=5432 user=pgedge dbname=demo' demo
Node n2
:
./nc spock create-sub sub_n2n1 'host=10.1.2.5 port=5432 user=pgedge dbname=demo' demo
At this point, you will have a two node cluster with cross subscriptions on n1
to n2
and n2
to n1
. For replication to begin, you will need to add tables to the replication sets and then add those replications to the subscriptions. For this demo, I will be using pgBench to set up a very simple four table database.
You can source the postgres environment variables and connect to your database with:
source pg15/pg15.env
This also adds pgbench and psql to your PATH. When using either command, you will still need to specify your database name, for example:
psql demo
Initialize with the pgBench command, psql into the postgreSQL database:
pgbench -i demo
Once connected to the database, alter the numeric columns to have LOG_OLD_VALUE
equal to true. This will make these numeric fields Conflict-Free Delta-Apply columns.
ALTER TABLE pgbench_accounts ALTER COLUMN abalance SET (LOG_OLD_VALUE=true); ALTER TABLE pgbench_branches ALTER COLUMN bbalance SET (LOG_OLD_VALUE=true); ALTER TABLE pgbench_tellers ALTER COLUMN tbalance SET (LOG_OLD_VALUE=true);
Dump and restore the database on to node 2 so that you are left with two databases with the same schema and data.
n1
:
pg_dump -v -d demo > dump-1.sql
Copy file to n2
, and run on n2
:
psql demo -f dump-1.sql
Run the following on both nodes to add these tables to the replication set. The fourth table, pgbench_history, cannot be added because it does not have a primary key.
./nc spock repset-add-table demo_replication_set pgbench_branches demo ./nc spock repset-add-table demo_replication_set pgbench_tellers demo ./nc spock repset-add-table demo_replication_set pgbench_accounts demo
Finish the set up by adding the replication sets to the subscriptions you had created.
n1
:
./nc spock sub-add-repset sub_n1n2 demo_replication_set demo
n2
:
./nc spock sub-add-repset sub_n2n1 demo_replication_set demo
Check the configuration with the following sql statements
demo=# SELECT * FROM spock.node; node_id | node_name ---------+---------- 673694252 | n1 560818415 | n2 (2 rows)
demo=# SELECT sub_id, sub_name, sub_slot_name, sub_replication_sets FROM spock.subscription; sub_id | sub_name | sub_slot_name | sub_replication_sets ------------+----------+----------------------+-------------------------------------------------------- 3293941396 | sub_n1n2 | spk_demo_n2_sub_n1n2 | {default,default_insert_only,ddl_sql,demo_replication_set} (1 row)
Run an update on n1
to see the update on n2
.
n1
:
demo=# SELECT * FROM pgbench_tellers WHERE tid = 1; tid | bid | tbalance | filler -----+-----+----------+-------- 1 | 1 | 0 | (1 row)
demo=# UPDATE pgbench_tellers SET filler = 'test' WHERE tid = 1; UPDATE 1
Check n2
:
demo=# SELECT * FROM pgbench_tellers WHERE tid = 1; tid | bid | tbalance | filler -----+-----+----------+-------------------------------------------------- 1 | 1 | 0 | test (1 row)
Run the following command on both nodes at the same time to run pgBench for one minute.
pgbench -R 100 -T 60 -n demo
Check the results on both nodes and see that the sum of the tbalance columns match on both pgbench_tellers tables. Without the Conflict-Free Delta-Apply columns, each conflict would have resulted in accepting the first in, potentially leading to sums that do not match between nodes.
n1
:
demo=# SELECT SUM(tbalance) FROM pgbench_tellers; sum | ------+ -84803 (1 row)
n2
:
demo=# SELECT SUM(tbalance) FROM pgbench_tellers; sum | ------+ -84803 (1 row)