-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Logging and Setting Status/BlockHeight for new Tables #299
Comments
(Option A) Should we provision one indexer_log_entries table per user DB or (Option B) provision a new table for every new user indexer or Option A:Pros
Cons
Option B:Pros
Cons
Seems like Option B will be a more performant solution here when it comes to querying and writing to the database as well as maintenance. |
@pkudinov I do not think we need to |
|
Hi @Kevin101Zhang here are some docs that we can talk in today's meeting:
|
@Kevin101Zhang please let's figure out how many records we have per indexer and then per indexer / date to validate the real need for partitions on logs table: something like: select indexer_name, count(1) from logs
group by 1
order 2 desc select user, count(1) from logs
group by 1
order 2 desc select user, indexer_name, count(1) from logs
group by 1, 2
order 3 desc select date(logs_date) , indexer_name, count(1) from logs
group by 1,2
order 3 desc |
Separate table per indexer, one partition per day. CREATE TABLE logs (
log_id INT PRIMARY KEY,
log_date DATE,
log_type TEXT NOT NULL CHECK (log_type IN ('system', 'indexer')),
timestamp TIMESTAMP,
block_height INT NOT NULL,
message TEXT,
-- add tsvector for message fulltext search
level TEXT NOT NULL CHECK (level IN ('DEBUG', 'INFO', 'WARN', 'ERROR')),
); |
-- 1) Let's create the logs table
CREATE TYPE log_type AS ENUM (
'system',
'indexer'
);
CREATE TYPE log_level AS ENUM (
'DEBUG',
'INFO',
'WARN',
'ERROR'
);
CREATE TABLE logs (
id BIGSERIAL NOT NULL,
block_height NUMERIC(20) NOT NULL,
log_date DATE NOT NULL,
log_timestamp TIMESTAMP NOT NULL,
log_type log_type NOT NULL,
log_level log_level NOT NULL,
message TEXT NOT NULL,
PRIMARY KEY (log_date, id)
) PARTITION BY RANGE (log_date);
CREATE INDEX logs_log_timestamp_idx ON logs USING btree (log_timestamp);
CREATE INDEX logs_log_type_idx ON logs USING btree (log_type);
CREATE INDEX logs_log_level_idx ON logs USING btree (log_level);
CREATE INDEX logs_block_height_idx ON logs USING btree (block_height);
-- 2) Create the tsvector index for the message column
-- Query example: SELECT * FROM logs WHERE message @@ to_tsquery('english', 'lag & block');
CREATE INDEX logs_search_vector_idx ON logs USING GIN (to_tsvector('english', message));
-- 3) Define a function to create new partitions. Examples of usage:
-- select fn_create_partition('myschema', 'mytable', current_date, '0 day', '1 day'); today
-- select fn_create_partition('myschema', 'mytable', current_date, '1 day', '2 day'); tomorrow
CREATE OR REPLACE FUNCTION fn_create_partition(_tbl text, _date date, _interval_start text, _interval_end text)
RETURNS void
LANGUAGE plpgsql AS
$func$
DECLARE
_start text;
_end text;
_partition_name text;
BEGIN
_start := TO_CHAR(date_trunc('day', _date + (_interval_start)::interval), 'YYYY-MM-DD');
_end := TO_CHAR(date_trunc('day', _date + (_interval_end)::interval), 'YYYY-MM-DD');
_partition_name := TO_CHAR(date_trunc('day', _date + (_interval_start)::interval), 'YYYYMMDD');
-- Create partition
EXECUTE 'CREATE TABLE IF NOT EXISTS ' || _tbl || '_p' || _partition_name || ' PARTITION OF ' || _tbl || ' FOR VALUES FROM (''' || _start || ''') TO (''' || _end || ''')';
END
$func$;
SELECT fn_create_partition('eduohe_near_access_keys_v1.logs', CURRENT_DATE, '0 day', '1 day');
SELECT fn_create_partition('eduohe_near_access_keys_v1.logs', CURRENT_DATE, '1 day', '2 day');
-- 4) Define a function to delete partitions. Examples of usage:
-- select fn_delete_partition('myschema', 'mytable', current_date, '-15 day', '-14 day'); 2 weeks retention
CREATE OR REPLACE FUNCTION fn_delete_partition(_tbl text, _date date, _interval_start text, _interval_end text)
RETURNS void
LANGUAGE plpgsql AS
$func$
DECLARE
_start text;
_end text;
_partition_name text;
BEGIN
_start := TO_CHAR(date_trunc('day', _date + (_interval_start)::interval), 'YYYY-MM-DD');
_end := TO_CHAR(date_trunc('day', _date + (_interval_end)::interval), 'YYYY-MM-DD');
_partition_name := TO_CHAR(date_trunc('day', _date + (_interval_start)::interval), 'YYYYMMDD');
-- Detach partition
EXECUTE 'ALTER TABLE ' || _tbl || ' DETACH PARTITION ' || _tbl || '_p' || _partition_name;
EXECUTE 'DROP TABLE ' || _tbl || '_p' || _partition_name;
END
$func$;
-- 5) To create new partitions automatically and delete old ones we can use pg_cron
-- GCP Cloud SQL requires these flags:
-- cloudsql.enable_pg_cron = On
-- cron.database_name = indexer_balances_mainnet
-- IMPORTANT: Needs a SUPERUSER (e.g. potsgres or admin user) to create the extension
-- and grant access to the indexer user
CREATE EXTENSION pg_cron;
GRANT CREATE, USAGE ON SCHEMA cron TO eduohe_near;
-- Every day at 1am creates a new partition for the next day
SELECT cron.schedule('eduohe_near_access_keys_v1_logs_delete_partition', '0 1 * * *', $$SELECT fn_create_partition('eduohe_near_access_keys_v1.logs', CURRENT_DATE, '1 day', '2 day')$$);
-- Every day at 2am deletes a partition of 14 days ago
SELECT cron.schedule('0 2 * * *', $$SELECT fn_delete_partition('eduohe_near_access_keys_v1.logs', CURRENT_DATE, '-15 day', '-14 day')$$);
-- to unschedule we can use these cmd: SELECT cron.unschedule('eduohe_near_access_keys_v1_logs_create_partition');
SELECT * FROM cron.job;
-- END
SELECT * FROM logs WHERE message @@ to_tsquery('english', 'lag & block');
@Kevin101Zhang here is the script and how we should see the components for the indexers DB. |
Is this one redundant. Can we close this @pkudinov |
This is a parent ticket. Let's mark it as done when all sub tickets are in production |
@morgsmccauley please close this ticket when everything is in prod |
block_height
Tasks
Blocked
The text was updated successfully, but these errors were encountered: