Database performance tweaks #226

frol · 2022-01-17T04:38:59Z

I want to collect the knowledge we gain over running Indexer for Explorer in production here.

frol · 2022-01-17T04:53:22Z

We hit hight IO wait on our database instance that we only use to WRITE indexed data, so one of the entry-points was this post: https://serverfault.com/a/284824

There I learned about https://wiki.postgresql.org/wiki/Performance_Optimization and https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

Here are the discussed settings:

max_connections - we don't need to tweak that since we maintain 15 connections per Indexer for Explorer instance and the upper bound is currently 500 connections

shared_buffers - it seems that GCP has auto-tweak for this as it is already set to 5GB

effective_cache_size - it seems that GCP has auto-tweak for this as it is already set to 6GB

checkpoint_segments - deprecated, so we should look into min_wal_size and max_wal_size tweaking instead

checkpoint_completion_target - already set to 0.9

default_statistics_target - we currently have the default 100 and it is not clear if we need to tweak it at this point

work_mem - we already set it to 64MB and I recently bumped it to 128MB without any noticeable outcomes

maintenance_work_mem - we don't seem to need to tweak it now

wal_sync_method, wal_buffers - I have not made any decision about those yet

constraint_exclusion, max_prepared_transactions - already has good defaults

synchronous_commit - this was on, and I am going to try to switch it off

random_page_cost - the default is 4 and it is suggested to lower it to 2, so I will try it

UPD: I have not seen any improvement from synchronous_commit and random_page_cost tweaks, so we can revert them next time we need to upgrade anything.

frol · 2022-01-17T05:14:56Z

I bumped the hardware of the main instance (and all the replicas) to 8 vCPU and 52GB RAM

UPD: That helped a bit, but not enough to resolve the lag we have

frol · 2022-01-17T06:03:47Z

I have disabled storing NFT and account_changes while keeping transaction and action-receipt actions on the beta nodes (started from block 57526420), and that helped a bit (it feels that it can keep up with the network, but still won't be able to catch up)

frol · 2022-01-17T06:20:45Z

UPD: the SELECT query should be completely removed; I am addressing it now.

It revealed unnecessary code in indexer-for-explorer, so there is no need to optimize it

Currently I see the following stats:

The top dominating queries are:

SELECT
  "execution_outcome_receipts"."produced_receipt_id",
  "receipts"."originated_from_transaction_hash"
FROM ("execution_outcome_receipts"
  INNER JOIN
    "receipts"
  ON
    "execution_outcome_receipts"."executed_receipt_id" = "receipts"."receipt_id")
WHERE
  "execution_outcome_receipts"."produced_receipt_id" = ANY($1)

INSERT INTO
  "execution_outcomes" ("receipt_id",
    "executed_in_block_hash",
    "executed_in_block_timestamp",
    "index_in_chunk",
    "gas_burnt",
    "tokens_burnt",
    "executor_account_id",
    "status",
    "shard_id")
VALUES
  ($1, $2, $3, $4, $5, $6, $7, $8, $9),
  ($10, $11, $12, $13, $14, $15, $16, $17, $18),
  ($19, $20, $21, $22, $23, $24, $25, $26, $27),
  ($28, $29, $30, $31, $32, $33, $34, $35, $36),
  ($37, $38, $39, $40, $41, $42, $43, $44, $45),
  ($46, $47, $48, $49, $50, $51, $52, $53, $54),
  ($55, $56, $57, $58, $59, $60, $61, $62, $63),
  ($64, $65, $66, $67, $68, $69, $70, $71, $72),
  ($73, $74, $75, $76, $77, $78, $79, $80, $81),
  ($82, $83, $84, $85, $86, $87, $88, $89, $90),
  ($91, $92, $93, $94, $95, $96, $97, $98, $99),
  ($100, $101, $102, $103, $104, $105, $106, $107, $108),
  ($109, $110, $111, $112, $113, $114, $115, $116, $117),
  ($118, $119, $120, $121, $122, $123, $124, $125, $126),
  ($127, $128, $129, $130, $131, $132, $133, $134, $135),
  ($136, $137, $138, $139, $140, $141, $142, $143, $144),
  ($145, $146, $147, $148, $149, ...

@telezhnaya ~~Could you dive into the performance of those? I think it is easier to start with the SELECT query.~~

frol · 2022-01-17T06:42:42Z

@telezhnaya I have seen various posts regarding IN vs ANY(ARRAY[]) vs ANY(VALUES()), e.g. https://www.datadoghq.com/blog/100x-faster-postgres-performance-by-changing-1-line/. Yet, it seems that there is no "correct" answer to the problem, and we should consider if we can do any better

We don't need to deal with that query at all, as it should be removed.

luixo mentioned this issue Jan 17, 2022

Explorer indexer far behind current block near/near-explorer#855

Closed

exalate-issue-sync bot assigned khorolets Sep 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database performance tweaks #226

Database performance tweaks #226

frol commented Jan 17, 2022

frol commented Jan 17, 2022 •

edited

Loading

frol commented Jan 17, 2022 •

edited

Loading

frol commented Jan 17, 2022 •

edited

Loading

frol commented Jan 17, 2022 •

edited

Loading

frol commented Jan 17, 2022 •

edited

Loading

Database performance tweaks #226

Database performance tweaks #226

Comments

frol commented Jan 17, 2022

frol commented Jan 17, 2022 • edited Loading

frol commented Jan 17, 2022 • edited Loading

frol commented Jan 17, 2022 • edited Loading

frol commented Jan 17, 2022 • edited Loading

frol commented Jan 17, 2022 • edited Loading

frol commented Jan 17, 2022 •

edited

Loading

frol commented Jan 17, 2022 •

edited

Loading

frol commented Jan 17, 2022 •

edited

Loading

frol commented Jan 17, 2022 •

edited

Loading

frol commented Jan 17, 2022 •

edited

Loading