perf(core-database-postgres): faster count method #2832

faustbrian · 2019-07-18T09:26:07Z

Use a faster query that performs better against larger datasets like on mainnet while the benefit on devnet is still visible but a lot smaller as the database is 1/3 the size. This is obviously not a permanent solution but a nice little improvement for now.

EXPLAIN ANALYZE SELECT COUNT(id) FROM blocks;

 Finalize Aggregate  (cost=298772.77..298772.78 rows=1 width=8) (actual time=2079.334..2079.334 rows=1 loops=1)
   ->  Gather  (cost=298772.55..298772.76 rows=2 width=8) (actual time=2079.232..2087.596 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=297772.55..297772.56 rows=1 width=8) (actual time=2074.550..2074.550 rows=1 loops=3)
               ->  Parallel Index Only Scan using blocks_pkey on blocks  (cost=0.56..288371.32 rows=3760492 width=24) (actual time=0.068..1678.611 rows=3009111 loops=3)
                     Heap Fetches: 311
 Planning time: 0.149 ms
 Execution time: 2087.720 ms
(9 rows)

EXPLAIN ANALYZE SELECT COUNT(*) FROM (SELECT id FROM blocks) AS count;

 Finalize Aggregate  (cost=192153.71..192153.72 rows=1 width=8) (actual time=1095.040..1095.041 rows=1 loops=1)
   ->  Gather  (cost=192153.49..192153.70 rows=2 width=8) (actual time=1094.962..1101.561 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=191153.49..191153.50 rows=1 width=8) (actual time=1086.913..1086.914 rows=1 loops=3)
               ->  Parallel Index Only Scan using blocks_timestamp_key on blocks  (cost=0.43..181752.26 rows=3760492 width=0) (actual time=0.085..719.661 rows=3009114 loops=3)
                     Heap Fetches: 366
 Planning time: 0.135 ms
 Execution time: 1101.631 ms
(9 rows)

Tested on a 4 CPU / 16 GB RAM / 160GB NVME server so results will vary between servers.

What kind of change does this PR introduce?

Does this PR introduce a breaking change?

Yes
No

Does this PR release a new version?

Yes
No

If yes, please describe the impact and migration path for existing applications:

The PR fulfills these requirements:

It's submitted to the develop branch, not the master branch
All tests are passing
New/updated tests are included

If adding a new feature, the PR's description includes:

A convincing reason for adding this feature (to avoid wasting your time, it's best to open a suggestion issue first and wait for approval before working on it)

vasild · 2019-07-18T12:54:58Z

I am afraid that there is no silver bullet for the problem with the slow COUNT(*).

SELECT COUNT(*) FROM (SELECT * FROM table); is not faster than SELECT COUNT(*) FROM table;. The former is either slower, or with a smart enough optimizer, equivalent to the later.

vasild · 2019-07-18T13:09:17Z

packages/core-database-postgres/src/repositories/repository.ts

@@ -103,8 +103,8 @@ export abstract class Repository implements Database.IRepository {
            return { rows, count: Math.max(count, rows.length), countIsEstimate: true };
        }

-        const countRow = await this.find(selectQueryCount);
+        const { count } = await this.db.one(`SELECT COUNT(*) FROM (${selectQueryCount.toString()}) AS count;`);


If you are going to do SELECT COUNT(*) FROM (... subquery ...) then you do not need a separate selectQueryCount variable and can use selectQuery for this. But I specifically did not do that when I introduced selectQueryCount because the straight-forward SELECT COUNT(*) FROM table WHERE ... is to be preferred over the convoluted SELECT COUNT(*) FROM (SELECT ... FROM table WHERE ...) AS count; because with the latter you are asking the database to do more stuff and depend on it being smart enough to detect that both queries are the same and actually do the first one under the hood. In other words, the variant with the subquery could also be slower!

codecov-io · 2019-07-23T00:38:04Z

Codecov Report

Merging #2832 into master will not change coverage.
The diff coverage is 0%.

@@           Coverage Diff           @@
##           master    #2832   +/-   ##
=======================================
  Coverage   65.33%   65.33%           
=======================================
  Files         363      363           
  Lines        8193     8193           
  Branches      393      393           
=======================================
  Hits         5353     5353           
  Misses       2806     2806           
  Partials       34       34

Impacted Files	Coverage Δ
...database-postgres/src/repositories/transactions.ts	`27.08% <0%> (ø)`	⬆️
...e-database-postgres/src/repositories/repository.ts	`16.12% <0%> (ø)`	⬆️
.../core-database-postgres/src/repositories/blocks.ts	`35.29% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1f5c1d4...19734cb. Read the comment docs.

spkjp · 2019-07-26T02:18:11Z

Given that the impact varies greatly between nodes, I'll close this PR.

perf(core-database-postgres): faster count method

c789a82

ghost added Complexity: Low labels Jul 18, 2019

vasild reviewed Jul 18, 2019

View reviewed changes

ghost mentioned this pull request Jul 21, 2019

[Weekly Digest] Jul 15, 2019 - Jul 21, 2019 #2833

Closed

Merge branch 'master' into nfsmw

19734cb

Merge branch 'master' into nfsmw

7cf7a81

faustbrian marked this pull request as ready for review July 25, 2019 01:46

faustbrian requested review from air1one, kristjank and spkjp as code owners July 25, 2019 01:46

Merge branch 'master' into nfsmw

20e39d6

spkjp closed this Jul 26, 2019

ghost removed the Status: In Progress label Jul 26, 2019

spkjp deleted the nfsmw branch July 26, 2019 02:23

ghost mentioned this pull request Jul 28, 2019

[Weekly Digest] Jul 22, 2019 - Jul 28, 2019 #2846

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(core-database-postgres): faster count method #2832

perf(core-database-postgres): faster count method #2832

faustbrian commented Jul 18, 2019 •

edited

Loading

vasild commented Jul 18, 2019

vasild Jul 18, 2019

codecov-io commented Jul 23, 2019

spkjp commented Jul 26, 2019

perf(core-database-postgres): faster count method #2832

perf(core-database-postgres): faster count method #2832

Conversation

faustbrian commented Jul 18, 2019 • edited Loading

What kind of change does this PR introduce?

Does this PR introduce a breaking change?

Does this PR release a new version?

vasild commented Jul 18, 2019

vasild Jul 18, 2019

Choose a reason for hiding this comment

codecov-io commented Jul 23, 2019

Codecov Report

spkjp commented Jul 26, 2019

faustbrian commented Jul 18, 2019 •

edited

Loading