Fix performance on previewing large tables #168

trengrj · 2016-07-19T00:35:19Z

When a table is viewed the function TableRowsCount is called to get the size of the table for pagination. This unfortunately is a very slow operation on large tables as it calls SELECT COUNT(1) FROM TABLE which is slow in Postgres and will cause the application to hang on tables in the many gigabytes range.

I've got a working commit here which uses the estimated table size (if it is available). I removed the "Page X of Y" at top right and replaced with just "Page X" because sometimes estimated size is not available (i.e. when it is a view).

trengrj@b8cc628

I think it is important to be able to preview any table quickly, but agree the pagination is useful. What are you views on fixing this issue? Maybe some sort of hybrid approach?

The text was updated successfully, but these errors were encountered:

sosedoff · 2016-09-03T17:31:47Z

Thanks for bringing this up and sorry for the late response. Indeed, COUNT(*) is very slow on large data sets so having a hybrid approach would solve the issue (partially). Ideally, we could first check the estimated rows count and if its below 100k (just an example) rows we use COUNT, otherwise use the estimated rows count for pagination. I would also prefer keeping the pagination where possible instead of removing it altogether, i know this could be a tricky feature, but still.

felixbuenemann · 2016-09-06T05:45:14Z

It's also worth noting that OFFSET pagination will get slower the further you paginate, because postgres has to first fetch all rows preceding the offset and throw them away. A faster way is to use keyset pagination, something like WHERE id > (last row id from previous page) LIMIT n. Not sure if that's practical to implement here, because it requires knowledge of the underlying table.

Also see We need tool support for keyset pagination for more info.

sosedoff · 2016-11-16T03:00:26Z

Does anyone have any interest in tackling the problem?

sosedoff · 2017-08-11T01:25:30Z

Coming back to this. So im thinking of having a map (per established connection) that will hold information about tables that are somewhat large (> 100k rows). For table browse requests we will first check if that map contains the table and if it does we will fetch the rows stats using count estimates. On database switch that map will be cleared out.

Any thoughts?

felixbuenemann · 2017-08-14T10:27:22Z

The DZone article Faster PostgreSQL Counting has some useful tips on getting good estimate counts on large tables.

Instead of keeping a map of large tables in memory you could just look at the table stats in postgres on the fly, I don't think caching that info is worth the trouble.

allisson mentioned this issue Jun 14, 2018

Add EstimatedTableRowsCount to avoid count in large tables #366

Merged

sosedoff closed this as completed in #366 Jun 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix performance on previewing large tables #168

Fix performance on previewing large tables #168

trengrj commented Jul 19, 2016

sosedoff commented Sep 3, 2016

felixbuenemann commented Sep 6, 2016

sosedoff commented Nov 16, 2016

sosedoff commented Aug 11, 2017

felixbuenemann commented Aug 14, 2017

Fix performance on previewing large tables #168

Fix performance on previewing large tables #168

Comments

trengrj commented Jul 19, 2016

sosedoff commented Sep 3, 2016

felixbuenemann commented Sep 6, 2016

sosedoff commented Nov 16, 2016

sosedoff commented Aug 11, 2017

felixbuenemann commented Aug 14, 2017