Change device_inbox stream index to include user #1793

erikjohnston · 2017-01-10T15:06:10Z

This makes fetching the most recently changed users much quicker, and
brings it in line with e.g. presence_stream indices.

This makes fetching the nost recently changed users much tricker, and brings it in line with e.g. presence_stream indices.

NegativeMjark · 2017-01-10T15:27:38Z

Does this index actually help? What does the explain look like for the queries before/after?

Is this trying to fix #1768 ?

erikjohnston · 2017-01-10T15:32:46Z

It does help.

New query:

matrix=# explain analyze SELECT user_id, MAX(stream_id) FROM device_inbox WHERE stream_id > 291509 - 100000 GROUP BY user_id;
                                                                           QUERY PLAN                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=118140.66..118147.50 rows=684 width=27) (actual time=620.251..621.149 rows=5270 loops=1)
   Group Key: user_id
   ->  Index Only Scan using device_cache_test_2 on device_inbox  (cost=0.56..106708.76 rows=2286381 width=27) (actual time=0.050..230.960 rows=2311840 loops=1)
         Index Cond: (stream_id > 191509)
         Heap Fetches: 24067
 Planning time: 0.464 ms
 Execution time: 621.318 ms

Old query:

matrix=# explain analyze SELECT user_id, MAX(stream_id) FROM device_inbox WHERE stream_id > 291509 - 100000 GROUP BY user_id;
                                                                          QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=292078.26..292085.10 rows=684 width=27) (actual time=1251.428..1252.534 rows=5270 loops=1)
   Group Key: user_id
   ->  Index Scan using device_inbox_stream_id on device_inbox  (cost=0.43..280646.35 rows=2286381 width=27) (actual time=0.032..760.074 rows=2311840 loops=1)
         Index Cond: (stream_id > 191509)
 Planning time: 0.085 ms
 Execution time: 1252.684 ms

NegativeMjark · 2017-01-10T15:42:20Z

Hmm the only change I see is that it's using an index only scan now, which looks like it's removed a constant factor from the query.

NegativeMjark · 2017-01-10T15:44:39Z

I don't think it's worth it. I suspect the change proposed in #1768 would be a better fix to the query performance.

erikjohnston · 2017-01-10T15:46:07Z

Hmm the only change I see is that it's using an index only scan now, which looks like it's removed a constant factor from the query.

Well, its impossible to extrapolate that from two data points, but even so a constant factor of two isn't necessarily something to be sniffed at. Both of those two readings are going to have come from the disk cache, and I've certainly seen that query take a lot longer than a couple of seconds when its not in the disk cache.

erikjohnston · 2017-01-10T15:49:12Z

I don't think it's worth it. I suspect the change proposed in #1768 would be a better fix to the query performance.

Worth what? The work is done. I'm also concerned how this is going to scale as we roll out e2e.

NegativeMjark · 2017-01-10T15:50:29Z

Yes, but the fix in #1768 will probably knock off a factor ~20 or so from the query and will make it scale better with more e2e.

erikjohnston · 2017-01-10T15:54:47Z

Oh, sorry, I misread the PR/issue number and thought you were referring to the limit.

I find the query in there is a bit dubious, given that it silently relies on there being a unique stream_id per user_id, which is not true for quite a few streams.

erikjohnston · 2017-01-10T15:57:19Z

Yes, but the fix in #1768 will probably knock off a factor ~20 or so from the query and will make it scale better with e2e.

That appears to be because it loads only 100000 rows, rather than e.g. 2313402 rows. (Due to the fact that there are multiple rows per stream id).

Is there a reason from the EXPLAIN you think that it would be quicker to do a backwards scan than an index only scan?

NegativeMjark · 2017-01-10T16:00:48Z

Yes because it cheats and only loads 100000 rows rather than 2313402, which i'd have thought is entirely reasonable for something which is just trying to prefill a cache to smooth the startup process.

I'm also concerned that adding the user_id to the stream index will makes some of the other queries less efficient, and will result in synapse chewing more diskspace for little benefit.

NegativeMjark · 2017-01-10T16:01:51Z

(obviously you'd need to fiddle the result slightly to get the correct minimum stream_id if you were using the query from #1768)

erikjohnston · 2017-01-10T16:06:31Z

Yes because it cheats and only loads 100000 rows rather than 2313402, which i'd have thought is entirely reasonable for something which is just trying to prefill a cache to smooth the startup process.

Right, but we can achieve the exact same affect by reducing the limit. Which we have. The index only scan has the benefit it doesn't need to pull the rows out of the DB, thus reducing overall IO. So actually I expect adding the user_id to the index is faster for the same number of cache entries.

I'm also concerned that adding the user_id to the stream index will makes some of the other queries less efficient, and will result in synapse chewing more diskspace for little benefit.

Other than reducing the number of index rows in a page, it won't make a difference since the user_id is at the end. I have no reason to believe that it makes much of a difference and other─much busier streams─use that style of index.

(obviously you'd need to fiddle the result slightly to get the correct minimum stream_id if you were using the query from #1768)

Keeping the stream tables consistent seems nicer.

NegativeMjark · 2017-01-10T19:35:36Z

Dropping the limit in #1792 doesn't change the fact that the number of rows it needs to scan is proportional to "average number of devices in room" * limit. The change proposed in #1768 puts a hard limit on the number of rows scanned which makes the performance of the query a lot more predictable.

Personally I'd rather have predictable performance when running the startup query rather than dropping a constant factor off of it.

NegativeMjark · 2017-01-12T14:43:37Z

synapse/storage/schema/delta/40/device_inbox.sql

+ */
+
+INSERT into background_updates (update_name, progress_json)
+    VALUES ('device_inbox_stream_index', '{}');


Add a comment to explain that this is to turn the pre-fill startup query into a index-only scan on postgresql.

Change device_inbox stream index to include user

caddadf

This makes fetching the nost recently changed users much tricker, and brings it in line with e.g. presence_stream indices.

erikjohnston assigned NegativeMjark Jan 10, 2017

erikjohnston added 2 commits January 10, 2017 15:11

Don't disable autocommit

5a32e92

Explicitly close the cursor

ab655dc

NegativeMjark reviewed Jan 12, 2017

View reviewed changes

Add comment

c050f49

erikjohnston merged commit f0325a9 into develop Jan 13, 2017

erikjohnston deleted the erikj/change_device_inbox_index branch March 29, 2017 10:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change device_inbox stream index to include user #1793

Change device_inbox stream index to include user #1793

erikjohnston commented Jan 10, 2017 •

edited

Loading

NegativeMjark commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017 •

edited

Loading

erikjohnston commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

NegativeMjark Jan 12, 2017

Change device_inbox stream index to include user #1793

Change device_inbox stream index to include user #1793

Conversation

erikjohnston commented Jan 10, 2017 • edited Loading

NegativeMjark commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017 • edited Loading

erikjohnston commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

erikjohnston commented Jan 10, 2017

NegativeMjark commented Jan 10, 2017

NegativeMjark Jan 12, 2017

Choose a reason for hiding this comment

erikjohnston commented Jan 10, 2017 •

edited

Loading

NegativeMjark commented Jan 10, 2017 •

edited

Loading