Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Background update user_ips_remove_dupes query on user_ips takes a while #4430

Closed
erikjohnston opened this issue Jan 22, 2019 · 4 comments
Closed
Labels
A-Background-Updates Filling in database columns, making the database eventually up-to-date z-bug (Deprecated Label)

Comments

@erikjohnston
Copy link
Member

erikjohnston commented Jan 22, 2019

c.f. https://matrix.to/#/!HsxjoYRFsDtWBgDQPh:matrix.org/$15481524910FNbFt:intelfx.name?via=jki.re&via=matrix.org&via=chat.weho.st

It appears for some reason the query is taking ages (>24h, 100% CPU), despite the fact it should be happily using the index:

SELECT user_id, access_token, ip,
MAX(device_id), MAX(user_agent), MAX(last_seen)
FROM (
SELECT user_id, access_token, ip
FROM user_ips
WHERE ? <= last_seen AND last_seen < ?
ORDER BY last_seen
) c
INNER JOIN user_ips USING (user_id, access_token, ip)
GROUP BY user_id, access_token, ip
HAVING count(*) > 1""",

An explain of query, after the table has been purged, is:

synapse=# EXPLAIN SELECT user_id, access_token, ip, MAX(device_id), MAX(user_agent), MAX(last_seen) FROM ( SELECT user_id, access_token, ip FROM user_ips WHERE 1547946285426 <= last_seen AND last_seen < 3095921913384 ORDER BY last_seen ) c INNER JOIN user_ips USING (user_id, access_token, ip) GROUP BY user_id
synapse-# , access_token, ip HAVING count(*) > 1;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=2.71..2.76 rows=1 width=378)
   Group Key: user_ips_1.user_id, user_ips_1.access_token, user_ips_1.ip
   Filter: (count(*) > 1)
   ->  Hash Join  (cost=2.46..2.64 rows=4 width=367)
         Hash Cond: ((user_ips_1.user_id = user_ips.user_id) AND (user_ips_1.access_token = user_ips.access_token) AND (user_ips_1.ip = user_ips.ip))
         ->  Sort  (cost=1.24..1.26 rows=8 width=314)
               Sort Key: user_ips_1.last_seen
               ->  Seq Scan on user_ips user_ips_1  (cost=0.00..1.12 rows=8 width=314)
                     Filter: (('1547946285426'::bigint <= last_seen) AND (last_seen < '3095921913384'::bigint))
         ->  Hash  (cost=1.08..1.08 rows=8 width=367)
               ->  Seq Scan on user_ips  (cost=0.00..1.08 rows=8 width=367)
(11 rows)

However, the seq scan is probably used simply because there are so few rows in the table

@erikjohnston erikjohnston added the z-bug (Deprecated Label) label Jan 22, 2019
@erikjohnston
Copy link
Member Author

A couple of potential causes:

  1. This was reported at the same time as them saying their user_ips table was growing quickly
  2. Fixup removal of duplicate user_ips rows #4432 removes using a large upper bound (in this case 3095921913384), just in case that was confusing the postgres query planner

@richvdh
Copy link
Member

richvdh commented Feb 5, 2019

Another report of this:

synapse=# EXPLAIN VERBOSE SELECT user_id, access_token, ip, MAX(device_id), MAX(user_agent), MAX(last_seen) FROM ( SELECT user_id, access_token, ip FROM user_ips WHERE 1549403093407 <= last_seen ) c INNER JOIN user_ips USING (user_id, access_token, ip) GROUP BY user_id, access_token, ip HAVING count(*) > 1 ;
                                                                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=1001.64..39504001.54 rows=247 width=372)
   Output: user_ips_1.user_id, user_ips_1.access_token, user_ips_1.ip, max(user_ips.device_id), max(user_ips.user_agent), max(user_ips.last_seen)
   Group Key: user_ips_1.user_id, user_ips_1.access_token, user_ips_1.ip
   Filter: (count(*) > 1)
   ->  Gather Merge  (cost=1001.64..39503966.39 rows=1480 width=380)
         Output: user_ips_1.user_id, user_ips_1.access_token, user_ips_1.ip, (PARTIAL max(user_ips.device_id)), (PARTIAL max(user_ips.user_agent)), (PARTIAL max(user_ips.last_seen)), (PARTIAL count(*))
         Workers Planned: 2
         ->  Partial GroupAggregate  (cost=1.61..39502795.54 rows=740 width=380)
               Output: user_ips_1.user_id, user_ips_1.access_token, user_ips_1.ip, PARTIAL max(user_ips.device_id), PARTIAL max(user_ips.user_agent), PARTIAL max(user_ips.last_seen), PARTIAL count(*)
               Group Key: user_ips_1.user_id, user_ips_1.access_token, user_ips_1.ip
               ->  Nested Loop  (cost=1.61..7016163.51 rows=1856378550 width=339)
                     Output: user_ips_1.user_id, user_ips_1.access_token, user_ips_1.ip, user_ips.device_id, user_ips.user_agent, user_ips.last_seen
                     ->  Parallel Index Scan using user_ips_user_ip on public.user_ips user_ips_1  (cost=0.81..6847033.34 rows=314 width=300)
                           Output: user_ips_1.user_id, user_ips_1.access_token, user_ips_1.device_id, user_ips_1.ip, user_ips_1.user_agent, user_ips_1.last_seen
                           Filter: ('1549403093407'::bigint <= user_ips_1.last_seen)
                     ->  Index Scan using user_ips_user_ip on public.user_ips  (cost=0.81..535.28 rows=335 width=339)
                           Output: user_ips.user_id, user_ips.access_token, user_ips.device_id, user_ips.ip, user_ips.user_agent, user_ips.last_seen
                           Index Cond: ((user_ips.user_id = user_ips_1.user_id) AND (user_ips.access_token = user_ips_1.access_token) AND (user_ips.ip = user_ips_1.ip))
(18 rows)

At the same time:

 tablename  |        indexname         |  num_rows   | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
------------+--------------------------+-------------+------------+------------+--------+-----------------+-------------+----------------
 user_ips   | user_ips_last_seen_only  |      187276 | 1578 MB    | 122 MB     | N      |             156 |     4600554 |        2761182
 user_ips   | user_ips_user_ip         |      187276 | 1578 MB    | 1684 MB    | N      |              59 |     3608174 |         188349
 user_ips   | user_ips_device_id       |      187276 | 1578 MB    | 296 MB     | N      |               0 |           0 |              0
 user_ips   | user_ips_last_seen       |      187276 | 1578 MB    | 254 MB     | N      |               0 |           0 |              0
 event_json | event_json_event_id_key  | 2.35147e+06 | 1496 MB    | 195 MB     | Y      |           23570 |       23096 |          23096
 event_json | event_json_room_id       | 2.35147e+06 | 1496 MB    | 137 MB     | N      |               0 |           0 |              0
 events     | events_pkey              | 2.34991e+06 | 705 MB     | 78 MB      | Y      |            1750 |        1585 |           1585
 events     | events_room_stream       | 2.34991e+06 | 705 MB     | 215 MB     | N      |              20 |          34 |             34
 events     | event_contains_url_index | 2.34991e+06 | 705 MB     | 496 kB     | N      |               0 |           0 |              0
 events     | events_ts                | 2.34991e+06 | 705 MB     | 106 MB     | N      |               0 |           0 |              0
(10 rows)

(and user_ips was apparently growing by tens of MB every minute)

@richvdh
Copy link
Member

richvdh commented Feb 5, 2019

user_ips growing rapidly seems a common theme. most odd.

@MadLittleMods MadLittleMods added the A-Background-Updates Filling in database columns, making the database eventually up-to-date label Jun 3, 2022
@erikjohnston
Copy link
Member Author

Haven't noticed this in a while, so going to close.

@erikjohnston erikjohnston closed this as not planned Won't fix, can't repro, duplicate, stale Jun 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Background-Updates Filling in database columns, making the database eventually up-to-date z-bug (Deprecated Label)
Projects
None yet
Development

No branches or pull requests

3 participants