Fix `last_active_ts` Presence bug #15859

realtyem · 2023-07-01T20:34:47Z

Fixes last_active_ago is updated on sync when it shouldn't be #12424

Essentially, what was happening was that during an incremental sync, online is set repeatedly by the system(which is correct behavior, approximately each timeout). This means it was as simple as comparing the 'before' and 'after' state of online, if they matched then don't bump last_active_ts. I spent hours checking permutations, I'm pretty sure I caught them all but need more eyes to check me.

I did write a basic test to check the behavior(which fails on develop), but am open to ideas of what else to actually check for regression.

last_active_ago in the Matrix Spec

Consequences:

By not updating the timestamp every sync:
- this avoids setting a staggered set of wheel timers. It appears they deduplicate into the same time bucket as was probably intended(5 seconds). Less processing of unnecessary timers is a small win.
- this allows the last_active_change_online and last_active_change_not_online metrics to actually update correctly(which will be interesting to see how it plays out)
- this may cause slightly more presence activity, as now users will be marked UNAVAILABLE(idle) correctly where they didn't before(it will probably be a wash, though).
One less call to _update_states() machinery, which is the heavy lift of sending notifications and persisting changes(and setting more timers).

A quick hackish(and incomplete) primer on how presence does timeouts

A local user that is online and currently syncing:
- if they are currently sending messages, they have LAST_ACTIVE_GRANULARITY(60 seconds) before they are marked not 'currently_active'
- if they are just sitting there and watching, they will be marked UNAVAILABLE(idle) in IDLE_TIMER(5 minutes) if they don't send a message or create a new 'pro-active' event(this was not happening correctly a lot of the time, however typing, sending read receipts, and sending read markers will bump last_active_ts, even though I don't believe they are expressly 'pro-active')
- when they stop syncing:
  - in UPDATE_SYNCING_USERS_MS(10 seconds) we notify any workers that the user has stopped syncing
  - in SYNC_ONLINE_TIMEOUT(30 seconds) we assume the client is gone OFFLINE and mark them as such(unless they somehow create a 'pro-active' event). This ignores currently_active
if the user is remote:
- if we haven't heard from their server in FEDERATION_TIMEOUT(30 minutes), we mark them OFFLINE
if the user is local but in a room with remote users:
- we send presence every FEDERATION_PING_INTERVAL(25 minutes) so they don't mark us OFFLINE incorrectly.

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file.
Pull request includes a sign off
Code style is correct
(run the linters)

Signed-off-by: Jason Little [email protected]

…hould only be updating that if a pro-active event

realtyem · 2023-07-01T20:52:57Z

synapse/handlers/presence.py

-            # To keep the single process behaviour consistent with worker mode, run the
-            # same logic as `update_external_syncs_row`, even though it looks weird.
-            if prev_state.state == PresenceState.OFFLINE:
-                await self._update_states(
-                    [
-                        prev_state.copy_and_replace(
-                            state=PresenceState.ONLINE,
-                            last_active_ts=self.clock.time_msec(),
-                            last_user_sync_ts=self.clock.time_msec(),
-                        )
-                    ]
-                )


This code block preformed two functions, updating last_active_ts and last_user_sync_ts. last_active_ts is now handled by set_state(), and last_user_sync_ts will be set at the finish of the context manager, so is unneeded here. (As is actually commented on just below this)

realtyem · 2023-07-01T20:54:31Z

synapse/handlers/presence.py

@@ -1866,7 +1862,7 @@ def handle_timeout(
    """Checks the presence of the user to see if any of the timers have elapsed

    Args:
-        state
+        state: the current state of the user, not the state at the setting of the timer


This was just a driveby fix

realtyem · 2023-07-01T20:57:17Z

synapse/handlers/presence.py

-        if presence == PresenceState.ONLINE or (
-            presence == PresenceState.BUSY and self._busy_presence_enabled
-        ):
+        if (
+            prev_state.state != PresenceState.ONLINE
+            and presence == PresenceState.ONLINE
+        ) or (presence == PresenceState.BUSY and self._busy_presence_enabled):


This was the meat and potatoes of the fix, if we are already online, we don't have to bump last_active_ts

realtyem · 2023-07-01T22:11:53Z

tests/handlers/test_presence.py

+            self.presence_handler.get_state(UserID.from_string(user_id))
+        )
+        # last_active_ts should not have changed as no pro-active event has occurred
+        self.assertEqual(prev_state.last_active_ts, state.last_active_ts)


FYI: on develop this fails with

twisted.trial.unittest.FailTest: 2000 != 4000

erikjohnston

Otherwise seems good

erikjohnston · 2023-07-18T09:06:27Z

synapse/handlers/presence.py

+            # TODO: This will be removed in future work as it's
+            #  literally done again at the '_end' of this context manager and the
+            #  only time last_user_sync_ts matters is when processing timeouts in
+            #  handle_timeout().


Then can't we just remove it now rather than leaving a comment?

Not yet. And I'm on the fence based on further experiments about putting the other code block back in. Let's put this on hold for the moment. I think I have something to show you(out-of-band though).

realtyem · 2023-07-22T06:35:50Z

I'm going to close this for now, as there were unforeseen side-effects I'm not happy about. I will roll this into another pull request with more in it to abate those side-effects.

(Mainly, the side effects were that federation sending would ramp up and stay up for over an hour. I couldn't pin down exactly what was causing this, but fixing 'presence spam' seemed to make it stop and requires something broader than this one PR was scoped for.)

realtyem · 2023-07-24T12:03:38Z

Replaced with #15989

realtyem added 4 commits July 1, 2023 06:31

Fix how set_state is setting last_active_ts without discrimination, s…

6682a28

…hould only be updating that if a pro-active event

Add a test to check this behavior doesn't regress

2e4bd8f

Add to docstring for clarification of what state is here

805094f

changelog

a1becfb

realtyem commented Jul 1, 2023

View reviewed changes

Merge branch 'develop' into presence-fix-last-active-ts

213f052

realtyem commented Jul 1, 2023

View reviewed changes

realtyem marked this pull request as ready for review July 2, 2023 21:04

realtyem requested a review from a team as a code owner July 2, 2023 21:04

realtyem mentioned this pull request Jul 5, 2023

Current state of Presence #15877

Open

Merge branch 'develop' into presence-fix-last-active-ts

0b79836

erikjohnston reviewed Jul 18, 2023

View reviewed changes

realtyem closed this Jul 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `last_active_ts` Presence bug #15859

Fix `last_active_ts` Presence bug #15859

realtyem commented Jul 1, 2023 •

edited

Loading

realtyem Jul 1, 2023 •

edited

Loading

realtyem Jul 1, 2023

realtyem Jul 1, 2023

realtyem Jul 1, 2023

erikjohnston left a comment

erikjohnston Jul 18, 2023

realtyem Jul 18, 2023

realtyem commented Jul 22, 2023

realtyem commented Jul 24, 2023

Fix last_active_ts Presence bug #15859

Fix last_active_ts Presence bug #15859

Conversation

realtyem commented Jul 1, 2023 • edited Loading

Pull Request Checklist

realtyem Jul 1, 2023 • edited Loading

Choose a reason for hiding this comment

realtyem Jul 1, 2023

Choose a reason for hiding this comment

realtyem Jul 1, 2023

Choose a reason for hiding this comment

realtyem Jul 1, 2023

Choose a reason for hiding this comment

erikjohnston left a comment

Choose a reason for hiding this comment

erikjohnston Jul 18, 2023

Choose a reason for hiding this comment

realtyem Jul 18, 2023

Choose a reason for hiding this comment

realtyem commented Jul 22, 2023

realtyem commented Jul 24, 2023

Fix `last_active_ts` Presence bug #15859

Fix `last_active_ts` Presence bug #15859

realtyem commented Jul 1, 2023 •

edited

Loading

realtyem Jul 1, 2023 •

edited

Loading