-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Old events can't be loaded after joining a room; reconnecting fixes #148
Comments
I just tested it, and it works for me. The error message you're getting from the server:
suggests that it's specific to a certain room. So does it happen on every room you're in? |
It doesn't happen all the time but also not specific to a room, it happens to all rooms sometimes. |
I just tried it on four rooms that I'm in, and they all work fine. Can you reproduce the problem if you try it in public, unencrypted rooms without connecting through Pantalaimon? What homeserver do you use? Does it happen on a matrix.org account? |
Ill test everything tomorrow, for the stuff i can now, I use synapse, almost latest but it has been happening for a long time. |
So you're running your own homeserver? If so, I'd suggest checking some details of the rooms that are exhibiting this problem. For example, are they an old room version? Do they have any abnormalities, any corrupt data in their state? If you make a new room, does the problem still happen (after adding enough new events to the timeline to allow scrolling back, of course)? What if you join a room hosted on a different homeserver, one that is decently sized and busy, and try to scroll back to old events (e.g. AFAIK Ement correctly handles the from/to tokens returned by the server with event chunks. This error would suggest that the server is missing history for that room, or something like that. But I haven't run a homeserver before, and I don't know much about the server-to-server protocol. |
from
parameter is invalid)
From talking with another user just now, it seems that the issue is that, after joining a room, the |
from
parameter is invalid)
I just hit this testing against the staging instance of our local Matrix server, which is running version 1.85.2. {"errcode":"M_UNKNOWN", "error":"'from' parameter is invalid"} The logged request from ement was passing
My test cases are very limited (there were only a couple of rooms on this server, both with short histories), but my impression is this only happens when I've already reached the beginning of the room history and there are no more messages to be fetched, as I was able to fetch what history existed before encountering this error. Thus far it seems to be happening consistently for me on this server, including when I tried creating a new room and then using page-up to scroll back in that. I've not yet tried disconnecting and reconnecting; I figure if it's in a testable state now, I shouldn't mess with it. I can't confirm that the server version is relevant, but I don't recall ever seeing this happen with our production instance (which is currently running a much older v1.47.1 of the server software). In the rooms I've tested there in the same state (all history already fetched) just now, I just get these messages with no error:
As opposed to:
|
@phil-s Ok, I guess we need to ensure that the Lines 1368 to 1397 in 909abd4
|
Aha, https://spec.matrix.org/latest/client-server-api/#get_matrixclientv3roomsroomidmessages says
|
To me that sounds like ement.el could (in the absence of |
Oh; no I'm missing that "paginating from the first visible event" in this case is moving backwards in history to earlier events. So making the request without |
I've tested this simple change, and I believe it's working as desired in my case. The first change avoids errors in notification/mention buffers. modified ement-room.el
@@ -1376,16 +1376,17 @@ see."
(list (read-number "Number of messages: "))
(number current-prefix-arg))
:buffer (current-buffer)))
- (unless ement-room-retro-loading
+ (unless (or ement-room-retro-loading (not room))
(pcase-let* (((cl-struct ement-room id prev-batch) room)
(endpoint (format "rooms/%s/messages" (url-hexify-string id))))
;; We use a timeout of 30, because sometimes the server can take a while to
;; respond, especially if loading, e.g. hundreds or thousands of events.
(ement-api session endpoint :timeout 30
- :params (list (list "from" prev-batch)
- (list "dir" "b")
- (list "limit" (number-to-string number))
- (list "filter" (json-encode ement-room-messages-filter)))
+ :params (nconc (and prev-batch
+ (list (list "from" prev-batch)))
+ (list (list "dir" "b")
+ (list "limit" (number-to-string number))
+ (list "filter" (json-encode ement-room-messages-filter))))
:then then
:else (lambda (plz-error)
(when buffer |
Ah, there's still an issue here. Once at the beginning of the room history, repeated calls to Clearly the room event data is also stored somewhere outside of the room buffer, as killing the buffer and re-visiting the room gets me all the same duplicates. (Is there way to purge the data for the current buffer's room?) |
Purging events shouldn't be necessary. I think we don't fully understand the problem here and the intended solution. |
Perhaps, but it would seem useful for debugging. I'd certainly like to be able to do this and then view the room again as if it was the first time I'd done it this session.
(setf (ement-session-rooms session)
(cl-remove room (ement-session-rooms session))) |
I'm afraid it's not quite that simple. If you removed the room struct from the session's rooms slot, you'd also need to replay the room's creation and metadata events in order for it to be properly recreated. You can "view the room again as if it was the first time I'd done it this session" by simply killing the room's buffer and using Maybe you could try something like this:
Although that may cause errors unless you avoid processing the events normally, which would try to add them to the appropriate structs in the session. Basically, I guess you just want to issue API calls, which you could do by just calling But again, I don't think we (or at least, I) fully understand the problem here. Looking again at my comment here: #148 (comment), I'm guessing that the problem is that we aren't recording the token correctly when we receive the event indicating that the user joined the room, but that the same token is correctly recorded when received in an initial sync request. Another possibility is that this is related to rooms' history visibility settings, that we aren't recognizing when a room doesn't allow the user to view history from before the user joined. Anyway, it would be best if we could define a reproducible test case (e.g. involving creating a test room, maybe on a second account, and then joining on the main account and reproducing the problem). That would probably make it easy to understand and solve. |
That wasn't my experience when I was seeing duplicate events. The Matrix server didn't have duplicates of the events, and the first time I visited the room there were no duplicates. But Emacs had ended up with dups, and killing the buffer and re-visiting the room from the room list created a new room buffer with all the duplicates, rather than recreating what I'd seen when initially visiting the room. (After disconnecting and reconnecting, things went back to normal.) |
Ok, I must have misunderstood what you meant. We really do need a set of steps to reproduce the problem. :) |
I'm planning to spend some time on ement on Friday, so I'll take another look at this then and hopefully other things as well. I don't know how much I'll get done, but I'm mentioning it in case you have some hot tips on debugging. One thing is that some absolutely huge data structures are thrown around, so I'm curious as to whether you use any particular techniques for managing these. Another thing I was wondering about was general Matrix event logging. I guess Any tips you may have on these points or anything else are welcomed :) |
This can indeed be a problem. Incidentally, it used to be much worse, before I changed the data structure to not have circular references (rooms pointed to their events which pointed back to their rooms). I had to unfreeze or Having made that change, with recent Emacs versions I don't often have this problem anymore. But if it does happen, it can be mitigated by setting Another tip might be to use the Other than that, I sometimes use
You could certainly use those to do that, but I haven't needed to yet.
You might take a look at the If you have any other questions, please feel free to ask. I appreciate your help. |
That all sounds really helpful, thank you! I'm not familiar with |
I just encountered this problem myself. As mentioned, this change does indeed fix the error: diff --git a/ement-room.el b/ement-room.el
index a7eb288..a8444f7 100644
--- a/ement-room.el
+++ b/ement-room.el
@@ -1382,10 +1382,12 @@ (cl-defun ement-room-retro
;; We use a timeout of 30, because sometimes the server can take a while to
;; respond, especially if loading, e.g. hundreds or thousands of events.
(ement-api session endpoint :timeout 30
- :params (list (list "from" prev-batch)
- (list "dir" "b")
- (list "limit" (number-to-string number))
- (list "filter" (json-encode ement-room-messages-filter)))
+ :params (remq nil
+ (list (when prev-batch
+ (list "from" prev-batch))
+ (list "dir" "b")
+ (list "limit" (number-to-string number))
+ (list "filter" (json-encode ement-room-messages-filter))))
:then then
:else (lambda (plz-error)
(when buffer But as Phil said, some of the retrieved messages end up being duplicates. Shouldn't be too hard to dedupe them... |
For future reference, and FWIW, I went down the rabbit hole and arrived at this: https://github.com/matrix-org/matrix-js-sdk/blob/687d08dc9d9c618c628ef59031fee2f8f81554cd/src/models/event-timeline-set.ts#L418 That shows how the Matrix JS SDK handles room timelines and retrieving old events and adding the ones it receives to the timelines it maintains for a room. I was hoping there was some trick that I just didn't know about, since I can't find any special behavior for getting the first |
Since we now fetch old events without a "from" token immediately after joining (i.e. without a full initial sync), we must deduplicate the received events. See #148.
I think this branch will fix the problem: https://github.com/alphapapa/ement.el/compare/wip/148-retro-after-join If anyone would like to test it, please do. |
I've applied this patch and done some initial testing. I'm not seeing the "'from' parameter is invalid" error any more. It's not quite working right yet, though. I now get the following error, and the room buffer is not updated.
However the retro messages do in fact make it into the session -- if I kill the buffer and revisit the room, I see them. The error only happens the first time I try. Subsequent use of the command has no obvious effect. I've not tried to debug this, but I suspect the vector argument to
|
Hi Phil, Apologies, then; I wrote and reviewed the code carefully, but apparently not carefully enough. (Maybe it was too late at night to be coding, haha.) I'll let you know when I have another update. |
finally do
(setf (ement-room-timeline room) (append (ement-room-timeline room)
(seq-remove #'null chunk)))) Looks like up there we're putting a (when buffer
;; Insert events into the room's buffer.
(with-current-buffer buffer
(save-window-excursion
;; NOTE: See note in `ement--update-room-buffers'.
(when-let ((buffer-window (get-buffer-window buffer)))
(select-window buffer-window))
;; FIXME: Use retro-loading in event handlers, or in --handle-events, anyway.
(ement-room--process-events chunk)
I'm familiar with that phenomenon :/ |
I'll test with this: modified ement-room.el
@@ -2520,7 +2520,8 @@ ement-room-retro-callback
;; likely very few compared to the number of timeline events, which is
;; what the user is interested in (e.g. when loading 1000 earlier
;; messages in #emacs:matrix.org, only 31 state events were received).
- (progress-max-value (* 3 num-events)))
+ (progress-max-value (* 3 num-events))
+ (chunk-unseen nil))
;; NOTE: Put the newly retrieved events at the end of the slots, because they should be
;; older events. But reverse them first, because we're using "dir=b", which the
;; spec says causes the events to be returned in reverse-chronological order, and we
@@ -2554,8 +2555,9 @@ ement-room-retro-callback
(ement--put-event event nil session))
(ement-progress-update)
finally do
- (setf (ement-room-timeline room) (append (ement-room-timeline room)
- (seq-remove #'null chunk))))
+ (setf chunk-unseen (seq-remove #'null chunk)
+ (ement-room-timeline room) (append (ement-room-timeline room)
+ chunk-unseen)))
(when buffer
;; Insert events into the room's buffer.
(with-current-buffer buffer
@@ -2564,7 +2566,7 @@ ement-room-retro-callback
(when-let ((buffer-window (get-buffer-window buffer)))
(select-window buffer-window))
;; FIXME: Use retro-loading in event handlers, or in --handle-events, anyway.
- (ement-room--process-events chunk)
+ (ement-room--process-events chunk-unseen)
(when set-prev-batch
;; This feels a little hacky, but maybe not too bad.
(setf (ement-room-prev-batch room) end)) |
Thanks, Phil, good eye. |
Since we now fetch old events without a "from" token immediately after joining (i.e. without a full initial sync), we must deduplicate the received events. See alphapapa#148.
Since we now fetch old events without a "from" token immediately after joining (i.e. without a full initial sync), we must deduplicate the received events. See alphapapa#148.
I've been running with these changes (including the diff above) for a while now, and I've just done some more purposeful testing, and I believe this is behaving correctly. I haven't seen an error when trying to fetch more history, and when I try to do so when already at the beginning of the history I don't get any duplicates. I do get some inconsistent messages which I don't understand, and it would probably be nice if Ement could say something like "No prior history available", but my impression is that this could be merged as-is to improve things. This example was from repeatedly calling
|
Since we now fetch old events without a "from" token immediately after joining (i.e. without a full initial sync), we must deduplicate the received events. See alphapapa#148.
Since we now fetch old events without a "from" token immediately after joining (i.e. without a full initial sync), we must deduplicate the received events. See alphapapa#148.
Since we now fetch old events without a "from" token immediately after joining (i.e. without a full initial sync), we must deduplicate the received events. See #148.
I pushed a couple more fixes to the branch: https://github.com/alphapapa/ement.el/compare/wip/148-retro-after-join Including one for the behavior noted in #148 (comment) Now the room's token is not cleared if the response doesn't include one, so it doesn't go through that looping pattern. Please let me know if this branch still seems to work correctly. If it does, I'll merge it. |
It works well so far as I can see -- and once I've obtained all the history I now consistently see:
So that seems good. |
Since we now fetch old events without a "from" token immediately after joining (i.e. without a full initial sync), we must deduplicate the received events. See #148.
After joining a room, but without having received an initial sync response including its events, we have no "from" token, in which case we should omit it rather than sending an empty value for it. See #148.
If the response doesn't include an end token, don't clear the room's end token (which would make future "retro" requests load events from the end of the room's timeline, as if we hadn't received any). See #148.
When loading old events, we may sometimes receive duplicates (e.g. after joining a room but not having received its events in an initial sync response, we have no "from" token, so the first "retro" request will receive events from the end of its timeline, which may include ones we already have). Now we ignore events we've already received rather than inserting them into the timeline (and buffer). See #148.
I wish I could have helped more, i ended up switching off ement.el due to a relience on E2EE and pantalaimon being just bad... but i do plan to come back in the near future |
ement version: 9.0-pre
emacs version: some master revision, recent
pantalaimon: in use, custom fork but same behavior on the upstream version
If i attempt to scroll beyond the available messages, I get this error, which also causes minor Emacs freezes.
The text was updated successfully, but these errors were encountered: