Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Element-R: hang during login #25779

Open
richvdh opened this issue Jul 12, 2023 · 8 comments · Fixed by matrix-org/matrix-js-sdk#3610
Open

Element-R: hang during login #25779

richvdh opened this issue Jul 12, 2023 · 8 comments · Fixed by matrix-org/matrix-js-sdk#3610
Labels
A-Element-R Issues affecting the port of Element's crypto layer to Rust O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Impairs non-critical functionality or suitable workarounds exist T-Defect

Comments

@richvdh
Copy link
Member

richvdh commented Jul 12, 2023

Sometimes:

  • Log out
  • Attempt to log back in
  • See a spinner of doom:
    image

I think it's something indexedDB related. From the logs:

16:00:02.449 Removing indexeddb instance: matrix-js-sdk:crypto [rageshake.ts:74:27](webpack:///src/rageshake/rageshake.ts)
16:00:02.450 Removing IndexedDB instance matrix-js-sdk::matrix-sdk-crypto [rageshake.ts:74:27](webpack:///src/rageshake/rageshake.ts)
16:00:02.451 Removed indexeddb instance: matrix-js-sdk:crypto [rageshake.ts:74:27](webpack:///src/rageshake/rageshake.ts)
16:00:02.452 cannot yet remove IndexedDB instance matrix-js-sdk::matrix-sdk-crypto [rageshake.ts:74:27](webpack:///src/rageshake/rageshake.ts)
16:00:02.551 IndexedDB worker is ready [rageshake.ts:74:27](webpack:///src/rageshake/rageshake.ts)
16:00:02.551 Removing indexeddb instance: matrix-js-sdk:riot-web-sync [959eaaf02cce2e15d078.worker.js:5127:33](https://pr10080--matrix-react-sdk.netlify.app/959eaaf02cce2e15d078.worker.js)
16:00:02.552 Removed indexeddb instance: matrix-js-sdk:riot-web-sync [959eaaf02cce2e15d078.worker.js:5127:33](https://pr10080--matrix-react-sdk.netlify.app/959eaaf02cce2e15d078.worker.js)
16:00:02.553 Deleted indexeddb data. [rageshake.ts:74:27](webpack:///src/rageshake/rageshake.ts)


note cannot yet remove IndexedDB instance in particular.

It's fine after a reload.

@richvdh richvdh added T-Defect S-Minor Impairs non-critical functionality or suitable workarounds exist O-Uncommon Most users are unlikely to come across this or unexpected workflow A-Element-R Issues affecting the port of Element's crypto layer to Rust labels Jul 12, 2023
@richvdh richvdh self-assigned this Jul 19, 2023
@richvdh
Copy link
Member Author

richvdh commented Oct 4, 2023

This is back

@richvdh richvdh reopened this Oct 4, 2023
@richvdh
Copy link
Member Author

richvdh commented Oct 5, 2023

There is a test that is supposed to catch this (MatrixClient.clearStores > should clear the indexeddbs in matrix-js-sdk/spec/integ/crypto/rust-crypto.spec.ts). Once we find the leak, we should extend the test to reproduce the problem area.

@richvdh
Copy link
Member Author

richvdh commented Oct 20, 2023

I now can't reproduce this.

@richvdh richvdh closed this as completed Oct 20, 2023
@richvdh
Copy link
Member Author

richvdh commented Jun 20, 2024

We're still seeing this intermittently

@richvdh richvdh reopened this Jun 20, 2024
@richvdh
Copy link
Member Author

richvdh commented Jun 20, 2024

The root cause here is failures to clean up objects on the rust side during logout, which means we end up holding onto a connection to the IndexedDB. That means that we can't delete the IndexedDB, and when we log in again, attempts to open the IndexedDB block indefinitely.

The reason that the Rust objects aren't being cleaned up is that we're relying on javascript's FinalizationRegistry to call the rust-side destructors. As the MDN page makes clear, that mechanism is unreliable.

We could attempt to explicitly call .free on every reference to a Rust object that is ever passed to the JS layer (as we did for a specific case with matrix-org/matrix-js-sdk#3610), but that is likely to add a lot of boilerplate, and be very brittle.

A more plausible solution is to update the matrix-rust-sdk IndexeddbCryptoStore so that it doesn't maintain a connection to the Indexeddb throughout its lifetime, but rather opens a new one each time an operation is performed. Opening the Indexeddb is quick enough that this shouldn't have a significant performance impact.

@gmanskibiditoilet
Copy link

@richvdh @t3chguy please fix this problem or maybe u can find someone to make a new pull request as i often clean up cached files including indexeddb and when i opened out element and sign in again it blocked sign in and lead to this problem.

@t3chguy
Copy link
Member

t3chguy commented Feb 14, 2025

@gmanskibiditoilet what does this issue have to do with me? If you continue to randomly tag me I'll just block you so I don't get notifications unrelated to anything I'm involved with while I'm on holiday.

@gmanskibiditoilet
Copy link

gmanskibiditoilet commented Feb 14, 2025

@gmanskibiditoilet what does this issue have to do with me? If you continue to randomly tag me I'll just block you so I don't get notifications unrelated to anything I'm involved with while I'm on holiday.

ok sorry

richvdh added a commit to matrix-org/matrix-rust-sdk-crypto-wasm that referenced this issue Feb 17, 2025
This reverts commit eea167e.

Unfortunately, most of this change is unsound.

With an explicit `future_to_promise`, we are able to clone the underlying
`matrix_sdk_crypto::OlmMachine` (and its constituent `Arc`) *before* returning
a Promise to the javascript, meaning even if the
wrapping `matrix_sdk_crypto_wasm::OlmMachine` is dropped, the underlying `OlmMachine`
hangs around.

An `async` function does nothing until it is polled, meaning there is plenty of
time for the Javascript to get hold of the Promise and drop its reference to
the OlmMachine, before we try and clone it. I think it's fine if we let the
Javascript finalizer manage cleanup; but we deliberately drop OlmMachine, in an
attempt to get it to drop its indexeddb conneciton.

I'd go so far as to say that wasm-bindgen's support for `#[wasm_bindgen]` on
`async fn` (rustwasm/wasm-bindgen#1754) is
fundamentally incompatible with an explicit call to `.free()` (or, in this
case, `.close()`, which is the same thing.)

Probably, the correct fix here is to improve the way we handle indexeddb
connections (cf element-hq/element-web#25779), at
which point we can remove the `.close()`/`.free()` calls, and reintroduce
this. For now though, I'm going to back it out.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Element-R Issues affecting the port of Element's crypto layer to Rust O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Impairs non-critical functionality or suitable workarounds exist T-Defect
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants