Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Query missing cross-signing keys on local sig upload #7289

Merged

Conversation

anoadragon453
Copy link
Member

@anoadragon453 anoadragon453 commented Apr 16, 2020

Mitigates #7276

So we have a race condition that appears in the following scenario: You want to cross-sign with a remote user who doesn't share a room with any users on your homeserver. When you start cross-signing, a DM is created between you and the remote user.

When the remote user sends you their keys, the DM room may not have completed creation yet. In this case, your homeserver ignores the new keys. When you then upload your signatures of the remote user, your homeserver rejects them as it doesn't have the remote user's keys.

This seems to only occur with workerised Synapse setups. A possible solution to this is to fix the race condition, but that still leaves a number of people with broken setups. Another solution, and one we should probably do anyways, is to query the remote user's keys when a local user signs them, if the homeserver doesn't have them already.

So that's what this PR does. At least I think it's what it does. This is untested and I used MSC1756 as a reference for all of this so it may or may not be right. Review appreciated!

@anoadragon453 anoadragon453 self-assigned this Apr 16, 2020
Copy link
Member

@dbkr dbkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this save the key in the cache after fetching it and if not, should it? I can see above in the /keys/query path that it claims to be storing keys in the cache although failing to find where it actually does, and also not sure if this applies to cross-signing keys.

user = UserID.from_string(user_id)
try:
remote_result = yield self.federation.query_client_keys(
user.domain, {"device_keys": {user_id: []}}, timeout=10 * 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the user ID we're querying be in here?

Copy link
Member Author

@anoadragon453 anoadragon453 Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume so? That's what the user_id in {"device_keys": {user_id: []}} is. I left the value as [] as according to the MSC that should query all of their keys?

Does this save the key in the cache after fetching it and if not, should it?

It's not, currently, but I suspect we want to. However atm I don't see any cache involved here. It looks like it's just pulling them directly from the db. I can save them to the db instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right yes, I was completely failing to read the object literal correctly.

and yeah, I think cache === database according to this code (

def get_cross_signing_keys_from_cache(self, query, from_user_id):
). Either way if we are just intentionally not updating our local copy here then that's fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if there's other code that relies on these keys being available then we'd also need the same fix there, and in that case it'd be more efficient to update the database in each instance.

But, changing the state of the db here may break other assumptions, so there's an argument against it. Unfortunately I don't have a grasp on everything Synapse is doing related to cross-signing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, looking at what code calls get_e2e_cross_signing_key, it's only the sig upload servlet and federation key queries, both of which shouldn't be harmed by having the extra keys in there.

Let me see what the code for saving them after retrieving them looks like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks plausible, but yes, unsure if you would need to notify clients.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to notify clients.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we seem to have a very similar situation in E2eKeysHandler.query_devices, which I don't think does any notification in this situation. After all, we're assuming that the user does not share any rooms with the remote user, so why would we notify clients of the new key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I wonder if we should not be sharing code with E2eKeysHandler.query_devices

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally pulled do_remote_query out of that method to try and share it between query_devices and _get_e2e_cross_signing_verify_key, but decided it was doing too much and just plucked self.federation.query_client_keys out of it instead.

Now we've transitioned to self.federation.query_user_devices which is a lot simpler. I'm just now worried/wondering whether we should be taking the device_keys returned by this method and doing anything with them (which query_devices seems to do, but it does so much else as well).

dbkr
dbkr previously approved these changes Apr 16, 2020
@anoadragon453 anoadragon453 requested a review from a team April 16, 2020 12:55
uhoreg
uhoreg previously requested changes Apr 16, 2020
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned that a client could easily cause a DoS here by uploading a bunch of signatures which it claims are for a user on some remote server. AFAICT we would then retrieve the key for every single signature?

changelog.d/7289.bugfix Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Show resolved Hide resolved
synapse/handlers/e2e_keys.py Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
user = UserID.from_string(user_id)
try:
remote_result = yield self.federation.query_client_keys(
user.domain, {"device_keys": {user_id: []}}, timeout=10 * 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we seem to have a very similar situation in E2eKeysHandler.query_devices, which I don't think does any notification in this situation. After all, we're assuming that the user does not share any rooms with the remote user, so why would we notify clients of the new key.

user = UserID.from_string(user_id)
try:
remote_result = yield self.federation.query_client_keys(
user.domain, {"device_keys": {user_id: []}}, timeout=10 * 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I wonder if we should not be sharing code with E2eKeysHandler.query_devices

@anoadragon453
Copy link
Member Author

I'm a bit concerned that a client could easily cause a DoS here by uploading a bunch of signatures which it claims are for a user on some remote server. AFAICT we would then retrieve the key for every single signature?

Correct, which is a problem that kind of invalidates this whole approach. Hm.

@anoadragon453
Copy link
Member Author

@dbkr Is the problem that the race condition occurs between the client creating a DM with the remote user and trying to upload key signatures from that user?

How can you upload signatures if your homeserver has ignored device updates from that user? How did you have their public keys to sign?

Is the client waiting until the room is created before uploading signatures? If so, then there's definitely an issue with us querying for the rooms a remote user is in on the server.

@dbkr
Copy link
Member

dbkr commented Apr 16, 2020

@dbkr Is the problem that the race condition occurs between the client creating a DM with the remote user and trying to upload key signatures from that user?

I don't think so: I've been creating a direct chat with the user, accepting, then starting a verification, so I don't think any of this could race when it's happening in the amount of time it takes me to click on stuff.

How can you upload signatures if your homeserver has ignored device updates from that user? How did you have their public keys to sign?

So in /keys/query it looks like synapse will still fetch remote keys for users and return them, it just won't cache them unless it thinks we share a room, so my suspicion is that my client is querying the keys, and getting them, but synapse is not saving the resulting key to the db because it doesn't think we share a room yet.

Is the client waiting until the room is created before uploading signatures? If so, then there's definitely an issue with us querying for the rooms a remote user is in on the server.

It definitely is doing this: I'm not starting the verification process until the other user has joined (although this is also a good point - we've not done anything in particular to prevent you from trying to verify a user you aren't in a room with, but that's not the problem in this case).

@anoadragon453
Copy link
Member Author

@dbkr Thanks. My working theory on this is that get_rooms_for_user_with_stream_ordering may be getting it's cache invalidated on a different worker than the one handling key signature uploads, but we can look at that later.

@anoadragon453 anoadragon453 requested review from richvdh and uhoreg April 16, 2020 17:43
uhoreg
uhoreg previously approved these changes Apr 16, 2020
Copy link
Member

@uhoreg uhoreg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks roughly fine, other than a little nit. Please do try to test it in some way before merging, though.

synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few more thoughts and questions from me

changelog.d/7289.bugfix Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
@anoadragon453
Copy link
Member Author

anoadragon453 commented Apr 17, 2020

Looking at the signature update handling code, we may want to update clients about new devices as well:

yield device_handler.notify_device_update(user_id, device_ids)

Edit: Added some code to handle this.

changelog.d/7289.bugfix Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
@anoadragon453 anoadragon453 requested a review from richvdh April 20, 2020 17:43
@richvdh richvdh dismissed stale reviews from uhoreg and dbkr April 21, 2020 09:13

updated

changelog.d/7289.bugfix Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Show resolved Hide resolved
Comment on lines 1016 to 1018
try:
key_id, verify_key = get_verify_key_from_cross_signing_key(key)
except ValueError as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this still feels like spaghetti. Why are we returning key_id and verify_key from _retrieve_cross_signing_keys_for_remote_user if we immediately throw them away?

how about:

        key = yield self.store.get_e2e_cross_signing_key(
            user_id, key_type, from_user_id
        )

        if key:
            # we found a copy of this key in our database. decode it and return it.
            # XXX is a try/except needed here? If so, why? we didn't have one before.
            key_id, verify_key = get_verify_key_from_cross_signing_key(key)
            return key, key_id, verify_key

        # some words about the edge case
        if not self.is_mine(user) and key_type in ["master", "self_signing"]:
            (
                key,
                key_id,
                verify_key,
            ) = yield self._retrieve_cross_signing_keys_for_remote_user(user, key_type)

        if key is None:
            logger.debug("No %s key found for %s", key_type, user_id)   # XXX or wrning?
            raise NotFoundError("No %s key found for %s" % (key_type, user_id))

        return key, key_id, verify_key

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXX is a try/except needed here? If so, why? we didn't have one before.

I suppose one wouldn't be if we trust the data in our database, which we probably should be.

Comment on lines 1070 to 1074
# At the same time, store this key in the db for
# subsequent queries
yield self.store.set_e2e_cross_signing_key(
user.to_string(), key_type, key_content
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it correct that we store it before we check that it can be parsed? (In short, is the data in the table supposed to have been validated or not? If not, we need to be careful when we extract it). What does the existing code do?

Copy link
Member Author

@anoadragon453 anoadragon453 Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the above call from the database did not have a try/except, I believe we're operating on the assumption that the data in the database is valid.

Yes, we should validate this data in some way. Some level of such is done in do_remote_query:

for user_id, keys in remote_result["device_keys"].items():
if user_id in destination_query:
results[user_id] = keys
if "master_keys" in remote_result:
for user_id, key in remote_result["master_keys"].items():
if user_id in destination_query:
cross_signing_keys["master_keys"][user_id] = key
if "self_signing_keys" in remote_result:
for user_id, key in remote_result["self_signing_keys"].items():
if user_id in destination_query:
cross_signing_keys["self_signing_keys"][user_id] = key

I'd hate to duplicate but that could be lifted and modified.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of verification, one thing that's confusing me is that in the loop of _retrieve_cross_signing_keys_for_remote_user, remote_result should look something like this:

Response:
{
"device_keys": {
"<user_id>": {
"<device_id>": {...}
} }
"master_keys": {
"<user_id>": {...}
} }
"self_signing_keys": {
"<user_id>": {...}
} } }

Thus key_content should look something like:

"<user_id>": {
"<device_id>": {...}

And thus we should need to extract the key_contents from that dictionary before we pass it to get_verify_key_from_cross_signing_key.

And yet this PR works, so something isn't lining up here.

Copy link
Member Author

@anoadragon453 anoadragon453 Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already, so my docstring update is incorrect, as the thing I actually get back from query_client_keys is in the form:

{
  "user_id": "@test18:workerstest.amorgan.xyz",
  "stream_id": 213,
  "devices": [
    {
      "device_id": "MLYKYOAKEZ"
    },
    {
      "device_id": "NBICDLRVHV",
      "keys": {
        "algorithms": [
          "m.olm.v1.curve25519-aes-sha2",
          "m.megolm.v1.aes-sha2"
        ],
        "device_id": "NBICDLRVHV",
        "keys": {
          "curve25519:NBICDLRVHV": "E9ryKjylPorQIxyRA4FsJwi9d2bPECLvWKjgcuqmamU",
          "ed25519:NBICDLRVHV": "yk5AMkCyhogBCQt1CrVbFL9weRb/5iExmIDYd7JmBhE"
        },
        "signatures": {
          "@test18:workerstest.amorgan.xyz": {
            "ed25519:NBICDLRVHV": "JTAL+uIduDDJjdQneocTbjqjIDMljqDOq8bJ6ZgAYOQ01wMHrX4qRmQufQT9KAiYoXQ+b6UY+ZNEIFjyi1rjCg",
            "ed25519:cEYsFkRyHrfj1fLLgq4rA+JyvOwHda2mO092N4scL/E": "RQSm3aYjvWMWHx0GHZPRRKvj90TEdlqjOGXulXBUSDEuI2P64A7N/7/uSdx//BhqOgcmZSS/9o1Awlpb+KiiCg"
          }
        },
        "user_id": "@test18:workerstest.amorgan.xyz"
      },
      "device_display_name": "https://riot.im/develop/ via Firefox on Linux"
    }
  ],
  "master_key": {
    "user_id": "@test18:workerstest.amorgan.xyz",
    "usage": [
      "master"
    ],
    "keys": {
      "ed25519:RPg3wmdeZOvjMCmi3/pP9+4ndSW9W2dhE/RkLyCdr+E": "RPg3wmdeZOvjMCmi3/pP9+4ndSW9W2dhE/RkLyCdr+E"
    },
    "signatures": {
      "@test18:workerstest.amorgan.xyz": {
        "ed25519:NBICDLRVHV": "YLdKUD4aDmJx/vqIWjbqXTOZXv+cU1ba8Z9hw44vVn32uhLdJAiGC+L/6HeBTz7+Wh+NdWGBiy+usea3408bDA"
      }
    }
  },
  "self_signing_key": {
    "user_id": "@test18:workerstest.amorgan.xyz",
    "usage": [
      "self_signing"
    ],
    "keys": {
      "ed25519:cEYsFkRyHrfj1fLLgq4rA+JyvOwHda2mO092N4scL/E": "cEYsFkRyHrfj1fLLgq4rA+JyvOwHda2mO092N4scL/E"
    },
    "signatures": {
      "@test18:workerstest.amorgan.xyz": {
        "ed25519:RPg3wmdeZOvjMCmi3/pP9+4ndSW9W2dhE/RkLyCdr+E": "1H9aaL9623L764uGAezZM5jH3Gp5DLwto4s9rz+IP3bGQqLo0Cyi2bWK54aJbVrJ6akwxahZRs/n9vLDn3nIDQ"
      }
    }
  }
}

Lemme update that.

In terms of verification, I think we just need to check the user_id matches the user we're querying for, and that there's a keys field under each *_keys key.

Copy link
Member Author

@anoadragon453 anoadragon453 Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaand I was looking at the output of query_user_devices, not query_client_keys. Have updated the right docstring now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to conclude, we now validate the key by checking it belongs to the right user, and has a valid keys key dict in it with get_verify_key_from_cross_signing_key.

richvdh added a commit that referenced this pull request Apr 21, 2020
This was incorrectly merged before it was ready.

This reverts commit aead826, reversing
changes made to 4cd2a4a.

It also reverts commits 9b8212d, fb3f1fb and 2fdfa96.
anoadragon453 and others added 6 commits April 21, 2020 12:23
Co-Authored-By: Richard van der Hoff <[email protected]>
…:matrix-org/synapse into anoa/query_cross_signing_keys_key_upload

* 'anoa/query_cross_signing_keys_key_upload' of github.com:matrix-org/synapse:
  Update changelog.d/7289.bugfix
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few nits left. looking pretty good though.

synapse/federation/transport/client.py Outdated Show resolved Hide resolved
synapse/federation/transport/client.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Outdated Show resolved Hide resolved
synapse/handlers/e2e_keys.py Show resolved Hide resolved
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm otherwise. please fix up and if it passes CI, merge.

synapse/federation/transport/client.py Outdated Show resolved Hide resolved
@anoadragon453 anoadragon453 merged commit f89ad3b into release-v1.12.4 Apr 22, 2020
@anoadragon453 anoadragon453 deleted the anoa/query_cross_signing_keys_key_upload branch April 22, 2020 11:29
richvdh added a commit that referenced this pull request Apr 22, 2020
Synapse 1.12.4rc1 (2020-04-22)
==============================

Features
--------

- Always send users their own device updates. ([\#7160](#7160))
- Add support for handling GET requests for `account_data` on a worker. ([\#7311](#7311))

Bugfixes
--------

- Fix a bug that prevented cross-signing with users on worker-mode synapses. ([\#7255](#7255))
- Do not treat display names as globs in push rules. ([\#7271](#7271))
- Fix a bug with cross-signing devices belonging to remote users who did not share a room with any user on the local homeserver. ([\#7289](#7289))
clokep added a commit that referenced this pull request Apr 23, 2020
Synapse v1.12.4

Features:

* Always send users their own device updates. (#7160)
* Add support for handling GET requests for account_data on a worker. (#7311)

Bugfixes:

* Fix a bug that prevented cross-signing with users on worker-mode synapses. (#7255)
* Do not treat display names as globs in push rules. (#7271)
* Fix a bug with cross-signing devices belonging to remote users who did not share a
  room with any user on the local homeserver. (#7289)
anoadragon453 added a commit that referenced this pull request Apr 24, 2020
…anoa/temp_working_cache_config

* 'release-v1.12.4' of github.com:matrix-org/synapse: (123 commits)
  1.12.4
  formatting for the changelog
  1.12.4rc1
  1.12.4rc1
  Do not treat display names as globs for push rules. (#7271)
  Query missing cross-signing keys on local sig upload (#7289)
  Fix changelog file
  Support GET account_data requests on a worker (#7311)
  Revert "Query missing cross-signing keys on local sig upload"
  Always send the user updates to their own device list (#7160)
  Query missing cross-signing keys on local sig upload
  Only register devices edu handler on the master process (#7255)
  tweak changelog
  1.12.3
  Fix the debian build in a better way. (#7212)
  Fix changelog wording
  1.12.2
  Pin Pillow>=4.3.0,<7.1.0 to fix dep issue
  1.12.1
  Note where bugs were introduced
  ...
anoadragon453 added a commit that referenced this pull request Apr 24, 2020
…cache-config-without-synctl

* 'develop' of github.com:matrix-org/synapse: (76 commits)
  1.12.4
  Revert "Revert "Merge pull request #7315 from matrix-org/babolivier/request_token""
  Revert "Merge pull request #7315 from matrix-org/babolivier/request_token"
  Stop the master relaying USER_SYNC for other workers (#7318)
  Config option to inhibit 3PID errors on /requestToken
  Fix replication metrics when using redis (#7325)
  formatting for the changelog
  Another go at fixing one-word commands (#7326)
  1.12.4rc1
  1.12.4rc1
  fix changelog name
  Extend StreamChangeCache to support multiple entities per stream ID (#7303)
  Extend room admin api with additional attributes (#7225)
  Add ability to run replication protocol over redis. (#7040)
  Do not treat display names as globs for push rules. (#7271)
  Reduce logging verbosity of URL cache cleanup. (#7295)
  Query missing cross-signing keys on local sig upload (#7289)
  import urllib.parse when using urllib.parse.quote (#7319)
  Reduce federation logging on success (#7321)
  Fix changelog file
  ...
phil-flex pushed a commit to phil-flex/synapse that referenced this pull request May 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants