-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add end-to-end crypto? #1813
Comments
I think piggybacking on ZRTP could work nicely. My main concerns would be bandwidth requirements for group conversations, where, if we pairwise ZRTP between all parties, it could quickly grow into something unreasonable. But, if it's how Silent Circle does it, and it's expected that E2E VoIP group chat sends voice streams to each individual in the conversation, then I suppose it's not too bad... I'm curious how you expect this to work from a user's and UI perspective. Would E2E mode be something you go into explicitly (I'm assuming it has to be, because of SAS auth), and your current channel is then in E2E mode? Also, what happens when new people join? ZRTP is obviously more suited for a phone-like structure, where you explicitly make a call to a person or group of people. So maybe entering E2E mode is equivalent to selecting people currently connected to the server, and launching a separate "E2E" channel for those people. Then it'd be completely separate from the channel-based communication that Mumble currently uses, and more like a "call". For example, it could open the E2E-enabled conversation in a new tab, so that it is obvious to users that it's a separate "call". But then again, it also feels weird to have this be something you opt into explicitly... How to implement in a usable way into the current scheme Mumble uses seems to be one of the bigger hurdles to overcome. I'm curious if you guys have any ideas in mind already... |
I really dislike having to introduce "calls" semantics into what is by nature a group-chat software just so we can fit ZRTPs underlying assumptions. Here's how I see it: User-to-User:
Groups:
General:
In any case implementing such a system is a lot of work. Making sure it is safe and provides all the guarantees we think it does is even more work. Such features will also have impact on client complexity, backwards compatibility as well as what server capabilities we can provide in the future (e.g. what about positional audio). Imho: There's already specialized software you can use if you want e2e encrypted communication whose sole design can focus on that sole purpose. Bolting it on to mumble - even though not quite a frankenstein - might really be a bit out of scope for use. However interesting it is to think about this stuff... |
This is a very important feature for some of my team chats, +1 |
in 2018 i think this is a major concern and should be implemented |
Over time it became necessary to get this feature. A trade-off may be using more bandwidth but nowadays this is a minor issue. |
Just to add something to the List to consider. Signal messenger added a new protocol called RingRTC. I think its not yet open source, but I'm sure it will be soon. |
I have been toying with the idea of making mumble into a prototype Matrix client. Ephemeral Matrix rooms are already created (that I know of) upon placing a call, so channels could just be hidden as well, and made to contain the necessary metadata. The idea would be to piggy-back on the E2E ecosystem that Matrix already creates (including verification, extensible profiles, support for multiple devices -- soon with cross-signing), while adding everything that's nice about voice on Mumble. I bet a proof-of-concept wouldn't be that hard, since the Matrix Client-server protocol is quite simple. I have been thinking about this for a while, but haven't found enough time to start implementing this. If you want, I could elaborate a bit more. |
I think it could easily be a step into wrong direction, assuming you mean Matrix.org. While 1:1 Riot calls are end-to-end-encrypted, conference calls depend on Jitsi Meet. Matrix.org is currently also bad for privacy, while Mumble currently doesn't store chat logs, the reference homeserver implementation for Matrix, Synapse stores everything forever https://github.com/matrix-org/matrix-doc/issues/447 including deleted messages. Multiple other privacy issues are listed at https://github.com/privacytoolsIO/privacytools.io/issues/1049. I don't know what you would do to Murmur, but I have understood Synapse to be very heavy especially if you have bigger rooms, while Murmur seems to run anywhere and there is µMurmur for even more limited systems. |
That's indeed the protocol documented on matrix.org I am talking about (though not matrix.org's Matrix server instance, to disambiguate).
This is indeed something that has to be cleared out. And the privacy points you mentioned are indeed concerning. Synapse should get better at this itself, but this is a lesser concern if the participants are on the same homeserver, moreso if E2E is enabled. I am not sure we really should be discussing the merits of Matrix here, as I wouldn't want to completely derail the thread. I am not advocating for completely replacing Murmur here, but to be able to connect to Matrix servers with Matrix identifiers, and optionally use that as a backend for what Murmur is currently used for. Actually, I was more thinking of a proof-of concept where a matrix room would just contain a Mumble server address, with an authentication mechanism for connecting (publish the public key in the room, for instance). As I see it, it could be something completely bolted on the Matrix protocol, with little modification (for starters, at least) to either Mumble or Murmur. Benefits:
Drawbacks:
Maybe the proper way to build what I describe here is a "hard" (could always be merged back) fork, after all, if all of this doesn't fall in the scope for Mumble. But it could then also be argued the same about E2E ... |
I just wanted to add, that it needs to be possible for clients to verify their conference partners fingerprints, if the server is giving out wrong public keys to the user. |
This is stating the obvious but just in case newcomers forget, mumble is already encrypted, so it's not a problem at all if you completely trust your host or your host is one of the parties of conversation. https://wiki.mumble.info/wiki/FAQ/English#Is_Mumble_encrypted.3F Please don't misinterpret my comment as saying this idea is already in mumble, just stating for the record it already has encryption, but not E2E encryption, which absolutely has its place and should be considered. On another noteI think the extra bandwidth could be offset by server owners getting a separate config option for the bandwidth of E2E calls, so they can set it lower than normal. |
I would like to add a concept for this (also considering the discussion about chat logs matrix-org/matrix-spec-proposals#2560). This would be a per channel solution, because mumble is a channel- and server-based software. We create a group-key (details below) that is used for encryption for all participants in the channel. So the client encrypts voice & chat with the group-key and sends it to the server, the server then sends it to the other clients and they decrypt it. For chat logs (in case of server-side chat logs), the server will add the encrypted chat messages into a file (specific for that group). group-key creation: Now the only remaining problem is the distribution of the key, because that would normally be send over the server, so a potencial man-in-the-middle-attack by the server is possible.
For 1 (encryption) we have the following solution: For 2 (verification of sender/receivers) we have these solutions:
Details:
|
I understand that doing this "nicely", with proper, seamless, key exchange would be a lot of work. However, I think there would be a lot of value in doing it with key exchange out of band:
This could be slightly improved at very little cost by having some UI like "Yanmaani would like to enable encryption; key ID = Lots of other small UX improvements you could make like that, without actually implementing any "big" protocol. And then if this is implemented and works, someone could maybe start looking at key exchange more properly, or going the Unix way and having that done by a separate, external daemon. But for me, just having this simple though ugly system of pre-sharing a key would be extremely useful for my personal needs. |
For this feature, the flow as I understand it would be:
Items 2 and 7 are going to create a delay and it will largely be dependent on the algorithm, bit level, and speed of the two PC's. In a room with multiple people you would most likely have to multicast to attempt to reduce that delay. The overhead would be noticeable to those in the gaming community that use Mumble for real time communication. The similar scenario with chat will be negligible by comparison. Since there is no storage of the audio it's probably just better to encrypt in transit (which TCP already does). |
I think this is addressed by https://security.stackexchange.com/a/127331 |
Moreover, using a stream chipher like ChaCha20 (as used by wireguard) after negociating a symmetric encryption key, the performance cost is very small: one just need to generate the cipher stream, which can be accelerated with SSE, and XOR the cipher stream with the data stream. Assuming cipher stream generation is faster than sampling rate, this would add 1-instruction (XOR) of latency per byte, more or less (one might decide to wait for enough data to fill a packet, perform a vectored XOR on that before sending the packet, which would effectively produce less added latency on a per-byte basis, but could produce more total latency). We're talking about nanoseconds in any case. |
Linked channels and shouts are not an issue. At least if the goal is merely to prevent the server from sniffing into conversations. We only have to treat the encryption the same as if all clients were in a single room. Thus, all clients on a server generate some sort of shared secret in a way the server doesn't know. Then the clients can all encrypt and decrypt packets in a way that the server can't. Thus, it wouldn't matter which channel a given user is in. |
Oh, I didn't see anyone mentioned per server instead of per channel. That is interesting. |
Not sure anyone did actually mention that, but that immediately popped to my mind in order to solve the problems you mentioned :) But then again: a knowledgeable server admin may change the server's code to send the audio stream to a locally running client that can the decrypt the conversation. I guess true security would indeed only be provided in a call-like scenario (which could always be an additional feature - e.g. "secure channels" for which links and shouts won't work). |
I 100% agree with @Krzmbrzl on the above. I see no reason to scope key handling depending on the channel the user is in. The only issue I can find is that "invisible" users wouldn't be able to spoof on conversations, but that's probably a feature. Now, indeed, the server could add any number of fake users it sees fit. That's a bit more obvious when eavesdropping, and users could theoretically make use of the "friends" mechanism to only send data to trusted users, or choose to ignore users and never give them keys. Secure channels where every participant must be verified are an option as well. MITM by the server is a risk, of course, but that can be mitigated, especially with the "friend" feature: connect to another random server and verify fingerprints there, or out-of-band. |
I don't think undermining E2E is a good or even acceptable idea. |
Nobody is undermining anything. E2E is by definition only a technique that prevents anyone in the middle from decrypting stuff... It doesn't require separate encryption for all recipients. |
I don't see a solution where the server is able to eavesdrop as useful. But server wide encryption where everyone has to be friends or where you need a secret (that is not shared with the server) to enter would work. The users need to verify each other somehow. |
I don't see a solution where the server is able to eavesdrop as
useful. But server wide encryption where everyone has to be friends
or where you need a secret (that is not shared with the server) to
enter would work. The users need to verify each other somehow.
I'm not a advocate of encryption, but I can see a potential option:
* The client generates a public private key pair (existing client
certificate);
* When the client adds someone as a friend, it asks to verify the
friend's public key signature;
* When the client enters a channel, it generates a channel session key;
* The client sends everyone who can hear (including whispers and
related channels) the channel session key, encrypted with the
recipient's public key, and requests the recipient's channel session
key;
* The client sends voice and data encrypted with its channel session
key to everyone through the server;
* Only the one with the correct key can decrypt the voice and data.
The client can enable the sending of the session key to everyone, or
only to friends, with the appropriate notification.
The client can enable/disable encryption altogether.
|
That is not the point, I proposed using a "(sessioned) group key" (or whatever it is called in correct terms), myself. I think this is also a standard method, but I am not an expert. Even if limited to a channel, it is problematic, as users would expect their communication to be limited to participants that are known (aka visible and maybe verified) and restricted (e.g. password-protected channel and maybe forward secrecy (see point 1 below)) and/or that take/took part in this session only.
|
[Originally from https://github.com/mumble-voip/mumble-iphoneos/issues/87]
@ioerror says:
A mumble server is able to wiretap every mumble client. It would be nice if the content of audio (and chat text per channel) was not available to a passive collection process. It should be possible for every client to do at least a pairwise ZRTP, if not a group ZRTP (pairwise, each pair?) as I think Silent Circle does for group calls.
SilentCircle has this group calling feature but it isn't Free Software. I've heard (but not used) that Jitsi has this with ZRTP and it is Free Software.
This would be an amazing feature and it would mean that a server would essentially just be a relay for encrypted data with minimal metadata (eg: user name, ip address, channel joined).
@Tea23 says:
As a pretty heavy Mumble user I'd just like to express some support for this idea. End-to-end crypto in Mumble would bring it up to speed with OTR, adding vital private voice comms to the landscape which at the moment are pretty scarce and certainly none of them have the feature set of Mumble.
At the moment verifying identities w/ Mumble is pretty difficult since the certificate handling is pretty shoddy, and if there really is no end-to-end encryption then its utility as a safe utility for private communication is severely limited.
Mumble's probably in the best position of any VoIP stack to provide an effective privacy suite.
The text was updated successfully, but these errors were encountered: