Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shorten subdirectory IDs #386

Merged
merged 3 commits into from
Nov 18, 2024
Merged

Shorten subdirectory IDs #386

merged 3 commits into from
Nov 18, 2024

Conversation

nothingmuch
Copy link
Collaborator

WIP since this should probably be refactored a bit before merging (see TODO comments).

The message for the last commit contains a rationale for 64 bit values and hashing.

@nothingmuch
Copy link
Collaborator Author

nothingmuch commented Nov 12, 2024

Akshually, I just realized the last commit message is (edit: was) incorrect, not only was the interception scenario possible before due to the truncated values used for the redis key, since this was 8 characters of Base64 and the first encoded byte was always 0x02 or 0x03 for the compressed y coordinate, the redis level IDs only had 41 bits of entropy, so in fact the last commit improves the robustness against DoS attacks and v1 fallback proposal PSBT leakage

@nothingmuch nothingmuch force-pushed the short-ids branch 2 times, most recently from b8d3332 to a6b64ed Compare November 12, 2024 01:32
payjoin/Cargo.toml Outdated Show resolved Hide resolved
payjoin/src/lib.rs Outdated Show resolved Hide resolved
@DanGould
Copy link
Contributor

DanGould commented Nov 14, 2024

Concept ACK in response to the "Shorten subdirectory IDs to 64 pseudorandom bits" commit message.

✅ utxo set size ≈ 2^28
✅ daily tx count limit ≈ 2^21

the probability for an individual session to
experience a random collision is << 1e-10

10 12 9s is way more than enough reliability of this particular component, and the utxo set size and even daily tx counts are already way overestimations of the number of sessions a single directory will ever have and many directories may be run.

Concept ACK also on your note that this already improves collision resistance on the naive truncated shortIDs of not-completely-pseudorandom values.

payjoin/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@DanGould DanGould left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to do what it says it does to me. There are quite a few TODOs that are OK to do in follow ups. I find it a bit weird to change the API in the first commit and only to use it in later commits but that's a nit of nits

Hoping @spacebear21 weighs in as well

Instead of the compressed public key, subdirectory IDs are now a
truncated SHA256 of the compressed public key.

These are should only be treated unique identifiers, not hashes: the use
of SHA256 is only an implementation detail, and should not be specified
by BIP77, nor verified/enforced by clients. This is because 64 bit
hashes are insufficiently binding: finding a pair of colliding keys is
almost trivial and finding a 2nd preimage for a given ID is tractable.
For this reason no tagging is used to derive the IDs: public keys are
ephemeral and have sufficient entropy to be unguessable. Random IDs
could also have been used, but hashing seems simpler and reduces the
receiver's statefulness requirements.

ID collisions are only a liveness concern, and do not affect safety.
With BIP77, HPKE will fail due to the wrong key (and/or domain
separation if future protocols also use short IDs). With BIP 78 fallback
the PSBT will not contain the intended receiver's outputs. The intended
receiver can still poll the same subdirectory and respond, eventually,
but only one sender will succeed.

64 bits are sufficient to make the probability of experiencing a random
collisions negligible. As of writing, the UTXO set has ~2^28 elements.
This is a very loose upper bound for the number of concurrent (non-spam)
sessions, for which the probability of a random collision with will be
<1%. The actual number of sessions will of course be (orders of
magnitudes) lower given they are short lived. With ~2^21 sessions (loose
bound on number of transactions that can be confirmed in 24H) the
probability is less than 1e-6. These figures are for the existence of a
collision in the set, the probability for an individual session to
experience a random collision is << 1e-10 in either case.

Since no rate limiting or access control mechanism exists for the
directory yet, it's notable that this change changes the nature of a
hypothetical DoS attack. With long IDs the adversary could only cause
operational errors in theory (e.g. by filling the directory's storage).

Note that by polling a large number of IDs an adversary can succeed in
randomly *intercepting* v2 clients' sessions, and POST garbage data to
the session causing HPKE to fail.

For v1 sessions this can leak PSBT proposals, since those are not
encrypted, which can leak input ownership information to the adversary.

As implemented this change is not a regression but an actually hardens
against DoS/malice in practice, because although in theory subdirectory
IDs contained more entropy, the underlying redis keys prior to this
change only contained 41 bits of entropy (8 characters of base64 encoded
data, with 0x02 or 0x03 for the first encoded byte).

Both the random collision and abuse scenarios can be mitigated by
restricting the number of concurrent sessions in the directory to more
reasonable values (less than 2^20). This is not done in this change.
@nothingmuch
Copy link
Collaborator Author

I find it a bit weird to change the API in the first commit and only to use it in later commits but that's a nit of nits

Squashed first two commits

@DanGould DanGould mentioned this pull request Nov 17, 2024
17 tasks
Copy link
Collaborator

@spacebear21 spacebear21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK a95a2ba. Noted one more nit that can be addressed in a follow-up.

@@ -99,6 +117,6 @@ impl DbPool {
}
}

fn channel_name(pubkey_id: &str, channel_type: &str) -> String {
format!("{}:{}", pubkey_id, channel_type)
fn channel_name(pubkey_id: &ShortId, channel_type: &str) -> Vec<u8> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper is no longer that helpful IMO and could be replaced by direct calls to column_key() in later commits.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed... do you think the TODO comments for ShortId are missing anything or could be simplified?

@DanGould DanGould merged commit b013874 into payjoin:master Nov 18, 2024
6 checks passed
@nothingmuch nothingmuch deleted the short-ids branch November 18, 2024 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants