Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around '#' escaping bug in bip21 crate #373

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

nothingmuch
Copy link
Collaborator

@nothingmuch nothingmuch commented Oct 21, 2024

The pj parameter of the BIP 21 URL is itself a URL which contains a
fragment.

The # character was not escaped by bip21 during serialization. According
to RFC 3986, this terminates the query string, effectively truncating the pj
parameter.

The buggy deserialization likewise ignored #, parsing it as part of the
value, which survived a round trips with the incorrect serialization
behavior.

Upstream fix PR: payjoin/bitcoin_uri#3

@DanGould
Copy link
Contributor

Hmm. It looks like The Sender's parsed subdirectory evaluates to "Ak-eJMIFs9fSuMoULsRwGhVnrvNhLc4IuMGmq9rjt8er%23ohttp=AQJPzw3sw5nHUaLaeVCqmGguYzH5VfjZBYHkQhS3h6ZQVQ&exp=1729708517"

You can see the %23ohttp isn't decoded to #ohttp

@DanGould
Copy link
Contributor

DanGould commented Oct 22, 2024

It seems like this PR is missing the corresponding percent-decode in deserialize_temp. I thought we wrote this together when pair programming

Something like the following seems to address the issue in my local environment.

            "pj" if self.pj.is_none() => {
                let encoded = Cow::try_from(value).map_err(|_| InternalPjParseError::NotUtf8)?;
                let decoded = percent_decode_str(&encoded)
                    .map_err(|_| InternalPjParseError::BadEndpoint)?
                    .decode_utf8()
                    .map_err(|_| InternalPjParseError::NotUtf8)?;
                let url = Url::parse(&decoded).map_err(|_| InternalPjParseError::BadEndpoint)?;
                self.pj = Some(url);

                Ok(bip21::de::ParamKind::Known)
            }

These commits should be squashed if we agree that they're the right ones to make

@nothingmuch
Copy link
Collaborator Author

I don't understand the CI error, I can't replicate it locally with act, and neither cargo 1.77.2 (e52e36006 2024-03-26), 1.80.0 (376290515 2024-07-16), nor 1.81.0 (2dbb1af80 2024-08-20) does anything to the lockfile, I'd appreciate some guidance

@DanGould
Copy link
Contributor

DanGould commented Oct 22, 2024

I believe the long-running ohttp relay and directory tasks in integration tests started becoming flaky around a week ago. I'm not sure why they're causing the problem, but the problem seems to have to do with the GitHub CI environment

@nothingmuch
Copy link
Collaborator Author

It seems like this PR is missing the corresponding percent-decode in deserialize_temp. I thought we wrote this together when pair programming

My fix is actually buggy, because the '#' only encoding pass should happen after the incomplete encoding, because it runs before the '%' in the '%23' is then escaped as well

The bip21 crate is correct here, so unfortunately I don't think there's a simple workaround without the upstream fix:

https://github.com/Kixunil/bip21/blob/eae72026cc5838bb169949641948b8c1cef99cbe/src/ser.rs#L139

https://github.com/Kixunil/bip21/blob/eae72026cc5838bb169949641948b8c1cef99cbe/src/ser.rs#L57

Something like the following seems to address the issue in my local environment.

            "pj" if self.pj.is_none() => {
                let encoded = Cow::try_from(value).map_err(|_| InternalPjParseError::NotUtf8)?;
                let decoded = percent_decode_str(&encoded)
                    .map_err(|_| InternalPjParseError::BadEndpoint)?
                    .decode_utf8()
                    .map_err(|_| InternalPjParseError::NotUtf8)?;
                let url = Url::parse(&decoded).map_err(|_| InternalPjParseError::BadEndpoint)?;
                self.pj = Some(url);

                Ok(bip21::de::ParamKind::Known)
            }

This is not correct as it would decode escaped parameter values, e.g. if the pj URI has a query string in it, and that has an escaped # in it, it'd break the pj URI parsing.

Decoding is already done here:

https://github.com/Kixunil/bip21/blob/eae72026cc5838bb169949641948b8c1cef99cbe/src/de.rs#L65

As confirmed by this test:

https://github.com/payjoin/rust-payjoin/pull/373/files#diff-0a6cb81cf3412ef9ec45c2a520b79c91435b057e6557dc466ecb88faf7ac6df9R133

@nothingmuch
Copy link
Collaborator Author

I don't see any other way of fixing serialization

Something like the following approach could work if PjUri wasn't an alias to an external type making the trait definition impossible (well apart from the infinite recursion):

impl<'a> fmt::Display for PjUri<'a> {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        let malformed_uri = format!("{}", self);
        let escaped = malformed_uri.replacen("#", "%23", 1);
        write!(f, "{}", escaped)?;
    }
}

The only solutions I can see are waiting for upstream fix to be merged or making PjUri (which is pub) a wrapper type instead of an alias so overriding fmt can make PjUri stringify with the correct escaping (making it a wrapper also fixes the recursion).

I pushed a version of the latter with a Deref trait, but that still breaks the API even if it can maintain syntactic compatibility for accessing the bip21::Uri fields, so this doesn't seem advisable.

@nothingmuch
Copy link
Collaborator Author

Dropped the Deref hack, just froze the dep to use the bugfix PR branch as per our chat

@nothingmuch nothingmuch force-pushed the escape-fragment-separator branch from f2b9dfa to ba0378c Compare October 23, 2024 18:20
payjoin-cli/Cargo.toml Outdated Show resolved Hide resolved
@DanGould DanGould mentioned this pull request Nov 17, 2024
17 tasks
@DanGould
Copy link
Contributor

DanGould commented Nov 25, 2024

Looks like it makes the most sense to link bech32 = "*" since it's not exported from rust-bitcoin

rust-bitcoin/rust-bitcoin#3650

edit: Crates.io does NOT allow wildcard version dependencies

@nothingmuch nothingmuch mentioned this pull request Nov 28, 2024
@DanGould
Copy link
Contributor

Since this is deadlocked by kixunil I propose we start a new crate with hopes to implement the new unmerged bip spec anyhow. @spacebear21 is this something you'll have capacity for?

@spacebear21
Copy link
Collaborator

Since this is deadlocked by kixunil I propose we start a new crate with hopes to implement the new unmerged bip spec anyhow. @spacebear21 is this something you'll have capacity for?

Sure, should we start by just forking Kixunil's repo into the payjoin project and merge Kixunil/bip21#26 there? I'm hesitant to publish a crate for an unfinalized/unmerged bip, which wouldn't even be compliant with said bip, but I guess we could put a big disclaimer up front.

DanGould added a commit that referenced this pull request Dec 2, 2024
These commits:

1. encode fragment paramters as bech32 strings (with no checksum), with
2 character param names in the human readable part, and `+` as a
delimiter
- expiry time parameter encoded as a little endian u32, in line with
bitcoin header timestamp and unix time nLocktime encoding. `EX` hrp
- ohttp has `OH` hrp, same byte representation as previously encoded in
base64
- receiver key has `RK` hrp, same byte representation as previously
encoded in base64
2. use bech32 with no human readable part for the subdirectory IDs
3. uppercase the scheme and host of the URI
4. move the pj so it follows the pjos parameter

Once the `#` %-escaping bug is fixed (see #373), and as long as no path
elements between the host and the subdirectory ID, the entire `pj`
parameter of the bip21 URI can be encoded in a QR using the alphanumeric
mode. Closes #389.

Manually verified, with manual % escaping of `#`, using the `qrencode`
CLI encoder alphanumeric mode is indeed used for everything following
`pj=`.
@DanGould
Copy link
Contributor

DanGould commented Dec 2, 2024

The Kixunil/bip21#26 issue is now fixed in bitcoin_uri 0.1.0 fork

@nothingmuch nothingmuch force-pushed the escape-fragment-separator branch 2 times, most recently from 18dd004 to a7affe7 Compare December 2, 2024 22:39
@nothingmuch nothingmuch marked this pull request as draft December 2, 2024 22:40
@nothingmuch nothingmuch force-pushed the escape-fragment-separator branch from a7affe7 to d46d704 Compare December 2, 2024 22:41
@nothingmuch nothingmuch marked this pull request as ready for review December 2, 2024 22:41
@coveralls
Copy link
Collaborator

coveralls commented Dec 2, 2024

Pull Request Test Coverage Report for Build 12131928990

Details

  • 23 of 25 (92.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.1%) to 61.869%

Changes Missing Coverage Covered Lines Changed/Added Lines %
payjoin/src/uri/mod.rs 10 12 83.33%
Totals Coverage Status
Change from base Build 12131589868: 0.1%
Covered Lines: 2867
Relevant Lines: 4634

💛 - Coveralls

@nothingmuch nothingmuch force-pushed the escape-fragment-separator branch 2 times, most recently from 80be48e to d822f02 Compare December 2, 2024 22:57
DanGould added a commit that referenced this pull request Dec 3, 2024
These commits:

1. encode fragment paramters as bech32 strings (with no checksum), with
  2 character param names in the human readable part, and `+` as a
  delimiter
  - expiry time parameter encoded as a little endian u32, in line with
    bitcoin header timestamp and unix time nLocktime encoding. `EX` hrp
  - ohttp has `OH` hrp, same byte representation as previously encoded in
    base64
  - receiver key has `RK` hrp, same byte representation as previously
    encoded in base64
2. use bech32 with no human readable part for the subdirectory IDs
3. uppercase the scheme and host of the URI
4. move the pj so it follows the pjos parameter

Once the `#` %-escaping bug is fixed (see #373), and as long as no path
elements between the host and the subdirectory ID, the entire `pj`
parameter of the bip21 URI can be encoded in a QR using the alphanumeric
mode. Closes #389.

The last commit seems sufficiently motivated. Since there is no purpose
to clearing these values it doesn't actually simplify the code that much.

Manually verified, with manual % escaping of `#`, using the `qrencode`
CLI encoder alphanumeric mode is indeed used for everything following
`pj=`.
Copy link
Contributor

@DanGould DanGould left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 66e243d

edit: and 1aaa7a7 which edited the commit message

The `pj` parameter of the BIP 21 URL is itself a URL which contains a
fragment.

The # character was not escaped by bip21 during serialization. According
to RFC 3986, this terminates the query string, effectively truncating the `pj`
parameter.

The buggy deserialization likewise ignored #, parsing it as part of the
value, which survived a round trips with the incorrect serialization
behavior.
@nothingmuch nothingmuch force-pushed the escape-fragment-separator branch from 66e243d to 1aaa7a7 Compare December 3, 2024 02:42
@DanGould DanGould merged commit a1fbac5 into payjoin:master Dec 3, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants