Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkcs7+der: mixed BER/DER encoding with undefined length #779

Open
smndtrl opened this issue Nov 30, 2022 · 18 comments
Open

pkcs7+der: mixed BER/DER encoding with undefined length #779

smndtrl opened this issue Nov 30, 2022 · 18 comments

Comments

@smndtrl
Copy link
Contributor

smndtrl commented Nov 30, 2022

Hi,

while looking into the pkcs7 crate for a CMS usecase around the Apple world I discovered that their detached signatures use the BER indefinite length encoding for some of the SEQUENCE which of course is not supported by the der crate.

Has there been any thoughts/discussions around if that's something the formats repo should address or if that is out of scope (for now) and left to other crates.

I started with adding the signed-data content to pkcs7 validating against a DER encoded signature and discovered the problem afterwards with the ones Apple generates :( Link to example

Simon

@tarcieri
Copy link
Member

We've talked about introducing a lax mode to the der crate which supports a limited number of BER productions necessary to interop with real-world use cases like this. So far there hasn't been one until now.

@smndtrl
Copy link
Contributor Author

smndtrl commented Dec 19, 2022

I gave it a shot with my limited experience with Rust and also this codebase. It is meant as a starting point.

I used it successfully with changes to pkcs7/CMS crate that are not yet ready/

@tarcieri
Copy link
Member

tarcieri commented Dec 19, 2022

@smndtrl rather than trying to change how the existing parsing works in the der crate, you can write your own "lax" implementation of Decode for a specific type in the pkcs7 crate

@smndtrl
Copy link
Contributor Author

smndtrl commented Dec 20, 2022

@tarcieri Wouldn't that require me changing how DecodeValue is implemented on Sequence? My PKCS7 data starts with 30 80 and that already makes the Length::decode fail without reaching the first DecodeValue code for ContentInfo in pkcs7.

For making indefinite length work a solution for Length and Readers is required. The ASN1 I encountered had indefinite lengths for SEQUENCE and OCTET STRING but also can be other fields that can be constructed in BER. For OctetStringRef I'm not sure how to handle those cases as it then is non-contiguous memory (or at least broken up with Tags/Lengths values. For Sequence it via the Reader.

I'm fine with carrying this patch on my own as it works well for my use case. I might revisit the suggested approach later on.

@tarcieri
Copy link
Member

@smndtrl you'll need to write all the impls by hand if you go that route, and you will have to avoid impl'ing one of DecodeValue or FixedTag (I would suggest DecodeValue unless it needs to be in an IMPLICIT and context-specific), as otherwise it will receive a blanket impl of Decode.

If you can't figure it out I can take a look at some point based on your lax branch. It should be possible to impl the "lax" logic for just a single type.

@bkstein
Copy link
Contributor

bkstein commented Aug 10, 2023

We've talked about introducing a lax mode to the der crate which supports a limited number of BER productions necessary to interop with real-world use cases like this. So far there hasn't been one until now.

We've found a real-world use case: EJBCA sends (degenerate certificates-only) CMS messages encoded with BER. Unfortunately, that's in line with RFC 5652:

   ...
   The CMS values are generated using ASN.1 [X.208-88], using BER-
   encoding (Basic Encoding Rules) [X.209-88].

@carl-wallace
Copy link
Contributor

BER was an issue working on PKCS #12 as well. Exports from Firefox are BER. In a cert collection I use for testing, there are several several AIAs hosting P7s that are BER encoded.

@tarcieri
Copy link
Member

We're aware this is a real issue. It's something I've been meaning to address.

@tarcieri
Copy link
Member

I think we can probably use an approach like #810 but we need some way to control the granularity so enabling it doesn't cause der to revert to BER parsing globally

@bkstein
Copy link
Contributor

bkstein commented Nov 13, 2023

FYI, I have implemented a simple parser/converter (Berder), that replaces all indefinite length specifications in a BER message by definite lengths. The parser recurses through the complete message (though there is a limit for recursion depth 🙂). It doesn't fix ordering, but as RustCrypto's der crate fixes it, I think it should work reliably. I tested it against EJBCA messages. The usage is simple. Just prepare your BER data before you try to parse it, e.g.:

    ...
    let der = Berder::ber_to_der(ber.as_slice())?; // Indefinite -> definite lengths
    let ci = ContentInfo::from_der(der.as_slice())?;
    let pki_message = SignedData::from_der(ci.content.to_der()?;

This could be implemented as Decode::from_ber(), but could also be integrated into Decode::from_der(). However, the latter would introduce some performance loss.
What do you think?

@tarcieri
Copy link
Member

tarcieri commented Nov 13, 2023

I think we can probably implement Decode::from_ber in a way that doesn't require transcoding the entire document to DER first, using an approach similar to #810 (we have since added an IndefiniteLength type), and perhaps carrying the flag for using BER-like rules on the Reader, for lack of a better option.

@bkstein
Copy link
Contributor

bkstein commented Nov 14, 2023

I think we can probably implement Decode::from_ber in a way that doesn't require transcoding the entire document to DER first, using an approach similar to #810 (we have since added an IndefiniteLength type), and perhaps carrying the flag for using BER-like rules on the Reader, for lack of a better option.

@tarcieri I have started doing that based on #810. I think I got pretty far with this implementation, but there are one or two issues left (correct handling of the end-of-content markers). The main idea was, to modify the der parser to do look-ahead evaluation of indefinite lengths. Whenever the parser encounters an indefinite length, it stores the current position and makes a "fast-forward" until the end of the current TLV is reached. Then, the length is "definite"ly known and we can resume parsing with the definite length at the stored position. This look-ahead must be recursive, even if e.g. an OctetString is parsed, because it might contain other (nested) TLVs with indefinite lengths.
I have uploaded this to my repo, if you want to have a look at it.
And as you mentioned IndefiniteLength: I changed der::Header:

pub struct Header {
    /// Tag representing the type of the encoded value
    pub tag: Tag,

    /// Length of the encoded value
    pub length: IndefiniteLength,
}

This implementation is not complete, but I think it is already more complete, than the current #810.

@tarcieri
Copy link
Member

@bkstein I'd like to avoid making changes to the existing APIs like that. The idea would be to use IndefiniteLength as an encoding type to parse lengths in "BER mode", but then convert back to a Length in the public API.

However, I'd definitely like to also preserve the way the existing implementation works so indefinite lengths are rejected when parsing DER.

@bkstein
Copy link
Contributor

bkstein commented Nov 16, 2023

Yes, the Header change would break compatibility. I saw, that signature's crates dsa and ecdsa access Header::length, so this would actually affect compatibility here.
However, I introduced a flag is_parsing_ber in Reader. Is that approximately, what you intend with "BER mode"? In "DER mode", parsing is unchanged.

@tarcieri
Copy link
Member

I was thinking an enum like Encoding::Ber which would improve clarity, I think, but yes that's the basic idea

tarcieri added a commit that referenced this issue Jan 9, 2024
Adds an enum with `Ber` and `Der` (default) variants which can be used
to selectively allow a limited number of BER productions when decoding
certain BER-based security-oriented formats, e.g. CMS, PKCS#8.

Currently this doesn't actually do anything, however the goal is to
address #779, where we can't decode CMS generated by Apple tooling.

PR #810 is an example of how the rules could be relaxed to support
`IndefiniteLength`s.
@tarcieri
Copy link
Member

tarcieri commented Jan 9, 2024

I've bumped der's version to v0.8.0-pre.0 so we're now allowed to make breaking changes, which should make this easier.

I opened #1321 which adds an EncodingRules enum with Ber and Der variants, as well as a Reader::encoding_rules method to interrogate them.

tarcieri added a commit that referenced this issue Jan 9, 2024
Adds an enum with `Ber` and `Der` (default) variants which can be used
to selectively allow a limited number of BER productions when decoding
certain BER-based security-oriented formats, e.g. CMS, PKCS#8.

Currently this doesn't actually do anything, however the goal is to
address #779, where we can't decode CMS generated by Apple tooling.

PR #810 is an example of how the rules could be relaxed to support
`IndefiniteLength`s.
@Firstyear
Copy link

@tarcieri For reference, we were interested in the BER capability as we want to unify our LDAP and Kerberos parsers on one trusted encoding/decoding stack. Currently we have a "ber only" parser, but we would like to move to this crate. I'll do some testing soon to see what we need for extending this to support BER in our cases.

@tarcieri tarcieri changed the title pkcs7, der: mixed BER/DER encoding with undefined length pkcs7+der: mixed BER/DER encoding with undefined length Aug 18, 2024
dwhjames added a commit to dwhjames/RustCrypto-formats that referenced this issue Feb 9, 2025
Issue RustCrypto#779 has had some process (PR RustCrypto#1321). However, this helper
function is intended to provide an escape hatch for limited support
of some cases of BER, while not waiting for general support or needing
API changes.

Specifically, the transcoding of occurrences of the constructed,
indefinite-length method into the constructed, definite-length method.
This is likely sufficient to address the examples in the wild, reported
in Issue RustCrypto#779 and elsewhere.

The goal is not to support all possible violations of DER that are still
following valid BER. Examples are non-canonical encodings of lengths
(which would require an alternative `Length`), or constructed string
types (which could be handled by further work on this function, but
usage may not be well motivated).
@dwhjames
Copy link
Contributor

dwhjames commented Feb 9, 2025

@tarcieri,

I hit this issue.

I did see

And I read through prior attempts of @smndtrl and @bkstein

I spent some time searching for additive-only changes, and even with, PR #1321, I could not see a path.

My impression from the codebase was that there would need to be breaking changes, and one possible path might look like,

  1. using IndefiniteLength rather than Length within Header,
  2. passing Header rather than Length, to function read_nested of trait Reader,

so that the existing API and usage doesn’t change dramatically, but so that nested readers can handle the constructed, indefinite-length method.


As I need a solution in the short term, I had to go with the ‘transcoding’ approach that was discussed above.

The following PR is me contributing that code back, in case the maintainers are open to accepting this into the crate in the near term (at least until more general solutions are developed). Or failing that, maybe it will be of use to others that find this issue while it remains open.

dwhjames added a commit to dwhjames/RustCrypto-formats that referenced this issue Feb 9, 2025
Issue RustCrypto#779 has had some process (PR RustCrypto#1321). However, this helper
function is intended to provide an escape hatch for limited support
of some cases of BER, while not waiting for general support or needing
API changes.

Specifically, the transcoding of occurrences of the constructed,
indefinite-length method into the constructed, definite-length method.
This is likely sufficient to address the examples in the wild, reported
in Issue RustCrypto#779 and elsewhere.

The goal is not to support all possible violations of DER that are still
following valid BER. Examples are non-canonical encodings of lengths
(which would require an alternative `Length`), or constructed string
types (which could be handled by further work on this function, but
usage may not be well motivated).
dwhjames added a commit to dwhjames/RustCrypto-formats that referenced this issue Feb 9, 2025
Issue RustCrypto#779 has had some progress (PR RustCrypto#1321). However, this helper
function is intended to provide an escape hatch for limited support
of some cases of BER, while not waiting for general support or needing
API changes.

Specifically, the transcoding of occurrences of the constructed,
indefinite-length method into the constructed, definite-length method.
This is likely sufficient to address the examples in the wild, reported
in Issue RustCrypto#779 and elsewhere.

The goal is not to support all possible violations of DER that are still
following valid BER. Examples are non-canonical encodings of lengths
(which would require an alternative `Length`), or constructed string
types (which could be handled by further work on this function, but
usage may not be well motivated).
dwhjames added a commit to dwhjames/RustCrypto-formats that referenced this issue Feb 9, 2025
Issue RustCrypto#779 has had some progress (PR RustCrypto#1321). However, this helper
function is intended to provide an escape hatch for limited support
of some cases of BER, while not waiting for general support or needing
API changes.

Specifically, the transcoding of occurrences of the constructed,
indefinite-length method into the constructed, definite-length method.
This is likely sufficient to address the examples in the wild, reported
in Issue RustCrypto#779 and elsewhere.

The goal is not to support all possible violations of DER that are still
following valid BER. Examples are non-canonical encodings of lengths
(which would require an alternative `Length`), or constructed string
types (which could be handled by further work on this function, but
usage may not be well motivated).
PandasAreBears added a commit to PandasAreBears/macho2 that referenced this issue Feb 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants