Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::span (and more) for Botan::RandomNumberGenerator #3195

Merged
merged 7 commits into from
Mar 22, 2023

Conversation

reneme
Copy link
Collaborator

@reneme reneme commented Jan 19, 2023

Open Things to Do in this Pull Request

  • Deprecate random_vec(std::span)?
    This replaced template<typename AllocT> random_vec(std::vector<uint8_t, AllocT>), so we can't just get rid of it. But it is equivalent to randomize(std::span) which has the better fitting name.
  • Deprecate template<typename T> random_vec(size_t) (and friends)?
    ... and rename it simply to template<typename T> random(size_t) (but that may be just a nit-pick).
  • Stateful_RNG::randomize_with_ts_input()
    Its implementation deviates from the base-class implementation (presumably to be more conservative with stateful RNGs). We have to find a way to incorporate that into the new sub-class API.

Description

This adapts the RNG's internal API to std::span<> so that subclasses do not need to provide or use ptr-length style method overloads. The class hierarchy's heavy lifting (i.e generating random bits and ingesting entropy bits) is unified into a single method fill_bytes_with_input(). All sub-classes need to implement this interface starting with Botan 3.0. Details here.

For most sub-classes this change applies nicely. Stateful_RNG needed some extra thought (and special handling) to ingest initial entropy (via add_entropy()) into its concrete implementations (ChaCha_RNG and HMAC_DRBG) in an implementation-agnostic manner. Along with that, an outstanding API inconsistency was fixed and nailed down with a test.


Old (rejected) Idea Description

This generalizes the API of Botan::RandomNumberGenerator to work with arbitrary contiguous byte-containers beyond just std::vector<> or C-style pointer-length crutches. Judging from the massive internal usage of the adapted APIs this seems to be backward-compatible change (especially regarding ::random_vec()).

Particularly, ::random_vec() should be useful within the library as well. Such as in #3150:

using Session_ID = Botan::Strong<std::vector<uint8_t>, struct Session_ID_>;

// generate a random TLS Session ID
const auto random_id = rng.random_vec<Session_ID>(32);

I opted to keep the old pointer-length APIs and add overloads using std::span<> for backward compatibility. Albeit convenient for legacy software, this forces sub-classes of RandomNumberGenerator to actively make those overloads available, when they override (say) randomize(uint8_t*, size_t) (see the adaptions to test_rng.h).

Hence, I'd like to make the argument to ditch the C-style APIs in favor of fully embracing std::span<>. Doing this consistently across the API is a huge break in compatibility. But I think it's worth considering.

For reference #752.

@codecov-commenter
Copy link

codecov-commenter commented Jan 19, 2023

Codecov Report

Patch coverage: 91.76% and project coverage change: +0.02 🎉

Comparison is base (a379e17) 88.17% compared to head (4e6c3e2) 88.20%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3195      +/-   ##
==========================================
+ Coverage   88.17%   88.20%   +0.02%     
==========================================
  Files         618      618              
  Lines       70379    70415      +36     
  Branches     6977     6985       +8     
==========================================
+ Hits        62057    62107      +50     
+ Misses       5391     5380      -11     
+ Partials     2931     2928       -3     
Impacted Files Coverage Δ
src/tests/test_rng.h 100.00% <ø> (ø)
src/lib/ffi/ffi_rng.cpp 78.12% <55.55%> (+1.56%) ⬆️
src/tests/test_rngs.cpp 74.35% <78.12%> (+1.38%) ⬆️
src/lib/rng/rng.cpp 89.47% <92.30%> (+45.02%) ⬆️
src/lib/rng/stateful_rng/stateful_rng.cpp 90.47% <93.75%> (-1.20%) ⬇️
src/tests/test_rng_behavior.cpp 82.79% <97.50%> (+1.50%) ⬆️
src/lib/prov/pkcs11/p11_randomgenerator.cpp 100.00% <100.00%> (ø)
src/lib/rng/auto_rng/auto_rng.cpp 94.44% <100.00%> (+4.97%) ⬆️
src/lib/rng/chacha_rng/chacha_rng.cpp 100.00% <100.00%> (ø)
src/lib/rng/hmac_drbg/hmac_drbg.cpp 100.00% <100.00%> (ø)
... and 4 more

... and 10 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@@ -254,7 +254,7 @@ class BOTAN_PUBLIC_API(2,0) RandomNumberGenerator
template<typename T = secure_vector<uint8_t>>
requires(concepts::contiguous_container<T> &&
concepts::resizable_container<T> &&
std::default_initializable<T> &&
concepts::default_initializable<T> &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we ll get there :-)

@randombit
Copy link
Owner

Hence, I'd like to make the argument to ditch the C-style APIs in favor of fully embracing std::span<>. Doing this consistently across the API is a huge break in compatibility. But I think it's worth considering.

I think this is a total non-starter. That would break every single application out there, probably piss a lot of people off, and anytime you're forced to modify your perfectly good application code to deal with a new library version that broke its API for no good reason, your mind will inevitably be drawn to the fact that if you're rewriting it anyway, you might as well pick a different library. I personally do not want to be responsible for any new code being written that uses OpenSSL.

I opted to keep the old pointer-length APIs and add overloads using std::span<> for backward compatibility.

This also seems to me to be not the correct approach. It only provides the benefits of span to our users (who may well not appreciate them!) while not providing us, as the library developers, any benefit at all.

Also I think there is an important difference here, wrt compatability:

  • I'm very much not ok with breaking totally typical application code that, say, instantiates a System_RNG and uses it for RNG-ish things.
  • I'm fine with us breaking code that derives a new instance of one of our base classes, because that is rare and you're probably doing something weird that is literally impossible with most other libraries, so you'll accept some breakage as part of the process.

I would sketch something like this instead (here just doing randomize but it extends naturally):

class RandomNumberGenerator
   {
   public:

   // no longer virtual
   // we could deprecate this eventually
   void randomize(uint8_t out[], size_t len)
     {
     this->fill_bytes(std::span<uint8_t>(out, len));
     }

    // This is what all subclasses of RandomNumberGenerator must implement in 3.0
    virtual void fill_bytes(std::span<uint8_t> output) = 0;
    ...

@randombit
Copy link
Owner

If we are revisiting how this base class works we could also unify randomize, add_entropy, and randomize_with_input with just a single call:

fill_bytes_with_input(span input, span output) where

randomize -> fill_bytes_with_input with an empty input buffer
add_entropy -> empty output buffer
randomize_with_input -> both spans (possibly) populated.

@reneme
Copy link
Collaborator Author

reneme commented Jan 20, 2023

I think this is a total non-starter.That would break every single application out there, [...]

It would break applications that use pointer-length-style APIs in a way that is easily and unambiguously fixable by using a modern idiom for it. Certainly far from a "rewrite" IMHO. I agree though that it might be too much of a burden on users, nevertheless. No hard feelings, I just found it worth discussing, despite being a radical proposal.

Re: compatibility of deriving classes vs. typical applications

Fair enough. That's a distinction that wasn't obvious to me, but that makes a whole lot of sense. I'll sketch out your proposal of replacing the virtual randomize*() and add_entropy() with fill_bytes_with_input(std::span).

@reneme
Copy link
Collaborator Author

reneme commented Jan 20, 2023

I converted this to a draft PR and split off #3201 that is technically unrelated from the changes to the internals of RandomNumberGenerator.

@randombit
Copy link
Owner

It would break applications that use pointer-length-style APIs in a way that is easily and unambiguously fixable by using a modern idiom for it. Certainly far from a "rewrite" IMHO.

I think now that std::span is available we should absolutely use it internally everywhere (eventually). I don't think it's that difficult to leave (as in my randomize example above) ptr/length APIs available for public compat even if we never use them internally ourselves.

I would envision a timeline for removing such APIs entirely as something like

  1. Full set of std::span based APIs introduced in 3.0, with appropriate ptr/length compat
  2. If not in 3.0, eventually all internals are rewritten to use span
  3. Some where along the way we deprecate the ptr/length versions
  4. In 4.0 the ptr/length versions are removed and everything is a span.

@reneme
Copy link
Collaborator Author

reneme commented Jan 20, 2023

I would envision a timeline for removing such APIs entirely as something like [...]

Sounds fair to me. I'll work on my patience regarding API changes, going forward. 😊

@reneme reneme force-pushed the feature/random_something branch from 1f577c0 to 2bce692 Compare February 17, 2023 16:29
@reneme
Copy link
Collaborator Author

reneme commented Feb 17, 2023

I finally came back to this and applied your suggestion across the code base. Most things fit nicely with this interface. The Stateful_RNG sub-tree needed a bit of mind-bending but the test coverage helps a lot here! Also AutoSeeded_RNG does some conservative mixing with a high-resolution timer which needs a thorough review to ensure I didn't screw something up.

The public API is backward-compatible, with all C-style methods now being accompanied py a std::span<> sibling and simply map to that one. Virtual methods only use std::span<>.

Note that the public randomize_with_input() and the virtual (protected) fill_bytes_with_input() are one and the same. Though, when adapting Buffered_Computation it came in handy that public and private-virtual interface was disjoint. So I kept this for good measure.

@reneme reneme marked this pull request as ready for review February 20, 2023 10:32
@reneme reneme force-pushed the feature/random_something branch from 89d8555 to bfa8c2a Compare February 20, 2023 13:55
@reneme reneme requested a review from lieser February 20, 2023 13:56
@reneme
Copy link
Collaborator Author

reneme commented Mar 1, 2023

@randombit

There's a gotcha with making randomize_with_ts_input() non-virtual. As @lieser pointed out Stateful_RNG used to override the default implementation of this method. In short, it uses the System_RNG additional to a high-res clock reading or, if System_RNG is not available, it incorporated another high-res clock reading and the current process ID.

To my mind, that is done to be more conservative in the stateful RNG case (particularly the process ID is likely meant to be a counter-measure to fork() duplicating the RNG state). Note that Stateful_RNG::reseed_check() has an explicit fork-detection.

Without randomize_with_ts_input() being virtual I don't see a way to keep this exact behavior, unfortunately. Except by moving it up into the base implementation of RandomNumberGenerator which might be somewhat overkill.

I see three ways forward:

  1. Move the conservative implementation up into the base implementation,
  2. Leave the randomize_with_ts_input() as virtual and override it in Stateful_RNG as before. Note that we need to keep the virtual overload dispatch issue in mind -- i.e. introduce a protected alias that gets overridden,
  3. Remove the conservative special case and rely on the reseed_check().

@reneme reneme force-pushed the feature/random_something branch from f2bb1db to 167d363 Compare March 2, 2023 07:34
@reneme reneme requested a review from lieser March 2, 2023 07:42
@reneme reneme force-pushed the feature/random_something branch from 167d363 to fff8248 Compare March 2, 2023 11:09
@reneme reneme force-pushed the feature/random_something branch from fff8248 to ea87e7e Compare March 2, 2023 15:06
@randombit
Copy link
Owner

@reneme regarding randomize_with_ts_input I definitely want to keep this as conservative as possible since userspace RNGs are prone to all kinds of fun problems and a bug here could be catastrophic. But I think just moving the conservative version from Stateful_RNG to the base class is fine.

reneme and others added 4 commits March 21, 2023 13:40
Implementations of the RandomNumberGenerator interface now implement
::fill_bytes_with_input(), only. The function can both query the RNG
and provide input, or both at the same time.

Also, internal APIs now use std::span instead of ptr+length style
methodes.

Co-Authored-By: Philippe Lieser <[email protected]>
Given a deterministically seeded HMAC_DRBG, it should generate the same output
stream regardless of requesting chunks (of size m_max_number_of_bytes_per_request)
or a bulk buffer (of multiples of m_max_number_of_bytes_per_request).

Co-Authored-By: Philippe Lieser <[email protected]>
@reneme reneme force-pushed the feature/random_something branch from ea87e7e to 91cf2a7 Compare March 21, 2023 13:47
@reneme reneme force-pushed the feature/random_something branch from 16d5978 to f351580 Compare March 21, 2023 15:25
Comment on lines +18 to 50
void RandomNumberGenerator::randomize_with_ts_input(std::span<uint8_t> output)
{
if(this->accepts_input())
{
/*
Form additional input which is provided to the PRNG implementation
to paramaterize the KDF output.
*/
uint8_t additional_input[16] = { 0 };
store_le(OS::get_system_timestamp_ns(), additional_input);
store_le(OS::get_high_resolution_clock(), additional_input + 8);
constexpr auto s_hd_clk = sizeof(decltype(OS::get_high_resolution_clock()));
constexpr auto s_sys_ts = sizeof(decltype(OS::get_system_timestamp_ns()));
constexpr auto s_pid = sizeof(decltype(OS::get_process_id()));

std::array<uint8_t, s_hd_clk + s_sys_ts + s_pid> additional_input = {0};
auto s_additional_input = std::span(additional_input.begin(), additional_input.end());

store_le(OS::get_high_resolution_clock(), s_additional_input.data());
s_additional_input = s_additional_input.subspan(s_hd_clk);

#if defined(BOTAN_HAS_SYSTEM_RNG)
System_RNG system_rng;
system_rng.randomize(s_additional_input);
#else
store_le(OS::get_system_timestamp_ns(), s_additional_input.data());
s_additional_input = s_additional_input.subspan(s_sys_ts);

this->randomize_with_input(output, output_len, additional_input, sizeof(additional_input));
store_le(OS::get_process_id(), s_additional_input.data());
#endif

this->fill_bytes_with_input(output, additional_input);
}
else
{
this->randomize(output, output_len);
this->fill_bytes_with_input(output, {});
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @randombit

The base implementation of RandomNumberGenerator::randomize_with_ts_input() now adopted the behavior of Stateful_RNG (as discussed here and here). Additionally, this method is not virtual any longer: I.e. all RNGs will inherit exactly this implementation.

There's one caveat: Naturally, also the System_RNG is a subclass of RandomNumberGenerator. In other words: if the platform's RNG reports that it can accept input, it ends up seeding itself. Technically, that shouldn't be problematic though its worth noting.

There shouldn't be a chance of this becoming an endless loop either. system_rng.randomize() will resolve to System_RNG::fill_bytes_with_input() (without actual input). By design, this is the most low-level method that will not call back into higher-level members (like randomize_with_ts_input() to actually create an endless recursion).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems OK to me. (As long as there is no infinite recursions! which seems we are safe here.) Typically one would not use randomize_with_ts_input on a system RNG but rather only a userspace RNG. Also at this point only a few systems (if any?) use the /dev/urandom based interface anymore, which is the only System_RNG instance that supports writing at all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consider final overriding System_RNG::accepts_input() and let it return false to be explicit about that. But that can be done in a follow-up.

@reneme reneme force-pushed the feature/random_something branch from d81d301 to 5a3bf9f Compare March 21, 2023 15:57
@reneme reneme force-pushed the feature/random_something branch from 5a3bf9f to 4e6c3e2 Compare March 21, 2023 16:01
Comment on lines +75 to +79
void initialize_with(std::span<const uint8_t> input);
void initialize_with(const uint8_t input[], size_t length)
{
this->initialize_with(std::span(input, length));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General remark (out of scope of this pull request):

The doxygen string says:

      /**
      * Consume this input and mark the RNG as initialized regardless
      * of the length of the input or the current seeded state of
      * the RNG.
      */

Assuming that "mark the RNG as initialized" means "considering the RNG as seeded": I don't think that is the case in general!

The input will eventually flow down to the following code:

this->update(input);

if(8*input.size() >= security_level())
   {
   reset_reseed_counter();
   }

Which will reset the reseed counter only if the input was long enough. This contradicts with "regardless of the length of the input" in the doxygen comment above. I.e. stateful_rng->initialize_with("baadcafe") wouldn't result in a seeded stateful_rng object.

@randombit I think we should consider fixing this.

However: The behavior was the same before the refactoring, so I consider that out-of-scope.

@reneme reneme merged commit 88eaa00 into master Mar 22, 2023
@randombit randombit deleted the feature/random_something branch March 22, 2023 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants