refactored SHA-3 in terms of newly added keccak-FIPS; added KMAC #3525

falko-strenzke · 2023-04-26T16:19:27Z

Adding only KMAC256 for now. For adding KMAC128, a generic base should be created for KMAC and then both variants derived from it.

coveralls · 2023-04-26T16:58:49Z

Coverage: 91.714% (-0.4%) from 92.148% when pulling 1e8f985 on falko-strenzke:kmac_04_23 into b6e705f on randombit:master.

randombit · 2023-04-26T21:20:18Z

src/lib/hash/hash.cpp

@@ -251,6 +255,18 @@ std::unique_ptr<HashFunction> HashFunction::create(std::string_view algo_spec,
      }
 #endif

+
+#if defined(BOTAN_HAS_KECCAK_FIPS)
+   if(req.algo_name() == "Keccak-FIPS[512]")


As I said in the ticket, I really dislike this approach because the Keccak being exposed here is not, as I understand it, approved by NIST as a hash function, nor is it implemented by any library I am aware of.

Agreed, that makes sense. I removed it from the external interface.

reneme

I like the overall direction of this. It certainly takes the internal implementation closer to the specs and makes it easier to understand. 👍
I've added a few high-level remarks regarding the implementation strategy.

Maybe related: With several PQC algorithms using SHAKE as XOF, it might be worth investing in a more generic interface for the extendable output flavours as well.

src/lib/hash/sha3/sha3.h

src/lib/mac/kmac/kmac.cpp

src/lib/mac/kmac/kmac.h

reneme · 2023-05-03T13:05:07Z

src/lib/hash/keccak_fips/keccak_fips.h

+/**
+* KECCAK FIPS
+*/


It would be great the condense the discussion about the two variants of Keccak in #3279 into some clarifying Doxygen remarks here and in keccak.h. I'm sure future users and contributors would be grateful. 😄

I added clarifying documentation in keccak_fips.h which comes in the next commit. The problem with keccak.h is that I am not really sure what it implements. I suspect it is what is implemented in "KECCAK - Final Algorithm Package" at https://csrc.nist.gov/projects/hash-functions/sha-3-project – but I didn't verify. It might be worthwhile to do so, but currently I don't have the time for it nor does this have priority in our current project.

src/lib/hash/keccak_fips/keccak_fips.h

reneme · 2023-05-03T13:20:19Z

src/lib/hash/keccak_fips/keccak_fips.h

+* KECCAK FIPS
+*/
+
+class Keccak_FIPS_generic : public HashFunction


Generic idea to consider: Would it make sense to define this as a template with the capacity being a value template parameter? Downstream users of the static building blocks could then do Keccak_FIPS_256::absorb() without explicitly stating the bitrate every time.

Not sure whether that's at all feasible, but maybe its worth a try in a follow-up.

I had the same thought but refrained from trying it right now because I wasn't sure how much code would then have to go into the header and what would be the consequences of that. Right now I think it wouldn't be a problem as the header is not part of the public API. Making the capacity (and maybe other parameters) template parameters may improve the readability of the code and - at least potentially – increase performance.

However I would tend to first have the basic implementation merged and then address this later like you are proposing as well.

reneme · 2023-05-03T13:26:05Z

src/lib/hash/sha3/sha3.h

-      std::unique_ptr<HashFunction> new_object() const override;
-      std::unique_ptr<HashFunction> copy_state() const override;


I believe those should have overloads in this class. It's somewhat hidden behind the polymorphy of HashFunction but as a user I would expect code along those lines to work:

SHA_3 sha(256); HashFunction* other_sha = sha.new_object().get(); // this will fail, because `other_sha` is actually an instance of `Keccak_FIPS` // and not of `SHA_3` (which I would have expected) SHA_3* other_sha_ptr = dynamic_cast<SHA_3>(other_sha);

If I understand correctly these overloads would be needed in all 4 base classes SHA_3_224, ... Can you confirm this?

Yes, indeed. So that was actually an issue with the existing code already, no?
Come to think of it: sha3.h is an internal header, so applications don't actually have access to the SHA_3_* types and wouldn't be able to do the dynamic_cast as I suggested.

Still it is a code smell, in my opinion. @randombit what's your take on that?

Maybe CRTP could be helpful here, if the SHA_3 base class is not meant to be used as an interface by consuming code. It could potentially even be applied to Keccak_FIPS at one point.

A sketch for reference:

template <class DerivedT> class SHA_3 : public Keccak_FIPS_generic { public: SHA_3(size_t output_bits) : Keccak_FIPS_generic(...) {} std::unique_ptr<HashFunction> new_object() const override { return std::make_unique<DerivedT>(...); } std::unique_ptr<HashFunction> copy_state() const override { return std::make_unique<DerivedT>(...); } }; class SHA_3_224 : public SHA_3<SHA_3_224> {} class SHA_3_256 : public SHA_3<SHA_3_256> {} // ...

This seems like enourmously complicating vs just using ownership not inheritance.

SHA-3 is a hash function.
SHA-3 is not "a Keccak"

This issue is unrelated to the inheritance of Keccak, though.

My mental model of ::new_object() is that it returns an instance of the very same hash function class. The fact that SHA_3 is not marked final but has sub-classes (SHA_3_224, SHA_3_256, ...) violates that mental model in the current implementation.

It's really just a nitpick, but it struck me as unusual. Internal code might do something like that:

std::unique_ptr<HashFunction> sha3_256 = std::make_unique<SHA_3_256>(); std::unique_ptr<HashFunction> bare_sha3 = sha3_256->new_object();

Now, both objects will behave exactly the same. But they won't be instances of exactly the same class. While sha3_256 is an instance of the class SHA_3_256, sha3_bare will be an instance of SHA_3.

Applications will always go through HashFunction::create("SHA-3(256)") and therefore get an instance of SHA_3 (without the _256 suffix).

Again: just a minor code smell, if at all.

reneme · 2023-05-03T13:30:35Z

src/lib/hash/shake/shake.cpp

@@ -45,13 +45,13 @@ void SHAKE_128::clear()

 void SHAKE_128::add_data(const uint8_t input[], size_t length)
   {
-   m_S_pos = SHA_3::absorb(SHAKE_128_BITRATE, m_S, m_S_pos, input, length);
+   m_S_pos = Keccak_FIPS_generic::absorb(SHAKE_128_BITRATE, m_S, m_S_pos, input, length);


Referring to the idea of making Keccak_FIPS_generic a template (with the capacity being a value template-parameter), this might be written as:

Keccak_FIPS_128::absorb(m_S, m_S_pos, input, length);

which I would find desirable. But as stated there: certainly okay to look at separately as a follow-up.

src/lib/hash/hash.cpp

src/lib/hash/keccak_fips/keccak_fips.h

randombit · 2023-05-05T13:42:39Z

src/lib/hash/sha3/sha3.h

 #include <string>

 namespace Botan {

 /**
 * SHA-3
 */
-class SHA_3 : public HashFunction
+
+class SHA_3 : public Keccak_FIPS


Inheritance here seems just wrong.

TBH I do not understand at all the motivation for moving the SHA-3 permutation logic, nor for how Keccak_FIPS simplifies anything.

randombit

I really dislike adding Keccak as a HashFunction. It is at best an implementation detail. If you really feel it is helpful, define it within kmac.cpp, in terms of the functions on SHA_3.

Considering this patch is net +500 lines of code, I am not even seeing how this approach simplifies anything.

randombit · 2023-05-05T13:46:11Z

src/lib/mac/kmac/kmac.cpp

+   {
+   // KMAC supports key lengths from zero up to 2²⁰⁴⁰ (2^(2040)) bits
+   // https://nvlpubs.nist.gov/nistpubs/specialpublications/nist.sp.800-185.pdf#page=28
+   return Key_Length_Specification(0, std::numeric_limits<size_t>::max());


There is no practical reason to support keys larger than 512 bits.

This will just cause weird confusion (and out of memory errors) when someone tries to use mac->maximum_keylength()

I see the potential problem but I think this has some implications. KMAC may be used in arbitrary constuctions. Look for instance how the NIST KMAC-based key derivation functions are specified. Here a 'salt' value is used as the key. A salt value might be chosen based on very different consideration than a key. Accordingly, the Botan implementation may fail to support certain construction if it limits the key size to 512 bits. But I completely agree that the integer maximum is a bad idea. Maybe we find a reasonable value that accounts for other uses to a certain degree but also avoids an excessive memory allocation?

If 3.0 wouldn't have been out yet, I would have suggested making the maximum value a std::optional<> to clearly communicate "no relevant upper limit in practice".

Maybe provide a "smallish" upper bound but allow to use KMAC with larger keys? This would require a way to pass Symmetric_Algorithm::valid_keylength() for larger keys than Key_Length_Spec advertises in the particular case of KMAC. That makes me feel uneasy as well. 🤔

reneme · 2023-05-05T16:28:57Z

I really dislike adding Keccak as a HashFunction. It is at best an implementation detail.

Mhh, I think I get your point. Keccak is a construction that is "more" than just a HashFunction. Composition (instead of inheritance) better reflects the fact that Keccak is merely used as a building block for the HashFunction that SHA-3 is. Similarly, it could be used to build a SHAKE XOF, or a KMAC message authentication code.

To me Keccak_FIPS seemed equivalent to the MDx_HashFunction concept in mdx_hash.h or (with a slight stretch) similar to the Block_Cipher_Fixed_Params templated base class.

My main source of confusion is that these building blocks (absorb, permute, ...) are currently in SHA_3.

falko-strenzke · 2023-05-08T05:36:48Z

I really dislike adding Keccak as a HashFunction. It is at best an implementation detail. If you really feel it is helpful, define it within kmac.cpp, in terms of the functions on SHA_3.

I think I explained at length how the changes simplify the code and make it clearer in #3279 . I am not going to repeat this here again. As long as there are no arguments negating the one's I've given, I don't expect I will change my mind.

Considering this patch is net +500 lines of code, I am not even seeing how this approach simplifies anything.

The number of changed lines is a mere artifact due to the limitations of git. I moved the entire SHA-3 code to Keccak_FIPS with git mv. But in a pull request, git cannot reflect that. Further, I fail to see how the number of changed lines should be a reliable measure of the degree to which the change simplifies the code. As the initial state of the code may be arbitrary, arbitrary (-ily large) changes may be necessary to optimize it.

falko-strenzke · 2023-05-08T06:09:48Z

I really dislike adding Keccak as a HashFunction. It is at best an implementation detail.

Mhh, I think I get your point. Keccak is a construction that is "more" than just a HashFunction. Composition (instead of inheritance) better reflects the fact that Keccak is merely used as a building block for the HashFunction that SHA-3 is. Similarly, it could be used to build a SHAKE XOF, or a KMAC message authentication code.

To me Keccak_FIPS seemed equivalent to the MDx_HashFunction concept in mdx_hash.h or (with a slight stretch) similar to the Block_Cipher_Fixed_Params templated base class.

My main source of confusion is that these building blocks (absorb, permute, ...) are currently in SHA_3.

I think the idea which supports using Keccak_FIPS as a base class is pretty straight forward: Keccak[c] is the base for all Keccac[c]-derived functions such as SHA-3, KMAC, etc. Thus if we would relabel it Keccak_FIPS_base, it would become clear enough that inheritance makes sense. However, I don't mind making it a member, if that is preferred for some reason. Since I understand that is what both of you prefer, I suggest making it a member. Probably then the discussion about turning Keccak_FIPS into a template is not relevant any more, as the code will not become more readable through this any more and the exepected performance gain through some compile-time known values will be negligible.

falko-strenzke · 2023-05-09T08:22:15Z

I really dislike adding Keccak as a HashFunction. It is at best an implementation detail. If you really feel it is helpful, define it within kmac.cpp, in terms of the functions on SHA_3.

Looking at mxd_hash.h I find

/**
* MDx Hash Function Base Class
*/
class MDx_HashFunction : public HashFunction
[...]

Wouldn't it be straightforward to use the same approach for Keccak[c], i.e. use Keccak_FIPS as the base class for all hash functions of this type? It may very well be a purely virtual class. Inside KMAC, we would then instantiate a derived class, which creates the specific hash function used by KMAC.

If this is not desired, then I would be interested to learn the difference between the two patterns (inheritance for MDx hash but not for Keccak[c]-based hash).

reneme · 2023-05-10T06:34:32Z

If this is not desired, then I would be interested to learn the difference between the two patterns (inheritance for MDx hash but not for Keccak[c]-based hash).

Personally I can live with that. But may I suggest a hybrid approach?

Keccak[c] as a free-standing entity (that is not bound to the interface or inheritance hierarchy of Hash_Function)
A Keccak_HashFunction that condenses the common concerns of such hashes and acts as a base for SHA_3 and KMAC.

The free-standing base implementation could then be used as an implementation detail of other Keccak-based constructions (like a XOF based on SHAKE for example). As far as I can tell, the Hash_Function interface isn't necessarily fit for that use case.

What I have in mind as the "free-standing" portion are likely just the existing SHA_3::absorb, ::extend, ... functions. Just combined under a different (more generic) name.

falko-strenzke · 2023-05-10T06:37:39Z

If this is not desired, then I would be interested to learn the difference between the two patterns (inheritance for MDx hash but not for Keccak[c]-based hash).

Personally I can live with that. But may I suggest a hybrid approach?
1. `Keccak[c]` as a free-standing entity (that is not bound to the interface or inheritance hierarchy of `Hash_Function`)

2. A `Keccak_HashFunction` that condenses the common concerns of such hashes and acts as a base for `SHA_3` and `KMAC`.
The free-standing base implementation could then be used as an implementation detail of other Keccak-based constructions (like a XOF based on SHAKE for example). As far as I can tell, the Hash_Function interface isn't necessarily fit for that use case.

That makes sense to me. @randombit Do you also agree to this approach?

randombit · 2023-05-11T11:09:15Z

Re MDx_HashFunction don't read too much into it - that's not how it should be and is just a refactoring in waiting. Thankfully now that all of the implementations are hidden, removing it even possible to do in a release branch. In general at this point I'm quite strongly against C++ inheritance for implemention. IMO inheritance should only be used to express interface relationships. You can see this trend in PRs such as #3210

Keccak[c] as a free-standing entity (that is not bound to the interface or inheritance hierarchy of Hash_Function)

This seems fine. Not really sure where it should live, I suppose it could be in a new permutations subdir.

A Keccak_HashFunction that condenses the common concerns of such hashes and acts as a base for SHA_3 and KMAC.

This I do not like for aforementioned reasons. It's also not clear how much a common base helps anything. My take is that needing some base class to help implementation just means the Keccak[c] type was not properly designed.

reneme · 2023-05-22T09:31:38Z

A Keccak_HashFunction that condenses the common concerns of such hashes and acts as a base for SHA_3 and KMAC.

This I do not like for aforementioned reasons. It's also not clear how much a common base helps anything. My take is that needing some base class to help implementation just means the Keccak[c] type was not properly designed.

It helps separating API contracts (implementation of the HashFunction interface) from permutation implementation (Keccak[c], MD_Hash (maybe?)) from concrete algorithm parameterization (SHA-3, SHAKE, ...).

I wouldn't see the need for such an "adapter" as a weakness in the design of Keccak[c]. When abstracting things into common base implementations we will usually need to "translate" to public interfaces (e.g. HashFunction, MessageAuthenticationCode or (maybe at some point) Xof). We can do that translation repeatedly in every concrete algorithm class (SHA-3, SHAKE, ...). Or we can lift it into a common adapter base class.

I've sketched that idea based on your refactoring of MDx_Hash here: #3553.

falko-strenzke · 2023-06-01T11:34:14Z

The new PR #3570 obsoletes this one, thus closing it.

refactored SHA-3 in terms of newly added keccak-FIPS; added KMAC

ebfe66a

falko-strenzke requested a review from reneme April 26, 2023 16:19

randombit requested changes Apr 26, 2023

View reviewed changes

removed keccak-fips hash function from API

c1f278c

reneme reviewed May 3, 2023

View reviewed changes

fixes according to PR feedback

1e8f985

reneme reviewed May 5, 2023

View reviewed changes

src/lib/hash/keccak_fips/keccak_fips.h Outdated Show resolved Hide resolved

randombit reviewed May 5, 2023

View reviewed changes

randombit requested changes May 5, 2023

View reviewed changes

falko-strenzke mentioned this pull request May 17, 2023

SHA-3 vs KMAC for KDF openpgp-pqc/draft-openpgp-pqc#46

Closed

reneme mentioned this pull request May 19, 2023

Remove MDx_HashFunction #3550

Closed

corrected spelling error in comment

bed2ad3

falko-strenzke mentioned this pull request Jun 1, 2023

KMAC, 2nd: added keccak-fips as standalone; added KMAC-256 #3570

Closed

falko-strenzke closed this Jun 1, 2023

reneme mentioned this pull request Aug 19, 2023

eXtendable Output Functions as first-class citizen #3671

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactored SHA-3 in terms of newly added keccak-FIPS; added KMAC #3525

refactored SHA-3 in terms of newly added keccak-FIPS; added KMAC #3525

falko-strenzke commented Apr 26, 2023 •

edited

Loading

coveralls commented Apr 26, 2023 •

edited

Loading

randombit Apr 26, 2023

falko-strenzke Apr 27, 2023

reneme left a comment

reneme May 3, 2023

falko-strenzke May 4, 2023

reneme May 3, 2023 •

edited

Loading

falko-strenzke May 4, 2023

reneme May 3, 2023

falko-strenzke May 4, 2023

reneme May 4, 2023

randombit May 5, 2023

reneme May 5, 2023

reneme May 3, 2023 •

edited

Loading

randombit May 5, 2023

randombit left a comment

randombit May 5, 2023

falko-strenzke May 8, 2023 •

edited

Loading

reneme May 8, 2023 •

edited

Loading

reneme commented May 5, 2023 •

edited

Loading

falko-strenzke commented May 8, 2023 •

edited

Loading

falko-strenzke commented May 8, 2023

falko-strenzke commented May 9, 2023

reneme commented May 10, 2023 •

edited

Loading

falko-strenzke commented May 10, 2023

randombit commented May 11, 2023

reneme commented May 22, 2023

falko-strenzke commented Jun 1, 2023

		std::unique_ptr<HashFunction> new_object() const override;
		std::unique_ptr<HashFunction> copy_state() const override;

refactored SHA-3 in terms of newly added keccak-FIPS; added KMAC #3525

refactored SHA-3 in terms of newly added keccak-FIPS; added KMAC #3525

Conversation

falko-strenzke commented Apr 26, 2023 • edited Loading

coveralls commented Apr 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reneme left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reneme May 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reneme May 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

randombit left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

falko-strenzke May 8, 2023 • edited Loading

Choose a reason for hiding this comment

reneme May 8, 2023 • edited Loading

Choose a reason for hiding this comment

reneme commented May 5, 2023 • edited Loading

falko-strenzke commented May 8, 2023 • edited Loading

falko-strenzke commented May 8, 2023

falko-strenzke commented May 9, 2023

reneme commented May 10, 2023 • edited Loading

falko-strenzke commented May 10, 2023

randombit commented May 11, 2023

reneme commented May 22, 2023

falko-strenzke commented Jun 1, 2023

falko-strenzke commented Apr 26, 2023 •

edited

Loading

coveralls commented Apr 26, 2023 •

edited

Loading

reneme May 3, 2023 •

edited

Loading

reneme May 3, 2023 •

edited

Loading

falko-strenzke May 8, 2023 •

edited

Loading

reneme May 8, 2023 •

edited

Loading

reneme commented May 5, 2023 •

edited

Loading

falko-strenzke commented May 8, 2023 •

edited

Loading

reneme commented May 10, 2023 •

edited

Loading