-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactored SHA-3 in terms of newly added keccak-FIPS; added KMAC #3525
Conversation
src/lib/hash/hash.cpp
Outdated
@@ -251,6 +255,18 @@ std::unique_ptr<HashFunction> HashFunction::create(std::string_view algo_spec, | |||
} | |||
#endif | |||
|
|||
|
|||
#if defined(BOTAN_HAS_KECCAK_FIPS) | |||
if(req.algo_name() == "Keccak-FIPS[512]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said in the ticket, I really dislike this approach because the Keccak being exposed here is not, as I understand it, approved by NIST as a hash function, nor is it implemented by any library I am aware of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, that makes sense. I removed it from the external interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the overall direction of this. It certainly takes the internal implementation closer to the specs and makes it easier to understand. 👍
I've added a few high-level remarks regarding the implementation strategy.
Maybe related: With several PQC algorithms using SHAKE as XOF, it might be worth investing in a more generic interface for the extendable output flavours as well.
/** | ||
* KECCAK FIPS | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great the condense the discussion about the two variants of Keccak in #3279 into some clarifying Doxygen remarks here and in keccak.h
. I'm sure future users and contributors would be grateful. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added clarifying documentation in keccak_fips.h
which comes in the next commit. The problem with keccak.h
is that I am not really sure what it implements. I suspect it is what is implemented in "KECCAK - Final Algorithm Package" at https://csrc.nist.gov/projects/hash-functions/sha-3-project – but I didn't verify. It might be worthwhile to do so, but currently I don't have the time for it nor does this have priority in our current project.
* KECCAK FIPS | ||
*/ | ||
|
||
class Keccak_FIPS_generic : public HashFunction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generic idea to consider: Would it make sense to define this as a template with the capacity being a value template parameter? Downstream users of the static building blocks could then do Keccak_FIPS_256::absorb()
without explicitly stating the bitrate every time.
Not sure whether that's at all feasible, but maybe its worth a try in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same thought but refrained from trying it right now because I wasn't sure how much code would then have to go into the header and what would be the consequences of that. Right now I think it wouldn't be a problem as the header is not part of the public API. Making the capacity (and maybe other parameters) template parameters may improve the readability of the code and - at least potentially – increase performance.
However I would tend to first have the basic implementation merged and then address this later like you are proposing as well.
std::unique_ptr<HashFunction> new_object() const override; | ||
std::unique_ptr<HashFunction> copy_state() const override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe those should have overloads in this class. It's somewhat hidden behind the polymorphy of HashFunction
but as a user I would expect code along those lines to work:
SHA_3 sha(256);
HashFunction* other_sha = sha.new_object().get();
// this will fail, because `other_sha` is actually an instance of `Keccak_FIPS`
// and not of `SHA_3` (which I would have expected)
SHA_3* other_sha_ptr = dynamic_cast<SHA_3>(other_sha);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly these overloads would be needed in all 4 base classes SHA_3_224, ... Can you confirm this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed. So that was actually an issue with the existing code already, no?
Come to think of it: sha3.h
is an internal header, so applications don't actually have access to the SHA_3_*
types and wouldn't be able to do the dynamic_cast
as I suggested.
Still it is a code smell, in my opinion. @randombit what's your take on that?
Maybe CRTP could be helpful here, if the SHA_3
base class is not meant to be used as an interface by consuming code. It could potentially even be applied to Keccak_FIPS
at one point.
A sketch for reference:
template <class DerivedT>
class SHA_3 : public Keccak_FIPS_generic
{
public:
SHA_3(size_t output_bits) : Keccak_FIPS_generic(...) {}
std::unique_ptr<HashFunction> new_object() const override
{
return std::make_unique<DerivedT>(...);
}
std::unique_ptr<HashFunction> copy_state() const override
{
return std::make_unique<DerivedT>(...);
}
};
class SHA_3_224 : public SHA_3<SHA_3_224> {}
class SHA_3_256 : public SHA_3<SHA_3_256> {}
// ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like enourmously complicating vs just using ownership not inheritance.
SHA-3 is a hash function.
SHA-3 is not "a Keccak"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue is unrelated to the inheritance of Keccak
, though.
My mental model of ::new_object()
is that it returns an instance of the very same hash function class. The fact that SHA_3
is not marked final
but has sub-classes (SHA_3_224
, SHA_3_256
, ...) violates that mental model in the current implementation.
It's really just a nitpick, but it struck me as unusual. Internal code might do something like that:
std::unique_ptr<HashFunction> sha3_256 = std::make_unique<SHA_3_256>();
std::unique_ptr<HashFunction> bare_sha3 = sha3_256->new_object();
Now, both objects will behave exactly the same. But they won't be instances of exactly the same class. While sha3_256
is an instance of the class SHA_3_256
, sha3_bare
will be an instance of SHA_3
.
Applications will always go through HashFunction::create("SHA-3(256)")
and therefore get an instance of SHA_3
(without the _256
suffix).
Again: just a minor code smell, if at all.
src/lib/hash/shake/shake.cpp
Outdated
@@ -45,13 +45,13 @@ void SHAKE_128::clear() | |||
|
|||
void SHAKE_128::add_data(const uint8_t input[], size_t length) | |||
{ | |||
m_S_pos = SHA_3::absorb(SHAKE_128_BITRATE, m_S, m_S_pos, input, length); | |||
m_S_pos = Keccak_FIPS_generic::absorb(SHAKE_128_BITRATE, m_S, m_S_pos, input, length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Referring to the idea of making Keccak_FIPS_generic
a template (with the capacity being a value template-parameter), this might be written as:
Keccak_FIPS_128::absorb(m_S, m_S_pos, input, length);
which I would find desirable. But as stated there: certainly okay to look at separately as a follow-up.
#include <string> | ||
|
||
namespace Botan { | ||
|
||
/** | ||
* SHA-3 | ||
*/ | ||
class SHA_3 : public HashFunction | ||
|
||
class SHA_3 : public Keccak_FIPS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inheritance here seems just wrong.
TBH I do not understand at all the motivation for moving the SHA-3 permutation logic, nor for how Keccak_FIPS simplifies anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike adding Keccak as a HashFunction
. It is at best an implementation detail. If you really feel it is helpful, define it within kmac.cpp
, in terms of the functions on SHA_3
.
Considering this patch is net +500 lines of code, I am not even seeing how this approach simplifies anything.
{ | ||
// KMAC supports key lengths from zero up to 2²⁰⁴⁰ (2^(2040)) bits | ||
// https://nvlpubs.nist.gov/nistpubs/specialpublications/nist.sp.800-185.pdf#page=28 | ||
return Key_Length_Specification(0, std::numeric_limits<size_t>::max()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no practical reason to support keys larger than 512 bits.
This will just cause weird confusion (and out of memory errors) when someone tries to use mac->maximum_keylength()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the potential problem but I think this has some implications. KMAC may be used in arbitrary constuctions. Look for instance how the NIST KMAC-based key derivation functions are specified. Here a 'salt' value is used as the key. A salt value might be chosen based on very different consideration than a key. Accordingly, the Botan implementation may fail to support certain construction if it limits the key size to 512 bits. But I completely agree that the integer maximum is a bad idea. Maybe we find a reasonable value that accounts for other uses to a certain degree but also avoids an excessive memory allocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If 3.0 wouldn't have been out yet, I would have suggested making the maximum value a std::optional<>
to clearly communicate "no relevant upper limit in practice".
Maybe provide a "smallish" upper bound but allow to use KMAC with larger keys? This would require a way to pass Symmetric_Algorithm::valid_keylength()
for larger keys than Key_Length_Spec
advertises in the particular case of KMAC. That makes me feel uneasy as well. 🤔
Mhh, I think I get your point. Keccak is a construction that is "more" than just a To me My main source of confusion is that these building blocks ( |
I think I explained at length how the changes simplify the code and make it clearer in #3279 . I am not going to repeat this here again. As long as there are no arguments negating the one's I've given, I don't expect I will change my mind.
The number of changed lines is a mere artifact due to the limitations of git. I moved the entire SHA-3 code to Keccak_FIPS with |
I think the idea which supports using Keccak_FIPS as a base class is pretty straight forward: Keccak[c] is the base for all Keccac[c]-derived functions such as SHA-3, KMAC, etc. Thus if we would relabel it Keccak_FIPS_base, it would become clear enough that inheritance makes sense. However, I don't mind making it a member, if that is preferred for some reason. Since I understand that is what both of you prefer, I suggest making it a member. Probably then the discussion about turning Keccak_FIPS into a template is not relevant any more, as the code will not become more readable through this any more and the exepected performance gain through some compile-time known values will be negligible. |
Looking at mxd_hash.h I find
Wouldn't it be straightforward to use the same approach for Keccak[c], i.e. use Keccak_FIPS as the base class for all hash functions of this type? It may very well be a purely virtual class. Inside KMAC, we would then instantiate a derived class, which creates the specific hash function used by KMAC. If this is not desired, then I would be interested to learn the difference between the two patterns (inheritance for MDx hash but not for Keccak[c]-based hash). |
Personally I can live with that. But may I suggest a hybrid approach?
The free-standing base implementation could then be used as an implementation detail of other Keccak-based constructions (like a XOF based on SHAKE for example). As far as I can tell, the What I have in mind as the "free-standing" portion are likely just the existing |
That makes sense to me. @randombit Do you also agree to this approach? |
Re
This seems fine. Not really sure where it should live, I suppose it could be in a new
This I do not like for aforementioned reasons. It's also not clear how much a common base helps anything. My take is that needing some base class to help implementation just means the |
It helps separating API contracts (implementation of the I wouldn't see the need for such an "adapter" as a weakness in the design of Keccak[c]. When abstracting things into common base implementations we will usually need to "translate" to public interfaces (e.g. I've sketched that idea based on your refactoring of MDx_Hash here: #3553. |
The new PR #3570 obsoletes this one, thus closing it. |
Adding only KMAC256 for now. For adding KMAC128, a generic base should be created for KMAC and then both variants derived from it.