-
Notifications
You must be signed in to change notification settings - Fork 585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functionality for compile time decoding of word constants #3962
Conversation
3134d8d
to
2a71a7a
Compare
@reneme I imagine you would appreciate this and may have some suggestions since you're a lot better at this C++ compile time stuff 😅 |
Mhh, admittedly, I got somewhat nerd-sniped by that. Anyway: I suggest splitting this up into two functions:
That results in a bit more boilerplate but is easier to understand in my opinion. Also, it reuses existing functionality and has the potential to replace the hex-to-bytes function by a constexpr-version of consteval void decode_from_hex(std::span<uint8_t> out_bytes, std::span<const char> in_hex) {
auto decode = [](char c) -> uint8_t {
if(c >= '0' && c <= '9') {
return static_cast<uint8_t>(c - '0');
} else if(c >= 'a' && c <= 'f') {
return static_cast<uint8_t>(c - 'a' + 10);
} else if(c >= 'A' && c <= 'F') {
return static_cast<uint8_t>(c - 'A' + 10);
} else {
throw std::runtime_error("Invalid hex character: " + std::to_string(c));
}
};
// If the hex string has an odd length, decode the first character first,
// implicitly adding a zero hex byte as a prefix.
assert((in_hex.size() + 1) / 2 == out_bytes.size());
if(in_hex.size() % 2 != 0) {
out_bytes[0] = decode(in_hex[0]);
out_bytes = out_bytes.subspan(1);
in_hex = in_hex.subspan(1);
}
// Decode the hex string that is now guaranteed to have an even length.
assert(out_bytes.size() * 2 == in_hex.size());
for(auto& byte : out_bytes) {
const uint8_t hi = decode(in_hex[0]);
const uint8_t lo = decode(in_hex[1]);
byte = (hi << 4) | lo;
in_hex = in_hex.subspan(2);
}
}
template <WordType W, size_t N>
consteval auto hex_to_words(const char (&hex_string)[N]) {
constexpr size_t hex_len = N - 1; // Char count includes null terminator which we ignore
constexpr size_t byte_len = (hex_len + 1) / 2; // hex_string might have an odd length
constexpr size_t word_len = (byte_len + sizeof(W) - 1) / sizeof(W);
constexpr size_t zero_prefix_len = word_len * sizeof(W) - byte_len;
constexpr size_t padded_byte_len = byte_len + zero_prefix_len;
std::array<uint8_t, padded_byte_len> decoded_bytes = {0};
decode_from_hex(std::span<uint8_t>{decoded_bytes}.subspan<zero_prefix_len, byte_len>(),
std::span<const char>{hex_string}.first<hex_len>());
// Decode the (potentially zero-padded) big-endian byte string into "little-
// endian" limbs. Words are decoded from the right of the byte string.
static_assert(decoded_bytes.size() % sizeof(W) == 0, "Padded bytes is a multiple of word size");
static_assert(decoded_bytes.size() / sizeof(W) == word_len, "Padded bytes is the right size");
std::array<W, word_len> words = {0};
std::span<const uint8_t> unread_bytes(decoded_bytes);
for(auto& word : words) {
word = load_be(unread_bytes.last<sizeof(W)>());
unread_bytes = unread_bytes.first(unread_bytes.size() - sizeof(W));
}
return words;
} Another new concept: I'm using vanilla |
bd65c95
to
ce80c8e
Compare
a) b) The provision for odd sized hex character seems to me relevant for big integers (if rarely) but not at all helpful, in fact more of a good way to get a subtle bug, for generic binary hex strings. c) I can certainly see the utility of a compile time hex decoding for binary strings (I use the analogous This is imo one of those situations where there are two things that seem quite similar, such that they could be solved in a combined way, but the combination proves more complicated than solving each individually due to handling the subtle differences. |
Agreed. There it is quite useful though. But if BOTAN_ASSERT works just fine, there's no real argument for it.
Yep, agree on that as well. The out-param is a crutch for this specific use case. For a generic hex_decode we would want to enforce an even-length hex string. Perhaps a std::copy or the ranges-equivalent thereof would be better suited, for this special case. Generally, I do "like" the out-param pattern for its ability to avoid copies and allocations when building serializations, despite the awkward usage. But at compile time that is much less of an argument. |
ce80c8e
to
1fea96e
Compare
Here's another suggestion: 1fea96e, moving the special treatment for the odd-length hex string into For the record: I'm somewhat surprised that |
🙀
Yeah this has been a problem for a while, I just poked on google/oss-fuzz#11116 |
Our range-based |
TBH I find the new version much harder to understand than what I did originally. |
Please revert to it then. No hard feelings at all! I think there's value in playing around with this to get a grasp on which abstractions work and which don't. I totally see how the buffer shuffling in my suggestion adds too much complexity for the task its supposed to achieve. |
This is immediately useful for the NIST reduction code which has precomputed tables of constant values but likely will prove useful elsewhere as use of constexpr/constinit expands in the mp layer.
eee0a0c
to
63722b8
Compare
Nice. I like the |
This is immediately useful for the NIST reduction code which has precomputed tables of constant values but likely will prove useful elsewhere as use of constexpr/constinit expands in the mp layer.