Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tests] SHA256 + official shuffling vectors #263

Merged
merged 10 commits into from
May 10, 2019

Conversation

mratsim
Copy link
Contributor

@mratsim mratsim commented May 9, 2019

This PR switches the hash function to SHA2-256 from Keccak256 and also uses the official shuffling test vectors.

Don't merge yet, I still have to fix a typedesc issue (https://github.com/status-im/nim-beacon-chain/compare/test-official-shuffling-vectors?expand=1#diff-b16dc57c1bf32e43578965533a7d7198R114) in nim-serialization.

Here are the accompanying changes:

@mratsim mratsim requested a review from tersec May 9, 2019 15:25
@mratsim mratsim force-pushed the test-official-shuffling-vectors branch from 710e3ad to ef24e63 Compare May 9, 2019 15:28
@mratsim
Copy link
Contributor Author

mratsim commented May 9, 2019

Typedesc issue opened: status-im/nim-serialization#4

We can go ahead with this and having parseTestsType(file) instead of parseTests(file, Type)

@mratsim mratsim changed the title [Tests] [WIP] SHA256 + official shuffling vectors [Tests] SHA256 + official shuffling vectors May 9, 2019
# when hashing in loop or into a buffer
# See: https://github.com/cheatfate/nimcrypto/blob/b90ba3abd/nimcrypto/sha2.nim#L570
func eth2hash*(v: openArray[byte]): Eth2Digest {.inline.} =
result = sha256.digest(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to reintroduce calling clear and thus burnMem from https://github.com/cheatfate/nimcrypto/blob/b90ba3abd6db9461419f33db4cbae8206d3fde18/nimcrypto/keccak.nim which has consumed up to 25-30% of CPU. While it doesn't seem like we can use the same avoid-init shortcut as with keccak because SHA2 has actual context to init, just calling sha256.digest does unnecessary work.

Copy link
Contributor

@tersec tersec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the caveats shown -- if you want, we can see how the perf around the hashing looks. I'm okay with merging it and fixing that later if it's still a problem. It's not a subtle perf drop/profiling signature if it's still one, and it might not be as relatively significant with the change of hash function. A lot of the rest is temporary around the 0.6 transition and would be medium-term inefficient to try to get perfect now.

checkPermutation(1775, 3434, 707, Eth2Digest(data: [0x92'u8, 0xf3'u8, 0x0d'u8, 0x85'u8, 0x56'u8, 0x38'u8, 0x2b'u8, 0x72'u8, 0xa5'u8, 0x79'u8, 0x7d'u8, 0xb8'u8, 0x11'u8, 0x48'u8, 0x6e'u8, 0x7a'u8, 0x21'u8, 0x3e'u8, 0x01'u8, 0x45'u8, 0xd6'u8, 0xc9'u8, 0x46'u8, 0xe5'u8, 0x12'u8, 0x1a'u8, 0xa6'u8, 0xa8'u8, 0xf7'u8, 0x61'u8, 0xd1'u8, 0x64'u8]))
checkPermutation(1109, 2010, 433, Eth2Digest(data: [0x09'u8, 0x3f'u8, 0xb9'u8, 0x76'u8, 0xf2'u8, 0x49'u8, 0x73'u8, 0x61'u8, 0x89'u8, 0x70'u8, 0x12'u8, 0xdf'u8, 0xa6'u8, 0xdc'u8, 0x01'u8, 0x90'u8, 0x09'u8, 0xed'u8, 0xa2'u8, 0xe4'u8, 0x8b'u8, 0xbe'u8, 0xb4'u8, 0xb7'u8, 0xc5'u8, 0x6d'u8, 0x4a'u8, 0xa5'u8, 0xda'u8, 0x7d'u8, 0x5f'u8, 0x87'u8]))
checkPermutation(359, 538, 115, Eth2Digest(data: [0xa7'u8, 0x9b'u8, 0x35'u8, 0xbe'u8, 0xac'u8, 0xbe'u8, 0x48'u8, 0xc6'u8, 0x62'u8, 0xd6'u8, 0x08'u8, 0x84'u8, 0xc7'u8, 0x04'u8, 0x04'u8, 0x00'u8, 0x24'u8, 0xc5'u8, 0x5a'u8, 0xb8'u8, 0x79'u8, 0xe5'u8, 0xf6'u8, 0x15'u8, 0x21'u8, 0x01'u8, 0x3c'u8, 0x5f'u8, 0x45'u8, 0xeb'u8, 0x3b'u8, 0x70'u8]))
checkPermutation(1259, 1473, 1351, Eth2Digest(data: [0x02'u8, 0xc5'u8, 0x3c'u8, 0x9c'u8, 0x6d'u8, 0xdf'u8, 0x25'u8, 0x97'u8, 0x16'u8, 0xff'u8, 0x02'u8, 0xe4'u8, 0x9a'u8, 0x29'u8, 0x4e'u8, 0xba'u8, 0x33'u8, 0xe4'u8, 0xad'u8, 0x25'u8, 0x5d'u8, 0x7e'u8, 0x90'u8, 0xdb'u8, 0xef'u8, 0xdb'u8, 0xc9'u8, 0x91'u8, 0xad'u8, 0xf6'u8, 0x03'u8, 0xe5'u8]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good time for this mass of semi-manual (albeit auto-code-generated) tests to disappear anyway, in favor of YAML or some other test harness format.


TestConstants* = object
# TODO - 0.5.1 constants
SHARD_COUNT*: int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good to mark the prevoius versoin as a kind of specific sort of TODO -- that is, when I scan for updates to perform, I (conceptually) grep for the previous version string(s). So I'd take this to mean that that set of constants should be specifically re-evaluated during a 0.6 transition.


# Empty size -> empty list.
if list_size == 0:
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully this is going to be a temporary special-case kludge. Kind of ugly to have to check (I write as the person who wrote the code around which you're working here) -- ideally, the 0.6 native solution just does this naturally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's here to stay, I have to check how Python get away with a modulo 0 in

# https://github.com/ethereum/eth2.0-specs/blob/v0.6.0/specs/core/0_beacon-chain.md#get_permuted_index
def get_shuffled_index(index: ValidatorIndex, index_count: int, seed: Bytes32) -> ValidatorIndex:
    """
    Return the shuffled validator index corresponding to ``seed`` (and ``index_count``).
    """
    assert index < index_count
    assert index_count <= 2**40

    # Swap or not (https://link.springer.com/content/pdf/10.1007%2F978-3-642-32009-5_1.pdf)
    # See the 'generalized domain' algorithm on page 3
    for round in range(SHUFFLE_ROUND_COUNT):
        pivot = bytes_to_int(hash(seed + int_to_bytes(round, length=1))[0:8]) % index_count
        flip = (pivot - index) % index_count
        position = max(index, flip)
        source = hash(seed + int_to_bytes(round, length=1) + int_to_bytes(position // 256, length=4))
        byte = source[(position % 256) // 8]
        bit = (byte >> (position % 8)) % 2
        index = flip if bit else index

    return index

@tersec tersec merged commit e5047d9 into master May 10, 2019
@delete-merged-branch delete-merged-branch bot deleted the test-official-shuffling-vectors branch May 10, 2019 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants