Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lowram implementation #91

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

dop-amin
Copy link

This PR adds a third implementation variant called lowram. It focuses on using very little memory at a performance tradeoff. The implementation is written in C and based on the code in pqm4, which was added in this PR.

The original ideas for this implementation are taken from the paper

Joppe W. Bos, Joost Renes, and Amber Sprenkels. 2022. Dilithium for Memory
Constrained Devices. In Progress in Cryptology - AFRICACRYPT 2022: 13th
International Conference on Cryptology in Africa, AFRICACRYPT 2022, Fes,
Morocco, July 18–20, 2022, Proceedings. Springer-Verlag, Berlin, Heidelberg,
217–235. https://doi.org/10.1007/978-3-031-17433-9_10

and parts of the implementation in this PR are written by @mkannwischer.

I tried to retain as many files as possible from the ref implementation by moving most memory-optimization related code to lowram.c. Further, smallntt_3329.c and smallpoly.c contain functions to operate on polynomials with 16-bit coefficients. api.h, config.h, Makefile, and sign.c are the only files that should be different from ref -- everything else is just symlinks.

If there's anything that could be done to improve the quality of the implementation, feel free to let me know!

Cheers,
Amin

Copy link
Contributor

@mkannwischer mkannwischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some small suggestions.

lowram/sign.c Outdated Show resolved Hide resolved
lowram/sign.c Outdated Show resolved Hide resolved
lowram/smallpoly.h Outdated Show resolved Hide resolved
lowram/smallpoly.c Outdated Show resolved Hide resolved
lowram/smallntt_3329.c Show resolved Hide resolved
Co-authored-by: Matthias J. Kannwischer <[email protected]>
@gregorseiler
Copy link
Member

Hi Armin,

Thanks a lot for submitting the PR. I have now had a closer look at it. It would be great if you could make a pass over the code and clean it up a bit more. Some of the things I have noticed are inconsistencies in types (unsigned char instead of uint8_t, size_t instead of unsigned int), missing const qualifiers, and missing namespace defines. Also it would be good to use the same coding style as in the rest of the Dilithium code (no spaces after for, if and so on, { on the same line, variable declarations only at function beginnings).

Regarding the approach of retaining as much of the ref implementation as possible: In principle I like the idea. But it leads to code duplication that might be harder to maintain than replacing the ref files such as packing.c and poly.c with variants that implement the new functions in the expected places. So thinking about it I would prefer that.

Cheers,
Gregor

@mkannwischer
Copy link
Contributor

Hi Gregor, thanks for reviewing this!

Regarding the approach of retaining as much of the ref implementation as possible: In principle I like the idea. But it leads to code duplication that might be harder to maintain than replacing the ref files such as packing.c and poly.c with variants that implement the new functions in the expected places. So thinking about it I would prefer that.

I'm not sure I understand this. The reason we went for this approach was to reduce code duplication. All functions specific to this implementations are moved to lowram.c and all other files are just symlinks to the ref implementation. The only downside of this is that there is quite a lot of dead code in the lowram implementation. An alternative way would be to integrate them into the original files directly using #ifdef LOWRAM (or similar) - that would lead to less code duplication and no dead code, but it would be touching the reference implementation which is something we tried to avoid.
IMHO, copying poly.c and packing.c and modifiying it is the worst in terms of code duplication and I'd prefer not to do that.
Let us know which approach you'd like us to take and we'll adjust it.

@gregorseiler
Copy link
Member

Hi Matthias,

Your explanation makes a lot of sense. The reason why I think I prefer independent copies of the ref source files where the lowram function variants are implemented is the following. Let's assume future compilers turn the rounding implementation into variable time code and we need to change it. Then, we know we also have to look into the rounding.c files of the avx2 and lowram implementations. On the other hand when rounding.c in lowram/ is a symlink but the same code is duplicated in lowram.c, we have to remember this and I think there is a risk that we forget. And yes, the dead code you mentioned might also a bit annoying if someone is only interested in the lowram implementation and would like to have a clean stand-alone implementation. On the other hand you're of course right that copying the sources files leads to much more code duplication than your approach. But hopefully the code doesn't actually have to be changed much in the future. So in summary I guess I weigh clean stand-alone implementations higher than less code duplication.

Cheers,
Gregor

@vincentvbh
Copy link

vincentvbh commented Nov 12, 2024

I suggest precomputing the constant in Barrett in mod 3329. The constant is 20159.
Further, why don't you use 769?

@vincentvbh
Copy link

I second with Gregor that standalone implementation is preferable. The stack-optimized implementation targets different users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants