Grøstl – a SHA-3 candidate
Implementations
The reference implementation, some optimized code in C and assembly language are part of the new NIST submission package for tweaked Grøstl.
These and other software implementations have also been submitted to the eBASH benchmarking project. This tarball contains the Grøstl implementations currently included in eBASH. Among these are
- NEW! Very fast constant-time inline assembly and intrinsics implementations containing Intel AES and AVX instructions
- NEW! Fast constant-time Grøstl implementations using Mike Hamburg's technique to compute the S-box using vector permute instructions (partially based on Çağdaş Çalık's implementations)
- Inline assembly implementations optimized for Intel Core 2 Duo and AMD Opteron processors
- C implementations optimized for 32-bit and 64-bit processors
Two 8-bit implementations for the ATmega163 microcontroller have been updated (without code size optimizations) and are available here.
Benchmarking results for Grøstl can be seen below.
Software benchmarks of Grøstl from eBASH
| Digest size | Processor | Mode | Speed |
|---|---|---|---|
| 224/256 | Intel Xeon E5620 (with AES-NI) | 64-bit | 11.3 cycles/byte |
| Intel Core i7-2600K (with AES-NI) | 64-bit | 11.5 cycles/byte | |
| AMD Phenom II X6 | 64-bit | 19.4 cycles/byte | |
| AMD Opteron 8354 | 64-bit | 19.8 cycles/byte | |
| Intel Core 2 Duo E4600 | 64-bit | 22.4 cycles/byte | |
| Intel Pentium M | 32-bit | 38.8 cycles/byte | |
| 384/512 | Intel Xeon E5620 (with AES-NI) | 64-bit | 16.0 cycles/byte |
| Intel Core i7-2600K (with AES-NI) | 64-bit | 15.6 cycles/byte | |
| AMD Phenom II X6 | 64-bit | 31.7 cycles/byte | |
| AMD Opteron 8354 | 64-bit | 33.6 cycles/byte | |
| Intel Core 2 Duo E4600 | 64-bit | 33.2 cycles/byte | |
| Intel Pentium M | 32-bit | 76.1 cycles/byte |
8-bit software implementations of Grøstl-0
Günther A. Roland has implemented three different versions of Grøstl-0-256 for an 8-bit ATmega163 microcontroller in his Bachelor Thesis. To store the Grøstl-0 state, 128 or 192 bytes of RAM are used. The results are given below.
| Version (state) | RAM (bytes) | Flash (bytes) | Speed (cycles/byte) |
|---|---|---|---|
| Low memory (128) | 164 | 2336 | 738 |
| Balanced (192) | 226 | 4170 | 517 |
| High speed (192) | 994 | 4228 | 456 |
Hardware ASIC implementations of Grøstl-0
Stefan Tillich developed high-speed Grøstl-0-256 ASIC implementations in 0.18µm technology of UMC. Here are the synthesis results.
| Total area (mm²) | Total area (GE) | Throughput (Gbit/s) |
|---|---|---|
| 547,227.47 | 58,403 | 6.290 |
| 538,462.41 | 57,467 | 6.141 |
| 523,472.74 | 55,867 | 5.690 |
| 471,626.06 | 50,334 | 2.725 |