asm_sha3 - xero/leviathan-crypto GitHub Wiki
SHA-3 WASM Reference
This low-level reference details the SHA-3 AssemblyScript source and WASM exports, intended for those auditing, contributing to, or building against the raw module. Most consumers should instead use the TypeScript wrapper.
Table of Contents
Overview
This module implements the full SHA-3 family as defined in FIPS 202 ("SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions", August 2015).
All six variants share a single core: the Keccak-f[1600] permutation operating on a 5x5 matrix of 64-bit lanes (200 bytes of state). The sponge construction wraps the permutation into two phases:
- Absorb. Input bytes are XORed into the state, rate bytes at a time, with a permutation call after each full block.
- Squeeze. Output bytes are read from the state after padding and a final permutation.
The six variants differ only in three parameters: the rate (how many bytes per sponge block), the domain separation byte (which distinguishes SHA-3 from SHAKE in the padding), and the output length.
Fixed-output hash functions (FIPS 202 SS6.1):
- SHA3-224, SHA3-256, SHA3-384, SHA3-512
Extendable-output functions / XOFs (FIPS 202 SS6.2):
- SHAKE128, SHAKE256
Security Notes
No length extension. Unlike the Merkle-Damgard family (SHA-2), SHA-3's sponge construction does not leak sufficient internal state to mount length extension attacks. The capacity portion of the state is never exposed during squeezing.
Domain separation. SHA-3 and SHAKE use different domain separation bytes in the multi-rate padding (FIPS 202 SS6.1-6.2):
0x06for SHA-3 (fixed-output)0x1ffor SHAKE (extendable-output)
This means SHA3-256(M) and SHAKE256(M, 256) produce different digests even
though both use rate=136. The domain separation byte is XORed into the state
during padding, making the two functions cryptographically independent.
Constant-time permutation. Keccak-f[1600] uses only bitwise XOR, AND, NOT, and fixed rotations on 64-bit lanes. There are no data-dependent branches, no table lookups, and no secret-dependent memory access patterns. The permutation is constant-time by construction.
SHAKE output cap. shakeFinal() performs a single squeeze block. SHAKE128
output is capped at 168 bytes and SHAKE256 at 136 bytes per call. For longer
output use shakePad() + shakeSqueezeBlock() in a loop — see the Multi-block
squeeze section below.
wipeBuffers(). Zeroes all 545 bytes of module state: the 200-byte Keccak lane
matrix, the input staging buffer, the output buffer, and the rate/absorbed/dsByte
metadata. The TypeScript wrapper must call this on dispose() to prevent key
material or intermediate hash state from persisting in WASM linear memory.
See SHA-3 implementation audit for algorithm correctness verifications.
API Reference
All exported functions from src/asm/sha3/index.ts:
Initialization
sha3_224Init(): void // rate=144, dsByte=0x06
sha3_256Init(): void // rate=136, dsByte=0x06
sha3_384Init(): void // rate=104, dsByte=0x06
sha3_512Init(): void // rate=72, dsByte=0x06
shake128Init(): void // rate=168, dsByte=0x1f
shake256Init(): void // rate=136, dsByte=0x1f
Each init function zeroes the 200-byte state and the 168-byte input buffer, then writes the variant-specific rate, absorbed count (0), and domain separation byte into the metadata slots. Call exactly one init function before absorbing data.
Absorb
keccakAbsorb(len: i32): void
Absorbs len bytes from INPUT_OFFSET into the sponge state. Write
input data to INPUT_OFFSET before calling this function. Data is XORed into the
state byte-by-byte (FIPS 202 SS4, Algorithm 8). When the absorbed byte count
reaches the rate, Keccak-f[1600] is applied and absorption continues from lane 0.
You can call this multiple times for streaming input; the ABSORBED counter tracks the
current position within the rate block across calls.
Constraint: len must not exceed 168 (the size of the input staging buffer).
For messages longer than 168 bytes, call keccakAbsorb in a loop, writing up to
168 bytes to INPUT_OFFSET each iteration.
Finalize (fixed-output)
sha3_224Final(): void // writes 28 bytes to OUT_OFFSET
sha3_256Final(): void // writes 32 bytes to OUT_OFFSET
sha3_384Final(): void // writes 48 bytes to OUT_OFFSET
sha3_512Final(): void // writes 64 bytes to OUT_OFFSET
Each final function applies multi-rate padding (FIPS 202 SS5.1, pad10*1), runs
the final Keccak-f[1600] permutation, and copies the appropriate number of output
bytes from the state to OUT_OFFSET.
Padding consists of two XOR operations:
- The domain separation byte (
0x06) is XORed at positionabsorbedin the state. 0x80is XORed at positionrate - 1in the state.
If absorbed == rate - 1, both XORs hit the same byte.
Finalize (extendable-output)
shakeFinal(outLen: i32): void
Same as the fixed-output finals, but you specify the output length.
Constraint: outLen must not exceed the rate of the initialized SHAKE
variant (168 for SHAKE128, 136 for SHAKE256). This implementation performs a
single squeeze and does not loop additional permutations for longer output.
Multi-block squeeze
shakePad(): void
shakeSqueezeBlock(): void
Low-level primitives for squeezing more than one block of SHAKE output. Call
shakePad() once after absorbing all input — it applies the domain separation
byte, the pad10*1 terminator, and runs Keccak-f. Then call shakeSqueezeBlock()
once per block needed: each call copies rate bytes from the Keccak state into
OUT_OFFSET and runs Keccak-f to advance the state for the next block.
Usage pattern:
// absorb input via keccakAbsorb() as normal, then:
shakePad()
while (moreOutputNeeded) {
shakeSqueezeBlock() // rate bytes now at OUT_OFFSET
// read from OUT_OFFSET ...
}
shakePad() must be called exactly once per hash. Calling shakeSqueezeBlock()
without a preceding shakePad() produces undefined output.
Low-level finalize
keccakFinal(outLen: i32): void
The underlying finalize used by all final functions. Applies padding using
whatever domain separation byte was set during init, permutes, and squeezes
outLen bytes. Exported for advanced use cases where you manage variant
parameters directly.
Buffer offset getters
getModuleId(): i32 // returns 3 (sha3 module identifier)
getStateOffset(): i32 // returns 0
getRateOffset(): i32 // returns 200
getAbsorbedOffset(): i32 // returns 204
getDsByteOffset(): i32 // returns 208
getInputOffset(): i32 // returns 209
getOutOffset(): i32 // returns 377
getMemoryPages(): i32 // returns current WASM memory size in pages
The TypeScript wrapper uses these to locate buffers in WASM linear memory without hardcoding offsets.
Cleanup
wipeBuffers(): void
Zeroes all state: the 200-byte lane matrix, 168-byte input buffer, 168-byte output buffer, and the rate/absorbed/dsByte metadata. Call this when the hash context is no longer needed.
Buffer Layout
All buffers occupy fixed offsets in WASM linear memory, starting at 0. There is no
dynamic allocation (memory.grow() is not used).
| Offset | Size (bytes) | Name | Description |
|---|---|---|---|
| 0 | 200 | KECCAK_STATE |
25 x u64 lane matrix (5x5, little-endian, A[x][y] at (x + 5y) * 8) |
| 200 | 4 | KECCAK_RATE |
u32 rate in bytes (variant-specific: 72-168) |
| 204 | 4 | KECCAK_ABSORBED |
u32 count of bytes absorbed into the current block |
| 208 | 1 | KECCAK_DSBYTE |
u8 domain separation byte (0x06 or 0x1f) |
| 209 | 168 | KECCAK_INPUT |
Input staging buffer (max rate = SHAKE128 at 168) |
| 377 | 168 | KECCAK_OUT |
Output buffer (one full SHAKE128 squeeze block) |
| 545 | END | Total footprint: 545 bytes (well within 3 x 64KB = 192KB) |
The input and output buffers are both sized to 168 bytes, the maximum rate across all variants (SHAKE128). For SHA3-512 (rate=72), only the first 72 bytes of the input buffer and the first 64 bytes of the output buffer are used.
Internal Architecture
buffers.ts
Defines the six buffer offset constants and the getter functions that expose them to the TypeScript layer. The layout is minimal: 545 bytes total, well under the 3-page (192KB) WASM memory allocation.
keccak.ts
Contains all cryptographic logic:
Keccak-f[1600] permutation (keccakF): 24 rounds, each consisting of five
steps (FIPS 202 SS3.2):
- theta (SS3.2.1): column parity mixing. Computes the XOR of each column, then XORs each lane with the parity of its neighboring columns.
- rho (SS3.2.2): lane rotation. Each of the 25 lanes is rotated left by a fixed offset from the rotation table (FIPS 202 Table 2).
- pi (SS3.2.3): lane permutation. Lanes are rearranged:
B[y][2x+3y] = A[x][y]. - chi (SS3.2.4): nonlinear mixing.
A[x] = B[x] XOR (NOT B[x+1] AND B[x+2]). This is the only nonlinear step and provides the cryptographic strength. - iota (SS3.2.5): round constant addition. A round-dependent constant is
XORed into lane
A[0][0]. The 24 constants are derived from an LFSR (FIPS 202 SS3.2.5).
The implementation loads all 25 lanes into local variables at the top of keccakF
and stores them back at the end. The rho and pi steps are combined into a single
operation using precomputed rotation offsets. The iota step uses an if-else chain
rather than a table load to apply round constants, avoiding any data-dependent memory
access.
Sponge functions:
keccakInit(rate, dsByte): zeroes state, sets variant parameters.
keccakAbsorb(len): XORs input into state, permuting when a full rate
block is absorbed. Tracks position via ABSORBED counter.
keccakFinal(outLen): applies pad10*1 (domain byte at absorbed, 0x80
at rate-1), permutes, squeezes outLen bytes to the output buffer.
index.ts
Pure re-export barrel. Exports all getters from buffers.ts and all sponge
functions from keccak.ts. No logic.
Variant Parameters
| Variant | Rate (bytes) | Capacity (bytes) | Output (bytes) | DS Byte | Security (bits) |
|---|---|---|---|---|---|
| SHA3-224 | 144 | 56 | 28 | 0x06 |
112 |
| SHA3-256 | 136 | 64 | 32 | 0x06 |
128 |
| SHA3-384 | 104 | 96 | 48 | 0x06 |
192 |
| SHA3-512 | 72 | 128 | 64 | 0x06 |
256 |
| SHAKE128 | 168 | 32 | variable (max 168) | 0x1f |
128 |
| SHAKE256 | 136 | 64 | variable (max 136) | 0x1f |
256 |
Rate + Capacity = 200 bytes (1600 bits) for all variants. The capacity determines
the security level: collision resistance is capacity/2 bits, preimage resistance
is min(output_bits, capacity) bits (FIPS 202 SS A.1).
Error Conditions
The WASM module itself does not throw exceptions. Constraints are enforced by the TypeScript wrapper, but callers working directly with the WASM exports must observe:
-
Input length per call:
keccakAbsorb(len)readslenbytes fromINPUT_OFFSET. Iflen > 168, the read will exceed the input buffer and access adjacent memory (the output buffer). The TypeScript wrapper must chunk input into 168-byte segments. -
Output length:
keccakFinal(outLen)copiesoutLenbytes from state toOUT_OFFSET. IfoutLenexceeds the rate, the squeeze reads past the rate-portion of the state into the capacity. Those bytes are not meaningful output and the result will be incorrect. For SHA-3 variants, the typed final functions (sha3_256Final, etc.) enforce correct output lengths. For SHAKE, ensureoutLen <= rate. -
Init before absorb: Calling
keccakAbsorbwithout a prior init will operate on stale or zeroed state with rate=0, causing a tight infinite loop (the absorb loop conditionabsorbed === rateis immediately true when rate=0, triggering permutations on every byte). Always call an init function first. -
Single squeeze: The squeeze phase copies bytes from state after one permutation. There is no multi-block squeeze loop. Requesting more than one block of SHAKE output requires re-architecting the squeeze, which is outside v1.0 scope.
Cross-References
| Document | Description |
|---|---|
| index | Project Documentation index |
| architecture | architecture overview, module relationships, buffer layouts, and build pipeline |
| sha3 | TypeScript wrapper classes (SHA3_224, SHA3_256, SHA3_384, SHA3_512, SHAKE128, SHAKE256) |
| asm_sha2 | alternative hash family (SHA-2/HMAC WASM module) |
| sha3_audit.md | SHA-3 / Keccak implementation audit |