Vulnerability Assessment Report: Nonce Leakage and Implementation Risks in (EC)DSA
A structured look at HNP-based key recovery, BIGNUM representation leaks, lazy resizing CVEs, and protocol-level subgroup failures in modern signature stacks.
Vulnerability Assessment Report: Nonce Leakage and Implementation Risks in (EC)DSA
A digital signature should be a tiny, opaque blob that proves you signed a message. In ECDSA — the signature scheme behind Bitcoin, TLS server authentication, and git commit -S — that blob hides a per-signature random number called the nonce. Leak even a few bits of that nonce, repeatedly, and a clever attacker can recover your long-term private key.
This post walks through how tiny leaks become total compromise, with the math kept at “advanced undergraduate” level. We’ll see why ECDSA is so brittle, what the Hidden Number Problem is, and how OpenSSL’s well-meaning optimizations turned into CVE-2018-0734 and CVE-2018-0735.
Background you’ll want. ECDSA signs a hashed message $h$ with a private scalar $x$ and a fresh random nonce $k$. The signature is the pair $(r, s)$ where $r = (kG)_x \bmod q$ (the x-coordinate of $kG$, the nonce times the base point) and $s = k^{-1}(h + xr) \bmod q$. The number $q$ is the order of the base point — a big prime. The verifier knows $G$, $q$, $r$, $s$, $h$, and the public key $xG$; they don’t know $x$ or $k$.
1. Why ECDSA is so unforgiving
The signing equation is $s = k^{-1}(h + xr) \bmod q$. Solve it for the long-term key: $x = (sk - h) r^{-1} \bmod q$. Knowing $k$ for any single signature lets you recover $x$ — the catastrophe. Two famous illustrations:
- Sony PlayStation 3 used a fixed nonce across firmware signatures. Attackers in 2010 immediately recovered the master ECDSA key from any two signatures (since $s_1 - s_2 = k^{-1}(h_1 - h_2)$, you solve for $k$ trivially when both signatures share the same $k$).
- Bitcoin wallets on Android in 2013 used a broken
SecureRandomthat produced colliding nonces; tens of thousands of dollars of BTC were stolen by attackers who simply scraped the blockchain for repeated $r$ values.
You don’t even need full nonce recovery. Leaking just a few bits per signature, across many signatures, suffices — via the Hidden Number Problem.
2. The Hidden Number Problem (HNP), in plain English
Suppose I’m thinking of a secret $x \bmod q$. For each signature $i$, I leak you a small clue: I tell you the top $\ell$ bits of some quantity $a_i x \bmod q$ (where $a_i$ is something you can compute). Each clue gives you a band of values $a_i x$ might live in — an inequality.
If I give you enough such clues, you can find $x$. Here’s the magic: pack each clue as a vector, build a lattice (a discrete grid in high-dimensional space), and the shortest vector in that lattice encodes $x$. Modern lattice-reduction algorithms — LLL (Lenstra–Lenstra–Lovász, 1982) and BKZ (Block Korkin–Zolotarev) — find that short vector efficiently in practice.
Mapping back to ECDSA: $a_i x \bmod q$ shows up naturally in the signing equation. Whenever an implementation leaks the top (or bottom) bits of $k$, you can express it as an HNP instance and solve.
The Nguyen–Shparlinski 2002 result says you only need about $\ell \approx \log_2 \log_2 q$ bits of leakage per signature for the lattice to give up the secret. For $q$ at 256 bits, that’s around 8 bits per signature — and modern attacks recover keys with even less. Bleichenbacher-style attacks use FFTs to handle noisy or fractional-bit leaks, trading more signatures for tolerance to measurement noise.
The DATA framework (Differential Address Trace Analysis) is the standard tool for finding which lines of crypto code leak which bits. It runs the binary with two different secret inputs, diffs the address traces, and flags every divergence.
3. Big numbers, big leaks
Keys and nonces are 256+ bits long. CPUs handle 64 bits at a time. So crypto libraries store these as arrays of 64-bit chunks called limbs. The whole 256-bit number takes 4 limbs; a 521-bit number takes 9.
The big philosophical question for a BIGNUM library: when a number becomes smaller, do you shrink the buffer?
| OpenSSL / LibreSSL (minimal representation) | BoringSSL (fixed-width) | |
|---|---|---|
| Storage metric | top field tracks “currently used” limbs; bn_fix_top strips leading zeros |
width field is fixed by the modulus |
| Memory policy | Lazy: shrinks on the fly | Pre-allocated to maximum |
| Word-boundary leakage | Yes: $L = \log_2(q) \bmod w$ bits leak when a value crosses a limb boundary | None |
| What an attacker can see | Allocation patterns reveal value-dependent state | Allocation patterns are constant |
The minimal-representation approach is faster and uses less RAM. It’s also a steady drip of side-channel information.
V1 — small-nonce information leak
If the curve order $q$ sits just above a 64-bit boundary (true for secp521r1: $q \approx 2^{521}$, which is 9 full limbs plus 9 bits of the 10th), then a nonce that happens to clear the topmost bits will shrink to one fewer limb. The library’s top field changes. An attacker watching from a co-located process or via cache side-channels sees the change.
For secp521r1, the top-limb-becomes-zero event happens roughly once every 512 signatures, leaking 9 bits at a time — which is plenty for an HNP attack. (Statistic verified in Weiser et al., “Big Numbers — Big Troubles,” USENIX Security 2020 §6.4.)
Optimized field arithmetic introduces its own bugs
Curves like secp384r1 use a 56-bit “redundant limb” encoding with Solinas reduction for speed. The trade is that carries can underflow if you don’t unroll the telescoping sums correctly. Ed448-Goldilocks uses the trinomial prime $2^{448} - 2^{224} - 1$ and Karatsuba multiplication; the limbs have to be sized so the products fit in 128-bit accumulators without overflow. Both are fast, both are subtle, and both have been the source of implementation bugs over the years.
4. Lazy resizing and “k-padding”
ECDSA implementations want $k$ to look like a uniform random number in $[1, q-1]$ — so an attacker can’t see, for example, that $k$ has 250 leading zero bits and conclude $k$ is small. The classical defense is k-padding: compute $k + q$ (or $k + 2q$) before exponentiation, ensuring the bit-length is uniform.
The defense backfired.
V2 / V5 — carry-induced resize (CVE-2018-0734, CVE-2018-0735)
Computing $k + q$ might trigger a carry that pushes the value into one more limb. OpenSSL/LibreSSL respond by calling bn_wexpand, which calls malloc or realloc. An attacker on the same machine watching malloc via Flush+Reload can detect each resize.
The carry only happens when the top bits of $k$ are large. So each “resize event” leaks information about those top bits. Stack enough signatures, run HNP, recover the long-term key.
CVE-2018-0734 is the OpenSSL DSA version; CVE-2018-0735 is the OpenSSL ECDSA version. (LibreSSL inherited the same BIGNUM pattern from its original OpenSSL fork, even though the CVE numbers are OpenSSL’s.)
V4 — accidental constant-time downgrade
OpenSSL has a per-BIGNUM flag BN_FLG_CONSTTIME that triggers constant-time code paths. The patches for V2/V5 propagated this flag incorrectly through BN_consttime_swap, so the constant-time bit got cleared on the secret. Variable-time code ran. The defense for one leak silently introduced a timing leak.
This is the cat-and-mouse pattern: each fix narrows the original hole and opens a new one. It’s why “fixed-width from the start” (BoringSSL’s choice) ends up being more robust than patching minimal-representation code.
5. The rest of the nonce life cycle
Side channels lurk at every stage where you touch the secret:
Generation: rejection vs. truncation
OpenSSL hashes the random seed with the long-term private key: $k = \mathrm{SHA512}(x \mathbin| m \mathbin| \text{rnd})$. This is defense-in-depth against a broken PRNG (à la the Android wallet attack) — even if rnd is biased, including $x$ keeps $k$ unpredictable.
BoringSSL prefers rejection sampling: generate a uniformly random number, reject and retry if it falls outside $[1, q-1]$. Mathematically clean, no bias.
Modular inversion: V8 and V9
ECDSA needs $k^{-1} \bmod q$. Two implementation options:
- Binary Extended Euclidean Algorithm (BEEA). Fast, but the loop count and conditional-negate steps both depend on $k$. Two leaks for the price of one.
- Fermat’s Little Theorem. Compute $k^{q-2} \bmod q$ via fixed-window exponentiation. Constant-time.
OpenSSL’s BEEA was the source of V8 (resize during the first iteration leaked the top bit of $k$) and V9 (conditional negation leaked the magnitude). BoringSSL — and recent OpenSSL — moved to Fermat inversion.
Point addition: V7
Inside scalar multiplication, optimized NIST curve code contains checks like if (x_equal && y_equal && !z2_is_zero) to handle the special case where the running point becomes the point at infinity. When a window of the secret nonce is all zero, that conditional fires — and an attacker doing single-stepping in SGX can detect each branch outcome instruction-by-instruction.
The fix is the same as elsewhere: rewrite as constant-time bitwise selects, never a conditional.
6. Protocol-layer pitfalls
The implementation issues above all sit inside a single signature operation. Protocol-level mistakes compound them.
| Attack | Mechanism | Consequence |
|---|---|---|
| Scuttlebutt | Low-order Curve25519 points | Authentication bypass; attacker completes handshake without responder’s secret |
| Bluetooth | y-coordinate manipulation (only x is in the auth hash) | Invalid-curve attack pins shared secret to a small set |
| Tendermint (per Jackson et al., CCS 2019) | Ed25519 + signer identity not bound into the signed payload | Universal Key Substitution-style risk under stronger threat models — analytical finding, no public exploit CVE |
Twist security only helps when the attacker can’t choose to land outside both the curve and the twist (the invalid-curve case). On a non-twist-secure curve like NIST P-224, even the twist is exploitable.
7. Defense hierarchy
In rough order of impact:
- Fixed-width BIGNUM. Eliminating the
topfield is the only structural fix for representation-based leaks. Patching minimal-representation code is whack-a-mole. - Reject low-order public keys. libsodium does this by default; the Go standard library historically didn’t. This is one library-level choice with outsized security impact.
- Cofactor clamping for Curve25519 — zero the low 3 bits of the scalar. Pair with identity-element rejection so the clamping doesn’t create confinement.
- Validate every received point on the curve equation $y^2 \equiv x^3 + ax + b \pmod p$. Required even for twist-secure curves.
The patching cycle of OpenSSL CVE-2018-0734 → V4 downgrade is the cautionary tale: side-channel resistance is a moving target. Each fix needs an audit for what new leaks it might create.
8. Conclusions
ECDSA’s brittleness to nonce leaks is mathematically fundamental — if you give up bits of $k$, you give up the key. So the entire defensive burden falls on the implementation: keep $k$ secret end-to-end, including from anyone watching CPU caches, memory allocators, branch predictors, or timing.
Three concrete recommendations:
- Use Fermat inversion, not BEEA, for $k^{-1}$.
- Use fixed-width BIGNUM, not minimal representation, for any value derived from the secret.
- Validate at the protocol entry point — curve equation, low-order rejection, identity rejection — don’t rely on implementation-level twist security to save you.
The math is solid. The implementations have been the structural problem for two decades and counting.