The Hidden Geometry of Trust: Subgroups, Cofactors, and Scalar Efficiency in ECC

This post is the geometric companion to the others in the April series. We’ll work through the actual shape of an elliptic curve group: what it looks like, how to add and multiply points, why subgroups appear, and how the wrong choice of starting point can collapse all your security into a tiny pocket of values an attacker can brute-force.

We’ll do a complete worked example on a tiny curve so the formulas don’t stay abstract — and we’ll see exactly how the Dual_EC_DRBG backdoor used subgroup math to read past random outputs.

Background. A group is a set with an associative operation, an identity, and inverses. Cyclic means the whole group is generated by repeating one element. The order of an element $P$ is the smallest $n$ with $nP = $ identity. We’re going to be working with a prime field $\mathbb{F}_p$ — integers mod a prime $p$, with regular addition and multiplication mod $p$.


1. Why ECC again? A 30-second recap

Classical cryptography (RSA, finite-field DH) lives inside the multiplicative integers mod $p$. The catch is that integers factor into smaller primes, which lets attackers build something called a “factor base” and break discrete logs in sub-exponential time using the index calculus algorithm — runtime $L_p[1/3, c]$, between polynomial and exponential.

Elliptic curve points don’t factor like integers, so index calculus doesn’t apply. The best classical attack is Pollard’s rho, a fully exponential $O(\sqrt n)$ method. That gap — sub-exponential vs. fully exponential — is why ECC keys are so much shorter for the same security:

Symmetric strength RSA / FFDH ECC Ratio
80 bits 1024 160 6.4×
112 bits 2048 224 9.1×
128 bits 3072 256 12×
192 bits 7680 384 20×
256 bits 15360 521 29.5×

For TLS, IoT sensors, smart cards — anywhere kilobytes matter — that’s a huge win.


2. The curve itself

The standard form (called short Weierstrass) is:

y2=x3+ax+by^2 = x^3 + ax + b

where the math happens in a finite field $\mathbb{F}_p$ for some large prime $p$. We need the curve to be smooth (no cusps, no self-intersections), which means the discriminant must be non-zero:

Δ=16(4a3+27b2)0\Delta = -16(4a^3 + 27b^2) \neq 0

The “group” we care about is the set of all points $(x, y)$ satisfying the equation, plus one extra symbolic point called $\mathcal{O}$ — the point at infinity. $\mathcal{O}$ is the identity element, so $P + \mathcal{O} = P$ for every point $P$.

The inverse of a point $(x, y)$ is just $(x, -y)$ — its mirror image across the x-axis. (In $\mathbb{F}_p$, that’s $(x, p - y)$.)

Adding points: chord and tangent

The picture: draw a line through two points $P$ and $Q$ on the curve. The line crosses the curve at exactly one third point. Reflect that third point across the x-axis — that’s $P + Q$.

If $P = Q$, you can’t draw a chord through the same point twice, so use the tangent line at $P$ instead. Same idea: tangent meets the curve at one other point, reflect across the x-axis.

The slopes:

  • Adding two distinct points ($P \neq Q$): $\quad s = \dfrac{y_2 - y_1}{x_2 - x_1} \pmod p$
  • Doubling a point ($P = Q$): $\quad s = \dfrac{3x_1^2 + a}{2y_1} \pmod p$

Then $x_3 = s^2 - x_1 - x_2$ and $y_3 = s(x_1 - x_3) - y_1$, all mod $p$. (When $P = Q$, just use $x_1$ for both.)

A note on “division” mod $p$: there’s no actual division — you compute the modular inverse instead, using the extended Euclidean algorithm or Fermat’s little theorem. We’ll do an example below.


3. Cyclic subgroups, generators, and Lagrange’s theorem

Pick any point $G$ on the curve and start computing $G, 2G, 3G, \dots$. Eventually you’ll cycle back to $\mathcal{O}$ (every element of a finite group has finite order). The smallest $n > 0$ with $nG = \mathcal{O}$ is the order of $G$.

The values ${G, 2G, \dots, nG = \mathcal{O}}$ form a subgroup of the curve — a self-contained mini-group inside the bigger one.

Lagrange’s theorem. If a finite group has $N$ elements total, every subgroup’s size must divide $N$. So if a curve has $N = 28$ points total, the only possible subgroup sizes are divisors of 28: ${1, 2, 4, 7, 14, 28}$. No subgroup of size 5 can exist, because 5 doesn’t divide 28.

The cofactor $h$ is just $h = N / n$ where $n$ is the order of the subgroup we’re using. For security we want $n$ to be a large prime and $h$ to be small (ideally 1, but Curve25519’s $h = 8$ is fine if you handle it correctly). The reason: the attacker’s hardness is governed by $n$ (Pollard’s rho is $O(\sqrt n)$). If the attacker can trick us into operating in a subgroup of size $h$ instead of $n$, security collapses to the tiny number $\sqrt h$.


4. Scalar multiplication: double-and-add

Computing $13P$ by adding $P$ to itself 13 times is $O(13)$. For real cryptographic scalars (256+ bits), that’s astronomical. The trick — same as classical exponentiation — is double-and-add, which is $O(\log k)$.

Take $k = 13 = 1101_2$. Process the binary from the most significant bit:

Bit Position Operation Running value
1 (MSB) $2^3$ Initialize $P$
1 $2^2$ Double, then add $P$ $3P$
0 $2^1$ Double $6P$
1 (LSB) $2^0$ Double, then add $P$ $13P$

For each bit you do a double; if the bit is 1 you also do an add. So computing $13P$ takes 3 doubles + 2 adds, not 12 adds.

This algorithm has a side-channel pitfall: if you only do an add when the bit is 1, an attacker watching power consumption or cache can tell which bits of $k$ are 1. Real implementations use the Montgomery ladder (or other constant-time variants) that does the same operations regardless of the bit value.


5. What goes wrong: small-subgroup attacks

What if an attacker hands you a public key that’s not in the prime-order subgroup, but instead in a small subgroup of order, say, 4?

Let’s call that point $P_\text{small}$. When you compute the shared secret $k \cdot P_\text{small}$ (where $k$ is your secret scalar), the result is forced into the subgroup generated by $P_\text{small}$ — only 4 possible values. The attacker just tries all 4 and matches against whatever you do next (encrypting with the resulting key, sending an authenticator, etc.).

This is the small-subgroup confinement attack. The fix is to verify $nQ = \mathcal{O}$ for any received public key $Q$ — i.e., $Q$ has the right order. Some implementations skip this check to save cycles. Don’t.

How Dual_EC_DRBG used the subgroup structure

Dual_EC_DRBG was a NIST-standardized random number generator that used elliptic curves. It relied on two public points $P$ and $Q$. The generator’s internal state $S$ produced output by computing $r = (SP)_x$ (the x-coordinate of $SP$), then advancing the state to $S^{\prime} = (rQ)_x$.

The backdoor: NSA had picked $P$ and $Q$ to satisfy a secret relationship $P = dQ$ for some scalar $d$ they alone knew. Now, given an output $r$:

  1. Recover the full point. The output is just an x-coordinate, with the top 16 bits chopped off. So there are about $2^{16}$ candidate points.
  2. For each candidate, multiply by the secret $d$. By the relationship $P = dQ$, this is the same as multiplying by $P$ — which gives you the output that would have come from advancing the state.
  3. Match. One of those $2^{16}$ candidates matches the next output you actually observed. That candidate was the right one. You now know the internal state and can predict every future output.

This was published by Shumow and Ferguson in 2007. NIST formally withdrew Dual_EC_DRBG in 2014 after the Snowden documents confirmed the backdoor. The lesson: when curve parameters are chosen by someone who could be hiding a discrete-log relationship, “trust” the parameters at your own risk.


6. Worked example: arithmetic on a tiny curve

Let’s compute everything by hand. Take the curve $y^2 = x^3 + x + 1$ over $\mathbb{F}_{23}$ ($a = 1, b = 1$), and start from the point $P = (11, 3)$. The total number of points on this curve is 28 (you can verify by enumeration).

Step 1 — Compute $2P$

Slope:

s=3112+123(mod23)=3646(mod23)s = \frac{3 \cdot 11^2 + 1}{2 \cdot 3} \pmod{23} = \frac{364}{6} \pmod{23}

Reduce the numerator: $364 = 15 \cdot 23 + 19$, so $364 \equiv 19 \pmod{23}$.

Find $6^{-1} \pmod{23}$: try small numbers. $6 \times 4 = 24 \equiv 1$. So $6^{-1} = 4$.

s=19×4=767(mod23)s = 19 \times 4 = 76 \equiv 7 \pmod{23}

Now the new coordinates:

x3=s22x1=4922=274(mod23)x_3 = s^2 - 2x_1 = 49 - 22 = 27 \equiv 4 \pmod{23} y3=s(x1x3)y1=7(114)3=493=460(mod23)y_3 = s(x_1 - x_3) - y_1 = 7(11 - 4) - 3 = 49 - 3 = 46 \equiv 0 \pmod{23}

So $2P = (4, 0)$. (Sanity check: $0^2 = 0$, and $4^3 + 4 + 1 = 69 \equiv 0 \pmod{23}$. ✓)

Step 2 — Compute $3P = 2P + P$

We’re adding $(4, 0)$ and $(11, 3)$:

s=30114=37(mod23)s = \frac{3 - 0}{11 - 4} = \frac{3}{7} \pmod{23}

Find $7^{-1} \pmod{23}$: $7 \times 10 = 70 = 3 \cdot 23 + 1 \equiv 1$. So $7^{-1} = 10$.

s=3×10=307(mod23)s = 3 \times 10 = 30 \equiv 7 \pmod{23} x3=72411=4915=3411(mod23)x_3 = 7^2 - 4 - 11 = 49 - 15 = 34 \equiv 11 \pmod{23} y3=7(411)0=4920(mod23)y_3 = 7(4 - 11) - 0 = -49 \equiv 20 \pmod{23}

So $3P = (11, 20)$.

Step 3 — Compute $4P = 3P + P$

We need to add $(11, 20)$ and $(11, 3)$. The x-coordinates are equal! And $20 \equiv -3 \pmod{23}$, so $3P$ is the additive inverse of $P$. That means:

4P=3P+P=(P)+P=O4P = 3P + P = (-P) + P = \mathcal{O}

So the order of $P$ is 4. By Lagrange, the subgroup generated by $P$ has size 4. The cofactor for this choice of $P$ is $h = 28/4 = 7$.

If a protocol used this curve and accepted $P = (11, 3)$ as someone’s public key, the shared secret could only land in 4 possible values — wide open to small-subgroup confinement.


7. The engineering tradeoff

Modern ECC gives you mathematical hardness with small keys. In return it asks for discipline about which points you accept and which subgroups you operate in. Most ECC failures in the wild aren’t math failures — they’re protocol-level failures that didn’t enforce one of these rules:

  • Validate received points. They must be on the curve, and not in a small subgroup.
  • Multiply by the cofactor when needed. For Curve25519, the standard scalar clamping (zero the bottom 3 bits) takes care of this for you.
  • Use constant-time scalar multiplication. Don’t let timing or memory access leak which bits of $k$ are 1.

The math says ECDLP is hard. The implementation has to keep that hardness from leaking out the side. Subgroup discipline is where most of that work lives.