UP | HOME

Bit Limbs in Ed25519

Why 51-Bit Limbs in Ed25519?

In my previous post, we explored the use of 52-bit limbs in the scalar multiplication for the secp256k1 elliptic curve. The choice of 52-bit limbs aligns perfectly with the field arithmetic modulo \(p=2^{256}-2^{32}-977\), allowing for efficient operations on 64-bit processors. However, when it comes to the Ed25519 curve, the arithmertic takes a lightly different approach. Here, the limbs are 51 bits wide, which might seem unsual at first glance, especially if you're coming from the 52-bit world of secp256k1. So why 51 bits? Let's dive in.

The Prime Field of Ed25519:

The Ed25519 elliptic curve operates over a prime field defined by:

\[ p = 2^{255} - 19 \]

This prime is very close to a power of 2, but not quite, which makes the modular arithmetic slightly more complex than a pure power-of-2 modulus.

Representation of Large Integers:

In Ed25519, field elements are represented as a sum of multiple 510bit limbs:

\[ x = x_0 + 2^{51}x_1 + 2^{102}x_2 + 2^{153}x_3 + 2^{204}x_4 \]

where each \(x_i\) is a 51-bit integer. This allows the entire 255-bit field element to be stored across 5 limbs.

Why 51 Bits:

Using 51-bit limbs is a carfully considered choice:

  1. Efficient Carry Propagation: The use of 51-bit limbs leaves enough headroom in a 64-bit word for handling carries during arithmetic operations. For instance, when multiplying two 51-bit numbers, the result is at most 102 bits long. This fits comfortable within a 128-bit intermediate result, allowing for efficient carry propagation without immediate overflow.
  2. Modulo Reduction: The sum of five 51-bit values easily fits within 256 bits, and the reduction by \(p=2^{255}-19\) can be efficiently managed by leveraging the properties of the field.

Example: Scalar Multiplication in Ed25519

Here's how scalar multiplicaion might look in Ed25519, whare the large integer \(x\) is multiplied by a scalar 1:

typedef uint64_t limb:
typedef limb felem[5]; // 5 limbs of 51 bits each

void scalar_mul(felem out, const felem in, const limb scalar) {
     uint128_t a;

     // Multiply each limb by the scalar, and handle carries
     a = ((uint128_t) in[0]) * scalar;
     out[0] = ((limb)a) & 0x7ffffffffffff;

     a = ((uint128_t) in[1]) * scalar;
     out[1] = ((limb)a) & 0x7ffffffffffff;

     a = ((uint128_t) in[2]) * scalar;
     out[2] = ((limb)a) & 0x7ffffffffffff;

     a = ((uint128_t) in[3]) * scalar;
     out[3] = ((limb)a) & 0x7ffffffffffff;

     a = ((uint128_t) in[4]) * scalar;
     out[4] = ((limb)a) & 0x7ffffffffffff;

     out[5] += (a >> 51) * 19;
}

In this example, each limb of the input is multiplied by the scalar, and the result is accumulated into the output limbs. The carry from each multiplication is propagated to the next limb. Finally, the carry from the top limb is multiplied by 19 (corresponding to the reduction by \(2^{255} - 19\) and added back to the first limb.

Carry Propagation in Detail

As each limb is only 51 bits, there is ample room within 64-bit word to handle carries during multiplication. For example, multiplying two 51-bit numbers:

\begin{eqnarray} max_{limb} &=&2^{51}-1 \notag \\ max_{product} &=& (2^{51}-1)\times(2^{51}-1)=2^{102}-2^{52}+1 \notag \\ \end{eqnarray}

This product fits within 102 bits, and the carry can be propagated without overflow, making it an efficient process on modern prosessors that support 64-bit operations natively.

Conclusion

The choice of 51-bit limbs in Ed25519 is driven by the specific arithmetic requirements of its prime field and the need for efficinet, secure operations on 64-bit hardware. By balancing the limb size just under the native word size, Ed25519 ensures that its arithmetic remains manageable, avoiding overflow while making the most of available computational resources.


1

Sampled from curve25519-donna implementaion

follow me on 𝕏.