Exploring Elliptic Curve Pairing

Vitalik Buterin
2022-08-15 19:05:50
Collection

Author: Vitalik Buterin

Original Title: 《Exploring Elliptic Curve Pairings

Published on: January 14, 2017

Many foundational constructions in cryptography, such as deterministic threshold signatures, zk-SNARKs, and other simpler forms of zero-knowledge proofs, are based on one key primitive model: elliptic curve pairings. Elliptic curves have been widely used in cryptography, digital signatures, and other applications for over thirty years. "Elliptic curve pairings" (or bilinear maps) are a recent innovation based on them, introducing cryptographic multiplication and greatly expanding what can be accomplished with elliptic curve-based agreements. This article will provide a detailed introduction to elliptic curve pairings and briefly explain how they work.

Because this concept is really not easy to grasp, it is not expected that you will fully understand it on your first or even tenth reading, but hopefully, this article will at least give you some insights into the intricacies.

Elliptic curves themselves are already a challenging topic, but this article will mostly assume that you have some understanding of how they work. If you do not have a basic concept, I would recommend this article as an introduction. In general, elliptic curves deal with mathematical objects called "points," which are simply points (x, y) on a two-dimensional plane, along with some special formulas for addition and subtraction (for example, calculating R = P + Q). You can also multiply a point by an integer (for example, P * n = P + P + … + P, although there is a much faster algorithm when n is large).

image

This is what point addition looks like graphically

In addition, there is a special point called the "point at infinity" (O), which acts as the zero element in point operations, meaning that for all points, P + O = P. A curve also has a property called "order," which means there exists a positive integer n such that for all P, P * n = O (of course, P * (n + 1) = P, P * ((7*n) + 5) = P * 5, and so on). There is also a pre-agreed "generator point" G, which is chosen to serve as the additive identity, representing the role of the number 1. In theory, any point on the curve can serve as a generator, but the important thing is that G is uniformly chosen.

Pairings further allow you to verify certain more complex equations: for example, if P = G * p, Q = G * q, R = G * r, when you want to check if p * q = r holds, you only need the coordinates of the points P, Q, and R as input. This may seem like it breaks the most basic security guarantees of elliptic curves because, at first glance, knowing the coordinates of point P leaks the value of p. However, the risk of such leakage is very limited—specifically, the Decisional Diffie-Hellman problem is easy, but the computational version remains "computationally infeasible." At least the difficulty is about the same as not knowing this value.

A third way to understand "pairings," perhaps the most enlightening in most of the contexts we discuss, is that if you view points on the elliptic curve as a one-way encryption function (that is, encrypt(p) = p * G = P), then comparing to the traditional elliptic curve mathematical principles allows you to verify linear constraints among variables (for example, P = G * p, Q = G * q, R = G * r, verifying 5 * P + 7 * Q = 11 * R is actually verifying 5 * p + 7 * q = 11 * r). Elliptic curve pairings allow you to verify quadratic constraints (verifying e(P, Q) * e(G, G * 5) = 1 is actually verifying p * q + 5 = 0). Being able to do quadratic checks is sufficient for us to perform interesting applications like deterministic threshold signatures, QAP (quadratic arithmetic programs, a form of zero-knowledge proof), and more.

Now the question is, what exactly is the operator e(P, Q) that we introduced above? This is a set of "pairings." Mathematicians sometimes refer to it as a "bilinear map," where "bilinear" essentially means it satisfies the following conditions:

e(P, Q + R) = e(P, Q) * e(P, R)
e(P + S, Q) = e(P, Q) * e(S, Q)

Note that here + and * can be any operators; when you create a new class of mathematical objects, abstract algebra does not care how + and * are "defined," as long as they are consistent with the operations we are used to, such as a + b = b + a, (a * b) * c = a * (b * c), (a * c) + (b * c) = (a + b) * c.

If P, Q, R, and S are just numbers, then the pairing function is quite easy to construct: we can define e(x, y) = 2^(xy). Then we see:

e(3, 4 + 5) = 2^(3 * 9) = 2²⁷
e(3, 4) * e(3, 5) = 2^(3 * 4) * 2^(3 * 5) = 2¹² * 2¹⁵ = 2²⁷

It is indeed bilinear!

However, such a simple pairing is not suitable for use in cryptography because analyzing purely integer-based mathematical objects is trivial; the properties of integers make operations like division and logarithms easy. Integers also lack the concepts of "public keys" or "one-way functions." Furthermore, the pairing described above is reversible: knowing x and e(x, y), you can compute y through division and logarithms. The mathematical structure we want is as close to a "black box" as possible: you can perform addition, subtraction, multiplication, and division, but that is all. This is where elliptic curves and elliptic curve pairings come into play.

It has been discovered that it is indeed possible to design bilinear maps on points of elliptic curves—that is, when the inputs are two points P and Q on the elliptic curve, a function e(P, Q) maps to an element of F_p¹² (at least in this specific case, and this specification will vary based on the details of the curve, which will be mentioned later). However, the mathematics underlying this is quite complex.

First, we introduce prime fields and extension fields. The curve drawn above is beautiful, but that is under the assumption that the curve's equation is defined over the usual real numbers. If we were to actually use real numbers in cryptography, you could "reverse" using logarithms, and that would render everything useless, not to mention the space required to store real numbers could be infinitely large. Therefore, we use numbers over finite fields.

A prime field consists of the set of numbers 0, 1, 2, …, (p−1), where p is a prime number, and its operations are defined as follows:

a + b: (a + b) % p
a * b: (a * b) % p
a - b: (a - b) % p
a / b: (a * b^(p-2)) % p

Basically, all operations are performed modulo p (here is a brief introduction to modular arithmetic). Division is a special case. Generally, 3/2 is not an integer, but we want to only deal with integers, so we change it to find an integer x such that x * 2 = 3, where * refers to the modular multiplication defined above. Fortunately, Fermat's Little Theorem ensures that the exponent for division meets the requirements, but there is also a faster method using the Extended Euclidean Algorithm. Assuming p = 7, here are a few examples:

2 + 3 = 5 % 7 = 5
4 + 6 = 10 % 7 = 3
2 - 5 = -3 % 7 = 4
6 * 3 = 18 % 7 = 4
3 / 2 = (3 * 2^5) % 7 = 5
5 * 2 = 10 % 7 = 3

If you try to operate with these calculations, you will find that they are consistent and satisfy all the usual rules. The last two examples illustrate that (a / b) * b = a, and you can also find (a + b) + c = a + (b + c), (a + b) * c = a * c + b * c, and other algebraic identities you learned and loved in high school still hold. The elliptic curves used in practice typically operate over prime fields.

Now let's talk about extension fields. You may have encountered extension fields before; the most common example in math textbooks is the field of complex numbers—formed by extending the field of real numbers with the new element sqrt(-1) = i. In simple terms, an extension field is created by "inventing" a new element over an existing field and defining the relationship between this new element and the existing elements (in the previous example, i² + 1 = 0); this relationship cannot be satisfied by existing numbers, so the set formed by adding this new element to "all linear combinations of the old elements" is the constructed set.

image

We can also extend prime fields; for example, by adding the i we just mentioned to the prime field modulo 7, we have:

(2 + 3i) + (4 + 2i) = 6 + 5i
(5 + 2i) + 3 = 1 + 2i
(6 + 2i) * 2 = 5 + 4i
4i * (2 + i) = 3 + i

The last equation may be harder to understand; actually, the first step of that equation is to distribute the multiplication on the left side to become 4i * 2 + 4i * i, resulting in 8i - 4. Since we are performing calculations in the modulo 7 environment, this number becomes i + 3. As for the division part:

a / b : (a * b^(p²-2)) % p

Here, the exponent in Fermat's Little Theorem changes from p to p², and of course, the Extended Euclidean Algorithm can also be used for more efficient calculations. Because for any element x in the field, x^(p² − 1) = 1, we refer to (p² − 1) as the "order of the multiplicative group in the field."

For the field of real numbers, the Fundamental Theorem of Algebra guarantees that its quadratic extension (the field of complex numbers) is complete (note: it should be algebraically closed; the field of real numbers is also complete but can be extended)—this field cannot be further extended because all possible new elements j that should satisfy the mathematical relationships (strictly speaking, defined by polynomials) with existing complex numbers already have an element in the field that satisfies them. However, in prime fields, this is not the case; we can perform cubic extensions (where the new element w has a relationship defined by a cubic polynomial, making 1, w, w² linearly independent), higher-order extensions, and even further extensions. Elliptic curves are built on these modular arithmetic complexes.

If you are interested in how to implement these mathematical concepts in code, here are some examples of implementing prime fields and extension fields.

Now back to elliptic curve pairings. Elliptic curve pairings (the pairings we discuss here are just one type, but the logic of pairings is similar) are a mapping from G2 × G1 → Gt, where:

  • G1 is an elliptic curve, with points on the curve satisfying equations of the form y² = x³ + b, and the x and y coordinates of the points are elements of F_p (which means they are just ordinary numbers, but arithmetic operations are performed modulo some prime).
  • G2 is also an elliptic curve, similarly satisfying the curve of G1, but the x and y coordinates of the elements in G2 are elements of F_p¹² (these are the complex numbers we just mentioned; we define a magical number w that satisfies a 12th-degree polynomial w¹² − 18 w⁶ + 82 = 0).
  • Gt is the set formed by the results of elliptic curve operations. In the curves we discuss, Gt is F_p¹² (using the same complex numbers as G2).

The main property it must satisfy is bilinearity, represented in this context as:

e(P, Q + R) = e(P, Q) * e(P, R)
e(P + Q, R) = e(P, R) * e(Q, R)

(Note: the bilinearity here is technically "bilinearity over ℤ," simply because the points on the elliptic curve being discussed are integer points, and the definitions of multiplication and addition are integer operations taken modulo, naturally satisfying some linear properties.)

There are two important criteria for choosing a pairing function:

  • The operations must be efficient enough (for example, we could directly take discrete logarithms of all points and multiply them together as a simple pairing method, but the computational cost required is as difficult as breaking elliptic curve cryptography, so this does not count).
  • Non-degenerate (you could simply define e(P, Q) = 1, but it is not a particularly useful pairing).

So how do we accomplish this?

The mathematics behind making the pairing function work is quite difficult and requires some advanced algebra, beyond what we have seen so far, but I will outline it broadly. First, we need to define the concept of a "divisor," which is essentially another way to represent functions acting on points of elliptic curves. A function's divisor essentially counts how many zeros and points where the function takes infinite values it has. To clarify, let's look at a few examples. Let’s fix a point P = (Px, Py), and then consider the following function:

f(x, y) = x − P_x

Its divisor is [P] + [−P] − 2 * [O] (the brackets here denote the set of points where a certain point appears as a zero or takes infinite values, rather than the point itself; [P] + [Q] and [P + Q] are not the same). The reasons are as follows:

  • The value of this function at point P is zero because x takes the value Px, so x − Px = 0.
  • The value of this function at point −P is zero because −P and P have the same x-coordinate.
  • The function approaches infinity as x approaches infinity, so we say this function takes infinite value at point O. The calculation for this point at infinity counts twice, so O is multiplied by the factor -2 (the negative sign is because it is a point at infinity, distinct from a zero point, and 2 is because it is counted twice).

The reasoning behind the calculation is roughly: because the equation of this curve is x³ = y² + b, as x increases, for y² to reach a comparable scale, y must grow at a rate approximately 1.5 times that of x. Therefore, if a linear function only contains x, its infinite term's coefficient is 2, but if it contains y, the coefficient must be 3.

Now consider the function of a line:

ax + by + c = 0

where a, b, c are chosen so that this line passes through points P and Q. According to the way elliptic curve addition works, it must also pass through the point −P−Q. Since it depends on both x and y as it approaches infinity, the divisor is [P] + [Q] + [−P−Q] − 3 * [O].

image

We know that all "rational functions" (that is, functions defined by a finite number of additions, subtractions, multiplications, and divisions of point coordinates) correspond uniquely to a divisor, at most multiplied by a constant (that is, if two functions F and G have the same divisor, then there must exist a constant k such that F = G * k).

For any two functions F and G, the divisor of (F * G) equals the sum of the divisors of F and G (in math textbooks, you will see (F * G) = (F) + (G)), so for example, if f(x, y) = P_x − x, then (f³) = 3 * [P] + 3 * [−P] − 6 * [O]; P and −P are counted three times because in a specific mathematical sense, f³ approaches 0 "three times faster."

Note that there is a theorem stating that if you remove the "brackets" from a divisor, the result of those point operations must be O (performing [P] + [Q] + [−P−Q] − 3 * [O] will yield P + Q − P − Q − 3 * O = O), and any divisor with this property will be a divisor of that function.

Now we can start looking at the Tate pairing. Consider the following functions defined by divisors:

  • (F_P) = n * [P] − n * [O], where n is the order of G1, meaning for all P, n * P = O
  • (F_Q) = n * [Q] − n * [O]
  • (g) = [P + Q] − [P] − [Q] + [O]

Now, let’s look at the product FP * FQ * gⁿ. Its divisor is:

n * [P] − n * [O] + n * [Q] − n * [O] + n * [P + Q] − n * [P] − n * [Q] + n * [O]

Simplifying gives us a clean result:

n * [P + Q] − n * [O]

Notice that the format of this divisor matches that of FP and FQ. Therefore, we have FP * FQ * gⁿ = F_(P + Q).

Now, perform an operation called the "final exponentiation," raising the result of the previous calculations (FP, FQ, etc.) to the power z = (p¹² − 1) / n, where p¹² − 1 is the order of the multiplicative group in Fp¹² (in other words, for all x ϵ Fp¹², x^(p¹² − 1) = 1). Note that if you apply this exponent to any result that is already an n-th power, you will get some element raised to the (p¹² − 1) power, and the result will become 1. Thus, after the final exponentiation step, gⁿ will cancel out, and we will get FP^z * FQ^z = F_(P + Q)^z. Thus, we have part of the bilinear property.

Now, if you want to construct a function that is bilinear in both parameters, you need more daunting mathematics; what you need to do is not just calculate FP, but also calculate the divisor of FP, and going further will yield the complete Tate pairing. To prove more conclusions, you need to understand concepts like "linear equivalence" and Weil reciprocity. Detailed content on these concepts can be found here and here.

Here is an implementation of a modified Tate pairing—Optimal Ate Pairing. This code also implements the calculation of F_p using the Miller algorithm.

In fact, using such pairings is like wielding a double-edged sword: on one hand, it means we can use this pairing to implement different agreements, while on the other hand, it means we must be particularly careful when choosing which elliptic curve to use.

Each elliptic curve has a value called the embedding degree—the smallest integer k such that p^k − 1 is a multiple of n (where p is the prime used in the field, and n is the order of the curve). In the fields mentioned above, k = 12; in traditional elliptic curve cryptography (not considering pairings), the embedding degrees of the fields used are usually very large, making the computation of pairings computationally infeasible. However, it is quite possible to construct fields with k = 4 or even k = 1 if we are not careful.

When k = 1, the "discrete logarithm problem" on the elliptic curve (calculating the value of p when only knowing P = G * p) can degenerate into a relatively simple problem over F_p (this method is called the MOV attack); using an elliptic curve with an embedding degree of at least 12 guarantees that this degeneration is infeasible, or that the resulting problem is complex enough to be at least as difficult as calculating the private key from the public key using "normal" methods. Currently, all standard curve parameters have been carefully checked to avoid this issue.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators