We talked about Elliptic Curves, mostly on Short Weierstrass Curve and its operations. We also talked about commitments and how to commit to a polynomial using an elliptic curve. We also talked about Pairings and how they can be used to prove evaluations of a committed polynomial.
There are several forms of elliptic curve definitions. We will describe the most notable one, that is the Short Weierstrass form. The elliptic curve is defined by the set of pairs
where
Elliptic curves form a group under the operation of point addition that is
Addition / Chord Rule: To add two points
Doubling / Tangent Rule: There is the case where
Notice that the formula is a bit different when
$P = Q$ because the slope is different. Twisted Edwards curves have a simpler formula for such a case, both chord and tangent rule are the same!
See https://curves.xargs.org/ for a nice animation of this operation.
Given a point
We can add a point to itself multiple times, this is called scalar multiplication. Given a point
It is quite important to know how many points there are on the curve
This means that the number of points
It is generally not easy to find the number of points in the curve. In the best case, we would like to number of points on the curve to be some large prime number. However, we are still okay with large numbers with some large prime factor.
Sometimes you have "families of curves" and there you may have a formula to calculate the number of points. See for example BN254 curve. There, the number of points can be simply computed from a given parameter, which is much more efficient that using a more complicated algorithm such as Schoof's Algorithm to find the number of points.
Pasta curves are quite interesting, i.e. the two curves Pallas and Vesta. Both curves are defined over the equation
- Pallas curve is defined over
$\mathbb{F}_p$ base field, and has$r$ points. - Vesta curve is defined over
$\mathbb{F}_{r}$ extension field, and has$p$ points.
Mina Protocol uses these curves for efficient verification! Similarly, Nova folding scheme uses these curves for efficient verification.
In a prime order group, we would like to find a generator element
In groups with non-prime order but with a large prime factor, we instead go for a generator point
So, to make sure we have a safe generator point, we need to make sure that:
- The generator is within the curve
- The generator generates the large prime order subgroup, meaning that its order is equal to the large prime factor!
How many generators are there in a finite field of size
$p$ ? There are$\phi(p)$ generators, where$\phi$ is the Euler's totient function. Conveniently, if the order is prime, then you have$p-1$ generators, all elements except the identity!If the order is not prime but has a large prime factor, with some small co-factors, you can do something called "co-factor clearing" to get a generator of the large prime order subgroup.
What happens if we pick a generator
Using the small subgroups, you can find the secret key
TODO: watch this again / explain more
This attack was used in several Capture-the-Flag events, such as ZKHACK or Lambda-Ingonyama ZK-CTF. In these challenges, there was either a faulty generator thats in the wrong subgroup, or something that leaked information about the discrete log, enabling the Chinese Remainder Theorem to take place in the attack.
When we store points in the curve, we usually store them in the Affine form. This is the form
Consider addition like
As an alternative, we can use the Projective form (Homogeneous projective coordinates) to store points. This is the form
In projective coordinates, you can add points without doing field inversions. The formulas are a bit more complex, but they are more efficient.
There is also the Jacobian form, which is a bit more efficient than projective. This is the form
There are many more representations, each with different levels of efficiency. You can see different point representations for Short Weierstrass at https://hyperelliptic.org/EFD/g1p/auto-shortw.html.
A point can be stored efficiently as well. For example, a curve point is given by the pair
$(x, y)$ , but you can only store$x$ if you want to; because$y$ can be derived from$x$ by taking the square of curve equation's$x$ -side. A single extra bit to indicate the positive / negative solution is enough to store the point.
The public key in Elliptic Curve Cryptography is derived using scalar multiplication. Given a private key
The best algorithms to solve Discrete Logarithm are Pollard's Rho and Baby-Step Giant-Step. They run in time
BN254 was initially though to have 128 bits of security, but it was later subject to more clever attacks that reduced the security level to ~100 bits. (See https://eprint.iacr.org/2017/334)
In many cases
$a = 0$ is picked in the curve, which simplifies the formulas as$y^2 = x^3 + b$ and makes operations a bit more efficient. Some examples are: Secp256k1, BN254, BLS12-381.
The Diffie-Hellman key exchange is a protocol that allows two parties to agree on a shared secret key over an insecure channel. The protocol is based on the hardness of the discrete logarithm problem.
Alice and Bob would like to agree on a key, but first they have to "exchange" this key securely. They do this by exchanging public keys and then computing the shared secret key.
-
Alice and Bob agree on a curve
$E$ over a base field$\mathbb{F}_p$ and a generator point$G$ . This curve has$r$ points, meaning that its scalar field is$\mathbb{F}_r$ . -
Alice picks a private key
$a \in \mathbb{F}_r$ and computes the public key$A = aG$ . Send this to Bob. -
Bob picks a private key
$b \in \mathbb{F}_r$ and computes the public key$B = bG$ . Send this to Alice. -
Alice computes the shared secret key
$S = aB = a(bG)$ . -
Bob computes the shared secret key
$S = bA = b(aG)$ . -
Et viola, the shared secret key is the same because
$aB = bA = abG$ . No one can break this because it is hard to find the discrete log! -
Now, they can derive the symmetric key they like using a key derivation function (KDF) using the secret
$(ab)G$ .
This is good an all, but it is not authenticated. This means that an attacker could intercept the public keys and replace them with their own. This is called a Man in the Middle attack.
ECDSA, Schnorr signatures and BLS signatures all are defined using an elliptic curve.
One of the simplest examples of signatures is the Schnorr signature. Consider a group
-
Key Generation: The private key is a randomly picked
$x \in \mathbb{Z}_q$ and the public key is$y = g^{-x}$ . -
Signing: To sign a message
$m$ , the signer picks a random$k \in \mathbb{Z}_q$ and computes$r = g^k$ and$e = H(r || m)$ . Then, the signer computes$s = k + xe$ and the signature is$(s, e)$ . -
Verification: To verify the signature
$(s, e)$ on message$m$ , the verifier computes$r_v = g^s y^e$ and$e_v = H(r_v || m)$ . The signature is valid if$e = e_v$ .
LambdaWorks have quite a lot of implementations for elliptic curves. See https://github.com/lambdaclass/lambdaworks/tree/main/math/src/elliptic_curve. They use the projective form for efficiency within their operations, and they allow conversion to affine form if needed.
Commitments are a way to commit to a value without revealing it; think of it like having a piece of data and putting the data inside an envelope. This is useful in many cryptographic protocols. A cryptographic commitment scheme has two important properties:
-
Hiding: The commitment should hide the value
$m$ , one cannot know what is committed just by looking at the commitment. -
Binding: The commitment should bind the value
$m$ to the commitment$C$ .
A cryptographic hash function is a one-way function, they are hard to invert! This means that given
SHA-2 is based on Merkle-Damgard construction, which uses a Compression function. Merkle-Damgard construction has a length-extension attack.
SHA-3 is based on Sponge construction, which has an "absorb" step and a "squeeze" step. It begins by absorbing the input, and then squeezing the sponge results in bits of the hash.
A hash function can be used within a commitment scheme.
A Merkle Tree is a method of comitting to a vector of values. Consider
We can use any cryptographic hash function within our Merkle Tree, but most people use SHA-2, SHA-3, Blake2, or Blake3; there are mostly based on bitwise operations. Within the zero-knowledge space, people use more "circuit-friendly" hashes such as Poseidon, Monolith, and Rescue; these are mostly based on Algebraic operations.
When we create a binary tree of hashes, we can commit to a value by revealing the root of the tree. This is a commitment to the entire vector of values, also denoted as the Merkle Root.
In particular, we will use the Merkle Trees as a way of committing to polynomials! Consider a polynomial with coefficients
Now, we look at a commitment scheme known as KZG (Kate-Zaverucha-Goldberg) commitment scheme. The main idea of KZG is to evaluate a polynomial
Consider an elliptic curve
If you were the one who received the commitment, you would have to solve discrete-log to find out the polynomial, but that is hard. This is a hiding commitment scheme. However, this is not binding, you could simply pick the constant polynomial
This is basically a set of points
So, no need to know what
In one CTF, the trick was to look at the SRS and see that the points were repeating from some point on! There,
$s$ belonged to a small order subgroup.
Thanks to this new method, we now have a commitment scheme that is both hiding and binding. We have computed
MOV Attack and Cheon's Attack are attacks on the discrete log problem in the context of pairing-based cryptography.
So imagine I have a commitment
What I will do is to send you
This is a bilinear map, meaning that it is linear in both arguments. This is a very useful property for zero-knowledge proofs. A bilinear pairing has the following properties:
-
$e(g_1, g_2) \ne 1$ (non-degenerate) $e(g_1 + g_3, g_2) = e(g_1, g_2) e(g_3, g_2)$
Now, notice that:
The non-degenerancy is helpful here because
See also "A taxonomy of pairing-friendly elliptic curves".
See this blog for info on KZG.
As described above with Multi-Scalar Multiplication (MSM), we have a commitment
If
We can commit to this polynomial
Since both
How did we get the
$s$ within$(s-z)$ if we don't know the$s$ ? That's because the pairing makes use of$s g_2$ , which is the second element in the SRS of the respective set of points.
KZG commitments are additively homomorphic!
This is useful for batching, where you can commit to multiple polynomials at once. Halo protocol made use of this trick.
Say that we have
When you evaluate this polynomial at
Finally, we will do the division trick over this final polynomial:
We will commit to all the polynomials
The verifier will check the linear combination:
They will compute the evaluation point
Suppose that I want to show you that
If we had access to