Blog by Oleg Fomenko

QAP - Quadratic Arithmetic Program

Special thanks to Anton Levochko and ZK-DL lectures (zkdl-camp.github.io) for the base material for this post.

In the previous post, I’ve described an approach to characterize the polynomial constrains in the R1CS form: \(\langle \mathbf{a}, \mathbf{w} \rangle \cdot \langle\mathbf{b}, \mathbf{w}\rangle = \langle\mathbf{c}, \mathbf{w}\rangle\). Where the one constraint is represented by a public \(\mathbf{a}, \mathbf{b}, \mathbf{c}\) vectors. In an example with a more complex circuit, we’ve generated 4 constrains, so we will have 4 tuples of vectors - each tuple for its corresponding constraint. Of course, it seems to be very difficult to prove big circuits using only R1CS - that is why we may want to build more efficient system based on R1CS. This system is called Quadratic Arithmetic Program (QAP).

At the beginning, we should recheck our note in previous post, where I’ve mentioned that we can combine all tuples \(\mathbf{a}, \mathbf{b}, \mathbf{c}\) into the corresponding matrices \(A, B, C\) and represent our final R1CS circuit as: \(A\mathbf{w} + B\mathbf{w} = C\mathbf{w}\). Then, for each column \(j\) (that corresponds to the witness’s \(j\)-th value) we want to build a polynomial \(p_j(X)\) where for each constraint \(i \in \{1, 2,...\}\) the \(p_j(i)\) equals to the corresponding \(j\)-th witness element coefficient in the constraint \(i\). This can be done using any appropriate interpolation method (and not only on \(X \in \{1, 2,...\}\) - we can use any fixed set of indexed values for interpolation if corresponding algorithm requires such). Then, receiving polynomials \(A_i(X), B_i(X), C_i(X)\) for each column of \(A, B, C\) matrices we can rewrite our R1CS as: \((w_1A_1(X) + ... + w_nA_n(X)) \cdot (w_1B_1(X) + ... + w_nB_n(X)) =\) \((w_1C_1(X) + ... + w_nC_n(X))\) evaluated on \(X \in \{1, 2,...\}\) domain.

Let’s rename the parts of this equation as: \(A(X) =(w_1A_1(X) + ... + w_nA_n(X))\), \(B(X) =(w_1B_1(X) + ... + w_nB_n( x))\), \(C(X) =(w_1C_1(X) + ... + w_nC_n(X))\). Then, our R1CS will have form of \(A(X) + B(X) = C(X)\).

We can mention that \(A(X) + B(X) - C(X)\) equals to some polynomial \(M(X)\) that has zeros in \(X \in \{1, 2,...\}\). To convince the verifier that we know such polynomial with corresponding zeros, we can use the popular principle - proving the knowledge of such polynomial \(H(X) = \frac{M(X)}{Z(X)}\) (where \(Z(X) = \prod (X - i)\) is known for everyone) by proving the knowledge of such polynomials that holds the equation \(AB - C = HZ\).

The algorithm of proving for this relation lies on the Schwartz–Zippel lemma: for the polynomial \(P \in F[x]\) of degree \(d\) and uniformly selected \(t \in F\) the probability of \(P(t) = 0\) equals to \(\frac{d}{|F|}\).

This lemma becomes very useful when we want to prove the knowledge of a polynomial by evaluating it at a random point. If this evaluation can be equal to zero - such proving system does not make any sense, but when the probability of such event is small enough verifier can be sure that evaluation is correct.

So, we consider the following protocol:

  1. Prover shares the coefficients of \(H(X)\) named \(\mathbf{h} = (h_0, h_1, ... h_{n-1})\)
  2. Verifier selects random \(\gamma\) and sends \(\mathbf{\gamma} = (\gamma^0, \gamma^1, ... \gamma^{n-1})\)
  3. Prover responds with the following values \(\pi_A = A(\gamma), \pi_B = B(\gamma), \pi_C = C(\gamma), \pi_M = Z( \gamma) \cdot \langle \mathbf{\gamma}, \mathbf{h}\rangle\).
  4. Verifier accepts if \(\pi_A\pi_B - \pi_C = \pi_M\)

Ok, this protocol follows the correctness property, but how about the soundness and zero-knowledge? The proof values queried by the verifier open lots of information about the witness, because the interpolated polynomials over the witness coefficients columns are public and can be calculated by everyone. To deal with this problem we are moving the next protocol called Proof of exponent.

Proof of exponent

Imagine, we have a polynomial \(p(x)\) and we want to convenience the verifier that it is divisible by the other public polynomial \(t(x)\). Then, we consider the following protocol: verifier shares a powers of tau \(\{g^{\tau^i}\}\) for some generator \(g\) - the securely generated powers of a randomly selected \(\tau\) in such way that nobody knows the \(\tau\) itself. Then, prover calculates the polynomial encryption for \(p(x)\) in the following way: \(g^{p(\tau)} = \sum (g^{\tau^i})^{a_i}\), where \(a_i\) is the coefficients of the \(p(x)\). The same is done for \(h(\tau) = \frac{p( \tau)}{t(\tau)}\). The verifier checks that \(g^{p(\tau)} = (g^{h(\tau)})^{t(\tau)}\).

Unfortunately, it is not enough to reach the soundness property - prover still can manipulate values of \(h(\tau) = r\) to produce a valid proof where \(g^{r \cdot t(\tau)} = (g^r)^{t(\tau)}\). That is why we should add “shifted” evaluation of our polynomials by some randomly selected \(\alpha\) (in the same way as powers of tau): verifier additionally shares \(g^{\alpha\cdot\tau^i}\) with the prover, and the prover adds to the response the shifted polynomial encryption \(g^{p( \alpha\tau)}\). Then, the verifier adds check for \(g^{p(\alpha\tau)} = (g^{p(\tau)})^\alpha\). While the raw values of \(\alpha\) and \(\tau\) are not accessible by the prover we reach the soundness property.

Unfortunately (x2), we still have not reached zero-knowledge. Because the \(\tau\) and \(\alpha\) seems to be generated by verifier, we should additionally add our “own shifts” to hide our polynomials in case when the \(\tau\) and \(\alpha\) setup is compromised. It’s actually quite simple: during the proving phase we select random \(\delta\) and additionally multiply all our polynomials by it (share \(g^{\delta p(\alpha\tau)}\) instead of \(g^{p(\alpha\tau)}\) and same for other polynomial encryptions). While still verifiable it perfectly hides all information about polynomials even if the raw values of \(\tau\) or \(\alpha\) is known by somebody.