Zero-Knowledge Knowledge

Zero-knowledge (ZK) technology is mentioned a lot in crypto these days. But all too often scant technical detail is provided. And broad claims are made alongside vague appeals to ZK somehow solving all the world's ills.

A column a few weeks ago worked through how the Privacy Pools protocol falls short of its design goals (building a regulation-compliant mixer on Ethereum) because it still requires one component of the ecosystem to break the law (the relayers). That protocol, and on-chain anonymization services in general, rely on ZK proofs.

So here we are going to explore how these things work on a semi-technical level and sketch out some limitations of what ZK can and cannot achieve.

The Basics

A ZK proof is a way of proving you know something without revealing the details. Conceptually this is like showing you know a secret without having to reveal that secret. In the real world one might prove they know the combination to a safe by retrieving the contents. That works and, if you do not reveal the combination, and convey no further information to observers.

But this does not work as well online. You would need to transmit the combination over the internet where someone could – and eventually would – steal it. What we want instead is a scheme where the secret itself never travels over the wire.

So now let us pretend we have a scheme for factoring large numbers quickly. Factoring underlies an awful lot of internet security and discovering such a scheme would dramatically change the world. So it is natural that you would not want to transmit such a scheme over the internet. But is there a way to prove this?

Yes. A fairly straightforward ZK proof process presents itself. Set up a website where you factor large numbers. People can come to your site and try it. The more the site works the more people believe you can do this. What we have here is an interactive proof. Interactive because there is a prover (the website) and a verifier (the user). The more numbers are tried the higher the confidence our scheme works.

Interactive proofs are fine. But what we really want is a non-interactive process. Some sort of proof that exists outright and does not rely on a back-and-forth conversation.

Non-Interactive ZK Proofs

A technique known as the Fiat-Shamir Heuristic provides a method for converting interactive proofs into non-interactive ones. The essential intuition is that it might be possible to bluff your way through a few rounds of verification given chosen tests cases – but it will not be possible to do this if the given examples are large and random.

We can now construct a non-interactive proof fairly easily: we just show our scheme works given a single randomly chosen large example. This can be as simple as using the hash of the current time as the test case. We do not need multiple rounds of verification if the example is random and hard enough. It is sufficient to simply publish the test case and answers.

One other common sort of non-interactive ZK proof is a digital signature. Most crypto systems employ signatures in one way or another. And publishing the message "this is a zk proof" signed by the private keys to a given wallet is sufficient to prove you hold those keys. That serves as a non-interactive proof because there is no way to generate the signature without the private keys.

Every time you sign a transaction you are producing a (simple) type of ZK proof.

Monero

At this point we have almost enough information to understand how Monero works. The one little bit we need to add is the concept of a ring signature. Think of a key ring where are many keys. Ring signatures are digital signature schemes where we get the same signature for all keys in the ring.

So imagine if there were a dozen different private keys for each Bitcoin address. Those keys would form a ring and you would not be able to tell which one was used to sign any individual transaction.

Monero works by relying on a process where individual transactions can have come from a large – possibly very large – set of different keys and it is not possible to tell which were used. That is the core concept. Yes each transfer can be whittled down to some set of addresses. But assume that set has 10 possibilities and we transfer tokens 10 times. Now there are 10^10 = 10 billion possibilities. If the ring has a million possibilities you can easily exceed the number of protons in the earth.

Yes there are a few more technical details. But the anonymity in Monero is built around this mechanism where every transaction is equally likely to have come from a large number of addresses and it is mathematically impossible to distinguish among them.

And this presents us with a good example of the sorts of limitations that exist on what ZK. All the keys on the ring need to be equally likely for this thing to work. If there is some kind of asymmetry it can be attacked. So it is not going to be possible to build something on top of Monero-like schemes that offer a sort of partial privacy because the signatures must be indistinguishable. Sure someone can probably cook up a Privacy-Pools-adjacent tool for Monero. But it must also compromise the anonymity if it works at all. That is fine. But it shows that ZK cannot do everything we might want. That's how the world works and is not terribly shocking.

Zcash & Tornado Cash

These two are different from Monero but overlap a lot in the internal design. In both cases a user starts by generating some secret information. They then submit a deposit (in Zcash they initiate a transfer) alongside information derived from the secret. In Tornado Cash the user can then submit a different piece of information derived from their secret to withdraw. That information is a ZK proof that they made a corresponding deposit.

The essential feature here is that computing these pieces of information from the secret is easy – but determining which pairs share a common secret is hard. Think of this like factoring: it is easy to multiply numbers together to check a given set of factors is correct but hard to reverse the process. Possession of this second secret piece of information serves as proof you made a deposit.

Now think of a shop that sells sandwiches. You wait in line and pay. But rather than giving you a receipt they have you pick a piece of paper from a bowl with a 100 digit random number written on it.

To pick up your sandwich later you present this random number. They can easily check if it is one of the random numbers they printed out. And the shop can easily check if this number has been redeemed already. But if the bowl contains only 10o different 100-digital random numbers there is essentially zero chance you can guess one of them outright. Possession of the number is a ZK proof you made a sandwich purchase even though it has no connection to your payment.

For Zcash, rather than submitting a withdrawal request the user, roughly, is able to compute the private key for the recipient address from their secret. And for the same reasons as Tornado, it is to compute that key but hard to work out which pairs of address keys are connected.

Again, immediately, we can see some limitations on what ZK can do. In Zcash fees are paid by the sender. There may well be some way to build a partial privacy scheme where you can prove one receipt did not come from a given transfer. Maybe. But in a system like Tornado Cash where the withdrawal process requires gas to be paid we have an unavoidable problem. Someone has to fund the gas.

As we previously discussed retrofitting partial privacy on to Tornado cannot produce a compliant system because it still requires relayers (who are actively prosecuted) to anonymize gas. And in fact we can find another limitation on what ZK can do here. In a system like Ethereum where all computation is public we cannot build something like Zcash. Sure we can design a mixer where the recipient address is derivable from a secret fed in by the sender. But the deposit and withdrawal will be publicly tied together.

Why? Because everyone observing the smart contract computations can follow along as the "mixer" initiates a transfer to the destination address. That is how Ethereum works. In Zcash there is no such visible transfer – the recipient simply holds the keys. You are not going to be able to retrofit this sort of process onto Ethereum as it exists today – you will need to modify the protocol to fix it.

Scaling

At the same time this enables us to see that ZK does not enable certain sorts of L2 scaling solutions. Within a given L2 scaling may or may not be possible. But across L2s – in exactly the way a16z talks about in the first problem of their Nakamoto Challenge – this cannot scale beyond Ethereum's native L1 capacity.

This is easy to see. The only way to settle across L2s is to complete a transfer on the L1 chain. Fine. But can presenting a ZK proof from L2 X on L2 Y speed this all up? No. Let us assume we have some sort of ZK proof scheme that is portable across L2s. And now we present a proof we own a token on X to a DEX on Y.

If the DEX does not trust us and waits for the X->Y settle on mainnet clearly we did not scale. So let us assume the DEX trusts us and allows a sale of the token now while waiting for the mainnet transfer to settle. Maybe we scaled.

But, as but one example of trouble, what happens if gas on mainnet skyrockets far in excess of what we promised the DEX. What if gas exceeds the entire balance we tried to sell. Now what happens? We can wait until gas comes down. In that case every subsequent transfer of our token is stalled. Either the token must be frozen pending resolution of this problem or all subsequent transfers are at risk of being reverted.

This can go on forever where nobody every achieves finality. Or we can have some sort of timeout. If the mainnet transfer does not complete before the timeout we give up and the DEX reverts the trade. Now consider what just happened. We may be able to scale the number of proposed transactions – but throughput is the number of transactions divided by the time taken to complete them. The timeout is our denominator. We did not scale the number of settled transactions – we scaled the number of proposed transactions.

ZK cannot fix this because finality on the L1 requires gas and future gas can be anything. We can approximate our way around the problem with timeouts and by extending credit a bit across L2s – the DEX is extending us credit by allowing immediate trading when the L1 settlement is pending – but we cannot scale in the limit.

Again compromises are likely possible and systems that often scale can surely be built. But the ideal of an ecosystem of L2s that reliably process a huge multiple of the L1's capacity is out of reach in a say ZK technology cannot fix.