-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Second design for a proof network #197
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,257 @@ | ||
Storage proof network | ||
===================== | ||
|
||
Authors: Codex Team | ||
|
||
In this document we explore a design for an off-chain network for validating | ||
[storage proofs][1]. Instead of checking each storage proof in a smart contract | ||
on-chain, we let the proof network check these proofs. Only when a proof is | ||
missing we go on-chain to enact slashing. The main goal of this exercise is to | ||
reduce the costs of submitting and validating proofs, which has shown to be a | ||
limiting factor for the profitability of storage providers and the scaling of | ||
the storage network, even when deploying on a [rollup][2] or [sidechain][3]. | ||
|
||
[1]: proof-erasure-coding.md | ||
[2]: ../evaluations/rollups.md | ||
[3]: ../evaluations/sidechains.md | ||
|
||
Overview | ||
-------- | ||
|
||
The main idea is that validators in the network sign off on blocks of | ||
transactions. Transactions either contain a storage proof, or indicate that a | ||
storage proof is missing. These validators deposit stake on-chain that allows | ||
them to participate in a staked consensus protocol. This consensus protocol | ||
ensures that transactions are only validated when a subset of the validators | ||
representing > 2/3 of the total network stake have signed off on it. The | ||
assumption here is that less than 1/3 of the validators are byzantine, meaning | ||
that the rest is online and following protocol. | ||
|
||
Roles in the network are: | ||
|
||
- storage providers: they submit storage proofs | ||
- validators: they keep track of submitted and missed proofs, and trigger | ||
slashing on-chain | ||
|
||
The validators form a consensus network that allows them to agree on blocks of | ||
transactions. Storage providers submit transactions containing a storage proof | ||
to one of the validators, which includes it in a block of transactions. | ||
Validators also monitor the on-chain marketplace to check when storage proofs | ||
are missed. | ||
|
||
consensus network | ||
|
||
-------------------------------------- | ||
storage providers | | | ||
| validator | | ||
--------------------- | ^ ^ | | ||
| | | / \ | | ||
| provider | | / \ | | ||
| v v | | ||
| provider <---------------------> validator <----> validator | | ||
| ^ ^ | | ||
| provider | | \ / | | ||
| | | \ / | | ||
--------------------- | v v | | ||
^ | validator | | ||
| | | | ||
| -------------------------------------- | ||
| ^ | ||
| | | ||
| ethereum | | ||
| | | ||
| ------------------ | | ||
| | | | | ||
\------- | marketplace | <--------------/ | ||
| | | ||
------------------ | ||
|
||
Transactions | ||
------------ | ||
|
||
The types of transactions that can be included in blocks are: | ||
|
||
- `StorageProof(slot id, period, inputs, proof)` | ||
- `MissingProof(slot id, period, inputs)` | ||
|
||
Each transaction is signed by its sender. A *slot id* parameters refers to a | ||
[slot in a storage request][4] on the marketplace. It uniquely identifies the | ||
data for which a storage proof is required. The *period* refers to a [time | ||
interval][5] in which the storage proof should be submitted. By *proof* we mean | ||
a [zero-knowledge proof][5], and by *inputs* we mean its public inputs. | ||
|
||
#### StorageProof #### | ||
|
||
A storage provider sends a `StorageProof` transaction to a validator to indicate | ||
to the network that it calculated a storage proof as required by the | ||
marketplace. This validator includes it in its next block. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The proof is sent to only one validator or the entire network of validators? This validator includes it in its next block? Or rather the validator in charge of producing the next block adds the transaction to it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The proof is sent to only one validator. In the Mysticeti design, all validators produce a block each round, so the proof is included in the next block of the validator that you sent it to. If it fails to provide you with proof that it included the transaction, then you're free to go to another validator. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How are proofs/transactions propagated across validators? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
As part of the consensus protocol blocks of transactions are exchanged between validators. |
||
|
||
#### MissingProof #### | ||
|
||
A validator includes a `MissingProof` transaction in its next block when it | ||
notices that a required storage proof was not submitted. | ||
|
||
[4]: marketplace.md | ||
[5]: https://github.com/codex-storage/codex-storage-proofs-circuits#circuit | ||
|
||
Flows | ||
------- | ||
|
||
### Successfull proof submission ### | ||
|
||
Storage providers monitor the on-chain marketplace to check in which periods | ||
they need to provide a storage proof. When a provider sees that a proof is | ||
required for a slot in the current *period*, it gathers public *inputs* for the | ||
slot, including the random challenge and calculates a zero-knowledge storage | ||
*proof*. The provider then submits `StorageProof(slot id, period, inputs, | ||
proof)` to a single validator. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why a single validator? What if this one is malicious? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Each validator produces a block every round in the Mysticeti design, and the proof only needs to end up in one of them. If the validator is malicious, it won't be able to give an inclusion proof for the transaction, and you can simply go to another validator to get the transaction included. |
||
|
||
|
||
StorageProof | ||
storage provider ---------------------------------------> validator | ||
|
||
|
||
The validator will include the transaction in the next block that it proposes to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How many validators are we talking about? How often do they produce blocks? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure yet how many. In the Mysticeti paper they measured less than 5 validators to get sub-second latency numbers. But we don't really care about latency so much, so we can go higher than that. But I don't think it's practical to go for more than 100 validators, unless we go for a scheme where we pick a few validators from a larger pool every epoch. |
||
the consensus network. When the proposed block is sequenced by the consensus | ||
network, the validator will return an inclusion proof to the provider. This is a | ||
proof that the transaction was included in a block, and that the consensus | ||
network included the block. | ||
|
||
|
||
inclusion proof | ||
storage provider <--------------------------------------- validator | ||
|
||
|
||
This inclusion proof doesn't need to be succinct, and can for example consist of | ||
a record of the messages that were exchanged between validators as part of the | ||
consensus protocol. | ||
|
||
Notice that validators do not check the correctness of `StorageProof` | ||
transactions prior to including them in blocks. In this design sequencing and | ||
evaluation of transactions are separated. First, the validators reach consensus | ||
on a sequence of blocks of transactions. Then, each of the validators evaluates | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the motivation for this design? (does it come from Mysticeti?) It looks overly complicated on first sight:
Also, what if the sequence ordering is such that a missing proof is first and the actual proof comes later? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It mimicks what SUI (and Mysticeti) do. It is part of the reason why they are able to scale to a very high number of transactions per second. It minimizes what is done while forming consensus, so that the rest of the work (evaluating transactions, updating state) can be done in parallel.
That is correct, for proofs we can just about get away with it, as we argued in the first design. However, for anything more consensus is necessary. If we don't want validators to race on-chain to mark a proof as missing, if we want to add payments for including a proof, if we want to punish validators for not doing their work properly, then we need consensus of some sort.
At some point, I'd like to introduce fees for including a transaction. This avoids the validators being spammed, and gives them an incentive to do their work. This is analogous to how you can submit an transaction that reverts to an Ethereum network, and even though it doesn't change the state, you still pay for the transaction fee.
Then the proof was submitted too late. And we have consensus that this is so. |
||
these transactions in order. | ||
|
||
When evaluating a `StorageProof` transaction a validator checks that it was | ||
submitted within the *period* and that the *proof* is correct w.r.t. to the | ||
*inputs*. If that is all in order then it updates its internal accounting to | ||
reflect that the proof was submitted and correct. | ||
|
||
### Missing proofs ### | ||
|
||
Validators monitor the on-chain marketplace to check which slots require a | ||
storage proof to be submitted and what the public *inputs* for the proof are. | ||
For each required proof they check at the end of its *period* whether that proof | ||
was submitted and correct. If they did not receive a correct proof then they | ||
will add a `MissingProof(slot id, period, inputs)` transaction to the next | ||
block. | ||
|
||
The `MissingProof` transactions are sequenced by the consensus algorithm. They | ||
are then evaluated by each validator. They will note the first `MissingProof` | ||
transaction that correctly notices a missing proof, and allow the sender of that | ||
first transaction to go on-chain to mark the proof as missing. | ||
|
||
The validator that sent the first `MissingProof` transaction for a missing proof | ||
can now request BLS signatures from the other validators to enact on-chain | ||
slashing of the storage provider for missing a proof. When the validator | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we slash SP for every single proof missed? Even the very first one? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, I do not want to change the rules around slashing with this. So I'd keep the way we slash in the Marketplace smart contract the same. Only the way proofs are as marked as missing changes, but the consequences do not. |
||
receives enough signatures to represent > 2/3 stake it can combine these | ||
signatures into a single combined BLS signature. The validator can then submit | ||
*slot id*, *period*, *inputs* and the combined signature to the marketplace. | ||
|
||
The marketplace will then verify the correctness of *inputs* for the *slot id* | ||
and *period*, and checks that the combined signature is representative for > 2/3 | ||
stake. If these conditions are met, it will then slash the storage provider | ||
collateral and reward the validator. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is only the validator who sent the 1st MissingProof transaction rewarded? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that would be the simplest I think. The Mysticeti design suggests ordering the blocks from the validators based on a round-robin scheme, so every validator is equally capable of being the first to notice a proof as missing. |
||
|
||
### Faulty proofs ### | ||
|
||
The storage proofs that a storage provider submits can be faulty for a number of | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to assign different penalties to these faulty proofs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i don't think that would make sense, a proof is either present and valid, or not. There are no "qualities" of missing/wrong proofs. A more interesting case when storage providers also aggregate their proofs. Then it can happen that for example they submit a single proof for 100 slots, but only 99 is valid because they actually lost 1. |
||
reasons: | ||
|
||
1. The zero knowledge *proof* is incorrect | ||
2. The submitted *period* is not the current time period | ||
3. The public *inputs* to the proof do not match the values from the on-chain | ||
marketplace | ||
|
||
Faults 1 and 2 are caught by the validators. Validators check the zero-knowledge | ||
*proof*, and that the *period* is the current time period when evaluating a | ||
`StorageProof` transaction. If these are incorrect then they ignore the | ||
transaction, effectively treating the proof as missing. Validators now go | ||
through the same flow that we described in the previous section. | ||
|
||
Fault 3 is caught by the validators and the on-chain marketplace. Validators | ||
will look for a correct `StorageProof` transaction that has the same *inputs* as | ||
specified by the marketplace. If it doesn't find it because the storage provider | ||
submitted a `StorageProof` transaction with a different value for *inputs*, then | ||
it will treat the proof as missing, and go through the same flow as in the | ||
previous section. | ||
|
||
Consensus | ||
--------- | ||
|
||
Our design depends on a consensus algorithm between validators. This can be any | ||
byzantine-fault-tolerant consensus algorithm for sequencing transactions. The | ||
[Mysticeti][6] algorithm seems to be particularly suited because it is highly | ||
performant. | ||
|
||
The consensus algorithm is only used to sequence transactions. Evaluation of the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Out of curiosity: why does sequencing transactions matter? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess, maybe to match the right/current period? If a transaction is added too late will it miss the period? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sequencing is not strictly necessary for storage proofs, but for anything else it becomes necessary. See my reply here: https://github.com/codex-storage/codex-research/pull/197/files#r1820368545 |
||
transactions is done after sequencing. This means that storage proofs can be | ||
checked in parallel, which allows validators to scale up and support a large | ||
storage network that produces many proofs. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do validators post onchain any result of these evaluations? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, they don't. |
||
|
||
[6]: https://arxiv.org/pdf/2310.14821 | ||
|
||
Staking | ||
------- | ||
|
||
The marketplace smart contract only slashes a storage provider when there is a | ||
combined BLS signature that represents > 2/3 stake. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So validators only make money when SPs fail proofs? Isn't this a motivation to attack SPs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is no different from our current implementation. I also want to add fees for including storage proofs so that slashing isn't the only source of income for validators. |
||
|
||
It can be expensive to calculate the amount of stake associated with a combined | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the stake amount is constant, would be possible to confirm that >2/3 threshold by simply counting the number of signatures? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that is possible. However, you'd still need to figure out from how many signatures the combined signature was constructed, and that they are valid signatures from staked validators. |
||
BLS signature on-chain. Because any combination that represents > 2/3 stake is | ||
valid, there can be many different valid combinations. If we have to calculate | ||
the amount of stake every time that a validator submits a combined signature | ||
that signals that a proof was missed, then the gas fees would be prohibitive. | ||
|
||
So instead we expect there to be pre-calculated combined public keys that | ||
represent > 2/3 stake majorities. The gas costs for validating the stake that | ||
these combined public keys represent can be borne by the validators when they | ||
put down their stake. | ||
|
||
Pros and cons | ||
------------- | ||
|
||
There are a couple of advantages to this design in which we use a consensus | ||
protocol to sequence transactions. | ||
|
||
A storage provider only needs to send its proofs to a single validator, and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. but any validator can decide to censor any provider (and especially if the same SPs play both roles, this can very easily happen), so sending to a single validator seems very dangerous to me There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is correct. I thought about that, but didn't write it down. When the validator doesn't give you an inclusion proof, you can simply switch to another validator. I'll update the text to include this. |
||
the consensus protocol ensures that all validators see it. | ||
|
||
There is no need for validators to sign off on individual storage proofs. The | ||
[Mysticeti paper][6] points out that signing and verifying signatures is one of | ||
the main contributors to latency in a consistent broadcast design. By only | ||
signing entire blocks of transactions this is avoided in the Mysticeti protocol. | ||
|
||
This design is suitable for supporting other kinds of transactions later on, | ||
such as payments, payment channels, and marketplace interactions. | ||
|
||
Compared to a [previous iteration][7] of this design we have one less role. | ||
There no longer is a separate role that monitors the on-chain marketplace to | ||
check which storage proofs are required. Also, there is no longer a race to go | ||
on-chain to mark a proof as missing. Because we have a consensus protocol we can | ||
select which validator goes on-chain. | ||
|
||
In exchange for these advantages we have some drawbacks as well. | ||
|
||
The number of validators is by necessity fairly small (in the order of < 100 | ||
validators) because of the communication between the validators. Measures can be | ||
taken to increase the decentralization of the validators. We could for example | ||
introduce epochs in which some validators are chosen from a larger set of | ||
potential validators, but that comes at the expense of added complexity. | ||
|
||
The latency is larger than in a [previous iteration][7] of this design, because | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If SPs are required to post 1 proof/day, validators would have enough time to reach consensus, wouldn't they? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is 1 proof per dataset right? Not per SP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, latency is not really a problem for our use case.
Currently yes, but if we succeed in doing local proof aggregation, then it becomes one proof per storage provider. |
||
the consensus protocol requires 3 rounds of communication before it has | ||
sequenced the transactions. That might be mitigated by using the fast path from | ||
the Mysticeti protocol for `StorageProof` transactions. | ||
|
||
[7]: https://github.com/codex-storage/codex-research/pull/194 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see many other drawbacks to this approach.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I intend to add fees for including transactions.
Through having multiple staked validators. The scheme works as long as >2/3 stake is not byzantine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really would like a different name, as the network is most definitely not doing proofs. It's checking proofs.
This name is so misleading that already within the Codex team I've seen it to be referred as a "prover network", while it's the exact opposite of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't mind renaming this, do you have a suggestion? Perhaps "proof checking network"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a really good name in my mind. "Proof checking network" is certainly one possibility, "Verification network" or "Validator network" could be some more. From these 3 maybe I prefer the last one. But open to other suggestions