Availability and Validity
The Availability and Validity (AnV) protocol of Polkadot is what allows for the network to be efficiently sharded among parachains while maintaining strong security guarantees.
Phases of the AnV protocol
There are five phases of the Availability and the Validity protocol.
Parachain phase.
Relay Chain submission phase.
Availability and unavailability subprotocols.
Secondary GRANDPA approval validity checks.
Invocation of a Byzantine fault tolerant finality gadget to cement the chain.
Parachain phase
The parachain phase of AnV is when the collator of a parachain proposes a candidate block to the validators that are currently assigned to the parachain.
CANDIDATE BLOCK
A candidate block is a new block from a parachain collator that may or may not be valid and must go through validity checks before being included into the Relay Chain.
Relay Chain submission phase
The validators then check the candidate block against the verification function exposed by that parachain's registered code. If the verification succeeds, then the validators will pass the candidate block to the other validators in the gossip network. However, if the verification fails, the validators immediately reject the candidate block as invalid.
When more than half of the parachain validators agree that a particular parachain block candidate is a valid state transition, they prepare a candidate receipt. The candidate receipt is what will eventually be included into the Relay Chain state. It includes:
The parachain ID.
The collator's ID and signature.
A hash of the parent block's candidate receipt.
A Merkle root of the block's erasure-coded pieces.
A Merkle root of any outgoing messages.
A hash of the block.
The state root of the parachain before block execution.
The state root of the parachain after block execution.
This information is constant size while the actual PoV block of the parachain can be variable length. It is enough information for anyone that obtains the full PoV block to verify the state transition contained inside of it.
Availability and unavailability subprotocols
During the availability and unavailability phases, the validators gossip the erasure coded pieces among the network. At least 1/3 + 1 validators must report that they possess their piece of the code word. Once this threshold of validators has been reached, the network can consider the PoV block of the parachain available.
Erasure Codes
Erasure coding transforms a message into a longer code that allows for the original message to be recovered from a subset of the code and in absence of some portion of the code. A code is the original message padded with some extra data that enables the reconstruction of the code in the case of erasures.
The type of erasure codes used by Polkadot's availability scheme are Reed-Solomon codes, which already enjoys a battle-tested application in technology outside the blockchain industry. One example is found in the compact disk industry. CDs use Reed-Solomon codes to correct any missing data due to inconsistencies on the disk face such as dust particles or scratches.
In Polkadot, the erasure codes are used to keep parachain state available to the system without requiring all validators to keep tabs on all the parachains. Instead, validators share smaller pieces of the data and can later reconstruct the entire data under the assumption that 1/3+1 of the validators can provide their pieces of the data.
NOTE
The 1/3+1 threshold of validators that must be responsive in order to construct the full parachain state data corresponds to Polkadot's security assumption in regard to Byzantine nodes.
Fishermen: Deprecated
The idea of Fishermen is that they are full nodes of parachains, like collators, but perform a different role in relation to the Polkadot network. Instead of packaging the state transitions and producing the next parachain blocks as collators do, fishermen will watch this process and ensure no invalid state transitions are included.
Fishermen are not available on Kusama or Polkadot and are not planned for formal implementation, despite previous proposals in the AnV protocol.
To address the motivation behind the Fishermen design consideration, the current secondary backing checkers perform a similar role in relation to the Polkadot network. From a security standpoint, security is based on having at least one honest validator either among parachain validators or secondary checker.
Further Resources
Path of a Parachain Block - Article by Parity analyst Joe Petrowski expounding on the validity checks that a parachain block must pass in order to progress the parachain.
Availability and Validity - Paper by the W3F Research Team that specifies the availability and validity protocol in detail.