refactor: align QBFT quorum with ssv-spec (2f + 1)#966
refactor: align QBFT quorum with ssv-spec (2f + 1)#966shane-moore wants to merge 5 commits intosigp:unstablefrom
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
517d643 to
890875f
Compare
| } | ||
|
|
||
| // Validate `quorum_size`: must fall in `[2f + 1, quorum_size(N)]`. | ||
| // Validate `quorum_size`: must fall in `[2f + 1, N − f]`. |
There was a problem hiding this comment.
From where is this requirement?
There was a problem hiding this comment.
To elaborate a bit: this is not just a comment issue, the code is actually accepting any quorum_size in the interval [2f + 1, N - f].
I don’t think that matches the upstream contract. Go SSV / ssv-spec define quorum as exactly 2f + 1, not as a range of valid thresholds. So this builder is preserving an Anchor-local notion that multiple quorum values are acceptable for the same committee size.
Also, there does not seem to be a public with_quorum_size(...) setter anymore, which makes this even harder to reason about: why is quorum treated as an independently validated field here instead of being derived from the committee size?
I’d expect this to enforce one exact quorum rule for the committee, not a lower/upper bound interval.
There was a problem hiding this comment.
We should also validate committee_size itself here to match the upstream Go/SSV contract.
Go SSV does not treat arbitrary committee sizes as valid QBFT input; it expects canonical N = 3f + 1 sizes. Right now this builder validates quorum against committee_size, but it does not validate that the committee size is canonical in the first place.
If we want Go/spec parity, this check should live here at the config/build boundary rather than implicitly allowing non-canonical committee sizes through.
There was a problem hiding this comment.
Good pushes - all three points here addressed together:
Range check ([2f + 1, N - f]): no documented rationale, and since there was never a public with_quorum_size(...) setter, the interval was unreachable in practice. Removed entirely.
Quorum derivation: dropped the cached quorum_size field from Config/ConfigBuilder. quorum_size() is now a derived getter from committee_members.len(), routed through the shared ssv_types::quorum_size helper (aligned with ssv-spec's 2f + 1). It can't drift from the committee size anymore.
Canonical committee size validation: added at the config/build boundary via ssv_types::is_valid_committee_size (canonical N = 3f + 1, f in 1..=4), returning a new InvalidCommitteeSize variant. Also consolidated eth::util::validate_operators onto the same helper (it had its own inline check), so canonical-size policy lives in exactly one place. Three-layer defense: SSV Network contract -> eth event ingestion -> qbft config build.
|
|
||
| // Regression targets: silent `2f + 1` substitution for `N − f`. | ||
| #[test] | ||
| fn quorum_size_is_n_minus_f() { |
There was a problem hiding this comment.
Nice to add coverage here. I think these expectations should be updated to match the upstream quorum rule instead of locking in the current behavior.
Go SSV and ssv-spec define quorum as 2f + 1, so if quorum_size() is the shared helper, the cases here should assert that directly. For example: N=3 -> 1, N=4 -> 3, N=6 -> 3, N=7 -> 5.
As written, this test suite cements the wrong contract in ssv_types.
There was a problem hiding this comment.
Done. quorum_size() is now 2f + 1, and the test asserts exactly the values you called out:
(3, 1), (4, 3), (6, 3), (7, 5), (10, 7), (13, 9)
Plus a quorum_size_composes_with_get_f property test across 0..=MAX_SIGNATURES so a future inline refactor can't silently decouple the two.
| members.saturating_sub(1) / 3 | ||
| } | ||
|
|
||
| /// Default QBFT quorum: `N − f`. Equivalent to `2f + 1` when `N = 3f + 1` |
There was a problem hiding this comment.
I think quorum_size() should match the upstream Go/spec quorum rule rather than preserve Anchor’s local N - f default.
Current Go SSV and ssv-spec model quorum as 2f + 1, and message_validator::compute_quorum_size() already does that in this repo. Defining the shared ssv_types::quorum_size() helper as N - f makes the canonical helper disagree with both upstream and existing Anchor validation logic.
I think this should be:
pub fn quorum_size(committee_size: usize) -> usize {
get_f(committee_size) * 2 + 1
}There was a problem hiding this comment.
Applied verbatim. quorum_size is now get_f(committee_size) * 2 + 1, with a doc comment citing ssv-spec's CommitteeMember.GetQuorum and Go-SSV's ComputeQuorumAndPartialQuorum.
| @@ -1059,8 +1060,7 @@ pub fn sync_committee_period( | |||
| } | |||
|
|
|||
| pub(crate) fn compute_quorum_size(committee_size: usize) -> usize { | |||
There was a problem hiding this comment.
Once ssv_types::quorum_size() is corrected to match Go/ssv-spec, I think this local helper should go away and the call sites here should use the shared helper directly.
Right now the actual quorum formula is still duplicated in message_validator, which leaves us with two places to keep in sync for the same protocol rule.
There was a problem hiding this comment.
Done. compute_quorum_size and its test are gone. Its one call site in consensus_message::validate_consensus_message_semantics now calls ssv_types::quorum_size directly. Single authoritative quorum formula in the workspace.
|
@claude, please take a look |
|
Claude finished @shane-moore's task in 1m 54s —— View job Review of PR #966 (final state, all 5 commits)
Reviewed the final state across all 10 changed files. The refactor is clean and correct. Correctness verified:
Tests are thorough:
No issues found. LGTM. |
|
Thanks for all the updates here. One meta point: I don’t think |
Problem, Evidence, and Context
Anchor's QBFT default quorum used
N − f, with no spec basis. SSV-spec defines quorum as2f + 1viaCommitteeMember.HasQuorum(ssv-spec/types/operator.go:27-33), and Go-SSV uses2f + 1strictly throughout (ssvshare.go:228-258). The two formulas coincide at canonicalN = 3f + 1sizes and diverge elsewhere, but the spec contract is2f + 1in both cases, while Anchor'sN − fwas an internal invention.This also left three places computing BFT math:
qbft::config::get_f,message_validator::get_f(flagged in-file as aTODO centralize), andssv_types::Cluster::get_f, plus a second quorum formula inmessage_validator::compute_quorum_size(which already used2f + 1, making the quorum rule internally inconsistent).N − fdefault was introduced 2025-01-14 in9389ccdwithout a comment or design note. No evidence it was a deliberate workaround for any data-model constraint (Anchor derivesffromcommittee_members.len();2f + 1andN − fare both pure functions of that and neither requires a storedFaultyNodesfield as ssv-spec has).ssvlabs/ssv-network/contracts/libraries/ValidatorLib.sol:validateOperatorsLength) rejects non-canonicaloperatorIds.lengthon-chain atregisterValidator. Operator counts are enforced to{4, 7, 10, 13}viaMIN_OPERATORS_LENGTH = 4,MAX_OPERATORS_LENGTH = 13,length % 3 == 1. So the two formulas are observationally identical in production; the divergence only matters in test harnesses and would be a latent footgun for any future flow that bypasses event ingestion.r3133635561,r3133348690,r3133751308,r3133780108,r3133768791,r3133635550) called this out in full and is the motivation for the expanded scope.Change Overview
Quorum formula alignment (
anchor/common/ssv_types/src/lib.rs)quorum_size(n)now returns2f + 1(previouslyN − f). Doc comment ties it explicitly to ssv-spec'sCommitteeMember.GetQuorumand Go-SSV'sComputeQuorumAndPartialQuorum.is_valid_committee_size(n)mirrors Go-SSV'sValidCommitteeSize:N = 3f + 1withfin1..=4, i.e.{4, 7, 10, 13}.get_funchanged ((N − 1) / 3withsaturating_subguard).QBFT config (
anchor/common/qbft/src/config.rs,error.rs)ConfigBuilder::buildnow rejects non-canonical committee sizes viais_valid_committee_size(Diego'sr3133780108).[2f + 1, N − f]range validation removed. There was nowith_quorum_sizesetter anywhere, so the range was never exercised (Diego'sr3133751308).ConfigBuilder::quorum_sizeandConfig::quorum_sizeare now derived getters (no cached field). Single source of truth; can't drift fromcommittee_members.ConfigBuilderError::InvalidQuorumSizerenamed toInvalidCommitteeSizeto reflect what's actually validated. Display:"Committee size must be of the form 3f + 1 (4, 7, 10, or 13)".Consolidation in
message_validatorandethmessage_validator::compute_quorum_sizewrapper deleted (Diego'sr3133768791). Its single production call site (consensus_message::validate_consensus_message_semantics) now usesssv_types::quorum_sizedirectly. Previously,compute_quorum_sizealready used2f + 1whilessv_types::quorum_sizeusedN − f, meaning Anchor had two different quorum rules internally; they now share one.eth::util::validate_operatorsinline canonical check replaced with a call tossv_types::is_valid_committee_size. Canonical-size policy now lives in exactly one place.Tests
#[cfg(test)] mod testsat the bottom ofssv_types::lib.rs:get_f_matches_byzantine_formula(table at{0, 3, 4, 7, 10, 13}; lockssaturating_sub, catchesN/3vs(N − 1)/3regression).quorum_size_is_two_f_plus_one(table at{0, 1, 3, 4, 6, 7, 10, 13}; catches silentN − fsubstitution for2f + 1).quorum_size_composes_with_get_f(property assertion across0..=MAX_SIGNATURES).valid_committee_size_matches_ssv_canonical_sizes.qbft::config::testsmodule:build_accepts_canonical_committee_sizes,build_rejects_empty_committee,build_rejects_non_canonical_committee_sizes,builder_quorum_size_is_derived_from_committee_size.qbft::tests) migrated from N=5 and N=3 to N=4 (smallest canonical). Test intent unchanged; only the committee size and expected consensus count shift with it.What intentionally did not change
get_fformula and itssaturating_subguard on empty committees.Cluster::get_f(BLS Lagrange threshold path viavalidator_storeremains2f + 1).Configstruct's internal layout aside from thequorum_sizefield removal.Risks, Trade-offs, and Mitigations
ConfigBuilder::buildpreviously accepted any committee size; now rejects non-canonical withInvalidCommitteeSize. Production is unreachable (on-chain + eth-event layer both reject non-canonical). The qbftConfig::quorum_sizevalue for any hypothetical non-canonical input changes fromN − fto2f + 1. Mitigation: three-layer defense (contract, eth ingestion, qbft build) makes the non-canonical path unreachable except through test harnesses, which are updated here.InvalidQuorumSize→InvalidCommitteeSize. Grep of the workspace found no externalmatches!ormatchcallers on this variant; the only matchers are in this PR's new tests.compute_quorum_sizedeletion.pub(crate)-scoped, one production caller (consensus_message.rs:80), no external consumers.TestQBFTCommitteeBuilder::defaulthad N=5). Moving to N=4 loses the weak signal that those tests gave on non-canonical behavior, but that signal was unused in any production path and the qbft tests continue to exercise the same scenarios (basic consensus, faulty operators, node recovery, round change, leader waiting).Validation
cargo test -p ssv_types -p qbft -p message_validator -p qbft_manager -p eth: all green (57 + 9 + 16 + 80 + 16).cargo clippy --all-targets -- -D warningson all touched crates: clean.make cargo-fmt-check: clean.types/operator.go:27-33) and Go-SSV (protocol/v2/types/ssvshare.go:228-267) to confirm the2f + 1formula andValidCommitteeSizesemantics match.ssvlabs/ssv-networkatstage-v2.0.0-upgrade7) to confirm the canonical-size enforcement happens on-chain.Rollback
Single focused branch. Revert cleanly with
git revert. No config, data, or on-wire impact. Downstream branches that depend oncompute_quorum_sizeorInvalidQuorumSizewould need to be rebased after revert (none known).Blockers / Dependencies
CommitteeMemberTestspec test). That branch already usesssv_types::quorum_sizeto check quorum. With this PR, it's now semantically aligned with Go'sCommitteeMember.HasQuorum(both2f + 1) rather than accidentally aligned by virtue of all fixtures being N=4.