Voltar para Documentação

Docs Técnicas

atlas-consensus

atlas-consensus is the consensus coordination crate of the Rust workspace. It owns proposal and vote intake, quorum evaluation, equivocation detection, leader schedule helpers, sync recovery state, and a small audit database for proposals, votes, and results.

O conteúdo abaixo vem das fontes técnicas do repositório e é prerenderizado no site para leitura direta por pessoas, crawlers e agentes.

Summary

atlas-consensus is the consensus coordination crate of the Rust workspace. It owns proposal and vote intake, quorum evaluation, equivocation detection, leader schedule helpers, sync recovery state, and a small audit database for proposals, votes, and results.

It is not the full runtime loop by itself. In the current architecture, atlas-node drives the sequence "receive proposal -> create vote -> evaluate quorum -> move phase -> commit", while atlas-consensus provides the stateful primitives and the Cluster facade that make that loop possible.

Why This Crate Exists

  • keep consensus-specific state outside the application crate
  • validate proposals and votes before they reach the ledger commit path
  • track in-flight proposals, votes, quorum results, and equivocation evidence
  • expose cluster and sync-health helpers to the node runtime
  • persist consensus audit artifacts independently from ledger storage

Current Role In The Workspace

atlas-consensus currently owns six related concerns:

  1. ConsensusEngine, the in-memory proposal pool, vote registry, and quorum evaluator
  2. Cluster, the higher-level facade around engine, auth, peers, sync status, and audit
  3. proposal lifecycle handling: submit, receive, validate, stage, evaluate, and commit
  4. vote lifecycle handling: create, sign, verify, dedup, and detect equivocation
  5. sync recovery state, peer backoff, and peer health checks used by the node sync driver
  6. audit persistence through a dedicated redb store

The main point to keep in mind is that this crate contains the consensus state machine pieces, but not the outer orchestrator. The orchestration lives in `runtime/consensus_driver.rs`.

Public Surface

Top-Level Modules

  • cluster
  • consensus
  • env
  • storage
  • telemetry

Main Public Reexports

  • ClusterBuilder
  • Cluster
  • QuorumPolicy
  • ConsensusEngine

Important Types

  • cluster::core::Cluster
  • cluster::core::ClusterStatus
  • cluster::core::SyncRecoveryState
  • cluster::core::SyncRecoveryStatus
  • cluster::voting::HandleVoteOutcome
  • env::runtime::AtlasEnv
  • env::runtime::Callback
  • storage::audit::AuditStorage
  • consensus::evaluator::QuorumPolicy

Crate Target

  • library target only, no standalone binary target in this crate

Key Modules

  • `lib.rs`: public export surface
  • `cluster/core.rs`: Cluster, leader scheduling, sync recovery state, peer penalties, and evidence handling
  • `cluster/proposals.rs`: proposal submission, inbound proposal validation, evaluation entrypoint, and ledger commit path
  • `cluster/voting.rs`: local vote creation, remote vote verification, duplicate retry handling, and equivocation detection
  • `cluster/peers.rs`: peer registration and active-peer queries over PeerManager
  • `cluster/builder.rs`: ClusterBuilder and required construction inputs
  • `consensus/engine.rs`: in-memory proposal and vote state plus evaluation caching
  • `consensus/evaluator.rs`: BFT and stake-weighted quorum logic
  • `env/runtime.rs`: AtlasEnv, Storage, ConsensusEngine, callback, and peer-manager wiring
  • `storage/audit.rs`: redb-backed audit persistence for proposals, votes, and results
  • `telemetry.rs`: consensus-level counters

Inputs And Outputs

Inputs

  • inbound proposal bytes from P2P gossip
  • inbound vote bytes and equivocation evidence bytes
  • local validator identity and signing capability through Authenticator
  • ledger queries for validator stake, transaction existence, proposal append, and slashing
  • active peer state from atlas-p2p::PeerManager
  • static validator set used for deterministic leader scheduling

Outputs

  • AdapterCmd::Publish commands for outbound proposal gossip
  • signed VoteData values for prepare, precommit, and commit phases
  • ConsensusResult values for quorum-approved phases
  • ledger side effects such as append_proposal(...) and slash_validator(...)
  • redb audit records for proposals, votes, and final results
  • sync recovery status updates and peer penalty/backoff decisions
  • telemetry counters for rejected proposals, duplicate vote retries, and evaluation passes

Internal Dependencies

Workspace Dependencies

  • atlas-common
  • atlas-ledger
  • atlas-p2p

This crate sits between the shared type layer and the application runtime. It depends on atlas-ledger for stake and commit decisions, and on atlas-p2p for peer state and consensus-facing transport commands.

External Dependencies That Shape The Design

  • tokio for async coordination and shared state
  • serde, serde_json, and bincode for proposal, vote, and audit serialization
  • redb for local audit persistence
  • libp2p for PeerId handling during evidence and address conversion paths
  • metrics for consensus instrumentation

Used By

These packages depend directly on atlas-consensus today:

  • atlas-node

Observed usage in the workspace includes:

  • `builder.rs`: node bootstrapping builds Cluster with ClusterBuilder
  • `config.rs`: runtime config owns QuorumPolicy, AtlasEnv, ConsensusEngine, and AuditStorage
  • `env_config.rs`: older env-config path reconstructs AtlasEnv from serialized config
  • `runtime/consensus_driver.rs`: drives phase transitions, evidence broadcast, commit, and mempool cleanup
  • `runtime/block_producer.rs`: submits new proposals through Cluster::submit_proposal(...)
  • `runtime/sync_driver/fork_recovery.rs`: uses sync failure, penalty, and backoff helpers
  • `main.rs`: registers consensus telemetry

Consensus Model

Proposal Intake

The proposal path splits into two entrypoints:

  • `submit_proposal(...)`: local path used by the block producer
  • `handle_proposal(...)`: inbound path used for gossiped proposals

Before a proposal is staged, the crate currently checks:

  • proposer signature
  • metadata-based state_root
  • transaction envelope signatures
  • duplicate transaction hashes inside the proposal
  • already committed transactions through the ledger

If the proposal passes, it is inserted into the engine pool and mirrored into consensus storage for later evaluation and commit bookkeeping.

Vote Intake And Equivocation Handling

`create_vote(...)` does more than sign a vote. It:

  • loads the proposal from the local pool
  • prevents self-equivocation for the same height, view, and phase
  • verifies the proposal signature during Prepare
  • signs the vote
  • atomically pre-registers the vote in the local registry before broadcast
  • persists the vote to audit storage

`handle_vote(...)` then:

  • ignores votes for already finalized proposals
  • verifies vote signatures
  • filters idempotent retries before storage and re-evaluation
  • records accepted votes in consensus storage and audit storage
  • forwards accepted votes into ConsensusEngine::receive_vote(...)

Equivocation detection lives in `registry.rs`, where votes are indexed both by proposal and by (height, view, phase, voter) so conflicting signed votes can be turned into EquivocationEvidence.

Quorum Evaluation

ConsensusEngine in `engine.rs` is the core evaluation state holder. It keeps:

  • a ProposalPool
  • a VoteRegistry
  • pending evidence to slash later
  • confirmed phases so the same phase is not reported repeatedly
  • a short-lived cache of active staked nodes

The evaluation path is intentionally layered:

  1. compute a cheap tentative quorum from active peers and skip work if no new result is even possible
  2. fetch active staked nodes, cached for 500 ms
  3. run stake-weighted evaluation through `evaluate_weighted(...)`
  4. fall back to classic node-count quorum if total active stake is zero

The weighted quorum rule is currently "greater than two thirds of total active stake", while the non-weighted fallback uses floor(n * fraction) + 1.

Commit, Audit, And Sync Model

Commit Path

`commit_proposal(...)` is the bridge from approved consensus result into ledger execution. It:

  • finds the proposal in the local engine pool
  • ensures the proposal is also present in consensus storage
  • calls ledger.append_proposal(...)
  • logs the final result and prunes committed proposal data from in-memory consensus storage
  • persists proposal, votes, and result into AuditStorage

The final cleanup of proposal-specific vote state is not fully internal to this crate. The node runtime clears proposal votes and handled phases in `runtime/consensus_driver.rs` after a successful commit.

Audit Storage

`AuditStorage` keeps three tables:

  • proposals keyed by height
  • votes keyed by a composite string proposal_id:phase:view:voter
  • results keyed by proposal ID

This store is clearly audit-oriented, not a performance index. Reads such as "recent proposals" and "votes for proposal" still rely on in-memory sorting or table scans.

Sync Recovery Helpers

Cluster also carries state that is adjacent to consensus, but really serves node recovery:

  • SyncRecoveryStatus
  • peer cooldown/backoff state
  • peer health checks based on last_seen
  • peer penalties after invalid state-sync responses

Those helpers are used by the node sync driver in `fork_recovery.rs` and related modules.

Testing

The crate relies mostly on module-local tests instead of a large integration test suite.

Coverage includes:

  • quorum calculation and weighted vote counting
  • proposal pool and vote registry behavior
  • duplicate retry and equivocation handling
  • proposal validation and commit-path edge cases
  • audit storage roundtrips
  • cluster builder requirements
  • sync recovery and evidence verification helpers

There is also indirect coverage from atlas-node runtime tests that exercise Cluster, ConsensusEngine, and the sync driver together.

Risks Or Design Tension

Mixed Responsibility

This crate combines:

  • quorum logic
  • runtime-facing cluster APIs
  • sync recovery helpers
  • peer health management
  • audit persistence

That is practical, but it means atlas-consensus is broader than "just the algorithm".

Runtime Boundary Lives In Another Crate

The actual phase-transition loop lives in atlas-node, not here. That split is workable, but a reader can easily assume Cluster is the whole engine when it is really a facade plus state container.

Legacy Runtime Shape Still Shows Through

`AtlasEnv` still owns a Graph and an apply_if_approved(...) helper for add_edge style payloads. That looks older and narrower than the current ledger-driven proposal model.

Leader Selection Signals Are Mixed

`get_leader_for_slot(...)` currently uses a deterministic round-robin schedule over a static validator set, but the crate still contains:

  • a weighted_lottery(...) helper
  • log messages such as "Weighted Leader Elected"

That suggests the leader-election story has evolved and the naming has not fully caught up.

Quorum Defaults Are Not Fully Unified

`QuorumPolicy::default()` is 0.5, but `AtlasEnv::new(...)` hardcodes 0.7. Depending on construction path, "default quorum" does not mean exactly one thing today.

Some Persistence Is Duplicated

Consensus data is stored in both:

  • atlas_ledger::storage::Storage inside AtlasEnv
  • storage::audit::AuditStorage

That separation is understandable, but it does mean proposal, vote, and result state exists in two different persistence layers with different purposes.