Skip to content

fix: improve libp2p connection limits for network discovery#17425

Merged
PhilWindle merged 1 commit intoAztecProtocol:nextfrom
jorem321:fix/libp2p-connection-limits
Oct 2, 2025
Merged

fix: improve libp2p connection limits for network discovery#17425
PhilWindle merged 1 commit intoAztecProtocol:nextfrom
jorem321:fix/libp2p-connection-limits

Conversation

@jorem321
Copy link
Copy Markdown
Contributor

@jorem321 jorem321 commented Oct 1, 2025

Context

After the Aztec network underwent its migration to v2.0.2, we noticed an inability to obtain reliable stats from the P2P network via our libp2p crawler deployed on aztec.nethermind.io.

Further investigation (by reading DEBUG logs on Aztec nodes) revealed rejections of the crawler's connection with the following type of errors:

2025-09-24 17:29:09.371 error[17:29:09.352] DEBUG: p2p:libp2p_service:libp2p_service:libp2p:tcp:listener error: inbound connection failed

This led to investigation of yarn-project/p2p/src/services/libp2p/libp2p_service.ts, which manages the libp2p connection at the TCP/Noise layer.

We identified that the ConnectionManager was configured with maxConnections: maxPeerCount. This configuration is too strict, causing rejections of any type of connections once the peer is at maxPeerCount. Moreover, it is inconsistent with the behavior of e.g. Ethereum consensus clients, which keep a buffer for discovery, as shown in the table below.

Current Aztec Parameter Role/Purpose Lighthouse Equivalent Prysm Equivalent Teku Equivalent
connectionManager.maxConnections: maxPeerCount Maximum total connections - Hard limit on concurrent peer connections to prevent resource exhaustion. libp2p docs max_established: target_peers * 1.3 (30% buffer) connManager high: max(MaxPeers + 32, 192) (92% buffer at MaxPeers=100) TargetPeerRange.upperBound: 100 (fixed, no scaling)
connectionManager.minConnections: 0 Minimum connection target - Ensures node maintains baseline connectivity for network health. libp2p docs No direct equivalent (discovery-based) No direct equivalent (pruning threshold only) TargetPeerRange.lowerBound: 64 (application-level minimum)
tcp.maxConnections: Math.ceil(maxPeerCount * 1.5) Transport layer limit - Controls TCP connection acceptance before libp2p validation. libp2p source code libp2p default: Infinity (no limit) libp2p default: Infinity (no limit) libp2p default: Infinity (no limit)
tcp.closeAbove: maxPeerCount * 2 Connection burst protection - Temporarily stops accepting new connections during traffic spikes. libp2p source code libp2p default: Infinity (no limit) libp2p default: Infinity (no limit) libp2p default: Infinity (no limit)

In order to keep Aztec's P2P network in line with the standards outlined above, this PR reviews and amends key parameters in the libp2p_service.

Changes Made

  • Set connectionManager.maxConnections: maxPeerCount * 2. Allows a more generous buffer for network discovery and crawling---consistent with Prysm's approach.
  • Set connectionManager.minConnections: Math.floor(maxPeerCount * 0.5). The previous choice was zero, which disabled the functionality of libp2p dialing peers from the peer book when peer counts get low. It is recommended to bump this parameter to help keep peer counts up, like Teku does.
  • Set tcp.maxConnections: maxPeerCount * 2. As per the libp2p docs, tweaking this parameter "will have no effect if it is larger than the value configured for the ConnectionManager maxConnections parameter", hence we set it equal.
  • Set tcp.closeAbove : maxPeerCount * 3 (still an improvement over libp2p's Infinity default, offering burst protection at the lowest networking level.)

@jorem321 jorem321 force-pushed the fix/libp2p-connection-limits branch 3 times, most recently from a8c59f8 to 14e298a Compare October 1, 2025 16:34
@jorem321 jorem321 closed this Oct 1, 2025
@jorem321 jorem321 reopened this Oct 1, 2025
@jorem321 jorem321 force-pushed the fix/libp2p-connection-limits branch from 14e298a to 55740b8 Compare October 1, 2025 16:48
@jorem321 jorem321 marked this pull request as ready for review October 2, 2025 07:15
@alexghr alexghr added the ci-external Allow CI to run on this external pull request. label Oct 2, 2025
- Set connectionManager.maxConnections to maxPeerCount * 2
- Set connectionManager.minConnections to Math.floor(maxPeerCount * 0.5)
- Set tcp.maxConnections to maxPeerCount * 2
- Set tcp.closeAbove to maxPeerCount * 3

This addresses crawler connection rejections by providing reasonable
buffers for network discovery while maintaining resource protection.
@jorem321 jorem321 force-pushed the fix/libp2p-connection-limits branch from 55740b8 to ebb0323 Compare October 2, 2025 07:25
@PhilWindle PhilWindle added this pull request to the merge queue Oct 2, 2025
Merged via the queue into AztecProtocol:next with commit 9f1ed43 Oct 2, 2025
13 checks passed
@AztecBot
Copy link
Copy Markdown
Collaborator

AztecBot commented Oct 2, 2025

💚 All backports created successfully

Status Branch Result
v2

Questions ?

Please refer to the Backport tool documentation and see the Github Action logs for details

PhilWindle pushed a commit that referenced this pull request Oct 2, 2025
…ork discovery (#17425) (#17449)

# Backport

This will backport the following commits from `next` to `v2`:
- [fix: improve libp2p connection limits for network discovery
(#17425)](#17425)

<!--- Backport version: 9.5.1 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

Co-authored-by: jorem321 <jorgearce321@gmail.com>
@jorem321 jorem321 deleted the fix/libp2p-connection-limits branch October 2, 2025 12:49
alexghr pushed a commit that referenced this pull request Nov 5, 2025
# v2.0.3..v2.1.0-rc.1 Notes

## Significant L1 Changes


### 1.  **Rollup Contract Interface Changes**

- **`propose()`  function signature changed**: Now requires an
additional  `_attestationsAndSignersSignature`  parameter
- **`validateHeaderWithAttestations()`  function signature changed**:
Also requires the new signature parameter
- This affects any code that directly calls these functions on the
rollup contract

### 2.  **New Required Configuration Parameters**

Several new configuration parameters are now required for deployment:

- `localEjectionThreshold`: Stricter ejection threshold local to
specific rollup (default: 196,000 tokens)
- `slashingDisableDuration`: How long slashing can be disabled in
seconds (default: 5 days)

### 3.  **GSE Contract Changes**

- **New function**:  `setProofOfPossessionGasLimit()`  \- allows
governance to adjust gas limits for BLS proof validation
- **Gas-limited proof validation**: Proof of possession validation now
has configurable gas limits (default: 200,000 gas)

### 4.  **Validator Queue Management Changes**

- **`flushEntryQueue()`  behavior changed**: Now has an overload
accepting a  `_toAdd`  parameter to limit validator additions
- **New validator flush accounting**: System now tracks available
validator flushes per epoch

Significant Non-Breaking Changes
--------------------------------

### 1.  **Enhanced Slashing Controls**

- **Temporary slashing disable**: Vetoers can now temporarily disable
slashing for the configured duration
- **New function**:  `setSlashingEnabled(bool)`  for controlling
slashing state

### 2.  **Improved Validator Selection**

- **Configurable lag period**: Validator sampling now uses configurable
epoch lag instead of fixed 2-epoch delay
- **Better bootstrapping**: Enhanced validator set bootstrapping with
improved flush size calculations

### 3.  **Updated Default Values**

- **Coin issuer rate**: Updated to  `25,000,000,000 tokens / year` 
(approximately 793 tokens per second)
- **Local ejection threshold**: Set to 196,000 tokens (stricter than
global 50,000 threshold)

## Significant Node Changes

### Fixes

- Rollback world state on failed block sync – Prevents bad state
persistence by rolling back uncommitted data if block sync fails.
[(#17158)](github.com//pull/17158)
- Early rejection of duplicate nullifiers – Detects and rejects
transactions with duplicate nullifiers before inclusion.
[(#17157)](github.com//pull/17157)
- Watcher pruning fix – Watcher now re-executes only blocks from the
relevant pruned epoch, avoiding cross-epoch slashing issues.
[(#17145)](github.com//pull/17145)
- Improved proposal validation – Fully validates proposal headers
(including archive root derivation) and blocks attempts to reuse
existing block numbers.
[(#17144)](github.com//pull/17144)
- L1 to L2 message sync reliability – Waits for rollup to reach the
inbox block before marking L1→L2 messages as synced; adds helpers to
track message readiness.
[(#17132)](github.com//pull/17132)
- Slashing round recovery – Executes pending slashing rounds skipped
during the first executable round; adds slashExecuteRoundsLookBack to
control re-check depth.
[(#17125)](github.com//pull/17125)
- Broker restart on rollup change – Ensures broker restarts when rollup
chain changes to stay synchronized.
[(#17120)](github.com//pull/17120)
- Remote signer readiness check – Verifies that a remote signer is
available before use.
[(#17119)](github.com//pull/17119)
- Orchestrator and agent retry improvements – Makes connections to the
broker more robust under transient failures.
[(#17117)](github.com//pull/17117)
- Telemetry cleanup – Fixes incorrect or spammy telemetry warnings.
[(#17155)](github.com//pull/17155)

### Features

- Network configuration support – Introduces centralized configuration
for network parameters.
[(#17113)](github.com//pull/17113)


## Full Changelog

You can generate this yourself with `./scripts/commits
v2.0.3..v2.1.0-rc.1 1000 -m -g`.

#### Fixes

- fix: use archiveAt(0) instead of getBlock to get genesis archive tree
- backport v2
([#17447](#17447)) —
spypsy, 5 days ago
- fix: add keystoreDirectory option to sequencer
([#17265](#17265)) —
spypsy, 13 days ago
- fix: testnet archival node - v2
([#17142](#17142)) —
Aztec Bot, 3 weeks ago

#### Chores

- chore: bump minor version — Mitch, 4 days ago —
[dbc243f](dbc243f)
- chore: backport dependabot deps
([#17463](#17463)) —
Aztec Bot, 5 days ago
- chore: Backport slack alerts
([#17460](#17460)) —
PhilWindle, 5 days ago
- chore(backport-to-v2): chore: New salt for staging-ignition (#17453)
([#17453](#17453)) —
Aztec Bot, 5 days ago
- chore(backport-to-v2): fix: improve libp2p connection limits for
network discovery (#17425)
([#17425](#17425)) —
Aztec Bot, 5 days ago
- chore(backport-to-v2): feat: add flushing rewarder (#17335)
([#17335](#17335)) —
Aztec Bot, 6 days ago
- chore(backport-to-v2): feat: add date gated relayer (#17323)
([#17323](#17323)) —
Aztec Bot, 6 days ago
- chore(backport-to-v2): feat: support using existing ERC20 token for
fee and staking (#17413)
([#17413](#17413)) —
Aztec Bot, 6 days ago
- chore: Delete contract addresses from chain l2 config
([#17430](#17430)) —
PhilWindle, 6 days ago
- chore: More updated staging public config
([#17364](#17364)) —
PhilWindle, 7 days ago
- chore(backport-to-V2): L1 backports
([#17365](#17365)) —
Lasse Herskind, 7 days ago
- chore: Ensure DB map sizes are configured for networks
([#17383](#17383)) —
PhilWindle, 7 days ago
- chore: Backport of fixes into v2
([#17206](#17206)) —
PhilWindle, 8 days ago
- chore: update zkpassport version
([#17339](#17339)) —
saleel, 8 days ago
- chore: Backport of workflow fix
([#17333](#17333)) —
PhilWindle, 11 days ago
- chore: Streamline staging deployments
([#17328](#17328)) —
PhilWindle, 11 days ago
- chore(backport-to-v2): fix: avm gracefully handles shifts (shl) with
huge bit sizes (#17171)
([#17171](#17171)) —
Aztec Bot, 12 days ago
- chore(backport-to-v2): chore: remove unconstrained generics from trait
impls (#17075)
([#17075](#17075)) —
Aztec Bot, 12 days ago
- chore: Backport deployment refactor
([#17280](#17280)) —
PhilWindle, 12 days ago
- chore(backport-to-v2): fix(docs): Update Counter contract tutorial
imports and remove unnecessary sections (#17241)
([#17241](#17241)) —
Aztec Bot, 13 days ago
- chore: remove ACCEPT_DISABLED_AVM_VK_TREE_ROOT
([#17238](#17238)) —
Alex Gherghisan, 13 days ago
- chore: remove bad rollup-version default
([#17223](#17223)) —
Alex Gherghisan, 2 weeks ago
- chore(docs): node docs to v2
([#17205](#17205)) —
esau, 2 weeks ago
- chore(backport-to-v2): chore(avm)!: Fix a misleading log in recursive
verifier related to public input (#17184)
([#17184](#17184)) —
Aztec Bot, 2 weeks ago
- chore: Backport of ignition fix attempt 2
([#17201](#17201)) —
PhilWindle, 2 weeks ago
- chore: turn on testnet compat test
([#17195](#17195)) —
Alex Gherghisan, 2 weeks ago
- chore: Backport fix to staging-ignition to v2
([#17159](#17159)) —
PhilWindle, 3 weeks ago
- chore: kubectl
([#17140](#17140)) —
Alex Gherghisan, 3 weeks ago

#### Other

- backport dependabots p2
([#17488](#17488)) —
mralj, 4 days ago

---------

Co-authored-by: AztecBot <tech@aztecprotocol.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-to-v2 ci-external Allow CI to run on this external pull request.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants