Skip to content

Feat/static peer discovery#1690

Merged
Evanev7 merged 4 commits intoexo-explore:mainfrom
Deepzima:feat/static-peer-discovery
Mar 25, 2026
Merged

Feat/static peer discovery#1690
Evanev7 merged 4 commits intoexo-explore:mainfrom
Deepzima:feat/static-peer-discovery

Conversation

@Deepzima
Copy link
Contributor

@Deepzima Deepzima commented Mar 9, 2026

Enabling peers to be discovered in environments where mDNS is unavailable (SSH sessions, headless servers, Docker).

Motivation

Exo discovers peers exclusively via mDNS, which works great on a local network but breaks once you move beyond a single L2 broadcast domain:

Related works:
#1488 (working implementation made by @AlexCheema and closed because SSH had a GUI workaround),
#1023 (Headscale WAN then closed due to merge conflicts),
#1656 (discovery cleanup, open).

This PR introduces an optional bootstrap mechanism for peer discovery while leaving the existing mDNS behavior unchanged.

Changes

Adds two new CLI flags:

  • --bootstrap-peers (env: EXO_BOOTSTRAP_PEERS) — comma-separated libp2p multiaddrs to dial on startup and retry periodically
  • --libp2p-port — fixed TCP port for libp2p to listen on (default: OS-assigned). Required when bootstrap peers, so other nodes know which port to dial.

8 files:

  • rust/networking/src/discovery.rs: Store bootstrap addrs, dial in existing retry loop
  • rust/networking/src/swarm.rs: Thread bootstrap_peers parameter to Behaviour
  • rust/networking/examples/chatroom.rs: Updated call site for new create_swarm signature
  • rust/networking/tests/bootstrap_peers.rs: Integration tests
  • rust/exo_pyo3_bindings/src/networking.rs: Accept optional bootstrap_peers in PyO3 constructor
  • rust/exo_pyo3_bindings/exo_pyo3_bindings.pyi : Update type stub
  • src/exo/routing/router.py: Pass peers to NetworkingHandle
  • src/exo/main.py : --bootstrap-peers CLI arg + EXO_BOOTSTRAP_PEERS env var

Why It Works

Bootstrap peers are dialed in the existing retry loop — the same path taken by peers when mDNS-discovered. The swarm handles connection, Noise handshake, and gossipsub mesh joining from there.

PeerId is intentionally not required in the multiaddr, the Noise handshake discovers it.

Docker Compose example:

services:
  exo-1:
    environment:
      EXO_BOOTSTRAP_PEERS: "/ip4/exo-2/tcp/30000"
  exo-2:
    environment:
      EXO_BOOTSTRAP_PEERS: "/ip4/exo-1/tcp/30000"

Test Plan

Manual Testing

Docker Compose config
services:
  exo-node1:
    build:
      context: .
      dockerfile: Dockerfile.bootstrap-test
    container_name: exo-bootstrap-node1
    hostname: exo-node1
    command: ["-q", "--libp2p-port", "30000", "--bootstrap-peers", "/ip4/172.30.20.3/tcp/30000"]
    environment:
      - EXO_LIBP2P_NAMESPACE=bootstrap-test
    ports:
      - "52415:52415"
    networks:
      bootstrap-net:
        ipv4_address: 172.30.20.2
    deploy:
      resources:
        limits:
          memory: 4g

  exo-node2:
    build:
      context: .
      dockerfile: Dockerfile.bootstrap-test
    container_name: exo-bootstrap-node2
    hostname: exo-node2
    command: ["-q", "--libp2p-port", "30000", "--bootstrap-peers", "/ip4/172.30.20.2/tcp/30000"]
    environment:
      - EXO_LIBP2P_NAMESPACE=bootstrap-test
    ports:
      - "52416:52415"
    networks:
      bootstrap-net:
        ipv4_address: 172.30.20.3
    deploy:
      resources:
        limits:
          memory: 4g

networks:
  bootstrap-net:
    driver: bridge
    ipam:
      config:
        - subnet: 172.30.20.0/24

Two containers on a bridge network (172.30.20.0/24), fixed IPs, --libp2p-port 30000, cross-referencing --bootstrap-peers.

Both nodes found each other and established a connection then ran the election protocol.

Automated Testing

4 Rust integration tests in rust/networking/tests/bootstrap_peers.rs (cargo test -p networking):

Test What it verifies Result
two_nodes_connect_via_bootstrap_peers Node B discovers Node A via bootstrap addr (real TCP connection) PASS
create_swarm_with_empty_bootstrap_peers Backward compatibility — no bootstrap peers works PASS
create_swarm_ignores_invalid_bootstrap_addrs Invalid multiaddrs silently filtered PASS
create_swarm_with_fixed_port listen_port parameter works PASS

All 4 pass. The connection test takes ~6s

@Deepzima Deepzima force-pushed the feat/static-peer-discovery branch from a2177b1 to e7ca9f8 Compare March 9, 2026 17:09
@Deepzima Deepzima marked this pull request as ready for review March 9, 2026 17:58
Copy link
Member

@Evanev7 Evanev7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at first pass this seems fine. i expect some breaking changes here in the future as we work around some of libp2ps issues and move off gossipsub for our data plane.

src/exo/main.py Outdated
parser.add_argument(
"--libp2p-port",
type=int,
default=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also default to 0 and not have the port be nullable either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, make sense. I'll make the change


#[new]
fn py_new(identity: Bound<'_, PyKeypair>) -> PyResult<Self> {
#[pyo3(signature = (identity, bootstrap_peers=vec![], listen_port=None))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't provide default arguments when we won't make use of them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, ack

Comment on lines +108 to +109
bootstrap_peers: list[str] | None = None,
listen_port: int | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto. bootstrap_peers doesn't need nullability at all.

@Deepzima Deepzima requested a review from Evanev7 March 21, 2026 17:46
Copy link
Member

@Evanev7 Evanev7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! let's get her in

@Evanev7 Evanev7 force-pushed the feat/static-peer-discovery branch 2 times, most recently from e46d29f to 27d42d8 Compare March 23, 2026 16:06
Deepzima and others added 3 commits March 25, 2026 10:48
Address review feedback: use zero-values (empty sequence, port 0)
instead of Option/None throughout the bootstrap peers API. Defaults
live only in main.py's argument parser, not in the Rust PyO3 boundary.

Signed-off-by: DeepZima <deepzima@outlook.com>
@Evanev7 Evanev7 force-pushed the feat/static-peer-discovery branch from b4c4c16 to 4090a69 Compare March 25, 2026 10:48
@Evanev7 Evanev7 enabled auto-merge (squash) March 25, 2026 10:50
@Evanev7 Evanev7 merged commit 2da740c into exo-explore:main Mar 25, 2026
3 checks passed
@Deepzima Deepzima deleted the feat/static-peer-discovery branch March 25, 2026 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants