Skip to content

Add LSQ leashing 0.30.0.1#18

Draft
tweag-ev-ak wants to merge 12 commits intoleashing-stub-branch-for-prfrom
add-leashing-0.30.0.1
Draft

Add LSQ leashing 0.30.0.1#18
tweag-ev-ak wants to merge 12 commits intoleashing-stub-branch-for-prfrom
add-leashing-0.30.0.1

Conversation

@tweag-ev-ak
Copy link

No description provided.

tweag-ev-ak and others added 6 commits February 2, 2026 16:24
* Update ouroboros-network ref

* Add support for LeashID

* Support explicit unleashing
@tweag-ev-ak tweag-ev-ak requested a review from nfrisby as a code owner February 26, 2026 13:59
Copy link

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review is probably more than you had intended to ask for! You offered to have a call to discuss this code with me and I said "looks small enough for me to look over myself first", and suddenly here I am saying things like "this code needs comments", which is a bit unfair since you just wanted to discuss the ideas with me.

But I figured a more thorough/serious review than requested will likely be more useful, so quit trying to prevent myself from settling into "review mode" and let it rip. I hope the result is helpful in the way you were hoping 🤞And we can still have that call if there's particular things you'd like to discuss synchronously.

-- is called.
ChainDB.LoEEnabled $
AF.Empty AF.AnchorGenesis
gnkaGDDArgs <- for (gcGDDConfig gcfg) $ \p -> do
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO doesn't need to be monadic anymore

leashingState <- readLeashingState
if not $ Map.null leashingState then
ChainDB.LoEEnabled <$> readLoEFragment
-- readLoEFragment >>= \case
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO commented out code

-- ^ The LoE fragment.
-> StrictTVar m (ChainDB.GetLoEFragment m blk)
-> m ()
setGetLoEFragment readLeashingState readGsmState readGenesisLoEFragment readLoEFragment varGetLoEFragment =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between readLoEFragment and readGenesisLoEFragment?

My suspicion is that readLoEFragment should have a more distinct name now. TODO return here when I have an opinion about what that name should be

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #18 (comment)

{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}

{-# OPTIONS_GHC -Wno-unused-matches -Wno-unused-local-binds #-}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this about? Different versions of GHC disagree or something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover, removed.

import qualified Ouroboros.Network.AnchoredFragment as AF
import Ouroboros.Network.Protocol.LocalStateQuery.Type (LeashID)

type LeashingFingerprint blk = ([(LeashID, ChainHash (HeaderWithTime blk))], Maybe (ChainHash (HeaderWithTime blk)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment explaining what the two components are.

I'm guessing: one LeashID per LSQ client in Acquired state, and the Just is from GSM's Syncing mode

oldLoEFrag <- atomically $ swapTVar varLoEFrag loeFrag
-- The chain selection only depends on the LoE tip, so there
-- is no point in retriggering it if the LoE tip hasn't changed.
when (AF.headHash oldLoEFrag /= AF.headHash loeFrag) $
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was moved to the Leashing.hs module ✔️

, getDiffusionPipeliningSupport ::
DiffusionPipeliningSupport
, getBlockchainTime :: BlockchainTime m
, getLeashingStateVar :: StrictTVar m (ChainDB.LeashingState blk)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My instincts are that LeashingState (and so getLeashingStateVar etc) should be explicitly mentioning LSQ.

Recap of where the jargon comes from:

  • The Limit on Eagerness (LoE) prevents a node syncing via Ouroboros Genesis from committing to a chain that would force it to disconnect from any peer while it's syncing. The gddWatcher is already inspecting those peers' candidates, and so it is a convenient place to compute the least restrictive LoE anchor fragment that protects the current peers.
  • "Leashing" is a way an adversarial peer could abuse the LoE to prevent the syncing node from making progress. The Limit on Patience (LoP) and the Genesis Density Disconnector (GDD) cooperate to mitigate that attack vector.
  • The PR that this feature adds is to intentionally let LocalStateQuery (LSQ) peers reuse the existing LoE in a way that intentionally leashes the node (regardless of whether it's syncing).

So I think "leashing state" is too vague. But "LsqLoE" or "LsqLeashing" etc would be sufficiently indicative.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the PR should be "Add LSQ leashing". Etc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, that would make it easier to notice/remember that getLeashingStateVar only exists so that the LSQ servers can keep that state up-to-date.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, that's a great idea 👍

-- | The LoE fragment. It starts at a (recent) immutable tip and ends at
-- the common intersection of the candidate fragments.
StrictTVar m (AnchoredFragment (HeaderWithTime blk)) ->
StrictTVar m (Maybe (AnchoredFragment (HeaderWithTime blk))) ->
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't obvious to me why this must be a Maybe. But I think the answer is: the leashingWatcher needs to know whether or not the GDD's written LoE fragment should be included. So this value is Nothing while Genesis isn't even bothering to keep it up-to-date but it's Just while Genesis is actually active.

A comment explaining that seems very useful.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO double-check that a Nothing is definitely written to this TVar the Genesis State Machine (GSM) transitions into CaughtUp (aka "turns off" the GDD etc).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't obvious to me why this must be a Maybe. But I think the answer is: the leashingWatcher needs to know whether or not the GDD's written LoE fragment should be included. So this value is Nothing while Genesis isn't even bothering to keep it up-to-date but it's Just while Genesis is actually active.

That's correct. Will add a comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO double-check that a Nothing is definitely written to this TVar the Genesis State Machine (GSM) transitions into CaughtUp (aka "turns off" the GDD etc).

Good point, added the write action 👍

(genesisArgs, setLoEinChainDbArgs) <-
mkGenesisNodeKernelArgs llrnGenesisConfig
genesisArgs <- mkGenesisNodeKernelArgs llrnGenesisConfig
varGetLoEFragment <- newTVarIO $ pure ChainDB.LoEDisabled
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO I didn't scrutinize this very much. Seems plausible as just a copy-paste from what used to be in mkGenesisNodeKernelArgs.

else
readGenesisLoEFragment >>= \case
Just glf -> do
-- leashing is disabled, run old behavior
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other big comment about "names should mention LSQ".

I think "leashing is disabled" should say "LSQ leashing is disabled", for example.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate concern: I don't think we need mutual exclusion "either the LSQ LoE or the GDD LoE". Because we can intersect them to correctly combine them, can't we?

Copy link
Author

@tweag-ev-ak tweag-ev-ak Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate concern: I don't think we need mutual exclusion "either the LSQ LoE or the GDD LoE". Because we can intersect them to correctly combine them, can't we?

Yes, that's what we do right now in the leashing watcher:

    wNotify :: (LsqLeashingWatcherState blk) -> m ()
    wNotify LsqLeashingWatcherState{..} = do
        let
          lsqLeashingCandidates = Map.toList lsqLeashingState
          prefix = maybe curChain id genesisLoEFrag
          loeFrag = fst $ sharedCandidatePrefix prefix lsqLeashingCandidates 

if there is a genesis LoE fragment, we use it as a base for sharedCandidatePrefix, otherwise it's current chain. Is it a correct approach?

Only later, in setGetLoEFragment, we decide which the LoE fragment to return, basing on both fragment variables, the lsqLeashingState and the gsm state.

readLoEFragment in setGetLoEFragment returns the resulting LoE fragment calculated by lsqLeashingWatcher.

readGenesisLoEFragment returns the genesis loe fragment calculated by gddWatcher.

Sorry that I confused you because of all these variables mess. I wanted to keep the original implementation as much as possible and extend it rather than changing the existent one. I thought it would be easier to understand the additions. Turned out it only messed up the understanding.

We can do that final calculation in the lsq leashing watcher instead of setGetLoEFragment. I just couldn't decide what's the correct approach. Either we do all the logic in the watchers and remove the getLoEFragment from setGetLoEFragment (it will become just plain readLoEFragment) or combine the whole calculation in setGetLoEFragment (what I ended up with to keep the old behaviour and make our additions more clear).

@nfrisby nfrisby marked this pull request as draft February 26, 2026 21:22
@nfrisby
Copy link

nfrisby commented Mar 1, 2026

Uh oh.... I'm anticipating a burdensome amount of conflicts with IntersectMBO#1891, since it's changing how Forkers are acquired. Maybe it won't be too bad 🤞

@tweag-ev-ak tweag-ev-ak force-pushed the add-leashing-0.30.0.1 branch from 0ecf3e1 to ce1fc45 Compare March 2, 2026 11:12
@tweag-ev-ak tweag-ev-ak changed the title Add leashing 0.30.0.1 Add LSQ leashing 0.30.0.1 Mar 2, 2026
@tweag-ev-ak tweag-ev-ak force-pushed the add-leashing-0.30.0.1 branch from ce1fc45 to 9274b49 Compare March 2, 2026 11:38
@tweag-ev-ak tweag-ev-ak force-pushed the add-leashing-0.30.0.1 branch from 9274b49 to 4210cb3 Compare March 2, 2026 11:54
@tweag-ev-ak tweag-ev-ak requested a review from nfrisby March 2, 2026 12:25
Copy link

@axman6 axman6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments while reading through

let
leashingFragment = case mpt of
ImmutableTip -> AF.Empty $ AF.anchor currentChain
SpecificPoint p -> AF.takeWhileOldest (\(HeaderWithTime h _) -> headerPoint h <= p) currentChain
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several functions for splitting an AnchoredFragment efficiently which might be better than takeWhileOldest https://ouroboros-network.cardano.intersectmbo.org/ouroboros-network/api/Ouroboros-Network-AnchoredFragment.html#g:8

ImmutableTip -> AF.Empty $ AF.anchor currentChain
SpecificPoint p -> AF.takeWhileOldest (\(HeaderWithTime h _) -> headerPoint h <= p) currentChain
VolatileTip -> currentChain
let newState = Map.insert leashId leashingFragment lsqLeashingState
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if this leashId exists already? I think it's fine, I'm pretty sure we always want to update to this fragment, but just want to check it isn't a situation we haven't considered.

Right forker
| Just leashId <- mLeashId -> do
traceM $ "My leash id " <> show leashId
atomically $ do
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that's worth adding to this discussion is that the main way that clients use LSQ is via short lived connections, so things like the Conn type will change for every query.

The querying functions explicitly always do Acquire -> Query -> Release -> Done (which also means a new thread is spun up for every query).

I've played around with making a slightly longer lived querying client but wasn't happy with it (hopefully the overhead of the extra round-trips isn't too large compared to the query itself). We might want to look at that again if the resource overhead on the node is a problem.

- Adds leashId to MsgDone message
- Removes leashId from MsgRelease message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants