-
Notifications
You must be signed in to change notification settings - Fork 161
Mempool syncing optimizations & fixups #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Overall this MR seems to greatly improve the initial sync and steady state of electrs alongside a slower bitcoind. some issues found during testing
|
RCasatta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK a6a796c code review
My comments are mostly styling and perf improv.
Note also I am not that familiar with the overall codebase, but all changes seems reasonable/
While tracing this with wireshark, I observed that electrs sends a request (such as getbestblockhash), and waits for a response. After 30 seconds (and a few TCP Keep-Alive interactions), bitcoind sends a FIN,ACK without any text response. Seems to be some behaviour in bitcoind. |
a6a796c to
a6c181e
Compare
I was unable to reproduce this, the "missing transaction" message appears to be returned as a JSONRPC error without panicking:
I'm seeing that behaviour on my local regtest env too. |
|
utACK a6c181e |
a6c181e to
f3a13e7
Compare
RCasatta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK f3a13e7
|
in analyzing pcap trace between local electrs and local. bitcoin rpc, we see electrs asking for getrawmempool on |
|
|
following one of the "disconnects":
electrs should be responding to the FIN,ACK and active, and reopen |
philippem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see earlier comments about connection handling.
|
I misread the comment about the disconnects
This should already be the case. In fact only errors explicitly constructed as an
Looking into this. |
bitcoind appears to close the connection after a period of inactivity, which can be adjusted with the (hidden) config option Adjusting it upwards made the disconnection issues I was experiencing in my local env go away. I also tested with electrs and bitcoind connected over the internet (local electrs connecting to a regtest bitcoind on a vps) and was able to retain open TCP connections with no disconnections for several hours (used every ~2 minutes, with I also tried investigating what's causing electrs not to close immediately in reply to bitcoind's FIN, but do not have an answer yet. |
This actually hurts performance because the batched response has to be bueffered on the bitcoind side, as @TheBlueMatt explains at romanz#373 (comment) Instead, send multiple individual RPC requests in parallel using a thread pool, with a separate RPC TCP connection for each thread. Also see romanz#374
The indexing process was adding transactions into the store so that prevouts funded & spent within the same batch could be looked up via Mempool::lookup_txos(). If the indexing process later failed for any reason, these transactions would remain in the store. With this change, we instead explicitly look for prevouts funded within the same batch, then look for the rest in the chain/mempool indexes and fail if any are missing, without keeping the transactions in the store.
Previously, if any of the mempool transactions were not available because they were evicted between getting the mempool txids and txs themselves, the mempool syncing operation would be aborted and tried again from scratch. With this change, we instead keep whatever transactions we were able to fetch, then get the updated list of mempool txids and re-try fetching missing ones continuously until we're able to get a full snapshot.
- Reduce logging level for bitcoind's JSONRPC response errors These can happen pretty often for missing mempool txs and do not warrant warn-level logging. Unexpected RPC errors will bubble up and be handled appropriately. - Add more verbose logging for mempool syncing
Keep RPC TCP connections open between sync runs and reuse them, to reduce TCP connection initialization overhead.
f3a13e7 to
7a068bf
Compare
|
Rebased and added some more verbose logging for mempool syncing. |
|
testing latest commit using local electrs and bitcoind, I set parameters in bitcoin.conf I have not seen any "disconnected from daemon while receiving" messages. |
|
utACK 7a068bf |
|
These are the guidelines for the bitcoin.conf rpc settings: D = --daemon-parallelism D for electrs, should be # of cores given to container, default is 2 E = # of electrs instances B = # of bitcoin core instances Then in bitcoin.conf set that is, we provide D + 3 threads for each electrs instead, and D*8 queue for each electrs. bitcoin.conf settings: |
Prior to this change, the local electrs mempool was only updated with new transactions once we were able to get a complete snapshot of every transaction in bitcoind's mempool, which could require multiple rounds of attempts if any transactions are replaced (or otherwise evicted) between querying for the mempool txids and querying for the transactions themselves. PR #89 made each round more efficient, but obtaining a complete snapshot can still potentially take many rounds the more RBF is used, with each round taking longer the larger the mempool gets (just listing the mempool txids to find the delta becomes costly). With this change, the local mempool is instead updated with whatever transactions we are able to get, without waiting for a fully consistent snapshot - which should improve performance and reduce latency. Making partial updates requires some extra care on the electrs side to ensure mempool consistency (i.e., no double-spends or missing input txos), since we can no longer fully rely on bitcoind validating consistency for us. Specifically: - If any transactions were RBF'd and gone by the time we queried for them, we explicitly check for transactions that depend on the missing transactions (or their descendants) and remove them too. - Mempool evictions must be processed first before any additions, even if the replacing transactions aren't available. This ensures double-spend conflicts are not possible.


Optimizations:
Send RPC requests in parallel over multiple TCP connections, without batching (38e73d9)
The number of parallel requests to send can be controlled with a new
--daemon-parallelismCLI options (defaults to 4).Continuously attempt to fetch a full mempool snapshot, keeping whatever transactions we were able to get each time (6a14f40)
Fixups (not strictly necessary for the other changes, but touches on related areas and the previous behavior was wrong):
Don't add transactions with unknown parents to ensure mempool consistency (c60207a)
Make sure the chain tip doesn't move while fetching the mempool (a6a796c)
This will require adjusting bitcoind's
rpcthreadsand possiblyrpcworkqueueconfiguration options upwards. The defaultrpcthreadsis 4, which would be exhausted by mempool fetching alone (electrskeeps at least one other open connection, and we probably want to allow some room for other clients too).Based on top of #76. (now merged)