bugfix: fix mtp prefix cache prefill starvation.#1264
bugfix: fix mtp prefix cache prefill starvation.#1264phantomlei3 wants to merge 1 commit intojd-opensource:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces an explicit reference counting mechanism for physical blocks to improve the accuracy of usage tracking, particularly when prefix caching is enabled. It also updates the prefill scheduler to bypass memory threshold checks when using the prefix cache. The review feedback identifies critical thread-safety risks due to non-atomic updates of the new reference counters and notes a style guide violation regarding the use of auto for primitive types.
|
|
||
| // const auto block_ids = allocate(num_blocks); | ||
| num_used_blocks_.fetch_add(num_blocks, std::memory_order_relaxed); | ||
| add_seq_refs(blocks); |
There was a problem hiding this comment.
what's this differs from the current implementation.
Currently, when a block is assigned, the class's assignment operator automatically increments the reference count (ref count++). In theory, there's no difference compared to the current approach.
Block& Block::operator=(const Block& other) {
if (this != &other) {
dec_ref_count();
id_ = other.id_;
size_ = other.size_;
manager_ = other.manager_;
ref_count_ = other.ref_count_;
memcpy(hash_value_, other.hash_value_, XXH3_128BITS_HASH_VALUE_LEN);
inc_ref_count(); // <--------------------- here
}
return *this;
}
Under what circumstances would current implementation cause a problem?
There was a problem hiding this comment.
ref_count and active_seq_refs are counting different things.
| Step | Holder of physical block B | ref_count(B) | Active sequences using B |
|---|---|---|---|
| 1 | Prefix cache only | 1 | 0 |
| 2 | PrefixCache::match() returns a local shared_blocks vector | 2 | 0 |
| 3 | sequence->add_kv_blocks(shared_blocks) copies B into the sequence state | 3 | 1 |
| 4 | Later allocation fails, so we call deallocate(shared_blocks) first | 2 | 1 |
| 5 | Then sequence->reset() releases the sequence-owned copy | 1 | 0 |
The problem is that ref_count includes every Block handle copy:
- the prefix-cache node
- the temporary shared_blocks vector
- the sequence-owned vector
But num_used_blocks_ is supposed to mean:
"how many physical blocks are still occupied by live sequences"
So in prefix-cache paths, ref_count is not a reliable proxy for sequence ownership. A temporary copy can increaseref_count even though no extra sequence is using the block. That is why the old ref_count() <= 2 heuristic can drift, while active_seq_refs_ tracks the intended quantity directly.
5f34a91 to
83b7791
Compare
83b7791 to
801f372
Compare
No description provided.