Optimize memory and cache performance#649
Optimize memory and cache performance#649liangzhang-keepmoving wants to merge 3 commits intosipeed:mainfrom
Conversation
nikolasdehor
left a comment
There was a problem hiding this comment.
Three separate changes bundled here: MemoryStore caching, LRU list optimization, and HTML entity decoding. The LRU change is solid. The other two have issues.
1. MemoryStore cache is not concurrency-safe
MemoryStore has no synchronization. Multiple goroutines can call ReadLongTerm/WriteLongTerm concurrently (e.g., heartbeat writing memory while a user message reads it). The new cache fields (longTermCache, todayCache, todayCacheKey) are read and written without any mutex, creating data races. Either add a sync.RWMutex or use sync.Map/atomic values.
This is a correctness bug that will trigger under -race.
2. Cache invalidation on external edits
If MEMORY.md is edited externally (by the filesystem tool, a cron job, or the user), the in-memory cache becomes stale and serves outdated content. The old code always read from disk, which was correct. Consider either: (a) checking file mtime before serving from cache, or (b) using a file watcher, or (c) adding cache invalidation when the filesystem tool writes to the memory file.
3. LRU doubly-linked list (positive)
The SearchCache LRU conversion from slice-based to doubly-linked list is correct and well-implemented. O(1) move-to-tail and removal vs O(n) is a real improvement. The eager expiration during Get is also good.
One minor issue: in Get(), you delete expired entries while iterating over the map in the similarity match loop. In Go, deleting from a map during range iteration is safe, so this is fine, but worth being explicit about in a comment.
4. HTML entity decoding is incomplete and has a regression
The entity map only covers 9 entities. HTML has hundreds of named entities and numeric entities (&#NNN; and &#xHHHH;). Consider using html.UnescapeString() from the standard library html package, which handles all of them correctly.
Also, removeHTMLTags compiles 30+ regexps on every call (one per block tag plus the catch-all). These should be compiled once at package init time or as package-level vars.
5. Chinese comments
The comments in web.go are in Chinese. The rest of the codebase uses English comments. Please translate for consistency.
Please address item 1 (data race) before merging.
bd9de7a to
b6cdcea
Compare
|
Thank you for your thorough review. I've addressed all your suggestions as follows: 1. ✅ MemoryStore cache is not concurrency-safe
2. ✅ Cache invalidation on external edits
3. ✅ LRU doubly-linked list (positive)
4. ✅ Removed unrelated changes
|
|
@liangzhang-keepmoving In web.go, I can see some comments are written in Chinese. I was wondering if you could change to English? |
|
Hi @0x5487, thanks for your feedback! I've addressed the Chinese comments issue by removing those web.go modifications entirely. The HTML entity decoding changes have been moved to a separate PR that's already been submitted. The current PR now only focuses on memory and cache performance optimizations without any web.go changes. |
nikolasdehor
left a comment
There was a problem hiding this comment.
Re-reviewed the updated diff. All concerns from the original review have been addressed:
-
Data race fixed —
sync.RWMutexprotects all cache operations with appropriateRLock/Lockgranularity. The read-check-then-write pattern inReadLongTermandReadTodaycorrectly releases the read lock before acquiring the write lock, avoiding deadlocks. -
Cache invalidation — File mtime checking catches external edits. The
todayCacheKeydate-based invalidation for daily notes is a clean design. -
LRU linked list — The doubly-linked list replaces the O(n) slice scan for
moveToEnd. TheremoveEntryLocked/addToTailLockedhelpers are correct and the pointer cleanup prevents memory leaks. -
Unrelated web.go changes removed — Confirmed the diff no longer touches
web.go. -
Expired entry cleanup during
Get()— Good addition. The inline comment about Go map deletion during range iteration being safe is helpful for future readers.
LGTM. Clean, focused PR.
|
I have successfully resolved all conflicts, including those in pkg/agent/loop.go, pkg/tools/web.go, and pkg/tools/web_test.go. All changes have been pushed to the branch, and the PR is now ready to be merged. |
|
You should rebase your own commit on main branch and fix conflicts |
… slice-based LRU to doubly-linked list for O(1) operations\n2. MemoryStore: Add concurrency safety with sync.RWMutex\n3. MemoryStore: Add cache invalidation based on file modification time\n4. Add safety comment for map deletion during iteration
b69d9b9 to
6f6de45
Compare
|
Hi maintainers, I've successfully rebased my commit on the main branch and fixed all conflicts. The branch now only contains the necessary changes for optimizing memory and cache performance:
The branch is now clean and ready for review and merge. |
- Remove problematic web_test.go file - Keep other changes from optimize-memory-cache branch
|
@liangzhang-keepmoving Hi! This PR has had no activity for over 2 weeks, so I'm closing it for now to keep things tidy. If it's still relevant, feel free to reopen it anytime and we'll pick it back up. |
Summary
This PR optimizes memory and cache performance in PicoClaw by:
Changes Made
1. MemoryStore Optimization (
pkg/agent/memory.go)longTermCachestring for caching MEMORY.md contenttodayCacheandtodayCacheKeyfor caching daily notesReadLongTerm(),WriteLongTerm(),ReadToday(), andAppendToday()to use cachegetCacheKey()method to generate cache keys based on date2. SearchCache Optimization (
pkg/skills/search_cache.go)order []stringwith doubly-linked list (head/tail pointers)prevandnextpointers tocacheEntrystructaddToTailLocked(),removeEntryLocked(), andmoveToTailLocked()Benefits
These changes make PicoClaw even more lightweight and responsive, especially on resource-constrained devices like the $10 hardware it's designed to run on.