-
Notifications
You must be signed in to change notification settings - Fork 412
Make kv.mem *fast* #9766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Make kv.mem *fast* #9766
Conversation
Current kv.mem iteration sorts a slice at every call to Next(). Instead use a sorted map. Will use this in `lakefs gc simulate`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR improves the performance of the in-memory KV store (kv.mem) by replacing the inefficient approach of sorting a slice on every Next() call with a LLRB (Left-Leaning Red-Black) tree-based sorted map from the github.com/NVIDIA/sortedmap library. Additionally, benchmark tests for the mem driver are added to measure the performance improvements.
Key changes:
- Replaces map-based storage with
sortedmap.LLRBTreefor maintaining sorted key-value pairs - Removes base64 encoding of keys (previously used to convert byte slices to map keys)
- Updates iterator implementation to use tree-based bisection instead of full slice sorting
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| pkg/kv/mem/store.go | Core implementation changes: replaces map with LLRB tree, removes base64 encoding, updates all operations (Get, Set, SetIf, Delete, Scan) to use sortedmap API |
| pkg/kv/msg_test.go | Adds benchmark tests for mem driver and improves error handling in test utilities (Errorf instead of Fatalf for better test coverage) |
| pkg/kv/kvtest/store.go | Adds test to verify that keys are compared by content rather than by reference, which is important for the mem implementation |
| go.mod, go.sum | Adds NVIDIA sortedmap and cstruct dependencies (version 1.30.0) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/kv/msg_test.go
Outdated
| setupMsgs(b, ctx, store, tt.messagesDBSize, tt.setupParallelism) | ||
|
|
||
| // If store caches locally, this benchmarks cached data. kv.Store has | ||
| // not current facility for flushing caches; indeed it is unclear how to |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error in comment: "not current facility" should be "no current facility".
| // not current facility for flushing caches; indeed it is unclear how to | |
| // no current facility for flushing caches; indeed it is unclear how to |
| key, value, ok, err := m.GetByIndex(index) | ||
| if !ok { | ||
| // end of iteration | ||
| e.start = nil | ||
| e.entry = l[0] | ||
| return true | ||
| return false | ||
| } |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error handling: When m.GetByIndex(index) returns an error (err != nil), the function should check and handle it. Currently, only ok is checked but err could be non-nil even when ok is true or false. The error should be set and return false:
key, value, ok, err := m.GetByIndex(index)
if err != nil {
e.err = err
return false
}
if !ok {
// end of iteration
e.start = nil
return false
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was found sooner by golangci-lint, and explained better, and in fewer words, and in a definitive way. Major 👎🏽 here.
pkg/kv/mem/store.go
Outdated
| ok, err := m.Put(key, value) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if ok { | ||
| // New key. | ||
| return nil | ||
| } | ||
| // Existing key: modify it. | ||
| _, err = m.PatchByKey(key, value) | ||
| return err |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential data safety issue: The implementation stores byte slices directly as keys and values in the sortedmap without making copies. If the caller retains and modifies the byte slices after calling Set/SetIf, or if the caller modifies slices returned from Get/Scan, this could lead to data corruption or comparison errors. The byte slices should be copied before storing:
keyCopy := make([]byte, len(key))
copy(keyCopy, key)
valueCopy := make([]byte, len(value))
copy(valueCopy, value)
ok, err := m.Put(keyCopy, valueCopy)Similarly, the Get operations should return copies of the stored values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done by cloning on Put (take ownership from caller) but not on Get (where it is easy for the caller not to change the returned value, and indeed we do not change it).
| Key: key.([]byte), | ||
| Value: value.([]byte), |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential data safety issue: The iterator returns byte slices directly from the sortedmap without making copies (lines 233-234). If the caller modifies the returned Key or Value slices, it could corrupt the internal data structure. The slices should be copied before being returned:
e.entry = &kv.Entry{
PartitionKey: []byte(e.partition),
Key: append([]byte(nil), key.([]byte)...),
Value: append([]byte(nil), value.([]byte)...),
}| Key: key.([]byte), | |
| Value: value.([]byte), | |
| Key: append([]byte(nil), key.([]byte)...), | |
| Value: append([]byte(nil), value.([]byte)...), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be awful for performance. In any case callers to Get methods do not modify the returned slices. So not done.
Also, curiously the code here is exactly different from the previous way to clone byte slices. It does manage to be a copy of bytes.Clone - which has a comment explaining why it's probably not right to use here.
pkg/kv/mem/store.go
Outdated
| return true | ||
| return false | ||
| } | ||
| e.start = append(key.([]byte)[:], 0) |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The slice copy operation key.([]byte)[:] creates a new slice header pointing to the same underlying array. The subsequent append could potentially modify the original key if there's sufficient capacity. A safer approach would be to create a copy with a new backing array:
e.start = append([]byte(nil), key.([]byte)...)
e.start = append(e.start, 0)or simply:
e.start = append(append([]byte(nil), key.([]byte)...), 0)| e.start = append(key.([]byte)[:], 0) | |
| e.start = append(append([]byte(nil), key.([]byte)...), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Changed)
| entry := &kv.Entry{ | ||
| PartitionKey: partitionKey, | ||
| Key: key, | ||
| Value: value.([]byte), |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential data safety issue: The Get operation returns byte slices directly from the sortedmap without making copies (lines 84-88). The returned entry contains the original byte slices stored in the map, which if modified by the caller, would corrupt the internal state. The values should be copied before being returned in the entry.
| entry := &kv.Entry{ | |
| PartitionKey: partitionKey, | |
| Key: key, | |
| Value: value.([]byte), | |
| // Make copies of the slices to avoid exposing internal state | |
| partitionKeyCopy := make([]byte, len(partitionKey)) | |
| copy(partitionKeyCopy, partitionKey) | |
| keyCopy := make([]byte, len(key)) | |
| copy(keyCopy, key) | |
| valueBytes := value.([]byte) | |
| valueCopy := make([]byte, len(valueBytes)) | |
| copy(valueCopy, valueBytes) | |
| entry := &kv.Entry{ | |
| PartitionKey: partitionKeyCopy, | |
| Key: keyCopy, | |
| Value: valueCopy, |
Most of the issues flagged by copilot are irrelevant. In the case of Set operations it might make some sense to copy slices - they live longer, and the caller might use them somewhere (it doesn't happen in practice). Other kv operations are Gets, and that just doesn't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Key: key.([]byte), | ||
| Value: value.([]byte), |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Next() method, the returned key and value from GetByIndex are type-asserted to []byte and stored directly in the entry (lines 246-247). These slices reference the internal storage of the sorted map. If the caller modifies these slices, it could corrupt the map's internal state.
The code should copy these byte slices before storing them in the entry, similar to how internalSet copies keys and values (lines 116-119). Use:
e.entry = &kv.Entry{
PartitionKey: []byte(e.partition),
Key: bytes.Clone(key.([]byte)),
Value: bytes.Clone(value.([]byte)),
}| Key: key.([]byte), | |
| Value: value.([]byte), | |
| Key: bytes.Clone(key.([]byte)), | |
| Value: bytes.Clone(value.([]byte)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Literally explained why we do not do this here.
| if err != nil { | ||
| return err | ||
| } | ||
| if ok { | ||
| // New key. | ||
| return nil | ||
| } | ||
| // Existing key: modify it. | ||
| _, err = m.PatchByKey(keyCopy, valueCopy) |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The internalSet method calls Put and then conditionally calls PatchByKey if the key already exists (lines 120-130). However, this is inefficient: Put will insert/update the value, and then PatchByKey is called again to update it.
The typical pattern would be to check if the key exists first, then call either Put or PatchByKey. However, without understanding the sortedmap API semantics, it's unclear if Put on an existing key should be followed by PatchByKey.
Based on typical sorted map APIs, Put should handle both insert and update cases. If this is true, the conditional PatchByKey call might be unnecessary. Please verify the sortedmap API documentation and simplify this logic if possible.
| if err != nil { | |
| return err | |
| } | |
| if ok { | |
| // New key. | |
| return nil | |
| } | |
| // Existing key: modify it. | |
| _, err = m.PatchByKey(keyCopy, valueCopy) |
| // Missing godoc in sortedmap: "Returns index of matching key:value pair or, if no | ||
| // match, index is to key:value just after where this key would go" |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The comment describes BisectRight behavior but contains a grammatical error. The comment says "Returns index of matching key:value pair or, if no match, index is to key:value just after where this key would go".
This should be rephrased for clarity: "Returns index of the matching key:value pair or, if no match, the index to the key:value pair just after where this key would go".
| // Missing godoc in sortedmap: "Returns index of matching key:value pair or, if no | |
| // match, index is to key:value just after where this key would go" | |
| // Missing godoc in sortedmap: "Returns index of the matching key:value pair or, if no match, the index to the key:value pair just after where this key would go" |
| // Caller could modify key, value - copy them out. (Don't use bytes.Clone, which can | ||
| // waste capacity!) | ||
| keyCopy := make([]byte, len(key)) | ||
| copy(keyCopy, key) | ||
| valueCopy := make([]byte, len(value)) | ||
| copy(valueCopy, value) |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The comment states "Don't use bytes.Clone, which can waste capacity!" but this is misleading. bytes.Clone is designed specifically for this use case and is more efficient than the manual approach used here (lines 116-119).
bytes.Clone uses append([]byte(nil), b...) internally, which is optimized by the Go compiler and runtime. The manual approach allocates exact-size slices, but bytes.Clone does the same thing more concisely and with better compiler optimizations.
Consider simplifying to:
keyCopy := bytes.Clone(key)
valueCopy := bytes.Clone(value)| // Caller could modify key, value - copy them out. (Don't use bytes.Clone, which can | |
| // waste capacity!) | |
| keyCopy := make([]byte, len(key)) | |
| copy(keyCopy, key) | |
| valueCopy := make([]byte, len(value)) | |
| copy(valueCopy, value) | |
| // Caller could modify key, value - copy them out. Use bytes.Clone for efficient copy. | |
| keyCopy := bytes.Clone(key) | |
| valueCopy := bytes.Clone(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is literally wrong. The docs say:
[...] The result may have additional unused capacity.
Here key and value are immutable, so any extra capacity would be wasted. This could make memory performance significantly worse.
|
This also fixes a bug in mem KV - it scanned by sorting base64 representations. This is wrong, the code expects lexicographic byte ordering. |
Current kv.mem iteration sorts a slice at every call to Next(). Instead use a sorted map.
Will use this in
lakefs gc simulate.