[Store] Add ObjectDataType enum for type-aware metadata by he-yufeng · Pull Request #1719 · kvcache-ai/Mooncake

he-yufeng · 2026-03-23T00:46:09Z

Summary

Implements Phase 1 of the ObjectDataType system discussed in #1673.

This adds a data type classification for objects in Mooncake Store, so the metadata layer knows what kind of data each object holds (KV cache, weights, tensors, etc.). This lays the groundwork for type-aware eviction policies, replication strategies, and monitoring — without changing any existing behavior.

What changed

ObjectDataType enum in types.h — UNKNOWN(0), KVCACHE(1), TENSOR(2), WEIGHT(3), SAMPLE(4), ACTIVATION(5), GRADIENT(6), OPTIMIZER_STATE(7), METADATA(8)
ReplicateConfig.data_type field — clients can tag objects at PutStart time. Defaults to UNKNOWN, so existing callers are unaffected.
ObjectMetadata.data_type — propagated from PutStart config and stored as a const member.
Snapshot serialization — new format writes data_type as the 8th field. Deserialization auto-detects old (7 + replicas) vs new (8 + replicas) format, so rolling upgrades won't break.
Python bindings — ObjectDataType enum and ReplicateConfig.data_type exposed via pybind11.
Unit tests — enum values, stream operator, config defaults, PutStart propagation, uint8 roundtrip.

Backward compatibility

Fully backward compatible. Clients that don't set data_type get UNKNOWN everywhere. Old snapshots deserialize correctly (missing field defaults to UNKNOWN).

Next steps (Phase 2+)

Wire data_type into eviction scoring
Add per-type metrics/monitoring
Expose in gRPC/REST APIs

Closes phase 1 of #1673.

…cache-ai#1673) Introduce a data type classification system for objects stored in Mooncake Store, as agreed in the RFC discussion on issue kvcache-ai#1673. Changes: - Add ObjectDataType enum (UNKNOWN, KVCACHE, TENSOR, WEIGHT, etc.) in mooncake-store/include/types.h - Extend ReplicateConfig with a data_type field (default UNKNOWN) - Propagate data_type through PutStart into ObjectMetadata - Serialize/deserialize data_type in snapshot metadata, with backward compatibility for old snapshots (gracefully handles missing field) - Expose ObjectDataType enum and data_type field in Python bindings - Add unit tests for enum values, defaults, and PutStart propagation Existing clients that don't set data_type will continue to work unchanged (defaults to UNKNOWN everywhere).

gemini-code-assist · 2026-03-23T00:46:27Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a foundational ObjectDataType system to Mooncake Store, enabling type-aware metadata for stored objects. This enhancement allows the system to understand the nature of the data it holds, such as KV caches, tensors, or weights, without altering existing behavior. This lays crucial groundwork for future advanced features like intelligent eviction policies, optimized replication strategies, and granular monitoring based on data types.

Highlights

Object Data Type Enum: Introduced ObjectDataType enum in types.h to classify stored objects (e.g., KVCache, Tensor, Weight), providing a structured way to categorize data within the store.
ReplicateConfig Integration: Added a data_type field to ReplicateConfig, allowing clients to tag objects with their specific data type during the PutStart operation.
Metadata Propagation: Ensured that the ObjectMetadata structure now stores the data_type propagated from PutStart configurations, making this information persistent with the object's metadata.
Snapshot Backward Compatibility: Implemented a new snapshot serialization format that includes the data_type, with auto-detection logic to ensure seamless rolling upgrades and backward compatibility with older snapshot formats.
Python Bindings: Exposed the new ObjectDataType enum and the ReplicateConfig.data_type field through pybind11, enabling Python clients to utilize and specify object data types.
Comprehensive Unit Tests: Added new unit tests covering ObjectDataType enum values, stream operators, default configurations, PutStart propagation, and uint8 roundtrip conversions to ensure correctness and robustness.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces the ObjectDataType enum to classify objects in the Mooncake Store, which is a well-designed and forward-looking change. The implementation correctly integrates the new data type into ReplicateConfig, ObjectMetadata, and the serialization/deserialization logic. The backward compatibility handling for snapshot deserialization is robust, ensuring smooth rolling upgrades. The Python bindings are appropriately updated, and the comprehensive unit tests cover enum values, stream operators, default configurations, propagation, and roundtrip serialization. This is a solid foundation for future type-aware policies.

codecov-commenter · 2026-03-23T01:31:47Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 89.39394% with 14 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
mooncake-integration/store/store_py.cpp	0.00%	13 Missing ⚠️
mooncake-store/include/master_service.h	50.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

stmatengss · 2026-03-27T17:00:20Z

mooncake-integration/store/store_py.cpp

 PYBIND11_MODULE(store, m) {
+    // Object data type classification
+    py::enum_<ObjectDataType>(m, "ObjectDataType")
+        .value("UNKNOWN", ObjectDataType::UNKNOWN)


Maybe we should add a general type?

stmatengss · 2026-03-27T17:02:09Z

mooncake-store/src/master_service.cpp

Need clearer logic.

Copilot

Pull request overview

Adds a first-class ObjectDataType classification to Mooncake Store metadata so objects can be tagged (e.g., KV cache, weights, tensors) without changing default behavior (still UNKNOWN unless explicitly set).

Changes:

Introduces ObjectDataType enum + stream operator, and threads data_type through ReplicateConfig → MasterService::ObjectMetadata.
Extends snapshot msgpack serialization to include data_type while keeping new-code support for deserializing older snapshots.
Exposes ObjectDataType and ReplicateConfig.data_type via pybind11 and adds unit tests.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
mooncake-store/include/types.h	Adds `ObjectDataType` enum and `operator<<`
mooncake-store/include/replica.h	Adds `ReplicateConfig::data_type` and prints it in `operator<<`
mooncake-store/include/master_service.h	Stores `data_type` in `ObjectMetadata` and updates constructors/helpers
mooncake-store/src/master_service.cpp	Propagates `data_type` on `PutStart` and serializes/deserializes it in snapshots
mooncake-integration/store/store_py.cpp	Exposes enum + config field to Python via pybind11
mooncake-store/tests/object_data_type_test.cpp	Adds unit tests for enum/config/PutStart behavior
mooncake-store/tests/CMakeLists.txt	Registers the new test target

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-27T17:05:34Z

mooncake-store/include/types.h

+inline std::ostream& operator<<(std::ostream& os,
+                                const ObjectDataType& type) noexcept {
+    static const std::unordered_map<ObjectDataType, std::string_view>
+        type_strings{{ObjectDataType::UNKNOWN, "UNKNOWN"},
+                     {ObjectDataType::KVCACHE, "KVCACHE"},
+                     {ObjectDataType::TENSOR, "TENSOR"},
+                     {ObjectDataType::WEIGHT, "WEIGHT"},
+                     {ObjectDataType::SAMPLE, "SAMPLE"},
+                     {ObjectDataType::ACTIVATION, "ACTIVATION"},
+                     {ObjectDataType::GRADIENT, "GRADIENT"},
+                     {ObjectDataType::OPTIMIZER_STATE, "OPTIMIZER_STATE"},
+                     {ObjectDataType::METADATA, "METADATA"}};
+
+    auto it = type_strings.find(type);
+    os << (it != type_strings.end() ? it->second : "UNKNOWN");
+    return os;
+}


types.h now uses std::string_view in the ObjectDataType stream operator, but the header does not include <string_view> (and also relies on indirect includes for std::ostream). This makes the header non-self-contained and can cause build breaks for translation units that include types.h directly. Add the missing standard includes in types.h.

Copilot · 2026-03-27T17:05:35Z

mooncake-store/tests/object_data_type_test.cpp

+// PutStart with data_type propagates to ObjectMetadata
+TEST_F(ObjectDataTypeTest, PutStartWithDataType) {
+    std::unique_ptr<MasterService> service(new MasterService());
+    Segment segment = MakeSegment();
+    UUID client_id = generate_uuid();
+    auto mount_result = service->MountSegment(segment, client_id);
+    ASSERT_TRUE(mount_result.has_value());
+
+    UUID put_client = generate_uuid();
+
+    // Put with WEIGHT type
+    ReplicateConfig config;
+    config.replica_num = 1;
+    config.data_type = ObjectDataType::WEIGHT;
+
+    auto result = service->PutStart(put_client, "key_weight", 1024, config);
+    ASSERT_TRUE(result.has_value());
+    EXPECT_FALSE(result.value().empty());
+
+    auto end_result =
+        service->PutEnd(put_client, "key_weight", ReplicaType::MEMORY);
+    EXPECT_TRUE(end_result.has_value());
+}


PutStartWithDataType is described as verifying propagation to ObjectMetadata, but it never asserts that data_type was actually stored/serialized—only that PutStart/PutEnd succeed. Either add an assertion that reads back the stored metadata (e.g., via an existing test-only hook/snapshot roundtrip) or rename/re-scope the test so it matches what it verifies.

Copilot · 2026-03-27T17:05:35Z

mooncake-store/src/master_service.cpp

@@ -3543,6 +3546,9 @@ MasterService::MetadataSerializer::SerializeMetadata(
    // Serialize replicas count
    packer.pack(static_cast<uint32_t>(metadata.CountReplicas()));

+    // Serialize data_type
+    packer.pack(static_cast<uint8_t>(metadata.data_type));
+


Snapshot serialization now always writes the new array shape (8 + replicas_count) by inserting data_type. While the new deserializer can read both formats, this is only backward-compatible in the "new code reads old snapshots" direction. If any mixed-version deployment expects older binaries to load snapshots produced by newer binaries, they will likely fail on the extra field. Consider adding an explicit snapshot format/version marker or a compatibility mode to write the old format during rolling upgrades, and/or clarify the upgrade guarantees.

- Add GENERAL(9) to ObjectDataType for objects that don't fit specific categories but are known to be valid - Restructure snapshot format detection with named constants (kOldFieldCount/kNewFieldCount) and explicit bool flags - Add missing <string_view> include in types.h - Update Python bindings and tests

he-yufeng · 2026-03-28T03:12:01Z

Thanks for the review @stmatengss! Pushed a new commit addressing both points:

GENERAL type: Added GENERAL = 9 for objects that are known/valid but don't fit a specific category. Updated enum, stream operator, Python bindings, and tests.
Clearer deserialization logic: Refactored with named constants (kOldFieldCount/kNewFieldCount) and explicit is_old_format/is_new_format bools. The validation rejects early on mismatch, then the data_type read is in a separate block.

stmatengss · 2026-03-28T13:14:40Z

@he-yufeng Last thing, merge main and resolve conflicts. Thanks!

…pe-phase1 # Conflicts: # mooncake-store/include/master_service.h # mooncake-store/src/master_service.cpp

stmatengss · 2026-04-01T15:22:17Z

@he-yufeng Thank you for your contribution. It appears the CI is failing.

Resolve conflict in master_service.cpp: use AllocateAndInsertMetadata refactored helper from main while preserving config.data_type threading.

he-yufeng · 2026-04-06T07:44:11Z

@stmatengss Merged main and resolved the conflict in master_service.cpp. The conflict was between the inline allocation code (from this PR) and the refactored AllocateAndInsertMetadata helper (from #1662). Resolved by keeping the helper while threading config.data_type through it.

CI should re-run now. The previous stress test failure (Data Mismatch for stress_fixed_t6_32) looks like a flaky concurrent test unrelated to this PR's changes.

stmatengss · 2026-04-08T14:18:10Z

@stmatengss Merged main and resolved the conflict in master_service.cpp. The conflict was between the inline allocation code (from this PR) and the refactored AllocateAndInsertMetadata helper (from #1662). Resolved by keeping the helper while threading config.data_type through it.

CI should re-run now. The previous stress test failure (Data Mismatch for stress_fixed_t6_32) looks like a flaky concurrent test unrelated to this PR's changes.

Thanks. CI is triggered.

he-yufeng requested review from XucSh, YiXR, stmatengss and ykwd as code owners March 23, 2026 00:46

github-actions bot added run-ci Store labels Mar 23, 2026

gemini-code-assist bot reviewed Mar 23, 2026

View reviewed changes

he-yufeng and others added 4 commits March 23, 2026 10:59

style: fix clang-format violations

bfbdef2

ci: retrigger CI after transient infra failures

e3eb2f1

retrigger CI

96bad43

ci: retrigger sglang integration test

e68abff

stmatengss requested a review from Copilot March 27, 2026 16:59

Copilot started reviewing on behalf of stmatengss March 27, 2026 17:00 View session

stmatengss reviewed Mar 27, 2026

View reviewed changes

mooncake-store/src/master_service.cpp Outdated

Comment on lines 3607 to 3618

Copy link
Copy Markdown

Collaborator

stmatengss Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need clearer logic.

Copilot AI reviewed Mar 27, 2026

View reviewed changes

stmatengss self-assigned this Mar 29, 2026

he-yufeng added 2 commits March 30, 2026 12:48

Merge remote-tracking branch 'upstream/main' into feat/object-data-ty…

c632b5c

…pe-phase1 # Conflicts: # mooncake-store/include/master_service.h # mooncake-store/src/master_service.cpp

style: fix clang-format issues

011f3ce

Merge upstream/main into feat/object-data-type-phase1

0a85afb

Resolve conflict in master_service.cpp: use AllocateAndInsertMetadata refactored helper from main while preserving config.data_type threading.

Conversation

he-yufeng commented Mar 23, 2026

Summary

What changed

Backward compatibility

Next steps (Phase 2+)

Uh oh!

gemini-code-assist bot commented Mar 23, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

codecov-commenter commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

stmatengss Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

stmatengss Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

he-yufeng commented Mar 28, 2026

Uh oh!

stmatengss commented Mar 28, 2026

Uh oh!

stmatengss commented Apr 1, 2026

Uh oh!

he-yufeng commented Apr 6, 2026

Uh oh!

stmatengss commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Mar 23, 2026 •

edited

Loading