Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions docs/source/design/mooncake-store.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,25 @@ struct ReplicateConfig {
};
```

### Upsert

```C++
tl::expected<void, ErrorCode> Upsert(const ObjectKey& key,
std::vector<Slice>& slices,
const ReplicateConfig& config);

std::vector<tl::expected<void, ErrorCode>> BatchUpsert(
const std::vector<ObjectKey>& keys,
std::vector<std::vector<Slice>>& batched_slices,
const ReplicateConfig& config);
```

`Upsert` inserts `key` if it does not exist and updates the existing object if
it does. It uses the same replication configuration model as `Put`, while
allowing the store to reuse existing placement for in-place updates when the
current layout permits it. `BatchUpsert` performs the same operation for
multiple keys using a shared replication configuration.

### Remove

```C++
Expand Down Expand Up @@ -515,6 +534,40 @@ The Master Service handles object-related interfaces as follows:

Before writing an object, the Client calls PutStart to request storage space allocation from the Master Service. After completing data writing, the Client calls PutEnd to notify the Master Service to mark the object write as completed.

- Upsert

```C++
tl::expected<std::vector<Replica::Descriptor>, ErrorCode> UpsertStart(
const std::string& key,
const std::vector<size_t>& slice_lengths,
const ReplicateConfig& config);

std::vector<tl::expected<std::vector<Replica::Descriptor>, ErrorCode>>
BatchUpsertStart(const std::vector<std::string>& keys,
const std::vector<std::vector<uint64_t>>& slice_lengths,
const ReplicateConfig& config);

tl::expected<void, ErrorCode> UpsertEnd(
const std::string& key, ReplicaType replica_type);

std::vector<tl::expected<void, ErrorCode>> BatchUpsertEnd(
const std::vector<std::string>& keys);

tl::expected<void, ErrorCode> UpsertRevoke(
const std::string& key, ReplicaType replica_type);

std::vector<tl::expected<void, ErrorCode>> BatchUpsertRevoke(
const std::vector<std::string>& keys);
```

`UpsertStart` / `UpsertEnd` / `UpsertRevoke` mirror the existing put lifecycle
but operate on insert-or-update semantics. If the key does not exist, the flow
behaves like `PutStart`. If the key already exists, the Master may reuse the
current allocation for an in-place update or allocate new space when the object
layout changes. The batch variants provide the same control flow for multiple
keys and are the lower-level primitives used by the high-level `BatchUpsert`
path.

- GetReplicaList

```C++
Expand Down
126 changes: 126 additions & 0 deletions docs/source/python-api-reference/mooncake-store.md
Original file line number Diff line number Diff line change
Expand Up @@ -629,6 +629,74 @@ result = store.put_batch(keys, values)

---

#### upsert()

Insert a new object if the key does not exist, or update the existing object in place when possible. They use the same replication configuration model as `put()`.

Upsert binary data in the distributed storage.

```python
def upsert(self, key: str, value: bytes, config: ReplicateConfig = None) -> int
```

**Parameters:**
- `key` (str): Unique object identifier
- `value` (bytes): Binary data to insert or update
- `config` (ReplicateConfig, optional): Replication configuration

**Returns:**
- `int`: Status code (0 = success, non-zero = error code)

**Example:**
```python
config = ReplicateConfig()
config.replica_num = 2

rc = store.upsert("weights", b"new-bytes", config)
if rc == 0:
print("Upsert succeeded")
```

#### upsert_from()

Upsert object data directly from a pre-allocated buffer (zero-copy).

```python
def upsert_from(self, key: str, buffer_ptr: int, size: int, config: ReplicateConfig = None) -> int
```

**Parameters:**
- `key` (str): Object identifier
- `buffer_ptr` (int): Memory address of the source buffer
- `size` (int): Number of bytes to insert or update
- `config` (ReplicateConfig, optional): Replication configuration

**Returns:**
- `int`: Status code (0 = success, non-zero = error code)

**Note:** This is the zero-copy counterpart of `upsert()`. As with
`put_from()`, register the buffer before issuing the request.

#### batch_upsert_from()

Upsert multiple objects directly from pre-allocated buffers.

```python
def batch_upsert_from(self, keys: List[str], buffer_ptrs: List[int], sizes: List[int],
config: ReplicateConfig = None) -> List[int]
```

**Parameters:**
- `keys` (List[str]): List of object identifiers
- `buffer_ptrs` (List[int]): List of source buffer addresses
- `sizes` (List[int]): List of byte lengths for each buffer
- `config` (ReplicateConfig, optional): Replication configuration shared by all objects

**Returns:**
- `List[int]`: List of status codes for each upsert

---

#### get_batch()
Retrieve multiple objects in a single batch operation.

Expand Down Expand Up @@ -1500,6 +1568,64 @@ def batch_pub_tensor(self, keys: List[str], tensors_list: List[torch.Tensor], co

---

#### upsert_tensor()

Insert a tensor if its key is missing, or update the existing tensor if the key already exists. The current tensor upsert helpers use the default `ReplicateConfig` and therefore do not take a `config` parameter.

Upsert a PyTorch tensor into the store.

```python
def upsert_tensor(self, key: str, tensor: torch.Tensor) -> int
```

**Parameters:**
- `key` (str): Object identifier
- `tensor` (torch.Tensor): The PyTorch tensor to insert or update

**Returns:**
- `int`: Status code (0 = success, non-zero = error code)

**Note:** This function requires `torch` to be installed and available in the environment.

#### upsert_tensor_from()

Upsert a tensor directly from a pre-allocated buffer. The buffer layout must be
`[TensorMetadata][tensor data]`, matching the layout used by
`get_tensor_into()`.

```python
def upsert_tensor_from(self, key: str, buffer_ptr: int, size: int) -> int
```

**Parameters:**
- `key` (str): Object identifier
- `buffer_ptr` (int): Buffer pointer containing serialized tensor metadata and payload
- `size` (int): Actual serialized byte length of the tensor buffer

**Returns:**
- `int`: Status code (0 = success, non-zero = error code)

**Note:** This function is not supported for dummy client.

#### batch_upsert_tensor_from()

Upsert multiple tensors directly from pre-allocated buffers. Each buffer must
use layout `[TensorMetadata][tensor data]`.

```python
def batch_upsert_tensor_from(self, keys: List[str], buffer_ptrs: List[int], sizes: List[int]) -> List[int]
```

**Parameters:**
- `keys` (List[str]): List of object identifiers
- `buffer_ptrs` (List[int]): List of serialized tensor buffer pointers
- `sizes` (List[int]): List of actual serialized byte lengths

**Returns:**
- `List[int]`: List of status codes for each tensor upsert

---

### PyTorch Tensor Operations (Zero Copy)

These methods provide direct support for storing and retrieving PyTorch tensors. They automatically handle serialization and metadata, and include built-in support for **Tensor Parallelism (TP)** by automatically splitting and reconstructing tensor shards.
Expand Down
Loading
Loading