[feature]：kv_batch_put + mooncake-store backend, no-zero-copy will result in a relatively large local buffer size memory overhead.

When we use the backend of tq+mooncake-store, the volume of text scenarios is relatively small, but the volume of multi-modal scenarios is relatively large. As the number of GBS and images increases linearly, the amount to be put may approach 10G. However, when choosing Mooncake-store as the backend, the kv_batch_put uses the no zero-copy API and requires a relatively large local buffer size to copy the data. This requirement only exists for the put operation, but the get client inherits this configuration, resulting in nearly gpu_per_node * local buffer size of invalid data on one machine. This issue becomes more obvious in the multi-modal scenario.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature]：kv_batch_put + mooncake-store backend, no-zero-copy will result in a relatively large local buffer size memory overhead. #78

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feature]：kv_batch_put + mooncake-store backend, no-zero-copy will result in a relatively large local buffer size memory overhead. #78

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions