Skip to content

[feature]:kv_batch_put + mooncake-store backend, no-zero-copy will result in a relatively large local buffer size memory overhead. #78

@pxp531

Description

@pxp531

When we use the backend of tq+mooncake-store, the volume of text scenarios is relatively small, but the volume of multi-modal scenarios is relatively large. As the number of GBS and images increases linearly, the amount to be put may approach 10G. However, when choosing Mooncake-store as the backend, the kv_batch_put uses the no zero-copy API and requires a relatively large local buffer size to copy the data. This requirement only exists for the put operation, but the get client inherits this configuration, resulting in nearly gpu_per_node * local buffer size of invalid data on one machine. This issue becomes more obvious in the multi-modal scenario.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions