Skip to content

[Iceberg] Support Excluding Columns from Stats Collection on Data Write #25510

@osscm

Description

@osscm

Generating statistics for columns of type STRING, BINARY, or for columns not involved in joins can be time-consuming and lead to increased storage usage.

For large tables, it's important to have the ability to skip stats generation for certain columns to improve performance and reduce overhead.

While the ANALYZE command allows specifying which columns to collect statistics for, there is currently no way to control this during data write or update operations.

ref: #17057
ref: https://docs.delta.io/latest/optimizations-oss.html#data-skipping

Metadata

Metadata

Assignees

Labels

icebergIceberg connector

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions