GH-68: Match language from parquet-format after merge of PARQUET-2139#69
GH-68: Match language from parquet-format after merge of PARQUET-2139#69wgtmac merged 3 commits intoapache:productionfrom
Conversation
| --- | ||
| There are three types of metadata: file metadata, column (chunk) metadata and page | ||
| header metadata. All thrift structures are serialized using the TCompactProtocol. | ||
| There are two types of metadata: file metadata, and page header metadata. All |
There was a problem hiding this comment.
I recommend providing a link to precisely what these terms are referring to
I think "file metadata" refers to FileMetadata https://github.com/apache/parquet-format/blob/ed66e87da9b2d79d6e9262fe37d5eae045c6a639/src/main/thrift/parquet.thrift#L1141
I am not sure what "page header metadata" refers to . Is it DataPageHeader https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L580 ?
If so, maybe we could update this document to use the same terms FileMetadata rather than file metadata and DatePageHeader rather than page header
There was a problem hiding this comment.
It makes more sense when viewed with the image (which has an ERD of the metadata). And this is copied verbatim from the parquet-format README.md. But I am in agreement that the parquet-site could provide more information than the format, which is kept terse for a reason. I'll wordsmith this up some.
Closes #68