Skip to content

Update the website to describe the larger role of Parquet #63

@asfimport

Description

@asfimport

I personally believe Parquet will be at the center of the analytics ecosystem

https://parquet.apache.org/ currently emphasis Parquet's role in the Hadoop ecosystem. I think this causes confusion in several ways:

  1. It implies that parquet is only focused on Hadoop, whem I think it is a critical technology across other ecosystems that are unrelated to hadoop (e.g. Apache Iceberg, Delta Lake, etc)
  2. It may further the perception that the Apache Parquet project only focuses on / cares about Hadoop / Java impleemntation

 

I would like to update the site to focus less on the hadoop aspects and more on the broader nature of Parquet

 

If people like where this is headed, I would like to next expand the documentation to explain better how the various implementations are related (e.g. how parquet-mr relates to the readers in arrow-rs, arrow, etc)

Reporter: Andrew Lamb / @alamb
Assignee: Andrew Lamb / @alamb

PRs and other links:

Note: This issue was originally created as PARQUET-2470. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions