-
Notifications
You must be signed in to change notification settings - Fork 49
PARQUET-2470: Update website with larger ecosystem emphasis #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
06cf679
daafc1d
bc8a832
2cb88a0
11aa0f7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,11 +6,11 @@ description: > | |
| All about Parquet. | ||
| --- | ||
|
|
||
| Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model or programming language. | ||
| Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. | ||
| It provides high performance compression and encoding schemes to handle complex data in bulk and is supported in many programming language and analytics tools. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did we mean for this to say "high performance compression" or is it "high performance, compression"? I think it may be the latter. Or maybe "It provides performant compression and encoding schemes..." I was thinking the first versions sound too much like the compression tool rather than the format
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't mean for the comma or lack there of to carry any additional semantic meaning. I am happy to put a comma there if you like
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No really strong feelings, was just wondering if there was a subtextual focus intended |
||
|
|
||
| This documentation contains information about both the [parquet-mr](https://github.com/apache/parquet-mr) and [parquet-format](https://github.com/apache/parquet-format) repositories. | ||
|
|
||
|
|
||
| ### parquet-format | ||
|
|
||
| The parquet-format repository hosts the official specification of the Apache Parquet file format, defining how data is structured and stored. This specification, along with Thrift metadata definitions and other crucial components, is essential for developers to effectively read and write Parquet files. The parquet-format project specifically contains the format specifications needed to understand and properly utilize Parquet files. | ||
|
|
@@ -43,4 +43,4 @@ Here is a non-exhaustive list of Parquet implementations: | |
| * [cuDF](https://github.com/rapidsai/cudf) | ||
| * [Apache Impala](https://github.com/apache/impala) | ||
| * [DuckDB](https://github.com/duckdb/duckdb) | ||
| * [fastparquet, a Python implementation of the Apache Parquet format](https://github.com/dask/fastparquet) | ||
| * [fastparquet, a Python implementation of the Apache Parquet format](https://github.com/dask/fastparquet) | ||
Uh oh!
There was an error while loading. Please reload this page.