Docs: Add Daft into Iceberg documentation#9836
Conversation
jaychia
commented
Feb 29, 2024
- Adds installation examples
- Adds code examples for getting up and running with Daft + PyIceberg
- Adds a type conversion matrix between Daft and PyIceberg
bitsondatadev
left a comment
There was a problem hiding this comment.
Sorry this is a lot but had a lot of thoughts the. Got stuck.
Co-authored-by: Brian "bits" Olsen <bits@bitsondata.dev>
Co-authored-by: Brian "bits" Olsen <bits@bitsondata.dev>
Co-authored-by: Brian "bits" Olsen <bits@bitsondata.dev>
Co-authored-by: Brian "bits" Olsen <bits@bitsondata.dev>
nastra
left a comment
There was a problem hiding this comment.
Given that this mainly targets pyiceberg, I think this doc should live in https://github.com/apache/iceberg-python/
Hello! The Daft query engine integrates with Daft is a fully featured distributed query engine, and we are actively working on non-PyIceberg specific functionality that is more applicable to the wider Iceberg ecosystem (e.g. partitioned writes, compaction stored procedures, orphan file pruning procedures etc). This is in contrast to pyiceberg-only integrations such as Pandas/Arrow which really just use pyiceberg for retrieving data into Python memory. |
|
@nastra, I tend to agree with @jaychia on this one. I don't want to split up the documentation any more than necessary. Any compute engine that runs on Iceberg, I want to I see this eventually looking like Trino's data sources or Kafka Connectors. We've discussed this a bit before here: #9681 I think this future reorder will include engines based in languages outside of Java. |
| - limitations under the License. | ||
| --> | ||
|
|
||
| # Daft |
There was a problem hiding this comment.
so it seems the site can't be actually built when serving the docs locally
There was a problem hiding this comment.
There was a problem hiding this comment.
I was using https://github.com/apache/iceberg/blob/aff5b39a7dddd22790b6ba47f514860c53e33c00/site/README.md to locally serve the site. @bitsondatadev can you double-check please if the site properly renders for you when running ./dev/serve.sh?
There was a problem hiding this comment.
Yeah, I used to have a "nightly" build in there, but we took it out initially to avoid confusion. I think part of the build can just be to add "local" or something. Currently, the build just grabs the latest semantic version and points latest there, we could just do the same and point /site/docs/docs/local >> /docs and maybe expose another build option to enable that.
Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
