Conversation
skrawcz
reviewed
Apr 22, 2024
Comment on lines
+15
to
+21
| for example: | ||
| - pandas | ||
| - polars | ||
| - dask | ||
| - vaex | ||
| - ibis | ||
| - duckdb results |
Contributor
There was a problem hiding this comment.
it would be nice to be stricter on types...
Contributor
There was a problem hiding this comment.
e.g.
def input_types(self) -> List[Type[Type]]:
"""Gives the applicable types to this result builder.
This is optional for backwards compatibility, but is recommended.
:return: A list of types that this can apply to.
"""
_types = []
try:
import ...
except ...
return _types
Contributor
Author
There was a problem hiding this comment.
In that case, the real check is if it implements __dataframe__(), which is done through pyarrow.interchange.from_dataframe() under build_result(). The PyarrowTableResult serve a slightly different role of "universal adapter" to help us avoid maintaining an explicit list of types (which is bound to grow). I opted to not include input_types() if it was to return Any.
skrawcz
approved these changes
Apr 22, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
You can pass
to.SAVER(dependencies=["NODE_NAME"], combine=PyarrowTableResult())to convert the specified node to thepyarrow.Tablebefore materialization. The first motivation was to support more thanpd.DataFrameandpyarrow.Tablewith the dltDataSaverplugin. More generally, it can be useful for platform teams that want to have a "single way to store parquet files" that is independent of the specific API of a library (e.g., pandas, polars)see #829 for more details
Changes
h_pyarrowand testsHow I tested this
Notes
Checklist