-
Notifications
You must be signed in to change notification settings - Fork 70
Add shredded Variant reader cases with variant logical type #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @rdblue, @zeroshade. This is the same data set but I have included parquet logical type in the schema. |
|
Do we have any python script to generate shredding variants |
We don't have right now. We generated from Iceberg test cases. |
OK. Anyway, this is very helpful for shredding variant testing in other projects. |
cbb19c1 to
560a76e
Compare
|
@aihuaxu You updated the descriptions in |
0af1223 to
0f5bf3b
Compare
0f5bf3b to
f422987
Compare
|
My understanding is that this PR would replace #90 , is that correct? |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @aihuaxu
I can't test with these values as we are still working on the shredding implementation in arrow-rs (see apache/arrow-rs#8150 and related tickets) but they look reasonable to me
I defer to @zeroshade to check with the Go implementation
cc @carpecodeum and @mprammer and @scovich
| "parquet_file" : "case-127.parquet", | ||
| "error_message" : "Unsupported shredded value type: INTEGER(32,false)" | ||
| }, { | ||
| "case_number" : 128, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error message looks clear to me
|
@zeroshade Can you check with GO implementation? I have tested with Java implementation and it works fine. |
@zeroshade I have verified from parquet-java and it works. Thanks. |
|
@zeroshade and @zeroshade can we merge this PR? Seems we have done enough cross-language check. |
|
arrow-rs is also going ahead with these files as the core test cases - apache/arrow-rs#8104 |
I think it is a good idea. However, I suspect that @zeroshade may not have the permissions to do so. I think it will require someone from this group (@emkornfield @julienledem @rdblue etc) |
|
Thanks all for validation! Let me merge this. |
|
Thank you @wgtmac and @aihuaxu and @zeroshade |
…ing` data (#8325) # Which issue does this PR close? We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. - Closes #8084 # Rationale for this change Now that we have merged the upstream parquet-variant tests: - apache/parquet-testing#91 We can test how far we are from the rust variant implementation working for all the values This PR updates the test harness added #8104 by @carpecodeum to use the final parquet files and the currnet APIs # What changes are included in this PR? 1. Update parquet-testing pin 2. Update the test harness to use the standard rust test runner (`#[test]`) rather than a custom main function 3. Added links to follow on tickets You can run this test manually like this: ```shell cargo test --all-features --test variant_integration ... running 138 tests test test_variant_integration_case_106 ... ok test test_variant_integration_case_107 ... ok test test_variant_integration_case_109 ... ok test test_variant_integration_case_110 ... ok .. test test_variant_integration_case_90 ... ok test test_variant_integration_case_91 ... ok test test_variant_integration_case_93 ... ok test test_variant_integration_case_83 - should panic ... ok test test_variant_integration_case_84 - should panic ... ok ``` # Are these changes tested? Yes this is all tests # Are there any user-facing changes? No
This is generated from apache/iceberg#13708 which includes the following changes on top of #90:
I have tested from Parquet-Java against this data (https://github.com/apache/parquet-java/pull/3258/files).
Here shows an example of the schema with variant logical type annotated.