GH-48208: [C++][Parquet] Fix Types logic to enable Parquet DB support on s390x#48209
GH-48208: [C++][Parquet] Fix Types logic to enable Parquet DB support on s390x#48209Vishwanatha-HD wants to merge 1 commit intoapache:mainfrom
Conversation
|
|
075120b to
84a9b25
Compare
|
One thing I noticed in the INT96 helpers is that the write path here keeps the limbs in host order on BE. That can definitely work, though the thing I ran into on s390x is that callers downstream (decoders, page stats, and FormatStatValue) often assume INT96 is stored in a consistent LE layout regardless of host. When the value arrives in native order instead, those helpers sometimes reconstruct days/nanos differently across architectures. On the decode side, the arithmetic around The stats formatting changes here also caught my eye. Integral paths now swap on BE, which lines up with what I’ve observed. On floats, though, FormatNumericValue tends to expose host-order bytes unless the limbs were normalized earlier in the pipeline. Just mentioning it because stats comparisons have been one of the more “quirky” areas on BE. None of this is blocking feedback — mostly sharing observations from hardware testing, in case any of it helps stress-test these code paths further. Happy to compare outputs if it’d be useful. |
84a9b25 to
242c4e5
Compare
Rationale for this change
This PR is intended to enable Parquet DB support on Big-endian (s390x) systems. The fix in this PR fixes the Types logic.
What changes are included in this PR?
The fix includes changes to following files:
cpp/src/parquet/types.cc
cpp/src/parquet/types.h
Are these changes tested?
Yes. The changes are tested on s390x arch to make sure things are working fine. The fix is also tested on x86 arch, to make sure there is no new regression introduced.
Are there any user-facing changes?
No
GitHub main Issue link: #48151