-
Notifications
You must be signed in to change notification settings - Fork 0
Fix multi-variable compression stats #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Ah, I just saw you opened an issue about this in #2 ! |
|
SZ3 technically also has the 4D Limitation but already has a builtin workaround for 0-size or 1-size dimensions. I will be adding it to ZFP in any case, so any ZFP-specific hacks should be avoided in this repo. I think the ability of averaging over the ensemble dimension is useful in general, irrespective of the compressor |
|
I just remembered that we also have a reshape codec, see #4 for an alternative for supporting ZFP |
juntyr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like most of the usability changes a lot and would like to merge them quickly.
However, I'd prefer to split out ZFP support (in favour of #4) and remove its specialized handling in this PR.
|
The approach in #4 seems a lot cleaner and avoids the workarounds in the code, so I definitely prefer that alternative! I've removed the special handling of ZFP from the PR now. |
juntyr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @treigerm!
This PR adds ZFP and Bitround compressors with the settings from the compression-lab-notebooks. We will probably want to fine-tune the settings at some point to optimize the compression performance but I wanted to get some code working at first.The changes to the code are mostly adjustments that I had to make to get all the compressors working on both the CMIP6 and ERA5 sample data. The two main adjustments I had to do were:
1. ZFP can only compress 1-4D data so we currently manually loop over the ensemble dimension. I am then combining all the (de)compression performance measurements with thecombine_measurementsfunction which is a bit hacky. Is there a better/cleaner way to do it, @juntyr ?2. I adjusted the performance measurements data structure to account for the fact that there might be multiple variables in a given zarr array.