CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume#882
Conversation
There was a problem hiding this comment.
Hey @chengzhuzhang this PR is ready for review. I've performed a self-review, but good to make sure the changes align with your two PRs mentioned in this PR's description.
Summary of changes
tc_analysis_driver.py- Remove
lat/lonslice - Add function
_filter_lines_within_year_bounds()to remove excessive time points crossing year bounds for 6 hourly data
- Remove
zonal_mean_2d_driver.py- Update
_convert_g_kg_to_ppm_units()to convert to ppm by volume.
- Update
There was a problem hiding this comment.
There was a problem hiding this comment.
Changes should align with https://github.com/E3SM-Project/e3sm_diags/pull/851/files
Note, #872 was already addressed on this branch.
| def _filter_lines_within_year_bounds( | ||
| lines_orig: List[str], data_end_year: int | ||
| ) -> List[str]: | ||
| """Filters lines within the specified year bounds. | ||
|
|
||
| This function processes a list of strings, each representing a line of data. | ||
| It filters out lines based on a year extracted from each line, ensuring that | ||
| only lines with years less than or equal to `data_end_year` are retained. | ||
| Additionally, it removes excessive time points crossing year bounds from | ||
| 6-hourly data. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| lines_orig : List[str] | ||
| A list of strings where each string represents a line of data. | ||
| data_end_year : int | ||
| The end year for filtering lines. Only lines with years less than or | ||
| equal to this value will be retained. | ||
| Returns | ||
| ------- | ||
| List[str] | ||
| A list of strings filtered based on the specified year bounds. | ||
| """ | ||
| line_ind = [] | ||
| for i in range(0, np.size(lines_orig)): | ||
| if lines_orig[i][0] == "s": | ||
| year = int(lines_orig[i].split("\t")[2]) | ||
|
|
||
| if year <= data_end_year: | ||
| line_ind.append(i) | ||
|
|
||
| end_ind = line_ind[-1] | ||
|
|
||
| new_lines = lines_orig[0:end_ind] | ||
| return new_lines |
There was a problem hiding this comment.
FYI I used GitHub Co-Pilot to extract this function and generated the docstring. I've found it to be useful for these kinds of tasks.
chengzhuzhang
left a comment
There was a problem hiding this comment.
@tomvothecoder I think the changes in .cfg files for H2OLNZ units change are not brought in this PR: https://github.com/E3SM-Project/e3sm_diags/pull/874/files#diff-794c0344b0a718e616d2051d6961d02c79f24e8d5e97c7c355fa7cb0ec467e19
For that specific PR, the changes are already here. Example: I rebased the dev branch on the latest |
chengzhuzhang
left a comment
There was a problem hiding this comment.
Looking good! The PR is good to go!
|
@tomvothecoder Just a heads-up that this PR is also just merged: #879 and we need to bring to migration branch.. |
…s and convert H20LNZ units to ppm/volume (#882)
Done |
…s and convert H20LNZ units to ppm/volume (#882)
Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754) Update regression test notebook to show validation of all vars Add `subset_and_align_datasets()` to regrid.py (#776) Add template run scripts CDAT Migration Phase: Refactor `cosp_histogram` set (#748) - Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py` - `formulas_cosp.py` (new file) - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py` - `derivations.py` - Cleaned up portions of `DERIVED_VARIABLES` dictionary - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716) - Remove unnecessary `convert_units()` function calls - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP` - `utils.py` - Delete deprecated, CDAT-based `cosp_histogram` functions - `dataset_xr.py` - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate - Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`) - `io.py` - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing) - Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) Refactor 654 zonal mean xy (#752) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration - Update run script output directory to NERSC public webserver (#793) [PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788) CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794) CDAT Migration: Refactor `polar` set (#749) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> Align order of calls to `_set_param_output_attrs` CDAT Migration: Refactor `meridional_mean_2d` set (#795) CDAT Migration: Refactor `aerosol_budget` (#800) Add `acme.py` changes from PR #712 (#814) * Add `acme.py` changes from PR #712 * Replace unnecessary lambda call Refactor area_mean_time_series and add ccb slice flag feature (#750) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> [Refactor]: Validate fix in PR #750 for #759 (#815) CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819) CDAT Migration: Refactor annual_cycle_zonal_mean set (#798) * Refactor `annual_cycle_zonal_mean` set * Address PR review comments * Add lat lon regression testing * Add debugging scripts * Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords - Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers * Fix unit tests * Remove old plotter * Add script to debug decode_times=True and ncclimo file * Update plotter time values to month integers * Fix slow `.load()` and multiprocessing issue - Due to incorrectly updating `keep_bnds` logic - Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar` * Update `_encode_time_coords()` docstring * Add AODVIS debug script * update AODVIS obs datasets; regression test results --------- Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `qbo` set (#826) CDAT Migration Phase 2: Refactor tc_analysis set (#829) * start tc_analysis_refactor * update driver * update plotting * Clean up plotter - Remove unused variables - Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts - Reorder functions for top-down readability * Remove unused notebook --------- Co-authored-by: tomvothecoder <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `enso_diags` set (#832) CDAT Migration Phase 2: Refactor `streamflow` set (#837) [Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841) [Refactor]: CDAT Migration Phase 3: testing and documentation update (#846) CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860) CDAT Migration Phase 2: Refactor arm_diags set (#842) Add performance benchmark material (#864) Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865) CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875) CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876) Add support for time series datasets via glob and fix `enso_diags` set (#866) Add fix for checking `is_time_series()` property based on `data_type` attr (#881) CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882) CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883) CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887) CDAT Migration: Prepare branch for merge to `main` (#885) [Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891) CDAT Migration Phase 2: Refactor core utilities and `lat_lon` set (#677) Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) CDAT Migration Phase 2: Refactor `qbo` set (#826)
Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754) Update regression test notebook to show validation of all vars Add `subset_and_align_datasets()` to regrid.py (#776) Add template run scripts CDAT Migration Phase: Refactor `cosp_histogram` set (#748) - Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py` - `formulas_cosp.py` (new file) - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py` - `derivations.py` - Cleaned up portions of `DERIVED_VARIABLES` dictionary - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716) - Remove unnecessary `convert_units()` function calls - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP` - `utils.py` - Delete deprecated, CDAT-based `cosp_histogram` functions - `dataset_xr.py` - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate - Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`) - `io.py` - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing) - Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) Refactor 654 zonal mean xy (#752) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration - Update run script output directory to NERSC public webserver (#793) [PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788) CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794) CDAT Migration: Refactor `polar` set (#749) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> Align order of calls to `_set_param_output_attrs` CDAT Migration: Refactor `meridional_mean_2d` set (#795) CDAT Migration: Refactor `aerosol_budget` (#800) Add `acme.py` changes from PR #712 (#814) * Add `acme.py` changes from PR #712 * Replace unnecessary lambda call Refactor area_mean_time_series and add ccb slice flag feature (#750) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> [Refactor]: Validate fix in PR #750 for #759 (#815) CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819) CDAT Migration: Refactor annual_cycle_zonal_mean set (#798) * Refactor `annual_cycle_zonal_mean` set * Address PR review comments * Add lat lon regression testing * Add debugging scripts * Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords - Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers * Fix unit tests * Remove old plotter * Add script to debug decode_times=True and ncclimo file * Update plotter time values to month integers * Fix slow `.load()` and multiprocessing issue - Due to incorrectly updating `keep_bnds` logic - Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar` * Update `_encode_time_coords()` docstring * Add AODVIS debug script * update AODVIS obs datasets; regression test results --------- Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `qbo` set (#826) CDAT Migration Phase 2: Refactor tc_analysis set (#829) * start tc_analysis_refactor * update driver * update plotting * Clean up plotter - Remove unused variables - Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts - Reorder functions for top-down readability * Remove unused notebook --------- Co-authored-by: tomvothecoder <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `enso_diags` set (#832) CDAT Migration Phase 2: Refactor `streamflow` set (#837) [Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841) [Refactor]: CDAT Migration Phase 3: testing and documentation update (#846) CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860) CDAT Migration Phase 2: Refactor arm_diags set (#842) Add performance benchmark material (#864) Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865) CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875) CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876) Add support for time series datasets via glob and fix `enso_diags` set (#866) Add fix for checking `is_time_series()` property based on `data_type` attr (#881) CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882) CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883) CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887) CDAT Migration: Prepare branch for merge to `main` (#885) [Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891) CDAT Migration Phase 2: Refactor core utilities and `lat_lon` set (#677) Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) CDAT Migration Phase 2: Refactor `qbo` set (#826)
Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754) Update regression test notebook to show validation of all vars Add `subset_and_align_datasets()` to regrid.py (#776) Add template run scripts CDAT Migration Phase: Refactor `cosp_histogram` set (#748) - Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py` - `formulas_cosp.py` (new file) - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py` - `derivations.py` - Cleaned up portions of `DERIVED_VARIABLES` dictionary - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716) - Remove unnecessary `convert_units()` function calls - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP` - `utils.py` - Delete deprecated, CDAT-based `cosp_histogram` functions - `dataset_xr.py` - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate - Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`) - `io.py` - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing) - Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) Refactor 654 zonal mean xy (#752) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration - Update run script output directory to NERSC public webserver (#793) [PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788) CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794) CDAT Migration: Refactor `polar` set (#749) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> Align order of calls to `_set_param_output_attrs` CDAT Migration: Refactor `meridional_mean_2d` set (#795) CDAT Migration: Refactor `aerosol_budget` (#800) Add `acme.py` changes from PR #712 (#814) * Add `acme.py` changes from PR #712 * Replace unnecessary lambda call Refactor area_mean_time_series and add ccb slice flag feature (#750) Co-authored-by: Tom Vo <tomvothecoder@gmail.com> [Refactor]: Validate fix in PR #750 for #759 (#815) CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819) CDAT Migration: Refactor annual_cycle_zonal_mean set (#798) * Refactor `annual_cycle_zonal_mean` set * Address PR review comments * Add lat lon regression testing * Add debugging scripts * Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords - Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers * Fix unit tests * Remove old plotter * Add script to debug decode_times=True and ncclimo file * Update plotter time values to month integers * Fix slow `.load()` and multiprocessing issue - Due to incorrectly updating `keep_bnds` logic - Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar` * Update `_encode_time_coords()` docstring * Add AODVIS debug script * update AODVIS obs datasets; regression test results --------- Co-authored-by: Tom Vo <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `qbo` set (#826) CDAT Migration Phase 2: Refactor tc_analysis set (#829) * start tc_analysis_refactor * update driver * update plotting * Clean up plotter - Remove unused variables - Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts - Reorder functions for top-down readability * Remove unused notebook --------- Co-authored-by: tomvothecoder <tomvothecoder@gmail.com> CDAT Migration Phase 2: Refactor `enso_diags` set (#832) CDAT Migration Phase 2: Refactor `streamflow` set (#837) [Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841) [Refactor]: CDAT Migration Phase 3: testing and documentation update (#846) CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860) CDAT Migration Phase 2: Refactor arm_diags set (#842) Add performance benchmark material (#864) Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865) CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875) CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876) Add support for time series datasets via glob and fix `enso_diags` set (#866) Add fix for checking `is_time_series()` property based on `data_type` attr (#881) CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882) CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883) CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887) CDAT Migration: Prepare branch for merge to `main` (#885) [Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891) CDAT Migration Phase 2: Refactor core utilities and `lat_lon` set (#677) Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) CDAT Migration Phase 2: Refactor `qbo` set (#826)
Description
cdat-migration-fy24but I refactored it a bit moremaintocdat-migration-fy24#871Checklist
If applicable: