Skip to content

CDAT Migration: Refactor aerosol_budget#800

Merged
tomvothecoder merged 14 commits intocdat-migration-fy24from
refactor/673-aerosol-budget
May 13, 2024
Merged

CDAT Migration: Refactor aerosol_budget#800
tomvothecoder merged 14 commits intocdat-migration-fy24from
refactor/673-aerosol-budget

Conversation

@tomvothecoder
Copy link
Collaborator

@tomvothecoder tomvothecoder commented Mar 20, 2024

Description

Todo

  • Refactor aerosol_budget_driver.py
  • Refactor aerosol_budget_plot.py
    • Add formulas to formulas.py
    • Add derived variables to derivations.py
  • Hybrid to pressure functionality
    • Add "hyai" and "hybi" to HYBRID_SIGMA_KEYS and expand accepted values
    • Add p0, a_key, and b_key to _hybrid_to_pressure()
    • Add logic to _hybrid_to_pressure() to update units based on p0
  • Update call to get_name_yrs_attr() in aerosol_aeronet_driver.py
  • Add changes from aerosol budget table values are erroneously scaled by 1e6 #804 and edit in-place modified derived variables #805

5/8/24 Fixes

  • Fix Elevated Emission (Tg/yr) -- much larger on dev -- due to using xarray .sum() instead of Python sum() on a tuple of variables
  • Fix Burden -- units seem smaller on dev (unit conversion?) -- fixed, I think fixing derivation formulas fixed this
  • Fix Lifetime (Days) -- uses Burden for calculation
  • Fix Sulfate Elevated Emission (Tg/yr) -- due to sum(x[0]) on derived variable with wildcards, which can actually have more than 1 element -- fixed by updating derived var function to expect a list of N variables and perform sum() within the function
  • Fix large differences in burden and lifetime values -- different units still? -- the old main results did not have the fixes from PR edit in-place modified derived variables #805, while this branch did. I re-ran the 2.90 example script again which produced the latest aerosol_budget metrics that align with this branch (refer to results here)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have added tests that prove my fix is effective or that my feature works
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

@tomvothecoder tomvothecoder changed the title Refactor/673 aerosol budget CDAT Migration: Refactor aerosol_budget Mar 20, 2024
Comment on lines +235 to +331
# Perform numpy index-based arithmetic instead of xarray label-based
# arithmetic because the Z dims of `mass` and `delta_p` can have
# different names ("lev" vs. "ilev") with slightly different values. The
# CDAT version of this code uses numpy index-based arithmetic too.
with xr.set_options(keep_attrs=True):
# mass density * mass air kg/m2
mass_3d = mass.copy()
mass_3d.values = mass.values * delta_p.values / 9.8
Copy link
Collaborator Author

@tomvothecoder tomvothecoder Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also align the Z axis dim names then use xr.align() to get the Z axis coordinates aligned before performing label-based arithmetic.

@tomvothecoder
Copy link
Collaborator Author

#804 and #805 is related to this PR. We need to carry over any code changes here.

@chengzhuzhang
Copy link
Contributor

chengzhuzhang commented Apr 12, 2024

Yes, this is in earlier code. It still is concerning (though I don't think it is wrong or blocking to have this behavior). We can add a fix to avoid these errors, I would have hoped the logic wouldn't reach the units check in these cases...

Agreed. I think I commented on the wrong PR, the conversation should go to #805.. Sorry, Tom.

@tomvothecoder
Copy link
Collaborator Author

Agreed. I think I commented on the wrong PR, the conversation should go to #805.. Sorry, Tom.

No problem @chengzhuzhang. I moved the comments over to that PR and deleted them here.

@E3SM-Project E3SM-Project deleted a comment from chengzhuzhang Apr 15, 2024
@E3SM-Project E3SM-Project deleted a comment from mahf708 Apr 15, 2024
@tomvothecoder tomvothecoder force-pushed the cdat-migration-fy24 branch from f6c4fdf to 1e1ab90 Compare May 1, 2024 17:10
@tomvothecoder tomvothecoder force-pushed the refactor/673-aerosol-budget branch from 97d6cc7 to 7ec87bf Compare May 1, 2024 17:13
Fix conditionals

Add optional `a_key` and `b_key` args to  `_hybrid_to_pressure()`
- Add derived variables to `derivations.py` and `formulas.py`

Add `bc_CFLX()`
@tomvothecoder tomvothecoder force-pushed the refactor/673-aerosol-budget branch from 7ec87bf to a95c223 Compare May 1, 2024 17:27
- Fix elevated emission, burden, lifetime, and sulfate elevated emission
- Update `Dataset` class to support tracking of wildcard source variables with new attribute, `is_src_vars_wildcard`. This attribute determines whether to pass a list of DataArrays to a derived variable function, or to unpack it beforehand
- Add main and dev results CSVs
@tomvothecoder
Copy link
Collaborator Author

Viewer

CSV File Diff Comparison (VS Code)

"ANN" comparison -- no highlighted diffs
image

"JJA" comparison -- no highlighted diffs
image

@tomvothecoder tomvothecoder self-assigned this May 9, 2024
@tomvothecoder tomvothecoder added the cdat-migration-fy24 CDAT Migration FY24 Task label May 9, 2024
@tomvothecoder tomvothecoder marked this pull request as ready for review May 9, 2024 18:19
Copy link
Collaborator Author

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PR review comments for areas of interest.

Comment on lines +1308 to +1442


# Names of 2D aerosol burdens, including cloud-borne aerosols
aero_burden_list = [
"ABURDENDUST",
"ABURDENSO4",
"ABURDENSO4_STR",
"ABURDENSO4_TRO",
"ABURDENPOM",
"ABURDENMOM",
"ABURDENSOA",
"ABURDENBC",
"ABURDENSEASALT",
]

# Add burden vars to DERIVED_VARIABLES
for aero_burden_item in aero_burden_list:
DERIVED_VARIABLES[f"_{aero_burden_item}"] = OrderedDict(
[((aero_burden_item,), aero_burden_fxn)]
)


# Names of 2D mass slices of aerosol species
# Also add 3D masses while at it (if available)
aero_mass_list = []
for aero_name in ["dst", "mom", "pom", "so4", "soa", "ncl", "bc"]:
for aero_lev in ["_srf", "_200", "_330", "_500", "_850", ""]:
# Note that the empty string (last entry) will get the 3D mass fields
aero_mass_list.append(f"Mass_{aero_name}{aero_lev}")


# Add burden vars to DERIVED_VARIABLES
for aero_mass_item in aero_mass_list:
DERIVED_VARIABLES[f"_{aero_mass_item}"] = OrderedDict(
[((aero_mass_item,), aero_mass_fxn)]
)

# Add all the output_aerocom_aie.F90 variables to aero_rename_list
# components/eam/src/physics/cam/output_aerocom_aie.F90
aero_aerocom_list = [
"angstrm",
"aerindex",
"cdr",
"cdnc",
"cdnum",
"icnum",
"clt",
"lcc",
"lwp",
"iwp",
"icr",
"icc",
"cod",
"ccn",
"ttop",
"htop",
"ptop",
"autoconv",
"accretn",
"icnc",
"rh700",
"rwp",
"intccn",
"colrv",
"lwp2",
"iwp2",
"lwpbf",
"iwpbf",
"cdnumbf",
"icnumbf",
"aod400",
"aod700",
"colccn.1",
"colccn.3",
"ccn.1bl",
"ccn.3bl",
]

# Add aerocom vars to DERIVED_VARIABLES
for aero_aerocom_item in aero_aerocom_list:
DERIVED_VARIABLES[aero_aerocom_item] = OrderedDict([((aero_aerocom_item,), rename)])

# add cdnc, icnc, lwp, iwp to DERIVED_VARIABLES
DERIVED_VARIABLES.update(
{
"in_cloud_cdnc": {("cdnc", "lcc"): incldtop_cdnc},
"in_grid_cdnc": {("cdnc",): cldtop_cdnc},
"in_cloud_icnc": {("icnc", "icc"): incldtop_icnc},
"in_grid_icnc": {("icnc",): cldtop_icnc},
"in_cloud_lwp": {("lwp", "lcc"): incld_lwp},
"in_grid_lwp": {("lwp",): cld_lwp},
"in_cloud_iwp": {("iwp", "icc"): incld_iwp},
"in_grid_iwp": {("iwp",): cld_iwp},
}
)


DERIVED_VARIABLES.update(
{
"ERFtot": {("FSNT", "FLNT"): erf_tot},
"ERFari": {("FSNT", "FLNT", "FSNT_d1", "FLNT_d1"): erf_ari},
"ERFaci": {("FSNT_d1", "FLNT_d1", "FSNTC_d1", "FLNTC_d1"): erf_aci},
"ERFres": {("FSNTC_d1", "FLNTC_d1"): erf_res},
}
)

# Add more AOD terms
# Note that AODVIS and AODDUST are already added elsewhere
aero_aod_list = [
"AODBC",
"AODPOM",
"AODMOM",
"AODSO4",
"AODSO4_STR",
"AODSO4_TRO",
"AODSS",
"AODSOA",
]

# Add aod vars to DERIVED_VARIABLES
for aero_aod_item in aero_aod_list:
DERIVED_VARIABLES[aero_aod_item] = {(aero_aod_item,): rename}

# Add 3D variables related to aerosols and chemistry
# Note that O3 is already added above
# Note that 3D mass vars are already added by the empty string above ""
# Note that it is possible to create on-the-fly slices from these variables with
# a function of the form:
# def aero_3d_slice(var, lev):
# return var[lev, :, :]
aero_chem_list = ["DMS", "H2O2", "H2SO4", "NO3", "OH", "SO2"]

# Add aero/chem vars to DERIVED_VARIABLES
for aero_chem_item in aero_chem_list:
DERIVED_VARIABLES[aero_chem_item] = {(aero_chem_item,): rename}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes were pulled over from PRs #763 and #805.

result.attrs["units"] = "kg/m2/s"

return result

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function now expects a list of variables (xr.DataArray), instead of a single sum DataArray.
We need to pass the list of variables to the sum_vars function to maintain the "units" attribute.

Otherwise, calling sum(vars) beforehand will drop attributes.

Comment on lines +498 to +849
def aero_burden_fxn(var: xr.DataArray) -> xr.DataArray:
"""Scale the aerosol burden by 1e6.

Parameters
----------
var : xr.DataArray
The input burden in kg/m2.

Returns
-------
xr.DataArray
The output burden in 1e-6 kg/m2.
"""
with xr.set_options(keep_attrs=True):
burden = var * 1e6

burden.attrs["units"] = "1e-6 kg/m2"

return burden


def aero_mass_fxn(var: xr.DataArray) -> xr.DataArray:
"""Scale the given mass by 1e12.

Parameters
----------
var : xr.DataArray
The input mass in kg/kg.

Returns
-------
xr.DataArray
The aerosol mass concentration in 1e-12 kg/kg units.
"""
with xr.set_options(keep_attrs=True):
mass = var * 1e12

mass.attrs["units"] = "1e-12 kg/kg"

return mass


def incldtop_cdnc(cdnc: xr.DataArray, lcc: xr.DataArray) -> xr.DataArray:
"""Return the in-cloud cloud droplet number concentration at cloud top.

Parameters
----------
cdnc : xr.DataArray
Cloud droplet number concentration in 1/m3.
lcc : xr.DataArray
Liquid cloud fraction.

Returns
-------
xr.DataArray
In-cloud cdnc at cloud top in 1/cm3.
"""
with xr.set_options(keep_attrs=True):
var = cdnc * 1e-6 / lcc

var.attrs["units"] = "1/cm3"
var.attrs["long_name"] = "In-cloud-top CDNC"

return var


def cldtop_cdnc(cdnc: xr.DataArray) -> xr.DataArray:
"""Return the in-grid cloud droplet number concentration at cloud top.

Parameters
----------
cdnc : xr.DataArray
Cloud droplet number concentration in 1/m3.

Returns
-------
xr.DataArray
In-grid cdnc at cloud top in 1/cm3.
"""
with xr.set_options(keep_attrs=True):
var = cdnc * 1e-6

var.attrs["units"] = "1/cm3"
var.attrs["long_name"] = "In-grid cloud-top CDNC"

return var


def incldtop_icnc(icnc: xr.DataArray, icc: xr.DataArray) -> xr.DataArray:
"""Return the in-cloud ice crystal number concentration at cloud top.

Parameters
----------
icnc : xr.DataArray
Ice crystal number concentration in 1/m3.
icc : xr.DataArray
ice cloud fraction.

Returns
-------
xr.DataArray
In-cloud cdnc at cloud top in 1/cm3.
"""
with xr.set_options(keep_attrs=True):
var = icnc * 1e-6 / icc

var.attrs["units"] = "1/cm3"
var.attrs["long_name"] = "In-cloud-top ICNC"

return var


def cldtop_icnc(icnc: xr.DataArray) -> xr.DataArray:
"""Return the in-grid ice crystal number concentration at cloud top.

Parameters
----------
icnc : xr.DataArray
Cloud crystal number concentration in 1/m3.

Returns
-------
xr.DataArray
In-grid icnc at cloud top in 1/cm3.
"""
with xr.set_options(keep_attrs=True):
var = icnc * 1e-6

var.attrs["units"] = "1/cm3"
var.attrs["long_name"] = "In-grid cloud-top ICNC"

return var


def incld_lwp(lwp: xr.DataArray, lcc: xr.DataArray) -> xr.DataArray:
"""Return the in-cloud liquid water path (LWP).

Parameters
----------
lwp : xr.DataArray
Liquid water path in kg/m2.
lcc : xr.DataArray
Liquid cloud fraction.

Returns
-------
xr.DataArray
In-cloud liquid water path in g/cm3.
"""
with xr.set_options(keep_attrs=True):
var = 1e3 * lwp / lcc

var.attrs["units"] = "g/cm3"
var.attrs["long_name"] = "In-cloud LWP"

return var


def cld_lwp(lwp: xr.DataArray) -> xr.DataArray:
"""Return the grid-mean-cloud LWP in g/cm3.

Parameters
----------
lwp : xr.DataArray
Liquid Water Path (LWP) value.

Returns
-------
xr.DataArray
Grid-mean-cloud LWP in g/cm3.
"""
with xr.set_options(keep_attrs=True):
var = 1e3 * lwp

var.attrs["units"] = "g/cm3"
var.attrs["long_name"] = "In-grid LWP"

return var


def incld_iwp(iwp: xr.DataArray, icc: xr.DataArray) -> xr.DataArray:
"""Return the in-cloud ice water path (IWP).

Parameters
----------
iwp : xr.DataArray
Ice water path in kg/m2.
icc : xr.DataArray
Ice cloud fraction.

Returns
-------
xr.DataArray
In-cloud IWP in g/cm3.
"""
with xr.set_options(keep_attrs=True):
var = 1e3 * iwp / icc

var.attrs["units"] = "g/cm3"
var.attrs["long_name"] = "In-cloud IWP"

return var


def cld_iwp(iwp: xr.DataArray) -> xr.DataArray:
"""Return the in-grid ice water path (IWP).

Parameters
----------
iwp : xr.DataArray
Ice water path in kg/m2.

Returns
-------
xr.DataArray
In-grid IWP in g/cm3.
"""
with xr.set_options(keep_attrs=True):
var = 1e3 * iwp

var.attrs["units"] = "g/cm3"
var.attrs["long_name"] = "In-grid IWP"

return var


def erf_tot(fsnt: xr.DataArray, flnt: xr.DataArray) -> xr.DataArray:
"""
Calculate the total effective radiative forcing (ERFtot).

Parameters
----------
fsnt : xr.DataArray
The incoming sw radiation at the top of the atmosphere.
flnt : xr.DataArray
The outgoing lw radiation at the top of the atmosphere.

Returns
-------
xr.DataArray
The ERFtot which represents the total erf.

See Ghan 2013 for derivation of ERF decomposition: https://doi.org/10.5194/acp-13-9971-2013
"""
with xr.set_options(keep_attrs=True):
var = fsnt - flnt

var.attrs["units"] = "W/m2"
var.attrs["long_name"] = "ERFtot: total effect"
return var


def erf_ari(
fsnt: xr.DataArray, flnt: xr.DataArray, fsnt_d1: xr.DataArray, flnt_d1: xr.DataArray
) -> xr.DataArray:
"""
Calculate aerosol--radiation interactions (ARI) part of effective radiative forcing (ERF).

Parameters
----------
fsnt : xr.DataArray
Net solar flux at the top of the atmosphere.
flnt : xr.DataArray
Net longwave flux at the top of the atmosphere.
fsnt_d1 : xr.DataArray
fsnt without aerosols.
flnt_d1 : xr.DataArray
flnt without aerosols.

Returns
-------
xr.DataArray
ERFari (aka, direct effect) in W/m2.

See Ghan 2013 for derivation of ERF decomposition: https://doi.org/10.5194/acp-13-9971-2013
"""
with xr.set_options(keep_attrs=True):
var = (fsnt - flnt) - (fsnt_d1 - flnt_d1)

var.attrs["units"] = "W/m2"
var.attrs["long_name"] = "ERFari: direct effect"

return var


def erf_aci(
fsnt_d1: xr.DataArray,
flnt_d1: xr.DataArray,
fsntc_d1: xr.DataArray,
flntc_d1: xr.DataArray,
) -> xr.DataArray:
"""
Calculate aerosol--cloud interactions (ACI) part of effectie radiative forcing (ERF)

Parameters
----------
fsnt_d1 : xr.DataArray
Downward shortwave radiation toa without aerosols.
flnt_d1 : xr.DataArray
Upward longwave radiation toa without aerosols.
fsntc_d1 : xr.DataArray
fsnt_d1 without clouds.
flntc_d1 : xr.DataArray
flnt_d1 without clouds.

Returns
-------
xr.DataArray
ERFaci (aka, indirect effect) in W/m2.

Notes
-----
See Ghan 2013 for derivation of ERF decomposition: https://doi.org/10.5194/acp-13-9971-2013
"""
with xr.set_options(keep_attrs=True):
var = (fsnt_d1 - flnt_d1) - (fsntc_d1 - flntc_d1)

var.attrs["units"] = "W/m2"
var.attrs["long_name"] = "ERFaci: indirect effect"

return var


def erf_res(fsntc_d1: xr.DataArray, flntc_d1: xr.DataArray) -> xr.DataArray:
"""
Calculate the residual effect (RES) part of effective radiative forcin g.

Parameters
----------
fsntc_d1 : xr.DataArray
Downward solar radiation at the top of the atmosphere with neither
clouds nor aerosols.
flntc_d1 : xr.DataArray
Upward longwave radiation at the top of the atmosphere with neither
clouds nor aerosols.

Returns
-------
xr.DataArray
ERFres (aka, surface effect) in W/m2.

Notes
-----
See Ghan 2013 for derivation of ERF decomposition: https://doi.org/10.5194/acp-13-9971-2013
"""
with xr.set_options(keep_attrs=True):
var = fsntc_d1 - flntc_d1

var.attrs["units"] = "W/m2"
var.attrs["long_name"] = "ERFres: residual effect"

return var
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes were pulled from PR #763 and #805.

Comment on lines +852 to +874
def sum_vars(vars: List[xr.DataArray]) -> xr.DataArray:
"""Sum DataArrays using Python's `.sum()` and perserve attrs.

Pythons sum iterates over the iterable (the list of DataArrays) and
adds all elements, which is different from NumPy which performs a sum
reduction over an axis/axes. This function ensures the DataArray attributes
are perserved by invoking the `.sum()` call within the context of
`xr.set_options()`.

Parameters
----------
vars : List[xr.DataArray]
A list of variables.

Returns
-------
xr.DataArray
The sum of the variables
"""
with xr.set_options(keep_attrs=True):
result: xr.DataArray = sum(vars) # type: ignore

return result
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new sum_vars() function replaces the original lambda(*x): sum(x) and lambda(*x): molec_convert_units(sum(x), 12) functions defined in the derived variables dictionary.

It places the call to sum(vars) inside with xr.set_options(keep_attrs=True) to maintain attributes, including "units" which is required for molec_convert_units().

@@ -1,114 +1,38 @@
"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the code in aerosol_budget_driver.py is similar to the CDAT version. I renamed and re-organized the functions, updated code to reflect Xarray based logic, and use xCDAT for regridding hybrid to pressure.

@@ -16,11 +16,16 @@

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes in this file involve updating the accepted hybrid sigma keys, including adding "hyai" and "hybi" which are used by the aerosol budget driver.

@@ -93,8 +93,8 @@ def rmvAnnualCycle(data, spd, fCrit):
# if fcrit_ndx > 1:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes in this file are to fix lingering pre-commit issues from #732.

@@ -1,7 +1,6 @@
from __future__ import annotations
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes in this file are to fix lingering pre-commit issues from #732.


self.variables: List[str] = []
self.seasons: List[CLIMO_FREQ] = ["ANN", "DJF", "MAM", "JJA", "SON"]
self.seasons: List[ClimoFreq] = ["ANN", "DJF", "MAM", "JJA", "SON"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update type annotation from CLIMO_FREQ to ClimoFreq.

z.loc[{"frequency": 0}] = np.nan

if "spec_raw" in var.name and subplot_num < 2:
var_name = str(var.name)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract var.name as var_name string variable for re-use.

("FISCCP1_COSP",): cosp_histogram_standardize,
("CLISCCP",): cosp_histogram_standardize,
},
"ICEFRAC": OrderedDict(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced unnecessary OrderedDict with a regular dictionary, but many still remain. All OrderedDicts should eventually be replaced in #716 because they are hard to read and work with.

@tomvothecoder tomvothecoder merged commit 2b492c9 into cdat-migration-fy24 May 13, 2024
@tomvothecoder tomvothecoder deleted the refactor/673-aerosol-budget branch May 13, 2024 20:24
tomvothecoder added a commit that referenced this pull request Jul 15, 2024
tomvothecoder added a commit that referenced this pull request Aug 21, 2024
tomvothecoder added a commit that referenced this pull request Oct 1, 2024
tomvothecoder added a commit that referenced this pull request Oct 25, 2024
tomvothecoder added a commit that referenced this pull request Oct 29, 2024
tomvothecoder added a commit that referenced this pull request Oct 29, 2024
tomvothecoder added a commit that referenced this pull request Dec 4, 2024
tomvothecoder added a commit that referenced this pull request Dec 5, 2024
Refer to the PR for more information because the changelog is massive.

Update build workflow to run on `cdat-migration-fy24` branch

CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743)

- Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files
- Fix some lingering unit tests failure
- Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml`
- Add `xskillscore` to `ci.yml`
- Fix `pre-commit` issues

CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744)

- Add Makefile that simplifies common development commands (building and installing, testing, etc.)
- Write unit tests to cover all new code for utility functions
  - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py`
- Metrics comparison for  `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs
- Test run with 3D variables (`_run_3d_diags()`)
  - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still
  - Fix subsetting syntax bug using ilev
  - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests
  - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()`
- Fix failing integration tests pass in CI/CD
  - Refactor `test_diags.py` -- replace unittest with pytest
  - Refactor `test_all_sets.py` -- replace unittest with pytest
  - Test climatology datasets -- tested with 3d variables using `test_all_sets.py`

CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746)

- Move driver type annotations to `type_annotations.py`
- Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py`
- Update `_save_data_metrics_and_plots` args to accept `plot_func` callable
- Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False`
- Move `parameter` arg to the top in `lat_lon_plot.plot`
- Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class

Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754)

Update regression test notebook to show validation of all vars

Add `subset_and_align_datasets()` to regrid.py (#776)

Add template run scripts

CDAT Migration Phase: Refactor `cosp_histogram` set (#748)

- Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py`
- `formulas_cosp.py` (new file)
  - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions
  - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py`
- `derivations.py`
  - Cleaned up portions of `DERIVED_VARIABLES` dictionary
  - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716)
  - Remove unnecessary `convert_units()` function calls
  - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP`
- `utils.py`
  - Delete deprecated, CDAT-based `cosp_histogram` functions
- `dataset_xr.py`
  - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate
  -  Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`)
- `io.py`
  - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing)
- Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files

CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774)

Refactor 654 zonal mean xy (#752)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

CDAT Migration - Update run script output directory to NERSC public webserver (#793)

[PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788)

CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794)

CDAT Migration: Refactor `polar` set (#749)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

Align order of calls to `_set_param_output_attrs`

CDAT Migration: Refactor `meridional_mean_2d` set (#795)

CDAT Migration: Refactor `aerosol_budget` (#800)

Add `acme.py` changes from PR #712 (#814)

* Add `acme.py` changes from PR #712

* Replace unnecessary lambda call

Refactor area_mean_time_series and add ccb slice flag feature (#750)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

[Refactor]: Validate fix in PR #750 for #759 (#815)

CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819)

CDAT Migration: Refactor annual_cycle_zonal_mean set (#798)

* Refactor `annual_cycle_zonal_mean` set

* Address PR review comments

* Add lat lon regression testing

* Add debugging scripts

* Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords
- Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers

* Fix unit tests

* Remove old plotter

* Add script to debug decode_times=True and ncclimo file

* Update plotter time values to month integers

* Fix slow `.load()` and multiprocessing issue
- Due to incorrectly updating `keep_bnds` logic
- Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar`

* Update `_encode_time_coords()` docstring

* Add AODVIS debug script

* update AODVIS obs datasets; regression test results

---------

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

CDAT Migration Phase 2: Refactor `qbo` set (#826)

CDAT Migration Phase 2: Refactor tc_analysis set  (#829)

* start tc_analysis_refactor

* update driver

* update plotting

* Clean up plotter
- Remove unused variables
- Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts
- Reorder functions for top-down readability

* Remove unused notebook

---------

Co-authored-by: tomvothecoder <tomvothecoder@gmail.com>

CDAT Migration Phase 2: Refactor `enso_diags` set (#832)

CDAT Migration Phase 2: Refactor `streamflow` set (#837)

[Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841)

[Refactor]: CDAT Migration Phase 3: testing and documentation update (#846)

CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860)

CDAT Migration Phase 2: Refactor arm_diags set (#842)

Add performance benchmark material (#864)

Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865)

CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875)

CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876)

Add support for time series datasets via glob and fix `enso_diags` set (#866)

Add fix for checking `is_time_series()` property based on `data_type` attr (#881)

CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882)

CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883)

CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887)

CDAT Migration: Prepare branch for merge to `main` (#885)

[Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891)

CDAT Migration Phase 2: Refactor core utilities and  `lat_lon` set (#677)

Refer to the PR for more information because the changelog is massive.

Update build workflow to run on `cdat-migration-fy24` branch

CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743)

- Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files
- Fix some lingering unit tests failure
- Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml`
- Add `xskillscore` to `ci.yml`
- Fix `pre-commit` issues

CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744)

- Add Makefile that simplifies common development commands (building and installing, testing, etc.)
- Write unit tests to cover all new code for utility functions
  - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py`
- Metrics comparison for  `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs
- Test run with 3D variables (`_run_3d_diags()`)
  - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still
  - Fix subsetting syntax bug using ilev
  - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests
  - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()`
- Fix failing integration tests pass in CI/CD
  - Refactor `test_diags.py` -- replace unittest with pytest
  - Refactor `test_all_sets.py` -- replace unittest with pytest
  - Test climatology datasets -- tested with 3d variables using `test_all_sets.py`

CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746)

- Move driver type annotations to `type_annotations.py`
- Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py`
- Update `_save_data_metrics_and_plots` args to accept `plot_func` callable
- Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False`
- Move `parameter` arg to the top in `lat_lon_plot.plot`
- Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class

CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774)

CDAT Migration Phase 2: Refactor `qbo` set (#826)
tomvothecoder added a commit that referenced this pull request Dec 5, 2024
Refer to the PR for more information because the changelog is massive.

Update build workflow to run on `cdat-migration-fy24` branch

CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743)

- Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files
- Fix some lingering unit tests failure
- Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml`
- Add `xskillscore` to `ci.yml`
- Fix `pre-commit` issues

CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744)

- Add Makefile that simplifies common development commands (building and installing, testing, etc.)
- Write unit tests to cover all new code for utility functions
  - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py`
- Metrics comparison for  `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs
- Test run with 3D variables (`_run_3d_diags()`)
  - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still
  - Fix subsetting syntax bug using ilev
  - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests
  - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()`
- Fix failing integration tests pass in CI/CD
  - Refactor `test_diags.py` -- replace unittest with pytest
  - Refactor `test_all_sets.py` -- replace unittest with pytest
  - Test climatology datasets -- tested with 3d variables using `test_all_sets.py`

CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746)

- Move driver type annotations to `type_annotations.py`
- Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py`
- Update `_save_data_metrics_and_plots` args to accept `plot_func` callable
- Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False`
- Move `parameter` arg to the top in `lat_lon_plot.plot`
- Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class

Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754)

Update regression test notebook to show validation of all vars

Add `subset_and_align_datasets()` to regrid.py (#776)

Add template run scripts

CDAT Migration Phase: Refactor `cosp_histogram` set (#748)

- Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py`
- `formulas_cosp.py` (new file)
  - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions
  - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py`
- `derivations.py`
  - Cleaned up portions of `DERIVED_VARIABLES` dictionary
  - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716)
  - Remove unnecessary `convert_units()` function calls
  - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP`
- `utils.py`
  - Delete deprecated, CDAT-based `cosp_histogram` functions
- `dataset_xr.py`
  - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate
  -  Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`)
- `io.py`
  - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing)
- Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files

CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774)

Refactor 654 zonal mean xy (#752)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

CDAT Migration - Update run script output directory to NERSC public webserver (#793)

[PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788)

CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794)

CDAT Migration: Refactor `polar` set (#749)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

Align order of calls to `_set_param_output_attrs`

CDAT Migration: Refactor `meridional_mean_2d` set (#795)

CDAT Migration: Refactor `aerosol_budget` (#800)

Add `acme.py` changes from PR #712 (#814)

* Add `acme.py` changes from PR #712

* Replace unnecessary lambda call

Refactor area_mean_time_series and add ccb slice flag feature (#750)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

[Refactor]: Validate fix in PR #750 for #759 (#815)

CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819)

CDAT Migration: Refactor annual_cycle_zonal_mean set (#798)

* Refactor `annual_cycle_zonal_mean` set

* Address PR review comments

* Add lat lon regression testing

* Add debugging scripts

* Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords
- Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers

* Fix unit tests

* Remove old plotter

* Add script to debug decode_times=True and ncclimo file

* Update plotter time values to month integers

* Fix slow `.load()` and multiprocessing issue
- Due to incorrectly updating `keep_bnds` logic
- Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar`

* Update `_encode_time_coords()` docstring

* Add AODVIS debug script

* update AODVIS obs datasets; regression test results

---------

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

CDAT Migration Phase 2: Refactor `qbo` set (#826)

CDAT Migration Phase 2: Refactor tc_analysis set  (#829)

* start tc_analysis_refactor

* update driver

* update plotting

* Clean up plotter
- Remove unused variables
- Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts
- Reorder functions for top-down readability

* Remove unused notebook

---------

Co-authored-by: tomvothecoder <tomvothecoder@gmail.com>

CDAT Migration Phase 2: Refactor `enso_diags` set (#832)

CDAT Migration Phase 2: Refactor `streamflow` set (#837)

[Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841)

[Refactor]: CDAT Migration Phase 3: testing and documentation update (#846)

CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860)

CDAT Migration Phase 2: Refactor arm_diags set (#842)

Add performance benchmark material (#864)

Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865)

CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875)

CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876)

Add support for time series datasets via glob and fix `enso_diags` set (#866)

Add fix for checking `is_time_series()` property based on `data_type` attr (#881)

CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882)

CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883)

CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887)

CDAT Migration: Prepare branch for merge to `main` (#885)

[Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891)

CDAT Migration Phase 2: Refactor core utilities and  `lat_lon` set (#677)

Refer to the PR for more information because the changelog is massive.

Update build workflow to run on `cdat-migration-fy24` branch

CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743)

- Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files
- Fix some lingering unit tests failure
- Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml`
- Add `xskillscore` to `ci.yml`
- Fix `pre-commit` issues

CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744)

- Add Makefile that simplifies common development commands (building and installing, testing, etc.)
- Write unit tests to cover all new code for utility functions
  - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py`
- Metrics comparison for  `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs
- Test run with 3D variables (`_run_3d_diags()`)
  - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still
  - Fix subsetting syntax bug using ilev
  - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests
  - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()`
- Fix failing integration tests pass in CI/CD
  - Refactor `test_diags.py` -- replace unittest with pytest
  - Refactor `test_all_sets.py` -- replace unittest with pytest
  - Test climatology datasets -- tested with 3d variables using `test_all_sets.py`

CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746)

- Move driver type annotations to `type_annotations.py`
- Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py`
- Update `_save_data_metrics_and_plots` args to accept `plot_func` callable
- Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False`
- Move `parameter` arg to the top in `lat_lon_plot.plot`
- Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class

CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774)

CDAT Migration Phase 2: Refactor `qbo` set (#826)
tomvothecoder added a commit that referenced this pull request Jan 15, 2025
Refer to the PR for more information because the changelog is massive.

Update build workflow to run on `cdat-migration-fy24` branch

CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743)

- Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files
- Fix some lingering unit tests failure
- Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml`
- Add `xskillscore` to `ci.yml`
- Fix `pre-commit` issues

CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744)

- Add Makefile that simplifies common development commands (building and installing, testing, etc.)
- Write unit tests to cover all new code for utility functions
  - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py`
- Metrics comparison for  `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs
- Test run with 3D variables (`_run_3d_diags()`)
  - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still
  - Fix subsetting syntax bug using ilev
  - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests
  - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()`
- Fix failing integration tests pass in CI/CD
  - Refactor `test_diags.py` -- replace unittest with pytest
  - Refactor `test_all_sets.py` -- replace unittest with pytest
  - Test climatology datasets -- tested with 3d variables using `test_all_sets.py`

CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746)

- Move driver type annotations to `type_annotations.py`
- Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py`
- Update `_save_data_metrics_and_plots` args to accept `plot_func` callable
- Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False`
- Move `parameter` arg to the top in `lat_lon_plot.plot`
- Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class

Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754)

Update regression test notebook to show validation of all vars

Add `subset_and_align_datasets()` to regrid.py (#776)

Add template run scripts

CDAT Migration Phase: Refactor `cosp_histogram` set (#748)

- Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py`
- `formulas_cosp.py` (new file)
  - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions
  - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py`
- `derivations.py`
  - Cleaned up portions of `DERIVED_VARIABLES` dictionary
  - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716)
  - Remove unnecessary `convert_units()` function calls
  - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP`
- `utils.py`
  - Delete deprecated, CDAT-based `cosp_histogram` functions
- `dataset_xr.py`
  - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate
  -  Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`)
- `io.py`
  - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing)
- Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files

CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774)

Refactor 654 zonal mean xy (#752)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

CDAT Migration - Update run script output directory to NERSC public webserver (#793)

[PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788)

CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794)

CDAT Migration: Refactor `polar` set (#749)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

Align order of calls to `_set_param_output_attrs`

CDAT Migration: Refactor `meridional_mean_2d` set (#795)

CDAT Migration: Refactor `aerosol_budget` (#800)

Add `acme.py` changes from PR #712 (#814)

* Add `acme.py` changes from PR #712

* Replace unnecessary lambda call

Refactor area_mean_time_series and add ccb slice flag feature (#750)

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

[Refactor]: Validate fix in PR #750 for #759 (#815)

CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819)

CDAT Migration: Refactor annual_cycle_zonal_mean set (#798)

* Refactor `annual_cycle_zonal_mean` set

* Address PR review comments

* Add lat lon regression testing

* Add debugging scripts

* Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords
- Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers

* Fix unit tests

* Remove old plotter

* Add script to debug decode_times=True and ncclimo file

* Update plotter time values to month integers

* Fix slow `.load()` and multiprocessing issue
- Due to incorrectly updating `keep_bnds` logic
- Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar`

* Update `_encode_time_coords()` docstring

* Add AODVIS debug script

* update AODVIS obs datasets; regression test results

---------

Co-authored-by: Tom Vo <tomvothecoder@gmail.com>

CDAT Migration Phase 2: Refactor `qbo` set (#826)

CDAT Migration Phase 2: Refactor tc_analysis set  (#829)

* start tc_analysis_refactor

* update driver

* update plotting

* Clean up plotter
- Remove unused variables
- Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts
- Reorder functions for top-down readability

* Remove unused notebook

---------

Co-authored-by: tomvothecoder <tomvothecoder@gmail.com>

CDAT Migration Phase 2: Refactor `enso_diags` set (#832)

CDAT Migration Phase 2: Refactor `streamflow` set (#837)

[Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841)

[Refactor]: CDAT Migration Phase 3: testing and documentation update (#846)

CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860)

CDAT Migration Phase 2: Refactor arm_diags set (#842)

Add performance benchmark material (#864)

Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865)

CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875)

CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876)

Add support for time series datasets via glob and fix `enso_diags` set (#866)

Add fix for checking `is_time_series()` property based on `data_type` attr (#881)

CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882)

CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883)

CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887)

CDAT Migration: Prepare branch for merge to `main` (#885)

[Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891)

CDAT Migration Phase 2: Refactor core utilities and  `lat_lon` set (#677)

Refer to the PR for more information because the changelog is massive.

Update build workflow to run on `cdat-migration-fy24` branch

CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743)

- Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files
- Fix some lingering unit tests failure
- Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml`
- Add `xskillscore` to `ci.yml`
- Fix `pre-commit` issues

CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744)

- Add Makefile that simplifies common development commands (building and installing, testing, etc.)
- Write unit tests to cover all new code for utility functions
  - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py`
- Metrics comparison for  `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs
- Test run with 3D variables (`_run_3d_diags()`)
  - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still
  - Fix subsetting syntax bug using ilev
  - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests
  - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()`
- Fix failing integration tests pass in CI/CD
  - Refactor `test_diags.py` -- replace unittest with pytest
  - Refactor `test_all_sets.py` -- replace unittest with pytest
  - Test climatology datasets -- tested with 3d variables using `test_all_sets.py`

CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746)

- Move driver type annotations to `type_annotations.py`
- Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py`
- Update `_save_data_metrics_and_plots` args to accept `plot_func` callable
- Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False`
- Move `parameter` arg to the top in `lat_lon_plot.plot`
- Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class

CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774)

CDAT Migration Phase 2: Refactor `qbo` set (#826)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cdat-migration-fy24 CDAT Migration FY24 Task

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants