Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .claude/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
<!-- markdownlint-disable -->
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
@../AGENTS.md
110 changes: 110 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
<!-- markdownlint-disable -->
# AGENTS.md


This file provides guidance to work with code in this repository.

## Project Overview

zppy_interfaces is a Python package providing extra functionality for E3SM climate model analysis. It processes log files, generates time series plots, and runs PCMDI diagnostics. The package is designed to be called by zppy or used standalone for climate model analysis.

## Development Environment Setup

Set up the development environment using conda:
```bash
conda clean --all --y
conda env create -f conda/dev.yml -n zppy-interfaces-dev
conda activate zppy-interfaces-dev
pip install .
pre-commit install
```

## Common Development Commands

### Installation and Setup
- `pip install .` - Install package in development mode
- `pip install -e .[testing]` - Install with testing dependencies
- `pip install -e .[qa]` - Install with quality assurance tools

### Code Quality and Testing
- `pytest` - Run all tests
- `pytest tests/unit/budget_analysis/` - Run specific module tests
- `pytest tests/unit/budget_analysis/test_atm_parser.py` - Run single test file
- `pre-commit run --all-files` - Run all pre-commit hooks
- `black .` - Format code with Black
- `isort .` - Sort imports
- `flake8` - Check code style
- `mypy zppy_interfaces/` - Type checking

### CLI Tools Testing
Test the main CLI applications:
- `zi-budget-analysis --help` - Budget analysis tool
- `zi-global-time-series --help` - Global time series plots
- `zi-pcmdi-link-observation --help` - PCMDI observation linking
- `zi-pcmdi-mean-climate --help` - PCMDI mean climate diagnostics
- `zi-pcmdi-variability-modes --help` - PCMDI variability modes
- `zi-pcmdi-enso --help` - PCMDI ENSO diagnostics
- `zi-pcmdi-synthetic-plots --help` - PCMDI synthetic plots

## Architecture Overview

### Main Components

**budget_analysis/** - E3SM water and energy budget analysis
- `__main__.py` - CLI entry point with legacy and whole-model modes
- `parser.py` - Core budget parsing logic for coupler logs
- `ingestion/` - Component-specific log parsers (atm, ocn, ice, lnd, cpl)
- `plotting.py` - HTML plot generation for legacy mode
- `viz.py` - Visualization for whole-model mode
- `checks.py` - Budget conservation checks
- `normalization.py` - Data normalization utilities

**global_time_series/** - Global time series plot generation
- `__main__.py` - CLI with viewer vs PDF output modes
- `coupled_global/` - Core time series generation logic
- `create_ocean_ts.py` - Ocean-specific time series processing
- `utils.py` - Parameter handling utilities

**pcmdi_diags/** - PCMDI diagnostics suite
- Multiple CLI tools for different diagnostic types
- `viewer.py` - HTML viewer generation
- `synthetic_plots/` - Synthetic plot utilities

**multi_utils/** - Shared utilities
- `logger.py` - Logging setup for child processes
- `viewer.py` - Common viewer functionality

### Data Flow Patterns

**Budget Analysis (whole-model mode):**
1. Ingest: Parse multiple log file types (cpl, atm, ocn, ice, lnd)
2. Normalize: Standardize data formats and units
3. Check: Run conservation checks and compute residuals
4. Visualize: Generate HTML reports with interactive plots

**Global Time Series:**
1. Ocean processing: Extract time series from MPAS-Analysis results (optional)
2. Coupled analysis: Generate regional and global plots
3. Output: HTML viewer (interactive) or PDF (static) based on make_viewer setting

### Key Configuration Files

- `pyproject.toml` - Package configuration, dependencies, CLI entry points
- `conda/dev.yml` - Development environment specification
- `.pre-commit-config.yaml` - Code quality hooks (black, isort, flake8, mypy)
- `.flake8` - Flake8 configuration (line length 119, specific ignores)

### Testing Strategy

- Unit tests in `tests/unit/` organized by module
- Example scripts in `examples/` showing realistic usage
- Integration with pre-commit hooks for quality assurance
- pytest with coverage reporting capabilities

## Important Notes

- The package handles both compressed (.gz) and uncompressed log files
- Budget analysis supports both "legacy" (coupler-only) and "whole-model" modes
- Time series generation can produce either interactive HTML viewers or static PDFs
- All CLI tools use argparse with comprehensive help documentation
- The codebase follows strict code quality standards with Black formatting and comprehensive linting
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,12 @@ classifiers = [

dependencies = [
"beautifulsoup4",
"bokeh",
"lxml",
"matplotlib",
"netcdf4",
"numpy >=2.0,<3.0",
"pandas",
"pcmdi_metrics>=3.9.3",
"xarray >=2023.02.0",
"xcdat >=0.7.3,<1.0",
Expand Down Expand Up @@ -117,6 +119,7 @@ version = { attr = "zppy_interfaces.version.__version__" }

# evolution of options.entry-points
[project.scripts]
zi-budget-analysis = "zppy_interfaces.budget_analysis.__main__:main"
zi-global-time-series = "zppy_interfaces.global_time_series.__main__:main"
zi-pcmdi-link-observation = "zppy_interfaces.pcmdi_diags.link_observation:main"
zi-pcmdi-mean-climate = "zppy_interfaces.pcmdi_diags.pcmdi_mean_cimate:main"
Expand Down
96 changes: 96 additions & 0 deletions tests/unit/budget_analysis/diagnose_ocn_closure.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
"""Diagnose ocean closure: verify log-native term names are used correctly."""

import glob
import sys

sys.path.insert(0, ".")

from zppy_interfaces.budget_analysis.checks import OcnClosure # noqa: E402
from zppy_interfaces.budget_analysis.ingestion.ocn_parser import OcnParser # noqa: E402
from zppy_interfaces.budget_analysis.normalization import normalize # noqa: E402

LOG_PATH = "/pscratch/sd/c/chengzhu/zstash/archive/logs"
START_YEAR = 1
END_YEAR = 50

ocn = OcnParser().parse_files(
sorted(glob.glob(f"{LOG_PATH}/ocn.log.*.gz")), START_YEAR, END_YEAR
)

print(f"Total ocean rows: {len(ocn)}")
print()

# --- Water ---
water = ocn[ocn["quantity"] == "water"]
print("=== Water term counts ===")
print(water["term"].value_counts().to_string())
print()

mc = water[water["term"] == "Mass change"]
print(f"=== 'Mass change' rows: {len(mc)} ===")
if not mc.empty:
print(
mc[["time", "value", "units", "table_type", "period"]]
.head(12)
.to_string(index=False)
)
else:
print(" *** NOT FOUND ***")
print()

svf = water[water["term"] == "SUM VOLUME FLUXES"]
print(f"=== 'SUM VOLUME FLUXES' rows: {len(svf)} ===")
if not svf.empty:
print(
svf[["time", "value", "units", "table_type", "period"]]
.head(12)
.to_string(index=False)
)
else:
print(" *** NOT FOUND ***")
print()

# --- Heat ---
heat = ocn[ocn["quantity"] == "heat"]
print("=== Heat term counts ===")
print(heat["term"].value_counts().to_string())
print()

ec = heat[heat["term"] == "Energy change"]
print(f"=== 'Energy change' rows: {len(ec)} ===")
if not ec.empty:
print(
ec[["time", "value", "units", "table_type", "period"]]
.head(12)
.to_string(index=False)
)
else:
print(" *** NOT FOUND ***")
print()

shf = heat[heat["term"] == "SUM IMP+EXP HEAT FLUXES"]
print(f"=== 'SUM IMP+EXP HEAT FLUXES' rows: {len(shf)} ===")
if not shf.empty:
print(
shf[["time", "value", "units", "table_type", "period"]]
.head(12)
.to_string(index=False)
)
else:
print(" *** NOT FOUND ***")
print()

# --- Test closure checks ---
print("=== Testing OcnClosure checks ===")
df = normalize(ocn)

for q in ["water", "heat"]:
check = OcnClosure(quantity=q)
result = check.evaluate(df)
if result is not None:
print(
f" {q} closure: {len(result.years)} years,"
f" max |residual| = {abs(result.residual).max():.2e}"
)
else:
print(f" {q} closure: SKIPPED (missing data)")
43 changes: 43 additions & 0 deletions tests/unit/budget_analysis/print_monthly_interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
"""Print monthly normalized values for cpl and ocn interface comparison."""

import glob
import sys

import pandas as pd

sys.path.insert(0, ".")

from zppy_interfaces.budget_analysis.ingestion.cpl_parser import CplParser # noqa: E402
from zppy_interfaces.budget_analysis.ingestion.ocn_parser import OcnParser # noqa: E402
from zppy_interfaces.budget_analysis.normalization import normalize # noqa: E402

LOG_PATH = "/pscratch/sd/c/chengzhu/zstash/archive/logs"
START_YEAR = 1
END_YEAR = 50

cpl = CplParser().parse_files(
sorted(glob.glob(f"{LOG_PATH}/cpl.log.*.gz")), START_YEAR, END_YEAR
)
ocn = OcnParser().parse_files(
sorted(glob.glob(f"{LOG_PATH}/ocn.log.*.gz")), START_YEAR, END_YEAR
)
df = normalize(pd.concat([cpl, ocn], ignore_index=True))

cpl_m = df[
(df.source == "cpl")
& (df.term == "*SUM*")
& (df.component == "ocn")
& (df.period == "monthly")
].sort_values("time")

ocn_m = df[
(df.source == "ocn")
& (df.term == "SUM VOLUME FLUXES")
& (df.table_type == "flux")
& (df.period == "monthly")
].sort_values("time")

print("=== CPL monthly ocn *SUM* ===")
print(cpl_m[["time", "normalized_value"]].to_string(index=False))
print(f"\n=== OCN monthly SUM VOLUME FLUXES ({len(ocn_m)} rows) ===")
print(ocn_m[["time", "normalized_value"]].to_string(index=False))
84 changes: 84 additions & 0 deletions tests/unit/budget_analysis/test_atm_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
"""Quick test script for AtmParser on a sample atm.log file."""

import sys

from zppy_interfaces.budget_analysis.ingestion.atm_parser import AtmParser

LOG_FILE = (
"/pscratch/sd/e/e3smtest/e3sm_scratch/pm-cpu/"
"SMS.ne4pg2_oQU480.F2010.pm-cpu_intel.eam-thetahy_ftype2_energy"
".C.JNextIntegration20260210_205258/run/"
"atm.log.48753126.260210-224733.gz"
)


def main():
parser = AtmParser()
log_files = [LOG_FILE]

# --- Raw per-step data ---
nstep_te, flux_diag = parser.parse_raw(log_files)

print("=== nstep_te (energy fixer) ===")
print(f"Shape: {nstep_te.shape}")
print(f"Columns: {list(nstep_te.columns)}")
print(nstep_te.head(3))
print("...")
print(nstep_te.tail(3))
print()

print("=== flux_diag (water/energy diagnostics) ===")
print(f"Shape: {flux_diag.shape}")
print(f"Columns: {list(flux_diag.columns)}")
print(flux_diag.head(3))
print("...")
print(flux_diag.tail(3))
print()

# --- Validation ---
print("=== Validation ===")
assert (
nstep_te.shape[0] == 122
), f"Expected 122 nstep_te rows, got {nstep_te.shape[0]}"
print(f" nstep_te rows: {nstep_te.shape[0]} (expected 122)")

assert (
flux_diag.shape[0] == 120
), f"Expected 120 flux_diag rows, got {flux_diag.shape[0]}"
print(f" flux_diag rows: {flux_diag.shape[0]} (expected 120)")

tw_first = flux_diag["tw"].iloc[0]
tw_last = flux_diag["tw"].iloc[-1]
print(f" W(n=1) = {tw_first:.6f} kg/m2 (expect ~25.303)")
print(f" W(n=120)= {tw_last:.6f} kg/m2 (expect ~24.949)")

e_diff_max = flux_diag["e_diff"].abs().max()
print(f" max |E difference| = {e_diff_max:.3e} W/m2")

# Check date tracking
print(
f" Date range: year {nstep_te['year'].min()}-{nstep_te['year'].max()}, "
f"month {nstep_te['month'].min()}-{nstep_te['month'].max()}, "
f"day {nstep_te['day'].min()}-{nstep_te['day'].max()}"
)
print()

# --- Tidy event table (annual) ---
events = parser.parse_files(log_files, 1, 1)
print("=== Tidy event table (annual, default) ===")
print(f"Shape: {events.shape}")
print(events.to_string())
print()

# --- Tidy event table (monthly) ---
monthly_parser = AtmParser(frequency="monthly")
events_m = monthly_parser.parse_files(log_files, 1, 1)
print("=== Tidy event table (monthly) ===")
print(f"Shape: {events_m.shape}")
print(events_m.to_string())

print("\nAll checks passed.")


if __name__ == "__main__":
sys.exit(main() or 0)
Loading
Loading