Is your feature request related to a problem?
zppy generates e3sm_diags scripts that rely on symlinks and local paths for input data. In particular, param.test_data_path is often set to a local directory (e.g., climo) without indicating how it maps to the actual data location.
This makes reproducing e3sm_diags issues difficult outside the original environment. To run the generated scripts elsewhere, we must either reverse-engineer the symlinks and either:
- Update
test_data_path to the correct absolute paths, or
- Copy the files located at the symlinks to the local directory referenced in the script (e.g.,
climo)
Also we have to consider that e3sm_diags won't know which sets of climo files to use. zppy symlinks to the exact files needed through the start_yr and end_yr configs.
Example
In this zppy-generated script:
https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/ac.forsyth2/zppy_weekly_comprehensive_v2_www/zppy-diags-1019-xc-break-test4/v2.LR.historical_0201/e3sm_diags/atm_monthly_180x360_aave/model_vs_obs_1982-1983/prov/e3sm.py
# Model
param.test_data_path = 'climo'
param.test_name = 'v2.LR.historical_0201'
param.short_test_name = short_name
The script cannot be run as-is without reconstructing the symlinks or copying the data locally.
Describe the solution you'd like
A simple way to improve reproducibility is to print the resolved test_data_path and reference_data_path configs used by all parameter objects in the console output (and therefore the log file). This would make the exact data configuration visible without needing to inspect generated scripts or reverse-engineer symlinks.
For example, extending the run header to include these paths:
E3SM Diagnostics Run
--------------------
Timestamp: 2025-12-19 11:50:03
Version Info: version v3.1.0
Results Path: model_vs_obs_1982-1983
Log Path: model_vs_obs_1982-1983/prov/e3sm_diags_run.log
Parameter Files Path: model_vs_obs_1982-1983/prov/cmd_used.txt
Python Script Path: model_vs_obs_1982-1983/prov/e3sm.py
Environment YML Path: model_vs_obs_1982-1983/prov/environment.yml
Provenance Index HTML Path: model_vs_obs_1982-1983/prov/index.html
Test Data Path: /absolute/path/to/test/data
Reference Data Path: /absolute/path/to/reference/data
This information would be preserved in the log file, making it much easier to reproduce and debug runs.
Describe alternatives you've considered
No response
Additional context
Related to:
Is your feature request related to a problem?
zppygeneratese3sm_diagsscripts that rely on symlinks and local paths for input data. In particular,param.test_data_pathis often set to a local directory (e.g.,climo) without indicating how it maps to the actual data location.This makes reproducing
e3sm_diagsissues difficult outside the original environment. To run the generated scripts elsewhere, we must either reverse-engineer the symlinks and either:test_data_pathto the correct absolute paths, orclimo)Also we have to consider that
e3sm_diagswon't know which sets of climo files to use.zppysymlinks to the exact files needed through thestart_yrandend_yrconfigs.Example
In this
zppy-generated script:https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/ac.forsyth2/zppy_weekly_comprehensive_v2_www/zppy-diags-1019-xc-break-test4/v2.LR.historical_0201/e3sm_diags/atm_monthly_180x360_aave/model_vs_obs_1982-1983/prov/e3sm.py
The script cannot be run as-is without reconstructing the symlinks or copying the data locally.
Describe the solution you'd like
A simple way to improve reproducibility is to print the resolved
test_data_pathandreference_data_pathconfigs used by all parameter objects in the console output (and therefore the log file). This would make the exact data configuration visible without needing to inspect generated scripts or reverse-engineer symlinks.For example, extending the run header to include these paths:
This information would be preserved in the log file, making it much easier to reproduce and debug runs.
Describe alternatives you've considered
No response
Additional context
Related to: