Skip to content

Adding in REV without dask support#999

Open
thomaspagano wants to merge 5 commits intonci:developfrom
thomaspagano:issue-632-REV-isolated-rev-without-dask
Open

Adding in REV without dask support#999
thomaspagano wants to merge 5 commits intonci:developfrom
thomaspagano:issue-632-REV-isolated-rev-without-dask

Conversation

@thomaspagano
Copy link
Contributor

As requested, a version of REV without any support for dask whatsoever. Note that when run with dask it doesn't complain, should it?

Please work through the following checklists. Delete anything that isn't relevant.

Development for new xarray-based metrics

  • Works with n-dimensional data and includes reduce_dims, preserve_dims, and weights args.
  • Typehints added
  • Add error handling
  • Imported into the API
  • Works with both xr.DataArrays and xr.Datasets if possible

Docstrings

  • Docstrings complete and follow Napoleon (google) style
  • Maths equation added
  • Reference to paper/webpage is in docstring. The preferred referencing style for journal articles is APA (7th edition)
  • Code example added

Testing of new xarray-based metrics

  • 100% unit test coverage
  • [NO BY DESIGN] Test that metric is compatible with dask.
  • Test that metrics work with inputs that contain NaNs
  • Test that broadcasting with xarray works
  • Test both reduce and preserve dims arguments work
  • Test that errors are raised as expected
  • Test that it works with both xr.DataArrays and xr.Datasets

Tutorial notebook

  • Short introduction to why you would use that metric and what it tells you
  • A link to a reference
  • A "things to try next" section at the end
  • Add notebook to Tutorial_Gallery.ipynb
  • Optional - a detailed discussion of how the metric works at the end of the notebook

Documentation

@thomaspagano thomaspagano marked this pull request as ready for review February 11, 2026 01:59
Copy link
Collaborator

@nicholasloveday nicholasloveday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @thomaspagano . This is starting to look pretty good. It looks like it's not using the Jive test data. Once you update the tests to use the Jive test data, I'll take another look at the tests.

if not np.all(np.isin(unique_obs, [0, 1])):
raise ValueError("obs must contain only 0, 1, or NaN values")

return None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that you need return None. If you update this, then you can simplify the docstring.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also. check_binary already exists, so you can just use https://github.com/nci/scores/blob/develop/src/scores/utils.py#L435

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed all instances of return None, swapped to using check_binary and deleted redundant test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typehints on some of these validation functions return a None still.

Additionally, _validate_rev_inputs has a misleading return typhint that should be removed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested

return result


def calculate_climatology(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on this going in a general file somewhere else? This feels like a useful tool that could be used for other things. Not sure where...perhaps in scores.processing? I'm interested in @tennlee 's view

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy either way. Maybe leave it for now.

Comment on lines 284 to 301
if weights is not None:
# Identify which dimensions weights actually span
weight_reduce_dims = tuple(d for d in obs_reduce_dims if d in weights.dims)

if weight_reduce_dims:
# Use aggregate for weighted mean over dimensions that weights span
climatology = aggregate(obs, reduce_dims=weight_reduce_dims, weights=weights, method="mean")

# Unweighted mean over remaining dimensions
remaining_dims = tuple(d for d in obs_reduce_dims if d not in weight_reduce_dims)
if remaining_dims:
climatology = aggregate(climatology, reduce_dims=remaining_dims, weights=None, method="mean")
else:
# Weights exist but don't span any reduction dimensions - ignore them
climatology = aggregate(obs, reduce_dims=obs_reduce_dims, weights=None, method="mean")
else:
# Simple unweighted mean
climatology = aggregate(obs, reduce_dims=obs_reduce_dims, weights=None, method="mean")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here can be simplified as aggregate already handles a lot of this. (e.g., if reduce_dims=None, aggregate already returns the unweighted mean. Take a look at https://github.com/nci/scores/blob/develop/src/scores/processing/aggregation.py#L15

Consider if you want to simplify the tests as a result of this or just leave them in to test each case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified as suggested

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified further:

    dims_to_reduce = gather_dimensions(
        fcst_dims=(),  # Not needed for climatology
        obs_dims=obs.dims,
        reduce_dims=reduce_dims,
        preserve_dims=preserve_dims,
        weights_dims=weights.dims if weights is not None else None,
    )
    climatology = aggregate(
        obs,
        reduce_dims=dims_to_reduce,
        weights=weights,
        method="mean",
    )
return climatology

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted as suggested

"n_times = 100\n",
"n_members = 50\n",
"# Generate ensemble with some skill\n",
"seed_temps = np.random.normal(1, 3, n_times) \n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused why this is called "seed_temps"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added for clarity

Comment on lines 27 to 28
# Module-level test data ... Preserved as is because these come from Jive and we want to demonstrate
# that they've been carried over exactly.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks different to the test data in Jive... Can you please review and update?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully it's all good now?

@nicholasloveday
Copy link
Collaborator

@thomaspagano I also see that there are now a couple of conflicts that need resolving with the latest develop branch.

@thomaspagano thomaspagano force-pushed the issue-632-REV-isolated-rev-without-dask branch from 5208ad5 to f8ff2bd Compare February 23, 2026 03:21
@thomaspagano thomaspagano force-pushed the issue-632-REV-isolated-rev-without-dask branch from f8ff2bd to ec7b905 Compare February 23, 2026 03:25
Comment on lines 1517 to 1541
# 5: get rational_user
# Censored because scores code does not allow
# cost loss ratios to be optional, they are required
# (
# rtd.FCST_2X3X2_WITH_NAN_MISALIGNED,
# rtd.OBS_3X3_WITH_NAN_MISALIGNED,
# [0, 0.3, 0.5, 0.7, 1],
# None,
# ["lead_day"],
# ["maximum", "rational_user"],
# None,
# rtd.EXP_PREV_CASE5,
# ),
# # 6: don't supply derived_metrics, supply threshold_outputs
# (
# rtd.FCST_2X3X2_WITH_NAN_MISALIGNED,
# rtd.OBS_3X3_WITH_NAN_MISALIGNED,
# [0, 0.3, 0.5, 0.7, 1],
# None,
# ["lead_day"],
# ["rational_user"],
# [0, 0.3, 0.5, 0.7, 1],
# rtd.EXP_PREV_CASE6,
# ),
# issue #162 - dims accepts any iterable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to delete these commented out tests

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted as suggested

Comment on lines 335 to 336
raise ValueError(
f"Invalid derived_metrics value: '{mode}'. Valid options are 'maximum' and 'rational_user'"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should already be checked in _validate_derived_metrics and isn't needed here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested

# Handle cost-loss ratios
if isinstance(cost_loss_ratios, (float, int)):
cost_loss_ratios = [cost_loss_ratios]
cost_loss_xr = xr.DataArray(cost_loss_ratios, dims=[cost_loss_dim], coords={cost_loss_dim: cost_loss_ratios})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conversion to an xarray object isn't needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested.

# False alarm rate (probability of false detection) equation 5

try:
check_monotonic_array(cost_loss_ratios)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a separate check for cost_loss_ratios here to _validate_cost_loss_ratios. Can't you just have one function that checks both?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested

Comment on lines 19 to 45
def _validate_dimensions(
fcst: XarrayLike,
obs: XarrayLike,
weights: Optional[xr.DataArray],
threshold_dim: str,
cost_loss_dim: str,
) -> None:
"""Check for dimension conflicts in inputs."""
# Dimension checks are always lazy - just check dims, not values
# Check fcst dimensions
if threshold_dim in fcst.dims:
raise ValueError(f"'{threshold_dim}' cannot be a dimension in fcst")
if cost_loss_dim in fcst.dims:
raise ValueError(f"'{cost_loss_dim}' cannot be a dimension in fcst")

# Check obs dimensions
if threshold_dim in obs.dims:
raise ValueError(f"'{threshold_dim}' cannot be a dimension in obs")
if cost_loss_dim in obs.dims:
raise ValueError(f"'{cost_loss_dim}' cannot be a dimension in obs")

# Check weights dimensions
if weights is not None:
if threshold_dim in weights.dims:
raise ValueError(f"'{threshold_dim}' cannot be a dimension in weights")
if cost_loss_dim in weights.dims:
raise ValueError(f"'{cost_loss_dim}' cannot be a dimension in weights")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify this like:

    inputs = [("fcst", fcst), ("obs", obs)]
    if weights is not None:
        inputs.append(("weights", weights))

    for dim_name in (threshold_dim, cost_loss_dim):
        for input_name, input_data in inputs:
            if dim_name in input_data.dims:
                raise ValueError(f"'{dim_name}' cannot be a dimension in {input_name}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested

cost_loss_dim: str,
) -> None:
"""Check for dimension conflicts in inputs."""
# Dimension checks are always lazy - just check dims, not values
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment about "lazy"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer relevant

Raises ValueError if invalid values are present.
"""

check_binary(obs, "obs")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we need a wrapper around check_binary? Why not just call check_binary directly?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested

binary_fcst = binary_discretise(fcst, threshold, ">=")

# Rename the threshold dimension if needed
if "threshold" in binary_fcst.dims and threshold_dim != "threshold":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be simplified to

Suggested change
if "threshold" in binary_fcst.dims and threshold_dim != "threshold":
if threshold_dim != "threshold":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted as suggested

fcst = xr.DataArray([0.2, 0.8, 0.6], dims=["time"])
obs = xr.DataArray([0, 1, 1], dims=["time"])

try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using try, except in tests; can you use pytest.raises instead?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed as suggested

cost_loss_ratios = [0.5]
derived_metrics = ["pizza_oven"]

try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use pytest.raises instead

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed as suggested

# The 'rational_user' DataArray should not contain the threshold_dim anymore
assert "threshold" not in result["rational_user"].dims

def test_rational_user_no_threshold_in_coords(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test needed? Is there a way that threshold coord is not present?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might leave it in. It's testing that threshold coord is not present which is a good thing. It's showing that the code is doing work.

Comment on lines 660 to 716
def test_custom_dimension_names(self):
"""Test that custom dimension names work correctly."""
fcst = xr.DataArray([0.2, 0.8, 0.6, 0.4], dims=["time"])
obs = xr.DataArray([0, 1, 1, 0], dims=["time"])

# Test custom threshold_dim only
result = relative_economic_value(
fcst,
obs,
cost_loss_ratios=[0.3, 0.7],
threshold=[0.5],
threshold_dim="my_threshold",
)

assert "my_threshold" in result.dims
assert "cost_loss_ratio" in result.dims
assert result.dims == ("my_threshold", "cost_loss_ratio")

def test_custom_cost_loss_dim(self):
"""Test that custom cost_loss_dim works correctly."""
fcst = xr.DataArray([0.2, 0.8, 0.6, 0.4], dims=["time"])
obs = xr.DataArray([0, 1, 1, 0], dims=["time"])

# Test custom cost_loss_dim only
result = relative_economic_value(
fcst,
obs,
cost_loss_ratios=[0.3, 0.7],
threshold=[0.5],
cost_loss_dim="alpha",
)

assert "threshold" in result.dims
assert "alpha" in result.dims
assert result.dims == ("threshold", "alpha")

def test_both_custom_dimension_names(self):
"""Test that both custom dimension names work together."""
fcst = xr.DataArray([0.2, 0.8, 0.6, 0.4], dims=["time"])
obs = xr.DataArray([0, 1, 1, 0], dims=["time"])

# Test both custom dims
result = relative_economic_value(
fcst,
obs,
cost_loss_ratios=[0.3, 0.7],
threshold=[0.5],
threshold_dim="decision_threshold",
cost_loss_dim="alpha",
)

assert "decision_threshold" in result.dims
assert "alpha" in result.dims
assert result.dims == ("decision_threshold", "alpha")
assert "threshold" not in result.dims
assert "cost_loss_ratio" not in result.dims

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you consolidate these tests above using @pytest.mark.parametrize

Additionally, you can simplify the assert statements to just be assert result.dims == expected_dims
For example in test_custom_dimension_names you have:

        assert "my_threshold" in result.dims
        assert "cost_loss_ratio" in result.dims
        assert result.dims == ("my_threshold", "cost_loss_ratio")

The first two asserts aren't needed as they're covered by the third assert

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested

class TestWeights:
"""Tests for handling of time/spatial weights in REV calculations."""

@pytest.mark.parametrize("weight_value", [0.5, 1, 2])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that this needs @pytest.mark.parametrize. Uniform weights all lead to the same result, so we probably don't need to test it 3 times.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted as suggested


xr.testing.assert_allclose(rev_weighted, rev_unweighted)

def test_weights_nonuniform(self, make_contingency_data):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non uniform weights are already tested in test_spatial_weights_broadcast, so this probably isn't needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepted as suggested

)
xr.testing.assert_allclose(result, expected, atol=0.001)

def test_weights_negative(self, make_contingency_data):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be included in the main input validation test?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved as suggested


assert "dimension 'cost_loss_ratio' must not be in input data" in str(excinfo.value)

def test_check_monotonic_array_non_convertible_raises_typeerror(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine with test_out_of_range_arrays using pytest.mark.paremeterize

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted as suggested

with pytest.raises(TypeError, match="could not convert array into a numpy ndarray of floats"):
check_monotonic_array(["not", "numbers"])

def test_check_monotonic_array_rejects_multidimensional_arrays(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine with test_out_of_range_arrays using pytest.mark.paremeterize

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted as suggested

check_args=True,
)

def test_fcst_outside_01_with_threshold(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporate into test_input_validation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted as suggested

@nicholasloveday
Copy link
Collaborator

Thanks @thomaspagano , I have made several more suggestions with a particular focus on ideas on how to simply/clean up the code (including tests) further to make it easier to maintain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants