Conversation
nicholasloveday
left a comment
There was a problem hiding this comment.
Thanks @thomaspagano . This is starting to look pretty good. It looks like it's not using the Jive test data. Once you update the tests to use the Jive test data, I'll take another look at the tests.
src/scores/probability/rev_impl.py
Outdated
| if not np.all(np.isin(unique_obs, [0, 1])): | ||
| raise ValueError("obs must contain only 0, 1, or NaN values") | ||
|
|
||
| return None |
There was a problem hiding this comment.
I don't think that you need return None. If you update this, then you can simplify the docstring.
There was a problem hiding this comment.
Also. check_binary already exists, so you can just use https://github.com/nci/scores/blob/develop/src/scores/utils.py#L435
There was a problem hiding this comment.
Removed all instances of return None, swapped to using check_binary and deleted redundant test.
There was a problem hiding this comment.
The typehints on some of these validation functions return a None still.
Additionally, _validate_rev_inputs has a misleading return typhint that should be removed
| return result | ||
|
|
||
|
|
||
| def calculate_climatology( |
There was a problem hiding this comment.
What are your thoughts on this going in a general file somewhere else? This feels like a useful tool that could be used for other things. Not sure where...perhaps in scores.processing? I'm interested in @tennlee 's view
There was a problem hiding this comment.
I'm happy either way. Maybe leave it for now.
src/scores/probability/rev_impl.py
Outdated
| if weights is not None: | ||
| # Identify which dimensions weights actually span | ||
| weight_reduce_dims = tuple(d for d in obs_reduce_dims if d in weights.dims) | ||
|
|
||
| if weight_reduce_dims: | ||
| # Use aggregate for weighted mean over dimensions that weights span | ||
| climatology = aggregate(obs, reduce_dims=weight_reduce_dims, weights=weights, method="mean") | ||
|
|
||
| # Unweighted mean over remaining dimensions | ||
| remaining_dims = tuple(d for d in obs_reduce_dims if d not in weight_reduce_dims) | ||
| if remaining_dims: | ||
| climatology = aggregate(climatology, reduce_dims=remaining_dims, weights=None, method="mean") | ||
| else: | ||
| # Weights exist but don't span any reduction dimensions - ignore them | ||
| climatology = aggregate(obs, reduce_dims=obs_reduce_dims, weights=None, method="mean") | ||
| else: | ||
| # Simple unweighted mean | ||
| climatology = aggregate(obs, reduce_dims=obs_reduce_dims, weights=None, method="mean") |
There was a problem hiding this comment.
The logic here can be simplified as aggregate already handles a lot of this. (e.g., if reduce_dims=None, aggregate already returns the unweighted mean. Take a look at https://github.com/nci/scores/blob/develop/src/scores/processing/aggregation.py#L15
Consider if you want to simplify the tests as a result of this or just leave them in to test each case.
There was a problem hiding this comment.
Simplified as suggested
There was a problem hiding this comment.
This can be simplified further:
dims_to_reduce = gather_dimensions(
fcst_dims=(), # Not needed for climatology
obs_dims=obs.dims,
reduce_dims=reduce_dims,
preserve_dims=preserve_dims,
weights_dims=weights.dims if weights is not None else None,
)
climatology = aggregate(
obs,
reduce_dims=dims_to_reduce,
weights=weights,
method="mean",
)
return climatology| "n_times = 100\n", | ||
| "n_members = 50\n", | ||
| "# Generate ensemble with some skill\n", | ||
| "seed_temps = np.random.normal(1, 3, n_times) \n", |
There was a problem hiding this comment.
I'm confused why this is called "seed_temps"
There was a problem hiding this comment.
Comment added for clarity
tests/probability/test_rev.py
Outdated
| # Module-level test data ... Preserved as is because these come from Jive and we want to demonstrate | ||
| # that they've been carried over exactly. |
There was a problem hiding this comment.
This looks different to the test data in Jive... Can you please review and update?
|
@thomaspagano I also see that there are now a couple of conflicts that need resolving with the latest develop branch. |
5208ad5 to
f8ff2bd
Compare
f8ff2bd to
ec7b905
Compare
tests/probability/test_rev.py
Outdated
| # 5: get rational_user | ||
| # Censored because scores code does not allow | ||
| # cost loss ratios to be optional, they are required | ||
| # ( | ||
| # rtd.FCST_2X3X2_WITH_NAN_MISALIGNED, | ||
| # rtd.OBS_3X3_WITH_NAN_MISALIGNED, | ||
| # [0, 0.3, 0.5, 0.7, 1], | ||
| # None, | ||
| # ["lead_day"], | ||
| # ["maximum", "rational_user"], | ||
| # None, | ||
| # rtd.EXP_PREV_CASE5, | ||
| # ), | ||
| # # 6: don't supply derived_metrics, supply threshold_outputs | ||
| # ( | ||
| # rtd.FCST_2X3X2_WITH_NAN_MISALIGNED, | ||
| # rtd.OBS_3X3_WITH_NAN_MISALIGNED, | ||
| # [0, 0.3, 0.5, 0.7, 1], | ||
| # None, | ||
| # ["lead_day"], | ||
| # ["rational_user"], | ||
| # [0, 0.3, 0.5, 0.7, 1], | ||
| # rtd.EXP_PREV_CASE6, | ||
| # ), | ||
| # issue #162 - dims accepts any iterable |
There was a problem hiding this comment.
Feel free to delete these commented out tests
src/scores/probability/rev_impl.py
Outdated
| raise ValueError( | ||
| f"Invalid derived_metrics value: '{mode}'. Valid options are 'maximum' and 'rational_user'" |
There was a problem hiding this comment.
This should already be checked in _validate_derived_metrics and isn't needed here
src/scores/probability/rev_impl.py
Outdated
| # Handle cost-loss ratios | ||
| if isinstance(cost_loss_ratios, (float, int)): | ||
| cost_loss_ratios = [cost_loss_ratios] | ||
| cost_loss_xr = xr.DataArray(cost_loss_ratios, dims=[cost_loss_dim], coords={cost_loss_dim: cost_loss_ratios}) |
There was a problem hiding this comment.
This conversion to an xarray object isn't needed.
src/scores/probability/rev_impl.py
Outdated
| # False alarm rate (probability of false detection) equation 5 | ||
|
|
||
| try: | ||
| check_monotonic_array(cost_loss_ratios) |
There was a problem hiding this comment.
Why is there a separate check for cost_loss_ratios here to _validate_cost_loss_ratios. Can't you just have one function that checks both?
src/scores/probability/rev_impl.py
Outdated
| def _validate_dimensions( | ||
| fcst: XarrayLike, | ||
| obs: XarrayLike, | ||
| weights: Optional[xr.DataArray], | ||
| threshold_dim: str, | ||
| cost_loss_dim: str, | ||
| ) -> None: | ||
| """Check for dimension conflicts in inputs.""" | ||
| # Dimension checks are always lazy - just check dims, not values | ||
| # Check fcst dimensions | ||
| if threshold_dim in fcst.dims: | ||
| raise ValueError(f"'{threshold_dim}' cannot be a dimension in fcst") | ||
| if cost_loss_dim in fcst.dims: | ||
| raise ValueError(f"'{cost_loss_dim}' cannot be a dimension in fcst") | ||
|
|
||
| # Check obs dimensions | ||
| if threshold_dim in obs.dims: | ||
| raise ValueError(f"'{threshold_dim}' cannot be a dimension in obs") | ||
| if cost_loss_dim in obs.dims: | ||
| raise ValueError(f"'{cost_loss_dim}' cannot be a dimension in obs") | ||
|
|
||
| # Check weights dimensions | ||
| if weights is not None: | ||
| if threshold_dim in weights.dims: | ||
| raise ValueError(f"'{threshold_dim}' cannot be a dimension in weights") | ||
| if cost_loss_dim in weights.dims: | ||
| raise ValueError(f"'{cost_loss_dim}' cannot be a dimension in weights") |
There was a problem hiding this comment.
You can simplify this like:
inputs = [("fcst", fcst), ("obs", obs)]
if weights is not None:
inputs.append(("weights", weights))
for dim_name in (threshold_dim, cost_loss_dim):
for input_name, input_data in inputs:
if dim_name in input_data.dims:
raise ValueError(f"'{dim_name}' cannot be a dimension in {input_name}")
src/scores/probability/rev_impl.py
Outdated
| cost_loss_dim: str, | ||
| ) -> None: | ||
| """Check for dimension conflicts in inputs.""" | ||
| # Dimension checks are always lazy - just check dims, not values |
There was a problem hiding this comment.
Remove comment about "lazy"
src/scores/probability/rev_impl.py
Outdated
| Raises ValueError if invalid values are present. | ||
| """ | ||
|
|
||
| check_binary(obs, "obs") |
There was a problem hiding this comment.
Any reason why we need a wrapper around check_binary? Why not just call check_binary directly?
src/scores/probability/rev_impl.py
Outdated
| binary_fcst = binary_discretise(fcst, threshold, ">=") | ||
|
|
||
| # Rename the threshold dimension if needed | ||
| if "threshold" in binary_fcst.dims and threshold_dim != "threshold": |
There was a problem hiding this comment.
Can this be simplified to
| if "threshold" in binary_fcst.dims and threshold_dim != "threshold": | |
| if threshold_dim != "threshold": |
tests/probability/test_rev.py
Outdated
| fcst = xr.DataArray([0.2, 0.8, 0.6], dims=["time"]) | ||
| obs = xr.DataArray([0, 1, 1], dims=["time"]) | ||
|
|
||
| try: |
There was a problem hiding this comment.
Instead of using try, except in tests; can you use pytest.raises instead?
tests/probability/test_rev.py
Outdated
| cost_loss_ratios = [0.5] | ||
| derived_metrics = ["pizza_oven"] | ||
|
|
||
| try: |
There was a problem hiding this comment.
Use pytest.raises instead
| # The 'rational_user' DataArray should not contain the threshold_dim anymore | ||
| assert "threshold" not in result["rational_user"].dims | ||
|
|
||
| def test_rational_user_no_threshold_in_coords(self): |
There was a problem hiding this comment.
Is this test needed? Is there a way that threshold coord is not present?
There was a problem hiding this comment.
I might leave it in. It's testing that threshold coord is not present which is a good thing. It's showing that the code is doing work.
tests/probability/test_rev.py
Outdated
| def test_custom_dimension_names(self): | ||
| """Test that custom dimension names work correctly.""" | ||
| fcst = xr.DataArray([0.2, 0.8, 0.6, 0.4], dims=["time"]) | ||
| obs = xr.DataArray([0, 1, 1, 0], dims=["time"]) | ||
|
|
||
| # Test custom threshold_dim only | ||
| result = relative_economic_value( | ||
| fcst, | ||
| obs, | ||
| cost_loss_ratios=[0.3, 0.7], | ||
| threshold=[0.5], | ||
| threshold_dim="my_threshold", | ||
| ) | ||
|
|
||
| assert "my_threshold" in result.dims | ||
| assert "cost_loss_ratio" in result.dims | ||
| assert result.dims == ("my_threshold", "cost_loss_ratio") | ||
|
|
||
| def test_custom_cost_loss_dim(self): | ||
| """Test that custom cost_loss_dim works correctly.""" | ||
| fcst = xr.DataArray([0.2, 0.8, 0.6, 0.4], dims=["time"]) | ||
| obs = xr.DataArray([0, 1, 1, 0], dims=["time"]) | ||
|
|
||
| # Test custom cost_loss_dim only | ||
| result = relative_economic_value( | ||
| fcst, | ||
| obs, | ||
| cost_loss_ratios=[0.3, 0.7], | ||
| threshold=[0.5], | ||
| cost_loss_dim="alpha", | ||
| ) | ||
|
|
||
| assert "threshold" in result.dims | ||
| assert "alpha" in result.dims | ||
| assert result.dims == ("threshold", "alpha") | ||
|
|
||
| def test_both_custom_dimension_names(self): | ||
| """Test that both custom dimension names work together.""" | ||
| fcst = xr.DataArray([0.2, 0.8, 0.6, 0.4], dims=["time"]) | ||
| obs = xr.DataArray([0, 1, 1, 0], dims=["time"]) | ||
|
|
||
| # Test both custom dims | ||
| result = relative_economic_value( | ||
| fcst, | ||
| obs, | ||
| cost_loss_ratios=[0.3, 0.7], | ||
| threshold=[0.5], | ||
| threshold_dim="decision_threshold", | ||
| cost_loss_dim="alpha", | ||
| ) | ||
|
|
||
| assert "decision_threshold" in result.dims | ||
| assert "alpha" in result.dims | ||
| assert result.dims == ("decision_threshold", "alpha") | ||
| assert "threshold" not in result.dims | ||
| assert "cost_loss_ratio" not in result.dims | ||
|
|
There was a problem hiding this comment.
Can you consolidate these tests above using @pytest.mark.parametrize
Additionally, you can simplify the assert statements to just be assert result.dims == expected_dims
For example in test_custom_dimension_names you have:
assert "my_threshold" in result.dims
assert "cost_loss_ratio" in result.dims
assert result.dims == ("my_threshold", "cost_loss_ratio")The first two asserts aren't needed as they're covered by the third assert
tests/probability/test_rev.py
Outdated
| class TestWeights: | ||
| """Tests for handling of time/spatial weights in REV calculations.""" | ||
|
|
||
| @pytest.mark.parametrize("weight_value", [0.5, 1, 2]) |
There was a problem hiding this comment.
Not sure that this needs @pytest.mark.parametrize. Uniform weights all lead to the same result, so we probably don't need to test it 3 times.
tests/probability/test_rev.py
Outdated
|
|
||
| xr.testing.assert_allclose(rev_weighted, rev_unweighted) | ||
|
|
||
| def test_weights_nonuniform(self, make_contingency_data): |
There was a problem hiding this comment.
Non uniform weights are already tested in test_spatial_weights_broadcast, so this probably isn't needed.
tests/probability/test_rev.py
Outdated
| ) | ||
| xr.testing.assert_allclose(result, expected, atol=0.001) | ||
|
|
||
| def test_weights_negative(self, make_contingency_data): |
There was a problem hiding this comment.
Should this just be included in the main input validation test?
tests/probability/test_rev.py
Outdated
|
|
||
| assert "dimension 'cost_loss_ratio' must not be in input data" in str(excinfo.value) | ||
|
|
||
| def test_check_monotonic_array_non_convertible_raises_typeerror(self): |
There was a problem hiding this comment.
Combine with test_out_of_range_arrays using pytest.mark.paremeterize
tests/probability/test_rev.py
Outdated
| with pytest.raises(TypeError, match="could not convert array into a numpy ndarray of floats"): | ||
| check_monotonic_array(["not", "numbers"]) | ||
|
|
||
| def test_check_monotonic_array_rejects_multidimensional_arrays(self): |
There was a problem hiding this comment.
Combine with test_out_of_range_arrays using pytest.mark.paremeterize
tests/probability/test_rev.py
Outdated
| check_args=True, | ||
| ) | ||
|
|
||
| def test_fcst_outside_01_with_threshold(self): |
There was a problem hiding this comment.
Incorporate into test_input_validation
|
Thanks @thomaspagano , I have made several more suggestions with a particular focus on ideas on how to simply/clean up the code (including tests) further to make it easier to maintain |
As requested, a version of REV without any support for dask whatsoever. Note that when run with dask it doesn't complain, should it?
Please work through the following checklists. Delete anything that isn't relevant.
Development for new xarray-based metrics
reduce_dims,preserve_dims, andweightsargs.xr.DataArraysandxr.Datasetsif possibleDocstrings
Testing of new xarray-based metrics
xr.DataArraysandxr.DatasetsTutorial notebook
Documentation