Conversation
Pre-commit checks not passing yet One test still failing Demonstrates approach to move cra_image to use an xarray data structure
Update some tests and some handling in the implementation
| def cra( | ||
| fcst: xr.DataArray, | ||
| obs: xr.DataArray, | ||
| fcst: XarrayLike, |
There was a problem hiding this comment.
When testing these changes, the XarrayLike part failed. Did you mean ArrayLike instead? I tried ArrayLike and then adding the importing from numpy.typing import ArrayLike and it worked.
There was a problem hiding this comment.
I don't have the full context in this PR, but just to add:
numpy's ArrayLike is a bit of a catch-all, we want to avoid using that if we have a more specific type-interface for that particular argument.
Co-authored-by: esteban-abellan <esteban.abellan@gmail.com> Signed-off-by: Tennessee Leeuwenburg <134973832+tennlee@users.noreply.github.com>
Delete unnecessary code Continue refactoring
Various test fixes
Delete test case for exception no longer generated
One failing unit test, think the code needs some more tweaking for 2D image slices
Tests passing
Update CRA method to take max_distance_approx and add notes about distance calculations
reza-armuei
left a comment
There was a problem hiding this comment.
Thanks @esteban-abellan and @tennlee
I have added several comments, but the main points are:
1- There is an issue in how the optimal shift (that minimises the MSE) is computed. The shift_range is currently hard‑coded, which can lead to cases where the shift that yields the minimum MSE lies outside the allowed maximum shift provided by the user. In such cases, the code discards the optimal shift and returns the unshifted fcst.
However, there may still exist a valid shift within the user-specified maximum range that would reduce the MSE compared to the unshifted forecast. Defining shift_range dynamically based on the user-provided maximum allowed shift would resolve this problem.
2- Several unit tests are duplicated or test essentially the same behaviour. These could be consolidated to avoid redundancy.
3- The current unit tests mostly verify code mechanics rather than validating scientific correctness. We should add a couple of tests that check whether the function returns the expected total MSE and its decomposed components.
I will work on addressing these issues in a separate PR. Depending on my availability, I may be able to get to it soon.
| obs (xr.DataArray): 2-D observation field. | ||
| threshold (float): Minimum value that a grid point must meet or exceed to be considered | ||
| minimum_intensity (float): Minimum value that a grid point must meet or exceed to be considered |
There was a problem hiding this comment.
To keep the consistency with the rest of scores, no need to include typehints in docstrings.
| # Assign a unique label to each connected component. For instance, if there are 3 separate | ||
| # blobs in our array, each blob will be assigned a different label (e.g., 1, 2, 3) | ||
| labeled_array_obs, num_features_obs = scipy.ndimage.label( | ||
| ~np.isnan(masked_obs), structure=structure | ||
| ) # labels the connected components in the masked array where the values are not NaN | ||
| if num_features_obs > 1: | ||
| # Find the largest blob | ||
| largest_blob_label_obs = np.argmax(np.bincount(labeled_array_obs.flat)[1:]) + 1 | ||
|
|
||
| # Create a new masked array with only the largest blob | ||
| obs = masked_obs.where(labeled_array_obs == largest_blob_label_obs) | ||
| else: | ||
| obs = masked_obs | ||
|
|
||
| labeled_array_fcst, num_features_fcst = scipy.ndimage.label( | ||
| ~np.isnan(masked_fcst), structure=structure | ||
| ) # labels the connected components in the masked array where the values are not NaN | ||
| if num_features_fcst > 1: | ||
| # Find the largest blob | ||
| largest_blob_label_fcst = np.argmax(np.bincount(labeled_array_fcst.flat)[1:]) + 1 | ||
|
|
||
| # Create a new masked array with only the largest blob | ||
| fcst = masked_fcst.where(labeled_array_fcst == largest_blob_label_fcst) | ||
| else: | ||
| fcst = masked_fcst |
There was a problem hiding this comment.
Since the logic for finding the largest blob is duplicated (once for observations and once for forecasts), would it make sense to extract it into a separate helper function and call it for both cases?
| fcst (xr.DataArray): Forecast field. | ||
| obs (xr.DataArray): Observation field. | ||
| x_name (str): Name of the zonal spatial dimension (e.g., 'x' or 'longitude'). | ||
| y_name (str): Name of the meridional spatial dimension (e.g., 'y' or 'latitude'). | ||
| max_distance (float) : Maximum distance in km allowed for the shifted blob. | ||
| coord_units (str) : coordinates units, 'degrees' or 'metres' |
There was a problem hiding this comment.
Same here. No need for including typehints in docstrings.
| def test_cra_image_shape_mismatch_raises_valueerror(): | ||
| """Shape mismatches should raise ValueError (parity with cra_image behavior).""" | ||
| fcst = create_array(shape=(10, 10)) | ||
| obs = create_array(shape=(8, 10)) # mismatched shape | ||
|
|
||
| with pytest.raises(ValueError): | ||
| cra_core_2d(fcst, obs, threshold=5.0, y_name="y", x_name="x") | ||
|
|
||
|
|
||
| def test_cra_core_2d_invalid_input_types_raise_typeerror(): | ||
| """Non-xarray inputs should raise TypeError (parity with cra_2d behavior).""" | ||
| obs = create_array() | ||
| with pytest.raises(TypeError): | ||
| cra_core_2d("invalid", obs, threshold=5.0, y_name="y", x_name="x") | ||
|
|
||
| fcst = create_array() | ||
| with pytest.raises(TypeError): | ||
| cra_core_2d(fcst, "invalid", threshold=5.0, y_name="y", x_name="x") | ||
| _cra_image(fcst, obs, minimum_intensity=5.0, y_name="y", x_name="x") |
There was a problem hiding this comment.
I beleive this has already been tested.
| Cover branch: | ||
| if rmse_shifted > rmse_original or corr_shifted < corr_original or mse_shifted > original_mse: | ||
| return None, None, None | ||
| by monkeypatching metric functions to force the condition true. | ||
| """ |
There was a problem hiding this comment.
I am not sure about return None, None, None as I think the _translate_forecast_region should return un-shifted forecast with 0 values for shift in x an y for a case when shifting forecats can result in an increase in error (here based on MSE).
| Cover branch: | |
| if mse_shifted > mse_original or corr_shifted < corr_original or mse_shifted > original_mse: | |
| return None, None, None | |
| by monkeypatching metric functions to force the condition true. | |
| """ |
| assert np.isnan(result.mse_total) | ||
|
|
||
|
|
||
| def test_cra_time_val_conversion_int_and_str(monkeypatch): |
There was a problem hiding this comment.
I am not sure if this test is required.
| # Monkeypatch .sel to avoid KeyError when datetime64 is passed | ||
| original_sel = xr.DataArray.sel | ||
|
|
||
| def safe_sel(self, indexers=None, drop=False): |
There was a problem hiding this comment.
Is this used anywhere in this test?
|
|
||
|
|
||
| def test_cra_core_2d_returns_none_when_shifted_fcst_is_none(monkeypatch): | ||
| def test_cra_image_returns_none_when_shifted_fcst_is_none(): |
There was a problem hiding this comment.
I think here _cra_image returns NaN because the fcst and obs fields contain fewer points than the default min_points value of 10. So I'm not sure whether this is the intended to test this behaviour?
Co-authored-by: reza-armuei <144857501+reza-armuei@users.noreply.github.com> Signed-off-by: Tennessee Leeuwenburg <134973832+tennlee@users.noreply.github.com>
Co-authored-by: reza-armuei <144857501+reza-armuei@users.noreply.github.com> Signed-off-by: Tennessee Leeuwenburg <134973832+tennlee@users.noreply.github.com>
Demonstrates a possible approach to moving CRA towards using an xarray return type
Only applied to cra_2d (now cra_image) for the time being
Feedback welcome