Better matching and time interpolation by jsmariegaard · Pull Request #281 · DHI/modelskill

jsmariegaard · 2023-11-22T10:19:18Z

This PR removes the use of dataframes as a temporary solution when interpolating and other improvements in the matching between observation and model results.

values

ecomodeller · 2023-11-24T16:47:52Z

modelskill/timeseries/_timeseries.py

+        self.data = data

-    def interp_time(self, new_time: pd.DatetimeIndex) -> TimeSeries:
+    def interp_time(


There seems to be a flaw in our design. 😳

It makes sense for a TimeSeries to be able to be interpolated.

It doesn't make sense to interpolate in track data, e.g. TrackModelResult and TrackObservation (these points are scattered in a 3d space (time, y, x))

Thus, track data are not TimeSeries (at least in this sense) 🤔

Hmm, maybe we should have dedicated TrackTimeseries and PointTimeseries after all?

…ide_observation_track

…nd y matches! Satellite tracks can be outside model domain and therefore contain more data.

jsmariegaard · 2023-11-26T22:33:53Z

modelskill/matching.py

+            assert len(observation.time) > 0
+            mri = mr.interp_time(new_time=observation.time)
+        else:
+            # It doesn't make sense to interpolate track data in time


@ecomodeller could you please help with this part? We still need matching for track data, just a different kind than for point data. For track data we should make sure t, x and y match (within a tolerance) - the model and observation series will not necessarily have same length or exact same data even though the observation data was used to extract data from the dfsu model result. The obs data will typically be covering more than the model domain. Also, we have in the past had problems around rounding of position and time data.

I think we should re-evaluate how much of this track matching logic that belongs in the core of modelskill.

If the observation data was used to extract model data, they should already match, otherwise the problem is in the extraction process. I would assume that it would be possible to get a matched dataset in connection with the extraction and then supply this pre-matched data to modelskill with the modelskill.from_matched(...) function.

I made an attempt of creating some matching logic, but it is difficult since I don't fully understand the requirements.

I can understand two ends of the spectrum.

Extracted data that already matches, i.e. equal number of rows

Intersection of observation and model data in 3d (t,y,x) with a buffer applied to each point

In order to something in between, I need more information on what is expected from the data.

ecomodeller · 2023-11-28T15:14:38Z

$ mypy modelskill/timeseries/_timeseries.py --strict
Success: no issues found in 1 source file

$ mypy modelskill/matching.py --strict
Success: no issues found in 1 source file

…keep_duplicates)

…ecomodeller

…ecomodeller

jsmariegaard · 2023-12-01T13:01:46Z

modelskill/matching.py

-    if isinstance(obs, get_args(ObsInputType)):
-        cmp = _single_obs_compare(
-            obs,
+    obs = [obs] if isinstance(obs, get_args(ObsInputType)) else obs


jsmariegaard · 2023-12-01T13:09:14Z

modelskill/matching.py

+            keep_x = np.abs((df.x_mod - df.x_obs)) < spatial_tolerance
+            keep_y = np.abs((df.y_mod - df.y_obs)) < spatial_tolerance
+            df = df[keep_x & keep_y]
+            mri.data[name] = df[name]


Does this keep the attrs?

…lerance

jsmariegaard and others added 21 commits November 22, 2023 10:30

Round all timeseries to 0.0001s

206e84d

First steps in using Timeseries instead of df for all raw_mod_data

b9e6185

use data["time"] when rounding

a6b8a4f

raw_mod_data is TimeSeries

435d9e2

fix renaming of time

bec1301

improved interp_time

6ccb8ea

TimeSeries argument in _remove_model_gaps

c75f901

Merge branch 'main' into better-matching

039a156

Don't parse input if _is_input_validated

3a6e0e7

use timeseries interp_time

24a882e

change type of raw_mod_data

123ea57

raw_data as dataset instead of dataframes and more

a2ee3a6

sel instead of loc; re-introduce _remove_model_gaps

0d9dae1

validate/parse raw_mod_data

f5031b8

raw_mod_data as series before plotting

ee34ec8

sel method

2a163d4

Update test_comparer.py

7c264b0

values

Type hints

8fc965b

Handle persisted data

2f9d574

Don't interpolate track data

d8d7034

Don't accept non-overlapping time

c7ea363

ecomodeller reviewed Nov 24, 2023

View reviewed changes

jsmariegaard added 4 commits November 24, 2023 21:10

Merge branch 'main' into better-matching

1e70140

let remove_bias support new data structure

7a4855c

new method _remove_non_matching_positions to replace _mask_model_outs…

72ac9c5

…ide_observation_track

roll back - some how not working 🤔. We still need to make sure t, x a…

43b6b69

…nd y matches! Satellite tracks can be outside model domain and therefore contain more data.

jsmariegaard commented Nov 26, 2023

View reviewed changes

ecomodeller added 3 commits November 28, 2023 09:46

PointModelResult can be interpolated

c0ceee4

Ignore tests related to Connector

c2593ef

Remove dead code

56557f0

ecomodeller added 6 commits November 28, 2023 10:40

Start and end are optional

466715e

Remove commented code etc.

aae0659

Static type checking

5931044

ModelResult is still a bit loose

2c8f064

Ignore connector related tests

5228e04

Strict static types

ead4826

ecomodeller and others added 11 commits November 28, 2023 16:32

More type hints

9eba28f

new argument keep_duplicates

1bd0109

refactor inline function

ef8d5b7

keep_duplicates argument wrappes ds.drop_duplicates(dim="time", keep=…

1696cea

…keep_duplicates)

updated warning message

37553d6

Removed 22 duplicate timestamps after new default duplicates behavior

b846ef9

Removed 22 duplicate timestamps

4c11d64

new tiny tests; one is failing due to missing check of matching x, y @…

4048d01

…ecomodeller

DRY

8ea10a6

Passing tests

00fa1b3

Refactor (+remove unused code)

b6ce4f8

jsmariegaard commented Dec 1, 2023

View reviewed changes

jsmariegaard and others added 4 commits December 1, 2023 15:01

use ds.sel() and extract method _select_overlapping_trackdata_with_to…

29caa30

…lerance

Fix type hints

0b4671f

Full test coverage in matching

7c293d7

skip notebooks changed name

0a8963b

jsmariegaard marked this pull request as ready for review December 1, 2023 16:20

ecomodeller approved these changes Dec 1, 2023

View reviewed changes

jsmariegaard added 2 commits December 1, 2023 17:38

remove unused

e021202

move track tests

6914ab7

jsmariegaard merged commit 8d570c8 into main Dec 1, 2023

jsmariegaard deleted the better-matching branch December 1, 2023 21:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better matching and time interpolation#281

Better matching and time interpolation#281
jsmariegaard merged 51 commits intomainfrom
better-matching

jsmariegaard commented Nov 22, 2023

Uh oh!

ecomodeller Nov 24, 2023 •

edited

Loading

Uh oh!

jsmariegaard Nov 25, 2023

Uh oh!

jsmariegaard Nov 26, 2023

Uh oh!

ecomodeller Nov 27, 2023

Uh oh!

ecomodeller Nov 28, 2023

Uh oh!

ecomodeller commented Nov 28, 2023 •

edited

Loading

Uh oh!

jsmariegaard Dec 1, 2023

Uh oh!

jsmariegaard Dec 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jsmariegaard commented Nov 22, 2023

Uh oh!

ecomodeller Nov 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jsmariegaard Nov 25, 2023

Choose a reason for hiding this comment

Uh oh!

jsmariegaard Nov 26, 2023

Choose a reason for hiding this comment

Uh oh!

ecomodeller Nov 27, 2023

Choose a reason for hiding this comment

Uh oh!

ecomodeller Nov 28, 2023

Choose a reason for hiding this comment

Uh oh!

ecomodeller commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jsmariegaard Dec 1, 2023

Choose a reason for hiding this comment

Uh oh!

jsmariegaard Dec 1, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ecomodeller Nov 24, 2023 •

edited

Loading

ecomodeller commented Nov 28, 2023 •

edited

Loading