Skip to content
2 changes: 1 addition & 1 deletion docs/api/comparercollection.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
The `ComparerCollection` is one of the main objects of the `modelskill` package. It is collection of `Comparer` objects and is returned by the `match()` method of the `Model` class.


::: modelskill.comparison.ComparerCollection
::: modelskill.ComparerCollection

::: modelskill.comparison._collection_plotter.ComparerCollectionPlotter
11 changes: 5 additions & 6 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,7 @@ The typical ModelSkill workflow consists of these five steps:

The result of a MIKE 21/3 simulation is stored in one or more dfs files.
The most common formats are .dfsu for distributed data and .dfs0 for
time series point data. A ModelSkill
[ModelResult](api/model.md#modelskill.model.PointModelResult) is defined by the
time series point data. A ModelSkill ModelResult is defined by the
result file path and a name:

```python
Expand All @@ -38,8 +37,8 @@ created. The data will be read later.
The next step is to define the measurements to be used for the skill
assessment. Two types of observation are available:

- [PointObservation](api/observation.md#modelskill.observation.PointObservation)
- [TrackObservation](api/observation.md#modelskill.observation.TrackObservation)
- [PointObservation](api/observation/point.md)
- [TrackObservation](api/observation/track.md)

Let\'s assume that we have one PointObservation and one
TrackObservation:
Expand All @@ -65,7 +64,7 @@ cc = ms.match([hkna, c2], mr)
```

This method returns a
[ComparerCollection](api/comparer.md#modelskill.comparison.ComparerCollection)
[ComparerCollection](api/comparercollection.md#modelskill.ComparerCollection)
for further analysis and plotting.


Expand All @@ -76,7 +75,7 @@ skill assessment.

The primary comparer methods are:

- [skill()](api/comparercollection.md#modelskill.comparison.ComparerCollection.skill)
- [skill()](api/comparercollection.md#modelskill.ComparerCollection.skill)
which returns a table with the skill scores
- various plot methods of the comparer objects
* `plot.scatter()`
Expand Down
14 changes: 7 additions & 7 deletions docs/terminology.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,41 +26,41 @@ A **timeseries** is a sequence of data points in time. In `ModelSkill`, The data


### Observation
An **observation** refers to real-world data or measurements collected from the system you are modeling. Observations serve as a reference for assessing the model's performance. These data points are used to compare with the model's predictions during validation and calibration. Observations are usually based on field measurements or laboratory experiments, but for the purposes of model validation, they can also be derived from other models (e.g. a reference model). `ModelSkill` supports [point](api/observation.md#modelskill.PointObservation) and [track](api/observation.md#modelskill.TrackObservation) observation types.
An **observation** refers to real-world data or measurements collected from the system you are modeling. Observations serve as a reference for assessing the model's performance. These data points are used to compare with the model's predictions during validation and calibration. Observations are usually based on field measurements or laboratory experiments, but for the purposes of model validation, they can also be derived from other models (e.g. a reference model). `ModelSkill` supports [point](api/observation/point.md) and [track](api/observation/track.md) observation types.


### Measurement
A **measurement** is called [observation](#observation) in `ModelSkill`.


### Model result
A **model result** is the output of any type of numerical model. It is the data generated by the model during a simulation. Model results can be compared with observations to assess the model's performance. In the context of validation, the term "model result" is often used interchangeably with "model output" or "model prediction". `ModelSkill` supports [point](api/model.md#modelskill.model.PointModelResult), [track](api/model.md#modelskill.model.TrackModelResult), [dfsu](api/model.md#modelskill.model.DfsuModelResult) and [grid](api/model.md#modelskill.model.GridModelResult) model result types.
A **model result** is the output of any type of numerical model. It is the data generated by the model during a simulation. Model results can be compared with observations to assess the model's performance. In the context of validation, the term "model result" is often used interchangeably with "model output" or "model prediction". `ModelSkill` supports [point](api/model/point.md), [track](api/model/track.md), [dfsu](api/model/dfsu.md) and [grid](api/model/grid.md) model result types.


### Metric
A **metric** is a quantitative measure (a mathematical expression) used to evaluate the performance of a numerical model. Metrics provide a standardized way to assess the model's accuracy, precision, and other attributes. A metric aggregates the skill of a model into a single number. See list of [metrics](api/metrics.md#modelskill.metrics) supported by `ModelSkill`.


### Score
A **score** is a numerical value that summarizes the model's performance based on chosen metrics. Scores can be used to rank or compare different models or model configurations. In the context of validation, the "skill score" or "validation score" often quantifies the model's overall performance. The score of a model is a single number, calculated as a weighted average for all time-steps, observations and variables. If you want to perform automated calibration, you can use the score as the objective function. In `ModelSkill`, [`score`](api/compare.md/#modelskill.comparison.ComparerCollection.score) is also a specific method on [Comparer](#comparer) objects that returns a single number aggregated score using a specific [metric](#metric).
A **score** is a numerical value that summarizes the model's performance based on chosen metrics. Scores can be used to rank or compare different models or model configurations. In the context of validation, the "skill score" or "validation score" often quantifies the model's overall performance. The score of a model is a single number, calculated as a weighted average for all time-steps, observations and variables. If you want to perform automated calibration, you can use the score as the objective function. In `ModelSkill`, [`score`](api/comparercollection.md/#modelskill.ComparerCollection.score) is also a specific method on [Comparer](#comparer) objects that returns a single number aggregated score using a specific [metric](#metric).


## ModelSkill-specific terminology

### matched data
In ModelSkill, observations and model results are *matched* when they refer to the same positions in space and time. If the [observations](#observation) and [model results](#model-result) are already matched, the [`from_matched`](api/compare.md#modelskill.from_matched) function can be used to create a [Comparer](#comparer) directly. Otherwise, the [compare](#compare) function can be used to match the observations and model results in space and time.
In ModelSkill, observations and model results are *matched* when they refer to the same positions in space and time. If the [observations](#observation) and [model results](#model-result) are already matched, the [`from_matched`](api/matching.md/#modelskill.from_matched) function can be used to create a [Comparer](#comparer) directly. Otherwise, the [compare](#compare) function can be used to match the observations and model results in space and time.


### match
The function [`match`](api/compare.md#modelskill.match) is used to match a model result with observations. It returns a [`Comparer`](api/compare.md#modelskill.comparison.Comparer) object or a [`ComparerCollection`](api/compare.md#modelskill.comparison.ComparerCollection) object.
The function [`match`](api/matching.md/#modelskill.match) is used to match a model result with observations. It returns a [`Comparer`](api/comparer.md) object or a [`ComparerCollection`](api/comparercollection.md) object.


### Comparer
A [**Comparer**](api/compare.md#modelskill.comparison.Comparer) is an object that compares a model result with observations. It is used to calculate validation metrics and generate plots. A Comparer can be created using the [`compare`](api/compare.md#modelskill.match) function (will return a [ComparerCollection](api/compare.md#modelskill.comparison.ComparerCollection)).
A [**Comparer**](api/comparer.md) is an object that compares a model result with observations. It is used to calculate validation metrics and generate plots. A Comparer can be created using the [`compare`](api/matching.md/#modelskill.match) function (will return a [ComparerCollection](api/comparercollection.md)).


### ComparerCollection
A [**ComparerCollection**](api/compare.md#modelskill.comparison.ComparerCollection) is a collection of Comparers. It is used to compare multiple model results with multiple observations. A ComparerCollection can be created using the [`compare`](api/compare.md#modelskill.match) function.
A [**ComparerCollection**](api/comparercollection.md) is a collection of Comparers. It is used to compare multiple model results with multiple observations. A ComparerCollection can be created using the [`compare`](api/matching.md/#modelskill.match) function.


### Connector
Expand Down
2 changes: 1 addition & 1 deletion modelskill/comparison/_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ def to_dataframe(self) -> pd.DataFrame:
df["variable"] = cmp.quantity.name
df["x"] = cmp.x
df["y"] = cmp.y
df["obs_val"] = cmp.obs
df["obs_val"] = cmp.data["Observation"].values
frames.append(df[cols])
if len(frames) > 0:
res = pd.concat(frames)
Expand Down
24 changes: 6 additions & 18 deletions modelskill/comparison/_collection_plotter.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,10 @@ def scatter(
show_hist : bool, optional
show the data density as a a 2d histogram, by default None
show_density: bool, optional
show the data density as a colormap of the scatter, by default None. If both `show_density` and `show_hist`
are None, then `show_density` is used by default.
for binning the data, the previous kword `bins=Float` is used
show the data density as a colormap of the scatter, by default None.
If both `show_density` and `show_hist` are None, then `show_density`
is used by default.
for binning the data, the kword `bins=Float` is used
backend : str, optional
use "plotly" (interactive) or "matplotlib" backend, by default "matplotlib"
figsize : tuple, optional
Expand All @@ -102,7 +103,8 @@ def scatter(
by default False
ax : matplotlib axes, optional
axes to plot on, by default None
kwargs
**kwargs :
other keyword arguments to matplotlib.pyplot.scatter()

Examples
------
Expand Down Expand Up @@ -446,20 +448,6 @@ def taylor(

Parameters
----------
model : (int, str), optional
name or id of model to be compared, by default all
observation : (int, str, List[str], List[int])), optional
name or ids of observations to be compared, by default all
variable : (str, int), optional
name or id of variable to be compared, by default first
start : (str, datetime), optional
start time of comparison, by default None
end : (str, datetime), optional
end time of comparison, by default None
area : list(float), optional
bbox coordinates [x0, y0, x1, y1],
or polygon coordinates[x0, y0, x1, y1, ..., xn, yn],
by default None
normalize_std : bool, optional
plot model std normalized with observation std, default False
aggregate_observations : bool, optional
Expand Down
26 changes: 12 additions & 14 deletions modelskill/comparison/_comparer_plotter.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,12 @@ def timeseries(

ax.scatter(
cmp.time,
cmp.data[cmp.obs_name].values,
cmp.data[cmp._obs_name].values,
marker=".",
color=cmp.data[cmp.obs_name].attrs["color"],
color=cmp.data[cmp._obs_name].attrs["color"],
)
ax.set_ylabel(cmp.unit_text)
ax.legend([*cmp.mod_names, cmp.obs_name])
ax.legend([*cmp.mod_names, cmp._obs_name])
ax.set_ylim(ylim)
if self.is_directional:
_ytick_directional(ax, ylim)
Expand Down Expand Up @@ -111,10 +111,10 @@ def timeseries(
*mod_scatter_list,
go.Scatter(
x=cmp.time,
y=cmp.data[cmp.obs_name].values,
name=cmp.obs_name,
y=cmp.data[cmp._obs_name].values,
name=cmp._obs_name,
mode="markers",
marker=dict(color=cmp.data[cmp.obs_name].attrs["color"]),
marker=dict(color=cmp.data[cmp._obs_name].attrs["color"]),
),
]
)
Expand Down Expand Up @@ -227,10 +227,10 @@ def _hist_one_model(
.hist(bins=bins, color=MOD_COLORS[mod_id], **kwargs)
)

cmp.data[cmp.obs_name].to_series().hist(
bins=bins, color=cmp.data[cmp.obs_name].attrs["color"], **kwargs
cmp.data[cmp._obs_name].to_series().hist(
bins=bins, color=cmp.data[cmp._obs_name].attrs["color"], **kwargs
)
ax.legend([mod_name, cmp.obs_name])
ax.legend([mod_name, cmp._obs_name])
ax.set_title(title)
ax.set_xlabel(f"{cmp.unit_text}")
if density:
Expand Down Expand Up @@ -464,8 +464,6 @@ def scatter(

Parameters
----------
model : (str, int), optional, DEPRECATED
name or id of model to be plotted, by default 0
bins: (int, float, sequence), optional
bins for the 2D histogram on the background. By default 20 bins.
if int, represents the number of bins of 2D
Expand All @@ -486,12 +484,12 @@ def scatter(
show_hist : bool, optional
show the data density as a a 2d histogram, by default None
show_density: bool, optional
show the data density as a colormap of the scatter, by default None. If both `show_density` and `show_hist`
show the data density as a colormap of the scatter, by default None.
If both `show_density` and `show_hist` are None, then `show_density`
is used by default. For binning the data, the kword `bins=Float` is used.
norm : matplotlib.colors norm
colormap normalization
If None, defaults to matplotlib.colors.PowerNorm(vmin=1,gamma=0.5)
are None, then `show_density` is used by default.
for binning the data, the previous kword `bins=Float` is used
backend : str, optional
use "plotly" (interactive) or "matplotlib" backend, by default "matplotlib"
figsize : tuple, optional
Expand Down
Loading