Skip to content

fix: integrate select variables in dataloader#900

Merged
mchantry merged 2 commits intomainfrom
fix/select_vars
Feb 16, 2026
Merged

fix: integrate select variables in dataloader#900
mchantry merged 2 commits intomainfrom
fix/select_vars

Conversation

@icedoom888
Copy link
Contributor

Description

Solves #899

What problem does this change solve?

What issue or task does this change relate to?

Additional notes

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

@dietervdb-meteo dietervdb-meteo changed the title Integrated select variables in dataloader fix: integrate select variables in dataloader Feb 16, 2026
@JPXKQX
Copy link
Member

JPXKQX commented Feb 16, 2026

I think this needs to be discussed. These parameters are added just to the open_dataset call, should we add all anemoi dataset kwargs to this call (cutout, missing, join, ...)?

Copy link
Contributor

@dietervdb-meteo dietervdb-meteo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, lets run integration tests just to be sure:
https://github.com/ecmwf/anemoi-core/actions/runs/22057431865

@github-project-automation github-project-automation bot moved this from To be triaged to For merging in Anemoi-dev Feb 16, 2026
@anaprietonem
Copy link
Contributor

this is also related to the issue @VeraChristina opened last week - #891

@JPXKQX
Copy link
Member

JPXKQX commented Feb 16, 2026

The solution to this is to pass all anemoi-datasets specifics arguments inside the dataset: argument, which is exactly the config passed to the open_dataset function.

@dietervdb-meteo
Copy link
Contributor

@JPXKQX , so why did we have the separate drop and frequency arguments? They could be read from the open_dataset config as well?

@dietervdb-meteo
Copy link
Contributor

Ok, see you already commented on that at #899 (comment)

@MicheleCattaneo
Copy link
Contributor

Nice catch.
It feels like these bugs (there was another about beta's for the optimizers) are a bit sneaky due to the extra = "ignore" default value of pydantic's schemas.
What if extra was set to either "allow" or "forbid" everywhere and we let the object's constructor fail if the argument is not expected? Silently dropping arguments sounds too dangerous.
If "ignore" is really needed, maybe it's possible to add some kind of hook that at least throws a warning?

@icedoom888
Copy link
Contributor Author

@JPXKQX this is a big change compared to how we used to write configs so far.
Has this been documented anywhere?

@mchantry
Copy link
Member

@icedoom888 thanks for your contribution. It's true that we have an asymmetry between drop and select at present.
My suggestion is that we merge this lovely contribution (and the one from Vera that will help catch more of these).
But we also revisit in the near future moving all of these keywords to the lower level within datasets.

@mchantry mchantry merged commit d34e80c into main Feb 16, 2026
18 of 20 checks passed
@mchantry mchantry deleted the fix/select_vars branch February 16, 2026 14:08
@github-project-automation github-project-automation bot moved this from For merging to Done in Anemoi-dev Feb 16, 2026
@DeployDuck DeployDuck mentioned this pull request Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ATS Approval Not Needed No approval needed by ATS training

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants