Skip to content

fix(models): keep_batch_sharded false logic#912

Merged
ssmmnn11 merged 4 commits intomainfrom
fix/keep_batch_sharded_false
Feb 24, 2026
Merged

fix(models): keep_batch_sharded false logic#912
ssmmnn11 merged 4 commits intomainfrom
fix/keep_batch_sharded_false

Conversation

@havardhhaugen
Copy link
Contributor

@havardhhaugen havardhhaugen commented Feb 18, 2026

Description

This pr fixes the logic to check in_out_sharded in models - which needs to be a dict now that shard_shapes is a dict. In the pr I have fixed the issue for the following models

  • AnemoiModelEncProcDec
  • AnemoiEnsModelEncProcDec

Still todo:

  • AnemoiModelHierarchicalAutoEncoder
  • AnemoiModelEncProcDecHierarchical
  • AnemoiModelEncProcDecInterpolator
    I'm not that familiar with the remaining models, but I can look into it. However if someone think they can fix it fast I would be happy with some help.

What problem does this change solve?

Currently training with model sharding and keep_batch_sharded: False fails with a shape error in the allgather due to the logical error described above.

What issue or task does this change relate to?

Additional notes

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

@havardhhaugen havardhhaugen requested a review from japols February 18, 2026 13:04
@havardhhaugen havardhhaugen added bug Something isn't working ATS Approval Not Needed No approval needed by ATS labels Feb 18, 2026
@github-project-automation github-project-automation bot moved this to To be triaged in Anemoi-dev Feb 18, 2026
@github-actions github-actions bot added models and removed models labels Feb 18, 2026
@havardhhaugen havardhhaugen requested a review from JPXKQX February 18, 2026 13:04
@havardhhaugen havardhhaugen changed the title Fix: keep_batch_sharded false logic fix (models): keep_batch_sharded false logic Feb 18, 2026
@havardhhaugen havardhhaugen changed the title fix (models): keep_batch_sharded false logic fix(models): keep_batch_sharded false logic Feb 18, 2026
@anaprietonem
Copy link
Contributor

@japols is this something you could take a look please? Might be @ssmmnn11 can double check as you also looked quite a bit into multiple-datasets

Copy link
Member

@ssmmnn11 ssmmnn11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! LGTM! At some point we should introduce regression tests for this.

@github-project-automation github-project-automation bot moved this from To be triaged to For merging in Anemoi-dev Feb 24, 2026
@ssmmnn11 ssmmnn11 merged commit 88b5066 into main Feb 24, 2026
12 of 13 checks passed
@ssmmnn11 ssmmnn11 deleted the fix/keep_batch_sharded_false branch February 24, 2026 15:36
@github-project-automation github-project-automation bot moved this from For merging to Done in Anemoi-dev Feb 24, 2026
@DeployDuck DeployDuck mentioned this pull request Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ATS Approval Not Needed No approval needed by ATS bug Something isn't working models

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants