Skip to content

[BUG] Fix HidalgoSegmenter IndexError with empty sampling (Fixes #3168)#3169

Merged
TonyBagnall merged 8 commits into
aeon-toolkit:mainfrom
samay2504:fix/aeon-3150-hidalgo-empty-sampling
Mar 12, 2026
Merged

[BUG] Fix HidalgoSegmenter IndexError with empty sampling (Fixes #3168)#3169
TonyBagnall merged 8 commits into
aeon-toolkit:mainfrom
samay2504:fix/aeon-3150-hidalgo-empty-sampling

Conversation

@samay2504
Copy link
Copy Markdown
Contributor

@samay2504 samay2504 commented Dec 11, 2025

Fixes #3168
Fixes #3150

Problem

HidalgoSegmenter crashes with IndexError: too many indices for array when configured with incompatible burn_in and sampling_rate parameters that result in zero samples passing the filtering criteria.

Root Cause

The _fit() method filters samples using:

idx = [it for it in range(n_iter) if it % sampling_rate == 0 and it >= n_iter * burn_in]

Solution:

  1. Initialize bestsampling = None to detect empty filtering
  2. Skip replicas with empty idx to allow later replicas a chance to produce samples
  3. Validate non-empty result with informative error message
  4. Fix Pi normalization to use actual sample count instead of hardcoded K

Copilot AI review requested due to automatic review settings December 11, 2025 19:01
@aeon-actions-bot aeon-actions-bot Bot added bug Something isn't working segmentation Segmentation package labels Dec 11, 2025
@aeon-actions-bot
Copy link
Copy Markdown
Contributor

Thank you for contributing to aeon

I have added the following labels to this PR based on the title: [ bug ].
I have added the following labels to this PR based on the changes made: [ segmentation ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Regenerate expected results for testing
  • Push an empty commit to re-run CI checks

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical IndexError bug in HidalgoSegmenter that occurs when incompatible burn_in and sampling_rate parameters result in zero valid samples after filtering. The fix adds proper validation and error handling to detect and report this configuration issue with a helpful error message.

Key changes:

  • Added empty sample detection with early continue for individual replicas
  • Implemented validation to ensure at least one replica produces valid samples
  • Corrected Pi normalization to use actual sample count instead of the number of manifolds (K)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
test_hidalgo_fix.py New test file with three test cases covering empty sampling error, valid configuration, and boundary cases
aeon/segmentation/_hidalgo.py Bug fix implementation including empty sample handling, validation error, and Pi normalization correction

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test_hidalgo_fix.py Outdated
Comment thread aeon/segmentation/_hidalgo.py Outdated
@samay2504 samay2504 force-pushed the fix/aeon-3150-hidalgo-empty-sampling branch from 046b5d3 to 0e7b743 Compare December 11, 2025 19:10
…-toolkit#3150)

- Handle case where no Gibbs samples survive burn-in and sampling_rate filtering
- Initialize bestsampling as None to detect empty filtering result
- Skip replicas that produce empty idx arrays to continue searching
- Validate that at least one valid sample exists before accessing array
- Raise informative ValueError with guidance on parameter adjustment
- Update Pi normalization to use actual sample count instead of idx length
@samay2504 samay2504 force-pushed the fix/aeon-3150-hidalgo-empty-sampling branch from 3dfb250 to 067cca9 Compare December 11, 2025 19:14
@MatthewMiddlehurst
Copy link
Copy Markdown
Member

Sorry for only getting around to this now. We had 3 PRs on this and 2 issues I believe, but going with this. Made some changes to speed up and get this tidied quicker, including moving the tests.

Copy link
Copy Markdown
Contributor

@TonyBagnall TonyBagnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for this @samay2504 sorry it took so long to get in!

@TonyBagnall TonyBagnall merged commit dbe19a0 into aeon-toolkit:main Mar 12, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working segmentation Segmentation package

Projects

None yet

4 participants