Skip to content

Filter-context auto-aggregation across methods (#118)#142

Merged
iskandr merged 1 commit intomasterfrom
filter-auto-aggregate
Apr 24, 2026
Merged

Filter-context auto-aggregation across methods (#118)#142
iskandr merged 1 commit intomasterfrom
filter-auto-aggregate

Conversation

@iskandr
Copy link
Copy Markdown
Contributor

@iskandr iskandr commented Apr 24, 2026

Summary

Closes #118. Inside apply_filter, a Comparison with unqualified same-kind refs auto-aggregates across methods instead of raising the ambiguity error:

# On a frame with both mhcflurry and netmhcpan affinity rows:
apply_filter(df, Affinity <= 500)   # no raise
# Keeps groups where ANY method has value <= 500.

Aggregator is picked by the comparison direction:

Operator Aggregator Equivalent OR expansion
<, <= nanmin across methods (aff[m1] < k) | (aff[m2] < k) | ...
>, >= nanmax across methods (aff[m1] > k) | (aff[m2] > k) | ...

Scope (narrow on purpose)

Auto-agg fires only when all of these hold:

  1. The node is a Comparison with <, <=, >, or >=.
  2. It's evaluated under a filter context (apply_filter sets EvalContext.filter_context=True).
  3. All unqualified Field refs in the comparison are the same kind. Cross-kind (affinity.rank <= processing.rank) stays strict.
  4. The DataFrame actually has >1 method for that kind. Single-method frames fast-path through strict eval.

apply_sort, scalar score arithmetic (0.5 * ba.score + 0.5 * el.score), == / !=, and qualified refs (Affinity['netmhcpan'] <= 500) all keep the current strict behavior.

Changes

  • EvalContext(df, filter_context=False) — public kwarg. apply_filter sets it to True.
  • EvalContext._method_override — internal (kind_value, method) tuple the Comparison auto-agg loop sets when iterating per method.
  • Field.eval honors ctx._method_override when self.method is None and kind matches.
  • Comparison.eval — new _should_auto_aggregate gate + _auto_aggregate loop that evals both sides per method and aggregates LHS via nanmin/nanmax and RHS via the opposite (for the "any method pair" union interpretation).
  • _collect_unqualified_kinds(node) — tree walker.
  • CHANGELOG + version bump to 5.10.0.

Back-compat

One test in tests/test_io_lens.py previously asserted apply_filter(r.df, Affinity <= 500) raised on a multi-model LENS frame. That was the behavior being relaxed. Replaced with:

  • test_unqualified_autoaggregates_in_filter — verifies no raise + valid shape.
  • test_unqualified_ambiguous_still_raises_in_sort — verifies the sort path keeps strict.

No other callers were relying on the filter-side raise.

Test plan

  • <= / < auto-agg via nanmin; >= / > via nanmax
  • "Any method passes" — group kept; "no method passes" — group dropped
  • apply_sort still raises Ambiguous on unqualified multi-method refs
  • Bare (Affinity.value <= 200).evaluate(df) outside filter still raises
  • Cross-kind comparison (Affinity.rank <= Processing.rank) stays strict
  • == / != stay strict
  • Single-method frame: no auto-agg overhead, same result
  • Qualified side (Affinity['mhcflurry'] <= 200) bypasses auto-agg
  • Compound boolean ((aff <= 200) & (aff.score >= 0.5)) auto-aggs each comparison
  • Const >= Field (Const on left) mirrors correctly
  • Full suite: pytest tests/ → 1192 passed, 3 skipped

- Under apply_filter only, a Comparison with unqualified same-kind
  refs auto-aggregates across methods: nanmin for <, <=; nanmax for
  >, >=. "Any method passes" semantics.
- Narrow scope: directional comparisons only; not in sort, not in
  score arithmetic; cross-kind comparisons stay strict.
- EvalContext(df, filter_context=True) exposed for hand-rolled
  evaluations outside apply_filter.
- Internal ctx._method_override drives the per-method binding loop;
  Field.eval honors it when self.method is None and kind matches.
@iskandr iskandr force-pushed the filter-auto-aggregate branch from 8af0431 to 2bfe59a Compare April 24, 2026 18:44
@iskandr iskandr merged commit b2c285f into master Apr 24, 2026
8 checks passed
@iskandr iskandr deleted the filter-auto-aggregate branch April 24, 2026 18:52
@coveralls
Copy link
Copy Markdown

Coverage Status

coverage: 88.589% (+0.1%) from 88.468% — filter-auto-aggregate into master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto-aggregate unqualified kind refs in filter comparisons (narrow)

2 participants