Fix pairwise_judge to use A/B instead of col_A/col_B. col_A/col_B use 2 tokens, so cascade fails by harshitgupta412 · Pull Request #248 · lotus-data/lotus

harshitgupta412 · 2026-04-10T00:35:53Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating README.md and examples for new features.

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS ABOVE HAVE BEEN CONSIDERED.

Purpose

Test Plan

Test Results

(Optional) Documentation Update

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Performance improvement
Refactoring (no functional changes)

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, updating docstrings
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

BEFORE SUBMITTING, PLEASE READ https://github.com/lotus-data/lotus/blob/main/CONTRIBUTING.md
anything written below this line will be removed by GitHub Actions

Co-authored-by: liana313 <lianapat@stanford.edu> Made-with: Cursor

…oken

liana313

Let's also update CI tests to include the pairwise judge and test for edge cases on this

liana313 · 2026-04-13T19:21:56Z

    assert list(df["_judge_1"].values) == ["A", "B"]
+
+
+@pytest.mark.parametrize("model", get_enabled("gpt-4o-mini"))


This is a start, but we need more than one CI test for cascade testing. We should cover edge cases (e.g., when there's an existing col A and col B)

harshitgupta412 and others added 10 commits March 25, 2026 16:53

Add AST-based lazy evaluation and semantic operator optimizations

6bffa0b

Co-authored-by: liana313 <lianapat@stanford.edu> Made-with: Cursor

fix doc, test, examples

9196e51

add auto_include_default_optimizer

7cc5806

fix tests

cb01241

fix deps

10b8209

add missing deps to test

c8476c9

fix tests

5e4ac31

fix benchmakrs

1fbf369

fix the output tokens for sem filter to be 1 token

53eb095

Merge branch 'main' of github.com:lotus-data/lotus into fix/cascade-t…

0b8c7eb

…oken

harshitgupta412 requested a review from liana313 April 10, 2026 00:36

add check in sem_filter

b281720

liana313 reviewed Apr 12, 2026

View reviewed changes

add test for multi-token

57699b9

liana313 reviewed Apr 13, 2026

View reviewed changes

liana313 merged commit ce544e1 into main Apr 13, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix pairwise_judge to use A/B instead of col_A/col_B. col_A/col_B use 2 tokens, so cascade fails#248

Fix pairwise_judge to use A/B instead of col_A/col_B. col_A/col_B use 2 tokens, so cascade fails#248
liana313 merged 12 commits intomainfrom
fix/cascade-token

harshitgupta412 commented Apr 10, 2026

Uh oh!

liana313 left a comment

Uh oh!

liana313 Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		assert list(df["_judge_1"].values) == ["A", "B"]


		@pytest.mark.parametrize("model", get_enabled("gpt-4o-mini"))

Conversation

harshitgupta412 commented Apr 10, 2026

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Results

(Optional) Documentation Update

Type of Change

Checklist

Uh oh!

liana313 left a comment

Choose a reason for hiding this comment

Uh oh!

liana313 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants