Skip to content

Add some useful planner processors#1327

Merged
HenryL27 merged 4 commits into
mainfrom
hml-planprocessors
Jun 3, 2025
Merged

Add some useful planner processors#1327
HenryL27 merged 4 commits into
mainfrom
hml-planprocessors

Conversation

@HenryL27
Copy link
Copy Markdown
Collaborator

@HenryL27 HenryL27 commented Jun 3, 2025

From upstreaming

  • AlwaysSummarize: makes sure the plan has a summarizedata or count on the end
  • OnlyRetrieval: Optimizes plan for only caring about the documents that come back
  • LogPlan: Print the plan while processing; useful debug tool
  • LLMRewriteQuestion: Use a llm + prompt to rewrite the question

HenryL27 added 2 commits June 3, 2025 12:33
…essors

Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
@HenryL27 HenryL27 requested a review from baitsguy June 3, 2025 23:06
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
# LlmExtractEntity without anything after it is just slow
# TopK can sometimes fail if the LLM returns garbage and doesn't help with getting
# the list of documents -- we just have to ignore it
assert len(n.inputs) == 1, f"Trailing {n.node_type} node must have exactly one input"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll need to update dependencies here right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more explicit way would be to choose which operators stay in vs out

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym update the dependencies? I delete the bad tail node and update the plan result_node to the one before it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered letting it be 'RemoveTailNodesByType' or some such. The only use-case I could come up with for it was this one though, so I decided to go with the less general thing as it's clearer why you want it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline figured it out

Comment thread lib/sycamore/sycamore/query/strategy.py Outdated
return plan


class LogPlan(LogicalPlanProcessor):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print plan?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it better to just add a .print() method to the plan object? then the .print can be called by planner who is orchestrating the processors

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm trying to assemble a strategy that does the right thing then my programming language is effectively the list of processors so one that prints the current state is useful.

Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
@HenryL27 HenryL27 merged commit c0e562d into main Jun 3, 2025
12 of 15 checks passed
@HenryL27 HenryL27 deleted the hml-planprocessors branch June 3, 2025 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants