feat(core): implement multi-strategy replacement pipeline for edit tool#2405
Closed
undici77 wants to merge 1 commit intoQwenLM:mainfrom
Closed
feat(core): implement multi-strategy replacement pipeline for edit tool#2405undici77 wants to merge 1 commit intoQwenLM:mainfrom
undici77 wants to merge 1 commit intoQwenLM:mainfrom
Conversation
- Add exact, flexible, regex, and fuzzy matching strategies - Fuzzy matching uses Levenshtein distance with whitespace penalty (up to 10% threshold) - Support ai_proposed_content and modified_by_user flags for user-modified edits - Handle edge cases: null content from readTextFile, empty old_string for file creation - Add comprehensive test matrix for replace_all scenarios (0, 1, multiple occurrences) - Improve error handling with READ_CONTENT_FAILURE and FILE_WRITE_FAILURE types This enhances the edit tool's robustness when matching and replacing text, especially for cases with whitespace differences or minor content variations. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Collaborator
|
@undici77 Thanks for your contribution! As as to the fuzzy match feature for edit tool, we had a similar implementation before, but it caused some trouble so we rolled if back, though I cannot recall the details now. For our vast majority installation base, they use top-tier models like qwen3.5-plus, which does not have much trouble with indentation issues lately. (If you encounter problem with mixed CJK characters and english characters, that is a known issue, checkout this PR for more detail. #2300 ) So we decide now to merge this PR which targets small models on local inference endpoint. Thanks again for your understanding! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This enhances the edit tool's robustness when matching and replacing text, especially for cases with whitespace differences or minor content variations.
TLDR
Using qwen-code with smaller models like Qwen3-Coder-30B, the
edittool fails frequently withEDIT_NO_OCCURRENCE_FOUNDeven when the intent is clear — the model producesold_stringwith slightly wrong indentation, normalised whitespace, or a minor typo and the single exact-match strategy gives up immediately.gemini-edit.tsin the same codebase already solves this with a four-strategy cascade. This PR ports that cascade intoedit.ts, fixes two broken tests exposed by the change, and tightens the tool prompt to remove redundancy and weak language.Dive Deeper
1. Multi-strategy replacement pipeline
The current
edit.tscallscountOccurrences+safeLiteralReplace— one shot, exact match only. Common failure patterns with smaller models:old_stringcopied with one extra leading space or wrong indentation → no matchEach failure costs a full round-trip: error → re-read file → retry. Ported from
gemini-edit.ts:\s*between them; handles intra-line spacing variationAdaptations to
edit.tsconventions:replace_all(notallow_multiple), BOM/encoding preservation (not CRLF tracking),StandardFileSystemServiceread/write API. Fuzzy matches append"Applied fuzzy match at line N."tollmContentso the model knows a looser match was used.2. Test fixes and new coverage (56 → 93 tests)
Two existing tests were broken or fragile:
FILE_WRITE_FAILUREusedchmod 444— silently passes when the runner is root (common in CI). Replaced withvi.spyOn(service, 'writeTextFile').mockRejectedValueOnce(...).EDIT_NO_OCCURRENCE_FOUND— the flexible strategy now succeeds, assertion updated.Two infrastructure fixes needed for the new pipeline:
logEditStrategyadded to theloggers.jsmock (was missing, would throw on any strategy call).getFileSystemServicechanged from a plain arrow function tovi.fn()to allow per-test service overriding.New suites added:
calculateReplacement(27 tests, all 4 strategies),replace_allparametrised matrix (7 cases),getModifyContext(all 4 methods),toolLocations,shouldConfirmExecute (ProceedAlways).3. Tool prompt tightening (~210 → ~95 tokens)
The
old_stringuniqueness rule was stated in three separate places."preferably unescaped"implies escaping is sometimes acceptable — it never is.replace_alldescription only explainedtrue, never whatfalseenforces. Collapsed into four numbered rules with imperative language and a single statement of consequences.