Skip to content

fix: prevent path corruption in qwen3.5-plus and Qwen3.5-397B-A17B file paths/修复了 qwen3.5-plusQwen3.5-397B-A17B 的文件路径损坏问题#2300

Closed
Deng-Xian-Sheng wants to merge 3 commits intoQwenLM:mainfrom
Deng-Xian-Sheng:fix/qwen35-unicode-paths
Closed

fix: prevent path corruption in qwen3.5-plus and Qwen3.5-397B-A17B file paths/修复了 qwen3.5-plusQwen3.5-397B-A17B 的文件路径损坏问题#2300
Deng-Xian-Sheng wants to merge 3 commits intoQwenLM:mainfrom
Deng-Xian-Sheng:fix/qwen35-unicode-paths

Conversation

@Deng-Xian-Sheng
Copy link
Copy Markdown
Contributor

Summary / 摘要

This PR fixes a path corruption issue in qwen3.5-plus and Qwen3.5-397B-A17B.

When these models output file paths that mix Chinese characters with punctuation, digits, or Latin characters, they may incorrectly insert extra spaces into the path. Those spaces break both file tool calls and shell commands.

本 PR 修复了 qwen3.5-plusQwen3.5-397B-A17B 模型的文件路径损坏问题。

当这些模型输出包含中文且混合标点、数字或英文字符的文件路径时,可能会错误地在路径中插入额外空格。这些空格会导致文件工具调用和 shell 命令都失败。


Problem / 问题

Examples of incorrect model output:

  • 中文中文-中文.md中文中文 - 中文.md
  • 中文中文1.md中文中文 1.md
  • 中文中文-1.md中文中文 -1.md
  • 中文中文-1.md中文中文 - 1.md

These extra spaces corrupt the real path and cause file operations to fail.

错误输出示例:

  • 中文中文-中文.md中文中文 - 中文.md
  • 中文中文1.md中文中文 1.md
  • 中文中文-1.md中文中文 -1.md
  • 中文中文-1.md中文中文 - 1.md

这些额外空格会破坏真实路径,从而导致文件操作失败。


Approach / 解决思路

Instead of relying on the model to always emit the exact visible path string correctly, this PR introduces a more stable path transport strategy for affected models:

对于受影响模型,本 PR 不再依赖模型始终正确输出可见路径字符串,而是引入了一套更稳定的路径传递策略:

  1. Add model-specific prompt instructions that tell affected models to prefer Unicode-escaped paths for non-ASCII file names.

  2. Add Unicode paths / Unicode names mappings to file-related tool outputs so the model can copy a stable representation.

  3. Decode Unicode-escaped tool arguments in the scheduler before validation and execution.

  4. 为受影响模型增加专用提示词,要求在非 ASCII 文件路径中优先使用 Unicode 转义形式。

  5. 在文件相关工具输出中增加 Unicode paths / Unicode names 映射,便于模型复制稳定表示。

  6. 在 scheduler 中,在参数校验和执行之前,解码 Unicode 转义参数。

This makes qwen-code robust against the specific path corruption pattern caused by inserted spaces.

这样可以让 qwen-code 对这种“插入额外空格导致路径损坏”的模型行为具备稳健性。


Changes / 修改内容

Prompt layer / 提示词层

  • Add path handling instructions only for affected models:
    • qwen3.5-plus
    • Qwen3.5-397B-A17B

提示词层

  • 仅针对以下受影响模型增加路径处理规则:
    • qwen3.5-plus
    • Qwen3.5-397B-A17B

Tool output layer / 工具输出层

  • list_directory now includes Unicode names / Unicode paths
  • glob now includes Unicode paths
  • grep now includes Unicode paths
  • ripgrep now includes Unicode paths

工具输出层

  • list_directory 增加 Unicode names / Unicode paths
  • glob 增加 Unicode paths
  • grep 增加 Unicode paths
  • ripgrep 增加 Unicode paths

Scheduler layer / 调度层

  • Decode escaped path-related arguments before tool.build(...)
  • Handle both:
    • \u4e2d\u6587
    • \\u4e2d\\u6587

Handled fields:

  • absolute_path
  • file_path
  • path
  • command

调度层

  • tool.build(...) 前解码转义路径相关参数
  • 同时兼容:
    • \u4e2d\u6587
    • \\u4e2d\\u6587

处理字段包括:

  • absolute_path
  • file_path
  • path
  • command

Testing / 测试

Automated tests / 自动化测试

Passed:

  • coreToolScheduler.test.ts
  • unicodeEscaping.test.ts
  • prompts.test.ts
  • ls.test.ts
  • glob.test.ts
  • grep.test.ts
  • ripGrep.test.ts

自动化测试

已通过:

  • coreToolScheduler.test.ts
  • unicodeEscaping.test.ts
  • prompts.test.ts
  • ls.test.ts
  • glob.test.ts
  • grep.test.ts
  • ripGrep.test.ts

Manual smoke tests / 手工冒烟测试

Verified with real filenames:

  • 中文中文-中文.md
  • 中文中文1.md
  • 中文中文-1.md

Tested scenarios:

  • list + read via tools
  • edit/write with Chinese file names
  • shell reads via:
    • cat
    • sed -n
    • python -c

手工冒烟测试

已使用真实文件名验证:

  • 中文中文-中文.md
  • 中文中文1.md
  • 中文中文-1.md

测试场景包括:

  • 使用工具列出并读取文件
  • 编辑/写入中文文件名文件
  • 使用以下 shell 命令读取:
    • cat
    • sed -n
    • python -c

Notes / 说明

This PR does not try to retrain or fully correct the model's visible-text formatting behavior.
Instead, it makes qwen-code reliably handle the affected models' broken path output pattern.

本 PR 并不尝试从训练层面纠正模型的可见文本格式行为,
而是让 qwen-code 能够可靠处理这类受影响模型的错误路径输出模式。

Fixes #1922

…aths

- add model-specific path handling guidance for affected qwen3.5 models
- expose unicode path/name mappings in file listing and search tools
- decode unicode-escaped tool args in scheduler before validation
- add regression tests for read_file and run_shell_command

qwen3.5-plus and Qwen3.5-397B-A17B may insert extra spaces between
Chinese characters and punctuation/digits when outputting file paths,
which breaks both tool calls and shell commands.

This change uses unicode-escaped paths as a stable transport format for
affected models, exposes unicode mappings in tool outputs, and decodes
escaped args in the scheduler before validation/execution.
@Deng-Xian-Sheng
Copy link
Copy Markdown
Contributor Author

表达效果明显的图片
1773244653789
1773245023119
1773246130968
1773246317571
1773246487373
1773246614689

@Deng-Xian-Sheng
Copy link
Copy Markdown
Contributor Author

Deng-Xian-Sheng commented Mar 11, 2026

如果要在合并之前使用它的话:

git clone -b fix/qwen35-unicode-paths https://github.com/Deng-Xian-Sheng/qwen-code.git
cd qwen-code
npm install
npm run build
npm install -g ./packages/cli

然后修改~/.qwen/settings.json,在general里面添加"enableAutoUpdate": false,例如:

{
  "general": {
    "enableAutoUpdate": false
  }
}

@tanzhenxin
Copy link
Copy Markdown
Collaborator

@Deng-Xian-Sheng Thanks for your contribution! We made a decision not to make hot-patch on this issue, but we can leave the PR here in case anyone who would like to take a try.

@tanzhenxin tanzhenxin self-assigned this Mar 13, 2026
@Deng-Xian-Sheng
Copy link
Copy Markdown
Contributor Author

@Deng-Xian-Sheng Thanks for your contribution! We made a decision not to make hot-patch on this issue, but we can leave the PR here in case anyone who would like to take a try.

OK,Perhaps the model will be updated.

@tanzhenxin
Copy link
Copy Markdown
Collaborator

@Deng-Xian-Sheng We are looking forward to the next model release too. The issue would be fixed by then, and with better coding ability promised.

@kilowu
Copy link
Copy Markdown

kilowu commented Mar 14, 2026

qwen-code 层面为何不考虑一下这个方案?我是 GLM-5 以及 opus-4.6 的用户,这些模型有时候也会出现类似的问题。
如果从 cli 层面处理的话,可以让 qwen-code 在处理文件路径时,更健壮。

@tanzhenxin
Copy link
Copy Markdown
Collaborator

tanzhenxin commented Mar 14, 2026

qwen-code 层面为何不考虑一下这个方案?我是 GLM-5 以及 opus-4.6 的用户,这些模型有时候也会出现类似的问题。 如果从 cli 层面处理的话,可以让 qwen-code 在处理文件路径时,更健壮。

你所说的 GLM-5 以及 opus-4.6 模型很少会有这里问题,而且可以自动纠正,不需要在 Qwen Code 层面做任何工程化的处理。
而这个 PR 包括相关的 issues 是讨论另外的问题,即模型基本上无法处理中英文混合的场景,不局限于文件路径。这个问题不应该在工程层面做 hack,属于头疼医脚。

@kilowu
Copy link
Copy Markdown

kilowu commented Mar 15, 2026

好的,那我们来一起期待 Qwen 的下一代模型。

@tanzhenxin
Copy link
Copy Markdown
Collaborator

Qwen 3.6 plus is online.

@tanzhenxin tanzhenxin closed this Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/blocked Blocked by external dependency

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] The edit tool is unable to edit files in the latest version

3 participants