Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ Only write entries that are worth mentioning to users.

## Unreleased

- Shell: Use `git ls-files` for `@` file mention discovery — file completer now queries `git ls-files --recurse-submodules` with a 5-second timeout as the primary discovery mechanism, falling back to `os.walk` for non-git repositories; this fixes large repositories (e.g., apache/superset with 65k+ files) where the 1000-file limit caused late-alphabetical directories to be unreachable (fixes #1375)
- Core: Add shared `file_filter` module — unifies file mention logic between shell and web UIs via `src/kimi_cli/utils/file_filter.py`, providing consistent path filtering, ignored directory exclusion, and git-aware file discovery
- Shell: Prevent path traversal in file mention scope parameter — the `scope` parameter in file completer requests is now validated to prevent directory traversal attacks
- Web: Restore unfiltered directory listing in file browser API — file browser endpoint no longer applies git-aware filtering, ensuring all files are visible in the web UI file picker
- Todo: Refactor SetTodoList to persist state and prevent tool call storms — todos are now persisted to session state (root agent) and independent state files (sub-agents); adds query mode (omit `todos` to read current state) and clear mode (pass `[]`); includes anti-storm guidance in tool description to prevent repeated calls without progress (fixes #1710)
- ReadFile: Add total line count to every read response and support negative `line_offset` for tail mode — the tool now reports `Total lines in file: N.` in its message so the model can plan subsequent reads; negative `line_offset` (e.g. `-100`) reads the last N lines using a sliding window, useful for viewing recent log output without shell commands; the absolute value is capped at 1000 (MAX_LINES)
- Shell: Fix black background on inline code and code blocks in Markdown rendering — `NEUTRAL_MARKDOWN_THEME` now overrides all Rich default `markdown.*` styles to `"none"`, preventing Rich's built-in `"cyan on black"` from leaking through on non-black terminals
Expand Down
4 changes: 4 additions & 0 deletions docs/en/release-notes/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ This page documents the changes in each Kimi Code CLI release.

## Unreleased

- Shell: Use `git ls-files` for `@` file mention discovery — file completer now queries `git ls-files --recurse-submodules` with a 5-second timeout as the primary discovery mechanism, falling back to `os.walk` for non-git repositories; this fixes large repositories (e.g., apache/superset with 65k+ files) where the 1000-file limit caused late-alphabetical directories to be unreachable (fixes #1375)
- Core: Add shared `file_filter` module — unifies file mention logic between shell and web UIs via `src/kimi_cli/utils/file_filter.py`, providing consistent path filtering, ignored directory exclusion, and git-aware file discovery
- Shell: Prevent path traversal in file mention scope parameter — the `scope` parameter in file completer requests is now validated to prevent directory traversal attacks
- Web: Restore unfiltered directory listing in file browser API — file browser endpoint no longer applies git-aware filtering, ensuring all files are visible in the web UI file picker
- Todo: Refactor SetTodoList to persist state and prevent tool call storms — todos are now persisted to session state (root agent) and independent state files (sub-agents); adds query mode (omit `todos` to read current state) and clear mode (pass `[]`); includes anti-storm guidance in tool description to prevent repeated calls without progress (fixes #1710)
- ReadFile: Add total line count to every read response and support negative `line_offset` for tail mode — the tool now reports `Total lines in file: N.` in its message so the model can plan subsequent reads; negative `line_offset` (e.g. `-100`) reads the last N lines using a sliding window, useful for viewing recent log output without shell commands; the absolute value is capped at 1000 (MAX_LINES)
- Shell: Fix black background on inline code and code blocks in Markdown rendering — `NEUTRAL_MARKDOWN_THEME` now overrides all Rich default `markdown.*` styles to `"none"`, preventing Rich's built-in `"cyan on black"` from leaking through on non-black terminals
Expand Down
4 changes: 4 additions & 0 deletions docs/zh/release-notes/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@

## 未发布

- Shell:使用 `git ls-files` 进行 `@` 文件引用发现——文件补全器现在优先使用 `git ls-files --recurse-submodules` 查询文件列表(5 秒超时),非 Git 仓库则回退到 `os.walk`;此修复解决了大型仓库(如包含 6.5 万+文件的 apache/superset)中 1000 文件限制导致字母顺序靠后的目录无法访问的问题(修复 #1375)
- Core:新增共享的 `file_filter` 模块——通过 `src/kimi_cli/utils/file_filter.py` 统一 Shell 和 Web 的文件引用逻辑,提供一致的路径过滤、忽略目录排除和 Git 感知文件发现
- Shell:防止文件引用 scope 参数的路径遍历——文件补全器请求中的 `scope` 参数现在会经过验证,防止目录遍历攻击
- Web:恢复文件浏览器 API 中的未过滤目录列表——文件浏览器端点不再应用 Git 感知过滤,确保 Web UI 文件选择器中显示所有文件
- Todo:重构 `SetTodoList` 工具,支持状态持久化并防止工具调用风暴——待办事项现在会持久化到会话状态(主 Agent)和独立状态文件(子 Agent);新增查询模式(省略 `todos` 参数可读取当前状态)和清空模式(传 `[]` 清空);工具描述中增加了防风暴指导,防止在没有实际进展的情况下反复调用(修复 #1710)
- ReadFile:每次读取返回文件总行数,并支持负数 `line_offset` 实现 tail 模式——工具现在会在消息中报告 `Total lines in file: N.`,方便模型规划后续读取;负数 `line_offset`(如 `-100`)通过滑动窗口读取文件末尾 N 行,适用于无需 Shell 命令即可查看最新日志输出的场景;绝对值上限为 1000(MAX_LINES)
- Shell:修复 Markdown 渲染中行内代码和代码块出现黑色背景的问题——`NEUTRAL_MARKDOWN_THEME` 现在将所有 Rich 默认的 `markdown.*` 样式覆盖为 `"none"`,防止 Rich 内置的 `"cyan on black"` 在非黑色背景终端上泄露
Expand Down
156 changes: 43 additions & 113 deletions src/kimi_cli/ui/shell/prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -612,82 +612,15 @@ def _render_selected_item_lines(


class LocalFileMentionCompleter(Completer):
"""Offer fuzzy `@` path completion by indexing workspace files."""
"""Offer fuzzy `@` path completion by indexing workspace files.

File discovery and ignore rules are delegated to
:mod:`kimi_cli.utils.file_filter` so that the web backend can reuse
them.
"""

_FRAGMENT_PATTERN = re.compile(r"[^\s@]+")
_TRIGGER_GUARDS = frozenset((".", "-", "_", "`", "'", '"', ":", "@", "#", "~"))
_IGNORED_NAME_GROUPS: dict[str, tuple[str, ...]] = {
"vcs_metadata": (".DS_Store", ".bzr", ".git", ".hg", ".svn"),
"tooling_caches": (
".build",
".cache",
".coverage",
".fleet",
".gradle",
".idea",
".ipynb_checkpoints",
".pnpm-store",
".pytest_cache",
".pub-cache",
".ruff_cache",
".swiftpm",
".tox",
".venv",
".vs",
".vscode",
".yarn",
".yarn-cache",
),
"js_frontend": (
".next",
".nuxt",
".parcel-cache",
".svelte-kit",
".turbo",
".vercel",
"node_modules",
),
"python_packaging": (
"__pycache__",
"build",
"coverage",
"dist",
"htmlcov",
"pip-wheel-metadata",
"venv",
),
"java_jvm": (".mvn", "out", "target"),
"dotnet_native": ("bin", "cmake-build-debug", "cmake-build-release", "obj"),
"bazel_buck": ("bazel-bin", "bazel-out", "bazel-testlogs", "buck-out"),
"misc_artifacts": (
".dart_tool",
".serverless",
".stack-work",
".terraform",
".terragrunt-cache",
"DerivedData",
"Pods",
"deps",
"tmp",
"vendor",
),
}
_IGNORED_NAMES = frozenset(name for group in _IGNORED_NAME_GROUPS.values() for name in group)
_IGNORED_PATTERN_PARTS: tuple[str, ...] = (
r".*_cache$",
r".*-cache$",
r".*\.egg-info$",
r".*\.dist-info$",
r".*\.py[co]$",
r".*\.class$",
r".*\.sw[po]$",
r".*~$",
r".*\.(?:tmp|bak)$",
)
_IGNORED_PATTERNS = re.compile(
"|".join(f"(?:{part})" for part in _IGNORED_PATTERN_PARTS),
re.IGNORECASE,
)

def __init__(
self,
Expand All @@ -701,9 +634,12 @@ def __init__(
self._limit = limit
self._cache_time: float = 0.0
self._cached_paths: list[str] = []
self._cache_scope: str | None = None
self._top_cache_time: float = 0.0
self._top_cached_paths: list[str] = []
self._fragment_hint: str | None = None
self._is_git: bool | None = None # lazily detected
self._git_index_mtime: float | None = None

self._word_completer = WordCompleter(
self._get_paths,
Expand All @@ -717,21 +653,15 @@ def __init__(
pattern=r"^[^\s@]*",
)

@classmethod
def _is_ignored(cls, name: str) -> bool:
if not name:
return True
if name in cls._IGNORED_NAMES:
return True
return bool(cls._IGNORED_PATTERNS.fullmatch(name))

def _get_paths(self) -> list[str]:
fragment = self._fragment_hint or ""
if "/" not in fragment and len(fragment) < 3:
return self._get_top_level_paths()
return self._get_deep_paths()

def _get_top_level_paths(self) -> list[str]:
from kimi_cli.utils.file_filter import is_ignored

now = time.monotonic()
if now - self._top_cache_time <= self._refresh_interval:
return self._top_cached_paths
Expand All @@ -740,7 +670,7 @@ def _get_top_level_paths(self) -> list[str]:
try:
for entry in sorted(self._root.iterdir(), key=lambda p: p.name):
name = entry.name
if self._is_ignored(name):
if is_ignored(name):
continue
entries.append(f"{name}/" if entry.is_dir() else name)
if len(entries) >= self._limit:
Expand All @@ -753,45 +683,45 @@ def _get_top_level_paths(self) -> list[str]:
return self._top_cached_paths

def _get_deep_paths(self) -> list[str]:
now = time.monotonic()
if now - self._cache_time <= self._refresh_interval:
return self._cached_paths

paths: list[str] = []
try:
for current_root, dirs, files in os.walk(self._root):
relative_root = Path(current_root).relative_to(self._root)
from kimi_cli.utils.file_filter import (
detect_git,
git_index_mtime,
list_files_git,
list_files_walk,
)

# Prevent descending into ignored directories.
dirs[:] = sorted(d for d in dirs if not self._is_ignored(d))
fragment = self._fragment_hint or ""

if relative_root.parts and any(
self._is_ignored(part) for part in relative_root.parts
):
dirs[:] = []
continue
scope: str | None = None
if "/" in fragment:
scope = fragment.rsplit("/", 1)[0]

if relative_root.parts:
paths.append(relative_root.as_posix() + "/")
if len(paths) >= self._limit:
break
now = time.monotonic()
cache_valid = (
now - self._cache_time <= self._refresh_interval and self._cache_scope == scope
)

for file_name in sorted(files):
if self._is_ignored(file_name):
continue
relative = (relative_root / file_name).as_posix()
if not relative:
continue
paths.append(relative)
if len(paths) >= self._limit:
break
# Invalidate on .git/index mtime change (like Claude Code).
if cache_valid and self._is_git:
mtime = git_index_mtime(self._root)
if mtime != self._git_index_mtime:
cache_valid = False

if len(paths) >= self._limit:
break
except OSError:
if cache_valid:
return self._cached_paths

if self._is_git is None:
self._is_git = detect_git(self._root)

paths: list[str] | None = None
if self._is_git:
paths = list_files_git(self._root, scope)
self._git_index_mtime = git_index_mtime(self._root)
Comment on lines +717 to +719
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reapply completion cap for git-backed deep searches

When a repo is detected, deep @ completion now calls list_files_git without applying self._limit, so the candidate set becomes unbounded while the os.walk fallback still enforces the limit. In large repos this can feed tens of thousands of paths into FuzzyCompleter on each completion request, causing noticeable UI latency and negating the guardrail that the limit constructor argument previously provided.

Useful? React with 👍 / 👎.

if paths is None:
paths = list_files_walk(self._root, scope, limit=self._limit)

self._cached_paths = paths
self._cache_scope = scope
self._cache_time = now
return self._cached_paths

Expand Down
Loading
Loading