Skip to content

[P1] Add startup performance profiler / 添加启动性能度量 #3219

@doudouOUC

Description

@doudouOUC

Priority: P1
Category: performance, core
Parent: #3011
Blocks: All other startup sub-issues
PR: #3232

Problem / 问题

recordStartupPerformance() exists in packages/core/src/telemetry/metrics.ts but is NOT called from main(). No wall-clock startup measurement in production. Cannot validate any startup optimization without data.

recordStartupPerformance() 存在但未从 main() 调用。生产环境无启动耗时追踪,无法验证优化效果。

Implemented Solution / 已实现方案

通过 QWEN_CODE_PROFILE_STARTUP=1 环境变量启用轻量级启动 profiler,写 JSON 报告到 ~/.qwen/startup-perf/不改动 telemetry 链路,telemetry 上报作为后续独立 PR。

核心文件

文件 说明
packages/cli/src/utils/startupProfiler.ts Profiler 核心:initStartupProfiler(), profileCheckpoint(), finalizeStartupProfile()
packages/cli/index.ts 进程入口处调用 initStartupProfiler() 记录 T0
packages/cli/src/gemini.tsx main() 中 7 个 checkpoint 插桩

Checkpoint 列表

main_entryafter_load_settingsafter_parse_argumentsafter_sandbox_checkafter_load_cli_configafter_initialize_appbefore_render

JSON 报告结构

{
  "processUptimeAtT0Ms": 1341.65,  // Node.js 启动到 T0(模块加载耗时)
  "totalMs": 85.31,                // main() 内部各阶段总耗时
  "phases": [
    { "name": "main_entry", "startMs": 0, "durationMs": 0.17 },
    { "name": "after_load_settings", "startMs": 0.17, "durationMs": 7.98 },
    ...
  ],
  "nodeVersion": "v24.12.0", "platform": "darwin", "arch": "arm64"
}

设计决策

  • 仅详细模式,不接 telemetry — 零耦合,只依赖 Node 标准库(performance, fs, os)
  • 仅 sandbox 内 profile — 避免外层进程产生重复报告
  • process.uptime() 补盲 — 覆盖 ESM import hoist 导致 profiler 无法测量的模块加载阶段
  • initStartupProfiler() 幂等 — 入口处先 reset,防止重复调用状态泄漏
  • 写入失败静默 — try/catch + stderr warning,不影响正常启动

初始性能数据 / Baseline Measurements

Phase                        Duration      %
──────────────────────────────────────────────────
Node.js → T0 (模块加载)     1341.65ms   94.0%  ████████████████████████████████████████
main_entry                     0.17ms    0.0%
load_settings                  7.98ms    0.6%
parse_arguments                7.35ms    0.5%
sandbox_check                  0.06ms    0.0%
load_cli_config                1.55ms    0.1%
initialize_app                33.50ms    2.3%  █
before_render                 34.69ms    2.4%  █
──────────────────────────────────────────────────
Total (进程启动→就绪)      1426.96ms

关键发现:模块加载占 94%(1342ms),main() 内部仅 85ms。 #3011 的优化应聚焦于 barrel export 全量加载、重型依赖延迟加载、code splitting。

Acceptance Criteria / 验收标准

  • QWEN_CODE_PROFILE_STARTUP=1 produces a JSON report with phase timings
  • processUptimeAtT0Ms captures module loading time
  • No measurable overhead when profiling is disabled (<0.1ms, single env var check)
  • 11 unit tests pass
  • 1% of sessions report startup duration to telemetry (deferred to follow-up PR)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions