Recorder-Like ASR Benchmark Task

Objective

Create a separate sherpa-voice benchmark page for recorder-like transcription experiments so multiple ASR models can be compared without changing the existing feature demos.

Success Criteria

New benchmark page exists under the sherpa-voice feature stack.
The page supports repeatable sample file benchmarks across multiple ASR models.
The page supports live mic benchmarks for streaming models with latency-oriented metrics.
Benchmark results are kept in-session and can be copied/exported.
The RN ASR config surface supports model-specific fields needed for fair comparisons instead of hardcoded Whisper/SenseVoice defaults.

Benchmark Scope

Android-first
On-device-first
English-first
Transcription-first

Initial Model Matrix

streaming-zipformer-en-20m-mobile
streaming-zipformer-en-general
streaming-zipformer-ctc-small-2024-03-18
streaming-zipformer-bilingual-zh-en-2023-02-20
streaming-paraformer-bilingual-zh-en
streaming-zipformer-en-kroko-2025-08-06
whisper-tiny-en
whisper-small-multilingual
sense-voice-zh-en-ja-ko-yue-int8-2025-09-09
nemo-canary-180m-flash-en-es-de-fr

Metrics To Capture

model id
benchmark mode (sample or live)
init time
run duration
first partial latency
first commit latency
trailing commit latency after last audio chunk
transcript output
error state

Notes

Translation stays experimental in this PR; the first benchmark focus is Recorder-like transcription responsiveness.
Upstream Sherpa freshness should be tracked separately from any claim about rebuilt native binaries.
The benchmark matrix intentionally includes non-winner baselines so recommendations are based on tradeoffs, not only on the strongest model.
packages/sherpa-onnx.rn is part of the scope: package-level hardcoded ASR defaults were removed where needed for fair benchmarking, and remaining bridge/runtime gaps are documented in docs/ASR_BENCHMARK.md.
Current benchmark decision: streaming-zipformer-bilingual-zh-en-2023-02-20 is the best live model in the tested Sherpa set on the Pixel 6a, but it is still far from Google Recorder-like performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recorder-Like ASR Benchmark Task

Objective

Success Criteria

Benchmark Scope

Initial Model Matrix

Metrics To Capture

Notes

Uh oh!

FilesExpand file tree

task.md

Latest commit

History

task.md

File metadata and controls

Recorder-Like ASR Benchmark Task

Objective

Success Criteria

Benchmark Scope

Initial Model Matrix

Metrics To Capture

Notes