Conversation
There was a problem hiding this comment.
Pull request overview
Upgrades the @geti/smart-tools web ONNX Runtime dependency to a newer onnxruntime-web version to enable the webgpu execution provider, and updates the Segment Anything pipeline/session setup to work with the new runtime behavior (including worker-safe tensor passing).
Changes:
- Bump
onnxruntime-webto~1.24.3(and update the rootpackage-lock.jsonaccordingly). - Update WASM/runtime configuration and default execution providers to prefer
webgpu, plus serializeInferenceSession.run()calls. - Make Segment Anything encoder output structured-clone safe by returning a plain-object tensor representation and reconstructing the tensor in the decoder.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
web_ui/packages/smart-tools/src/utils/wasm-utils.ts |
Adjusts session defaults (WebGPU+CPU) and switches to a new WASM asset path/type. |
web_ui/packages/smart-tools/src/segment-anything/session.ts |
Updates ORT env/session creation usage and adds run-call serialization to avoid concurrent run() on a single session. |
web_ui/packages/smart-tools/src/segment-anything/segment-anything-encoder.ts |
Converts encoder output tensor into a serializable representation suitable for worker transfers. |
web_ui/packages/smart-tools/src/segment-anything/segment-anything-decoder.ts |
Reconstructs an ort.Tensor from the serialized encoder output for decoder execution. |
web_ui/packages/smart-tools/package.json |
Bumps onnxruntime-web dependency version. |
web_ui/package-lock.json |
Locks updated onnxruntime-web/onnxruntime-common and transitive dependency versions. |
Files not reviewed (1)
- web_ui/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| numThreads: number; | ||
| executionProviders: string[]; | ||
| wasmRoot?: string | Record<string, string>; | ||
| wasmRoot?: string | { wasm?: string }; |
There was a problem hiding this comment.
SessionParameters.wasmRoot is now typed as string | { wasm?: string }, but the rest of the code treats this as a concrete wasmPaths value (and passes it into ORT env config). Making wasm optional allows {} / undefined at the type level and tends to be incompatible with ORT’s usual Record<string, string>/string wasmPaths typing under strict mode. Consider making wasm required (e.g., { wasm: string }) or reverting to a broader Record<string, string>-style type that matches what ORT expects.
| wasmRoot?: string | { wasm?: string }; | |
| wasmRoot?: string | { wasm: string }; |
| env.wasm.numThreads = this.params.numThreads; | ||
| env.wasm.wasmPaths = this.params.wasmRoot; | ||
| env.wasm.simd = true; | ||
| // Suppress expected "some nodes not assigned to WebGPU EP" warnings — | ||
| // ORT intentionally keeps shape-related ops on CPU for performance. | ||
| env.logLevel = 'error'; |
There was a problem hiding this comment.
env.logLevel = 'error' changes the global ONNX Runtime logging level for the entire app/package, not just this session. This can hide important warnings/errors elsewhere (and makes diagnosing WebGPU fallback issues harder). Consider scoping this (save/restore the previous logLevel) or making it configurable via the session parameters instead of forcing it in init().
| const encoderResult: SerializableTensor = { | ||
| data: new Float32Array((await gpuTensor.getData()) as Float32Array), |
There was a problem hiding this comment.
data: new Float32Array((await gpuTensor.getData()) as Float32Array) creates an extra copy of the tensor data (and will double-copy again when structured-cloned), which can be expensive for SAM embeddings. If getData() already returns a Float32Array, consider using it directly (and/or using Comlink transfer of the underlying buffer) to avoid redundant allocations.
| const encoderResult: SerializableTensor = { | |
| data: new Float32Array((await gpuTensor.getData()) as Float32Array), | |
| const tensorData = (await gpuTensor.getData()) as Float32Array; | |
| const encoderResult: SerializableTensor = { | |
| data: tensorData, |
📝 Description
webgpu, which, in early tests showed a 10-12x speed improvement✨ Type of Change
Select the type of change your PR introduces:
🧪 Testing Scenarios
Describe how the changes were tested and how reviewers can test them too:
✅ Checklist
Before submitting the PR, ensure the following: