feat(core,ethexe,pallet-gear): strip WASM custom sections from InstrumentedCode#5367
feat(core,ethexe,pallet-gear): strip WASM custom sections from InstrumentedCode#5367
Conversation
…mentedCode Since commit dafb483 the instrumentation pipeline kept custom sections (sails:idl, producers, etc.) inside the persisted instrumented artifact. They are never consumed at sandbox execution time — IDL readers use OriginalCode via the gear_readWasmCustomSection RPC. Keep them out of InstrumentedCode so the sandbox load path is smaller and per-byte gas charges on Vara drop accordingly. Changes: * utils/wasm-instrument: add Module::strip_custom_sections (preserves name_section so Wasmer/Wasmtime trap backtraces stay readable). * core/src/code: call the strip in Code::try_new_internal before module.serialize(), so both pallet-gear and ethexe benefit. * ethexe_runtime_common::VERSION 1 → 2 and extend the ethexe InstrumentedCode DB key from (u32, CodeId) to (runtime_id: u32, version: u32, CodeId). Fixes a latent bug where various call sites passed RUNTIME_ID instead of VERSION; now both participate in the key. RPC code_getInstrumented gains a `version` parameter (breaking RPC change). * pallet_gear Schedule InstructionWeights::version 1900 → 1910 to trigger lazy re-instrumentation of existing on-chain programs. Gas impact on Vara: program upload and re-instrumentation cost drops slightly; the stored instrumented bytes are smaller. IDL readers are unaffected (they read OriginalCode). Existing programs lazily migrate to the stripped shape on their next execution via reinstrument_code. Tests: * wasm-instrument unit tests: strip clears custom, keeps name; no-op on empty module. * gear-core end-to-end: OriginalCode keeps sails:idl, InstrumentedCode has no custom sections. * ethexe processor integration: same contract across the full process_code pipeline. * pallet-gear regression: stripping_reduces_instrumented_code_len uploads a program with a 4 KiB sails:idl section and asserts the strip + net size-drop contract end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Changed Files
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request optimizes the storage and execution of instrumented WASM code by stripping unnecessary custom sections before persistence. It introduces a versioned database key layout for Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to strip WASM custom sections (excluding the name section) from instrumented code while ensuring they are preserved in the original code for RPC consumers. It also updates the InstrumentedCode database key layout in ethexe to include an instrumentation version, bumping the runtime version to 2 and providing a migration (v3) to clear stale entries. A compilation error was identified in the new migration file due to a missing import and inconsistent logging macro usage.
| } | ||
| } | ||
|
|
||
| log::info!( |
There was a problem hiding this comment.
The log crate is not imported in this file, which will lead to a compilation error. Additionally, the rest of the ethexe-db crate (e.g., database.rs) uses the tracing crate for logging. For consistency and to fix the compilation error, please use tracing::info!.
| log::info!( | |
| tracing::info!( |
There was a problem hiding this comment.
This is a false positive:
-
No compilation error.
logis declared as a direct workspace dep ofethexe-dbatethexe/db/Cargo.toml:23(log.workspace = true).log::info!resolves via the crate path and does not require ause logimport. Localcargo check -p ethexe-dbpasses. -
Logging convention. The
migrationsmodule consistently useslog::*!:ethexe/db/src/migrations/v2.rs:46, 71, 115ethexe/db/src/migrations/init.rs:37, 40, 59, 63, 78, 98
Switching v3 to
tracing::info!would diverge from its siblings in the same module. Thetracingconvention lives inethexe/db/src/database.rs— not a crate-wide rule.
Keeping log::info! to stay consistent with the other migration files.
|
@claude opus review |
|
Claude finished @ukint-vs's task in 4m 20s —— View job PR Review — strip WASM custom sections from InstrumentedCodeOverall: LGTM. The change is well-scoped, correctly sequenced, and the new test coverage is solid. Two minor maintenance notes below, nothing blocking. SummaryThis PR reduces the size of What I verifiedStripping placement (
DB key layout — v3 migration ( RPC surface ( Call sites — Findingsseverity: low — hardcoded File: db.set_instrumented_code(1, 2, code_id, instrumented);The PR description already flags this as "not in scope" and the comment explains the dep-cycle constraint (
Non-blocking, but worth a follow-up issue. severity: low — migration version-pin assertions require updates on every bump Files: Each prior migration has a Informational
TestsCoverage is thorough:
Ship it after the mock-constants follow-up is at least filed as an issue. Reviewed by Claude Opus 4.7 via |
…ants
Extract `MOCK_RUNTIME_ID` and `MOCK_VERSION` from the `(1, 2, code_id, …)`
literal in `ethexe-common`'s mock. `ethexe-common` can't import
`ethexe_runtime_common::{RUNTIME_ID, VERSION}` directly — runtime-common
depends on common, so the reverse would be a dep cycle — but the drift is
guarded the other way: `ethexe-runtime-common` now has a dev-dep on
`ethexe-common` with the `mock` feature and a `#[cfg(test)]` assertion
that `RUNTIME_ID == MOCK_RUNTIME_ID` and `VERSION == MOCK_VERSION`. A
future bump of either real constant that forgets to update the mock
fails that test loudly instead of silently diverging.
Follow-up to #5367 review feedback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Addressed the hardcoded
The dep-cycle constraint (runtime-common → common, not the reverse) is unchanged, so the mock can't import the real constants directly. But now a future bump of Follow-up per @claude's review. |
- `make fmt` reflowed multi-line literals in 7 files touched by this PR (no semantic changes). - `v3::migration_from_v2` imported `KVDatabase` alongside `RawDatabase` but never referenced the trait by name — method calls on `db.kv: Box<dyn KVDatabase>` resolve through the vtable without the trait in scope. Drop the unused import. Fixes `check / fmt` and `check / clippy` CI failures on #5367. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`wasmparser::Payload::CustomSection` fires for every standard custom section including the `name` section, but we intentionally preserve `name_section` (see `Module::strip_custom_sections`'s rustdoc — it keeps trap backtraces readable). The test's broader "no custom sections at all" assertion therefore contradicted the design and failed on CI despite passing the targeted `sails:idl` check. Narrow the broader assertion to "no custom sections other than `name`" and include the offending section name(s) in the panic message for easier diagnostics next time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…m-sections # Conflicts: # ethexe/db/src/migrations/v2.rs # ethexe/db/src/migrations/v3.rs
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
grishasobol
left a comment
There was a problem hiding this comment.
update the branch, you have side changes
Use the existing ModuleBuilder::push_custom_section helper in the three tests instead of re-rolling the get_or_insert_with(Vec::new).push(...) dance. Add a size-reduction assertion in stripping_reduces_instrumented_code_len so the test's stated contract is actually guarded. Pin the Key enum's InstrumentedCode discriminant via a unit test so a future Key reorder fails loudly rather than silently mis-scanning in migrations/v4.rs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…m-sections # Conflicts: # ethexe/db/src/migrations/mod.rs # ethexe/db/src/migrations/v2.rs # ethexe/db/src/migrations/v4.rs
| fn instrumented_code( | ||
| &self, | ||
| runtime_id: u32, | ||
| version: u32, |
There was a problem hiding this comment.
What the reason of adding second version? As soon as version can be changed only when runtime is changed, then we can use only one here, runtime version or instrumentation version
| pub const VERSION: u32 = 2; | ||
| pub const RUNTIME_ID: u32 = 1; |
There was a problem hiding this comment.
what the difference between version and runtime id ?
| /// re-instrument existing codes — that relies on the normal code processing | ||
| /// pipeline observing the code again. Operators upgrading a live node should | ||
| /// expect a brief window where existing programs return | ||
| /// `MissingInstrumentedCodeForProgram` until their codes are re-observed. |
There was a problem hiding this comment.
we do not have any re-instrumentation mechanism right now
Per @grishasobol's review: `runtime_id` and `version` always change together, so carrying both in the `InstrumentedCode` DB key and in the `CodesStorage*` traits was redundant. Drop `RUNTIME_ID` (it had no use outside DB-key call sites) and key by `VERSION` only — `VERSION` is also what `Code::try_new` consumes as `instruction_weights_version`, so it carries semantic weight that `RUNTIME_ID` did not. DB key shape: `(runtime_id, version, code_id)` → `(version, code_id)`. Trait signatures lose the duplicated `u32`. Mock keeps `MOCK_VERSION` only; the runtime-common assertion test follows. The v5 migration body is unchanged in spirit but the doc no longer claims re-instrumentation happens (per @grishasobol — there is no such mechanism today). Old and new key shapes now have the same byte length, so the length-filter is dropped in favor of wiping every entry under the `InstrumentedCode` discriminant unconditionally; all of them are stale relative to the bumped `VERSION`. RPC `code_getInstrumented(runtime_id, code_id)` keeps its public signature; the `runtime_id` parameter is now ignored internally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Addressed @grishasobol's review in Both "what's the difference between version and runtime_id" / "use only one here" comments: collapsed. The The public RPC "we do not have any re-instrumentation mechanism right now": fixed the v5 doc to stop claiming the compute pipeline re-instruments on next observation. It doesn't — Tests green: 🤖 Generated with Claude Code |
Self-review pass on the previous commit:
- `VERSION` doc and v5 migration doc collapsed to the essentials.
- Drop diff-narration comments ("post-bump key shape has the same byte
length as the pre-bump one...") — readers see only the final code.
- Drop the SAFETY comment on `unsafe { db.kv.take(...) }` in v5: the
unsafety on `KVDatabase::take` is purely a data-loss risk, exactly
the intent here. `let _ = ...` makes that explicit without prose.
- Tighten `Key::to_bytes`' debug_assert upper-bound to match the
`with_capacity` hint (one `u32`, not two).
- Drop redundant comments on `instrumented_code_key_discriminant_is_stable`
and inside `test_instrumented_code` that restated the assertion.
- Collapse the now-single-constant header in `ethexe-common::mock`.
No behavior change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Strip WASM custom sections (e.g.
sails:idl,producers) fromInstrumentedCodebefore persisting. They are never consumed at sandbox execution time — IDL readers
pull from
OriginalCodevia thegear_readWasmCustomSectionRPC. Keeping them inInstrumentedCodejust inflates the bytes that are reloaded into Wasmer/Wasmtimeon every execution, gas-billed to users on Vara, and synced over libp2p on ethexe.
Since commit dafb483 the instrumentation pipeline round-tripped custom sections
so the IDL survived on-chain. This PR moves the stripping to the single choke point
Code::try_new_internal, so both pallet-gear and ethexe benefit from one call.name_sectionis preserved — Wasmer/Wasmtime trap backtraces stay readable.Changes
utils/wasm-instrument&coreModule::strip_custom_sections()— clearscustom_sections, keepsname_section. Rustdoc notes the intentional divergence fromwasm_optimizer::Optimizer::strip_custom_sections, which clears both.Code::try_new_internalcalls it right beforemodule.serialize(), so thecode_lencaptured intoCodeMetadatareflects the bytes actually stored(this is what pallet-gear charges gas against).
ethexe
ethexe_runtime_common::VERSION: 1 → 2.InstrumentedCodeextended from(u32, CodeId)to(runtime_id: u32, version: u32, CodeId). This closes a latent bug wherevarious call sites passed
RUNTIME_IDwhere production code expectedVERSION— both happened to equal1, so the mismatch was silently benignuntil this PR. All production call sites now pass
(RUNTIME_ID, VERSION).v3::migration_from_v2drops staleInstrumentedCodeentriesstored under the old 2-tuple layout.
ethexe-db::LATEST_VERSION: 2 → 3.code_getInstrumented(runtime_id, code_id)signature is preserved —the handler internally defaults to
ethexe_runtime_common::VERSION.External clients are not affected.
Vara / pallet-gear
Schedule::InstructionWeights::version: 1900 → 1910. Triggers lazyre-instrumentation of existing on-chain programs via the standard
queue.rs::needs_reinstrumentationpath — no storage migration required.Gas impact (Vara)
Program upload and re-instrumentation cost drops slightly; the stored
instrumented bytes are smaller because IDL + producers + any other custom
sections are no longer included. The drop is proportional to IDL size (typically
1–20 KiB). IDL readers are unaffected —
gear_readWasmCustomSectionreadsOriginalCode.Weight regeneration for
pallet_gearis deferred to CI / a maintainer passsince local benchmark runs produce hardware-biased numbers.
Migration story
Vara: lazy. Existing programs reinstrument on next execution via
reinstrument_codewhenSchedule::InstructionWeights::versionchanges. Nostorage migration.
ethexe: the
v3DB migration drops staleInstrumentedCodeentries storedunder the old key layout. The migration does not itself re-instrument — it
relies on the normal code observation pipeline to re-populate entries on next
touch. Operators upgrading a live node should expect a brief window where
existing programs return
MissingInstrumentedCodeForProgramuntil their codesare re-observed. Acceptable given ethexe's pre-mainnet status; if that ever
changes, add a lazy re-instrument fallback in
instrumented_code_and_metadata.Tests
utils/wasm-instrument:strip_custom_sections_clears_custom_but_keeps_name,strip_custom_sections_on_empty_module_is_noop.gear-core:instrumented_code_strips_custom_sections_but_original_keeps_them—end-to-end through
Code::try_new, assertsOriginalCoderetainssails:idland
InstrumentedCodehas no custom sections.ethexe-processor:instrumented_code_strips_custom_sections— same contractacross the full
Processor::process_codepipeline.ethexe-db:test_instrumented_codeextended with cross-runtime andcross-version isolation assertions (bumping either constant must invalidate
prior entries).
pallet-gear:stripping_reduces_instrumented_code_len— uploads a programwith a 4 KiB
sails:idlsection, assertsOriginalCodepreserves it andInstrumentedCodestrips all custom sections.Verification
Not in scope
a processor through the migration framework).
RUNTIME_ID/VERSIONconstants intoethexe-commonto eliminatethe hard-coded
(1, 2, ...)literal inmock.rs(blocked by the currentdep graph; tracked with a TODO).
pallet_gearruntime weight benchmarks (CI / maintainer task oncanonical hardware).
🤖 Generated with Claude Code