This has been found during investigation of x5 proof_size weight for one of the pallet_contracts API functions.
This bug was introduced in #11637.
Context
Results of benchmarks are processed with writer::process_storage_results() in the following way.
It loops through all the storage keys of all results
|
for result in results.iter().rev() { |
|
for (key, reads, writes, whitelisted) in &result.keys { |
and multiples each benchmark result for each key, adjusting the result's proof_size for each key, as it depends on:
- PoV estimation mode
- single read PoV overhead
- number of reads
|
// Add the additional trie layer overhead for every new prefix. |
|
if *reads > 0 { |
|
prefix_result.proof_size += 15 * 33 * additional_trie_layers as u32; |
|
} |
|
storage_per_prefix.entry(prefix.clone()).or_default().push(prefix_result); |
We get storage_per_prefix data as the result, which originally was used for creating comments with information about the storage keys touched during each benchmark.
The Bug
In PR#11637 this data started to be used for calculation of the resulting proof_size formula into the weights.rs . But in a wrong way:
Step 1, (almost right). We find average base proof_size value and slope for each component, by making regression analysis of benchmark results we put to this component in storage_per_prefix (see above).
|
let proof_size_per_components = storage_per_prefix |
|
.iter() |
|
.map(|(prefix, results)| { |
|
let proof_size = analysis_function(results, BenchmarkSelector::ProofSize) |
|
.expect("analysis function should return proof sizes for valid inputs"); |
(This is almost right but not totally right, because the values we're making regression analysis on are not per-key benchmark results but originated from benchmark results for the all the keys and then adjusted for a single key, see below for details)
Step 2, (wrong). The resulting base_calculated_proof_size and component_calculated_proof_size values are calculated as a simple sum of the values for all prefixes from the per_prefix benchmark results.
|
for (_, slope, base) in proof_size_per_components.iter() { |
|
base_calculated_proof_size += base; |
|
for component in slope.iter() { |
|
let mut found = false; |
|
for used_component in used_calculated_proof_size.iter_mut() { |
|
if used_component.name == component.name { |
|
used_component.slope += component.slope; |
But, wait a minute. To recap, the path of these proof_sizes is:
- They first were calculated for all keys in each benchmark run.
- Then we multiplied them for each key, adjusted depending on reads overhead for the key.
(which is weird because we're adding up overhead of a single key PoV to proof_size of all the keys)
- Then we summed up those multiplied values.
And this is wrong. It leads to exaggerated weights when being put to weights.rs:
|
{{#each benchmark.component_calculated_proof_size as |cp|}} |
|
.saturating_add(Weight::from_parts(0, {{cp.slope}}).saturating_mul({{cp.name}}.into())) |
|
{{/each}} |
Suggested Fix
Instead of a simple sum, those base_calculated_proof_size and component_calculated_proof_size should be calculated as averages or medians from all keys.
Context
Results of benchmarks are processed with
writer::process_storage_results()in the following way.It loops through all the storage keys of all results
substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs
Lines 549 to 550 in 9c92e49
and multiples each benchmark result for each key, adjusting the result's proof_size for each key, as it depends on:
substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs
Lines 628 to 632 in 9c92e49
We get storage_per_prefix data as the result, which originally was used for creating comments with information about the storage keys touched during each benchmark.
The Bug
In PR#11637 this data started to be used for calculation of the resulting proof_size formula into the
weights.rs. But in a wrong way:Step 1, (almost right). We find average base proof_size value and slope for each component, by making regression analysis of benchmark results we put to this component in storage_per_prefix (see above).
substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs
Lines 299 to 303 in 9c92e49
(This is almost right but not totally right, because the values we're making regression analysis on are not per-key benchmark results but originated from benchmark results for the all the keys and then adjusted for a single key, see below for details)
Step 2, (wrong). The resulting base_calculated_proof_size and component_calculated_proof_size values are calculated as a simple sum of the values for all prefixes from the per_prefix benchmark results.
substrate/utils/frame/benchmarking-cli/src/pallet/writer.rs
Lines 317 to 323 in 9c92e49
But, wait a minute. To recap, the path of these proof_sizes is:
(which is weird because we're adding up overhead of a single key PoV to proof_size of all the keys)
And this is wrong. It leads to exaggerated weights when being put to
weights.rs:substrate/.maintain/frame-weight-template.hbs
Lines 73 to 75 in 9c92e49
Suggested Fix
Instead of a simple sum, those base_calculated_proof_size and component_calculated_proof_size should be calculated as averages or medians from all keys.