Skip to content

string serialization escaping optimisations#1273

Merged
dtolnay merged 2 commits intoserde-rs:masterfrom
conradludgate:optimise-string-escaping
Jul 18, 2025
Merged

string serialization escaping optimisations#1273
dtolnay merged 2 commits intoserde-rs:masterfrom
conradludgate:optimise-string-escaping

Conversation

@conradludgate
Copy link
Copy Markdown
Contributor

@conradludgate conradludgate commented Jul 18, 2025

While serializing strings, we enter the hot loop checking 1 byte at a time. By massaging the code and using a little bit of unsafe, we can optimise this hot loop a bit.

This benchmark shows about a 2-5% speedup when working with string heavy documents

Benchmark was run with target x86_64-unknown-linux-gnu on an Intel Xeon Platinum 8375C CPU @ 2.90GHz, as well as on an Apple M4 Max.

use std::hint::black_box;

use criterion::{criterion_group, criterion_main, Criterion};
use serde_json::Value;

pub fn k8s(c: &mut Criterion) {
    // https://raw.githubusercontent.com/kubernetes/kubernetes/v1.33.3/api/openapi-spec/swagger.json
    let k8s = std::fs::read_to_string("benches/k8s-openapi.json").unwrap();
    let value: Value = serde_json::from_str(&k8s).unwrap();
    drop(k8s);

    let mut v = Vec::new();
    serde_json::to_writer_pretty(&mut v, &value).unwrap();

    c.bench_function("pretty", |b| {
        b.iter(|| {
            v.clear();
            serde_json::to_writer_pretty(&mut v, &value).unwrap();
            black_box(&v[..]);
        })
    });

    c.bench_function("compact", |b| {
        b.iter(|| {
            v.clear();
            serde_json::to_writer(&mut v, &value).unwrap();
            black_box(&v[..]);
        })
    });
}

criterion_group!(benches, k8s);
criterion_main!(benches);

Copy link
Copy Markdown
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Would there be any benefit to coordinating the discriminants of CharEscape with the values that are being converted to/from it in this code, even if as cannot be used?

#[repr(u8)]
pub enum CharEscape {
    Quote = b'"',
    ReverseSolidus = b'\\',
    Solidus = b'/',
    Backspace = b'b',
    FormFeed = b'f',
    LineFeed = b'n',
    CarriageReturn = b'r',
    Tab = b't',
    AsciiControl(u8) = b'u',
}

@conradludgate
Copy link
Copy Markdown
Contributor Author

Thanks!

Would there be any benefit to coordinating the discriminants of CharEscape with the values that are being converted to/from it in this code, even if as cannot be used?

#[repr(u8)]
pub enum CharEscape {
    Quote = b'"',
    ReverseSolidus = b'\\',
    Solidus = b'/',
    Backspace = b'b',
    FormFeed = b'f',
    LineFeed = b'n',
    CarriageReturn = b'r',
    Tab = b't',
    AsciiControl(u8) = b'u',
}

I had that in a draft of this PR, but I couldn't get any benefit from it unfortunately. Maybe it still makes sense from a semantic point of view

@dtolnay dtolnay merged commit 623d9b4 into serde-rs:master Jul 18, 2025
16 checks passed
takumi-earth pushed a commit to earthlings-dev/json that referenced this pull request Jan 27, 2026
…scaping

string serialization escaping optimisations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants