Skip to content

Commit 2f4bb6e

Browse files
Rollup merge of rust-lang#153936 - danielzgtg:perf/immediateAbortAvoidPthreadGetattrNp, r=Mark-Simulacrum
Skip stack_start_aligned for immediate-abort This improves startup performance by 16%, shown by an optimized hello-world program. glibc's `pthread_getattr_np` performs expensive syscalls when reading `/proc/self/maps`. That is all wasted with `panic = immediate-abort` active because `init()` immediately discards the return value from `install_main_guard()`. A similar improvement can be seen in environments that don't have `/proc`. This change is safe because the immediately succeeding comment says that we rely on Linux's "own stack-guard mechanism". Tracking issue: rust-lang#147286 # Benchmark Set it up with `cargo new hello-world2`, and replace these files: ```toml # Cargo.toml cargo-features = ["panic-immediate-abort"] [package] name = "hello-world" version = "0.1.0" edition = "2024" [profile.release] lto = true panic = "immediate-abort" codegen-units = 1 opt-level = "z" strip = true # .cargo/config.toml [unstable] build-std = ["std"] ``` ## Before ```console home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 524.8 µs ± 65.1 µs [User: 276.1 µs, System: 187.0 µs] Range (min … max): 446.4 µs … 975.5 µs 3996 runs home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 519.4 µs ± 65.8 µs [User: 282.1 µs, System: 177.7 µs] Range (min … max): 443.2 µs … 830.5 µs 3612 runs home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 520.0 µs ± 64.3 µs [User: 277.1 µs, System: 182.1 µs] Range (min … max): 447.1 µs … 1001.3 µs 3804 runs ``` For a visualization of the problem, run `cargo +stage1 build --release && perf record --call-graph dwarf -F max ./target/release/hello-world2 && perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg`: <img width="3832" height="1216" alt="flamegraph with 17.41% __pthread_getattr_np" src="https://github.com/user-attachments/assets/acc2286e-1582-4772-9e3b-68b5c35e3e70" /> ## After ```console home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2Benchmark 1: target/release/hello-world2 Time (mean ± σ): 444.7 µs ± 57.3 µs [User: 257.4 µs, System: 130.2 µs] Range (min … max): 379.4 µs … 1289.3 µs 3893 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 452.3 µs ± 60.7 µs [User: 261.5 µs, System: 133.5 µs] Range (min … max): 374.9 µs … 1512.4 µs 4177 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 441.2 µs ± 56.1 µs [User: 256.2 µs, System: 128.8 µs] Range (min … max): 375.0 µs … 760.4 µs 4032 runs ```
2 parents 27bcf51 + 577dba9 commit 2f4bb6e

1 file changed

Lines changed: 13 additions & 0 deletions

File tree

library/std/src/sys/pal/unix/stack_overflow.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -429,6 +429,11 @@ mod imp {
429429

430430
#[forbid(unsafe_op_in_unsafe_fn)]
431431
unsafe fn install_main_guard_linux(page_size: usize) -> Option<Range<usize>> {
432+
// See the corresponding conditional in init().
433+
// Avoid stack_start_aligned, which makes slow syscalls to read /proc/self/maps
434+
if cfg!(panic = "immediate-abort") {
435+
return None;
436+
}
432437
// Linux doesn't allocate the whole stack right away, and
433438
// the kernel has its own stack-guard mechanism to fault
434439
// when growing too close to an existing mapping. If we map
@@ -456,6 +461,10 @@ mod imp {
456461
#[forbid(unsafe_op_in_unsafe_fn)]
457462
#[cfg(target_os = "freebsd")]
458463
unsafe fn install_main_guard_freebsd(page_size: usize) -> Option<Range<usize>> {
464+
// See the corresponding conditional in install_main_guard_linux().
465+
if cfg!(panic = "immediate-abort") {
466+
return None;
467+
}
459468
// FreeBSD's stack autogrows, and optionally includes a guard page
460469
// at the bottom. If we try to remap the bottom of the stack
461470
// ourselves, FreeBSD's guard page moves upwards. So we'll just use
@@ -489,6 +498,10 @@ mod imp {
489498

490499
#[forbid(unsafe_op_in_unsafe_fn)]
491500
unsafe fn install_main_guard_bsds(page_size: usize) -> Option<Range<usize>> {
501+
// See the corresponding conditional in install_main_guard_linux().
502+
if cfg!(panic = "immediate-abort") {
503+
return None;
504+
}
492505
// OpenBSD stack already includes a guard page, and stack is
493506
// immutable.
494507
// NetBSD stack includes the guard page.

0 commit comments

Comments
 (0)