Use mmap for faster warn_missing_py_init by orlp · Pull Request #2950 · PyO3/maturin

orlp · 2026-01-27T14:53:47Z

One of my colleagues noticed that warn_missing_py_init is quite slow, reading the full file into memory (for context: the Polars debug .so is ~4GB).

I think it's quite likely that goblin only needs to parse a portion of the file, so I think using mmap here makes sense.

konstin · 2026-01-27T15:04:15Z

src/compile.rs

-    let mut buffer = Vec::new();
-    fd.read_to_end(&mut buffer)?;
+    let fd = File::open(artifact)?;
+    let mmap = unsafe { memmap2::Mmap::map(&fd).context("mmap failed")? };


Is that sound, does it guarantee non of the rust invariants are being violated? If so, can you write a safety comment on why it it safe to do?

@konstin Please see https://docs.rs/memmap2/0.9.9/memmap2/struct.Mmap.html#safety.

I'm not exactly sure what kind of safety comment you'd be looking for. It is technically inherently 'unsound' by the Rust abstract model if someone edits the file in the file system at the same time. But mmap is very widely used regardless, including in the Rust project itself, e.g. https://github.com/rust-lang/rust/blob/94a0cd15f5976fa35e5e6784e621c04e9f958e57/src/tools/rust-analyzer/crates/proc-macro-srv/src/dylib.rs#L110.

I'm afraid I'm not comfortable with using mmap like that in maturin. For context, the only production usage of unsafe we have in maturin currently is env::set_var and that also only because it used to be safe in a previous edition.

Applications must consider the risk and take appropriate precautions when using file-backed maps. Solutions such as file permissions, locks or process-private (e.g. unlinked) files exist but are platform specific and limited.

As long as we can't guarantee that the artifact is not actually modified by some means, we risk UB, and that's a risk that's not worth it for this performance win.

@konstin Which risks are you worried about? 'UB' is not a risk in of itself, it's a binary property of a program that states the abstract model of Rust was violated, a risk needs an actual attacker model, consequence and probability of exploitation.

Mmap is already widely used in the Rust compiler: https://github.com/search?q=repo%3Arust-lang%2Frust%20mmap&type=code. I'd really like to know which risks in particular you're worried about that are unacceptable for maturin but are apparently acceptable for the Rust compiler.

I don't know which measure or soundness considerations the rust-lang/rust repo has around mmap, but the documentation you linked at https://docs.rs/memmap2/0.9.9/memmap2/struct.Mmap.html#safety is very clear that unless we guarantee that the underlying file doesn't change, this is unsound. I assume those mmap calls ensure this, but here I don't see that we've ensured that (yet).

With rust being safe by default, using unsafe inverts the language contract: We need to proof and document at the site where we use the unsafe that we're keeping up the contract, and should also document why the unsafe is necessary in the first place, usually with a // SAFETY: comment.

@konstin "Measure of soundness" is the wrong measure. The API is fundamentally unsound. You can not prevent someone from mutating the file. There also is no alternative without re-writing the entire parser to be Read based rather than &[u8] based.

I'm perfectly aware of how Rust works with unsafe. You are however holding me to an impossible standard. There is no safety comment I could possibly write containing the contract we uphold, because we can not possibly guarantee that the contract is upheld. This is no different in the Rust compiler.

However, would you be satisfied with the following logic?

// SAFETY: technically this is unsound if the file gets modified while we hold the // immutable &[u8] slice. However, considering we're reading a dynamic library // with executable code intended to be run, should a malicious agent have write // access to it, arbitrary code execution is already trivially achieved. So there // is no reasonable attacker model where this would lead to exploits that are // otherwise impossible. let mmap = unsafe { memmap2::Mmap::map(&fd).context("mmap failed")? };

I meant more in the sense of, can we ensure that this is not a file that changes underneath the mmap if you do another cargo build while maturin is running? As I understand it, there's a lot of ecosystem usage, where generally those can either assume that the file doesn't change underneath them (they hold a lock, it's a temp dir, they are the only process in that docker container, etc.) or the file changing isn't catastrophic. I know this is unlikely with cargo being slow and how this check is timed, but the uv issue tracker has scared me that every timing and non-determinism bug that can happen, will happen.

can we ensure that this is not a file that changes underneath the mmap if you do another cargo build while maturin is running?

No, you can not in general. And should this be a problem (it's not, really), you're already in the woods because you call

maturin/src/auditwheel/repair.rs

Line 15 in e7922ab

let dep_analyzer = DependencyAnalyzer::new(sysroot).library_paths(ld_paths);

which does... you guessed it, mmap:

https://github.com/messense/lddtree-rs/blob/e3ab19edf39b4d2d9cc90058e53a7ec966c271ac/src/lib.rs#L133

I must re-iterate that I feel you're holding me to an impossibly high standard for this PR, considering your tool already does the exact same strategy I proposed elsewhere (as well as your dependencies in the Rust compiler, probably others).

messense

LGTM, thanks

konstin · 2026-01-27T16:33:18Z

for context: the Polars debug .so is ~4GB

While I agree that we should speed up maturin for large binaries, have you considered reducing the debug info, such as https://doc.rust-lang.org/cargo/guide/build-performance.html#reduce-amount-of-generated-debug-information? I found those changes to make building a lot faster, producing a 4GB file is a large task for rustc and the linker too.

orlp · 2026-01-27T18:05:00Z

The result would still be an 1.7 GB binary, so still significant overhead to fully read in warn_missing_py_init. Plus having to wait several minutes for a complete recompile with debug symbols should a debugger be needed is rather undesirable.

orlp · 2026-02-24T13:50:39Z

What's the status on this?

messense · 2026-02-24T14:00:53Z

I'm generally in favor of use mmap here, maybe to be safer we can lock the target/$profile/.cargo-lock file lock so no other cargo build can proceed at the same time?

Copilot

Pull request overview

This PR optimizes the warn_missing_py_init function by using memory-mapped I/O instead of reading entire files into memory, which is particularly beneficial for large artifacts like the 4GB Polars debug .so file.

Changes:

Replaced file reading with mmap for parsing binary artifacts
Added cargo lock file synchronization to prevent concurrent modifications during mmap operations
Added memmap2 dependency to support memory-mapped file operations

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File	Description
src/compile.rs	Refactored `warn_missing_py_init` to use mmap instead of reading full file into buffer, added cargo lock synchronization
Cargo.toml	Added memmap2 0.9.9 dependency

src/compile.rs

konstin · 2026-02-24T16:02:56Z

We should move the artifacts to a temp file before we process them, but if we're already using mmap we have that risk already now.

messense · 2026-02-25T11:26:08Z

The artifact is a cdylib that can be multi-GB in debug builds, a 4GB ~~copy~~move would be just as slow as the original read_to_end if target path is not on the same disk. The cargo lock coordination we have now is both faster (zero-copy) and provides stronger guarantees than a temp file copy, which itself would be subject to the same race (the source could change mid-copy, producing a corrupt temp file).

messense · 2026-02-25T11:34:03Z

About the lddtree issue, its analysis does not need the same cargo lock protection, because:

lddtree runs on a staged copy — copy_artifact_for_repair() copies the artifact to a maturin-owned staging directory before get_policy_and_libs() / find_external_libs() runs. lddtree mmaps this copy, not the original cargo output.
lddtree analyzes system libraries — Beyond the main artifact, lddtree resolves and reads shared libraries from the system sysroot (e.g., /lib/, /usr/lib/). These are not modified by cargo builds.

And we don't always copy the artifact, only when doing auditwheel check/repair.

messense · 2026-02-25T11:42:27Z

so moving the artifact is a viable option if we keep it in target directory which should usually be the same disk so we get atomic mv, let me also try it.

messense · 2026-02-25T12:05:38Z

Implemented move in b607b34, but I think it will break any project that relies on the artifact in the standard cargo target location like in maturin release pipeline when packaging binaries for github release which is unfortunate.

Acquire a shared lock on cargo's .cargo-lock before mmapping the artifact to prevent concurrent cargo builds (e.g. from rust-analyzer in VS Code) from modifying the file while it's mapped. This makes the mmap sound while avoiding reading multi-GB debug artifacts fully into memory.

…cation - Rename copy_artifact_for_repair to stage_artifact and call it unconditionally before warn_missing_py_init, so the mmap is safe without needing .cargo-lock file locking. - Use fs::rename with fs::copy fallback instead of always copying. - Remove the fs4-based .cargo-lock shared locking from warn_missing_py_init since the artifact is now in a private directory. - Fix target_has_profile test helper to check deps/ subdirectory.

Update archive steps to use target/maturin/ where stage_artifact moves the binary. macOS x86_64 and aarch64 archives keep their original paths since the universal2 build rebuilds individual binaries in the cargo output directories.

Copilot

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.

src/build_context.rs

tests/common/pep517.rs

.github/workflows/release.yml

src/build_context.rs

- release.yml: Remove cd/cd - in Windows archive step (cd - is a bash-ism that fails in PowerShell) - build_context.rs: Pre-remove stale destination before fs::rename to avoid fallback to copy on Windows - pep517.rs: Check cargo's .fingerprint directory instead of the .dylib file which gets moved by stage_artifact

konstin reviewed Jan 27, 2026

View reviewed changes

messense approved these changes Jan 27, 2026

View reviewed changes

orlp had a problem deploying to PyPI January 27, 2026 15:09 — with GitHub Actions Failure

orlp temporarily deployed to crates.io January 27, 2026 15:09 — with GitHub Actions Inactive

messense force-pushed the warn-missing-mmap branch from 4baac4b to 65b0b10 Compare February 24, 2026 15:06

messense requested review from Copilot and konstin February 24, 2026 15:06

Copilot AI reviewed Feb 24, 2026

View reviewed changes

src/compile.rs Outdated Show resolved Hide resolved

src/compile.rs Outdated Show resolved Hide resolved

Copilot started reviewing on behalf of messense February 24, 2026 15:15 View session

messense force-pushed the warn-missing-mmap branch from 65b0b10 to 3816920 Compare February 24, 2026 15:19

messense temporarily deployed to PyPI February 24, 2026 15:49 — with GitHub Actions Inactive

messense temporarily deployed to crates.io February 24, 2026 15:49 — with GitHub Actions Inactive

messense force-pushed the warn-missing-mmap branch from 3816920 to c9c67ae Compare February 25, 2026 11:24

messense temporarily deployed to crates.io February 25, 2026 11:54 — with GitHub Actions Inactive

messense temporarily deployed to PyPI February 25, 2026 11:54 — with GitHub Actions Inactive

messense force-pushed the warn-missing-mmap branch from c9c67ae to b607b34 Compare February 25, 2026 12:04

orlp and others added 4 commits February 26, 2026 22:07

Use mmap for faster warn_missing_py_init

b425f56

messense force-pushed the warn-missing-mmap branch from b607b34 to 038aaad Compare February 26, 2026 14:15

messense temporarily deployed to PyPI February 28, 2026 02:32 — with GitHub Actions Inactive

messense temporarily deployed to crates.io February 28, 2026 02:32 — with GitHub Actions Inactive

messense requested a review from Copilot February 28, 2026 05:03

Copilot started reviewing on behalf of messense February 28, 2026 05:03 View session

Copilot AI reviewed Feb 28, 2026

View reviewed changes

src/build_context.rs Show resolved Hide resolved

tests/common/pep517.rs Outdated Show resolved Hide resolved

.github/workflows/release.yml Outdated Show resolved Hide resolved

src/build_context.rs Show resolved Hide resolved

src/build_context.rs Show resolved Hide resolved

messense deployed to crates.io February 28, 2026 05:47 — with GitHub Actions View deployment

messense deployed to PyPI February 28, 2026 05:47 — with GitHub Actions View deployment

messense merged commit 1bfaa8c into PyO3:main Feb 28, 2026
98 of 101 checks passed

Uh oh!

Conversation

orlp commented Jan 27, 2026

Uh oh!

konstin Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

orlp Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

konstin Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

orlp Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

konstin Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

orlp Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

konstin Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

orlp Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

messense left a comment

Choose a reason for hiding this comment

Uh oh!

konstin commented Jan 27, 2026

Uh oh!

orlp commented Jan 27, 2026

Uh oh!

orlp commented Feb 24, 2026

Uh oh!

messense commented Feb 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

konstin commented Feb 24, 2026

Uh oh!

messense commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

messense commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

messense commented Feb 25, 2026

Uh oh!

messense commented Feb 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

orlp Jan 27, 2026 •

edited

Loading

orlp Jan 27, 2026 •

edited

Loading

orlp Jan 27, 2026 •

edited

Loading

orlp Jan 27, 2026 •

edited

Loading

messense commented Feb 25, 2026 •

edited

Loading

messense commented Feb 25, 2026 •

edited

Loading