Skip to content

A way to verify the integrity of caches #16850

@Supermagnum

Description

@Supermagnum

I propose adding a cargo verify-sources command that verifies the integrity of locally cached crate sources against the hashes recorded in Cargo.lock, independently of the download-time checksum check that Cargo already performs.

Current Behaviour

Cargo verifies crate checksums at download time against what crates.io has recorded. This is a necessary and valuable check. However once a crate is downloaded and cached locally, there is currently no built-in mechanism to verify that the source on disk still matches what Cargo.lock says it should be.

Cargo.lock records the authoritative hash of every dependency in the tree. The source is on disk. Comparing them is a natural and obvious integrity check that currently requires manual tooling to perform.

The Gap This Fills

The download-time check answers: does this crate match what crates.io has recorded?

It does not answer: does the code I am building right now still match what was originally downloaded? Has it been tampered with?

A sufficiently motivated attacker with access to a build environment — a compromised CI system, a developer machine, a shared build cache — could modify crate source files on disk without touching Cargo.lock. The current tooling would not detect this. The build would proceed, hashes would appear correct, and the compromised code would ship.

This is a documented class of real supply chain attacks. It is not theoretical.

Proposed Solution

cargo verify-sources could:

  1. Read every crate entry in Cargo.lock including its recorded checksum.
  2. Locate the corresponding source in the local cache.
  3. Compute the hash of the local source independently.
  4. Compare against the Cargo.lock recorded value.
  5. Report any divergence with the crate name, version, expected hash, and actual hash.
  6. Exit with a nonzero code if any divergence is found.

This is a read-only operation. It modifies nothing. It is safe to run at any point in a build pipeline.

Projects implementing or depending on cryptographic primitives have a particularly strong need for this guarantee. A modified crypto crate that passes all unit tests but behaves differently in specific conditions — weakened key generation, subtly incorrect authentication tag verification etcetera — is a realistic and serious attack with untold consequences. These projects need a simple, auditable way to answer the question: is the cryptographic code I am running actually what it claims to be?

The hashing logic already exists within Cargo for download-time verification. The proposed command could reuse that logic against the local cache rather than against a freshly downloaded archive. The surface area of the change is small.

A --locked flag could optionally refuse to run if Cargo.lock is not committed, ensuring the check is meaningful.

Some thoughs:

  • Could this be a subcommand or a flag on an existing command such as cargo check?
  • Could it verify against crates.io independently in addition to Cargo.lock, providing a second independent source of truth?
  • Could it support verifying crate release tag signatures where crate authors have signed their git tags, providing a chain of trust independent of crates.io entirely?

Notes

This should also be valuable in detecting if a AI has messed with the cryptographic curves or crates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-cachingArea: caching of dependencies, repositories, and build artifactsC-feature-requestCategory: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted`S-triageStatus: This issue is waiting on initial triage.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions