Skip to content

GC race: concurrent addTempRoot ignored during deletion (port NixOS/nix#15469) #395

@schickling-assistant

Description

@schickling-assistant

Summary

Determinate Nix contains a GC race condition where deleteReferrersClosure can delete store paths that a concurrent evaluator is actively using via addTempRoot. This causes intermittent error: path '/nix/store/...' is not valid failures during Nix evaluation, particularly in CI environments with persistent Nix stores.

The fix is available upstream as NixOS/nix#15469 (by @domenkozar) but has not been ported to Determinate Nix yet.

Affected Code

Three locations need the fix (all present in nix-src as of current main):

1. src/libstore/gc.cc — deletion loop missing tempRoots re-check

The deleteReferrersClosure BFS phase checks tempRoots when visiting paths, but the deletion loop (for (auto & path : topoSortPaths(visited))) does not re-check before deleting. A concurrent addTempRoot that arrives after the BFS but before deletion is silently ignored:

// Current code (vulnerable) — around line 795:
for (auto & path : topoSortPaths(visited)) {
    if (!dead.insert(path).second)
        continue;
    if (shouldDelete) {
        try {
            invalidatePathChecked(path);     // ← no tempRoots re-check!
            deleteFromStore(path.to_string());

The fix adds a tempRoots re-check + pending synchronization before each deletion.

2. src/libexpr/eval-cache.cc — missing addTempRoot before isValidPath

// Line ~577: no addTempRoot before validity check
if (!path || !root->state.store->isValidPath(*path)) {

3. src/libfetchers/fetchers.cc — missing addTempRoot before ensurePath

// Line ~380: no addTempRoot before ensurePath
store.ensurePath(*storePath);

Symptoms

  • error: path '/nix/store/h9lc1dpi14z7is86ffhl3ld569138595-audit-tmpdir.sh' is not valid during evaluation
  • Affects stdenv setup hooks, nixpkgs patches, and other derivation inputs
  • Flaky: ~5-15% CI failure rate on runners with persistent Nix stores
  • Path IS available on cache.nixos.org and CAN be fetched with nix-store --realise
  • Cannot be reproduced by simulating static store corruption — it's a timing-dependent race

Environment

  • Determinate Nix 3.17.1 (Nix 2.33.3)
  • Namespace.so Linux runners, GitHub-hosted Ubuntu runners, Namespace macOS runners
  • Triggered during devenv shell evaluation (derivationStrict)

References


Filed by an AI assistant on behalf of @schickling

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions