Skip to content

fix false positive cache staleness casuing package recompiling#61206

Closed
adienes wants to merge 5 commits intoJuliaLang:masterfrom
adienes:fix/precompile-cachepath-cache
Closed

fix false positive cache staleness casuing package recompiling#61206
adienes wants to merge 5 commits intoJuliaLang:masterfrom
adienes:fix/precompile-cachepath-cache

Conversation

@adienes
Copy link
Copy Markdown
Member

@adienes adienes commented Mar 1, 2026

fixes #61198

this line

cachepath_cache[pkg] = freshpaths

was populating cachepath_cache with String[] for every key. when compilecache_freshest_path accesses this cache via modpaths = get(() -> find_all_in_cache_path(modkey), cachepath_cache, modkey) it thus gets modpaths = String[] rather than the fallback find_all_in_cache_path.

in the ordinary case this isn't a problem since we push! to freshpaths inside the task for each package, and the wait here ensures packages are compiled in dependency order

wait(was_processed[(dep,config)])

however in the case of #61198, there are package extensions loaded dynamically that will not be waited upon as above since they are not declared as a dependency in the manifest, but do end up appearing in the required_modules list.

the proposal here is to not set cachepath_cache[pkg] = freshpaths until freshpaths is populated. note that this still leaves the race condition intact, kinda, but the consequence changes from "recompilation of the package" to "has to find the cachefiles on disk rather than pull from cachepath_cache" which is a much more minor penalty.

this specific issue appears to have been surfaced in #59670, but I don't know that it was correct before either

claude assisted the original analysis and wrote the test case

@adienes adienes added the compiler:precompilation Precompilation of modules label Mar 1, 2026
@giordano giordano added the backport 1.13 Change should be backported to release-1.13 label Mar 2, 2026
adienes and others added 2 commits March 2, 2026 10:05
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@KristofferC KristofferC mentioned this pull request Mar 3, 2026
56 tasks
@KristofferC
Copy link
Copy Markdown
Member

this picked up a merge conflict

@topolarity
Copy link
Copy Markdown
Member

however in the case of #61198, there are package extensions loaded dynamically that will not be waited upon as above since they are not declared as a dependency in the manifest, but do end up appearing in the required_modules list.

Can you explain a bit more the situation where this happens? Why does the Manifest end up missing this extension?

IanButterworth added a commit to IanButterworth/julia that referenced this pull request Mar 9, 2026
The precompilation dependency graph resolution did not account for
transitively-triggered extensions. When ExtAB depends on ExtA (because
ExtAB's triggers are a strict superset of ExtA's), and a package TopPkg
depends on all of ExtAB's triggers, the single-pass graph resolution
would add ExtA to TopPkg's deps but not ExtAB -- because indirect_deps
was computed before ExtA was added, so the issubset check for ExtAB's
triggers against TopPkg's indirect deps would fail.

This caused a race condition where TopPkg's precompilation task would
not wait on ExtAB, leading to false cache staleness when ExtAB appeared
in TopPkg's required_modules but its cache path was not yet populated.

The fix iterates the extension edge resolution to a fixed point, so
that adding ExtA as a dep of TopPkg causes ExtAB to be recognized as
loadable in TopPkg on the next iteration.

This is an alternative root-cause fix for the issue addressed by
JuliaLang#61206, which worked around the problem by deferring
cache path registration. This fix instead corrects the dependency
graph so that task ordering is correct.

Fixes JuliaLang#61198

Co-authored-by: Claude <noreply@anthropic.com>
@IanButterworth
Copy link
Copy Markdown
Member

My claude (opus 4.6) thinks that we can fix the core issue/race instead IanButterworth@6869086

If that looks reasonable I can open a PR

@topolarity
Copy link
Copy Markdown
Member

My claude (opus 4.6) thinks that we can fix the core issue/race instead IanButterworth@6869086

If that looks reasonable I can open a PR

Aha! Definitely better to fix our dependency modeling here.

Go ahead and open the PR, and we can collaborate on the proper fix.

@vtjnash
Copy link
Copy Markdown
Member

vtjnash commented Mar 9, 2026

That makes sense. Someday we should investigate switching this to a proper Tarjan's SCC algorithm, instead of these random extra corrections loops.

@adienes
Copy link
Copy Markdown
Member Author

adienes commented Mar 9, 2026

sg, replaced by #61272

@adienes adienes closed this Mar 9, 2026
@adienes adienes deleted the fix/precompile-cachepath-cache branch March 9, 2026 17:33
@adienes adienes removed the backport 1.13 Change should be backported to release-1.13 label Mar 12, 2026
topolarity pushed a commit to IanButterworth/julia that referenced this pull request Mar 16, 2026
The precompilation dependency graph resolution did not account for
transitively-triggered extensions. When ExtAB depends on ExtA (because
ExtAB's triggers are a strict superset of ExtA's), and a package TopPkg
depends on all of ExtAB's triggers, the single-pass graph resolution
would add ExtA to TopPkg's deps but not ExtAB -- because indirect_deps
was computed before ExtA was added, so the issubset check for ExtAB's
triggers against TopPkg's indirect deps would fail.

This caused a race condition where TopPkg's precompilation task would
not wait on ExtAB, leading to false cache staleness when ExtAB appeared
in TopPkg's required_modules but its cache path was not yet populated.

The fix iterates the extension edge resolution to a fixed point, so
that adding ExtA as a dep of TopPkg causes ExtAB to be recognized as
loadable in TopPkg on the next iteration.

This is an alternative root-cause fix for the issue addressed by
JuliaLang#61206, which worked around the problem by deferring
cache path registration. This fix instead corrects the dependency
graph so that task ordering is correct.

Fixes JuliaLang#61198

Co-authored-by: Claude <noreply@anthropic.com>
KristofferC pushed a commit that referenced this pull request Apr 14, 2026
The precompilation dependency graph resolution did not account for
transitively-triggered extensions. When ExtAB depends on ExtA (because
ExtAB's triggers are a strict superset of ExtA's), and a package TopPkg
depends on all of ExtAB's triggers, the single-pass graph resolution
would add ExtA to TopPkg's deps but not ExtAB -- because indirect_deps
was computed before ExtA was added, so the issubset check for ExtAB's
triggers against TopPkg's indirect deps would fail.

This caused a race condition where TopPkg's precompilation task would
not wait on ExtAB, leading to false cache staleness when ExtAB appeared
in TopPkg's required_modules but its cache path was not yet populated.

The fix iterates the extension edge resolution to a fixed point, so
that adding ExtA as a dep of TopPkg causes ExtAB to be recognized as
loadable in TopPkg on the next iteration.

This is an alternative root-cause fix for the issue addressed by
#61206, which worked around the problem by deferring
cache path registration. This fix instead corrects the dependency
graph so that task ordering is correct.

Fixes #61198

(cherry picked from commit 8659fb4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NonlinearSolve constantly recompiles in 1.13 and nightly, but not in 1.12

6 participants