fix false positive cache staleness casuing package recompiling#61206
fix false positive cache staleness casuing package recompiling#61206adienes wants to merge 5 commits intoJuliaLang:masterfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
this picked up a merge conflict |
Can you explain a bit more the situation where this happens? Why does the Manifest end up missing this extension? |
The precompilation dependency graph resolution did not account for transitively-triggered extensions. When ExtAB depends on ExtA (because ExtAB's triggers are a strict superset of ExtA's), and a package TopPkg depends on all of ExtAB's triggers, the single-pass graph resolution would add ExtA to TopPkg's deps but not ExtAB -- because indirect_deps was computed before ExtA was added, so the issubset check for ExtAB's triggers against TopPkg's indirect deps would fail. This caused a race condition where TopPkg's precompilation task would not wait on ExtAB, leading to false cache staleness when ExtAB appeared in TopPkg's required_modules but its cache path was not yet populated. The fix iterates the extension edge resolution to a fixed point, so that adding ExtA as a dep of TopPkg causes ExtAB to be recognized as loadable in TopPkg on the next iteration. This is an alternative root-cause fix for the issue addressed by JuliaLang#61206, which worked around the problem by deferring cache path registration. This fix instead corrects the dependency graph so that task ordering is correct. Fixes JuliaLang#61198 Co-authored-by: Claude <noreply@anthropic.com>
|
My claude (opus 4.6) thinks that we can fix the core issue/race instead IanButterworth@6869086 If that looks reasonable I can open a PR |
Aha! Definitely better to fix our dependency modeling here. Go ahead and open the PR, and we can collaborate on the proper fix. |
|
That makes sense. Someday we should investigate switching this to a proper Tarjan's SCC algorithm, instead of these random extra corrections loops. |
|
sg, replaced by #61272 |
The precompilation dependency graph resolution did not account for transitively-triggered extensions. When ExtAB depends on ExtA (because ExtAB's triggers are a strict superset of ExtA's), and a package TopPkg depends on all of ExtAB's triggers, the single-pass graph resolution would add ExtA to TopPkg's deps but not ExtAB -- because indirect_deps was computed before ExtA was added, so the issubset check for ExtAB's triggers against TopPkg's indirect deps would fail. This caused a race condition where TopPkg's precompilation task would not wait on ExtAB, leading to false cache staleness when ExtAB appeared in TopPkg's required_modules but its cache path was not yet populated. The fix iterates the extension edge resolution to a fixed point, so that adding ExtA as a dep of TopPkg causes ExtAB to be recognized as loadable in TopPkg on the next iteration. This is an alternative root-cause fix for the issue addressed by JuliaLang#61206, which worked around the problem by deferring cache path registration. This fix instead corrects the dependency graph so that task ordering is correct. Fixes JuliaLang#61198 Co-authored-by: Claude <noreply@anthropic.com>
The precompilation dependency graph resolution did not account for transitively-triggered extensions. When ExtAB depends on ExtA (because ExtAB's triggers are a strict superset of ExtA's), and a package TopPkg depends on all of ExtAB's triggers, the single-pass graph resolution would add ExtA to TopPkg's deps but not ExtAB -- because indirect_deps was computed before ExtA was added, so the issubset check for ExtAB's triggers against TopPkg's indirect deps would fail. This caused a race condition where TopPkg's precompilation task would not wait on ExtAB, leading to false cache staleness when ExtAB appeared in TopPkg's required_modules but its cache path was not yet populated. The fix iterates the extension edge resolution to a fixed point, so that adding ExtA as a dep of TopPkg causes ExtAB to be recognized as loadable in TopPkg on the next iteration. This is an alternative root-cause fix for the issue addressed by #61206, which worked around the problem by deferring cache path registration. This fix instead corrects the dependency graph so that task ordering is correct. Fixes #61198 (cherry picked from commit 8659fb4)
fixes #61198
this line
julia/base/precompilation.jl
Line 1032 in 7f1add2
was populating
cachepath_cachewithString[]for every key. whencompilecache_freshest_pathaccesses this cache viamodpaths = get(() -> find_all_in_cache_path(modkey), cachepath_cache, modkey)it thus getsmodpaths = String[]rather than the fallbackfind_all_in_cache_path.in the ordinary case this isn't a problem since we
push!tofreshpathsinside the task for each package, and thewaithere ensures packages are compiled in dependency orderjulia/base/precompilation.jl
Line 1057 in 7f1add2
however in the case of #61198, there are package extensions loaded dynamically that will not be
waited upon as above since they are not declared as a dependency in the manifest, but do end up appearing in therequired_moduleslist.the proposal here is to not set
cachepath_cache[pkg] = freshpathsuntilfreshpathsis populated. note that this still leaves the race condition intact, kinda, but the consequence changes from "recompilation of the package" to "has to find the cachefiles on disk rather than pull fromcachepath_cache" which is a much more minor penalty.this specific issue appears to have been surfaced in #59670, but I don't know that it was correct before either
claude assisted the original analysis and wrote the test case