feat: per-module inter-library dependency filtering (#4572)#14116
feat: per-module inter-library dependency filtering (#4572)#14116robinbb wants to merge 10 commits intoocaml:mainfrom
Conversation
15c3919 to
2b4ab03
Compare
1776c6e to
c6d39e7
Compare
|
Tagging folks who were interested in the now-deprecated PR 14021. This one supersedes that one, with a more elegant solution, and tests that actually pass. It is out of draft mode as of now. |
|
I need help running the so-called revdeps tests, as I have no credentials on this repo and therefore cannot initiate CI runs manually. I also would like to be instructed on how I can compare the results of the revdeps tests on this branch versus those on 'main'. |
Note: reduced package-level cycle detectionThis PR removes the This is an inherent trade-off of the approach: package cycle detection previously piggybacked on compilation-level glob deps as a proxy for declared library dependencies. More precise per-module deps mean that proxy no longer exists for modules that don't actually import their declared libraries. This only affects the case where a module declares a library dependency but never references it in source code. In practice, this is uncommon — but it does mean some package cycles will only be caught at opam-level resolution rather than at build time. |
6e9a47c to
556bd06
Compare
art-w
left a comment
There was a problem hiding this comment.
Very nice! I left some comments regarding the remaining edge-cases where module aliases might cause issues with other features :)
6bc4161 to
2195aaa
Compare
2195aaa to
a2b5350
Compare
a2b5350 to
9550cb5
Compare
Transparent module aliases (e.g., `module M = OtherLib.Foo`) create cross-library .cmi reads that ocamldep does not report. This test documents the discrepancy: ocamldep only sees the direct library reference, but the compiler follows alias chains and reads .cmi files from transitive libraries at compile time. This property is critical for any future per-module inter-library dependency optimization (ocaml#4572). Signed-off-by: Robin Bate Boerop <me@robinbb.com>
|
Note that this PR slows null builds by 30% (see the benchmark job). It is reasonable to expect some overhead from this refining step, but we should strive to reduce it. Getting it within 5% would be ideal. |
9550cb5 to
d51c71a
Compare
956c365 to
935c56c
Compare
…caml#4572) Use the already-computed ocamldep output to filter inter-library dependencies on a per-module basis. Each module now only depends on .cmi/.cmx files from libraries it actually imports, rather than glob deps on all files from all dependent libraries. This eliminates unnecessary recompilation when a module in a dependency changes but the current module doesn't reference that library. The implementation: - Add read_immediate_deps_raw_of to ocamldep, returning raw module names without filtering against the stanza's module set - Move Hidden_deps from Includes (library-wide) to per-module in build_cm - Add Lib_index mapping module names to libraries, computed from requires_compile and requires_hidden - In build_cm, use ocamldep output + Lib_index to determine which libraries each module actually needs; fall back to all-library glob deps when filtering is not possible (Melange, virtual library implementations, singleton stanzas, alias/root modules) - For local unwrapped libraries, use per-file deps on specific .cmi/.cmx files rather than directory-wide globs Signed-off-by: Robin Bate Boerop <me@robinbb.com> chore: add changelog entry for per-module dependency filtering (ocaml#4572) Signed-off-by: Robin Bate Boerop <me@robinbb.com>
935c56c to
aba965a
Compare
Remove deps_of_module, cm_kinds_for, and the Module.t option from Lib_index and deps_of_entries. All entries use glob deps (no per-file deps on individual modules within libraries), so the Some m path was dead code. Simplify the types accordingly: Lib_index maps module names to Lib.t list, and deps_of_entries takes Lib.t list directly. Per-module filtering within unwrapped libraries can be revisited in the future once the dependency cone of individual modules is tracked. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
|
I thank all reviewers for their comments. I think that I have resolved all criticism from the prior round. |
src/dune_rules/module_compilation.ml
Outdated
| else ( | ||
| (* Try the current ml_kind first; fall back to the other | ||
| kind for interface-only modules that may contain aliases | ||
| to other libraries. *) |
There was a problem hiding this comment.
I believe that you need to check both the dependencies of the .ml and .mli (since the interface can reference other libs than the implementation), but you may skip the .ml deps for Intf or when -opaque is set (if the .mli exists).
|
Some things to keep in mind before I do another round of review:
|
src/dune_rules/module_compilation.ml
Outdated
| let* requires = Resolve.Memo.read (Lib.requires lib ~for_) in | ||
| let new_deps = | ||
| List.filter_map requires ~f:(fun dep -> | ||
| if Table.mem libs_set dep && not (Table.mem covered dep) |
There was a problem hiding this comment.
libs_set should contain all the transitive deps, when is it possible for dep not to be in it?
src/dune_rules/module_compilation.ml
Outdated
| Resolve.Memo.read | ||
| (let open Resolve.Memo.O in | ||
| let+ d = Compilation_context.requires_compile cctx | ||
| and+ h = Compilation_context.requires_hidden cctx in |
There was a problem hiding this comment.
Again I see a user of requires_hidden that leaves me wondering if it's correct. I'm pretty sure it's not and we should be re-constructing whatever is needed from the transitively closed deps by doing the refining step first (as you've done below).
There was a problem hiding this comment.
requires_hidden is included in libs to bound the close_over transitive closure. A directly-referenced library's Lib.requires may return libraries that are in requires_hidden but not requires_compile. Without them in the bounding set, close_over would miss those transitive deps. The lib_index itself only contains direct_requires (per your earlier comment), so hidden libraries only enter through close_over, never through the ocamldep-based filtering.
There was a problem hiding this comment.
... and since I am new to the domain of OCaml compilation, it is important that you question my assumptions/premises. Happy to discuss further.
Extract the per-module inter-library dependency filtering logic from build_cm into a shared filtered_lib_deps helper. Use it in both build_cm and ocamlc_i so that inferred .mli generation also benefits from per-module filtering. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
07c34ba to
72df313
Compare
Use the existing Lib.closure for transitive library dependency closure instead of a hand-rolled close_over function. Lib.closure follows the same requires path and the libraries already passed overlap checks when the compilation context was built. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Remove the main_module_name special case in lib_index construction. For wrapped libraries, entry_modules returns the wrapper module — the same name that main_module_name provided. Since all entries now use glob deps (no per-file Module.t), the two branches were equivalent. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
- Extract all_libs helper to deduplicate requires_compile @ requires_hidden - Expand Module.kind wildcards to explicit variants - Reword fallback comment to not reference removed code - Drop trivial punctuation-only changes to unrelated test files Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Signed-off-by: Robin Bate Boerop <me@robinbb.com>
A module's .mli can reference different libraries than its .ml. Read ocamldep output from both and union them so the filtered set includes all cross-library references. This makes the filtering more precise and prepares for per-module -I flag filtering (ocaml#14186) where the .cmi compilation rule no longer serves as a safety net. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Summary
Use the already-computed
ocamldepoutput to filter inter-librarydependencies on a per-module basis.
Previously, when
libAdepends onlibB, every module oflibAgot aglob dependency on all
.cmifiles inlibB. Now each module onlydepends on
.cmi/.cmxfiles from libraries it actually imports,eliminating unnecessary recompilation.
(was 2)
recompilations (was 2), via per-file deps on individual
.cmifilesmodule M = Mylib): correctlyhandled by transitively closing library dependencies of referenced
libraries at arbitrary depth (not just direct requires)
by falling back to the other
ml_kindwhen reading ocamldep outputFallback cases (no change from current behavior)
ocamldepis skipped (singletonoptimization), so filtering is not possible
libraries are not in
requires_compile, so filtering falls backThis PR builds on prerequisite PRs:
#14017 — baseline tests documenting current recompilation behavior
#14031 — test documenting module name shadowing between stanzas and libraries
#14100 — test verifying library file deps in compilation rules and sandboxed builds
#14101 — safety test for incremental builds with transparent aliases
#14129 — test verifying incremental builds with alias re-exported libraries
#14178 — test documenting ocamldep behavior with transparent alias chains
Fixes #4572