Skip to content

feat: per-module inter-library dependency filtering (#4572)#14116

Draft
robinbb wants to merge 10 commits intoocaml:mainfrom
robinbb:robinbb-issue-4572-combined-tests
Draft

feat: per-module inter-library dependency filtering (#4572)#14116
robinbb wants to merge 10 commits intoocaml:mainfrom
robinbb:robinbb-issue-4572-combined-tests

Conversation

@robinbb
Copy link
Copy Markdown
Collaborator

@robinbb robinbb commented Apr 10, 2026

Summary

Use the already-computed ocamldep output to filter inter-library
dependencies on a per-module basis.

Previously, when libA depends on libB, every module of libA got a
glob dependency on all .cmi files in libB. Now each module only
depends on .cmi/.cmx files from libraries it actually imports,
eliminating unnecessary recompilation.

  • Modules that don't reference a changed library: 0 recompilations
    (was 2)
  • Unwrapped library consumers referencing only specific modules: 0
    recompilations
    (was 2), via per-file deps on individual .cmi files
  • Cross-library transparent aliases (module M = Mylib): correctly
    handled by transitively closing library dependencies of referenced
    libraries at arbitrary depth (not just direct requires)
  • Interface-only modules containing aliases to other libraries: handled
    by falling back to the other ml_kind when reading ocamldep output
  • Sandboxed builds: unaffected

Fallback cases (no change from current behavior)

  • Single-module stanzas: ocamldep is skipped (singleton
    optimization), so filtering is not possible
  • Virtual library implementations and parameterised modules: parameter
    libraries are not in requires_compile, so filtering falls back
  • Melange mode: filtering is OCaml-only

This PR builds on prerequisite PRs:

#14017 — baseline tests documenting current recompilation behavior
#14031 — test documenting module name shadowing between stanzas and libraries
#14100 — test verifying library file deps in compilation rules and sandboxed builds
#14101 — safety test for incremental builds with transparent aliases
#14129 — test verifying incremental builds with alias re-exported libraries
#14178 — test documenting ocamldep behavior with transparent alias chains

Fixes #4572

@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch 3 times, most recently from 15c3919 to 2b4ab03 Compare April 10, 2026 07:20
@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch 6 times, most recently from 1776c6e to c6d39e7 Compare April 10, 2026 21:19
@robinbb robinbb marked this pull request as ready for review April 10, 2026 21:19
@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 10, 2026

Tagging folks who were interested in the now-deprecated PR 14021. This one supersedes that one, with a more elegant solution, and tests that actually pass. It is out of draft mode as of now.
@rgrinberg @Leonidas-from-XIV @Khady @nojb @Alizter @art-w

@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 10, 2026

I need help running the so-called revdeps tests, as I have no credentials on this repo and therefore cannot initiate CI runs manually. I also would like to be instructed on how I can compare the results of the revdeps tests on this branch versus those on 'main'.

@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 10, 2026

Note: reduced package-level cycle detection

This PR removes the @package-cycle test case from reporting-of-cycles.t. The underlying cycle still exists (library a.1 declares (libraries b), library b declares (libraries a.2)), but the modules in those libraries are empty — they don't reference any library in their source code. With per-module filtering, the compilation graph no longer has inter-library edges for those modules, so the package-level cycle is no longer detected.

This is an inherent trade-off of the approach: package cycle detection previously piggybacked on compilation-level glob deps as a proxy for declared library dependencies. More precise per-module deps mean that proxy no longer exists for modules that don't actually import their declared libraries.

This only affects the case where a module declares a library dependency but never references it in source code. In practice, this is uncommon — but it does mean some package cycles will only be caught at opam-level resolution rather than at build time.

@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch from 6e9a47c to 556bd06 Compare April 11, 2026 21:09
Copy link
Copy Markdown
Collaborator

@art-w art-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I left some comments regarding the remaining edge-cases where module aliases might cause issues with other features :)

@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch 2 times, most recently from 6bc4161 to 2195aaa Compare April 13, 2026 19:36
@robinbb robinbb marked this pull request as draft April 13, 2026 19:38
@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch from 2195aaa to a2b5350 Compare April 13, 2026 20:06
@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch from a2b5350 to 9550cb5 Compare April 13, 2026 20:38
Transparent module aliases (e.g., `module M = OtherLib.Foo`) create
cross-library .cmi reads that ocamldep does not report. This test
documents the discrepancy: ocamldep only sees the direct library
reference, but the compiler follows alias chains and reads .cmi files
from transitive libraries at compile time.

This property is critical for any future per-module inter-library
dependency optimization (ocaml#4572).

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@rgrinberg
Copy link
Copy Markdown
Member

Note that this PR slows null builds by 30% (see the benchmark job). It is reasonable to expect some overhead from this refining step, but we should strive to reduce it. Getting it within 5% would be ideal.

@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch from 9550cb5 to d51c71a Compare April 13, 2026 22:07
@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch 2 times, most recently from 956c365 to 935c56c Compare April 13, 2026 22:58
…caml#4572)

Use the already-computed ocamldep output to filter inter-library
dependencies on a per-module basis. Each module now only depends on
.cmi/.cmx files from libraries it actually imports, rather than glob
deps on all files from all dependent libraries. This eliminates
unnecessary recompilation when a module in a dependency changes but the
current module doesn't reference that library.

The implementation:

- Add read_immediate_deps_raw_of to ocamldep, returning raw module names
  without filtering against the stanza's module set
- Move Hidden_deps from Includes (library-wide) to per-module in build_cm
- Add Lib_index mapping module names to libraries, computed from
  requires_compile and requires_hidden
- In build_cm, use ocamldep output + Lib_index to determine which
  libraries each module actually needs; fall back to all-library glob
  deps when filtering is not possible (Melange, virtual library
  implementations, singleton stanzas, alias/root modules)
- For local unwrapped libraries, use per-file deps on specific .cmi/.cmx
  files rather than directory-wide globs

Signed-off-by: Robin Bate Boerop <me@robinbb.com>

chore: add changelog entry for per-module dependency filtering (ocaml#4572)

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch from 935c56c to aba965a Compare April 14, 2026 01:38
Remove deps_of_module, cm_kinds_for, and the Module.t option from
Lib_index and deps_of_entries. All entries use glob deps (no per-file
deps on individual modules within libraries), so the Some m path was
dead code. Simplify the types accordingly: Lib_index maps module names
to Lib.t list, and deps_of_entries takes Lib.t list directly.

Per-module filtering within unwrapped libraries can be revisited in
the future once the dependency cone of individual modules is tracked.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 14, 2026

I thank all reviewers for their comments. I think that I have resolved all criticism from the prior round.

@robinbb robinbb marked this pull request as ready for review April 14, 2026 06:01
else (
(* Try the current ml_kind first; fall back to the other
kind for interface-only modules that may contain aliases
to other libraries. *)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that you need to check both the dependencies of the .ml and .mli (since the interface can reference other libs than the implementation), but you may skip the .ml deps for Intf or when -opaque is set (if the .mli exists).

@rgrinberg
Copy link
Copy Markdown
Member

Some things to keep in mind before I do another round of review:

  1. The null build regression has grown to 40%. Could you make an effort to understand where this is coming from?. I will try to reduce it further on Friday, but putting some work into this now will be appreciated.
  2. There's no speed up at all when it comes to building dune itself. I find that surprising and dissapointing. Is it because of First class support for the import module pattern #14061?
  3. Could you run some benchmarks that demonstrate that this feature is beneficial?
  4. Did we manager to convince ourselves that there are no longer any build regression for revdeps?

let* requires = Resolve.Memo.read (Lib.requires lib ~for_) in
let new_deps =
List.filter_map requires ~f:(fun dep ->
if Table.mem libs_set dep && not (Table.mem covered dep)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libs_set should contain all the transitive deps, when is it possible for dep not to be in it?

Resolve.Memo.read
(let open Resolve.Memo.O in
let+ d = Compilation_context.requires_compile cctx
and+ h = Compilation_context.requires_hidden cctx in
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I see a user of requires_hidden that leaves me wondering if it's correct. I'm pretty sure it's not and we should be re-constructing whatever is needed from the transitively closed deps by doing the refining step first (as you've done below).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires_hidden is included in libs to bound the close_over transitive closure. A directly-referenced library's Lib.requires may return libraries that are in requires_hidden but not requires_compile. Without them in the bounding set, close_over would miss those transitive deps. The lib_index itself only contains direct_requires (per your earlier comment), so hidden libraries only enter through close_over, never through the ocamldep-based filtering.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and since I am new to the domain of OCaml compilation, it is important that you question my assumptions/premises. Happy to discuss further.

@robinbb robinbb marked this pull request as draft April 14, 2026 21:26
Extract the per-module inter-library dependency filtering logic from
build_cm into a shared filtered_lib_deps helper. Use it in both
build_cm and ocamlc_i so that inferred .mli generation also benefits
from per-module filtering.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572-combined-tests branch from 07c34ba to 72df313 Compare April 14, 2026 22:08
robinbb added 4 commits April 14, 2026 16:04
Use the existing Lib.closure for transitive library dependency closure
instead of a hand-rolled close_over function. Lib.closure follows the
same requires path and the libraries already passed overlap checks when
the compilation context was built.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Remove the main_module_name special case in lib_index construction.
For wrapped libraries, entry_modules returns the wrapper module — the
same name that main_module_name provided. Since all entries now use
glob deps (no per-file Module.t), the two branches were equivalent.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
- Extract all_libs helper to deduplicate requires_compile @ requires_hidden
- Expand Module.kind wildcards to explicit variants
- Reword fallback comment to not reference removed code
- Drop trivial punctuation-only changes to unrelated test files

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Signed-off-by: Robin Bate Boerop <me@robinbb.com>
robinbb added 2 commits April 14, 2026 21:14
Signed-off-by: Robin Bate Boerop <me@robinbb.com>
A module's .mli can reference different libraries than its .ml. Read
ocamldep output from both and union them so the filtered set includes
all cross-library references. This makes the filtering more precise
and prepares for per-module -I flag filtering (ocaml#14186) where the .cmi
compilation rule no longer serves as a safety net.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Finer dependency analysis between libraries

3 participants