perf (tsort) : avoid reading the whole input into memory and intern strings by anastygnome · Pull Request #9872 · uutils/coreutils

anastygnome · 2025-12-26T21:48:45Z

This PR is a WIP to get more performance out of tsort.

Move to usize is complete. Next step is to move to vec instead of hashmap

codspeed-hq · 2025-12-26T22:02:24Z

CodSpeed Performance Report

Merging #9872 will degrade performance by 5.29%

_{Comparing anastygnome:tsort (4420344) with main (84e6f03)}

Summary

⚡ 1 improvement
❌ 1 regression
✅ 128 untouched
⏩ 30 skipped¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	`tsort_input_parsing_heavy[5000]`	84.2 ms	71.9 ms	+17.13%
❌	`tsort_linear_chain[1000000]`	1.5 s	1.6 s	-5.29%

30 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

github-actions · 2025-12-26T22:07:22Z

GNU testsuite comparison:

GNU test failed: tests/misc/tsort. tests/misc/tsort is passing on 'main'. Maybe you have to rebase?

github-actions · 2025-12-27T22:04:56Z

GNU testsuite comparison:

GNU test failed: tests/cp/cp-mv-enotsup-xattr. tests/cp/cp-mv-enotsup-xattr is passing on 'main'. Maybe you have to rebase?
Note: The gnu test tests/csplit/csplit-io-err was skipped on 'main' but is now failing.

github-actions · 2025-12-27T22:38:10Z

GNU testsuite comparison:

Congrats! The gnu test tests/tail/follow-name is no longer failing!

github-actions · 2025-12-27T23:02:11Z

GNU testsuite comparison:

Congrats! The gnu test tests/tail/follow-name is no longer failing!

github-actions · 2025-12-27T23:59:47Z

GNU testsuite comparison:

GNU test failed: tests/misc/tsort. tests/misc/tsort is passing on 'main'. Maybe you have to rebase?

github-actions · 2025-12-28T01:14:42Z

GNU testsuite comparison:

GNU test failed: tests/cp/cp-mv-enotsup-xattr. tests/cp/cp-mv-enotsup-xattr is passing on 'main'. Maybe you have to rebase?
Note: The gnu test tests/csplit/csplit-io-err was skipped on 'main' but is now failing.

sylvestre · 2025-12-28T10:35:05Z

any idea why the perf regressed here?

Cargo.toml

anastygnome · 2025-12-28T11:23:59Z

Oh, @sylvestre , I just added a few changes :)
The perf regression is most likely due to using Usize symbols. U32 actually improves performance.

github-actions · 2025-12-28T11:51:43Z

GNU testsuite comparison:

GNU test failed: tests/sort/sort-stale-thread-mem. tests/sort/sort-stale-thread-mem is passing on 'main'. Maybe you have to rebase?

github-actions · 2025-12-28T12:13:50Z

GNU testsuite comparison:

Congrats! The gnu test tests/tail/assert is no longer failing!

anastygnome · 2025-12-28T13:29:53Z

@sylvestre I believe the regression noticed are due to cache misses because the metadata that goes with the token is just slightly bigger than usize.

Here's another benchmark that maybe of interest:

graph.tsort, 1 Gb random graph 
Massif output: 
Gnu tsort: 

    MB (Memory)
248.6^                                                                       #
     |                                                          ::::::::::@::#
     |                                        ::::::::::::::::::: ::: ::::@: #
     |                               :::::@:::: ::::: : ::::: ::: ::: ::::@: #
     |                            @@:: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |                        ::::@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |                     @@::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |                   ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |               ::::::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |              :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |            :::::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |         :::: :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |       :::: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |      :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |    :::: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |    : :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |   :: :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     |   :: :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     | @@:: :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
     | @ :: :: :: : :::: ::@ ::: :@ :: : :@:: : ::::: : ::::: ::: ::: ::::@: #
   0 +----------------------------------------------------------------------->Gi (Giga Instructions) 
     0                                                                   23.47

Number of snapshots: 56
 Detailed snapshots: [1, 2, 16, 21, 26, 52, 55 (peak)]
 
 
 
uutils tsort 

     MB (Memory)
200.0^                                                                    #   
     |                                @::::@:::::::::::@::::@::::@::::@:::#:  
     |              :::::::::::@::::::@::::@::: :::::: @::::@::::@::::@:::#:  
     |           ::::::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:  
     |         :::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:: 
     |        ::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:: 
     |      @:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |     :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |     :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |     :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |     :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |     :@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |   @::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |  :@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |  :@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |  :@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     |  :@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     | ::@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     | ::@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
     | ::@::@:::::: :::::::::::@:::: :@::::@::: :::::: @::::@::::@::::@:::#:::
   0 +----------------------------------------------------------------------->Gi (Giga Instructions) 
     0                                                                   35.83

Number of snapshots: 98
 Detailed snapshots: [3, 6, 24, 31, 37, 50, 51, 52, 62, 72, 82, 91 (peak)]

Hyperfine : 
 
Benchmark 1: target/release/coreutils tsort /home/bench_vm/graph.tsort
  Time (mean ± σ):      6.044 s ±  0.071 s    [User: 5.602 s, System: 0.394 s]
  Range (min … max):    5.974 s …  6.168 s    10 runs
 
Benchmark 2: tsort /home/bench_vm/graph.tsort 
  Time (mean ± σ):      7.115 s ±  0.147 s    [User: 6.738 s, System: 0.328 s]
  Range (min … max):    6.910 s …  7.390 s    10 runs
`
Summary: 'target/release/coreutils tsort /home/bench_vm/graph.tsort'
ran 1.18 ± 0.03 times faster than 'tsort /home/bench_vm/graph.tsort'
 `

github-actions · 2025-12-28T13:54:57Z

GNU testsuite comparison:

Congrats! The gnu test tests/tail/assert is no longer failing!

github-actions · 2025-12-28T18:15:10Z

GNU testsuite comparison:

Congrats! The gnu test tests/tty/tty-eof is no longer failing!

github-actions · 2026-01-01T17:18:09Z

GNU testsuite comparison:

GNU test failed: tests/shuf/shuf-reservoir. tests/shuf/shuf-reservoir is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-stale-thread-mem. tests/sort/sort-stale-thread-mem is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

github-actions · 2026-01-01T18:29:32Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

sylvestre · 2026-01-01T21:11:27Z

windows fails with:


error[E0308]: mismatched types
  --> src\uu\tsort\src\tsort.rs:80:24
   |
80 |             return Err(TsortError::IsDir(input.to_string_lossy().to_string()));
   |                    --- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `UIoError`, found `TsortError`
   |                    |
   |                    arguments to this enum variant are incorrect
   |
help: the type constructed contains `TsortError` due to the type of the argument passed
  --> src\uu\tsort\src\tsort.rs:80:20
   |
80 |             return Err(TsortError::IsDir(input.to_string_lossy().to_string()));
   |                    ^^^^------------------------------------------------------^
   |                        |
   |                        this argument influences the type of `Err`
note: tuple variant defined here
  --> /rustc/8d670b93d40737e1b320fd892c6f169ffa35e49e/library\core\src\result.rs:566:4

anastygnome · 2026-01-01T21:13:02Z

@sylvestre fixed, the rest of the work will be put in another PR.

…ings

github-actions · 2026-01-01T21:31:32Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

github-actions · 2026-01-01T21:48:57Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

github-actions · 2026-01-01T22:09:22Z

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

anastygnome · 2026-01-01T22:34:28Z

@sylvestre done :)

… performance

anastygnome · 2026-01-02T11:32:49Z

The performance regression is inevitable here if we want to support the same input range (2^64 -1 unique input tokens). There's also a slight buffering cost at the given bench graph size. We are still way more memory efficient, so I'd still consider it a net positive.

anastygnome force-pushed the tsort branch 3 times, most recently from 81ece9b to 0a56fe6 Compare December 27, 2025 21:48

anastygnome force-pushed the tsort branch from 0a56fe6 to bc88290 Compare December 27, 2025 22:22

anastygnome force-pushed the tsort branch from bc88290 to 4c0138c Compare December 27, 2025 22:46

anastygnome force-pushed the tsort branch 2 times, most recently from ddb2361 to b98950c Compare December 27, 2025 23:43

anastygnome force-pushed the tsort branch from b98950c to fdf5306 Compare December 28, 2025 00:59

anastygnome force-pushed the tsort branch from fdf5306 to 8908850 Compare December 28, 2025 11:21

sylvestre reviewed Dec 28, 2025

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

sylvestre reviewed Dec 28, 2025

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

anastygnome force-pushed the tsort branch from c0f1452 to 6da22e3 Compare December 28, 2025 11:55

anastygnome force-pushed the tsort branch from 6da22e3 to 17c9efe Compare December 28, 2025 13:24

anastygnome marked this pull request as ready for review December 28, 2025 13:24

anastygnome requested a review from sylvestre December 28, 2025 13:36

anastygnome force-pushed the tsort branch from 17c9efe to c1dc245 Compare December 28, 2025 17:59

anastygnome force-pushed the tsort branch 2 times, most recently from 7e7fa49 to 8c49b1a Compare January 1, 2026 17:02

anastygnome force-pushed the tsort branch 6 times, most recently from 756b865 to b9604fe Compare January 1, 2026 18:16

anastygnome force-pushed the tsort branch 2 times, most recently from 8ac2d2a to 51b542f Compare January 1, 2026 21:07

anastygnome force-pushed the tsort branch 2 times, most recently from 3cd87f8 to a8a1860 Compare January 1, 2026 21:12

anastygnome force-pushed the tsort branch from a8a1860 to 9f49d68 Compare January 1, 2026 21:14

perf(tsort): avoid reading the whole input into memory and intern str…

2c039d6

…ings

anastygnome force-pushed the tsort branch from 9f49d68 to 2c039d6 Compare January 1, 2026 21:20

anastygnome force-pushed the tsort branch from 525b3b3 to 826a15f Compare January 1, 2026 21:50

perf(tsort): avoid redundant check on input

1e20789

anastygnome force-pushed the tsort branch from 826a15f to 1e20789 Compare January 1, 2026 21:57

perf(tsort): switch to the Bucket interning Backend for better lookup…

4420344

… performance

sylvestre merged commit 9086f43 into uutils:main Jan 2, 2026
127 of 129 checks passed

moonfruit mentioned this pull request Feb 3, 2026

uutils-selected 0.6.0 moonfruit/homebrew-tap#453

Closed

Uh oh!

Conversation

anastygnome commented Dec 26, 2025

Uh oh!

codspeed-hq bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #9872 will degrade performance by 5.29%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

github-actions bot commented Dec 26, 2025

Uh oh!

github-actions bot commented Dec 27, 2025

Uh oh!

github-actions bot commented Dec 27, 2025

Uh oh!

github-actions bot commented Dec 27, 2025

Uh oh!

github-actions bot commented Dec 27, 2025

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

sylvestre commented Dec 28, 2025

Uh oh!

Uh oh!

Uh oh!

anastygnome commented Dec 28, 2025

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

anastygnome commented Dec 28, 2025

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

github-actions bot commented Dec 28, 2025

Uh oh!

github-actions bot commented Jan 1, 2026

Uh oh!

github-actions bot commented Jan 1, 2026

Uh oh!

sylvestre commented Jan 1, 2026

Uh oh!

anastygnome commented Jan 1, 2026

Uh oh!

github-actions bot commented Jan 1, 2026

Uh oh!

github-actions bot commented Jan 1, 2026

Uh oh!

github-actions bot commented Jan 1, 2026

Uh oh!

anastygnome commented Jan 1, 2026

Uh oh!

anastygnome commented Jan 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq bot commented Dec 26, 2025 •

edited

Loading