tr: fix high memory use, possible heap exhaustion by julian-klode · Pull Request #8435 · uutils/coreutils

julian-klode · 2025-08-05T14:09:45Z

First commit changes the code to use a fixed-size buffer to avoid heap exhaustion when reading a file with no new lines.

Second commit removes the buffering on stdout, since we already buffer our reads on stdin (we read in 8192b chunks now). This was needed for observed identical buffering behavior. It's not entirely exact in that GNU writes a first chunk of 1024 bytes and then the remaining 7168 bytes when running on my file created with truncate file -s 96G (to simulate heap exhaustion).

We could add a test case in the form of ulimit -v 128M and generating a file that has say 256M but I'm not sure how meaningful it is.

julian-klode · 2025-08-05T14:14:33Z

Somehow test_cp is failing with invalid file descriptor in cp, in the l10n tests. 😕

julian-klode · 2025-08-05T14:15:25Z

I think we may want a larger buffer size, as it could be beneficial for performance, say 4MB at a time, but I just went with GNU buffer size for now.

julian-klode · 2025-08-05T14:26:22Z

Please rerun the l10n test, I wonder if there is a race condition with pipe_in of some sort.

github-actions · 2025-08-05T14:29:40Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

drinkcat · 2025-08-06T06:57:29Z

I think we may want a larger buffer size, as it could be beneficial for performance, say 4MB at a time, but I just went with GNU buffer size for now.

Yeah since you're at it, larger buffer seems worth it... Mind doing a few simple benchmarks with hyperfine?

drinkcat

Thanks! Change looks fine and we can merge it first, but if you want to do more benchmarking/optimizations, some ideas.

drinkcat · 2025-08-06T06:59:54Z

src/uu/tr/src/operation.rs

+        let filtered = buf[..length]
+            .iter()
+            .filter_map(|&c| translator.translate(c));
        output_buf.extend(filtered);


Maybe for a follow-up, but if you want to do benchmarking (and fine-tuning) here, I wonder if we can just directly write_all filtered, instead of collecting the data in output_buf first?

(but then... you might want to restore a buffered writer on stdout)

I think we should deal with performance separately, this is a surprisingly big issue:

$ hyperfine 'gnutr a b < zeroes' Benchmark 1: gnutr a b < zeroes Time (mean ± σ): 558.3 ms ± 80.6 ms [User: 356.6 ms, System: 201.5 ms] Range (min … max): 528.3 ms … 787.5 ms 10 runs $ hyperfine './target/release/tr a b < zeroes' Benchmark 1: ./target/release/tr a b < zeroes ⠴ Current estimate: 5.681 s

(main is at 6.8 ish)

you should run hyperfine this way:
hyperfine 'gnutr a b < zeroes' './target/release/tr a b < zeroes'

you get much better output

Ouch. Yes, sounds good, would be great if you can file an issue for the performance issue.

(I didn't look at our code for tr, but we use some vector instructions in wc to find newline chars that may also be helpful here.)

Performance Bug in #8439

Read the input into a statically sized buffer - 8192, matching GNU - instead of reading until the end of the line, as reading until the end of the line in a file with no end of line would result in reading the entire file into memory. Confusingly, GNU tr seems to write the 8192 byte in two chunks of 1024 and 7168 byte, but I can't figure out why it would do that; I don't see any line buffering in GNU tr. Bug-Ubuntu: https://launchpad.net/bugs/2119520

Our stdin that we transform already is buffered (using 8192 byte buffers in the previous commit), so avoid buffering our output needlessly. This effectively changes the code to write complete lines immediately, for example, in `( echo a; sleep 1 ) | tr a b` we receive read(0, "a\n", 8192) = 2 write(1, "b\n", 2) = 2 read(0, "", 8192) = 0 instead of read(0, "a\n", 8192) = 2 read(0, "", 8192) = 0 write(1, "b\n", 2) = 2 which matches the GNU coreutils behavior.

github-actions · 2025-08-06T07:57:39Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

drinkcat approved these changes Aug 6, 2025

View reviewed changes

julian-klode added 2 commits August 6, 2025 15:38

drinkcat force-pushed the tr-enomem branch from a54fea3 to c93b9ed Compare August 6, 2025 07:38

sylvestre merged commit 76bb429 into uutils:main Aug 6, 2025
88 of 90 checks passed

julian-klode mentioned this pull request Aug 6, 2025

gnutr ran 11.91 ± 0.22 times faster than coreutils tr #8439

Closed

BrewTestBot mentioned this pull request Sep 6, 2025

uutils-coreutils 0.2.0 Homebrew/homebrew-core#236403

Merged

Uh oh!

Conversation

julian-klode commented Aug 5, 2025

Uh oh!

julian-klode commented Aug 5, 2025

Uh oh!

julian-klode commented Aug 5, 2025

Uh oh!

julian-klode commented Aug 5, 2025

Uh oh!

github-actions bot commented Aug 5, 2025

Uh oh!

drinkcat commented Aug 6, 2025

Uh oh!

drinkcat left a comment

Choose a reason for hiding this comment

Uh oh!

drinkcat Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

julian-klode Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

sylvestre Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

drinkcat Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

julian-klode Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants