dd: get rid of line buffered stdout#10235
Conversation
Line-buffered stdout causes partial write and read operations in dd, which is an issue when writing binary data to stdout. Partial writes can lead to data loss and require passing iflag=fullblock to ensure that the exact number of bytes is read.
(cherry picked from commit 0f7c531)
|
I believe we have a shared library that handles this behavior: OwnedFileDescriptorOrHandle. My understanding is that treating the Fd as a file even if its Stdout will also solve the issue of bypassing using the LineWriter. There is the performance tradeoff of cloning the Fd, but it also means we can avoid the unsafe and it wont close stdout on drop. I did a check to see if OwnedFileDescriptorOrHandle works and using the examples you provided it seems to solve the issue you are describing |
OwnedFileDescriptorOrHandle can be used to bypass the LineWriter that is used by default for Stdout.
|
Thanks for the good points. I’ve applied the comments. |
Merging this PR will degrade performance by 4.77%
Performance Changes
Comparing Footnotes
|
|
Those memory tests were just added today, can ignore them |
|
GNU testsuite comparison: |
* dd: get rid of line-buffered stdout Line-buffered stdout causes partial write and read operations in dd, which is an issue when writing binary data to stdout. Partial writes can lead to data loss and require passing iflag=fullblock to ensure that the exact number of bytes is read. * dd: Add test to check for dropped writes (cherry picked from commit 0f7c531) * Fix build on Windows * dd: use OwnedFileDescriptorOrHandle OwnedFileDescriptorOrHandle can be used to bypass the LineWriter that is used by default for Stdout. * Run test_no_dropped_writes only on unix --------- Co-authored-by: Adrian Kretz <me@akretz.com>
Line-buffered stdout causes partial write and read operations in dd,
which is an issue when writing binary data to stdout. Partial writes can
lead to data loss and require passing
iflag=fullblockto ensure that theexact number of bytes is read.
It fixes partial writes mentioned in the PR #8840 and the issue #9119
Partial writes can be reproduced by the following calls
As you can see, each call copies a different number of bytes, and in some cases a broken pipe can occur. The broken pipe happens when the second dd closes the pipe too early.
The issue with line-buffered stdout is well-known:
https://ericswpark.com/blog/2025/2025-01-23-buffering-by-block-in-rust/
For the fix I was using the workaround shared here - rust-lang/rust#58326 (comment)
This is almost my first snippet in Rust, so please don’t judge too strictly.