Summary
When yield_on_full_buffer(true) is used with an output buffer smaller than a decoded code value, the Classic decoder returns NoProgress with consumed_in=0, consumed_out=0 while the internal buffer and bit buffer still contain decodable data. The caller interprets this as "done" and stops, silently losing the remaining output.
Reproduction
use weezl::{decode::{Configuration, TableStrategy}, encode, BitOrder, LzwStatus};
let data = vec![42u8; 8];
let encoded = encode::Encoder::new(BitOrder::Lsb, 8).encode(&data).unwrap();
let mut dec = Configuration::new(BitOrder::Lsb, 8)
.with_yield_on_full_buffer(true)
.with_table_strategy(TableStrategy::Classic)
.build();
let mut result = Vec::new();
let mut inp = encoded.as_slice();
loop {
let mut tmp = [0u8; 1]; // 1-byte output buffer
let r = dec.decode_bytes(inp, &mut tmp);
inp = &inp[r.consumed_in..];
result.extend_from_slice(&tmp[..r.consumed_out]);
match r.status {
Ok(LzwStatus::Done) => break,
Ok(LzwStatus::NoProgress) if r.consumed_in == 0 && r.consumed_out == 0 => break,
Ok(_) => {}
Err(_) => break,
}
}
assert_eq!(result.len(), 8); // FAILS: result.len() == 1
Chunked has the same bug. KwKwK-heavy data (repeated single byte) triggers it most easily because every code is next_code, forcing the burst loop to exit after one code and route through the internal buffer.
Root cause
In DecodeState::advance(), when the internal buffer is empty at the start of a call, line 957 unconditionally sets status = NoProgress:
} else if remain.is_empty() {
status = Ok(LzwStatus::NoProgress);
have_yet_to_decode_data = true;
}
The burst loop then extracts a code from the bit buffer (which still has codes from an earlier greedy refill_bits) and writes it to the internal buffer (is_in_buffer = true). This produces consumed_out = 0 because data went to the internal buffer, not to out. At the end of advance(), the guard that corrects NoProgress:
if o_in > inp.len() {
if let Ok(LzwStatus::NoProgress) = status {
status = Ok(LzwStatus::Ok);
}
}
...doesn't fire because o_in == inp.len() (no new input was consumed — the bits were already in the bit buffer from a previous call's greedy refill). So the false NoProgress propagates to the caller.
Suggested fixes
Option A: Fix the guard (correct for all buffer sizes)
At the end of advance(), also check whether data was written to the internal buffer:
let made_progress = o_in > inp.len()
|| !self.buffer.buffer().is_empty();
if made_progress {
if let Ok(LzwStatus::NoProgress) = status {
status = Ok(LzwStatus::Ok);
}
}
This is safe because if the buffer has data, the next advance() call will drain it and produce consumed_out > 0.
Option B: Document minimum buffer size + debug_assert
LZW code values can be up to 3839 bytes long (KwKwK worst case with MAX_ENTRIES=4096). If the output buffer is always ≥ 4096 bytes, every code's output fits directly in out and the is_in_buffer path never triggers with yield_on_full.
This can be documented on with_yield_on_full_buffer:
/// # Buffer size requirement
///
/// When this option is enabled, the output buffer passed to
/// `decode_bytes` should be at least 4096 bytes. LZW codes can
/// decode to up to 3839 bytes; if the output buffer is smaller
/// than a code's decoded value, the Classic decoder may return
/// a spurious `NoProgress`.
Plus a debug assertion in advance():
debug_assert!(
!CgC::YIELD_ON_FULL || out.len() >= MAX_ENTRIES,
"yield_on_full requires output buffer >= {} bytes, got {}",
MAX_ENTRIES, out.len()
);
image-tiff always passes strip-sized buffers (typically 8 KiB–1 MiB), so this constraint is already met in practice.
Note on Streaming strategy
The TableStrategy::Streaming decoder (#69) handles this correctly for all buffer sizes via its pending buffer — no minimum buffer requirement, no data loss. This bug only affects Classic and Chunked.
Impact
- TIFF is unaffected (strips are always large buffers).
- GIF is unaffected in practice (decoders use reasonably sized buffers).
- Any caller using
yield_on_full_buffer(true) with small or 1-byte output buffers will silently truncate output. This includes custom Read::read adapters that forward small caller-provided buffers without internal buffering.
Summary
When
yield_on_full_buffer(true)is used with an output buffer smaller than a decoded code value, the Classic decoder returnsNoProgresswithconsumed_in=0, consumed_out=0while the internal buffer and bit buffer still contain decodable data. The caller interprets this as "done" and stops, silently losing the remaining output.Reproduction
Chunked has the same bug. KwKwK-heavy data (repeated single byte) triggers it most easily because every code is
next_code, forcing the burst loop to exit after one code and route through the internal buffer.Root cause
In
DecodeState::advance(), when the internal buffer is empty at the start of a call, line 957 unconditionally setsstatus = NoProgress:The burst loop then extracts a code from the bit buffer (which still has codes from an earlier greedy
refill_bits) and writes it to the internal buffer (is_in_buffer = true). This producesconsumed_out = 0because data went to the internal buffer, not toout. At the end ofadvance(), the guard that correctsNoProgress:...doesn't fire because
o_in == inp.len()(no new input was consumed — the bits were already in the bit buffer from a previous call's greedy refill). So the falseNoProgresspropagates to the caller.Suggested fixes
Option A: Fix the guard (correct for all buffer sizes)
At the end of
advance(), also check whether data was written to the internal buffer:This is safe because if the buffer has data, the next
advance()call will drain it and produceconsumed_out > 0.Option B: Document minimum buffer size + debug_assert
LZW code values can be up to 3839 bytes long (KwKwK worst case with MAX_ENTRIES=4096). If the output buffer is always ≥ 4096 bytes, every code's output fits directly in
outand theis_in_bufferpath never triggers withyield_on_full.This can be documented on
with_yield_on_full_buffer:Plus a debug assertion in
advance():image-tiff always passes strip-sized buffers (typically 8 KiB–1 MiB), so this constraint is already met in practice.
Note on Streaming strategy
The
TableStrategy::Streamingdecoder (#69) handles this correctly for all buffer sizes via its pending buffer — no minimum buffer requirement, no data loss. This bug only affects Classic and Chunked.Impact
yield_on_full_buffer(true)with small or 1-byte output buffers will silently truncate output. This includes customRead::readadapters that forward small caller-provided buffers without internal buffering.