fix: segment GRO/GSO-coalesced packets in PCAP receive path by tbdye · Pull Request #1780 · BlitterStudio/amiberry

tbdye · 2026-02-12T05:36:41Z

Summary

PCAP bridged networking delivers ~3-7 KB/s throughput on hosts with GRO, GSO, or virtio NICs — roughly 100x slower than real A2065 hardware. This PR fixes the root cause and restores expected throughput.

The Problem

Linux GRO (Generic Receive Offload), GSO, and virtio mergeable receive buffers coalesce multiple TCP segments into single oversized packets before delivering them to pcap sockets. A burst of 10 standard 1460-byte TCP segments arrives as a single 14,600-byte packet.

The PCAP backend had a hard-coded 1600-byte receive limit (MAX_PSIZE):

#define MAX_PSIZE 1600

static void uaenet_queue(struct uaenet_data *ud, const uae_u8 *data, int len)
{
    if (!ud || len <= 0 || len > MAX_PSIZE)
        return;  // silently dropped

Every coalesced packet was silently dropped. The only packets that survived were single-segment retransmissions (1460 bytes) triggered after the TCP sender's retransmission timeout expired — typically 300ms on LAN, 500ms+ for remote servers. This forced TCP into a degenerate one-segment-per-RTO-cycle mode:

1460 bytes / 300ms = ~4.9 KB/s (LAN)
1460 bytes / 500ms = ~2.9 KB/s (remote)

Disabling GRO/GSO/TSO via ethtool does not help on virtualized hosts (Proxmox/KVM, Hyper-V, etc.) because the hypervisor coalesces packets in the virtual NIC before the guest kernel sees them.

The Fix

Replace the hard drop with TCP segmentation. When the PCAP backend receives a packet larger than a standard Ethernet frame (1514 bytes):

Fast path: Packets <= 1514 bytes pass through directly (zero overhead for normal traffic)
Oversized IPv4/TCP: Parse Ethernet, IP, and TCP headers, then segment the payload into MSS-sized chunks. For each segment, build a complete Ethernet frame with updated IP header (length, identification, checksum), TCP header (sequence number, flags, checksum), and payload
Non-IPv4/non-TCP oversized: Log and drop (GRO only coalesces TCP in practice)

Per-segment details:

IP identification: incremented per segment
TCP sequence number: advanced by payload offset
TCP flags: FIN and PSH preserved only on the final segment
Checksums: both IP header and TCP (with pseudo-header) fully recomputed

The receive queue depth is also raised from 10 to 50. A single GRO-coalesced packet segments into 10+ frames, so the old limit would drop most segments from a single burst.

Testing

Tested on Debian 13 (kernel 6.12) running as a Proxmox VM with a virtio NIC. Emulated A4000/040 with A2065 NIC, Roadshow TCP/IP stack. GRO/GSO/TSO left enabled at default settings.

LAN FTP (200KB file):

Before: 204,800 bytes in 31 seconds (~6.6 KB/s)
After: 204,800 bytes in 0.226 seconds (~885 KB/s) — 134x improvement

Remote FTP (Aminet, transatlantic):

Before: 202,765 bytes in 63.4 seconds (~3.2 KB/s)
After: 202,765 bytes in 3.37 seconds (~60 KB/s) — 19x improvement

The LAN result (885 KB/s) exceeds real A4000/040 + A2065 benchmarks (400-600 KB/s), which is expected since the emulated 68040 runs at an effective clock speed higher than 25 MHz. The remote result (60 KB/s) is RTT-bound, consistent with Roadshow's 33KB TCP window over a ~170ms path.

Generated with Claude Code

Hosts with GRO, GSO, or virtio NICs deliver TCP segments coalesced into packets far larger than the Ethernet MTU (up to 64KB). The PCAP backend silently dropped these because of a 1600-byte hard limit, forcing TCP into RTO-driven single-segment retransmission and reducing throughput to ~3-7 KB/s — roughly 100x slower than real hardware. Add TCP segmentation that splits oversized IPv4/TCP packets back into MSS-sized Ethernet frames with correct IP/TCP headers and checksums before queuing them for the A2065 emulation. Raise the receive queue depth from 10 to 50 to accommodate segmentation bursts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

tbdye requested a review from midwan as a code owner February 12, 2026 05:36

midwan approved these changes Feb 12, 2026

View reviewed changes

midwan merged commit 616be65 into BlitterStudio:master Feb 12, 2026
18 of 22 checks passed

tbdye deleted the fix/gro-tcp-segmentation branch February 13, 2026 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: segment GRO/GSO-coalesced packets in PCAP receive path#1780

fix: segment GRO/GSO-coalesced packets in PCAP receive path#1780
midwan merged 1 commit intoBlitterStudio:masterfrom
tbdye:fix/gro-tcp-segmentation

tbdye commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tbdye commented Feb 12, 2026

Summary

The Problem

The Fix

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants