Skip to content

Can't read 100mb from R2. Fuse, streaming from worker, signed urls, s3 api all fails after 10-15mb #137

@Victornovikov

Description

@Victornovikov

Checked multiple different ways of reading from R2

Method Initial Speed Failure Point Error Type
S3 SDK ~0.1 MB/s Never completes Exit code -1
Presigned URL 13.46 MB/s ~14MB Socket closed
Worker Stream 22-24 MB/s ~10MB ECONNRESET
FUSE (tigrisfs) N/A ~5-10MB Context deadline exceeded

Container network egress to R2 seems to be unreliable.
All four methods—including the officially documented FUSE approach—fail with the same pattern: initial success followed by connection death after ~10-15MB.

Not sure what I am doing wrong.

Workaround Attempts

We also tried:

  • Increasing timeouts (2min → 5min) - Still fails
  • Adding heartbeats - Connection still dies
  • Using streaming with progress logging - Confirmed data stops flowing...

More details:
.## Environment

  • Container Instance Type: standard-4
  • enableInternet: true
  • File Size Being Transferred: ~30MB -100MB (audio file for transcription)
  • Use Case: Audio transcription pipeline using FFmpeg + Whisper API

Failure Pattern (Consistent Across All Methods)

  1. Transfer starts successfully at good speed (13-24 MB/s)
  2. After ~10-15MB transferred, connection stalls
  3. After 2-3 minutes of stalling, connection dies
  4. Error indicates timeout or connection reset

Method 1: Container fetches from R2 using AWS S3 SDK

Approach: Container uses @aws-sdk/client-s3 to fetch directly from R2.

Container Code (clients.js)

const { S3Client } = require('@aws-sdk/client-s3');

const s3 = new S3Client({
endpoint: process.env.R2_ENDPOINT,
credentials: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
},
region: 'auto',
});
...### Container Code (transcribe.js)
const { GetObjectCommand } = require('@aws-sdk/client-s3');

const { Body } = await s3.send(new GetObjectCommand({
Bucket: BUCKET,
Key: r2_key,
}));

const chunks = [];
let downloadedBytes = 0;
for await (const chunk of Body) {
chunks.push(chunk);
downloadedBytes += chunk.length;
}
const audioBuffer = Buffer.concat(chunks);
...### Result

  • Speed: ~0.1 MB/s (extremely slow)
  • Error: Container exit code -1 (platform kill)
  • Log: Download stalls, container terminated by platform

Method 2: Container fetches via Presigned URL

Approach: Worker generates presigned URL, container uses fetch() to download.

Worker Code (audio.ts)

import { getSignedUrl } from '@aws-sdk/s3-request-presigner';
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';

const s3Client = new S3Client({
endpoint: https://${env.CLOUDFLARE_ACCOUNT_ID}.r2.cloudflarestorage.com,
credentials: {
accessKeyId: env.R2_ACCESS_KEY_ID,
secretAccessKey: env.R2_SECRET_ACCESS_KEY,
},
region: 'auto',
});

const presignedUrl = await getSignedUrl(
s3Client,
new GetObjectCommand({
Bucket: 'llm-mindmap-audio',
Key: r2_key,
}),
{ expiresIn: 3600 }
);

// Pass to container
const response = await container.fetch('http://container/transcribe', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ r2_key, presigned_url: presignedUrl }),
});
...### Container Code (transcribe.js)
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 300000); // 5 min

const response = await fetch(presigned_url, { signal: controller.signal });
const reader = response.body.getReader();
const chunks = [];
let downloadedBytes = 0;

while (true) {
const { done, value } = await reader.read();
if (done) break;
chunks.push(value);
downloadedBytes += value.length;
}
...### Result

  • Initial Speed: 13.46 MB/s (good!)
  • Failure Point: ~14MB transferred
  • Error: SocketError: other side closed
  • Log:
    bytesRead: 14876987
    remoteAddress: '172.64.66.1'
    code: 'UND_ERR_SOCKET'
    [cause]: SocketError: other side closed
    ...---

Method 3: Worker streams R2 body to Container

Approach: Worker downloads from R2 using internal binding (fast), then streams the body to container.

Worker Code (audio.ts)

// Worker downloads from R2 using internal binding (fast, no egress)
const audioObject = await env.AUDIO.get(r2_key);
if (!audioObject) {
throw new Error(Audio file not found in R2: ${r2_key});
}

// Stream the R2 body directly to the container
const transcribeResponse = await container.fetch('http://container/transcribe', {
method: 'POST',
headers: {
'Content-Type': 'application/octet-stream',
'X-R2-Key': r2_key,
'Content-Length': audioObject.size.toString(),
},
body: audioObject.body, // ReadableStream
});
...### Container Code (transcribe.js)
if (req.headers['content-type'] === 'application/octet-stream') {
const expectedSize = parseInt(req.headers['content-length'] || '0', 10);
const chunks = [];
let receivedBytes = 0;

await new Promise((resolve, reject) => {
req.on('data', (chunk) => {
chunks.push(chunk);
receivedBytes += chunk.length;
});
req.on('end', resolve);
req.on('error', reject);
});

audioBuffer = Buffer.concat(chunks);
}
...### Result

  • Initial Speed: 22-24 MB/s (excellent!)
  • Failure Point: ~10MB transferred
  • Error: ECONNRESET: aborted
  • Log:
    [transcribe] worker_stream progress { bytes: 10485760, speed_mbps: '23.92' }
    ... 3 minutes later ...
    [transcribe] error at stage 'download_progress': Error: aborted
    code: 'ECONNRESET'
    ...---

Method 4: FUSE Mount with tigrisfs

Approach: Container mounts R2 as filesystem using tigrisfs, reads file directly.

tigrisfs reads AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from env
tigrisfs --endpoint "${R2_ENDPOINT}" -f "${R2_BUCKET_NAME}" /mnt/r2 &

Wait for mount

for i in $(seq 1 30); do
if mountpoint -q /mnt/r2 2>/dev/null; then
echo "R2 mounted successfully"
break
fi
sleep 1
done

exec node server.js
...### Container Code (transcribe.js)
const R2_MOUNT = '/mnt/r2';
const inputPath = path.join(R2_MOUNT, r2_key);

if (!fs.existsSync(inputPath)) {
return res.status(404).json({ error: File not found: ${r2_key} });
}

const stats = fs.statSync(inputPath); // This works!
console.log(Size: ${stats.size} bytes);

// FFmpeg reads from the FUSE mount
const probe = spawnSync('ffprobe', [..., inputPath]); // Times out
...### Result

  • Mount: Successful
  • File Metadata: Readable (size: 30,926,305 bytes)
  • Failure Point: Reading file content at ~5-10MB
  • Error: context deadline exceeded (Client.Timeout or context cancellation while reading body)
  • Log:
    [transcribe] Starting: .../file.mp3, size: 30926305 bytes
    Error reading 5242880 +5242880 of .../file.mp3 (attempt 1):
    context deadline exceeded (Client.Timeout or context cancellation while reading body)
    Error reading 10485760 +5242880 of .../file.mp3 (attempt 1):
    context deadline exceeded

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions