Add file content streaming support for OpenAI and related utilities#25450
Conversation
- Introduced `afile_content_streaming` and `file_content_streaming` functions in `litellm/files/main.py` to handle asynchronous and synchronous file content streaming. - Added `FileContentStreamingResponse` class in `litellm/files/streaming.py` to manage streaming responses with logging capabilities. - Updated OpenAI API integration in `litellm/llms/openai/openai.py` to support new streaming methods. - Enhanced file content retrieval in `litellm/proxy/openai_files_endpoints/files_endpoints.py` to route requests for streaming. - Added unit tests for the new streaming functionality in `tests/test_litellm/llms/openai/test_openai_file_content_streaming.py` and `tests/test_litellm/proxy/openai_files_endpoint/test_files_endpoint.py`. - Refactored type hints and imports for better clarity and organization across modified files.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Greptile SummaryThis PR adds file content streaming support to reduce peak memory usage (from ~3.9 GiB to ~2.6 GiB at 1 000 concurrent requests with a 65 MB payload) by avoiding full in-memory buffering. Key additions include The previously-flagged blocking concerns from earlier rounds (wrong provider in routing, dead Confidence Score: 5/5Safe to merge — all prior blocking concerns are resolved; remaining findings are style-level only. The previously-flagged P1 issues (wrong provider forwarded on routing, dead litellm/proxy/openai_files_endpoints/file_content_streaming_handler.py — inline imports and dict-spread ordering.
|
| Filename | Overview |
|---|---|
| litellm/proxy/openai_files_endpoints/file_content_streaming_handler.py | New handler class for streaming routing; two inline imports inside static methods violate CLAUDE.md style guide, and the **data spread ordering in the afile_content call could silently override stream=True. |
| litellm/files/streaming.py | New FileContentStreamingResponse wrapper correctly handles sync/async iteration, aclose under cancellation via anyio.CancelScope, and SDK-level success/failure logging. |
| litellm/files/main.py | Adds file_content_streaming helper; client and litellm_params_dict are correctly forwarded. Streaming is gated on _should_sdk_support_streaming which matches OPENAI_COMPATIBLE_BATCH_AND_FILES_PROVIDERS. |
| litellm/llms/openai/openai.py | Adds afile_content_streaming and file_content_streaming methods; context-manager pattern correctly propagates exceptions to __aexit__/__exit__. |
| litellm/proxy/openai_files_endpoints/files_endpoints.py | Routing logic correctly resolves provider before the streaming gate check, so non-OpenAI routed providers fall through to the buffered path. |
Sequence Diagram
sequenceDiagram
participant Client
participant Proxy as files_endpoints.py
participant Handler as FileContentStreamingHandler
participant LiteLLM as litellm.afile_content
participant OpenAI as OpenAIFilesAPI
participant OAISDK as OpenAI SDK (streaming)
Client->>Proxy: GET /v1/files/{file_id}/content
Proxy->>Proxy: resolve custom_llm_provider
Proxy->>Handler: resolve_streaming_request_params()
Handler-->>Proxy: resolved_provider, file_id, data
Proxy->>Handler: should_stream_file_content(resolved_provider)
alt provider in OPENAI_COMPATIBLE_BATCH_AND_FILES_PROVIDERS
Handler-->>Proxy: True
Proxy->>Handler: get_streaming_file_content_response()
Handler->>LiteLLM: afile_content(stream=True, ...)
LiteLLM->>OpenAI: file_content_streaming(_is_async=True)
OpenAI->>OAISDK: files.with_streaming_response.content()
OAISDK-->>OpenAI: streaming context manager
OpenAI-->>LiteLLM: FileContentStreamingResult(AsyncIterator)
LiteLLM-->>Handler: FileContentStreamingResult (wrapped in FileContentStreamingResponse)
Handler-->>Proxy: StreamingResponse
Proxy-->>Client: HTTP 200 chunked/octet-stream
loop Each chunk
OAISDK-->>Client: bytes chunk
end
Handler->>Handler: _log_success_async() on StopAsyncIteration
Handler->>Handler: proxy_logging_obj.update_request_status(success)
else provider NOT in supported set
Handler-->>Proxy: False
Proxy->>LiteLLM: afile_content(stream=False, ...)
LiteLLM-->>Proxy: HttpxBinaryResponseContent (buffered)
Proxy-->>Client: HTTP 200 full body
end
Reviews (12): Last reviewed commit: "Refactor file content streaming handling..." | Re-trigger Greptile
| def _should_stream_file_content( | ||
| *, | ||
| custom_llm_provider: str, | ||
| is_base64_unified_file_id: Any, | ||
| ) -> bool: | ||
| return ( | ||
| custom_llm_provider == "openai" | ||
| and bool(is_base64_unified_file_id) is False | ||
| ) |
There was a problem hiding this comment.
Unconditional opt-out change breaks existing OpenAI users
Every proxy request where custom_llm_provider == "openai" is now silently rerouted to the streaming path — there is no feature flag or user-controlled opt-in. Callers that expected a buffered HttpxBinaryResponseContent (with Content-Length, synchronous .content, etc.) will receive a StreamingResponse after this change. This is a backwards-incompatible behavioral change for all current OpenAI file-content users.
Per the project style guide, new behavior that changes existing responses should be gated behind a flag (e.g., litellm.use_streaming_file_content = False by default), so existing users are not broken.
| def _should_stream_file_content( | |
| *, | |
| custom_llm_provider: str, | |
| is_base64_unified_file_id: Any, | |
| ) -> bool: | |
| return ( | |
| custom_llm_provider == "openai" | |
| and bool(is_base64_unified_file_id) is False | |
| ) | |
| def _should_stream_file_content( | |
| *, | |
| custom_llm_provider: str, | |
| is_base64_unified_file_id: Any, | |
| ) -> bool: | |
| import litellm as _litellm | |
| return ( | |
| custom_llm_provider == "openai" | |
| and bool(is_base64_unified_file_id) is False | |
| and getattr(_litellm, "use_streaming_file_content", False) is True | |
| ) |
Rule Used: What: avoid backwards-incompatible changes without... (source)
There was a problem hiding this comment.
I don’t think this is accurately characterized as a breaking change for existing OpenAI users.
For the externally supported /v1/files/{file_id}/content contract, the observable behavior remains the same for the validated client path:
HTTP 200
application/octet-stream
content-length preserved
content-disposition preserved
x-request-id present
identical response bytes
We explicitly validated this against the OpenAI SDK client.files.content(file_id=...) flow and confirmed LiteLLM and OpenAI return the same effective response contract. In that path, callers still receive a binary file payload and can access .content exactly as before.
The StreamingResponse vs HttpxBinaryResponseContent distinction is an internal server-side implementation detail, not a client-visible API contract, as long as the wire response and SDK-observable behavior remain equivalent. This change is specifically intended to avoid buffering large files fully in memory while preserving the existing external behavior.
So I agree this changes the internal delivery mechanism, but I don’t agree that it “breaks all current OpenAI file-content users.” The current evidence shows compatibility is preserved for the primary user path we care about.
If there is concern about unvalidated edge cases outside the tested OpenAI SDK flow, I’m happy to narrow the claim and add more coverage, but based on the current contract-level validation this does not appear to require a feature flag for backward compatibility.
There was a problem hiding this comment.
Fair point on the SDK validation. If response.content returns identical bytes and headers in the tested path, the client-visible contract is preserved.
However, streaming and buffered responses differ in failure modes:
-
Mid-stream errors: With buffered responses, errors are caught before any bytes are sent (HTTP 500). With streaming, the client may receive HTTP 200 + partial bytes, then the stream aborts. Does the OpenAI SDK handle this gracefully?
-
Client disconnects: If a client disconnects mid-download, the buffered path would log completion status based on the full response. The streaming path in
_stream_file_content_with_loggingonly logs success after full iteration — if the client disconnects, does the iterator stop cleanly without logging success? -
Content-Length vs Transfer-Encoding: You're preserving
content-lengthin the headers, but FastAPIStreamingResponsetypically uses chunked encoding. Does the SDK rely oncontent-lengthfor any validation?
Suggested validation to address these without requiring a feature flag:
- Test the error path: mock
afile_content_streamingto raise an exception after yielding one chunk. Verify the SDK sees a stream error, not HTTP 500. - Test client disconnect: verify
_stream_file_content_with_loggingcloses cleanly and doesn't log success if the iteration is incomplete. - Verify chunked encoding doesn't break SDK clients that expect
content-length.
If these scenarios pass, I'd agree this preserves compatibility. The load test results are compelling, and if the failure modes are equivalent, the streaming approach is sound.
Tip: You can customize Greptile's behavior for this repo with .greptile/rules and .greptile/config.
| def _should_stream_file_content( | ||
| *, | ||
| custom_llm_provider: str, | ||
| is_base64_unified_file_id: Any, | ||
| ) -> bool: | ||
| return ( | ||
| custom_llm_provider == "openai" | ||
| and bool(is_base64_unified_file_id) is False | ||
| ) |
There was a problem hiding this comment.
Provider-specific logic belongs in
litellm/llms/
_should_stream_file_content hardcodes custom_llm_provider == "openai" inside the proxy layer. The project rule is that provider-specific decisions live under litellm/llms/, not in proxy endpoints. Consider exposing a capability flag from the provider config (e.g., a supports_file_content_streaming property) and checking that here instead.
Rule Used: What: Avoid writing provider-specific code outside... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
I don’t think this is the same kind of provider-specific logic the style rule is meant to prevent.
_should_stream_file_content() is not implementing provider behavior or request/response transformation logic. It is deciding proxy routing policy: whether this proxy endpoint should serve the file-content response via the buffered path or the streaming path. That decision belongs naturally in the proxy layer because it is about endpoint response strategy, not provider semantics.
The provider-specific implementation still lives in litellm/llms/openai/:
the OpenAI file-content streaming call is implemented in litellm/llms/openai/openai.py
the iterator/headers returned by the provider are built there
the proxy is only deciding whether to invoke that streaming path for this endpoint
So the hardcoded custom_llm_provider == "openai" here is closer to:
“this proxy optimization/reroute is currently enabled only for OpenAI” than to
“the proxy is implementing OpenAI protocol logic”
If we later expand this to multiple providers, a capability flag could make sense. But for a targeted incremental rollout, a small hardcoded reroute policy in the proxy is reasonable and keeps the scope explicit. I’d view this as endpoint-level orchestration, not misplaced provider logic.
There was a problem hiding this comment.
@harish876 i guess an action item here is to now do this fix for all other providers we support for file content right ? Then we can remove this condition
…e. This provides a 1:1 behaviour mapping similar to the non streaming behaviour.
- Removed unused imports and streamlined type hints in `litellm/utils.py` and `litellm/files/main.py`. - Moved `FileContentStreamingResult` to a new `litellm/files/types.py` for better organization. - Updated `FileContentStreamingResponse` in `litellm/files/streaming.py` to include asynchronous close methods and improved logging capabilities. - Enhanced tests to ensure proper closure of streaming iterators in `tests/test_litellm/llms/openai/test_openai_file_content_streaming.py` and `tests/test_litellm/proxy/openai_files_endpoint/test_files_endpoint.py`.
| elif hasattr(stream_to_close, "close"): | ||
| result = cast(Iterator[bytes], stream_to_close).close() # type: ignore[attr-defined] | ||
| if result is not None: | ||
| await result |
|
@greptile review again |
OpenAI File Content Backward Compatibility NoteSummaryThis document captures a direct compatibility check for the OpenAI file content path introduced in PR The goal of this check is to verify that, for an OpenAI SDK caller using Specifically, the script verifies that LiteLLM and OpenAI both return:
This is a strong compatibility signal for the tested SDK flow because the consumer-visible payload and key headers match across both implementations. Validation ScriptFile: import asyncio
import os
from dotenv import load_dotenv
from openai import AsyncOpenAI
load_dotenv()
litellm_client = AsyncOpenAI(
api_key=os.getenv("LITELLM_API_KEY"),
base_url="http://34.95.44.152:4000/v1",
)
openai_client = AsyncOpenAI(
api_key=os.getenv("OPENAI_API_KEY"),
base_url="https://api.openai.com/v1",
)
file_id = "file-2qexFzUBybCR2BWndU3twx"
async def fetch_file_content(client, label):
content = await client.files.content(file_id=file_id)
response = content.response
print(f"[{label}] Status", response.status_code)
return response
def assert_file_response(response, label):
content_type = response.headers.get("content-type")
content_length = response.headers.get("content-length")
content_disposition = response.headers.get("content-disposition", "")
request_id = response.headers.get("x-request-id")
assert response.status_code == 200, f"{label}: expected status 200, got {response.status_code}"
assert content_type == "application/octet-stream", (
f"{label}: unexpected content-type {content_type}"
)
assert content_length is not None, f"{label}: missing content-length header"
assert int(content_length) == len(response.content), (
f"{label}: content-length header {content_length} != body length {len(response.content)}"
)
assert 'filename="dataset.jsonl"' in content_disposition, (
f"{label}: unexpected content-disposition {content_disposition}"
)
assert request_id, f"{label}: missing x-request-id header"
async def main():
litellm_response, openai_response = await asyncio.gather(
fetch_file_content(litellm_client, "LiteLLM"),
fetch_file_content(openai_client, "OpenAI"),
)
assert_file_response(litellm_response, "LiteLLM")
assert_file_response(openai_response, "OpenAI")
assert (
litellm_response.headers.get("content-type") == openai_response.headers.get("content-type")
), "content-type mismatch between LiteLLM and OpenAI"
assert (
litellm_response.headers.get("content-length") == openai_response.headers.get("content-length")
), "content-length mismatch between LiteLLM and OpenAI"
assert (
litellm_response.headers.get("content-disposition") == openai_response.headers.get("content-disposition")
), "content-disposition mismatch between LiteLLM and OpenAI"
assert litellm_response.content == openai_response.content, "response body mismatch"
print("All assertions passed.")
if __name__ == "__main__":
asyncio.run(main())Commandpython3 openai_file_client.pyOutputNon-Mock Header Parity CheckThis check was performed against real endpoints using the OpenAI Python SDK, not mocks. The purpose of this comparison is to show that the new LiteLLM streaming implementation preserves the response contract that an OpenAI SDK caller observes from The two responses were compared at the header and payload level. The compatibility-relevant result is:
Filtered headers from the new LiteLLM streaming path: {
"content-type": "application/octet-stream",
"content-length": "68156820",
"content-disposition": "attachment; filename=\"dataset.jsonl\"",
"x-request-id": "req_5f622a75ec644d70bbd5469d2c008abf",
"openai-version": "2020-10-01",
"openai-project": "proj_F0P5EBggl8kfWzGtPQWRPchP",
"x-litellm-version": "1.83.4",
"x-litellm-key-spend": "0.0"
}Filtered headers from the OpenAI baseline response: {
"content-type": "application/octet-stream",
"content-length": "68156820",
"content-disposition": "attachment; filename=\"dataset.jsonl\"",
"x-request-id": "req_18ded58aec1f4ed9a69256d82e3586d2",
"openai-version": "2020-10-01",
"openai-project": "proj_F0P5EBggl8kfWzGtPQWRPchP"
}Some headers are expected to differ across requests, such as Why This Supports Backward CompatibilityFor the tested OpenAI SDK path, the behavior is backward compatible in the ways that matter to the caller:
In other words, from the perspective of a client consuming Scope Of The ClaimThis validation demonstrates backward compatibility for the tested That is the key argument for PR |
|
@greptile review again |
ishaan-berri
left a comment
There was a problem hiding this comment.
Nit - minor change requested
| def _should_stream_file_content( | ||
| *, | ||
| custom_llm_provider: str, | ||
| is_base64_unified_file_id: Any, | ||
| ) -> bool: | ||
| return ( | ||
| custom_llm_provider == "openai" | ||
| and bool(is_base64_unified_file_id) is False | ||
| ) |
There was a problem hiding this comment.
@harish876 i guess an action item here is to now do this fix for all other providers we support for file content right ? Then we can remove this condition
|
|
||
|
|
||
| @client | ||
| def file_content_streaming( |
There was a problem hiding this comment.
this feels like a lot of duplicate code. Why can't we just add a stream=True/False on def file_content ?
That way you don't need this new function
There was a problem hiding this comment.
Counterpoint here. I think keeping file_content_streaming() separate is the cleaner choice because this is not just a stream=True transport toggle on the existing API. The streaming path returns a different shape, carries headers alongside an iterator, and has iterator-specific logging and cleanup behavior like aclose() on disconnect. Keeping it separate preserves the existing file_content() contract, makes the rollout to other providers incremental, and keeps the streaming-specific behavior isolated and easier to test. The original function code can be removed once we migrate all paths to a streaming one.
- Static Methods for Streaming Handler Function - Remove the afile_content_streaming wrapper function. Enabled with a stream boolean in afile_content - Cleaned up test cases after refactor
… routing - Updated `FileContentStreamingHandler` to utilize `custom_llm_provider` from credentials for routing. - Added error handling for missing `custom_llm_provider` in credentials. - Introduced new tests to validate streaming behavior with routed providers and non-OpenAI providers. - Cleaned up imports and ensured proper type casting for improved clarity.
…provider routing - Added validation to ensure credentials include a custom LLM provider before routing. - Cleaned up type casting for better readability. - Introduced a new test to verify behavior when a non-OpenAI provider is used, ensuring proper handling of streaming responses. - Updated imports to include necessary modules for testing.
|
@greptile review again |
- Changed the import path for `upload_file_to_storage_backend` in test files to reflect the new module structure. - Ensured consistency in mocking for storage backend service tests.
| from litellm.files.types import FileContentStreamingResult | ||
|
|
||
| if TYPE_CHECKING: | ||
| from litellm.proxy._types import UserAPIKeyAuth |
| from litellm.proxy.openai_files_endpoints.storage_backend_service import ( | ||
| StorageBackendFileService, | ||
| ) |
| from litellm.proxy.openai_files_endpoints.file_content_streaming_handler import ( | ||
| FileContentStreamingHandler, | ||
| ) |
| def should_stream_file_content( | ||
| *, | ||
| custom_llm_provider: str, | ||
| is_base64_unified_file_id: Any, | ||
| ) -> bool: | ||
| return ( | ||
| custom_llm_provider == "openai" | ||
| and bool(is_base64_unified_file_id) is False | ||
| ) |
There was a problem hiding this comment.
Streaming gate passes even when model-routing resolves to a non-OpenAI provider
should_stream_file_content checks only the request-level custom_llm_provider ("openai" by default), but when should_route=True the effective provider comes from credentials["custom_llm_provider"] which can be "azure", "vertex_ai", or "bedrock". file_content_streaming only handles OPENAI_COMPATIBLE_BATCH_AND_FILES_PROVIDERS = {"openai", "hosted_vllm"}, so the call in get_streaming_file_content_response raises BadRequestError for any model-routing target outside that set.
Concrete failure: user creates a file via the proxy with a model that routes to Azure → the file ID becomes model-encoded → on retrieval, should_route=True, credentials["custom_llm_provider"] = "azure", streaming is entered, afile_content(custom_llm_provider="azure", stream=True) → BadRequestError.
Simplest fix: pass the resolved effective provider into the gate check so streaming is only entered when the routed provider is actually supported:
@staticmethod
def should_stream_file_content(
*,
custom_llm_provider: str,
is_base64_unified_file_id: Any,
effective_custom_llm_provider: Optional[str] = None,
) -> bool:
from litellm.types.utils import OPENAI_COMPATIBLE_BATCH_AND_FILES_PROVIDERS
resolved = effective_custom_llm_provider or custom_llm_provider
return (
resolved in OPENAI_COMPATIBLE_BATCH_AND_FILES_PROVIDERS
and bool(is_base64_unified_file_id) is False
)| ) | ||
|
|
||
| response = await litellm.afile_content( |
There was a problem hiding this comment.
I think your architecture here is wrong.
You should always call litellm.afile_content()
Do not add a new branch as you did above.
Then in litellm.afile_content() add a stream="True"/"False" param. Based on the param handle it accordingly.
There was a problem hiding this comment.
I agree. The caveat here is that FileContentStreamingHandler.get_streaming_file_content_response calls acontent_file with stream as True. The same async wrapper function is used. This static method is just in order to facilitate streaming logic within a single function, similar to file_content
There was a problem hiding this comment.
My idea here is to replace the async wrapper, which was redundant to afile_content with a stream boolean
There was a problem hiding this comment.
will revisit this again
There was a problem hiding this comment.
Only one action item, make sure that this is done @harish876
| check_file_id_encoding=True, | ||
| ) | ||
|
|
||
| from litellm.proxy.openai_files_endpoints.file_content_streaming_handler import ( |
There was a problem hiding this comment.
we should not have this section here. user's expect file to go through which your code skips today
if should_route:
# Use model-based routing with credentials from config
prepare_data_with_credentials(
- Introduced a new method in `FileContentStreamingHandler` to resolve streaming request parameters, enhancing the routing logic based on credentials. - Updated the `should_stream_file_content` method to check against supported providers. - Cleaned up type hints and imports across multiple files for better organization and clarity. - Added comprehensive tests to validate the new routing behavior and ensure original data integrity during streaming requests.
| ): | ||
| verbose_proxy_logger.debug( | ||
| "Using streaming file content helper for custom_llm_provider=%s, original_file_id=%s, file_id=%s, model_used=%s", | ||
| resolved_custom_llm_provider, |
| verbose_proxy_logger.debug( | ||
| "Using streaming file content helper for custom_llm_provider=%s, original_file_id=%s, file_id=%s, model_used=%s", | ||
| resolved_custom_llm_provider, | ||
| original_file_id, |
| "Using streaming file content helper for custom_llm_provider=%s, original_file_id=%s, file_id=%s, model_used=%s", | ||
| resolved_custom_llm_provider, | ||
| original_file_id, | ||
| resolved_file_id, |
| resolved_custom_llm_provider, | ||
| original_file_id, | ||
| resolved_file_id, | ||
| model_used, |
| from litellm.litellm_core_utils.litellm_logging import ( | ||
| Logging as LiteLLMLoggingObj, | ||
| ) | ||
| from litellm.types.utils import StandardLoggingHiddenParams, StandardLoggingPayload |
|
|
||
| import litellm | ||
| from litellm.files.types import FileContentProvider, FileContentStreamingResult | ||
| from litellm.types.utils import OPENAI_COMPATIBLE_BATCH_AND_FILES_PROVIDERS |
|
|
||
| if TYPE_CHECKING: | ||
| from litellm.proxy._types import UserAPIKeyAuth | ||
| from litellm.proxy.utils import ProxyLogging |
| from litellm.proxy.openai_files_endpoints.common_utils import ( | ||
| prepare_data_with_credentials, | ||
| ) |
| from litellm.proxy.common_request_processing import ( | ||
| ProxyBaseLLMRequestProcessing, | ||
| ) |
There was a problem hiding this comment.
This is outdated. The error was resolved
| ] | ||
| FileDeleteProvider = Literal["openai", "azure", "gemini", "manus", "anthropic"] | ||
| FileListProvider = Literal["openai", "azure", "manus", "anthropic"] | ||
| FileContentProvider = Literal[ |
There was a problem hiding this comment.
why are we deleting this ?
There was a problem hiding this comment.
This has been moved to types.py. This is to prevent cyclic imports as the helper class needs to use this type as well
c70a3c7
into
BerriAI:litellm_harish_april11
file_content_streamingfunctions inlitellm/files/main.pyto handle asynchronous and synchronous file content streaming.FileContentStreamingResponseclass inlitellm/files/streaming.pyto manage streaming responses with logging capabilities.litellm/llms/openai/openai.pyto support new streaming methods.litellm/proxy/openai_files_endpoints/files_endpoints.pyto route requests for streaming.tests/test_litellm/llms/openai/test_openai_file_content_streaming.pyandtests/test_litellm/proxy/openai_files_endpoint/test_files_endpoint.py.Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewDelays in PR merge?
If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).
CI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Summary
This PR adds streaming response enabled for openai file content responses.
Issue
The v1 files content path buffers the full payload in memory before returning it. Under load, that causes elevated RSS and can contribute to OOM behavior when many large file requests run concurrently.
What was implemented
Load Test Memory Results (1000 concurrent requests, 65 MB payload)
Next Steps
The current file content retrieval path goes through afile_content. To make the response object a async iterable needs to be evaluated. The current solution is a prototype, which provider memory savings when a streaming based approach is used
Why this helps
Streaming avoids holding the full file payload in memory per request, which materially lowers peak RSS under concurrent load and reduces the risk of OOM events.