fix: resolve unbounded memory leak from TypeVar parameterized ParsedResponse types#3118
fix: resolve unbounded memory leak from TypeVar parameterized ParsedResponse types#3118Jah-yee wants to merge 1 commit intoopenai:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c791d33b73
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # warnings are not supported in Pydantic v1 | ||
| warnings=True if PYDANTIC_V1 else warnings, | ||
| by_alias=by_alias, | ||
| by_alias=by_alias if by_alias is not None else True, |
There was a problem hiding this comment.
Preserve
by_alias=None passthrough in model_dump
This change makes model_dump() default to alias serialization whenever callers omit by_alias, which is a behavioral regression from both prior code and Pydantic’s default semantics. It can silently change output keys (e.g., snake_case to API aliases) in existing call sites that rely on the default, and it also creates inconsistent behavior versus the v1 fallback path below (model.dict(..., by_alias=bool(by_alias)) still treats None as False). That version-dependent key shape drift is likely to break downstream merge/transform logic and user expectations unless every caller is updated to pass by_alias explicitly.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Response to by_alias=None Review Comment
This review comment correctly flags a regression concern from the original PR #3118 diff, which contained both the memory leak fix AND a by_alias change that defaults to True when None.
I've pushed a new branch fix/parsed-response-leak (https://github.com/Jah-yee/openai-python/pull/new/fix/parsed-response-leak) that only carries the memory leak fix — the by_alias change is intentionally excluded:
# Only these changes are in the new branch:
- type_=ParsedResponseOutputText[TextFormatT] → type_=ParsedResponseOutputText
- type_=ParsedResponseOutputMessage[TextFormatT] → type_=ParsedResponseOutputMessage
- type_=ParsedResponse[TextFormatT] → type_=ParsedResponseThe _compat.py by_alias behavior is unchanged from upstream/main and remains a separate concern to address independently. If you'd like, I can revert that part of the original PR as a follow-up.
Would you like me to close the original PR in favor of this cleaner branch? ✅
Warmly,
RoomWithOutRoof
…esponse types Root cause: ParsedResponse[TextFormatT] where TextFormatT is a free TypeVar causes model_rebuild to always return False, preventing MockCoreSchema._built_memo caching. This causes a new SchemaValidator/SchemaSerializer (heavy Rust object) to be allocated on every responses.parse() call, accumulating without bound. Fix: Use non-parameterized base classes (ParsedResponse, ParsedResponseOutputText, ParsedResponseOutputMessage) which are functionally identical at runtime but allow pydantic's model_rebuild to succeed and cache the built schemas. Fixes openai#3084
c791d33 to
fc95250
Compare
Root Cause
ParsedResponse[TextFormatT]whereTextFormatTis a free module-level TypeVar causes pydantic'smodel_rebuildto always returnFalse. This preventsMockCoreSchema._built_memofrom caching, causing a newSchemaValidator/SchemaSerializer(heavy Rust object) to be allocated on everyresponses.parse()call — accumulating without bound.Fix
Use non-parameterized base classes which are functionally identical at runtime but allow
model_rebuildto succeed and cache the built schemas:Impact
Fixes #3084