Skip to content

Fix OOM in BinHexDecoder, Base64Decoder, and XmlSchemaValidator when throwing on large invalid input#125930

Open
Copilot wants to merge 6 commits intomainfrom
copilot/fix-charcheckingreader-tests
Open

Fix OOM in BinHexDecoder, Base64Decoder, and XmlSchemaValidator when throwing on large invalid input#125930
Copilot wants to merge 6 commits intomainfrom
copilot/fix-charcheckingreader-tests

Conversation

Copy link
Contributor

Copilot AI commented Mar 22, 2026

CharCheckingReaderTest.RunTests ("ReadBinHex runs into an Overflow") fails sporadically with OutOfMemoryException on 32-bit platforms (x86, WASM). When encountering an invalid character, BinHexDecoder and Base64Decoder both called chars.ToString() on the entire decode buffer — potentially millions of characters — to populate the XmlException message. The string allocation itself OOMs before the actual exception reaches the caller. Furthermore, XmlSchemaValidator independently re-embeds the full element/attribute value into its own error messages, so even fixing the inner decoder exceptions alone would not bound the observable message size for callers going through schema validation.

Changes

Decoders (BinHexDecoder.cs, Base64Decoder.cs):

  • BinHexDecoder — invalid-character path (private Decode): chars.ToString()ch.ToString(). The error message '{0}' is not a valid BinHex character. now names the single bad character, not the whole buffer. Resource string updated from "text sequence" to "character" accordingly.
  • BinHexDecoder — odd-count path (public static Decode): Pass null instead of the full input span. The resource string Xml_InvalidBinHexValueOddCount was updated to drop {0} — for an odd-count error the problem is the length, not the content, so embedding the value is not useful.
  • Base64Decoder — two throw sites: chars.ToString()ch.ToString() for both the invalid-character path and the non-whitespace-after-padding path.

Schema validator (XmlSchemaValidator.cs):

  • Introduced a TruncateValueForErrorMessage private helper (max 40 chars + "...") and applied it at all three call sites that embed stringValue into Sch_ElementValueDataTypeDetailed / Sch_AttributeValueDataTypeDetailed error messages.

Before / after for the primary case:

// Before — materializes entire decode buffer into a string
throw new XmlException(SR.Xml_InvalidBinHexValue, chars.ToString());

// After — passes only the offending character
throw new XmlException(SR.Xml_InvalidBinHexValue, ch.ToString());

Tests:

  • Tightened TestReadBinHex_105376 to assert ex.Message.Length ≤ 100.
  • Added TestReadBase64_InvalidChar_ErrorMessageBounded (same pattern, limit 100).
  • Added ValidationErrorMessageTests for element and attribute schema validation paths (limit 500, since those messages include structural overhead: element/type names and inner exception text).
Original prompt

This section details on the original issue you should resolve

<issue_title>System.Xml.CharCheckingReaderTests fails with OutOfMemoryException</issue_title>
<issue_description>## Build Information
Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=1149008
Build error leg or test failing: System.Xml.CharCheckingReaderTests.CharCheckingReaderTest.RunTests
Pull request: #119599

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": ["System.Xml.CharCheckingReaderTests","OutOfMemoryException"],
  "ErrorPattern": "",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=1149008
Error message validated: [System.Xml.CharCheckingReaderTests OutOfMemoryException]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 9/16/2025 6:43:18 AM UTC

Report

Build Definition Test Pull Request
1331798 dotnet/runtime System.Xml.CharCheckingReaderTests.CharCheckingReaderTest.RunTests #123252
1324938 dotnet/runtime System.Xml.CharCheckingReaderTests.CharCheckingReaderTest.RunTests #125288
1309948 dotnet/runtime System.Xml.CharCheckingReaderTests.CharCheckingReaderTest.RunTests

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 3

Comments on the Issue (you are @copilot in this section)

@jkotas ``` Starting: System.Private.Xml.Tests (parallel test collections = on [4 threads], stop on fail = off) System.Xml.CharCheckingReaderTests.CharCheckingReaderTest.RunTests(testCase: ReadBinHex runs into an Overflow) [FAIL] System.OutOfMemoryException : Exception of type 'System.OutOfMemoryException' was thrown. Stack Trace: /_/src/libraries/System.Private.CoreLib/src/System/String.cs(316,0): at System.String.Ctor(ReadOnlySpan`1 value) /_/src/libraries/System.Private.CoreLib/src/System/Span.cs(382,0): at System.Span`1.ToString() /_/src/libraries/System.Private.CoreLib/src/System/String.Manipulation.cs(544,0): at System.String.FormatHelper(IFormatProvider provider, String format, ReadOnlySpan`1 args) /_/src/libraries/System.Private.CoreLib/src/System/String.Manipulation.cs(492,0): at System.String.Format(String format, ReadOnlySpan`1 args) /_/src/libraries/System.Private.Xml/src/System/Xml/XmlException.cs(202,0): at System.Xml.XmlException.CreateMessage(String res, String[] args, Int32 lineNumber, Int32 linePosition) /_/src/libraries/System.Private.Xml/src/System/Xml/XmlException.cs(167,0): at System.Xml.XmlException..ctor(String res, String[] args, Exception innerException, Int32 lineNumber, Int32 linePosition, String sourceUri) /_/src/libraries/System.Private.Xml/src/System/Xml/XmlException.cs(122,0): at System.Xml.XmlException..ctor(String res, String arg) /_/src/libraries/System.Private.Xml/src/System/Xml/BinHexDecoder.cs(165,0): at System.Xml.BinHexDecoder.Decode(ReadOnlySpan`1 chars, Span`1 bytes, Boolean& hasHalfByteCached, Byte& cachedHalfByte, Int32& charsDecoded, Int32& bytesDecoded) /_/src/libraries/System.Private.Xml/src/System/Xml/BinHexDecoder.cs(74,0): at System.Xml.BinHexDecoder.Decode(String str, Int32 startPos, Int32 len) /_/src/libraries/System.Private.Xml/src/System/Xml/Core/ReadContentAsBinaryHelper.cs(367,0): at System.Xml.ReadContentAsBinaryHelper.ReadContentAsBinary(Byte[] buffer, Int32 index, In...

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-xml
See info in area-owners.md if you want to be subscribed.

Copilot AI and others added 2 commits March 22, 2026 20:28
…essages

- In the private Decode method: pass just the invalid character `ch` instead of
  the entire `chars` span when throwing XmlException for an invalid BinHex char.
  This is also more semantically correct (the message says 'char X is not valid').
- In the public static Decode method: truncate the chars string to 40 chars when
  building the OddCount error message to avoid creating huge strings on large inputs.

Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/07551353-4ccc-440e-869f-1c38e684be2b
Copilot AI changed the title [WIP] Fix OutOfMemoryException in CharCheckingReaderTests Fix OOM in BinHexDecoder and Base64Decoder when throwing on large invalid input Mar 22, 2026
Copilot AI requested a review from danmoseley March 22, 2026 20:34
…ation

- Update resource strings to say 'character' instead of 'text sequence'
- Drop {0} from Xml_InvalidBinHexValueOddCount (pass null instead of full input)
- Truncate stringValue to 40 chars in XmlSchemaValidator error messages
- Add error message size assertions to existing BinHex/Base64 tests
- Add ValidationErrorMessageTests for schema validation paths

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danmoseley
Copy link
Member

Note

This comment was generated with Copilot assistance.

I pushed an additional commit that audits and fixes all the places where these decoders' unbounded error messages can surface.

The original two commits fix BinHexDecoder.Decode and Base64Decoder.Decode so they pass only the offending character into exception messages. But BinHexDecoder has a third error path (odd-count, line 121) that also embeds the full input via new string(chars). That path is reachable from public API through XSD schema validation of xs:hexBinary types. Furthermore, XmlSchemaValidator itself independently re-embeds the full element/attribute value into its own error messages at three call sites (Sch_ElementValueDataTypeDetailed / Sch_AttributeValueDataTypeDetailed), so even fixing the inner decoder exception alone wouldn't bound the observable message size.

Showing the offending character rather than the full sequence is fine diagnostically: the caller already has the input, and the character identity (or "odd count") is what they need to locate the problem. The full sequence in an error message was never useful — just a copy of data the caller already holds.

Product changes:

  • BinHexDecoder.cs (OddCount path): Pass null instead of the full input. The resource string was updated to drop {0} — for an odd-count error, knowing the value doesn't help; the problem is the length.
  • XmlSchemaValidator.cs (3 sites): Truncate stringValue to 40 chars + "..." before embedding in error messages. Here a prefix of the value is useful context (shows what was being validated), but the full value is not.
  • Base64Decoder.cs: Rename afterPad to ch for consistency with the existing loop variable.
  • Resource strings: "text sequence" to "character" (since {0} is now a single char), and dropped '{0}' from Xml_InvalidBinHexValueOddCount.

Tests:

  • Tightened the existing TestReadBinHex_105376 to assert ex.Message.Length < 1000.
  • Added TestReadBase64_InvalidChar_ErrorMessageBounded (same pattern).
  • Added ValidationErrorMessageTests for element and attribute schema validation paths.

All tests verified: fail before fix, pass after.

@danmoseley danmoseley marked this pull request as ready for review March 23, 2026 02:26
Copilot AI review requested due to automatic review settings March 23, 2026 02:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes sporadic OutOfMemoryException failures on 32-bit platforms by preventing very large allocations when formatting XmlException messages for invalid BinHex/Base64 input, and adds tests to ensure validation/reader error messages remain bounded.

Changes:

  • Update BinHexDecoder/Base64Decoder to report only the offending character (instead of chars.ToString() on the full decode span) when throwing XmlException.
  • Bound schema-validation error details by truncating the value included in datatype validation messages.
  • Add/extend tests to validate that error messages stay below a reasonable size for large invalid inputs.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/libraries/System.Private.Xml/src/System/Xml/BinHexDecoder.cs Avoids allocating the entire decode buffer when throwing for invalid BinHex; adjusts odd-count error path.
src/libraries/System.Private.Xml/src/System/Xml/Base64Decoder.cs Avoids allocating the entire decode buffer when throwing for invalid Base64 (including post-padding non-whitespace).
src/libraries/System.Private.Xml/src/System/Xml/Schema/XmlSchemaValidator.cs Truncates large values included in schema validation event messages.
src/libraries/System.Private.Xml/src/Resources/Strings.resx Updates resource strings to reflect “invalid character” phrasing and removes value insertion for odd-count BinHex.
src/libraries/System.Private.Xml/tests/XmlReaderLib/ReadBase64.cs Adds a regression test ensuring Base64 invalid-char exceptions don’t include the entire input.
src/libraries/System.Private.Xml/tests/XmlReaderLib/TCReadElementContentAsBase64.cs Registers new Base64 invalid-char bounded-message variations.
src/libraries/System.Private.Xml/tests/XmlReaderLib/ReadBinHex.cs Tightens existing BinHex overflow test to fail if exception message is unexpectedly large.
src/libraries/System.Private.Xml/tests/XmlSchema/XmlSchemaValidatorApi/ValidationErrorMessageTests.cs Adds schema-validation tests asserting error messages remain bounded for large invalid hexBinary values.
src/libraries/System.Private.Xml/tests/System.Private.Xml.Tests.csproj Includes the new schema validation test file in the test project.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danmoseley danmoseley enabled auto-merge (squash) March 23, 2026 03:20
@danmoseley danmoseley requested a review from tarekgh March 23, 2026 19:57
@danmoseley
Copy link
Member

@tarekgh I saw you touched some Xml*.cs recently :)

@tarekgh
Copy link
Member

tarekgh commented Mar 23, 2026

CC @krwq who I think still owning it. I'll take a look too anyway.

@stephentoub
Copy link
Member

Separately we should look at gutting Base64Decoder and replacing its implementation with use of Base64, rather than having a separate worse implementation.

Copy link
Member

@tarekgh tarekgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left minor suggestion. LGTM otherwise.

@danmoseley
Copy link
Member

@copilot address feedback for limit let's use only 100.
Let's leave ellipsis at three separate dots for max compat for people with terminals

Per review feedback: the decoder exception messages (which contain only
the invalid character) are short, so 100 is a tighter and more
appropriate bound. Schema validator tests keep 500 since those messages
include structural overhead (element/type names, inner exception text).

Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/8b521214-f14a-4ccc-8f64-29b49c990a43
auto-merge was automatically disabled March 23, 2026 22:10

Head branch was pushed to by a user without write access

Copy link
Contributor Author

Copilot AI commented Mar 23, 2026

@copilot address feedback for limit let's use only 100.
Let's leave ellipsis at three separate dots for max compat for people with terminals

Done in adfcd66 — decoder test limits changed from 1000 to 100. Kept three separate dots for ellipsis. The schema validator tests keep 500 since those messages include structural overhead beyond just the value.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • bla
    • Triggering command: /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet exec --runtimeconfig System.Private.Xml.Tests.runtimeconfig.json --depsfile System.Private.Xml.Tests.deps.json /home/REDACTED/.nuget/packages/microsoft.dotnet.xunitconsoleREDACTED/2.9.3-beta.26166.111/build/../tools/net/xunit.console.dll System.Private.Xml.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing (dns block)
    • Triggering command: /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet exec --runtimeconfig System.Private.Xml.Tests.runtimeconfig.json --depsfile System.Private.Xml.Tests.deps.json /home/REDACTED/.nuget/packages/microsoft.dotnet.xunitconsoleREDACTED/2.9.3-beta.26166.111/build/../tools/net/xunit.console.dll System.Private.Xml.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing tnet/sdk/11.0.10-DURTBLDENV_FRIENDLY=Debug tnet/sdk/11.0.10bash tnet/sdk/11.0.10--norc ITS=64 GET_AMD64 -DTARG-D_TIME_BITS=64 (dns block)
  • foo
    • Triggering command: /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet exec --runtimeconfig System.Private.Xml.Tests.runtimeconfig.json --depsfile System.Private.Xml.Tests.deps.json /home/REDACTED/.nuget/packages/microsoft.dotnet.xunitconsoleREDACTED/2.9.3-beta.26166.111/build/../tools/net/xunit.console.dll System.Private.Xml.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing (dns block)
    • Triggering command: /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet exec --runtimeconfig System.Private.Xml.Tests.runtimeconfig.json --depsfile System.Private.Xml.Tests.deps.json /home/REDACTED/.nuget/packages/microsoft.dotnet.xunitconsoleREDACTED/2.9.3-beta.26166.111/build/../tools/net/xunit.console.dll System.Private.Xml.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing tnet/sdk/11.0.10-DURTBLDENV_FRIENDLY=Debug tnet/sdk/11.0.10bash tnet/sdk/11.0.10--norc ITS=64 GET_AMD64 -DTARG-D_TIME_BITS=64 (dns block)
  • notfound.invalid.corp.microsoft.com
    • Triggering command: /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet exec --runtimeconfig System.Private.Xml.Tests.runtimeconfig.json --depsfile System.Private.Xml.Tests.deps.json /home/REDACTED/.nuget/packages/microsoft.dotnet.xunitconsoleREDACTED/2.9.3-beta.26166.111/build/../tools/net/xunit.console.dll System.Private.Xml.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing (dns block)
    • Triggering command: /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet exec --runtimeconfig System.Private.Xml.Tests.runtimeconfig.json --depsfile System.Private.Xml.Tests.deps.json /home/REDACTED/.nuget/packages/microsoft.dotnet.xunitconsoleREDACTED/2.9.3-beta.26166.111/build/../tools/net/xunit.console.dll System.Private.Xml.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing tnet/sdk/11.0.10-DURTBLDENV_FRIENDLY=Debug tnet/sdk/11.0.10bash tnet/sdk/11.0.10--norc ITS=64 GET_AMD64 -DTARG-D_TIME_BITS=64 (dns block)
  • test.test
    • Triggering command: /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet exec --runtimeconfig System.Private.Xml.Tests.runtimeconfig.json --depsfile System.Private.Xml.Tests.deps.json /home/REDACTED/.nuget/packages/microsoft.dotnet.xunitconsoleREDACTED/2.9.3-beta.26166.111/build/../tools/net/xunit.console.dll System.Private.Xml.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing (dns block)
    • Triggering command: /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet /home/REDACTED/work/runtime/runtime/artifacts/bin/testhost/net11.0-linux-Debug-x64/dotnet exec --runtimeconfig System.Private.Xml.Tests.runtimeconfig.json --depsfile System.Private.Xml.Tests.deps.json /home/REDACTED/.nuget/packages/microsoft.dotnet.xunitconsoleREDACTED/2.9.3-beta.26166.111/build/../tools/net/xunit.console.dll System.Private.Xml.Tests.dll -xml testResults.xml -nologo -notrait category=OuterLoop -notrait category=failing tnet/sdk/11.0.10-DURTBLDENV_FRIENDLY=Debug tnet/sdk/11.0.10bash tnet/sdk/11.0.10--norc ITS=64 GET_AMD64 -DTARG-D_TIME_BITS=64 (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Fix OOM in BinHexDecoder and Base64Decoder when throwing on large invalid input Fix OOM in BinHexDecoder, Base64Decoder, and XmlSchemaValidator when throwing on large invalid input Mar 23, 2026
Copilot AI requested a review from danmoseley March 23, 2026 22:11
@danmoseley danmoseley enabled auto-merge (squash) March 23, 2026 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

System.Xml.CharCheckingReaderTests fails with OutOfMemoryException

5 participants