Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent raw runtime exceptions from leaking out of Schema deserialization (notably a SnakeYAML-engine IndexOutOfBoundsException edge case) by wrapping runtime failures in SchemaException, and adds a regression test plus a convenience SchemaException constructor.
Changes:
- Add
SchemaException(String message, Throwable throwable)constructor. - Wrap
RuntimeExceptionthrown during input deserialization inSchema.deserialize(...)intoSchemaException. - Add a JUnit test intended to ensure YAML chunk-boundary parsing failures don’t escape as raw
IndexOutOfBoundsException.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/test/java/com/networknt/schema/SchemaTest.java | Adds a regression test for YAML chunk-boundary runtime failures. |
| src/main/java/com/networknt/schema/SchemaException.java | Adds a message+cause constructor to support clearer wrapped errors. |
| src/main/java/com/networknt/schema/Schema.java | Wraps runtime exceptions from readTree(...) into SchemaException during deserialization. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private JsonNode deserialize(String input, InputFormat inputFormat) { | ||
| try { | ||
| return this.getSchemaContext().getSchemaRegistry().readTree(input, inputFormat); | ||
| } catch (IOException e) { | ||
| throw new UncheckedIOException("Invalid input", e); | ||
| } catch (RuntimeException e) { | ||
| throw new SchemaException("Invalid " + inputFormat + " input: " + e.getMessage(), e); | ||
| } |
There was a problem hiding this comment.
Catching and wrapping all RuntimeExceptions here will also convert IllegalArgumentException thrown for unsupported InputFormat (and potentially other programmer errors) into SchemaException, which can make debugging harder and is a behavioral change. Consider rethrowing IllegalArgumentException/SchemaException as-is and only wrapping the specific runtime failures coming from the underlying parser (e.g., the snakeyaml-engine IndexOutOfBoundsException).
| @Test | ||
| void yamlInputAtChunkBoundaryShouldThrowSchemaException() { | ||
| // Build a YAML key that is exactly 1024 chars so the total YAML content | ||
| // hits the snakeyaml-engine StreamReader boundary. | ||
| String longKey = "a".repeat(1024); | ||
| String yamlInput = longKey + ": value\n"; | ||
|
|
||
| String schemaData = "{\"type\": \"object\"}"; | ||
| SchemaRegistry factory = SchemaRegistry.withDialect(com.networknt.schema.dialect.Dialects.getOpenApi31()); | ||
| Schema schema = factory.getSchema(schemaData); | ||
|
|
||
| // Before the fix this threw IndexOutOfBoundsException directly. | ||
| // After the fix it should throw SchemaException with a clear message. | ||
| try { | ||
| schema.validate(yamlInput, InputFormat.YAML); | ||
| } catch (IndexOutOfBoundsException e) { | ||
| // Raw IOOBE must not escape - this is the bug we're fixing | ||
| throw new AssertionError("Expected SchemaException but got raw IndexOutOfBoundsException", e); | ||
| } catch (RuntimeException e) { | ||
| // Any other RuntimeException (including SchemaException) is acceptable | ||
| // as long as it's not the raw IndexOutOfBoundsException | ||
| } |
There was a problem hiding this comment.
This test will pass even if schema.validate(...) does not throw at all, and it also treats any RuntimeException as acceptable (including unrelated NullPointerException, etc.). To make this a reliable regression test for issue 1231, assert that a SchemaException is thrown (and ideally assert its message/prefix), and fail the test when no exception is thrown.
| * When the YAML input length lands exactly on the snakeyaml-engine StreamReader | ||
| * 1024-char chunk boundary, an IndexOutOfBoundsException is thrown internally. | ||
| * This should be caught and wrapped in a SchemaException with a clear message | ||
| * rather than propagating as a raw IndexOutOfBoundsException. | ||
| */ | ||
| @Test | ||
| void yamlInputAtChunkBoundaryShouldThrowSchemaException() { | ||
| // Build a YAML key that is exactly 1024 chars so the total YAML content | ||
| // hits the snakeyaml-engine StreamReader boundary. |
There was a problem hiding this comment.
The Javadoc says the YAML input length lands exactly on the 1024-char boundary, but yamlInput is longer than 1024 characters (longKey is 1024 plus the ": value\n" suffix). Consider rewording to clarify what is actually at the boundary (e.g., the colon position / token boundary) so the test rationale matches the constructed input.
| * When the YAML input length lands exactly on the snakeyaml-engine StreamReader | |
| * 1024-char chunk boundary, an IndexOutOfBoundsException is thrown internally. | |
| * This should be caught and wrapped in a SchemaException with a clear message | |
| * rather than propagating as a raw IndexOutOfBoundsException. | |
| */ | |
| @Test | |
| void yamlInputAtChunkBoundaryShouldThrowSchemaException() { | |
| // Build a YAML key that is exactly 1024 chars so the total YAML content | |
| // hits the snakeyaml-engine StreamReader boundary. | |
| * When the YAML key length (so that the colon/token position) lands exactly on | |
| * the snakeyaml-engine StreamReader 1024-char chunk boundary, an | |
| * IndexOutOfBoundsException is thrown internally. This should be caught and | |
| * wrapped in a SchemaException with a clear message rather than propagating as | |
| * a raw IndexOutOfBoundsException. | |
| */ | |
| @Test | |
| void yamlInputAtChunkBoundaryShouldThrowSchemaException() { | |
| // Build a YAML key that is exactly 1024 chars so that the colon falls on | |
| // the snakeyaml-engine StreamReader 1024-char chunk boundary (the overall | |
| // YAML input is therefore slightly longer than 1024 characters). |
| @@ -1213,6 +1215,8 @@ | |||
| } | |||
| } catch (IOException e) { | |||
| throw new UncheckedIOException("Invalid input", e); | |||
There was a problem hiding this comment.
deserialize(AbsoluteIri, ...) deliberately throws an UncheckedIOException when the resource isn't found, but the new catch (RuntimeException e) will wrap that into a SchemaException with an "Invalid ... input" message. This changes the exception type/semantics for missing resources and makes the message misleading. Handle UncheckedIOException separately (rethrow) or narrow the runtime wrapping to the specific parser exceptions you intend to convert.
| throw new UncheckedIOException("Invalid input", e); | |
| throw new UncheckedIOException("Invalid input", e); | |
| } catch (UncheckedIOException e) { | |
| // Preserve deliberately thrown UncheckedIOException (e.g., missing resource) semantics | |
| throw e; |
|
I cannot reproduce this. The test case doesn't throw at all when schema.validate is called? |
|
I think only certain version of YAML parser has this bug, and it is not easy for us to reproduce it. The key question for us is should we catch runtime exception and rethrow a schema exception. |
|
I don't really think we should catch a runtime exception just to wrap it with another runtime exception. I think its the caller's responsibility to catch and handle all the runtime exceptions. Also copilot points out Catching and wrapping all |
|
Good point. Let's discard this PR. |
No description provided.