Skip to content

Conversation

@sandwwraith
Copy link
Member

This commit reworks JSON exceptions in kotlinx.serialization:

Fixes #2590

Additionally, some exception messages were made clearer, and minor bugs (such as incorrect offsets) were fixed. This is reflected in the added tests.

This commit reworks JSON exceptions in kotlinx.serialization:

- Classes JsonException, JsonDecodingException, and JsonEncodingException are now public.
They have relevant public properties, such as `shortMessage` and `path`.
Fixes #1930
Fixes #1877

- A special property addInputsToExceptionMessages is added to JsonConfiguration to hide user input from exception messages for security/privacy reasons.

Fixes #2590

Additionally, some exception messages were made clearer and minor bugs (such as incorrect offsets) were fixed. This is reflected in added tests.
* ```
*/
@ExperimentalSerializationApi
public var addInputsToExceptionMessages: Boolean = json.configuration.addInputsToExceptionMessages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe verboseExceptions or contextualExceptions would be less directive and more self-explanatory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect verboseExceptions = false to remove hints from exceptions. And the word contextual has to many meaning attached to it.

I don't have a better alternative, though :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we'll take a verbosity route, then we can think of it not as a binary property (brief/verbose), but as different verbosity levels:

  • verbose/trace - include message, path, hint and input
  • debug - include message, path, hint
  • info - include message only

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose inputs because it's straightforward and highlights that we hide user's info, but not some program/source info (such as variable names).
I agree, Verbose/non-verbose may imply that we'll hide hints and paths (?) as well.

/**
* Specifies whether actual input data should be included in exception messages.
*
* When `false`, exception messages will not contain sensitive input data that could be logged
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They still can contain sensitive data, if it was encoded as a map key and an exception message includes it as a path.

We definitely should elaborate why this option is needed and when to enable/disable it, and inputs leakage into logs is a nice argument in that context. But claiming that all sensitive data will be removed from an exception message might be too strict.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can go and remove map keys right away. We just need to decide on replacement for them

*/
@ExperimentalSerializationApi
public class JsonDecodingException @Deprecated(
"Use decodingExceptionOf() factory methods",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody will need to construct these exceptions outside of the kotlinx-serialization-json, because even if there's a custom serializer that works solely with JSON documents, it will delegate to JsonEncoder/Decoder and the actual decoding exception will come from the decoder, right?

At the same time, to deserialize it into a target type imply that I might want to throw it myself if the JsonElement could not be converted into a target type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody will need to construct these exceptions outside of the kotlinx-serialization-json

Yes. Deprecation is mostly a reminder to future me to use factory methods

path("$.boxed")
} else {
// FIXME? When we are reading Json in 'raw tree' (parseToJsonElement) mode, lexer does not provide path information on syntax errors because there are no descriptors.
// It's probably better to cut out path completely or try to build it without descriptors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How difficult the path reconstruction would be?

If it's hard, I think it's better to cut out the path completely and add a hint about its unavailability. No path is better than incorrect one, if you're trying to process it somehow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be that hard, but the path will contain input keys, not our property names. Or we can remove it here completely, too.

// A `:` instead of `"` is a good guess, but `:`/`}` is a perfectly valid token inside Json string — for example,
// it can be some kind of path `{"foo:bar:baz":"my:resource:locator:{123}"}` or even URI used as a string key/value.
// So if the closing quote is missing, there's really no way to correctly tell where the key or value is supposed to end.
// Although we may try to unify these messages for consistency.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if the key contains an escaped ", we'll throw Unexpected EOF too :)

Perhaps, we should complain about "Unterminated string started at" instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we can correctly detect that in all of the cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we can guess offset when we have all string available in memory (

), when we work with streams and buffers, offset can be 'lost' or modifier if buffer boundaries are shifted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants