Skip to content

publicly expose tokenizer callbacks#64

Merged
MohamedRejeb merged 1 commit intoMohamedRejeb:mainfrom
AndreasMattsson:feature/public-callbacks
Jun 8, 2025
Merged

publicly expose tokenizer callbacks#64
MohamedRejeb merged 1 commit intoMohamedRejeb:mainfrom
AndreasMattsson:feature/public-callbacks

Conversation

@AndreasMattsson
Copy link

@AndreasMattsson AndreasMattsson commented May 27, 2025

Motivation / Use Case

Been using Ksoup for a while as an alternative to fromHtml on KMP iOS, but I now came across a need to for certain unsupported tags output their entire raw unformatted representation (rather than just emitting their content).
In this case it is importent they are represented exactly "as is", so I can't just recreate them based on e.g. the attributes parameter.
Unfortunately for me to be able to handle this case with Ksoup I need to expose some internal state, hence this fork/PR.

There could be many ways Ksoup could allow for this (and similar) use-cases, e.g.:

  1. Adjusting KsoupHtmlHandler callbacks to include either the "raw" representation of a tag, or corresponding startIndex and endIndex. Though I suspect this would be seen as out of scope for this library.
  2. Making KsoupHtmlParser and callback methods open rather than final (perhaps with some internal state as protected to allow subclassers to change the behavior.
  3. The approach this PR goes for, which is to make the tokenizer callbacks a public interface, and allows passing an instance of them to KsoupHtmlParser which gets called (in addition to its own internal one).

Changes made:

  • Moves tokenizer callbacks to a separate file/class (called KsoupTokenizerCallbacks) and makes them public
  • Allows (optionally) passing tokenizer callbacks to KsoupHtmlParser in addition to existing KsoupHtmlHandler
  • Moves from direct inheritance of tokenizer callbacks in KsoupHtmlParser to inner (anonymous) object as I don't see why they should be publicly exposed? (could invite calling methods that could mess up internal state?)

… forwarded to public consumer specified callback
Copy link
Owner

@MohamedRejeb MohamedRejeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks for your contribution!

@MohamedRejeb MohamedRejeb merged commit 4b4ae1e into MohamedRejeb:main Jun 8, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants