Skip to content

fall back to times-roman as standard 14 font when lenient#1085

Merged
BobLd merged 1 commit intomasterfrom
fix-corpus-0000086-file
Jul 16, 2025
Merged

fall back to times-roman as standard 14 font when lenient#1085
BobLd merged 1 commit intomasterfrom
fix-corpus-0000086-file

Conversation

@EliotJones
Copy link
Copy Markdown
Member

if parsing in lenient mode and encountering a malformed base name (in this case 'helveticai') we fallback to times-roman as the adobe font metrics file for a standard 14 font. this aligns with the behavior of pdfbox. we also log a more informative error in non-lenient modes

this fixes document 0000086.pdf from the corpus

if parsing in lenient mode and encountering a malformed base name
(in this case 'helveticai') we fallback to times-roman as the adobe font
metrics file for a standard 14 font. this aligns with the behavior of pdfbox.
we also log a more informative error in non-lenient modes

this fixes document 0000086.pdf from the corpus
@EliotJones EliotJones requested a review from BobLd July 16, 2025 01:50
@BobLd BobLd merged commit 1021729 into master Jul 16, 2025
2 checks passed
@BobLd BobLd deleted the fix-corpus-0000086-file branch July 16, 2025 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants