Skip to content

Adding a proper function to add the prefixes#29

Closed
NohTow wants to merge 1 commit intomainfrom
fix_prefix
Closed

Adding a proper function to add the prefixes#29
NohTow wants to merge 1 commit intomainfrom
fix_prefix

Conversation

@NohTow
Copy link
Copy Markdown
Collaborator

@NohTow NohTow commented Aug 9, 2024

This PR introduces a proper function to add the query/document prefixes that is more robust and works with all tokenizer (not rely on ". " being tokenized as one unique token, which is not the case for mGTE for example).

This fixes #11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix tokenization for query/doc marker

1 participant