-
Notifications
You must be signed in to change notification settings - Fork 74
Add XTR scoring #86
Description
Recently WARP has reminded us about and improved upon XTR's modified inference scheme which obviates the need for loading every embedding for retrieved candidate documents by imputing the missing query-doc scores as the minimum score seen in the retrieved scores for that query token.
This change however requires an analogous modification of the scoring function during training, only keeping query-document token scores that are in the top k_train of all in-batch document tokens for a given query-token.
This change should be somewhat straightforward, mainly with one or multiple (depending on the multiplicity of documents to queries) xtr_scores functions in scores/scores.py.
I'd like to take this on if that's alright with y'all @NohTow @raphaelsty . It could be a good prelude to adding WARP support after the PLAID implementation is up and running.