Skip to content

Add XTR scoring #86

@robro612

Description

@robro612

Recently WARP has reminded us about and improved upon XTR's modified inference scheme which obviates the need for loading every embedding for retrieved candidate documents by imputing the missing query-doc scores as the minimum score seen in the retrieved scores for that query token.

This change however requires an analogous modification of the scoring function during training, only keeping query-document token scores that are in the top k_train of all in-batch document tokens for a given query-token.

This change should be somewhat straightforward, mainly with one or multiple (depending on the multiplicity of documents to queries) xtr_scores functions in scores/scores.py.

I'd like to take this on if that's alright with y'all @NohTow @raphaelsty . It could be a good prelude to adding WARP support after the PLAID implementation is up and running.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions