-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Closed
Labels
bugIssue described a bugIssue described a buggood first issueIssue for new contributors (not required gensim understanding + very simple)Issue for new contributors (not required gensim understanding + very simple)
Description
https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/summarization/bm25.py
Instead of "len(document)" it should be the length of the index document of the corpus.
def get_score(self, document, index, average_idf):
# in this line it should be the length of the index document in the corpus
score += (idf * self.f[index][word] * (PARAM_K1 + 1)
/ (self.f[index][word] + PARAM_K1 * (1 - PARAM_B + PARAM_B * len(document) / self.avgdl)))
Metadata
Metadata
Assignees
Labels
bugIssue described a bugIssue described a buggood first issueIssue for new contributors (not required gensim understanding + very simple)Issue for new contributors (not required gensim understanding + very simple)