-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Closed
Description
Problem description
**The query result seems not correct. The code is self-explained. Thank you!**
Steps/code/corpus to reproduce
Include full tracebacks, logs and datasets if necessary. Please keep the examples minimal ("minimal reproducible example").
from gensim.summarization.bm25 import BM25, get_bm25_weights
text1 = "A constellation is a group of stars that are considered to form imaginary outlines or meaningful patterns on the celestial sphere."
text2 = "The 88 modern constellations are formally defined regions of the sky together covering the entire celestial sphere."
text = [text1, text2]
corpus = [text1.split(" "), text2.split(" ")]
print(f'corpus: {corpus}')
query = text2.split(" ")
bm25 = BM25(corpus)
scores = bm25.get_scores(query)
scores = [(s, i) for i, s in enumerate(scores)]
scores.sort(key=lambda t: t[0], reverse=True)
print(f'scores: {scores}')
for s, idx in scores:
print(f'{s}\t{idx}: {text[idx]}')
Output:
-0.3601521710456333 0: A constellation is a group of stars that are considered to form imaginary outlines or meaningful patterns on the celestial sphere.
-0.44989406787023367 1: The 88 modern constellations are formally defined regions of the sky together covering the entire celestial sphere.
Versions
Please provide the output of:
import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import gensim; print("gensim", gensim.__version__)
from gensim.models import word2vec;print("FAST_VERSION", word2vec.FAST_VERSION)Output:
macOS-10.14.6-x86_64-i386-64bit
Python 3.8.0 (default, Nov 6 2019, 15:49:01)
[Clang 4.0.1 (tags/RELEASE_401/final)]
NumPy 1.17.4
SciPy 1.3.3
gensim 3.8.1
FAST_VERSION 0
Metadata
Metadata
Assignees
Labels
No labels