Skip to content

MmCorpus file-like object support bug #1869

@menshikh-iv

Description

@menshikh-iv

Into
We have some "weird" behavior if a user passes a file-like object to MmCorpus, based on this mailing list thread

Demonstration

from gensim.corpora import MmCorpus
import bz2

f = bz2.BZ2File("testcorpus.mm.bz2")
print(f.closed)  # 0
corpus = MmCorpus(f)
print(f.closed)  # 1 ???

What happens
File-like object was closed when we call MmReader, problem located here

https://github.com/RaRe-Technologies/gensim/blob/5342153eb4f4b02bb45bfa3951eef8250ac9f6b6/gensim/matutils.py#L1274

with automatically close file-like when we out of scope, this is OK if we open this file, but we shouldn't close file-like passed from user.

Related PR #1867

UPD: another problem here - call IndexCopus.__init__, that didn't support file-like object at all.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue described a bugdifficulty mediumMedium issue: required good gensim understanding & python skills

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions