Skip to content

Commit 57eb1cd

Browse files
menshikh-ivKMarie1
authored andcommitted
Add tox and pytest to gensim, integration with Travis and Appveyor. Fix piskvorky#1613, 1644 (piskvorky#1721)
* remove flake8 config from setup.cfg * create distinct test_env for win * ignore stuff from tox * basic tox config * add global env vars for full test run * force-recreate for envs * show top20 slowest tests * add upload/download wheels/docs * fix E501 [1] * fix E501 [2] * fix E501 [3] * fix E501 [4] * fix E501 [5] * fix E501 [6] * travis + tox * Install tox for travis * simplify travis file * more verbosity with tox * Fix numpy scipy versions * Try to avoid pip install hang * Fix tox * Add build_ext * Fix dtm test * remove install/run sh * Fix imports & indentation * remove flake-diff * Add docs building to Travis * join flake8 and docs to one job * add re-run for failed tests (to avoid FP) + calculate code coverage * fix WR segfault (veeeery buggy implementation) * attempt to make multiOS configuration * fix mistake with cython * Try to fix appveyor wheels problem * Remove commented parts & add cache for travis
1 parent 76983da commit 57eb1cd

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+1070
-601
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ Thumbs.db
4040

4141
# Other #
4242
#########
43+
.tox/
44+
.cache/
4345
.project
4446
.pydevproject
4547
.ropeproject

.travis.yml

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,24 @@ cache:
55
directories:
66
- $HOME/.cache/pip
77
- $HOME/.ccache
8-
8+
- $HOME/.pip-cache
99
dist: trusty
1010
language: python
1111

1212

1313
matrix:
1414
include:
15-
- env: PYTHON_VERSION="2.7" NUMPY_VERSION="1.11.3" SCIPY_VERSION="0.18.1" ONLY_CODESTYLE="yes"
16-
- env: PYTHON_VERSION="2.7" NUMPY_VERSION="1.11.3" SCIPY_VERSION="0.18.1" ONLY_CODESTYLE="no"
17-
- env: PYTHON_VERSION="3.5" NUMPY_VERSION="1.11.3" SCIPY_VERSION="0.18.1" ONLY_CODESTYLE="no"
18-
- env: PYTHON_VERSION="3.6" NUMPY_VERSION="1.11.3" SCIPY_VERSION="0.18.1" ONLY_CODESTYLE="no"
15+
- python: '2.7'
16+
env: TOXENV="flake8, docs"
17+
18+
- python: '2.7'
19+
env: TOXENV="py27-linux"
20+
21+
- python: '3.5'
22+
env: TOXENV="py35-linux"
1923

24+
- python: '3.6'
25+
env: TOXENV="py36-linux"
2026

21-
install: source continuous_integration/travis/install.sh
22-
script: bash continuous_integration/travis/run.sh
27+
install: pip install tox
28+
script: tox -vv

appveyor.yml

Lines changed: 5 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -13,29 +13,20 @@ environment:
1313
secure: qXqY3dFmLOqvxa3Om2gQi/BjotTOK+EP2IPLolBNo0c61yDtNWxbmE4wH3up72Be
1414

1515
matrix:
16-
# - PYTHON: "C:\\Python27"
17-
# PYTHON_VERSION: "2.7.12"
18-
# PYTHON_ARCH: "32"
19-
2016
- PYTHON: "C:\\Python27-x64"
2117
PYTHON_VERSION: "2.7.12"
2218
PYTHON_ARCH: "64"
23-
24-
# - PYTHON: "C:\\Python35"
25-
# PYTHON_VERSION: "3.5.2"
26-
# PYTHON_ARCH: "32"
19+
TOXENV: "py27-win"
2720

2821
- PYTHON: "C:\\Python35-x64"
2922
PYTHON_VERSION: "3.5.2"
3023
PYTHON_ARCH: "64"
31-
32-
# - PYTHON: "C:\\Python36"
33-
# PYTHON_VERSION: "3.6.0"
34-
# PYTHON_ARCH: "32"
24+
TOXENV: "py35-win"
3525

3626
- PYTHON: "C:\\Python36-x64"
3727
PYTHON_VERSION: "3.6.0"
3828
PYTHON_ARCH: "64"
29+
TOXENV: "py36-win"
3930

4031
init:
4132
- "ECHO %PYTHON% %PYTHON_VERSION% %PYTHON_ARCH%"
@@ -57,48 +48,16 @@ install:
5748
# not already installed.
5849
- "powershell ./continuous_integration/appveyor/install.ps1"
5950
- "SET PATH=%PYTHON%;%PYTHON%\\Scripts;%PATH%"
60-
- "python -m pip install -U pip"
51+
- "python -m pip install -U pip tox"
6152

6253
# Check that we have the expected version and architecture for Python
6354
- "python --version"
6455
- "python -c \"import struct; print(struct.calcsize('P') * 8)\""
6556

66-
# Install the build and runtime dependencies of the project.
67-
- "%CMD_IN_ENV% pip install --timeout=60 --trusted-host 28daf2247a33ed269873-7b1aad3fab3cc330e1fd9d109892382a.r6.cf2.rackcdn.com -r continuous_integration/appveyor/requirements.txt"
68-
- "%CMD_IN_ENV% python setup.py bdist_wheel bdist_wininst"
69-
- ps: "ls dist"
70-
71-
# Install the genreated wheel package to test it
72-
- "pip install --pre --no-index --find-links dist/ gensim"
73-
74-
# Not a .NET project, we build scikit-learn in the install step instead
7557
build: false
7658

7759
test_script:
78-
# Change to a non-source folder to make sure we run the tests on the
79-
# installed library.
80-
- "mkdir empty_folder"
81-
- "cd empty_folder"
82-
- "pip install pyemd testfixtures sklearn Morfessor==2.0.2a4"
83-
- "pip freeze"
84-
- "python -c \"import nose; nose.main()\" -s -v gensim"
85-
# Move back to the project folder
86-
- "cd .."
87-
88-
artifacts:
89-
# Archive the generated wheel package in the ci.appveyor.com build report.
90-
- path: dist\*
91-
on_success:
92-
# Upload the generated wheel package to Rackspace
93-
# On Windows, Apache Libcloud cannot find a standard CA cert bundle so we
94-
# disable the ssl checks.
95-
- "python -m wheelhouse_uploader upload --no-ssl-check --local-folder=dist gensim-windows-wheels"
96-
97-
notifications:
98-
- provider: Webhook
99-
url: https://webhooks.gitter.im/e/62c44ad26933cd7ed7e8
100-
on_build_success: false
101-
on_build_failure: True
60+
- tox -vv
10261

10362
cache:
10463
# Use the appveyor cache to avoid re-downloading large archives such

continuous_integration/travis/flake8_diff.sh

Lines changed: 0 additions & 159 deletions
This file was deleted.

continuous_integration/travis/install.sh

Lines changed: 0 additions & 13 deletions
This file was deleted.

continuous_integration/travis/run.sh

Lines changed: 0 additions & 11 deletions
This file was deleted.

gensim/corpora/indexedcorpus.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,8 @@ def __init__(self, fname, index_fname=None):
5656
self.length = None
5757

5858
@classmethod
59-
def serialize(serializer, fname, corpus, id2word=None, index_fname=None, progress_cnt=None, labels=None, metadata=False):
59+
def serialize(serializer, fname, corpus, id2word=None, index_fname=None,
60+
progress_cnt=None, labels=None, metadata=False):
6061
"""
6162
Iterate through the document stream `corpus`, saving the documents to `fname`
6263
and recording byte offset of each document. Save the resulting index
@@ -93,7 +94,9 @@ def serialize(serializer, fname, corpus, id2word=None, index_fname=None, progres
9394
offsets = serializer.save_corpus(fname, corpus, id2word, **kwargs)
9495

9596
if offsets is None:
96-
raise NotImplementedError("Called serialize on class %s which doesn't support indexing!" % serializer.__name__)
97+
raise NotImplementedError(
98+
"Called serialize on class %s which doesn't support indexing!" % serializer.__name__
99+
)
97100

98101
# store offsets persistently, using pickle
99102
# we shouldn't have to worry about self.index being a numpy.ndarray as the serializer will return

gensim/corpora/lowcorpus.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,8 @@ def __init__(self, fname, id2word=None, line2words=split_on_space):
7777
for doc in self:
7878
all_terms.update(word for word, wordCnt in doc)
7979
all_terms = sorted(all_terms) # sort the list of all words; rank in that list = word's integer id
80-
self.id2word = dict(izip(xrange(len(all_terms)), all_terms)) # build a mapping of word id(int) -> word (string)
80+
# build a mapping of word id(int) -> word (string)
81+
self.id2word = dict(izip(xrange(len(all_terms)), all_terms))
8182
else:
8283
logger.info("using provided word mapping (%i ids)", len(id2word))
8384
self.id2word = id2word

gensim/corpora/sharded_corpus.py

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -456,7 +456,10 @@ def resize_shards(self, shardsize):
456456
for old_shard_n, old_shard_name in enumerate(old_shard_names):
457457
os.remove(old_shard_name)
458458
except Exception as e:
459-
logger.error('Exception occurred during old shard no. %d removal: %s.\nAttempting to at least move new shards in.', old_shard_n, str(e))
459+
logger.error(
460+
'Exception occurred during old shard no. %d removal: %s.\nAttempting to at least move new shards in.',
461+
old_shard_n, str(e)
462+
)
460463
finally:
461464
# If something happens with cleaning up - try to at least get the
462465
# new guys in.
@@ -673,7 +676,10 @@ def __add_to_slice(self, s_result, result_start, result_stop, start, stop):
673676
Returns the resulting s_result.
674677
"""
675678
if (result_stop - result_start) != (stop - start):
676-
raise ValueError('Result start/stop range different than stop/start range (%d - %d vs. %d - %d)'.format(result_start, result_stop, start, stop))
679+
raise ValueError(
680+
'Result start/stop range different than stop/start range (%d - %d vs. %d - %d)'
681+
.format(result_start, result_stop, start, stop)
682+
)
677683

678684
# Dense data: just copy using numpy's slice notation
679685
if not self.sparse_serialization:
@@ -685,7 +691,10 @@ def __add_to_slice(self, s_result, result_start, result_stop, start, stop):
685691
# result.
686692
else:
687693
if s_result.shape != (result_start, self.dim):
688-
raise ValueError('Assuption about sparse s_result shape invalid: {0} expected rows, {1} real rows.'.format(result_start, s_result.shape[0]))
694+
raise ValueError(
695+
'Assuption about sparse s_result shape invalid: {0} expected rows, {1} real rows.'
696+
.format(result_start, s_result.shape[0])
697+
)
689698

690699
tmp_matrix = self.current_shard[start:stop]
691700
s_result = sparse.vstack([s_result, tmp_matrix])
@@ -786,7 +795,8 @@ def save_corpus(fname, corpus, id2word=None, progress_cnt=1000, metadata=False,
786795
ShardedCorpus(fname, corpus, **kwargs)
787796

788797
@classmethod
789-
def serialize(serializer, fname, corpus, id2word=None, index_fname=None, progress_cnt=None, labels=None, metadata=False, **kwargs):
798+
def serialize(serializer, fname, corpus, id2word=None, index_fname=None, progress_cnt=None,
799+
labels=None, metadata=False, **kwargs):
790800
"""
791801
Iterate through the document stream `corpus`, saving the documents
792802
as a ShardedCorpus to `fname`.

gensim/corpora/svmlightcorpus.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,8 @@ def line2doc(self, line):
119119
if not parts:
120120
raise ValueError('invalid line format in %s' % self.fname)
121121
target, fields = parts[0], [part.rsplit(':', 1) for part in parts[1:]]
122-
doc = [(int(p1) - 1, float(p2)) for p1, p2 in fields if p1 != 'qid'] # ignore 'qid' features, convert 1-based feature ids to 0-based
122+
# ignore 'qid' features, convert 1-based feature ids to 0-based
123+
doc = [(int(p1) - 1, float(p2)) for p1, p2 in fields if p1 != 'qid']
123124
return doc, target
124125

125126
@staticmethod

0 commit comments

Comments
 (0)