Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 4.0.0beta #2993

Merged
merged 230 commits into from
Nov 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
230 commits
Select commit Hold shift + click to select a range
f89808d
Merge branch 'master' into develop
mpenkov Sep 23, 2019
2fac325
added release/check_wheels.py (#2610)
mpenkov Sep 29, 2019
26f1e81
Add hacktoberfest-related documentation (#2616)
mpenkov Oct 2, 2019
25f8a42
Fixed #2554 (#2619)
SanthoshBala18 Oct 4, 2019
2131e3a
Properly install Pattern library for documentation build (#2626)
Hiyorimi Oct 8, 2019
a7713aa
Disable Py2.7 builds under Travis, CircleCI and AppVeyor (#2601)
mpenkov Oct 10, 2019
289a6ca
Handling for iterables without 0-th element, fixes #2556 (#2629)
Hiyorimi Oct 10, 2019
3e027c2
Move Py2 deprecation warning to top of changelog (#2627)
mpenkov Oct 11, 2019
e102574
Change find_interlinks return type to list of tuples (#2636)
napsternxg Oct 19, 2019
bcee414
Improve gensim documentation (numfocus) (#2591)
mpenkov Oct 21, 2019
86ed0d8
fix setup.py to get documentation to build under CircleCI (#2650)
mpenkov Oct 24, 2019
e228a93
Fix links to documentation in README.md (#2646)
mpenkov Oct 24, 2019
1894339
Delete requirements.txt (#2648)
mpenkov Oct 24, 2019
e859c11
Remove native Python implementations of Cython extensions (#2630)
mpenkov Oct 25, 2019
34ee98b
replacing deleted notebooks with placeholders (#2654)
mpenkov Oct 29, 2019
ee61691
Document accessing model's vocabulary (#2661)
mpenkov Nov 1, 2019
44ea793
Improve explanation of top_chain_var parameter in Dynamic Topic Model…
joelowj Nov 3, 2019
3d65961
Comment out Hacktober Fest from README (#2677)
piskvorky Nov 11, 2019
f72a55d
Update word2vec2tensor.py (#2678)
kaddynator Nov 18, 2019
1052b9b
Speed up word2vec model loading (#2671)
lopusz Nov 18, 2019
e7c9f0e
Fix local import degrading the performance of word2vec model loading …
lopusz Nov 21, 2019
e391f0c
[Issue-2670] Bug fix: Initialize doc_no2 because it is not set when c…
paulrigor Nov 23, 2019
de0dcc3
Warn when BM25.average_idf < 0 (#2687)
Witiko Dec 2, 2019
36ae46f
Rerun Soft Cosine Measure tutorial notebook (#2691)
Witiko Dec 21, 2019
cc8188c
Fix simple typo: voacab -> vocab (#2719)
timgates42 Jan 1, 2020
12897cb
Fix appveyor builds (#2706)
mpenkov Jan 1, 2020
74a375d
Change similarity strategy when finding n best (#2720)
svinkapeppa Jan 5, 2020
f022028
Initialize self.cfs in Dictionary.compatify method (#2618)
SanthoshBala18 Jan 5, 2020
3d129de
Fix ValueError when instantiating SparseTermSimilarityMatrix (#2689)
ptorrestr Jan 6, 2020
3abcb9f
Refactor bm25 to include model parametrization (cont.) (#2722)
Witiko Jan 8, 2020
fbc7d09
Fix overflow error for `*Vec` corpusfile-based training (#2700)
persiyanov Jan 8, 2020
4d22327
Implement saving to Facebook format (#2712)
lopusz Jan 23, 2020
4710308
Use time.time instead of time.clock in gensim/models/hdpmodel.py (#2730)
tarohi24 Jan 23, 2020
d05259a
better replacement of deprecated .clock()
gojomo Jan 27, 2020
b8346c1
drop py35, add py38 (travis), update explicit dependency versions
gojomo Jan 27, 2020
f5e05d0
better CI logs w/ gdb after core dump
gojomo Jan 27, 2020
0e624c1
improved comments via piskvorky review
gojomo Jan 27, 2020
9352dad
Merge pull request #2715 from gojomo/py38-plus-build-tuning
gojomo Jan 28, 2020
47a0675
rm autogenerated *.cpp files that shouldn't be in source control
gojomo Jan 29, 2020
8d79794
Fix TypeError when using the -m flag (#2734)
Tenoke Jan 30, 2020
b92e087
del cython.sh
gojomo Jan 31, 2020
68ec5b8
Merge pull request #2739 from gojomo/rm-cpp-files
gojomo Feb 24, 2020
0d75f2d
Improve documentation in run_similarity_queries example (#2770)
MartinoMensio Mar 21, 2020
cb3d87c
Fix fastText word_vec() for OOV words with use_norm=True (#2764)
avidale Mar 21, 2020
493e52f
remove mention of py27 (#2751)
mattf Mar 21, 2020
30ca5b3
Fix KeyedVectors.add matrix type (#2761)
menshikh-iv Mar 21, 2020
f767e1e
use collections.abc for Mapping (#2750)
mattf Mar 21, 2020
1b3ad81
Fix out of range issue in gensim.summarization.keywords (#2738)
carterols Mar 21, 2020
a811a23
fixed get_keras_embedding, now accepts word mapping (#2676)
Hamekoded Mar 21, 2020
8a2e2a7
Add downloads badge to README
piskvorky Mar 22, 2020
de0ef26
Get rid of "wheels" badge
piskvorky Mar 22, 2020
a4894bb
link downloads badge to pepy instead of pypi
piskvorky Mar 23, 2020
d952a51
fix broken english in tests (#2773)
piskvorky Mar 23, 2020
ec222e8
fix build, use KeyedVectors class (#2774)
mpenkov Mar 24, 2020
a6247af
cElementTree has been deprecated since Python 3.3 and removed in Pyth…
tirkarthi Mar 30, 2020
a2ec4c3
Fix FastText RAM usage in tests (+ fixes for wheel building) (#2791)
menshikh-iv Apr 13, 2020
10cec93
Fix typo in comments\nThe rows of the corpus are actually documents, …
Chenxin-Guo96 Apr 17, 2020
5b5b545
Add osx+py38 case for avoid multiprocessing issue (#2800)
menshikh-iv Apr 20, 2020
7f194c9
Use nicer twitter badge
piskvorky Apr 22, 2020
db11c14
Use downloads badge from shields.io
piskvorky Apr 22, 2020
188a590
Use blue in badges
piskvorky Apr 22, 2020
63dc990
Remove conda-forge badge
piskvorky Apr 22, 2020
8791bb7
Make twitter badge blue, too
piskvorky Apr 22, 2020
2a04825
Merge branch 'develop' into piskvorky-patch-1
piskvorky Apr 22, 2020
ca726c6
Merge pull request #2772 from RaRe-Technologies/piskvorky-patch-1
piskvorky Apr 22, 2020
68bd860
Cache badges
piskvorky Apr 23, 2020
fd3537a
Use HTML comments instead of Markdown comment
piskvorky Apr 23, 2020
d70b129
Merge pull request #2806 from RaRe-Technologies/piskvorky-patch-1
piskvorky Apr 24, 2020
585b0c0
Merge branch 'develop' into fix-xml
piskvorky Apr 24, 2020
47357de
Merge pull request #2799 from Chenxin-Guo/develop
piskvorky Apr 24, 2020
996801b
Merge pull request #2777 from tirkarthi/fix-xml
piskvorky Apr 24, 2020
29d1092
[MRG] Update README instructions + clean up testing (#2814)
piskvorky May 1, 2020
ace6c34
Add basic yml file for setup pipeline (will fail)
menshikh-iv May 4, 2020
b3b844e
revert back travis
menshikh-iv May 4, 2020
93385d3
Replace AppVeyor by Azure Pipelines (#2824)
menshikh-iv May 6, 2020
d692b9d
Update CHANGELOG.md (#2829)
mpenkov May 7, 2020
0027fb5
Update CHANGELOG.md (#2831)
mpenkov May 9, 2020
ceecef3
Fix-2253: Remove docker folder since it fails to build (#2833)
FyzHsn May 14, 2020
69732eb
LdaModel documentation update -remove claim that it accepts CSC matri…
FyzHsn May 14, 2020
2360459
delete .gitattributes (#2836)
gojomo May 14, 2020
e75f6c8
Fix for Python 3.9/3.10: remove xml.etree.cElementTree (#2846)
hugovk May 24, 2020
8149035
Correct grammar in docs (#2573)
shivdhar Jun 10, 2020
374de28
Don't proxy-cache badges with Google Images (#2854)
piskvorky Jun 15, 2020
42be086
pin keras=2.3.1 because 2.4.3 causes KerasWord2VecWrappper test failu…
gojomo Jun 27, 2020
a74f8e3
Expose max_final_vocab parameter in FastText constructor (#2867)
mpenkov Jun 27, 2020
c888b7a
Replace numpy.random.RandomState with SFC64 - for speed (#2864)
zygm0nt Jun 29, 2020
fff82aa
Update CHANGELOG.md
mpenkov Jun 29, 2020
1228ebe
Clarify that license is LGPL-2.1 (#2871)
pombredanne Jul 18, 2020
78e48b7
Fix travis issues for latest keras versions. (#2869)
dsandeep0138 Jul 18, 2020
4cdf228
Put cell outputs back to the soft cosine measure benchmark notebook (…
Witiko Jul 18, 2020
c0e0169
KeyedVectors & *2Vec API streamlining, consistency (#2698)
gojomo Jul 19, 2020
30af573
Delete .gitattributes
gojomo Jul 21, 2020
5c08d3e
Merge remote-tracking branch 'upstream/develop' into develop
gojomo Jul 21, 2020
3f7047f
test showing FT failure as W2V
gojomo Jul 22, 2020
ac9126d
set .vectors even when ngrams off
gojomo Jul 22, 2020
0316084
use _save_specials/_load_specials per type
gojomo Jul 22, 2020
03c8bb9
Make docs clearer on `alpha` parameter in LDA model
xh2 Jul 24, 2020
7791b74
Merge pull request #1 from xh2/patch-1
xh2 Jul 24, 2020
4e1b09c
Update Hoffman paper link
xh2 Jul 24, 2020
25005c5
rm whitespace
gojomo Jul 26, 2020
f34956c
Update gensim/models/ldamodel.py
piskvorky Jul 26, 2020
7d0ef9e
Update gensim/models/ldamodel.py
piskvorky Jul 26, 2020
a662e8d
Merge pull request #2896 from xh2/bugfix/lda-doc-alpha
piskvorky Jul 26, 2020
78778a9
Update gensim/models/ldamodel.py
piskvorky Jul 26, 2020
344c4ab
Merge pull request #2897 from xh2/bugfix/hoffman-paper-link
piskvorky Jul 26, 2020
b70c826
re-applying changes from #2821
piskvorky Jul 26, 2020
a81e547
migrating + regenerating changed docs
piskvorky Jul 26, 2020
78fe1c4
fix forgotten iteritems
piskvorky Jul 26, 2020
a0e40ca
remove extra `model.wv`
piskvorky Jul 26, 2020
4cf4da0
split overlong doc line
piskvorky Jul 26, 2020
161ad55
get rid of six in doc2vec
piskvorky Jul 27, 2020
31d2b87
increase test timeout for Visdom server
piskvorky Jul 27, 2020
bc95bcb
add 32/64 bits report
gojomo Jul 29, 2020
c834e06
add deprecations for init_sims()
piskvorky Jul 30, 2020
172e37f
remove vectors_norm + add link to migration guide to deprecation warn…
piskvorky Jul 30, 2020
3919b68
rename vectors_norm everywhere, update tests, regen docs
piskvorky Jul 30, 2020
d40f685
put back no-op property setter of deprecated vectors_norm
piskvorky Jul 30, 2020
872c8ed
fix typo
piskvorky Jul 30, 2020
4c1b3f7
fix flake8
piskvorky Jul 30, 2020
b39eec2
disable Keras tests
piskvorky Jul 30, 2020
d5556ea
Merge pull request #2899 from RaRe-Technologies/pr2821
piskvorky Jul 30, 2020
f2fd045
test showing FT failure as W2V
gojomo Jul 22, 2020
7ab1501
set .vectors even when ngrams off
gojomo Jul 22, 2020
ce16168
Update gensim/test/test_fasttext.py
piskvorky Jul 26, 2020
779fe46
Update gensim/test/test_fasttext.py
piskvorky Jul 26, 2020
9289c3b
refresh docs for run_annoy tutorial
piskvorky Aug 3, 2020
4b7e372
Merge pull request #2910 from RaRe-Technologies/rerun_tutorial
piskvorky Aug 3, 2020
b308883
Reduce memory use of the term similarity matrix constructor, deprecat…
Witiko Aug 7, 2020
28a2110
Fix doc2vec crash for large sets of doc-vectors (#2907)
gojomo Aug 17, 2020
817cac9
Fix AttributeError in WikiCorpus (#2901)
jenishah Aug 17, 2020
e9bb3a7
Corrected info about elements of the job queue
lunastera Sep 2, 2020
320cacd
Add unused args of `_update_alpha`
lunastera Sep 2, 2020
fc4b97f
intensify cbow+hs tests; bulk testing method
gojomo Sep 2, 2020
030e650
use increment operator
gojomo Sep 2, 2020
6e0d00b
Change num_words to topn in dtm_coherence (#2926)
MeganStodel Sep 3, 2020
63f977a
Integrate what is essentially the same process
lunastera Sep 4, 2020
d524fa4
Merge branch 'develop' into 2vec_saveload_fixes
piskvorky Sep 7, 2020
49b35b7
docstirng fixes
piskvorky Sep 7, 2020
9cd72f5
Merge pull request #2931 from lunastera/w2v_fix_jobqueue-info
piskvorky Sep 8, 2020
3f972a6
get rid of python2 constructs
piskvorky Sep 8, 2020
bb947b3
Remove Keras dependency (#2937)
piskvorky Sep 10, 2020
4331ccf
code style fixes while debugging pickle model sizes
piskvorky Sep 13, 2020
34e77dc
Merge branch 'pickle_perambulations' into 2vec_saveload_fixes
piskvorky Sep 13, 2020
012d598
py2 to 3: get rid of forgotten range
piskvorky Sep 13, 2020
eefe9ab
fix docs
piskvorky Sep 13, 2020
1a9b646
get rid of numpy.str_
piskvorky Sep 14, 2020
09b7e94
Fix deprecations in SoftCosineSimilarity (#2940)
Witiko Sep 16, 2020
cddf3c1
Fix "generator" language in word2vec docs (#2935)
polm Sep 16, 2020
08a61e5
Bump minimum Python version to 3.6 (#2947)
gojomo Sep 17, 2020
c14456d
Merge remote-tracking branch 'origin/develop' into 2vec_saveload_fixes
piskvorky Sep 19, 2020
06aef75
fix index2entity, fix docs, hard-fail deprecated properties
piskvorky Sep 19, 2020
5e21560
fix typos + more doc fixes + fix failing tests
piskvorky Sep 19, 2020
51cae68
more index2word => index_to_key fixes
piskvorky Sep 19, 2020
17da21e
finish method renaming
piskvorky Sep 19, 2020
f0cade1
Update gensim/models/word2vec.py
piskvorky Sep 19, 2020
6fa5a1b
a few more style fixes
piskvorky Sep 19, 2020
e95ac0a
fix nonsensical word2vec path examples
piskvorky Sep 20, 2020
dc9c3fc
more doc fixes
piskvorky Sep 20, 2020
da8847a
`it` => `itertools`, + code style fixes
piskvorky Sep 24, 2020
c6c24ea
Merge pull request #2939 from RaRe-Technologies/2vec_saveload_fixes
piskvorky Sep 24, 2020
e210f73
Refactor ldamulticore to serialize less data (#2300)
horpto Sep 26, 2020
f0788ad
new docs theme
dvorakvaclav Sep 23, 2020
0f64151
redo copy on web index page
piskvorky Sep 26, 2020
9ddf9a2
fix docs in KeyedVectors
piskvorky Sep 27, 2020
de66bb1
clean up docs structure
piskvorky Sep 28, 2020
65294ec
hopepage header update, social panel and new favicon
dvorakvaclav Sep 29, 2020
469abd7
fix flake8
piskvorky Sep 29, 2020
17f884d
reduce space under code section
piskvorky Sep 29, 2020
e2727c6
Merge pull request #2954 from friendlystudio/new_docs_theme
piskvorky Sep 30, 2020
156c5c0
fix images in core tutorials
piskvorky Sep 30, 2020
0c0f358
Merge remote-tracking branch 'origin/develop' into migrate_tutorials
piskvorky Sep 30, 2020
502b654
WIP: migrating tutorials to 4.0
piskvorky Sep 30, 2020
fd6b408
fix doc2vec tutorial FIXMEs
piskvorky Sep 30, 2020
70d4338
add autogenerated docs
piskvorky Oct 1, 2020
0936e45
fixing flake8 errors
piskvorky Oct 1, 2020
683cebe
Merge pull request #2968 from RaRe-Technologies/migrate_tutorials
piskvorky Oct 1, 2020
2dcaaf8
remove gensim.summarization subpackage, docs and test data (#2958)
mpenkov Oct 3, 2020
8874de1
reuse from test.utils
gojomo Sep 12, 2020
baee8e7
test re-saving-native-FT after update-vocab (#2853)
gojomo Sep 12, 2020
4ca5b78
avoid buggy shared list use (#2943)
gojomo Sep 12, 2020
eab3302
pre-assert save_facebook_model anomaly
gojomo Sep 13, 2020
eba73da
unittest.skipIf instead of pytest.skipIf
gojomo Sep 13, 2020
8e9d202
refactor init/update vectors/vectors_vocab; bulk randomization
gojomo Sep 13, 2020
81b9d14
unify/correct Word2Vec & FastText corpus/train parameter checking
gojomo Sep 14, 2020
bcf4f1e
suggestions from code review
gojomo Sep 15, 2020
a51818b
improve train() corpus_iterable parameter doc-comment
gojomo Sep 16, 2020
8687e7f
disable pytest-rerunfailures due to https://github.com/pytest-dev/pyt…
gojomo Sep 28, 2020
dda970e
comment clarity from review
gojomo Oct 6, 2020
e090400
specify dtype to avoid interim float64
gojomo Oct 6, 2020
1edbb4c
use inefficient-but-all-tests-pass 'uniform' for now, w/ big FIXME co…
gojomo Oct 6, 2020
b40c601
refactor phrases
piskvorky Oct 8, 2020
02354cd
float32 random; diversified dv seed; disable bad test
gojomo Oct 8, 2020
b2a5a0d
double-backticks
gojomo Oct 10, 2020
1c59aad
inline seed diversifier; unittest.skip
gojomo Oct 10, 2020
6f4053b
fix phrases tests
piskvorky Oct 10, 2020
092b512
clean up rendered docs for phrases
piskvorky Oct 10, 2020
1acb47c
fix sklearn_api.phrases tests + docs
piskvorky Oct 10, 2020
aaa79dd
fix flake8 warnings in docstrings
piskvorky Oct 10, 2020
0596dbd
Merge pull request #2976 from RaRe-Technologies/fix_phrases
piskvorky Oct 10, 2020
8166081
rename export_phrases to find_phrases + add actual export_phrases
piskvorky Oct 10, 2020
4879e52
skip common english words by default in phrases
piskvorky Oct 10, 2020
9e503a4
sphinx doesn't allow custom section titles :(
piskvorky Oct 10, 2020
9cd75c3
use FIXME for comments/doc-comments/names that must change pre-4.0.0
gojomo Oct 10, 2020
2784599
ignore conjunctions in phrases
piskvorky Oct 11, 2020
6baaa74
make ENGLISH_COMMON_TERMS optional
piskvorky Oct 12, 2020
b00b393
fix typo
piskvorky Oct 12, 2020
7c17577
docs: use full version as the "short version"
piskvorky Oct 12, 2020
75caa93
phrases: rename common_terms => connector_words
piskvorky Oct 12, 2020
22f4bc2
fix typo
piskvorky Oct 12, 2020
15d8261
ReST does not support nested markup
piskvorky Oct 12, 2020
c3b7f97
make flake8 shut up
piskvorky Oct 12, 2020
d6fc1b1
improve HTML doc formatting for consecutive paragraphs
piskvorky Oct 14, 2020
1d0a8bd
Merge pull request #2979 from RaRe-Technologies/phrases_common_words
piskvorky Oct 14, 2020
ea87470
Merge pull request #2944 from gojomo/ft_save_after_update_vocab
piskvorky Oct 15, 2020
4a2548a
fix typos
piskvorky Oct 18, 2020
c8d59b0
add benchmark script
piskvorky Oct 18, 2020
8d7dde2
silence flake8
piskvorky Oct 18, 2020
87ad617
Merge pull request #2982 from RaRe-Technologies/fix_2887
piskvorky Oct 18, 2020
839b1d3
remove dependency on `six`
piskvorky Oct 18, 2020
86fe8ef
regen tutorials
piskvorky Oct 18, 2020
94a227b
Merge pull request #2984 from RaRe-Technologies/remove_six
piskvorky Oct 19, 2020
0d1d054
Notification at the top of page in documentation
dvorakvaclav Oct 26, 2020
b0b2e38
Update notification.html
piskvorky Oct 26, 2020
60a8f7f
Merge pull request #2992 from friendlystudio/docs_notification
piskvorky Oct 26, 2020
e4199cb
Update changelog for 4.0.0 release (#2981)
mpenkov Oct 28, 2020
329adf2
bumped version to 4.0.0beta
mpenkov Oct 30, 2020
371e2c5
remove reference to cython.sh
mpenkov Oct 30, 2020
4895c64
update link in readme
mpenkov Oct 30, 2020
18d519c
Merge branch 'master' into release-4.0.0beta
mpenkov Oct 31, 2020
3918704
clean up merge artifact
mpenkov Oct 31, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
10 changes: 7 additions & 3 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ version: 2
jobs:
build:
docker:
- image: circleci/python:2.7
- image: circleci/python:3.7.4

working_directory: ~/gensim

Expand All @@ -18,16 +18,20 @@ jobs:
sudo apt-get -yq update
sudo apt-get -yq remove texlive-binaries --purge
sudo apt-get -yq --no-install-suggests --no-install-recommends --force-yes install dvipng texlive-latex-base texlive-latex-extra texlive-latex-recommended texlive-latex-extra texlive-fonts-recommended latexmk
sudo apt-get -yq install build-essential python3.7-dev

- run:
name: Basic installation (tox)
command: |
python -m virtualenv venv
python3.7 -m virtualenv venv
source venv/bin/activate
pip install tox
pip install tox --progress-bar off

- run:
name: Build documentation
environment:
TOX_PARALLEL_NO_SPINNER: 1
TOX_PIP_OPTS: --progress-bar=off
command: |
source venv/bin/activate
tox -e compile,docs -vv
Expand Down
42 changes: 28 additions & 14 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,32 +8,46 @@ cache:
- $HOME/.pip-cache
dist: trusty
language: python
env:
TOX_PARALLEL_NO_SPINNER: 1


matrix:
include:
- python: '2.7'
env: TOXENV="flake8,flake8-docs"

- python: '3.6'
env: TOXENV="flake8,flake8-docs"

- python: '2.7'
env: TOXENV="py27-linux"

- python: '3.5'
env: TOXENV="py35-linux"

- python: '3.6'
env: TOXENV="py36-linux"
- python: '3.8'
env:
- TOXENV="py38-linux"
dist: bionic

- python: '3.7'
env:
- TOXENV="py37-linux"
- BOTO_CONFIG="/dev/null"
# The following two lines used to be necessary because Travis left files lying around in ~/.aws/,
# messing up our tests. Now fixed since https://github.com/travis-ci/travis-ci/issues/7940
# - BOTO_CONFIG="/dev/null"
#sudo: true
dist: xenial
sudo: true

- python: '3.6'
env: TOXENV="py36-linux"


install:
- pip install tox
- sudo apt-get install -y gdb


before_script:
- ulimit -c unlimited -S # enable core dumps


install: pip install tox
script: tox -vv


after_failure:
- pwd
- COREFILE=$(find . -maxdepth 1 -name "core*" | head -n 1)
- if [[ -f "$COREFILE" ]]; then EXECFILE=$(gdb -c "$COREFILE" -batch | grep "Core was generated" | tr -d "\`" | cut -d' ' -f5); file "$COREFILE"; gdb -c "$COREFILE" "$EXECFILE" -x continuous_integration/debug.gdb -batch; fi
816 changes: 485 additions & 331 deletions CHANGELOG.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ Also, please check the [Gensim FAQ](https://github.com/RaRe-Technologies/gensim/
6. Check that everything's OK in your branch:
- Check it for PEP8: `tox -e flake8`
- Build its documentation (works only for MacOS/Linux): `tox -e docs` (documentation stored in `docs/src/_build`)
- Run unit tests: `tox -e py{version}-{os}`, for example `tox -e py27-linux` or `tox -e py36-win` where
- `{version}` is one of `27`, `35`, `36`
- Run unit tests: `tox -e py{version}-{os}`, for example `tox -e py35-linux` or `tox -e py36-win` where
- `{version}` is one of `35`, `36`
- `{os}` is either `win` or `linux`
7. Add files, commit and push: `git add ... ; git commit -m "my commit message"; git push origin my-feature`
8. [Create a PR](https://help.github.com/articles/creating-a-pull-request/) on Github. Write a **clear description** for your PR, including all the context and relevant information, such as:
Expand Down
35 changes: 35 additions & 0 deletions HACKTOBERFEST.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# :pizza: Hacktoberfest 2019 :beer:

It's that time of the year again!
[Hacktoberfest](https://hacktoberfest.digitalocean.com) is here, and `gensim` needs **your** help.
We've prepared a list of good issues to work on: [gensim hacktoberfest issues](https://github.com/RaRe-Technologies/gensim/labels/hacktoberfest).

If the learning curve for `gensim` is a bit steep, give the [smart_open](https://github.com/RaRe-Technologies/smart_open) repository a try.
`smart_open` is an important dependency of `gensim`: it performs file I/O over a variety of protocols and formats.
There's also a list of Hacktoberfest-friendly issues to work on: [smart_open hacktoberfest issues](https://github.com/RaRe-Technologies/smart_open/labels/hacktoberfest).

Of course, we welcome contributions on any of the existing issues, not just the ones labeled `hacktoberfest`.
If the issue is simple & quick, you can just submit your PR, with a proper reference to the issue it addresses.
If the issue requires a little more work, but you have a good idea of how to proceed & know when you'll be submitting some initial work, please post a short note about your plans to the issue, or a "work-in-progress" ("[WIP]") pull-request indicating work is underway, to help avoid wasted duplicate work.

Furthermore, we also welcome contributions not connected to an existing issue.
This includes things like fixing typos in documentation, docstrings, etc.
If you make such contributions, please make the motivation behind the contribution clear.
You could start such a contribution with a new pull-request, or if you think it requires other discussion beforehand, as a separate new issue.
Please avoid making innocuous changes without sufficient motivation (e.g. changing code formatting, etc).

## Before Contributing

Check out the following:

- [First-time contributors guide](https://github.com/firstcontributions/first-contributions): if this is your first time contributing on GitHub.
- [Hacktoberfest rules](https://hacktoberfest.digitalocean.com/faq#rules): read this in full
- [Developer page](https://github.com/RaRe-Technologies/gensim/wiki/Developer-page) on our Wiki: for the git flow, code style, etc.

## Questions

If you have a general question about Gensim, please ask on the [mailing list](https://groups.google.com/forum/#!forum/gensim).
If you have a question a about a specific issue or PR, just ask there directly, and we'll get back to you as soon as we can.
Otherwise, ping @mpenkov on [Twitter](https://twitter.com/mpenkov) or [Telegram](https://t.me/mpenkov).

Happy Hacking!!
1 change: 1 addition & 0 deletions ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Please provide the output of:
```python
import platform; print(platform.platform())
import sys; print("Python", sys.version)
import struct; print("Bits", 8 * struct.calcsize("P"))
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import gensim; print("gensim", gensim.__version__)
Expand Down
2 changes: 0 additions & 2 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,6 @@ include gensim/models/fasttext_inner.pxd
include gensim/models/fasttext_corpusfile.cpp
include gensim/models/fasttext_corpusfile.pyx

include gensim/models/_utils_any2vec.c
include gensim/models/_utils_any2vec.pyx
include gensim/corpora/_mmreader.c
include gensim/corpora/_mmreader.pyx
include gensim/_matutils.c
Expand Down
71 changes: 39 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,31 @@
gensim – Topic Modelling in Python
==================================

<!--
The following image URLs are obfuscated = proxied and cached through
Google because of Github's proxying issues. See:
https://github.com/RaRe-Technologies/gensim/issues/2805
-->

[![Build Status](https://travis-ci.org/RaRe-Technologies/gensim.svg?branch=develop)](https://travis-ci.org/RaRe-Technologies/gensim)
[![GitHub release](https://img.shields.io/github/release/rare-technologies/gensim.svg?maxAge=3600)](https://github.com/RaRe-Technologies/gensim/releases)
[![Conda-forge Build](https://anaconda.org/conda-forge/gensim/badges/version.svg)](https://anaconda.org/conda-forge/gensim)
[![Wheel](https://img.shields.io/pypi/wheel/gensim.svg)](https://pypi.python.org/pypi/gensim)
[![Downloads](https://img.shields.io/pypi/dm/gensim?color=blue)](https://pepy.tech/project/gensim/month)
[![DOI](https://zenodo.org/badge/DOI/10.13140/2.1.2393.1847.svg)](https://doi.org/10.13140/2.1.2393.1847)
[![Mailing List](https://img.shields.io/badge/-Mailing%20List-brightgreen.svg)](https://groups.google.com/forum/#!forum/gensim)
[![Gitter](https://img.shields.io/badge/gitter-join%20chat%20%E2%86%92-09a3d5.svg)](https://gitter.im/RaRe-Technologies/gensim)
[![Follow](https://img.shields.io/twitter/follow/gensim_py.svg?style=social&label=Follow)](https://twitter.com/gensim_py)


[![Mailing List](https://img.shields.io/badge/-Mailing%20List-blue.svg)](https://groups.google.com/forum/#!forum/gensim)
[![Follow](https://img.shields.io/twitter/follow/gensim_py.svg?style=social&style=flat&logo=twitter&label=Follow&color=blue)](https://twitter.com/gensim_py)

Gensim is a Python library for *topic modelling*, *document indexing*
and *similarity retrieval* with large corpora. Target audience is the
*natural language processing* (NLP) and *information retrieval* (IR)
community.

<!--
## :pizza: Hacktoberfest 2019 :beer:

We are accepting PRs for Hacktoberfest!
See [here](HACKTOBERFEST.md) for details.
-->

Features
--------

Expand All @@ -40,13 +49,6 @@ If this feature list left you scratching your head, you can first read
more about the [Vector Space Model] and [unsupervised document analysis]
on Wikipedia.

Support
------------

Ask open-ended or research questions on the [Gensim Mailing List](https://groups.google.com/forum/#!forum/gensim).

Raise bugs on [Github](https://github.com/RaRe-Technologies/gensim/blob/develop/CONTRIBUTING.md) but **make sure you follow the [issue template](https://github.com/RaRe-Technologies/gensim/blob/develop/ISSUE_TEMPLATE.md)**. Issues that are not bugs or fail to follow the issue template will be closed without inspection.

Installation
------------

Expand All @@ -60,23 +62,23 @@ NumPy. This is optional, but using an optimized BLAS such as [ATLAS] or
magnitude. On OS X, NumPy picks up the BLAS that comes with it
automatically, so you don’t need to do anything special.

The simple way to install gensim is:
Install the latest version of gensim:

pip install -U gensim
```bash
pip install --upgrade gensim
```

Or, if you have instead downloaded and unzipped the [source tar.gz]
package, you’d run:
package:

python setup.py test
```bash
python setup.py install
```

For alternative modes of installation (without root privileges,
development installation, optional install features), see the
[documentation].
For alternative modes of installation, see the [documentation].

This version has been tested under Python 2.7, 3.5 and 3.6. Gensim’s github repo is hooked
against [Travis CI for automated testing] on every commit push and pull
request. Support for Python 2.6, 3.3 and 3.4 was dropped in gensim 1.0.0. Install gensim 0.13.4 if you *must* use Python 2.6, 3.3 or 3.4. Support for Python 2.5 was dropped in gensim 0.10.0; install gensim 0.9.1 if you *must* use Python 2.5).
Gensim is being [continuously tested](https://travis-ci.org/RaRe-Technologies/gensim) under Python 3.6, 3.7 and 3.8.
Support for Python 2.7 was dropped in gensim 4.0.0 – install gensim 3.8.3 if you must use Python 2.7.

How come gensim is so fast and memory efficient? Isn’t it pure Python, and isn’t Python slow and greedy?
--------------------------------------------------------------------------------------------------------
Expand All @@ -98,22 +100,27 @@ Documentation

- [QuickStart]
- [Tutorials]
- [Tutorial Videos]
- [Official API Documentation]

[QuickStart]: https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/gensim%20Quick%20Start.ipynb
[Tutorials]: https://github.com/RaRe-Technologies/gensim/blob/develop/tutorials.md#tutorials
[Tutorial Videos]: https://github.com/RaRe-Technologies/gensim/blob/develop/tutorials.md#videos
[QuickStart]: https://radimrehurek.com/gensim/auto_examples/core/run_core_concepts.html
[Tutorials]: https://radimrehurek.com/gensim/auto_examples/
[Official Documentation and Walkthrough]: http://radimrehurek.com/gensim/
[Official API Documentation]: http://radimrehurek.com/gensim/apiref.html


Support
-------

Ask open-ended or research questions on the [Gensim Mailing List](https://groups.google.com/forum/#!forum/gensim).

Raise bugs on [Github](https://github.com/RaRe-Technologies/gensim/blob/develop/CONTRIBUTING.md) but **make sure you follow the [issue template](https://github.com/RaRe-Technologies/gensim/blob/develop/ISSUE_TEMPLATE.md)**. Issues that are not bugs or fail to follow the issue template will be closed without inspection.

---------

Adopters
--------

| Company | Logo | Industry | Use of Gensim |
|---------|------|----------|---------------|
|---------|------|----------|---------------|
| [RARE Technologies](http://rare-technologies.com) | ![rare](docs/src/readme_images/rare.png) | ML & NLP consulting | Creators of Gensim – this is us! |
| [Amazon](http://www.amazon.com/) | ![amazon](docs/src/readme_images/amazon.png) | Retail | Document similarity. |
| [National Institutes of Health](https://github.com/NIHOPA/pipeline_word2vec) | ![nih](docs/src/readme_images/nih.png) | Health | Processing grants and publications with word2vec. |
Expand Down Expand Up @@ -162,8 +169,8 @@ BibTeX entry:
[Talentpair]: https://avatars3.githubusercontent.com/u/8418395?v=3&s=100
[citing gensim in academic papers and theses]: https://scholar.google.cz/citations?view_op=view_citation&hl=en&user=9vG_kV0AAAAJ&citation_for_view=9vG_kV0AAAAJ:u-x6o8ySG0sC



[documentation and Jupyter Notebook tutorials]: https://github.com/RaRe-Technologies/gensim/#documentation
[Vector Space Model]: http://en.wikipedia.org/wiki/Vector_space_model
[unsupervised document analysis]: http://en.wikipedia.org/wiki/Latent_semantic_indexing
Expand Down
73 changes: 0 additions & 73 deletions appveyor.yml

This file was deleted.

29 changes: 29 additions & 0 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
pool:
vmImage: 'vs2017-win2016'

strategy:
matrix:
py36:
python.version: '3.6'
TOXENV: "py36-win"
py37:
python.version: '3.7'
TOXENV: "py37-win"
py38:
python.version: '3.8'
TOXENV: "py38-win"

steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '$(python.version)'
displayName: 'Use Python $(python.version)'

- script: |
python -m pip install --upgrade pip
python -m pip install tox
displayName: 'Install tox'

- script: |
tox -vv
displayName: 'Testing'
Loading