Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 4.0.0beta #2993

Merged
merged 230 commits into from
Nov 1, 2020
Merged

Release 4.0.0beta #2993

merged 230 commits into from
Nov 1, 2020

Conversation

mpenkov
Copy link
Collaborator

@mpenkov mpenkov commented Oct 31, 2020

We normally merge release branches onto master locally, using the git CLI, but there were some conflicts, and I thought it prudent to deal with them in a separate PR, where we can comment and discuss, etc.

The conflicts are caused by the non-standard way that we handled the previous release (3.8.3). Unlike regular releases, that branch off develop, the 3.8.3 release branched off the 3.8.2 tag on master (?) - the first commit was afaf76f. This branch is still around: https://github.com/RaRe-Technologies/gensim/compare/release-3.8.3.

The sole purpose of 3.8.3 was to temporarily bring back Py2.7 support to gensim (we removed it prematurely in 3.8.2). All work done on the 3.8.3 branch was related to Py2.7, so it was never merged to develop or master. Regular work continued during the release, and we did merge develop into the release-3.8.3 branch (see c3d95ab).

Shortly after the release of 3.8.3, people pointed out that it was missing from the change log on the develop branch. While this was expected for us, because we never merged release-3.8.3 into develop (see above for reason), it was confusing for people. So, we updated the develop changelog to mention 3.8.3. This is the PR here: #2831 .

I couldn't work out what the cause of the conflicts was, but there were only a small number of them:

  (use "git add <file>..." to mark resolution)
        both modified:   CHANGELOG.md
        both modified:   docs/src/conf.py
        both modified:   setup.py
        both modified:   tox.ini

I resolved them in favor of the release branch. You can check that the differences between release-4.0.0beta and develop are minimal, so there are no merge artifacts in those conflicted files:

gensim) sergeyich:gensim misha$ git diff upstream/develop upstream/release-4.0.0beta --minimal -- CHANGELOG.md docs/src/conf.py setup.py tox.ini | cat
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 067a361c..7108bd15 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,7 +1,7 @@
 Changes
 =======

-## 4.0.0beta, FIXME 2020-10-??
+## 4.0.0beta, 2020-10-31

 **⚠️ Gensim 4.0 contains breaking API changes! See the [Migration guide](https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4) to update your existing Gensim 3.x code and models.**

diff --git a/docs/src/conf.py b/docs/src/conf.py
index 13d26e25..ba4973d0 100644
--- a/docs/src/conf.py
+++ b/docs/src/conf.py
@@ -61,9 +61,9 @@ copyright = u'2009-now'
 # built documents.
 #
 # The short X.Y version.
-version = '4.0'
+version = '4.0.0beta'
 # The full version, including alpha/beta/rc tags.
-release = '4.0.0.dev0'
+release = '4.0.0beta'

 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.
diff --git a/setup.py b/setup.py
index 6c09eaf6..b3a2b61c 100644
--- a/setup.py
+++ b/setup.py
@@ -344,7 +344,7 @@ if need_cython():

 setup(
     name='gensim',
-    version='4.0.0.dev0',
+    version='4.0.0beta',
     description='Python framework for fast Vector Space Modelling',
     long_description=LONG_DESCRIPTION,

(gensim) sergeyich:gensim misha$

I think we're ready to proceed, so:

  1. Merge this PR into master
  2. Tag master
  3. Merge master into develop
  4. etc.

mpenkov and others added 30 commits September 23, 2019 18:19
* added release/check_wheels.py

* added preamble

* Update release/check_wheels.py

Co-Authored-By: Radim Řehůřek <[email protected]>

* respond to review comments
* git add HACKTOBERFEST.md

* clarify contributions

* respond to review comments

* add link to HACKTOBERFEST.md from README.md

* typo

* include comments from Gordon
* Probably fixes #2534

* Uppercase P

* Added comment
* Disable Py2.7 builds under Travis and AppVeyor

* use Py3.7.4 image under CircleCI

* tweak circleci config.yml

* patch tox.ini

* more fixes to get docs building under tox

* s/python3.7/python3/

* delay annoy ImportError until actual use

* bring back Pattern

* simplify invokation of pip command

* add install_numpy_scipy.py

* fixup

* use sys.executable

* adjust version in install_wheels.py

* adjust travis.yml

* adjust version in install_wheels.py back

* add logging statements

* use version_info instead of sys.version

* fixup
* Handling for iterables without 0-th element, fixes #2556

* Improved accessing the first element for the case of big datasets
It belongs at the top. People should see it immediately without having to scroll down to an older release.
* Change interlinks format to list of tuples. Fixes #2635

This commit fixes the issue in #2635

This commit changes the interlinks storage in the `segment_wiki` script from dictionary to a list of tuples.

We can process the test wikidata used in the test suite of gensim to inspect the new behavior.
```
python gensim/scripts/segment_wiki.py -i \
    -f ~/Downloads/enwiki-latest-pages-articles1.xml-p000000010p000030302-shortened.bz2 \
    -o ~/Downloads/enwiki-latest.json.gz
```

We get the following output:

```
$ cat ~/Downloads/enwiki-latest.json.gz | zcat | head -1 | jq -r '.interlinks[] | [.[0], .[1]] | @TSV' | sort | head
-ism	-ism
1848 Revolution	1848 Revolution
1917 October Revolution	1917 October Revolution
6 February 1934 crisis	February 1934 riots
A. S. Neill	A. S. Neill
AK Press	AK Press
Abu Hanifa	Abu Hanifa
Adolf Brand	Adolf Brand
Adolf Brand	Adolf Brand
Adolf Hitler	Hitler
```

All tests pass for the related test file.

```
python -m unittest gensim.test.test_scripts
/Users/smishra/miniconda3/envs/TwitterNER/lib/python3.7/bz2.py:131: ResourceWarning: unclosed file <_io.BufferedReader name='/Users/smishra/workspace/codes/python/gensim/gensim/test/test_data/enwiki-latest-pages-articles1.xml-p000000010p000030302-shortened.bz2'>
  self._buffer = None
ResourceWarning: Enable tracemalloc to get the object allocation traceback
.....
----------------------------------------------------------------------
Ran 5 tests in 6.298s

OK
```

* Updated docstrings

* Fixed flake8 issue of long line in docsrtring

* Fixed comments and replaces assertTrue with assertEqual

* Fixed unittest comment and checks for wikicorpus
* Update makefile to point to new subdirectory

* Update layout.html to show new documentation sections

* introduce sphinx gallery

* reorganize gallery

* trim tut3.rst

* git add docs/to_python.py

* git add gallery/010_tutorials/run_doc2vec_lee.py

* minor layout tweak

* add downloader api howto

* add fasttext tutorial and howto

* use pprint in fasttext tutorial

* add summarization tutorial

* git add gallery/020_howtos/run_howto_compare_lda.py

* add fasttext thumbnails

* adding core concepts tutorial

* add summarization plot

* update notebook to use 20newsgroups

* update notebook

* improve notebook

* update howtos

* fix distance metrics tutorial

* improve distance_metrics.ipynb

* git add gallery/010_tutorials/run_distance_metrics.py

* git add gallery/020_howtos/run_news_classification.py

* move downloader API to tutorials section

* add docs/src/auto_examples so bindr can pick up the notebooks

* minor changes

* git add gallery/010_tutorials/run_lda.py

* more minor changes

* More minor changes

* git add gallery/010_tutorials/run_word2vec.py

* updated notebooks

* git add gallery/010_tutorials/run_wmd.py

* add image

* move parts of intro.rst to core concepts tutorial

* move README.txt to wiki

* get rid of fasttext wrapper tutorial

* update top-level heading

* more minor changes

* minor updates

* improve Doc2Vec tutorial, move explanations from IMDB

* git add gallery/020_howtos/run_doc2vec_imdb.py

* git st

* fix notebook paths for bindr

* rename gallery to documentation

* git add binder/requirements.txt

* git add auto_examples/000_core/requirements.txt

* adding requirements.txt for binder

* removing requirements files added in desperation

* update conf.py

* remove temporary files from git branch

* rm images

* merge "getting started" into "core concepts"

* add some clarifying text

* add Jupyter notebook

* Revert "get rid of fasttext wrapper tutorial"

This reverts commit 3ec0a46.

* get rid of fasttext wrapper guide

* git add auto_examples/

* minor fixes

* fix typo

* add listing of corpora and models

* get rid of binder

* git add gallery/020_howtos/run_doc.py

* more instructions for authorship

* improve linkage between core tutorials

* add highlighting

* move downloader to howto

* restore support and about sections

* sync toolbars

* Add installation instructions to top page

* clean up html

* add wordcloud-based thumbnails

* updated notebooks

* update script

* add sphinx-gallery to doc dependencies

* include memory_profiler in docs_testenv

* git add README.rst

* use proper temporary file

* reorganize tutorials section

* clarify version control in README.rst

* git rm 020_howtos/saved_model_wrapper

* move pivoted document normalization to tutorials section

* fix ordering in howto section

* add images

* add annoy to doc dependencies

* update gitignore

* disable tox spinner

* turn off progress bar for pip

* fix labels

* naming fixes

* git rm docs/notebooks/gensim\ Quick\ Start.ipynb

* git rm docs/notebooks/Corpora_and_Vector_Spaces.ipynb

* git rm gensim\ Quick\ Start.ipynb

* git rm docs/notebooks/Topics_and_Transformations.ipynb

* git rm docs/notebooks/Similarity_Queries.ipynb

* git rm docs/notebooks/summarization_tutorial.ipynb

* git rm docs/notebooks/distance_metrics.ipynb

* git rm docs/notebooks/word2vec.ipynb

* git rm docs/notebooks/doc2vec-lee.ipynb

* git rm docs/notebooks/gensim_news_classification.ipynb

* git rm docs/notebooks/lda_training_tips.ipynb

* git rm docs/notebooks/doc2vec-IMDB.ipynb

* git rm docs/notebooks/annoytutorial.ipynb

* git rm tutorial.rst tut1.rst tut2.rst tut3.rst

* minor update to layout.html

* git rm changes_080.rst

* minor tweaks to gallery and surrounding docs

* remove cruft from run_doc2vec_imdb.py

* update doc howto

* fixup

* git add requirements_docs.txt

* more dependencies in requirements_docs.txt

* re-enable LDA howto

* add missing images

* add built LDA howto

* port tutorials.md to gallery

* WIP: cleaning up docs

* language clean up + pin exact versions in doc requirements

* git add redirects.csv test_redirects.py

* remove gensim_numfocus namespace qualifier

* doc cleanup in Other resources

* fix redirects

* regenerated tutorials

* Added tools/check_gallery.py

* committing unsuccessful attempt to fix a tutorial before deleting it

* remove tutorials that don't work

* index page fixes

* add install anchor

* Update redirects.csv

* link fixes from local testing

* replace easy_install with pip

* renamed run_040_compare_lda.py to run_compare_lda.py

* minor fixes

* more fixes from website testing

* updating wordcloud images

* add pandas to requirements_docs.txt

* !!

* more dependency + code fixes

* update upload path to "live" website

* update test_redirects.py

* git rm redirects.csv test_redirects.py
* Fix links to documentation in README.md

* Update README.md
* Remove native Python implementations of Cython extensions

Fix #2511

* remove print statement in tox.ini

* remove print statement in tox.ini

* fix flake8 issues

* fix missing imports

* adjust exception message

* bring back FAST_VERSION variable

* fixup: missing parens

* disable progress bar for tox

* respond to review comments

* remove C/C++ sources generated from Cython files

* update setup.py

* remove duplicate line in setup.py

* fix numpy bootstrapping

* update tox.ini

* handle cython dependency in setup.py

* fixup in setup.py: lowercase c

* more cython sourcery

* fix tox.ini

* Fix merge artifact in setup.py

* fix merge artifact

* disable pip progress bar under CircleCI
* document accessing model's vocabulary

* update images
… (DTM) documentation

* improve & corrected gensim documentation (#2637)

* more descriptive explanation of top_chain_var
* Speed up word2vec binary model loading (#2642)

* Add correctness tests for optimized word2vec model loading (#2642)

* Include remarks of Radim to code speeding up vectors loading (#2671)

* Include remarks of Michael to code speeding up vectors loading (#2671)

* Refactor _load_word2vec_format into a few functions for better readability

* Clean-up _add_word_to_result function
…orpus' is empty (#2672)

* [Issue-2670] Bug fix: Initialize doc_no2 because it is not set when 'corpus' is empty

* [Issue-2670] Add: unittests should fail on invalid input (generator and empty corpus)

* [Issue-2670] Add: Fix unittest for generator

* [Issue-2670] Fix unittest tox:flake8 errors

* [Issue-2670] Fix: empty corpus def in unittest

* [Issue-2670] Fix: empty corpus and generator unittests

* [Issue-2670] Fix: empty corpus and generator unittests
* move install_wheels script

* git add continuous_integration/check_wheels.py

* bump versions for numpy and scipy

* update old requirements.txt

* add file header

* get rid of install_wheels.py hack

* fixup: update travis.yml

* Update continuous_integration/check_wheels.py

Co-Authored-By: Radim Řehůřek <[email protected]>

* Update continuous_integration/check_wheels.py

Co-Authored-By: Radim Řehůřek <[email protected]>

Co-authored-by: Radim Řehůřek <[email protected]>
* Find largest by absolute value

* Add helper function to simplify code & add unit test for it
* force python int before calling islice. islice don't accept numpy int

* add test to check islice error

* it makes test to fail

* make sure that islice receives a python int

* fix typo
* Refactor bm25 to include model parametrization

* Refactor constants back and fix typo

* Refactor parameters order and description

* Add BM25 tests
This closes #2597 and closes #2606

* Simplify asserts in BM25 tests

* Refactor BM25.get_score

Co-authored-by: Marcelo d'Almeida <[email protected]>
@mpenkov mpenkov marked this pull request as draft October 31, 2020 14:21
@mpenkov mpenkov requested a review from piskvorky October 31, 2020 15:29
@mpenkov mpenkov marked this pull request as ready for review October 31, 2020 15:29
@piskvorky
Copy link
Owner

piskvorky commented Oct 31, 2020

OK, that git diff develop release-4.0.0beta makes me feel easier. LGTR.

What's the "merge artifact" in 3918704 though?

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 1, 2020

OK, that git diff develop release-4.0.0beta makes me feel easier. LGTR.

What's the "merge artifact" in 3918704 though?

I'm not 100% sure. I resolved the merge conflicts manually, and then ran a diff to check that I got them all right, and that artifact came out. The lines themselves come from master:

https://github.com/RaRe-Technologies/gensim/blame/8b1ea6a07a2d6010d4af6d2491f6cb008fa1e01a/setup.py#L348

@piskvorky
Copy link
Owner

piskvorky commented Nov 1, 2020

If I understand correctly, git added two install_requires sections on merge, without reporting any conflict?

Scary, but the diff is the definitive answer. As long as the diff is fine, we're good to go. @mpenkov can you finish the release? I'll switch the website symlink right after + tweet. Thanks.

@mpenkov mpenkov merged commit 8624aa2 into master Nov 1, 2020
@piskvorky piskvorky deleted the release-4.0.0beta branch November 1, 2020 13:53
@gojomo
Copy link
Collaborator

gojomo commented Nov 2, 2020

In my opinion, this was done exactly backwards, and has resulted in a seriously messed-up project develop history.

The commit labeled "Release 4.0.0beta" has an unhelpful title & gigantic commit message & reports as touching 500+ files.

develop was what was tested, develop was what had attention. Any release should be the smallest-possible 'just the release housekeeping' commit, to build the artifacts, then return the develop to dev-status.

This process was something else, more complicated, hader to review and understand, for unclear benefits. Anything that was in the weird orphaned -3.8.3 branch should have been forgotten as a one-time error, not carried forward to confuse things for 4.0.0-beta.

Cleanly releasing off develop is an important enough goal that if it were up to me I'd undo this & ensure the release is clear & minimal off of the develop trunk.

(I can't tell if the current state of develop is mid-release or post-release. It appears the tree as of the big commit titles "Release 4.0.0beta" reports its own version as "4.0.0.dev0". Then there's another commit - post-beta? Or something else? It's generically and unhelpfully titled "Merge remote-tracking branch 'upstream/master' into develop"? That commit changes the self-reported version to "4.0.0beta" - which in my views should only ever appear on trunk in the arbitrarily-small time it takes to mint an exact release, then be changed to dev again before any other time/changes/work could accumulate.)

@piskvorky
Copy link
Owner

piskvorky commented Nov 2, 2020

Mid-release, as far as I can tell. @mpenkov is fighting with the CI wheels I guess, which is a mid-release step that fails randomly and can take several hours.

The step of changing the self-reported version in develop comes after that, near the end. Docs. I see PyPI still reports 3.8.3, no 4.0.0beta.

I already switched https://radimrehurek.com/gensim/ because I expected the release to be done by now. @mpenkov what's the ETA?

Re. branches: yes, 3.8.2 was a botched up release, which led to a hot-fix release 3.8.3 which was badly merged, and now that affects 4.0.0. Ditching that orphaned branch would have been my choice too but apparently since git diff showed no significant changes between "develop vs. develop+orphan", @mpenkov's choice of merging that branch is actually cleaner – the full history is now in git.

I saw that including those orphaned commits actually auto-closed some (orphaned!) tickets… which is a bonus.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 3, 2020

We're still mid-release. There's a number of unforeseen issues blocking the wheel builds. It's hard for me to give an ETA because I haven't encountered these issues before, and my bandwidth during the working week is limited.

The problems I can see include:

  • numpy.testing.decorators missing in the Appveyor (Win) builds
  • x86_64-linux-gnu-gcc: error: unrecognized command line option '-std=c++14' during Linux builds
  • AttributeError: module 'numpy.random' has no attribute 'default_rng' during MacOS builds
  • ImportError: cannot import name 'FAST_VERSION' from 'gensim.models.fasttext' (/Users/travis/build/RaRe-Technologies/gensim-wheels/venv/lib/python3.8/site-packages/gensim/models/fasttext.py) during MacOS Py3.8 build

The way to proceed is, for each issue:

  1. Investigate the cause
  2. Fix the cause in gensim-wheels repo
  3. Push to gensim-wheels master, wait for rebuild

If either of your are able to handle any of the above, please go ahead. Otherwise, I'll poke at it when I have during the week.

Re. branches: yes, 3.8.2 was a botched up release, which led to a hot-fix release 3.8.3 which was badly merged, and now that affects 4.0.0. Ditching that orphaned branch would have been my choice too but apparently since git diff showed no significant changes between "develop vs. develop+orphan", @mpenkov's choice of merging that branch is actually cleaner – the full history is now in git.

@piskvorky Hang on, the 3.8.3 branch is still an orphan, right? The only trace of that in develop is the CHANGELOG, and we merged that intentionally in #2831. So yeah, we messed up 3.8.2, but I don't see how 3.8.3 was badly merged.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 3, 2020

@gojomo I had a look at one of the problems, and it's being caused by this:

$ git blame gensim/utils.py | grep -C 3 default_rng
e859c11f6 gensim/utils.py     (Michael Penkov     2019-10-25 14:54:24 +0200   49) """An exception that gensim code raises when Cython extensions are unavailable."""
e859c11f6 gensim/utils.py     (Michael Penkov     2019-10-25 14:54:24 +0200   50) 
c0e016956 gensim/utils.py     (Gordon Mohr        2020-07-19 06:15:17 -0700   51) #: A default, shared numpy-Generator-based PRNG for any/all uses that don't require seeding
c0e016956 gensim/utils.py     (Gordon Mohr        2020-07-19 06:15:17 -0700   52) default_prng = np.random.default_rng()
c0e016956 gensim/utils.py     (Gordon Mohr        2020-07-19 06:15:17 -0700   53) 
d9b79e2ff gensim/utils.py     (Radim Řehůřek      2015-04-08 15:18:20 +0200   54) 
255ce2590 gensim/utils.py     (Kumar Akshay       2017-12-26 20:24:31 +0530   55) def get_random_state(seed):

At this stage it's probably too late to fix this in gensim (because we'll have to repeat the above release procedure) so should we bump the numpy version that we build the wheels against? If yes, then to what?

@gojomo
Copy link
Collaborator

gojomo commented Nov 3, 2020

What I mean about merges is that if develop was deemed reviewed/tested/ready to release, the PR/commit to make a release should be really tiny. But this is appearing to change 550 files, and commit 8624aa2 seems to have a hundreds-of-lines-long commit message.

Ideally, nothing that happened wrong in 3.8.2 or 3.8.3 should have any impact now, because develop was already the proper true state of what was ready to release. (OTOH, if there were some lingering changes on other non-develop branches that needed to be in develop, they should be brought over in small, reviewable chunks – maybe, cherry-picked? – rather than an opaqueifying big-looking-at-Github (even if the real changes are tiny) merge.

I guess I just don't understand any necessary process that could create the giant commits/PRs as shown here in the Github web interface. (Maybe it's a Github issue, but even reading the explanation atop this issue, I can't imagine why a couple commits ago, develop was straightforward, and the release-steps turned ~5 lines of changes into this giant log.)

WRT to the numpy default_rng matter, I think we should just bump our numpy requirement to the one that supplies that. (I thought this had been discussed in a previous issue, but can't recall exactly where.) The fact that this has been working in the main gensim repo implies to me that it shouldn't be an issue in release-building, unless the main repo is configured wrong. I previously hadn't taken much note of the separate gensim-wheels repo. Could we just make it so the main repo builds the wheels, too? (Fixing anything "in gensim-wheels" rather than in gensim seems an anti-pattern to me.) Then releases are definitionally in-sync with whatever's recently had dev attention, and when the main repo builds/tests pass, it's unlikely (or even impossible) for new unforseen release-specific glitches to arise.

@piskvorky
Copy link
Owner

piskvorky commented Nov 3, 2020

I don't see how 3.8.3 was badly merged.

It left master vs develop in a state that caused our standard release process to unexpectedly blow up during the next release = now.

But as long as no "bad changes" (as verified via git diff) got in, I don't think it's a big deal. The extra commit(s) don't really bother me. Although it does make me wonder why such relatively simple merge + conflict resolution wasn't done according to the release process the last time, what's the reason for all this, whether there's another gotcha waiting for us somewhere.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 3, 2020

It left master vs develop in a state that caused our standard release process to unexpectedly blow up during the next release = now.

Looking around, I don't think the merge did that. I'm not sure what did, but a single changelog entry wouldn't have caused this.

I've identified another issue with the wheels on Windows: pip is being stupid and somehow failing to install Cython when installing the wheel:

pip install --pre --no-index --force-reinstall --find-links dist/ gensim
Looking in links: dist/
Processing c:\projects\gensim-wheels-2x1bk\gensim\dist\gensim-4.0.0b0-cp36-cp36m-win_amd64.whl
ERROR: Could not find a version that satisfies the requirement Cython==0.29.14 (from gensim) (from versions: none)
ERROR: No matching distribution found for Cython==0.29.14 (from gensim)

This is despite it being able to cythonize the files prior to building the wheel.

The wheel itself is getting built and uploaded, but the above error is crashing the build and preventing builds for other Python versions from proceeding.

$ aws --profile smart_open s3 ls s3://gensim-wheels/ | grep 4.0.0b0
2020-11-03 16:19:39   23898451 gensim-4.0.0b0-cp36-cp36m-win_amd64.whl

@piskvorky
Copy link
Owner

piskvorky commented Nov 3, 2020

Why would installing a binary wheel need Cython at all?

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 3, 2020

That's a good question. We typically only require cython if the extensions aren't built. In this case, the extensions are getting built, so cython shouldn't be required.

Yet another mystery to unravel.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 3, 2020

This is becoming a bit too much to manage for a single PR (that has been merged anyway), so I've opened separate tickets to deal with the immediate problems:

@piskvorky
Copy link
Owner

piskvorky commented Nov 3, 2020

Yet another mystery to unravel.

Yes. Seeing how the other contributions went under the previous stewardship, I should have expected this part would be of similar quality. And focused more attention there too, prior to the release.

Kudos to @gojomo for insisting on beta.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 14, 2020

I've solved the previous issues, but uncovered two more:

piskvorky/gensim-wheels#10
piskvorky/gensim-wheels#11

Currently, all Linux test runs of the wheels are passing (except 3.7). The Windows runs are still failing.

@piskvorky

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 15, 2020

Done: https://pypi.org/project/gensim/4.0.0b0/

@piskvorky @gojomo

@piskvorky
Copy link
Owner

piskvorky commented Nov 15, 2020

@mpenkov thanks a lot!

What are the take aways from releasing 4.0.0beta – anything to improve in our process / documentation? Do we expect the next release go smoother?

@piskvorky
Copy link
Owner

piskvorky commented Nov 15, 2020

@mpenkov I tried on Linux, and pip install --pre --upgrade gensim installs 3.8.3.

How do I install 4.0.0beta?

EDIT: OSX installs 4.0.0b0.

EDIT2: Windows install 4.0.0b0 too.

@piskvorky
Copy link
Owner

piskvorky commented Nov 15, 2020

Figured it out: the Linux virtualenv had py2.7, so pip installed the latest Gensim where py2.7 worked = 3.8.3.

So, a feature. No problem here :)

@mpenkov
Copy link
Collaborator Author

mpenkov commented Nov 15, 2020

What are the take aways from releasing 4.0.0beta – anything to improve in our process / documentation? Do we expect the next release go smoother?

Yeah, I can think of two:

  1. Build wheels more often (perhaps nightly)
  2. Simplify wheel build process (review the multibuild project: is it worth to continue to use it?)

@gojomo
Copy link
Collaborator

gojomo commented Nov 18, 2020

A practice I've liked on previous projects was for each auto-build to actually create the exact same distribution artifacts as a full release - and ideally, also, run the tests from an install of those artifacts. (And keep all of them around, at least for a little while.) Then, there's no separate release build or release repo: an official release is just a tiny labeling change - a version-label-bump of a few lines - then grabbing the exact same artifacts for uploading to official distribution points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.