Skip to content

Commit

Permalink
merge, trim and update python dependencies sections
Browse files Browse the repository at this point in the history
Fixes #156
  • Loading branch information
egpbos committed Apr 6, 2022
1 parent 31e83b3 commit 3bf470e
Showing 1 changed file with 49 additions and 35 deletions.
84 changes: 49 additions & 35 deletions best_practices/language_guides/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,43 +32,74 @@ Building and/or using Python 2 is probably discouraged even more than, say, usin

## Dependencies and package management

Use `pip` or `conda` (note that pip and conda can be used side by side, see also [what is the difference between pip and conda?](http://stackoverflow.com/questions/20994716/what-is-the-difference-between-pip-and-conda)).
To install Python packages use `pip` or `conda` (or both, see also [what is the difference between pip and conda?](http://stackoverflow.com/questions/20994716/what-is-the-difference-between-pip-and-conda)).

If you are planning on distributing your code at a later stage, be aware that your choice of package management may affect your packaging process. See [Building and packaging](#building-and-packaging-code) for more info.

### Use virtual environments

We strongly recommend creating isolated "virtual environments" for each Python project.
These can be created with `virtualenv` or with `conda`.
Advantages over installing packages system-wide or in a single user folder:

* Installs Python modules when you are not root.
* Contains all Python dependencies so the environment keeps working after an upgrade.
* Keeps environments clean for each project, so you don't get more than you need (and can easily reproduce that minimal working situation).
* Lets you select the Python version per environment, so you can test code compatibility between Python 2.x and 3.x.

### Pip + virtualenv

Create isolated Python environments with [virtualenv](https://virtualenv.pypa.io/en/latest/). Very much recommended for all Python projects since it:
If you don't want to use `conda`, create isolated Python environments with [virtualenv](https://virtualenv.pypa.io/en/latest/), [virtualenvwrapper](https://virtualenvwrapper.readthedocs.org) or, if you are using Python 3 only, the standard library [venv](https://docs.python.org/3/library/venv.html) module.

* installs Python modules when you are not root,
* contains all Python dependencies so the environment keeps working after an upgrade, and
* lets you select the Python version per environment, so you can test code compatibility between Python 2.x and 3.x.
With virtualenv and venv, pip is used to install all dependencies. An increasing number of packages are using [`wheel`](http://pythonwheels.com), so pip downloads and installs them as binaries. This means they have no build dependencies and are much faster to install.

To manage multiple virtualenv environments and reference them only by name, use [virtualenvwrapper](https://virtualenvwrapper.readthedocs.org). To create a new environment, run `mkvirtualenv environment_name`, to start using it, run `workon environment_name` and to stop working with it, run `deactivate`.
If the installation of a package fails because of its native extensions or system library dependencies and you are not root, you could switch to `conda` (see below).

To get an overview of the packages used by your package, run `pip freeze` in your environment.
This can be useful when writing the `setup.py` script for your package (see [Building and packaging](#building-and-packaging-code)).

### Conda

[Conda](http://conda.pydata.org/docs/) can be used instead of virtualenv and pip, since it is both an environment manager and a package manager. It easily installs binary dependencies, like Python itself or system libraries. Installation of packages that are not using `wheel` but have a lot of native code is much faster than `pip` because Conda does not compile the package, it only downloads compiled packages. The disadvantage of Conda is that the package needs to have a Conda build recipe. Many Conda build recipes already exist, but they are less common than the `setup.py` that generally all Python packages have.

There are two main distributions of Conda: [Anaconda](https://docs.anaconda.com/anaconda/install/) and [Miniconda](https://docs.conda.io/projects/continuumio-conda/en/latest/user-guide/install/index.html). Anaconda is large and contains a lot of common packages, like numpy and matplotlib, whereas Miniconda is very lightweight and only contains Python. If you need more, the `conda` command acts as a package manager for Python packages.

For environments where you do not have admin rights (e.g. DAS-5) either Anaconda or Miniconda is highly recommended, since the install is very straightforward. The installation of packages through Conda is very robust.
A possible downside of Anaconda is the fact that this is offered by a commercial supplier, but we don't foresee any vendor lock-in issues.

If you are using Python 3 only, you can also make use of the standard library [venv](https://docs.python.org/3/library/venv.html) module. Creating a virtual environment with it is as easy as running `python3 -m venv /path/to/environment`. Run `. /path/to/environment/bin/activate` to start using it and `deactivate` to deactivate.
### Building and packaging code

With virtualenv and venv, pip is used to install all dependencies. An increasing number of packages are using [`wheel`](http://pythonwheels.com), so pip downloads and installs them as binaries. This means they have no build dependencies and are much faster to install. If the installation of a package fails because of its native extensions or system library dependencies and you are not root, you have to revert to Conda (see below).
To create an installable Python package, create a file `setup.py` and use the [`setuptools`](https://setuptools.readthedocs.io) module.

To keep a log of the packages used by your package, run `pip freeze > requirements.txt` in the root of your package. If some of the packages listed in `requirements.txt` are needed during testing only, use an editor to move those lines to `test_requirements.txt`. Now your package can be installed with
This is also the primary location where you should list your dependencies (find the currently installed packages with `pip freeze` or `conda list`).
Use the `install_requires` argument to list them.
Keep version constraints to a minimum; use, in order of descending preference: no constraints, lower bounds, lower + upper bounds, exact versions.
Use of `requirements.txt` is discouraged, unless necessary for something specific, see the [discussion here](https://github.com/NLeSC/guide/issues/156).

```shell
pip install -r requirements.txt
When `setup.py` is written, your package can be installed with
```
pip install -e .
```

The `-e` flag will install your package in editable mode, i.e. it will create a symlink to your package in the installation location instead of copying the package. This is convenient when developing, because any changes you make to the source code will immediately be available for use in the installed version.

### Conda
Set up continuous integration to test your installation script. Use `pyroma` (can be run as part of `prospector`) as a linter for your installation script.

[Conda](http://conda.pydata.org/docs/) can be used instead of virtualenv and pip. It easily installs binary dependencies, like Python itself or system libraries. Installation of packages that are not using `wheel` but have a lot of native code is much faster than `pip` because Conda does not compile the package, it only downloads compiled packages. The disadvantage of Conda is that the package needs to have a Conda build recipe. Many Conda build recipes already exist, but they are less common than the `setup.py` that generally all Python packages have.
For packaging your code, you can either use `pip` or `conda`. Neither of them is [better than the other](https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/) -- they are different; use the one which is more suitable for your project. `pip` may be more suitable for distributing pure python packages, and it provides some support for binary dependencies using [`wheels`](http://pythonwheels.com). `conda` may be more suitable when you have external dependencies which cannot be packaged in a wheel.

There are two main distributions of Conda: [Anaconda](https://docs.anaconda.com/anaconda/install/) and [Miniconda](https://docs.conda.io/projects/continuumio-conda/en/latest/user-guide/install/index.html). Anaconda is large and contains a lot of common packages, like numpy and matplotlib, whereas Miniconda is very lightweight and only contains Python. If you need more, the `conda` command acts as a package manager for Python packages.
* Upload your package to the [Python Package Index (PyPI)](https://pypi.org) so it can be installed with pip.
* Either do this manually by using [twine](https://github.com/pypa/twine) ([tutorial](http://blog.securem.eu/tips%20and%20tricks/2016/02/29/creating-and-publishing-a-python-module/)),
* Or configure [Travis CI](https://docs.travis-ci.com/user/deployment/pypi/) or [Circle-CI](https://circleci.com/blog/continuously-deploying-python-packages-to-pypi-with-circleci/) to do it automatically for each release.
* Additional guidelines:
* Packages should be uploaded to PyPI using [your own account](https://pypi.org/account/register)
* For packages developed in a team or organization, it is recommended that you create a team or organizational account on PyPI and add that as a collaborator with the owner rule. This will allow your team or organization to maintain the package even if individual contributors at some point move on to do other things. At the Netherlands eScience Center, we are a fairly small organization, so we use a single backup account (`nlesc`).
* When distributing code through PyPI, non-python files (such as `requirements.txt`) will not be packaged automatically, you need to [add them to](https://stackoverflow.com/questions/1612733/including-non-python-files-with-setup-py) a `MANIFEST.in` file.
* To test whether your distribution will work correctly before uploading to PyPI, you can run `python setup.py sdist` in the root of your repository. Then try installing your package with `pip install dist/<your_package>tar.gz.`
* [Build using conda](http://conda.pydata.org/docs/build_tutorials.html)
* **Make use of [conda-forge](https://conda-forge.org/) whenever possible**, since it provides many automated build services that save you tons of work, compared to using your own conda repository. It also has a very active community for when you need help.
* Use BioConda or custom channels (hosted on GitHub) as alternatives if need be.
* [Python wheels](http://pythonwheels.com/) are the new standard for [distributing](https://packaging.python.org/distributing/#wheels) Python packages. For pure python code, without C extensions, use [`bdist_wheel`](https://packaging.python.org/distributing/#pure-python-wheels) with a Python 2 and Python 3 setup, or use [`bdist_wheel --universal`](https://packaging.python.org/distributing/#universal-wheels) if the code is compatible with both Python 2 and 3. If C extensions are used, each OS needs to have its own wheel. The [manylinux](https://github.com/pypa/manylinux) docker images can be used for building wheels compatible with multiple Linux distributions. See [the manylinux demo](https://github.com/pypa/python-manylinux-demo) for an example. Wheel building can be automated using Travis (for pure python, Linux and OS X) and Appveyor (for Windows).

Use `conda install` to install new packages and `conda update` to keep your system up to date. The `conda` command can also be used to create virtual environments.

For environments where you do not have admin rights (e.g. DAS-5) either Anaconda or Miniconda is highly recommended, since the install is very straightforward. The installation of packages through Conda seems very robust. If you want to add packages to the (Ana)conda repositories, please check [the conda-build documentation](https://docs.conda.io/projects/conda-build/en/latest/index.html).
A possible downside of Anaconda is the fact that this is offered by a commercial supplier, but we don't foresee any vendor lock-in issues.

## Editors and IDEs

Expand All @@ -93,23 +124,6 @@ Make sure to set strictness to `veryhigh` for best results. `prospector` has its

Autoformatting tools like [`yapf`](https://github.com/google/yapf) and [`black`](https://black.readthedocs.io/en/stable/index.html) can automatically format code for optimal readability. `yapf` is configurable to suit your (team's) preferences, whereas `black` enforces the style chosen by the `black` authors. The [`isort`](http://timothycrosley.github.io/isort/) package automatically formats and groups all imports in a standard, readable way.

## Building and packaging code

To create an installable Python package, create a file `setup.py` and use the [`setuptools`](https://setuptools.readthedocs.io) module. Make sure you only import standard library packages in `setup.py`, directly or through importing other modules of your package, or your package will fail to install on systems that do not have the required dependencies pre-installed. Set up continuous integration to test your installation script. Use `pyroma` (can be run as part of `prospector`) as a linter for your installation script.

For packaging your code, you can either use `pip` or `conda`. Neither of them is [better than the other](https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/) -- they are different; use the one which is more suitable for your project. `pip` may be more suitable for distributing pure python packages, and it provides some support for binary dependencies using [`wheels`](http://pythonwheels.com). `conda` may be more suitable when you have external dependencies which cannot be packaged in a wheel.

* Upload your package to the [Python Package Index (PyPI)](https://pypi.org) so it can be installed with pip.
* Either do this manually by using [twine](https://github.com/pypa/twine) ([tutorial](http://blog.securem.eu/tips%20and%20tricks/2016/02/29/creating-and-publishing-a-python-module/)),
* Or configure [Travis CI](https://docs.travis-ci.com/user/deployment/pypi/) or [Circle-CI](https://circleci.com/blog/continuously-deploying-python-packages-to-pypi-with-circleci/) to do it automatically for each release.
* Additional guidelines:
* Packages should be uploaded to PyPI using [your own account](https://pypi.org/account/register)
* For packages developed in a team or organization, it is recommended that you create a team or organizational account on PyPI and add that as a collaborator with the owner rule. This will allow your team or organization to maintain the package even if individual contributors at some point move on to do other things. At the Netherlands eScience Center, we are a fairly small organization, so we use a single backup account (`nlesc`).
* When distributing code through PyPI, non-python files (such as `requirements.txt`) will not be packaged automatically, you need to [add them to](https://stackoverflow.com/questions/1612733/including-non-python-files-with-setup-py) a `MANIFEST.in` file.
* To test whether your distribution will work correctly before uploading to PyPI, you can run `python setup.py sdist` in the root of your repository. Then try installing your package with `pip install dist/<your_package>tar.gz.`
* [Build using conda](http://conda.pydata.org/docs/build_tutorials.html)
* If desired, add packages to [conda-forge](https://conda-forge.github.io/). Use BioConda or custom channels (hosted on GitHub) as alternatives if need be.
* [Python wheels](http://pythonwheels.com/) are the new standard for [distributing](https://packaging.python.org/distributing/#wheels) Python packages. For pure python code, without C extensions, use [`bdist_wheel`](https://packaging.python.org/distributing/#pure-python-wheels) with a Python 2 and Python 3 setup, or use [`bdist_wheel --universal`](https://packaging.python.org/distributing/#universal-wheels) if the code is compatible with both Python 2 and 3. If C extensions are used, each OS needs to have its own wheel. The [manylinux](https://github.com/pypa/manylinux) docker images can be used for building wheels compatible with multiple Linux distributions. See [the manylinux demo](https://github.com/pypa/python-manylinux-demo) for an example. Wheel building can be automated using Travis (for pure python, Linux and OS X) and Appveyor (for Windows).

## Testing

Expand Down

0 comments on commit 3bf470e

Please sign in to comment.