From 18d2eb707b2dac8d92df2ec71ccf3075d975cc84 Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Mon, 19 Feb 2024 10:38:43 -0800 Subject: [PATCH 01/13] src/doc/en/developer/coding_basics.rst: Large data files should not be added to the Sage source tree --- src/doc/en/developer/coding_basics.rst | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 7471a9d4022..4d2e3b23b9a 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -104,7 +104,7 @@ of the directory containing the Sage sources: setup.py ... sage/ # Sage library - ext_data/ # extra Sage resources (formerly src/ext) + ext_data/ # extra Sage resources (legacy) bin/ # the scripts in local/bin that are tracked upstream/ # tarballs of upstream sources local/ # installed binaries @@ -161,6 +161,16 @@ of the following places: os.path.join(os.path.dirname(__file__), 'sage-maxima.lisp') +- Large data files should not be added to the Sage source tree. + Instead: + + - Create a separate git repository for them + - Add metadata in your repository that make it a pip-installable + package (distribution package) + - Upload it to PyPI + - Create metadata in ``SAGE_ROOT/build/pkgs`` that make your new + pip-installable package known to Sage + - In an appropriate subdirectory of ``SAGE_ROOT/src/sage/ext_data/``. (At runtime, it is then available in the directory indicated by ``SAGE_EXTCODE``). For example, if ``file`` is placed in From 894b667cf9763f82fdad5e767e1ddc51fd4a25d6 Mon Sep 17 00:00:00 2001 From: gmou3 <32706872+gmou3@users.noreply.github.com> Date: Tue, 20 Feb 2024 01:04:51 +0200 Subject: [PATCH 02/13] Add compression suggestion and guiding example repos (#26) --- src/doc/en/developer/coding_basics.rst | 43 +++++++++++++++++--------- 1 file changed, 28 insertions(+), 15 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 4d2e3b23b9a..7f4fe24d08f 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -89,9 +89,9 @@ In particular, Files and directory structure ============================= -Roughly, the Sage directory tree is layout like this. Note that we use -``SAGE_ROOT`` in the following as a shortcut for the (arbitrary) name -of the directory containing the Sage sources: +Roughly, the Sage directory tree is laid out like this. Note that we +use ``SAGE_ROOT`` in the following as a shortcut for the name of the +directory containing the Sage sources: .. CODE-BLOCK:: text @@ -149,8 +149,8 @@ Adding new top-level packages below :mod:`sage` should be done sparingly. It is often better to create subpackages of existing packages. -Non-Python Sage source code and supporting files can be included in one -of the following places: +Non-Python Sage source code and small supporting files can be +included in one of the following places: - In the directory of the Python code that uses that file. When the Sage library is installed, the file will be installed in the same @@ -161,16 +161,6 @@ of the following places: os.path.join(os.path.dirname(__file__), 'sage-maxima.lisp') -- Large data files should not be added to the Sage source tree. - Instead: - - - Create a separate git repository for them - - Add metadata in your repository that make it a pip-installable - package (distribution package) - - Upload it to PyPI - - Create metadata in ``SAGE_ROOT/build/pkgs`` that make your new - pip-installable package known to Sage - - In an appropriate subdirectory of ``SAGE_ROOT/src/sage/ext_data/``. (At runtime, it is then available in the directory indicated by ``SAGE_EXTCODE``). For example, if ``file`` is placed in @@ -184,6 +174,29 @@ the section ``options.package_data`` of the file ``SAGE_ROOT/pkgs/sagemath-standard/setup.cfg.m4`` (or the corresponding file of another distribution). +Large data files should not be added to the Sage source tree. Instead, it +is proposed to do the following: + +- create a separate git repository and upload them there [2]_, + +- add metadata to the repository that make it a pip-installable + package (distribution package), + +- upload it to PyPI, + +- create metadata in ``SAGE_ROOT/build/pkgs`` that make your new + pip-installable package known to Sage. + +For guiding examples of external repositories that host large data +files, see https://github.com/sagemath/conway-polynomials, and +https://github.com/gmou3/matroid-database. + +.. [2] + + It is also suggested that the files are compressed, e.g., through + the command ``xz -e``. They can then be read via a command such as + ``lzma.open(file, 'rt')``. + Learn by copy/paste =================== From e9038c502cbc964abb7ef1ae11825a91046b70af Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Mon, 19 Feb 2024 17:24:58 -0800 Subject: [PATCH 03/13] src/doc/en/developer/coding_basics.rst: Add links to packaging instructions --- src/doc/en/developer/coding_basics.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 7f4fe24d08f..43ca573b239 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -180,12 +180,13 @@ is proposed to do the following: - create a separate git repository and upload them there [2]_, - add metadata to the repository that make it a pip-installable - package (distribution package), + package (distribution package), as explained for example in the + `Python Packaging User Guide `_, -- upload it to PyPI, +- `upload it to PyPI `_, - create metadata in ``SAGE_ROOT/build/pkgs`` that make your new - pip-installable package known to Sage. + pip-installable package known to Sage; see :ref:`chapter-packaging`. For guiding examples of external repositories that host large data files, see https://github.com/sagemath/conway-polynomials, and From 0f74867d40bee3f2b61f51d9f6788264bf1aa958 Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Mon, 19 Feb 2024 17:40:07 -0800 Subject: [PATCH 04/13] src/doc/en/developer/coding_basics.rst: Mention importlib.resources.files --- src/doc/en/developer/coding_basics.rst | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 43ca573b239..5774539b817 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -154,12 +154,23 @@ included in one of the following places: - In the directory of the Python code that uses that file. When the Sage library is installed, the file will be installed in the same - location as the Python code. For example, - ``SAGE_ROOT/src/sage/interfaces/maxima.py`` needs to use the file - ``SAGE_ROOT/src/sage/interfaces/maxima.lisp`` at runtime, so it refers - to it as :: + location as the Python code. This is referred to as "package data". - os.path.join(os.path.dirname(__file__), 'sage-maxima.lisp') + The preferred way to access the data from Python is using the + function :func:`importlib.resources.files`. + + .. NOTE:: + + You may notice that some older code in the Sage library accesses + the package data in more direct ways. For example, + ``SAGE_ROOT/src/sage/interfaces/maxima.py`` uses the file + ``SAGE_ROOT/src/sage/interfaces/maxima.lisp`` at runtime, so it + refers to it as:: + + os.path.join(os.path.dirname(__file__), 'sage-maxima.lisp') + + This is no longer recommended, and PRs that update such uses + are welcome. - In an appropriate subdirectory of ``SAGE_ROOT/src/sage/ext_data/``. (At runtime, it is then available in the directory indicated by From 59a5204a757aaf40395882e840ce5a7ed85a812d Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Mon, 19 Feb 2024 17:49:48 -0800 Subject: [PATCH 05/13] src/doc/en/developer/coding_basics.rst: Show how to import importlib.resources.files --- src/doc/en/developer/coding_basics.rst | 8 +++++++- src/doc/en/developer/coding_in_python.rst | 1 + 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 5774539b817..4511e6a2fff 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -157,7 +157,13 @@ included in one of the following places: location as the Python code. This is referred to as "package data". The preferred way to access the data from Python is using the - function :func:`importlib.resources.files`. + function :func:`importlib.resources.files`. It should be imported + as follows (see :ref:`section-python-language-standard`):: + + try: + from importlib_resources import files + except ImportError: + from importlib.resources import files .. NOTE:: diff --git a/src/doc/en/developer/coding_in_python.rst b/src/doc/en/developer/coding_in_python.rst index 6b3b936662d..fef7a1011f0 100644 --- a/src/doc/en/developer/coding_in_python.rst +++ b/src/doc/en/developer/coding_in_python.rst @@ -7,6 +7,7 @@ Coding in Python for Sage This chapter discusses some issues with, and advice for, coding in Sage. +.. _section-python-language-standard: Python language standard ======================== From 827372e71488a79ee32620edd055fcc1f7683e06 Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Mon, 19 Feb 2024 18:09:16 -0800 Subject: [PATCH 06/13] src/doc/en/developer/coding_basics.rst: Mention importlib.resources.as_file, add links to tutorial --- src/doc/en/developer/coding_basics.rst | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 4511e6a2fff..0dcaa2d092a 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -157,14 +157,28 @@ included in one of the following places: location as the Python code. This is referred to as "package data". The preferred way to access the data from Python is using the - function :func:`importlib.resources.files`. It should be imported - as follows (see :ref:`section-python-language-standard`):: + `importlib.resources API `, + in particular the function :func:`importlib.resources.files`. + It should be imported as follows + (see :ref:`section-python-language-standard`):: try: + # Use backport package providing Python 3.11 features from importlib_resources import files except ImportError: from importlib.resources import files + If the file needs to be used outside of Python, then the + preferred way is using the context manager + :func:`importlib.resources.as_file`. It should be imported as + follows:: + + try: + # Use backport package providing Python 3.11 features + from importlib_resources import as_file + except ImportError: + from importlib.resources import as_file + .. NOTE:: You may notice that some older code in the Sage library accesses From a72e2fadfb82fc3a51981dfc00e8b42832762007 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Gonzalo=20Tornar=C3=ADa?= Date: Mon, 19 Feb 2024 18:37:28 -0800 Subject: [PATCH 07/13] src/doc/en/developer/coding_basics.rst: Add importlib.resources examples --- src/doc/en/developer/coding_basics.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 0dcaa2d092a..8ed7071724d 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -168,6 +168,15 @@ included in one of the following places: except ImportError: from importlib.resources import files + After this import, you can: + + - open a resource for text reading: `fd = files(package).joinpath(resource).open('rt')` + - open a resource for binary reading: `fd = files(package).joinpath(resource).open('rb')` + - read a resource as text: `text = files(package).joinpath(resource).read_text()` + - read a resource as bytes: `bytes = files(package).joinpath(resource).read_bytes()` + - open a xz resource for text reading: `fd = lzma.open(files(package).joinpath(resource), 'rt')` + - open a xz resource for binary reading: `fd = lzma.open(files(package).joinpath(resource), 'rb')` + If the file needs to be used outside of Python, then the preferred way is using the context manager :func:`importlib.resources.as_file`. It should be imported as From bab3d03ad9a6bf04bea4d285efb4bee82c6f10e3 Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Tue, 20 Feb 2024 08:53:29 -0800 Subject: [PATCH 08/13] src/doc/en/developer/coding_basics.rst: Use sys.version_info instead of try..except --- src/doc/en/developer/coding_basics.rst | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 8ed7071724d..8936cea7a5d 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -162,10 +162,12 @@ included in one of the following places: It should be imported as follows (see :ref:`section-python-language-standard`):: - try: + import sys + + if sys.version_info < (3, 11): # Use backport package providing Python 3.11 features from importlib_resources import files - except ImportError: + else: from importlib.resources import files After this import, you can: @@ -179,14 +181,8 @@ included in one of the following places: If the file needs to be used outside of Python, then the preferred way is using the context manager - :func:`importlib.resources.as_file`. It should be imported as - follows:: - - try: - # Use backport package providing Python 3.11 features - from importlib_resources import as_file - except ImportError: - from importlib.resources import as_file + :func:`importlib.resources.as_file`. It should be imported in the + same way as shown above. .. NOTE:: From 515b4b36bb2bc92768d72ea5ecb206606628ef1f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20K=C3=B6ppe?= Date: Tue, 20 Feb 2024 09:05:00 -0800 Subject: [PATCH 09/13] src/doc/en/developer/coding_basics.rst: Details on resource `open` examples MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Gonzalo TornarĂ­a --- src/doc/en/developer/coding_basics.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 8936cea7a5d..ee39c463196 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -176,8 +176,8 @@ included in one of the following places: - open a resource for binary reading: `fd = files(package).joinpath(resource).open('rb')` - read a resource as text: `text = files(package).joinpath(resource).read_text()` - read a resource as bytes: `bytes = files(package).joinpath(resource).read_bytes()` - - open a xz resource for text reading: `fd = lzma.open(files(package).joinpath(resource), 'rt')` - - open a xz resource for binary reading: `fd = lzma.open(files(package).joinpath(resource), 'rb')` + - open a xz resource for text reading: `fd = lzma.open(files(package).joinpath(resource).open('rb'), 'rt')` + - open a xz resource for binary reading: `fd = lzma.open(files(package).joinpath(resource).open('rb'), 'rb')` If the file needs to be used outside of Python, then the preferred way is using the context manager From 421c37fc2f04446a397c7ce7a3e60ab47e719b99 Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Mon, 26 Feb 2024 15:15:57 -0800 Subject: [PATCH 10/13] src/doc/en/developer/coding_basics.rst: Don't encourage PRs that update direct access to data files using __file__ for now --- src/doc/en/developer/coding_basics.rst | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index ee39c463196..cec72b8add5 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -184,18 +184,13 @@ included in one of the following places: :func:`importlib.resources.as_file`. It should be imported in the same way as shown above. - .. NOTE:: - - You may notice that some older code in the Sage library accesses - the package data in more direct ways. For example, - ``SAGE_ROOT/src/sage/interfaces/maxima.py`` uses the file - ``SAGE_ROOT/src/sage/interfaces/maxima.lisp`` at runtime, so it - refers to it as:: - - os.path.join(os.path.dirname(__file__), 'sage-maxima.lisp') +- Older code in the Sage library accesses + the package data in more direct ways. For example, + ``SAGE_ROOT/src/sage/interfaces/maxima.py`` uses the file + ``SAGE_ROOT/src/sage/interfaces/maxima.lisp`` at runtime, so it + refers to it as:: - This is no longer recommended, and PRs that update such uses - are welcome. + os.path.join(os.path.dirname(__file__), 'sage-maxima.lisp') - In an appropriate subdirectory of ``SAGE_ROOT/src/sage/ext_data/``. (At runtime, it is then available in the directory indicated by From 0c28f6f05aedb6fa110481a9a94cbc24ac350783 Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Mon, 26 Feb 2024 15:16:37 -0800 Subject: [PATCH 11/13] src/doc/en/developer/coding_basics.rst: Link to https://github.com/sagemath/sage/issues/33037 for SAGE_EXTCODE deprecation --- src/doc/en/developer/coding_basics.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index cec72b8add5..3bc8457b646 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -200,7 +200,9 @@ included in one of the following places: from sage.env import SAGE_EXTCODE file = os.path.join(SAGE_EXTCODE, 'directory', 'file') -In both cases, the files must be listed (explicitly or via wildcards) in + This practice is deprecated, see :issue:`33037`. + +In all cases, the files must be listed (explicitly or via wildcards) in the section ``options.package_data`` of the file ``SAGE_ROOT/pkgs/sagemath-standard/setup.cfg.m4`` (or the corresponding file of another distribution). From 81741c5c102f45d05060296b9ee52e240f6b9e22 Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Sat, 16 Mar 2024 12:15:40 -0700 Subject: [PATCH 12/13] src/doc/en/developer/coding_basics.rst: Remove advice on how to import from importlib.resources, fix markup --- src/doc/en/developer/coding_basics.rst | 27 ++++++++------------------ 1 file changed, 8 insertions(+), 19 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index 3bc8457b646..a1f00905869 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -159,25 +159,14 @@ included in one of the following places: The preferred way to access the data from Python is using the `importlib.resources API `, in particular the function :func:`importlib.resources.files`. - It should be imported as follows - (see :ref:`section-python-language-standard`):: - - import sys - - if sys.version_info < (3, 11): - # Use backport package providing Python 3.11 features - from importlib_resources import files - else: - from importlib.resources import files - - After this import, you can: - - - open a resource for text reading: `fd = files(package).joinpath(resource).open('rt')` - - open a resource for binary reading: `fd = files(package).joinpath(resource).open('rb')` - - read a resource as text: `text = files(package).joinpath(resource).read_text()` - - read a resource as bytes: `bytes = files(package).joinpath(resource).read_bytes()` - - open a xz resource for text reading: `fd = lzma.open(files(package).joinpath(resource).open('rb'), 'rt')` - - open a xz resource for binary reading: `fd = lzma.open(files(package).joinpath(resource).open('rb'), 'rb')` + Using it, you can: + + - open a resource for text reading: ``fd = files(package).joinpath(resource).open('rt')`` + - open a resource for binary reading: ``fd = files(package).joinpath(resource).open('rb')`` + - read a resource as text: ``text = files(package).joinpath(resource).read_text()`` + - read a resource as bytes: ``bytes = files(package).joinpath(resource).read_bytes()`` + - open an xz-compressed resource for text reading: ``fd = lzma.open(files(package).joinpath(resource).open('rb'), 'rt')`` + - open an xz-compressed resource for binary reading: ``fd = lzma.open(files(package).joinpath(resource).open('rb'), 'rb')`` If the file needs to be used outside of Python, then the preferred way is using the context manager From d3a4313f026985e4d4cce01db169f386b5a1e042 Mon Sep 17 00:00:00 2001 From: Matthias Koeppe Date: Fri, 22 Mar 2024 14:19:10 -0700 Subject: [PATCH 13/13] src/doc/en/developer/coding_basics.rst: Fix markup, break long lines --- src/doc/en/developer/coding_basics.rst | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/doc/en/developer/coding_basics.rst b/src/doc/en/developer/coding_basics.rst index a1f00905869..00844de6f4a 100644 --- a/src/doc/en/developer/coding_basics.rst +++ b/src/doc/en/developer/coding_basics.rst @@ -157,7 +157,8 @@ included in one of the following places: location as the Python code. This is referred to as "package data". The preferred way to access the data from Python is using the - `importlib.resources API `, + `importlib.resources API + `_, in particular the function :func:`importlib.resources.files`. Using it, you can: @@ -203,9 +204,11 @@ is proposed to do the following: - add metadata to the repository that make it a pip-installable package (distribution package), as explained for example in the - `Python Packaging User Guide `_, + `Python Packaging User Guide + `_, -- `upload it to PyPI `_, +- `upload it to PyPI + `_, - create metadata in ``SAGE_ROOT/build/pkgs`` that make your new pip-installable package known to Sage; see :ref:`chapter-packaging`.