-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make PySPQR into an installable Python module #5
Conversation
This looks very nice. I can accept the pull request. Do you think I should rename the git repository "sparseqr"? As a git repository, the name becomes ambiguous without a "py" in front. I can also rename the spqr.py modules inside. Note that this does not work for me on macOS. It seems to install the module fine, but runs into a permission error when installing the test data:
I will look into fixing this before I pull. Also, I think this should be called version 1.0, not 0.1.0. I agree that in the long run this would find a better home inside scikit-sparse. I took a different approach than they do to wrapping the C library by using cffi. |
It seems like |
I think it is fine to leave the repository name as-is. Any existing users (and their friends) expect to find the project by that name. A usage example in the README (and an included Good point about GitHub not being Python-specific; also, to be fair, there are quite many libraries on PyPI starting or ending with a redundant "py", so it's not strictly necessary to get rid of it. I'm not sure if " In my opinion, for the user it only matters that the package name is somewhat sensible (to maximize the first-glance readability of the source code, following the Python spirit), and that the functions are callable directly under that. There is no need to present nested namespaces to the user, as this is a small library. Internally, the code can of course be organized in whatever way is the most logical and maintainable; we can make the package-level To me the error sounds like setuptools on Mac OS is trying to install non-package data files (and directories) directly under For the record, on Linux (Mint), this works differently: Python (3.4) packages install into I was specifically using Maybe using I'll also change the version as requested. The Updated PR to follow soon. |
...but before updating the PR, one more comment: About As you already said, the approaches to wrapping C code are different; specifically, Cython vs. I wonder, in the long term, what would be the best way forward:
Solution 1 is ugly, since the code then depends on two independent systems to wrap (and load) the same C library. It also introduces unnecessary duplication of the data conversion routines that convert between SuiteSparse and NumPy/SciPy formats, since each of the two wrapper libraries has its own implementation of these. Solution 2 discards a tested, rather thoughtfully written (e.g. shared allocator for the triplet vectors of the same sparse matrix!) and most importantly fully working wrapper library that users already depend on. Solution 3 similarly discards a rather simple and elegant solution, similarly already tested and fully working. |
I'll think about the name a bit more. Regarding the long term, I don't know what the best way forward is. I think it's up to whoever chooses to maintain scikit-sparse (or scipy, if they want to take this on). I don't really have a problem with there being two entirely separate libraries, so long as they can be found and used. It's somewhat suboptimal, but I already use scipy and cvxopt a lot and have to convert matrix formats. If cvxopt or scikit-sparse ever stopped working (bitrot) and I needed CHOLMOD, I would probably add it to this library using |
Naming of the internal files, good point. To support both use cases, the names of the package and the main module indeed need to match. If you want an opinion, I think While at it, perhaps we could also get rid of the Digging a bit, now I remember another reason why I originally steered clear of Lies, More Lies and Python Packaging Documentation on Searching the web for package_data setuptools turns up also: How to include package data with setuptools/distribute? It seems that to include data files in both sdist and bdist, writing a MANIFEST.in is the way to go: Install package data with setuptools Then there are at least pbr and Flit offering a simplified interface for Python packaging, but for this higher level of abstraction no single dominant tool seems to have emerged yet, so it's probably better for now to stay on setuptools. |
… setuptools prefix may be set to /usr/local, making the install choke on attempting to install the non-package data files. README.md and test/test.py are now only included in the sdist, using MANIFEST.in.
Ok, investigation complete, and PR updated. Version bumped to 1.0.0, and install on Mac OS hopefully fixed. Please test the updated version. There is still the module and function renaming to do - if we agree on the names, I could do that before you pull? One final detail is that, in order to have GitHub detect and display the license type automatically, I would like to include a LICENSE.md, if that's ok. This seems to be the official text for CC0 1.0. About the installation issue, this SO comment confirms that with the previous setup.py, By default, in Linux (Mint) each Python package gets its own prefix, which is why it worked for me. In the case of On Mac OS, based on your report, it sounds like all packages share the same prefix As the SO comment notes, non-package data files have no natural install location. So, the current solution simply includes them in the sdist (by specifying them in MANIFEST.in), but leaves them out when installing. I think from the user's perspective, this should be ok. |
I think at this point we can radically simplify the
|
Sure, we can/should add a https://github.com/yig/rig_converter/blob/master/LICENSE |
Regarding renaming the functions themselves, we should probably follow the C API itself, which means:
|
… character coding declaration and future-imports)
Re The reason behind the proposed automatic detection for package As many things, it's a tradeoff - if you think it is better to keep I've 'pulled' your version of Re LICENSE text, thanks for the file! Added to the PR as LICENSE.md. (.md because also README is .md, and README/LICENSE/CHANGELOG/AUTHORS usually come as a set in the same format.) Speaking of which, are there any other contributors to name? Asking just to make sure; there weren't any on the GitHub network graph. Do you want your email address in Re naming the functions, while transparency in naming does have its merit, there are a couple of contrasting considerations:
|
Tested to make sure: import numpy as np
import scipy.sparse
A_dense = np.eye(5) # can use any dense matrix
for clsname in [s for s in dir(scipy.sparse) if s.endswith('_matrix')]:
clsobj = getattr(scipy.sparse, clsname)
A_sparse = clsobj(A_dense)
print( isinstance(A_sparse, scipy.sparse.spmatrix) ) # True |
I'd like to finish this, for which I need two decisions from you as the primary author of the project: 1: API
2: Module name
Once I have the decisions, I'll then rename the functions and the module, and update the PR to a final state that should be suitable for merging. As for the other questions I had, I'll default to "no other contributors", and leave the email address field blank, unless otherwise specified. |
Thanks for your work on this.
The current There are no other authors. I don't know what to say about the email field. I guess blank is fine. If they are wondering about the code, I'd answer the question. If they are wondering about the packaging, you'd answer the question. I don't know which is more likely. Since there is a link to the github, it is fine. |
Ok, let's keep that version of Authors and blank email, ok. I'll update the PR soon-ish (probably tomorrow). |
…sed on whether b is dense or sparse
PR updated. Overview of everything so far:
Details of the latest changes in the log as usual. Final things Maybe this is getting silly, but I find there is one minor thing remaining: There is a third function in the public API, namely We are already breaking backward compatibility in terms of naming of functions and modules, so if we want to rename the function, it would be best to do this before we release 1.0.0. (And we should tag and release it when we're done, to make the version number actually mean something. :) ) After that, any name changes would have to wait until 2.0.0 (if we follow semantic versioning), and considering that PySPQR in its current state is pretty much complete, there might never be a need to release a 2.0.0. So, what to use as a name? This is a general routine to convert a permutation vector to a permutation matrix, not specific to the
Then, a final final thing would be to convert the docstrings to follow the NumpyDoc format, for consistency with NumPy, SciPy and the scikits. The NumpyDoc format also has the bonus that the Spyder IDE pretty-renders it, making the documentation easier to read for Spyder users. Also, the docstrings should have max. 75 characters per line to make them readable in (standard default size) text terminals. Currently they wrap in an unreadable manner - some (maybe much) of this is to blame on my habit of maximizing my editor and terminal windows. But I think the docstring update can wait until a 1.0.1 dedicated to that. |
The changes look great. I agree with changing I agree with waiting to change the docstrings for a later minor version. |
Descriptivized. Maybe this is now complete? :) |
Attached is a PR to make PySPQR into an installable Python module. This is especially convenient when it is needed by several local projects, as it is then enough to install it once, avoiding the need to copy files around.
I have picked the name "sparseqr" (which can still be changed if you want), because the name "spqr" is already taken on PyPI for a module supporting Roman numerals. The project title "pyspqr" is confusingly close to the existing "spqr", on PyPI the "py" prefix would be somewhat redundant, and also in the conventional all-lowercase, it becomes much less readable.
In my opinion, in the long run, the optimal solution would be to merge PySPQR into scikit-sparse. The library already wraps CHOLMOD, and depends on SuiteSparse just like PySPQR does, but it currently has no active maintainer.
So, as an interim solution, this PR makes PySPQR installable separately. Install instructions in README.md have been updated.
This PR is sufficient to support setuptools, i.e. python setup.py install (with various options). If it is desirable to have PySPQR available also on PyPI, there are basically two options - either you can upload it as the original author, or I can take up the packaging.
I took the liberty to mark this version as v0.1.0, since with setuptools it is highly recommended to have a version number. The platform has been marked as Linux, as that is what I'm testing on, but it will probably work on any platform.