Working Full Time on NumPy

Progress on NumPy

I have been hired by the BIDS to work on NumPy full-time. Here I will document my work, what went well and also what didn't.

Aug 30, 2018

I am at EuroScipy in Trento, Italy. What a beautiful location for a conference, in a valley surrounded by mountains, the city itself has a wonderful square and castle. I presented a tutorial on wrapping C libraries with python, as well as a 15 minute talk on NumPy and where we would like to take it over the next two years. I am meeting lots of people from all across Europe, in total there are about 250 people here.

Aug 6, 2018

Whoops, over a month has gone by. The past month has been busy. July began with SciPy 2018. We presented a 5-minute Numpy overview which I am turning into a 20 minute talk for EuroScipy at the end of August. We also opened a BOF session, mainly disussing the NumPy roadmap, later when back at BIDS it was integrated with the NEPS documents. And we sprinted for two days. The most productive part of the conference for me was the opportunity to meet face-to-face with the core devs past and present. We were able to put together an impromptu dtypes brainstorming session, which it seems will become the focus of my work.

I took some time off at the end of the month as well. It seems the August doldrums have hit, some of the pull requests are languishing in the queue. On the other hand, 1.15 went out with not too much complaints for so many deep changes. Now I am trying to understand what must be done in creating a dtype, which is alot of reading.

June 27, 2018

I have been staring at the code to reproduce the inner functions of np.dot as np.matmul. Internally, NumPy today uses einsum to implement matmul. When implementing matmul as a gufunc, it would seem to be more appropriate to reuse the dot functionality since einsum is more general, dot has a fixed signature that does not have to be reparsed for each call.

dot actually calls PyArray_MatrixProduct2 to do the work. That function quickly hands off the work to cblas_matrixproduct if possible (dtype is float, double or cfloat, cdouble and blas is available), and if not calculates the result itself. PyArray_MatrixProduct2 and `cblas_matrixproduct share little code between them, here is a comparison:

step	`PyArray_MatrixProduct2`	`cblas_matrixproduct`
detect non-aligned or non-contiguous data	doesn't care	uses `PyArray_NewCopy` if non-aligned or non-contiguous
create output array	use new_array_for_sum	refactored lines of code to new_array_for_sum in PR #11432
logic to handle vector-matrix, matrix-matrix, scalar-matrix and visa-versa	uses `PyArray_IterAllButAxis`	`if` blocks for each case
chooses function for sum-of-loops	uses `out->dtype->vdot`	`if` blocks for each of the level 2 BLAS cases, use out->dtype->vdot for vector/scalar (which will use level 1 BLAS dot functions)

I then did some more work with the loops in the matmul PR.

June 21, 2018

Still plugging away at matmul as a ufunc, I discovered a few spots where the out kwarg was not handled properly in einsum. It seems out is rarely used in practice, as we apparently have no reported issues with it. Too bad, it is possibly a nice way to conserve memory. I also refactored one of my pre-BIDS pull requests, simplifying NpyIter deallocation, based on a last-minute review of the upcoming 1.15 release.

We now have four draft NEPS in the pipeline, it seems NumPy still has room to grow and develop. These NEPS touch on the major pain points in NumPy: indexing, random numbers, allowing downstream packages to replace the local-memory-based-backed storage mechanism, and extending generic ufuncs. Still in the works - extending the dtype mechanism, something I hope to work on in the coming months.

June 6, 2018

The matmul as a ufunc pull request is starting to come together. I created loops for all the supported types, and am now trying to optimise. So before linking the implementation of matmul to that of dot, I set up a little comparison of matmul (as a ufunc) dot, and einsum.

June 1, 2018

This week there was a large group of people at BIDS sprinting on a number of packages central to the data science core. It was nice to meet them and hear how they view NumPy within their projects. Spent most of the week triaging, formalizing a few NEPs, and pushing the ufunc signature expansion.

May 25, 2018

We had a very nice two-day NumPy sprint at BIDS, with about 8 participants. Among the highlights:

We drafted a very rough proposal for a NumPy "roadmap", which tries to sketch out the scope of the NumPy project, what are the enhancements we would like to see (including soft tagets like "a larger dev community"), and what tasks the maintainers should be doing. We will post the draft for comment soon, and hope to discuss it at the NumPy BOF at SciPy2018 in Austin TX.
We made progress toward understanding what a duck-array is, and how it might be implemented. Stay tuned for the NEP proposal and a prototype implementation.
We hammered out some issues better solved with many eyes, for instance it seems the CI is now pulling in pytest-3.6 which seems to have exposed an issue in one of the refcount tests.
We seem to have reached a concenus, based on the mailing list and github responses, prefering making matmul a full ufunc rather than creating a ufunc_wrapper decorator function.
No-one opposed merging umath and multiarray modules into one, so I will continue to work on that in the coming weeks.
Social interaction between people who rarely meet.
Progress toward releasing v1.15.0

Thanks to all who made the effort!

May 22, 2018

I am now located at BIDS for the summer, it took a little while to transition and settle in. In the meantime, I have completed a prototype matmul implementation as a ufunc. So now there are two competing ideas: either

wrap matmul with a wrapped_ufunc decorator that allows the __array_ufunc__ mechanism to delegate to ndarray subclasses, and/or
expand the core signature of gufuncs to allow ?, break PyUFuncObject binary compatibility, and convert matmul to a true ufunc.

Which do you prefer?

May 11, 2018

Started digging in to the history of the matmul and friends. The @ operator is routed through array_matrix_multiply to np.matmul, which is implemented as array_matmul which uses PyArray_EinsteinSum for true matrix multiplication or cblas_matrixproduct for inner product.

However np.dot is routed to array_dot which uses PyArray_MatrixProduct2, which also tries to use cblas_matrixproduct. Rather than using PyArray_EinsteinSum it loops over dtype->funcdot. It seems we should try to unify these different interfaces under the ufunc umbrella somehow.

Also, PyArray_EinsteinSum itself (exposed to python as np.c_einsum, partially reimplemented in python as np.einsum) seems to reimplement something resembling ufunc. Einsum has its own signature that uses a different syntax from ufuncs. Maybe the whole einsum mechanism should be overhauled to use ufuncs?

Searching for einsum and ufunc in issues lead me to a few discussions

using ufuncs everywhere, which is very similar to this summary
deprecating dot which turned out to be controversial.

Going even further down this rabbit hole, it seems

einsum is found in 28 open issues/pull requests
dot is found in 71 open issues/pull requests

Some of these are performance, some bugs, some documentation.

May 10, 2018

An idea - maybe we can triage issues by using a word cloud to find out what are the major numpy pain points?

$pip install wordcloud, pygithub
$python
>>> import github, matplotlib.plot as plt
>>> from wordcloud import WordCloud, STOPWORDS
>>> gh = github.GitHub("access_token")
>>> repo = gh.get_organization('numpy').get_repo('numpy')
>>> issues = repo.get_issues(state='open')
>>> txts = [xx.body for xx in issues] # slow, actually queries GH for each issue
>>> stopwords = set(STOPWORDS)
>>> stopwords.update(('Oj', '-Oj', 'return', 'python', 'numpy', 'py', 'line', 'package', 'use', 'using', 'array', 'np'))
>>> wc = wordcloud.WordCloud(background_color="white", stopwords=stopwords).generate(' '.join([xx for xx in txts if xx]))
>>> plt.imshow(wc, interpolation='bilinear'); plt.show()

word cloud

hmm, not very helpful. Maybe someone can improve on my word clould skills?

May 9, 2018

Went back to making matmul a full-fledged ufunc. It turns out the ufuncs used in ndarray are generated by a python script generate_umath.py, that cannot accept a signature. The ones in linalg are generated in C in umath_linalg.c.src, it is a toss-up which is more clear. There is also a test implementation of matrix_multiply in _umath_tests.c.src. So far I have extended generate_umath.py to accept a signature, and created a matrix multiply ufunc that should replace matmul, but I haven't worked out how to hook it in to be used. I think I need to extend PyArray_SetNumericOps.

May 7. 2018

Still working through the two issues from May 4. I have learned how hard it is to create a decorator class in C, compared to the 4 lines in python. The main problem is how to move between a c function pointer to a PyMethodObject. I ended up writing the actual decorator in python, and calling that from C. It seems to work, at least enough to get some feedback whether this is the direction the solution will take.

The SSE optimization work around now is green on the buildbots, after much back and forth between the different compiler quirks.

May 4, 2018

In addition to triaging and cleaning up other pull requests, I have been spending the last few days around two issues:

creating a decorator class in C that can wrap a "ufunc-like" class method (looking at you, matmul) and apply the __array_ufunc__ resolution machinery, which is what I described below
working through a solution to optimizer compiler's reordering of operations involving SSE calls, issue 10370. It took quite a lot of digging to come up with a "solution" that hopefully works on all compilers, at least it works on gcc and clang. It turns out using floating point error handling is non-trivial.

May 1, 2019

As an alternative to modifying ufunc signatures, there is a suggestion to try to wrap the __array_ufunc__ override mechanism in a function decorator. So today I worked through what that means. In each of the methods of numpy.ufunc in umath/ufunc_object.c (__call__, reduce, accumulate, reduceat, outer, and at) there is a call to PyUFunc_CheckOverride. If it returns a value, this is called instead of the ufunc. OK, so maybe if I could create a ufunc object that wraps a python function, my work could be done. So I tried to make a wrapper out of numpy.frompyfunc.

Along the way I noticed its implementation is a copy-paste from somewhere else, so I refactored out the duplicated code in favor of calling PyUFunc_FromFuncAndData.

Unfortunately, this path is not applicable since frompyfunc does not accept a signature and matmul definitely needs a signature, the default is (n),(n)->(n). Even if we extended frompyfunc to accept a signature, we would be back where we started, that is needing to extend gufunc for a number of signatures.

Hmm, I could create another class, much like numpy.ufunc:

    def __init__(self, func):
        self.func = func
    def __call__(self, *args, **kwargs):
        # the c call is a bit different
        result = PyUFunc_CheckOverride(self.func, "__call__", args, kwargs)
        if result:
           return result
        return self.func(*args, **kwargs)

Then perhaps this class would be suitable as a wrapper ...

April 30, 2019

Diving deep into parsing generalized ufunc signatures. I think I have to add a new field to the PyUFuncObject, I spent some time considering adding options to existing fields (like making them negative to represent flexible dimensions). The general idea is to allow signatures like (n?, k), (k, m?) -> (n?, m?), the question mark means "it is OK if that dimension is missing in the input/output". This would enable matmul to become a generic ufunc, handling (k),(k,m)->(m), (n,k),(k,m)->(n,m), (k),(k)->() and (n,k),(k)->(n) operations (vecmat, matmat, vecvec, matvec respectively). I am also completing an old pull request for fixed dimensions, like (3),(3)->() which would allow func(shape(4,3), shape(3,)) -> shape(4) ufuncs. What would (3?), (3?) -> () mean? In the end I rejected this, let's keep it simple. Progress can be viewed here https://github.com/mattip/numpy/tree/expand-gufunc-signature

I also took a look at the numpydoc repo which has a number of issues and open pull requests, and also spent some time with the wonderful tool to show compiled assembler mentioned in issue #10370

April 25, 2019

Jupyter has a nice repo for long term planning, perhaps we should do something like that. Continued working on the nep 0015 refactor, triaging, and tweaking documentation. The nep refactor now achieves the major aims, now need to think about further refactoring _import_array and _import_umath`.

Looking at the matmul and array_ufunc issue, it seems there was some work done in PR #5015 that might be applicable. I rebased that to master, and am thinking how the extra data info without changing the size of PyUFuncObject.

April 24, 2018

We tried an "Office Hours" meeting, about 7 people showed up for a one hour conversation that ranged from details about how to close duplicate issues on the issue tracker to how to manage the project and how to create a roadmap for NumPy (the one on this wiki is just for my work). I am working on a summary that will be sent to the NumPy dev mailing list.

April 23, 2018

I looked more at the implement-nep-0015 branch, trying to understand why it segfaults on CPython3. I found the best way to debug was to run CFLAGS='-O0 -g' pip install -e . --upgrade in a virtualenv, then gdb --args python -c "import numpy as np". When things are a bit further along, CFLAGS='-O0 -g' python runtests.py -t file_that_segfaults is also effective.

Along the way, I tried diving into the ufunc mechanism, in preperation for tackling matmul and ufuncs.

It is annoying that sometimes when running runtests.py, the tests all pass but then CircleCI fails since there was a warning during the build_ext compilation. I took a look at using CFLAGS=-Werror but there are many warnings during the configuration stage. I fixed some, by making the distutils.command.config tool even more complicated, it seems it will turn into a new autoconfig utility so I will try a different approach.

April 18, 2018

The doc changes were merged, now visible at devdocs. Also the link on scipy will be updated to point to that link soon. I am now following the https://github.com/numpy/numpy firehose of changes to the repo (pull requests, issues). It is awe-inspiring to see the level of dedication from the volunteers. I have started sifting through old pull requests and issues, trying to find low-hanging fruit.

In a conversation about the new quantile function that close to ready for merging, I peeked at the rendering of the percentile function and noticed a few formatting problems, which sent me down the documentation site rabbit hole again. A new pull request to fix some quirks in rendering and reduce the number of sphinx warnings was born. There are many warnings about "document isn't included in any toctree", it seems something is not quite right with the autoclass template even though it is the same one used by scipy.

April 16, 2018

While the discussion about matmul is progressing on the issue tracker, I made some other progress with issue/pr triaging, finished the documentation PR, and started merging multiarray and umath (nep 0015) While the current pull request passes tests on python2, it crashes in python3. I suspect there are subtle issues with module initialization and the new import mechanism, but it is hard to pin down.

April 12, 2018

Next up - issue #9028 about overriding matmul if __array_ufunc__ is set. In trying to understand the issue, I got sidetracked reading the relevant NEPS but then I went down the rabbit hole of looking at the developer docs. It turns out the docs themselves had a few problems: outdated links, relevant info was buried or not even linked into the docs, and way too many warnings when building the docs using sphinx. This led me to spend some time updating the docs themselves.

Along the way I started learning about the larger NumPy github workflow - how to label and curate issues and pull requests, as an ongoing task I can do to reduce the core devs' workload.

Now I can get back to reading NEPS and trying to figure out what makes matmul special.

April 10, 2018

Trying to take a look at the 1.15 release milestone. Touched a few things, then came across a long-standing issue with complex numbers, negative exponents, and savetxt / loadtxt. Should be an easy place to start sinking my teeth in.

April 9, 2018

I began onboarding, thinking about how I can best help the community and the project. Basically a day of getting organized. Had a productive meeting with Stefan van der Walt from BIDS, where we laid out some guidelines for my first steps.

Thanks to the Moore and Sloan Foundations and BIDS for allowing me this opportunity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly