Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: start using numpy-1.7 API #8329

Closed
immerrr opened this issue Sep 20, 2014 · 22 comments
Closed

CLN: start using numpy-1.7 API #8329

immerrr opened this issue Sep 20, 2014 · 22 comments
Labels
Compat pandas objects compatability with Numpy or Python functions Dependencies Required and optional dependencies good first issue

Comments

@immerrr
Copy link
Contributor

immerrr commented Sep 20, 2014

Now that numpy-1.6 support is dropped we could proceed to use np-1.7 API.

This involves fixing a macro or two deep down in pandas internals, but otherwise should be pretty straightforward. As a bonus we can get rid of these annoying build warnings:

gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Ipandas/src/klib -Ipandas/src -I/home/immerrr/.conda/envs/pandas/lib/python2.7/site-packages/numpy/core/include -I/home/immerrr/.conda/envs/pandas/include/python2.7 -c pandas/index.c -o build/temp.linux-x86_64-2.7/pandas/index.o
In file included from /home/immerrr/.conda/envs/pandas/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1761:0,
                 from /home/immerrr/.conda/envs/pandas/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
                 from /home/immerrr/.conda/envs/pandas/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from pandas/index.c:244:
/home/immerrr/.conda/envs/pandas/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]

xref https://github.com/pydata/numexpr/pull/228/files

@immerrr immerrr changed the title CLN: start using numpy-1.7 api CLN: start using numpy-1.7 API Sep 20, 2014
@jreback jreback added the Compat pandas objects compatability with Numpy or Python functions label Sep 20, 2014
@jreback jreback added this to the 0.15.0 milestone Sep 20, 2014
@jreback
Copy link
Contributor

jreback commented Sep 20, 2014

@immerrr would be gr8 (esp for 0.15)!

@immerrr
Copy link
Contributor Author

immerrr commented Sep 20, 2014

I was going to suggest it for the hackathon, are you expecting anyone who's into low-level stuff there?

@jreback
Copy link
Contributor

jreback commented Sep 20, 2014

hmm, interesting, I will add this up their too. As their are going to be people who could hack on NumPy, but we'd rather co-op them to pandas :)

@devanshmehta
Copy link

I am trying to work on this bug. But half difficulty finding a starting point. It will be nice if you can point me in right direction. I have the forked the panda repo and added upstream.

@jreback
Copy link
Contributor

jreback commented Sep 27, 2014

are you at the bloomberg hackathon?

@devanshmehta
Copy link

Yes

@jreback
Copy link
Contributor

jreback commented Sep 27, 2014

@devanshmehta can you come over to my table?

@immerrr
Copy link
Contributor Author

immerrr commented Oct 4, 2014

Ok, I gave this one a shot and it's nowhere near easy. After figuring out simple stuff, you run into two huge obstacles. One is that cython doesn't yet support numpy-1.7 api. Pandas has its own numpy.pxd, but it too has the necessary macros commented out with a description saying:

    # dtype PyArray_DESCR(ndarray) wrong refcount semantics

That one can be figured out with some interaction with Cython guys (and probably backported there), but then there are rolling functions.

Some of those functions in algos module do pure magic: they create one a single ndarray to refer to the window, pass it to python-level function and then alter its arr.data field to skip to the next element without incurring overhead of creating/destroying ndarrays. In np-1.7 api referring to data must be done via PyArray_DATA(arr) macro and it's probably not guaranteed to be assignable, so that optimization potential goes out of the window. I'm not sure if there's a way around this that doesn't lose performance.

@jreback
Copy link
Contributor

jreback commented Oct 4, 2014

I should have mentioned
@devanshmehta can u post the branch u did (even of not complete)

@immerrr
Copy link
Contributor Author

immerrr commented Oct 4, 2014

Here's mine: immerrr@71454db

@devanshmehta
Copy link

Apologize for the delayed response. I did not work on it after the hackathon. Will post the over the branch.

@devanshmehta
Copy link

https://github.com/devanshmehta/pandas

Here is the branch. Based on the conversation with Mark who was also present during the hackathon. He said there was particular macro which need to be defined before using the numpy api. The cython files cannot contain #define macro, so we created a c header file which contained the #define macro required by numpy and tried to include it in front of numpy api. After doing this there were few compilation error in both c and python files. I was trying to get rid of the compilation error by using the new numpy api. However my work is till incomplete and gives some compilation error.

@immerrr
Copy link
Contributor Author

immerrr commented Oct 15, 2014

Thanks, @devanshmehta . I'll look into this.

@immerrr
Copy link
Contributor Author

immerrr commented Nov 3, 2014

So I attempted to resolve this once more, but again hit some sort of a wall.

For the reasons outlined here it's not permitted to modify data pointers in living arrays and a lot of reduce.pyx code should be basically rewritten from scratch in a way that will decrease its performance (according to the mailing list message, negligibly).

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@jreback jreback added the Dependencies Required and optional dependencies label Jan 28, 2017
@gfyoung
Copy link
Member

gfyoung commented May 16, 2017

@jreback : I think you can close this now that #15206 has been issued.

@jreback
Copy link
Contributor

jreback commented May 16, 2017

@gfyoung your reference seems to be off, and this is certainly not closed. we have warning messages in the c-code. need to set a macro.

@gfyoung
Copy link
Member

gfyoung commented May 16, 2017

@jreback : I meant to refer to the issue where we are dropping numpy < 1.9. Certainly that should encapsulate this one?

@jreback
Copy link
Contributor

jreback commented May 16, 2017

this refers to the c api
we are actually using the pre 1.7 one
that's the point of setting the macro

so this issue needs to be fixed before (though of course will work when compiling on later numpy anyhow as they haven't removed the old api)

@jbrockmendel
Copy link
Member

According to the cython docs this warning should be ignored. I take that to mean we jus have to wait for them to fix it there (no indication if/when that will be). This is not actionable. Closing.

@gfyoung
Copy link
Member

gfyoung commented Jul 13, 2018

@jbrockmendel : Could you add a link for future reference?

@jreback
Copy link
Contributor

jreback commented Jul 13, 2018

i believe u can set a flag about the api to turn off the old api

@jbrockmendel
Copy link
Member

https://cython.readthedocs.io/en/latest/src/reference/compilation.html#configuring-the-c-build

Despite this, you will still get warnings like the following from the compiler, because Cython is using a deprecated Numpy API:

.../include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]

For the time being, it is just a warning that you can ignore.

@jreback

i believe u can set a flag about the api to turn off the old api

The issue isn't pandas using the old API, it is cython using it. As long as cython-generated C is using the old API, we're not getting rid of those warnings. The relevant flag would be to add to setup.macros the entry ('NPY_NO_DEPRECATED_API', 'NPY_1_7_API_VERSION'), but doing so causes compile-time errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Dependencies Required and optional dependencies good first issue
Projects
None yet
Development

No branches or pull requests

7 participants