Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use getattr instead of hasattr #573

Closed
wants to merge 8 commits into from

Conversation

jakirkham
Copy link
Member

Calling hasattr is just as expensive as calling getattr and slightly more expensive than direct attribute access. As a result calling hasattr and then perform attribute access, we are effectively paying the cost of attribute access twice.

hasattr vs. getattr vs. direct access
In [1]: import numpy                                                            

In [2]: a = numpy.arange(110).reshape(10, 11)                                   

In [3]: %timeit hasattr(a, "__array_interface__")                               
1.17 µs ± 3.73 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [4]: %timeit getattr(a, "__array_interface__", None)                         
1.17 µs ± 5.61 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [5]: %timeit a.__array_interface__                                           
1.11 µs ± 2.27 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

When we look at the overall runtime of get_buffer_data and get_buffer_nbytes, we can see that attribute access makes a significant contribution to runtime of these functions overall.

Runtime before of get_buffer_data and get_buffer_nbytes
In [1]: import numpy 
   ...:  
   ...: from ucp._libs.utils import get_buffer_data, get_buffer_nbytes          

In [2]: a = numpy.arange(110).reshape(10, 11)                                   

In [3]: %timeit get_buffer_data(a)                                              
2.48 µs ± 12.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [4]: %timeit get_buffer_nbytes(a, check_min_size=None, cuda_support=True)    
3.67 µs ± 17.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

So we can simply switch to getattr and add a default case of None to check for. Overall this is a pretty minor change to the code here. Though the result in terms of get_buffer_data and get_buffer_nbytes is quite notable, making both ~2x faster.

Runtime after of get_buffer_data and get_buffer_nbytes
In [1]: import numpy 
   ...:  
   ...: from ucp._libs.utils import get_buffer_data, get_buffer_nbytes          

In [2]: a = numpy.arange(110).reshape(10, 11)                                   

In [3]: %timeit get_buffer_data(a)                                              
1.16 µs ± 12.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [4]: %timeit get_buffer_nbytes(a, check_min_size=None, cuda_support=True)    
2.13 µs ± 20.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

cc @pentschev @madsbk @quasiben

@jakirkham jakirkham requested a review from a team as a code owner August 11, 2020 03:36
The `default` used in `.get(...)` is always `None`. So there is no need
to specify it. So go ahead and drop it to simplify things.
As calling `hasattr` is just expensive as `getattr` and slightly more
expensive compared to attribute access, simply call `getattr` instead
and handle the case where the attribute is not found with `None`.
To simplify later checks and operations, go ahead and assign `strides`.
@jakirkham
Copy link
Member Author

Absorbed into PR ( #546 ).

@jakirkham jakirkham closed this Aug 11, 2020
@jakirkham jakirkham deleted the use_getattr branch August 11, 2020 04:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant