More tslibs TODOS: small optimizations, trim namespaces #18446

jbrockmendel · 2017-11-23T04:12:57Z

A few methods of Timedelta and Timestamp that don't need to be user-facing are changed from cpdef to cdef. To make the wheels turn, Timestamp methods that call those methods are moved from Timestamp to _Timestamp.

Stronger typing for _Timestamp. _get_start_end_field and a couple others.

@cython.final supposedly allows for a small optimization because cython does not need to check at runtime whether a method has been overridden. Applied to Timestamp.__add__, Timestamp.__sub__

jreback · 2017-11-23T15:37:55Z

pandas/_libs/tslibs/timestamps.pyx

@@ -246,6 +249,7 @@ cdef class _Timestamp(datetime):
            result.nanosecond = self.nanosecond
        return result

+    @cython.final


can you perf check this (you can just compare time-its for couple of ops in PR and master to see); if it makes a diff then add an asv (and if not let's not do this).

if it makes a diff then add an asv

I'm shocked to see there isn't already an asv for "Timestamp addition/subtraction". I'll make one, or see if one isn't hidden somewhere other than benchmarks/timestamp.py. If it is somewhere else, will moving them around cause a problem for backward-comparability?

can you perf check this

Not seeing much difference, but %timeit results are jumping around more than I'd like. I've got a couple of questions queued up for the cython mailing list, will add this to them.

I'm shocked to see there isn't already an asv for "Timestamp addition/subtraction". I'll make one, or see if one isn't hidden somewhere other than benchmarks/timestamp.py. If it is somewhere else, will moving them around cause a problem for backward-comparability?

no there are, its just asv's seem to be unstable for you :> (and this is prob a tiny difference)

no there are

Unless the instability extends to the contents of the benchmarks/ files, I'm pretty sure asv_bench/benchmarks/timestamp.py doesn't have any addition/subtraction benchmarks.

timeseries.py has plenty

Yah, so the question about moving was if its OK to consolidate Timestamp asvs (kinda analogous to what I've been doing for the tests)

yes sure. we definitly want compreshensive benchmarks for scalar + scalar (and scalar + index, etc). so consolidation is fine.

jbrockmendel · 2017-11-24T16:39:39Z

Asked on the cython mailing list: turns out @cython.final doesn't apply to special methods including __add__ and __sub__. Will remove.

jbrockmendel · 2017-11-24T19:02:17Z

Looks like because of cython implementation details, making _round cdef (and therefore not in the namespace) requires setting default values for freq in round, ceil, floor methods. Any objection to making those default to "D"?

jreback · 2017-11-24T19:22:40Z

Looks like because of cython implementation details, making _round cdef (and therefore not in the namespace) requires setting default values for freq in round, ceil, floor methods. Any objection to making those default to "D"?

don't set defaults, this is a required user-settable parameter. move them back to the class impl. not really sure these belong in the c class def anyhow (no benefit).

jbrockmendel · 2017-11-24T19:29:04Z

move them back to the class impl. not really sure these belong in the c class def anyhow (no benefit).

Sure. The idea was to get _round out of the user-facing namespace. Not that big a deal. Will change shortly.

codecov · 2017-11-24T22:13:23Z

Codecov Report

Merging #18446 into master will increase coverage by 0.17%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #18446      +/-   ##
==========================================
+ Coverage   91.35%   91.53%   +0.17%     
==========================================
  Files         163      163              
  Lines       49691    50730    +1039     
==========================================
+ Hits        45397    46437    +1040     
+ Misses       4294     4293       -1

Flag	Coverage Δ
#multiple	`89.38% <ø> (+0.24%)`	⬆️
#single	`39.66% <ø> (-0.08%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.8% <0%> (-0.1%)`	⬇️
pandas/core/indexes/multi.py	`96.4% <0%> (-0.01%)`	⬇️
pandas/core/api.py	`100% <0%> (ø)`	⬆️
pandas/io/pytables.py	`92.84% <0%> (ø)`	⬆️
pandas/core/reshape/api.py	`100% <0%> (ø)`	⬆️
pandas/core/sparse/frame.py	`94.78% <0%> (ø)`	⬆️
pandas/io/parsers.py	`95.59% <0%> (ø)`	⬆️
pandas/core/reshape/melt.py	`96.22% <0%> (+0.03%)`	⬆️
pandas/core/indexes/interval.py	`92.64% <0%> (+0.11%)`	⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fedc503...e4d5f7f. Read the comment docs.

codecov · 2017-11-24T22:13:35Z

Codecov Report

Merging #18446 into master will increase coverage by 0.17%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #18446      +/-   ##
==========================================
+ Coverage   91.35%   91.53%   +0.17%     
==========================================
  Files         163      163              
  Lines       49691    50730    +1039     
==========================================
+ Hits        45397    46437    +1040     
+ Misses       4294     4293       -1

Flag	Coverage Δ
#multiple	`89.38% <ø> (+0.24%)`	⬆️
#single	`39.66% <ø> (-0.08%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.8% <0%> (-0.1%)`	⬇️
pandas/core/indexes/multi.py	`96.4% <0%> (-0.01%)`	⬇️
pandas/io/pytables.py	`92.84% <0%> (ø)`	⬆️
pandas/io/parsers.py	`95.59% <0%> (ø)`	⬆️
pandas/core/reshape/api.py	`100% <0%> (ø)`	⬆️
pandas/core/api.py	`100% <0%> (ø)`	⬆️
pandas/core/sparse/frame.py	`94.78% <0%> (ø)`	⬆️
pandas/core/reshape/melt.py	`96.22% <0%> (+0.03%)`	⬆️
pandas/core/indexes/interval.py	`92.64% <0%> (+0.11%)`	⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fedc503...e4d5f7f. Read the comment docs.

jreback · 2017-11-24T22:20:43Z

pandas/_libs/tslibs/timestamps.pyx

+    def days_in_month(self):
+        return self._get_field('dim')
+
+    cdef bint _get_start_end_field(_Timestamp self, field):


why does moving this to _Timestamp help

why does moving this to _Timestamp help

Among the todos was getting _get_start_end_field and _get_field out of the user-facing namespace. This accomplishes that.

It also adds type declarations, so should be marginally more performant.

why do we care about _get_start_end_field in the user namespace, its a private method?

type declarations are fine (pls run an asv to confirm)

jreback · 2017-11-25T15:12:55Z

pandas/_libs/tslibs/timestamps.pyx

+    def days_in_month(self):
+        return self._get_field('dim')
+
+    cdef bint _get_start_end_field(_Timestamp self, field):


why do we care about _get_start_end_field in the user namespace, its a private method?

type declarations are fine (pls run an asv to confirm)

jbrockmendel · 2017-11-25T16:59:33Z

why do we care about _get_start_end_field in the user namespace, its a private method?

Not sure anymore. I probably put it on the todo list in my youth. Will mark it as fine-as-is.

type declarations are fine (pls run an asv to confirm)

OK. I'll close this for now and circle back; avoid clogging the CI queue.

jbrockmendel added 3 commits November 22, 2017 20:08

small optimizations, trim namespaces

5534c41

fixup missing cimport

59b58a0

revert change that broke build

9166ab6

jreback added Clean Datetime Datetime data dtype labels Nov 23, 2017

jreback reviewed Nov 23, 2017

View reviewed changes

remove cython.final, add default freq to round for cython compat

37501e4

jbrockmendel added 2 commits November 24, 2017 11:34

revert changes to rounding methods

028ac99

revert change to variable name r-->roudned

e4d5f7f

jreback reviewed Nov 24, 2017

View reviewed changes

jreback requested changes Nov 25, 2017

View reviewed changes

jbrockmendel closed this Nov 25, 2017

jbrockmendel mentioned this pull request Nov 28, 2017

Fastpaths for Timestamp properties #18539

Merged

jbrockmendel deleted the tslibs-todos2 branch December 8, 2017 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More tslibs TODOS: small optimizations, trim namespaces #18446

More tslibs TODOS: small optimizations, trim namespaces #18446

jbrockmendel commented Nov 23, 2017

jreback Nov 23, 2017

jbrockmendel Nov 23, 2017

jreback Nov 23, 2017

jbrockmendel Nov 23, 2017

jreback Nov 23, 2017

jbrockmendel Nov 23, 2017

jreback Nov 23, 2017

jbrockmendel commented Nov 24, 2017

jbrockmendel commented Nov 24, 2017

jreback commented Nov 24, 2017

jbrockmendel commented Nov 24, 2017

codecov bot commented Nov 24, 2017 •

edited

Loading

codecov bot commented Nov 24, 2017

jreback Nov 24, 2017

jbrockmendel Nov 24, 2017

jreback Nov 25, 2017

jreback Nov 25, 2017

jbrockmendel commented Nov 25, 2017

More tslibs TODOS: small optimizations, trim namespaces #18446

More tslibs TODOS: small optimizations, trim namespaces #18446

Conversation

jbrockmendel commented Nov 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Nov 24, 2017

jbrockmendel commented Nov 24, 2017

jreback commented Nov 24, 2017

jbrockmendel commented Nov 24, 2017

codecov bot commented Nov 24, 2017 • edited Loading

Codecov Report

codecov bot commented Nov 24, 2017

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Nov 25, 2017

codecov bot commented Nov 24, 2017 •

edited

Loading