BUG: Fixes GH9311 groupby on datetime64 #9345

chrisbyboston · 2015-01-23T16:36:04Z

datetime64 columns were changing at the nano-second scale when
applying a groupby aggregator.

closes #9311
closes #6620

shoyer · 2015-01-25T22:35:36Z

pandas/core/groupby.py

@@ -1504,7 +1504,11 @@ def aggregate(self, values, how, axis=0):
                raise NotImplementedError
            out_shape = (self.ngroups,) + values.shape[1:]

-        if is_numeric_dtype(values.dtype):
+        if is_datetime:


What does this do to operations like groupby('foo').mean() for datetime columns? Currently these raise DataError: No numeric types to aggregate but perhaps they shouldn't...

mean of datetime64 is a can of worms. It can be done, but completely separate from this.

Agreed... wanted to make sure to understand if that was changing that behavior for groupby aggregations.

yep, I think that should be cause before here though (iow there is a test for it now that asserts that it fails)

shoyer · 2015-01-25T22:43:32Z

This looks like some nice fixes but also looks likely under-tested -- you've changed a lot of functions and only added tests for one of them.

jreback · 2015-01-25T22:45:03Z

pandas/core/groupby.py

@@ -1487,7 +1487,7 @@ def wrapper(*args, **kwargs):
                                      (how, dtype_str))
        return func, dtype_str

-    def aggregate(self, values, how, axis=0):
+    def aggregate(self, values, how, axis=0, is_datetime=False):



you can't put a flag like this here, its very odd. There are already introspection determinations in the grouper objects.

shoyer · 2015-01-25T22:59:40Z

I would probably scale this back to only change operations without any possibility of precision/overflow issues -- I think that's first/last/min/max/nth. The arithmetic mean/sum/var/std should probably wait until we define it even on datetime64 without grouping, as @jreback mentions above (though efforts there would certainly be appreciated).

Also, it is almost impossible to write too many tests :).

chrisbyboston · 2015-01-26T00:31:02Z

Awesome. Thanks for the feedback. I'll have updates, and more tests in here shortly.

chrisbyboston · 2015-01-30T21:57:43Z

@shoyer or @jreback - I'm very close to having the above fixes ready on this PR. I do need a bit of guidance on one thing though. Now that the Cython functions are using iNaT for nan values on the integer functions, they are returning the integer value (-9223372036854775808) back in the ndarray, as I would expect.

In order to get it working so that na functions will work (like fillna), I need to replace those values back with something that will be evaluated as na. Is there a helper function somewhere to do that?

shoyer · 2015-01-30T22:04:27Z

@iwschris iNaT -> NaT happens automatically when you cast back to datetime64. You can use values.view('datetime64[ns]') for that, assuming values is an array with int64 dtype. This won't work for integers -- you'll need to upcast those to float.

jreback · 2015-01-30T22:14:04Z

@iwschris just a do the view as @shoyer suggest
you know you have an int64 (or int32) so this will work
(only take a view if it's actually a datetime64/tiemselta64) in the first place
this should be done in python for consistency (we just normally return basic types to/from cython)

chrisbyboston · 2015-01-30T22:24:06Z

Awesome. Thanks!

chrisbyboston · 2015-02-02T22:47:20Z

@shoyer -- Pretty sure this is ready for review again. I was able to keep the safe casting turned on in internals.py and I validated that vbench is still looking pretty good.

Here's the output from vbench:


Invoked with :
--ncalls: 3
--repeats: 3


-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
index_from_series_ctor                       |   0.0147 |   0.0337 |   0.4363 |
series_constructor_ndarray                   |   0.0137 |   0.0277 |   0.4943 |
concat_empty_frames1                         |   0.7093 |   1.3227 |   0.5363 |
frame_get_numeric_data                       |   0.0956 |   0.1766 |   0.5414 |
join_dataframe_index_single_key_small        |   7.9147 |  13.9790 |   0.5662 |
groupby_ngroups_100_first                    |   0.2720 |   0.4776 |   0.5694 |
read_csv_infer_datetime_format_custom        |   7.8737 |  11.8280 |   0.6657 |
join_dataframe_index_single_key_bigger       |   8.9413 |  12.7583 |   0.7008 |
groupby_ngroups_100_last                     |   0.2457 |   0.3486 |   0.7046 |
groupby_ngroups_10000_median                 |   2.1223 |   2.9956 |   0.7085 |
frame_fancy_lookup_all                       |  10.6823 |  14.9640 |   0.7139 |
groupby_ngroups_100_max                      |   0.2611 |   0.3640 |   0.7172 |
groupby_ngroups_100_min                      |   0.2487 |   0.3421 |   0.7270 |
groupby_ngroups_100_count                    |   0.2720 |   0.3714 |   0.7325 |
frame_from_series                            |   0.1073 |   0.1454 |   0.7381 |
ctor_index_array_string                      |   0.0143 |   0.0193 |   0.7407 |
dataframe_resample_mean_numpy                |   2.4650 |   3.2423 |   0.7603 |
frame_iteritems_cached                       |   0.4021 |   0.5227 |   0.7692 |
index_float64_boolean_indexer                |   2.8100 |   3.6487 |   0.7701 |
frame_constructor_ndarray                    |   0.0670 |   0.0866 |   0.7734 |
reindex_frame_level_align                    |   0.5527 |   0.6983 |   0.7914 |
frame_count_level_axis0_multi                |  53.3690 |  66.9137 |   0.7976 |
timeseries_custom_bmonthbegin_incr_n         |   0.1923 |   0.2387 |   0.8059 |
frame_ctor_dtindex_BYearEndx1                |   1.1930 |   1.4800 |   0.8061 |
frame_reindex_axis0                          | 123.4897 | 152.9363 |   0.8075 |
frame_assign_timeseries_index                |   0.5610 |   0.6904 |   0.8126 |
read_csv_infer_datetime_format_iso8601       |   1.4773 |   1.8067 |   0.8177 |
groupby_frame_cython_many_columns            |   1.9461 |   2.3790 |   0.8180 |
frame_repr_wide                              |  10.0773 |  12.3091 |   0.8187 |
frame_ctor_dtindex_BDayx2                    |   0.9710 |   1.1800 |   0.8229 |
frame_nonunique_equal                        |   6.3940 |   7.7110 |   0.8292 |
frame_ctor_dtindex_QuarterBeginx1            |   0.9793 |   1.1793 |   0.8304 |
reindex_fillna_pad                           |   0.2143 |   0.2580 |   0.8306 |
frame_apply_ref_by_name                      |  10.7396 |  12.9287 |   0.8307 |
groupby_ngroups_10000_max                    |   1.5074 |   1.8050 |   0.8351 |
groupby_ngroups_10000_first                  |   1.5857 |   1.8973 |   0.8358 |
groupby_ngroups_100_cummax                   |  11.6933 |  13.9883 |   0.8359 |
index_datetime_intersection                  |   8.0390 |   9.6114 |   0.8364 |
frame_mask_floats                            |   4.5877 |   5.4599 |   0.8403 |
frame_drop_duplicates                        |  11.3926 |  13.4079 |   0.8497 |
groupby_ngroups_100_std                      |   0.3264 |   0.3827 |   0.8530 |
frame_reindex_columns                        |   0.2379 |   0.2789 |   0.8530 |
groupby_multi_size                           |  17.3423 |  20.3076 |   0.8540 |
series_ix_slice                              |   0.0416 |   0.0486 |   0.8562 |
datetimeindex_unique                         |   0.0723 |   0.0843 |   0.8577 |
groupby_first_float64                        |   2.2407 |   2.6090 |   0.8588 |
frame_ctor_dtindex_BQuarterEndx2             |   1.1080 |   1.2867 |   0.8611 |
reindex_frame_level_reindex                  |   0.5770 |   0.6690 |   0.8624 |
timeseries_custom_bmonthbegin_decr_n         |   0.2031 |   0.2350 |   0.8641 |
frame_ctor_dtindex_MonthEndx1                |   1.0313 |   1.1810 |   0.8732 |
groupby_series_nth_none                      |   1.0096 |   1.1547 |   0.8744 |
groupby_ngroups_100_rank                     |  11.8863 |  13.5717 |   0.8758 |
stat_ops_frame_sum_float_axis_1              |   3.5137 |   3.9994 |   0.8785 |
series_ctor_from_dict                        |   2.0247 |   2.3017 |   0.8796 |
frame_ctor_dtindex_MonthEndx2                |   1.0126 |   1.1484 |   0.8818 |
groupby_frame_singlekey_integer              |   1.3860 |   1.5713 |   0.8821 |
frame_ctor_dtindex_BYearBeginx1              |   1.1864 |   1.3447 |   0.8823 |
frame_drop_dup_na_inplace                    |   1.7404 |   1.9697 |   0.8836 |
groupby_frame_nth_any                        |   4.4223 |   4.9994 |   0.8846 |
frame_drop_dup_inplace                       |   1.8404 |   2.0734 |   0.8876 |
reindex_daterange_pad                        |   0.5534 |   0.6224 |   0.8892 |
groupby_ngroups_10000_var                    |   1.6543 |   1.8600 |   0.8894 |
timeseries_infer_freq                        |   6.6311 |   7.4290 |   0.8926 |
frame_interpolate_some_good                  |   0.9607 |   1.0746 |   0.8940 |
groupby_ngroups_100_cummin                   |  10.7533 |  12.0024 |   0.8959 |
read_csv_standard                            |   8.7403 |   9.7334 |   0.8980 |
timeseries_to_datetime_iso8601               |   3.1000 |   3.4483 |   0.8990 |
timeseries_to_datetime_YYYYMMDD              |   6.3020 |   6.9950 |   0.9009 |
sql_string_write_sqlalchemy                  |  54.9703 |  60.7810 |   0.9044 |
groupby_ngroups_10000_count                  |   1.5697 |   1.7330 |   0.9058 |
groupby_ngroups_10000_diff                   | 905.3814 | 999.3493 |   0.9060 |
frame_html_repr_trunc_mi                     |  29.4023 |  32.4394 |   0.9064 |
groupby_ngroups_10000_std                    |   1.8214 |   2.0063 |   0.9078 |
groupby_ngroups_10000_sem                    |   2.1150 |   2.3276 |   0.9087 |
frame_from_records_generator_nrows           |   0.6553 |   0.7210 |   0.9090 |
index_int64_union                            |  52.9044 |  58.1080 |   0.9104 |
multiindex_from_product                      |   8.3253 |   9.1114 |   0.9137 |
index_datetime_union                         |   7.9623 |   8.7016 |   0.9150 |
frame_ctor_dtindex_Weekx2                    |   0.7860 |   0.8587 |   0.9153 |
groupby_nth_object_any                       | 872.7303 | 951.7753 |   0.9169 |
groupby_ngroups_100_mean                     |   0.2930 |   0.3186 |   0.9197 |
packers_read_pack                            |  62.9237 |  68.2081 |   0.9225 |
groupby_frame_apply_overhead                 |   5.9593 |   6.4523 |   0.9236 |
timeseries_large_lookup_value                |   0.0126 |   0.0137 |   0.9244 |
packers_write_json_mixed_float_int_str       |  93.9660 | 101.5189 |   0.9256 |
frame_ctor_dtindex_BYearBeginx2              |   1.1083 |   1.1961 |   0.9266 |
eval_frame_add_python_one_thread             |  15.0924 |  16.2520 |   0.9286 |
frame_iloc_dups                              |   0.1764 |   0.1896 |   0.9300 |
groupby_ngroups_10000_sum                    |   1.8277 |   1.9616 |   0.9317 |
groupby_ngroups_10000_last                   |   1.7217 |   1.8463 |   0.9325 |
packers_read_stata                           |  36.7910 |  39.4524 |   0.9325 |
read_csv_precise_converter                   |   1.1957 |   1.2817 |   0.9328 |
timeseries_asof                              |   5.7844 |   6.1994 |   0.9331 |
series_value_counts_int64                    |   1.5830 |   1.6963 |   0.9332 |
frame_dtypes                                 |   0.0799 |   0.0857 |   0.9332 |
panel_shift_minor                            |   0.0700 |   0.0750 |   0.9333 |
groupby_dt_size                              |  18.6746 |  19.9850 |   0.9344 |
timeseries_custom_bday_cal_incr_neg_n        |   0.0196 |   0.0210 |   0.9356 |
index_str_boolean_indexer                    |   6.8731 |   7.3450 |   0.9357 |
groupby_first_float32                        |   2.4620 |   2.6309 |   0.9358 |
stat_ops_frame_sum_float_axis_0              |   3.4143 |   3.6473 |   0.9361 |
groupby_multi_different_functions            |   8.2520 |   8.8123 |   0.9364 |
timeseries_iter_datetimeindex_preexit        |   9.0630 |   9.6117 |   0.9429 |
frame_repr_tall                              |  17.1143 |  18.1334 |   0.9438 |
packers_read_json_date_index                 | 139.4784 | 147.7621 |   0.9439 |
dtype_infer_datetime64                       |   5.7657 |   6.0943 |   0.9461 |
read_csv_roundtrip_converter                 |   1.8900 |   1.9963 |   0.9468 |
groupby_ngroups_100_prod                     |   0.3593 |   0.3793 |   0.9472 |
lib_fast_zip_fillna                          |   8.9866 |   9.4833 |   0.9476 |
frame_reindex_both_axes_ix                   |  27.4524 |  28.9690 |   0.9476 |
sort_level_zero                              |   8.5460 |   9.0003 |   0.9495 |
groupby_ngroups_10000_cumprod                | 987.3126 | 1039.5793 |   0.9497 |
groupby_indices                              |   4.2933 |   4.5194 |   0.9500 |
frame_ctor_dtindex_MonthBeginx2              |   1.0253 |   1.0786 |   0.9506 |
series_getitem_slice                         |   0.0339 |   0.0357 |   0.9510 |
read_csv_skiprows                            |  11.6111 |  12.2014 |   0.9516 |
sql_float_write_fallback                     |  23.6900 |  24.8843 |   0.9520 |
sql_write_sqlalchemy                         | 121.5783 | 127.5000 |   0.9536 |
packers_read_json                            | 142.3674 | 149.2996 |   0.9536 |
sql_float_read_table_sqlalchemy              |  11.9420 |  12.5156 |   0.9542 |
frame_ctor_dtindex_Weekx1                    |   0.8196 |   0.8589 |   0.9542 |
groupby_ngroups_100_head                     |   0.5653 |   0.5923 |   0.9544 |
join_non_unique_equal                        |   0.3424 |   0.3587 |   0.9546 |
groupby_multi_different_numpy_functions      |   8.0523 |   8.4341 |   0.9547 |
merge_2intkey_sort                           |  24.8954 |  26.0690 |   0.9550 |
frame_ctor_dtindex_BMonthBeginx1             |   1.1203 |   1.1717 |   0.9562 |
groupby_transform_multi_key2                 |  33.9526 |  35.4800 |   0.9570 |
series_timestamp_compare                     |   7.5290 |   7.8657 |   0.9572 |
packers_write_stata_with_validation          |  31.8860 |  33.2757 |   0.9582 |
frame_ctor_dtindex_DateOffsetx2              |   0.8277 |   0.8633 |   0.9588 |
strings_count                                |   4.7774 |   4.9780 |   0.9597 |
sql_float_read_query_fallback                |   6.5816 |   6.8550 |   0.9601 |
frame_apply_pass_thru                        |   3.2713 |   3.4067 |   0.9603 |
index_float64_div                            |   2.0644 |   2.1497 |   0.9603 |
frame_multi_and_no_ne                        |  20.8013 |  21.6600 |   0.9604 |
index_int64_intersection                     |  10.9847 |  11.4144 |   0.9624 |
frame_ctor_dtindex_DateOffsetx1              |   0.8663 |   0.8997 |   0.9629 |
timeseries_iter_periodindex_preexit          |   9.4426 |   9.8027 |   0.9633 |
strings_len                                  |   1.3343 |   1.3837 |   0.9643 |
frame_to_csv                                 | 110.6810 | 114.6793 |   0.9651 |
strings_cat                                  |   0.5057 |   0.5236 |   0.9657 |
frame_count_level_axis0_mixed_dtypes_multi   | 107.9280 | 111.7450 |   0.9658 |
join_dataframe_index_multi                   |  12.2537 |  12.6839 |   0.9661 |
groupby_ngroups_10000_mad                    | 3435.5440 | 3555.5203 |   0.9663 |
sql_float_write_sqlalchemy                   |  55.9303 |  57.8830 |   0.9663 |
frame_ctor_dtindex_Easterx1                  |   0.9393 |   0.9716 |   0.9667 |
groupby_ngroups_100_any                      |   6.9167 |   7.1410 |   0.9686 |
stats_rank2d_axis1_average                   |   8.8743 |   9.1600 |   0.9688 |
groupby_ngroups_10000_describe               | 12356.2800 | 12750.7817 |   0.9691 |
groupby_multi_series_op                      |  10.1536 |  10.4600 |   0.9707 |
packers_write_json                           |  73.1703 |  75.3726 |   0.9708 |
series_getitem_array                         |   0.3147 |   0.3237 |   0.9723 |
frame_dropna_axis1_any_mixed_dtypes          | 164.5314 | 169.1933 |   0.9724 |
frame_dropna_axis0_any_mixed_dtypes          | 159.9306 | 164.4520 |   0.9725 |
concat_small_frames                          |  40.3993 |  41.5320 |   0.9727 |
timeseries_timestamp_downsample_mean         |   3.4340 |   3.5297 |   0.9729 |
frame_ctor_dtindex_BusinessDayx2             |   0.9197 |   0.9443 |   0.9739 |
groupby_transform_series2                    | 108.0357 | 110.8853 |   0.9743 |
groupby_ngroups_100_sum                      |   0.3707 |   0.3804 |   0.9745 |
frame_ctor_dtindex_YearEndx1                 |   0.8994 |   0.9213 |   0.9762 |
series_xs_mi_ix                              |   2.9663 |   3.0383 |   0.9763 |
packers_write_csv                            | 943.0904 | 965.2163 |   0.9771 |
timeseries_asof_single                       |   0.0173 |   0.0177 |   0.9776 |
frame_dropna_axis0_all_mixed_dtypes          | 192.6063 | 196.9613 |   0.9779 |
groupby_nth_datetimes_any                    | 889.3877 | 909.3717 |   0.9780 |
frame_ctor_dtindex_CDayx2                    |   0.8940 |   0.9136 |   0.9785 |
frame_interpolate                            |  64.3644 |  65.7176 |   0.9794 |
timeseries_iter_datetimeindex                | 539.4704 | 550.7627 |   0.9795 |
frame_to_string_floats                       |  22.1376 |  22.5883 |   0.9800 |
eval_frame_mult_python                       |  16.0577 |  16.3810 |   0.9803 |
frame_ctor_dtindex_CBMonthBeginx2            |   2.0204 |   2.0594 |   0.9811 |
panel_shift                                  |   0.0716 |   0.0730 |   0.9815 |
frame_ctor_dtindex_BQuarterEndx1             |   1.0660 |   1.0850 |   0.9826 |
reindex_daterange_backfill                   |   0.7203 |   0.7331 |   0.9827 |
packers_write_json_date_index                |  83.0213 |  84.4580 |   0.9830 |
timeseries_custom_bmonthend_decr_n           |   0.2303 |   0.2340 |   0.9840 |
frame_mask_bools                             |   6.4433 |   6.5400 |   0.9852 |
timeseries_add_irregular                     |   9.7550 |   9.8964 |   0.9857 |
frame_apply_lambda_mean                      |   4.3870 |   4.4500 |   0.9858 |
series_drop_duplicates_string                |   0.4007 |   0.4063 |   0.9861 |
frame_ctor_nested_dict                       |  56.1170 |  56.9040 |   0.9862 |
frame_ctor_dtindex_YearBeginx1               |   0.8856 |   0.8980 |   0.9863 |
groupby_pivot_table                          |  12.7386 |  12.9116 |   0.9866 |
frame_interpolate_some_good_infer            |   2.0733 |   2.1013 |   0.9866 |
frame_ctor_dtindex_YearEndx2                 |   0.9444 |   0.9566 |   0.9872 |
groupby_ngroups_100_all                      |   7.2843 |   7.3783 |   0.9873 |
groupby_ngroups_10000_cumsum                 | 987.8446 | 1000.3951 |   0.9875 |
groupby_ngroups_100_sem                      |   0.6073 |   0.6146 |   0.9881 |
groupby_transform_multi_key1                 |  49.5160 |  50.1057 |   0.9882 |
melt_dataframe                               |   1.4334 |   1.4487 |   0.9894 |
stat_ops_frame_sum_int_axis_1                |   3.4297 |   3.4643 |   0.9900 |
strings_contains_many                        |   4.6546 |   4.6970 |   0.9910 |
frame_mult                                   |   4.3540 |   4.3923 |   0.9913 |
groupby_frame_apply                          |  29.8660 |  30.1157 |   0.9917 |
panel_pct_change_major                       | 5035.1934 | 5074.3450 |   0.9923 |
read_csv_comment2                            |  20.2267 |  20.3773 |   0.9926 |
datetimeindex_add_offset                     |   0.1920 |   0.1934 |   0.9930 |
strings_endswith                             |   3.2713 |   3.2940 |   0.9931 |
frame_apply_np_mean                          |   4.2990 |   4.3247 |   0.9941 |
datetimeindex_normalize                      |   2.6603 |   2.6757 |   0.9942 |
frame_dropna_axis1_any                       |  19.4923 |  19.6040 |   0.9943 |
packers_write_json_mixed_delta_int_tstamp    | 102.4027 | 102.9747 |   0.9944 |
read_csv_vb                                  |  16.3047 |  16.3777 |   0.9955 |
groupby_ngroups_10000_cumcount               |  57.6493 |  57.8650 |   0.9963 |
frame_object_equal                           |   6.4127 |   6.4360 |   0.9964 |
frame_ctor_dtindex_Easterx2                  |   0.9423 |   0.9456 |   0.9965 |
series_ix_scalar                             |   0.0236 |   0.0237 |   0.9966 |
write_csv_standard                           |  38.6047 |  38.7180 |   0.9971 |
groupby_transform_multi_key4                 | 100.7117 | 100.9640 |   0.9975 |
groupby_ngroups_10000_size                   |   3.2973 |   3.3036 |   0.9981 |
groupby_nth_float32_none                     |  66.8260 |  66.9477 |   0.9982 |
timeseries_with_format_no_exact              | 601.0760 | 601.8333 |   0.9987 |
frame_fillna_many_columns_pad                |   3.2874 |   3.2890 |   0.9995 |
datetimeindex_infer_dst                      |   2.6283 |   2.6280 |   1.0001 |
panel_from_dict_same_index                   |  30.4280 |  30.4210 |   1.0002 |
groupby_ngroups_10000_value_counts           | 3861.8680 | 3859.5047 |   1.0006 |
timeseries_asof_nan                          |   5.5869 |   5.5803 |   1.0012 |
frame_multi_and                              |  22.1247 |  22.0914 |   1.0015 |
frame_add_st                                 |   4.3914 |   4.3824 |   1.0020 |
groupby_frame_median                         |   4.9524 |   4.9326 |   1.0040 |
frame_fancy_lookup                           |   2.5950 |   2.5843 |   1.0042 |
frame_ctor_dtindex_QuarterEndx1              |   1.0297 |   1.0243 |   1.0052 |
groupby_last_object                          |  12.3967 |  12.3320 |   1.0052 |
timeseries_is_month_start                    |   2.3900 |   2.3774 |   1.0053 |
series_align_left_monotonic                  |  12.2104 |  12.1450 |   1.0054 |
lib_fast_zip                                 |   6.3469 |   6.3126 |   1.0054 |
frame_ctor_dtindex_CBMonthEndx2              |   2.8620 |   2.8457 |   1.0057 |
groupby_ngroups_10000_pct_change             | 2949.9527 | 2930.2514 |   1.0067 |
frame_fillna_inplace                         |   7.3990 |   7.3454 |   1.0073 |
eval_frame_add_python                        |  14.7847 |  14.6750 |   1.0075 |
groupby_nth_object_none                      | 479.3561 | 475.7130 |   1.0077 |
timeseries_slice_minutely                    |   0.0400 |   0.0397 |   1.0080 |
frame_mult_st                                |   4.3757 |   4.3403 |   1.0081 |
groupby_multi_python                         |  64.3490 |  63.8003 |   1.0086 |
frame_reindex_both_axes                      |  27.2237 |  26.9873 |   1.0088 |
dataframe_reindex                            |   0.2473 |   0.2450 |   1.0094 |
groupby_nth_datetimes_none                   | 460.5076 | 456.1773 |   1.0095 |
frame_ctor_dtindex_QuarterEndx2              |   1.0264 |   1.0166 |   1.0096 |
frame_iloc_big                               |   0.1190 |   0.1176 |   1.0115 |
sql_datetime_write_sqlalchemy                | 105.9604 | 104.7047 |   1.0120 |
timestamp_ops_diff1                          |  12.7310 |  12.5770 |   1.0122 |
series_value_counts_strings                  |   3.2010 |   3.1614 |   1.0125 |
frame_ctor_dtindex_BMonthEndx2               |   0.9724 |   0.9593 |   1.0136 |
frame_shift_axis_1                           |  13.0233 |  12.8467 |   1.0137 |
frame_ctor_dtindex_BYearEndx2                |   1.1053 |   1.0897 |   1.0144 |
frame_dropna_axis0_any                       |  20.1437 |  19.8566 |   1.0145 |
frame_ctor_dtindex_CBMonthBeginx1            |   2.2947 |   2.2613 |   1.0148 |
reshape_stack_simple                         |   1.7076 |   1.6827 |   1.0148 |
reindex_fillna_pad_float32                   |   0.4029 |   0.3970 |   1.0150 |
append_frame_single_homogenous               |   0.8920 |   0.8783 |   1.0156 |
frame_add_no_ne                              |   4.4303 |   4.3620 |   1.0157 |
indexing_dataframe_boolean_rows_object       |   0.3884 |   0.3823 |   1.0158 |
groupby_multi_cython                         |  11.0527 |  10.8797 |   1.0159 |
frame_ctor_dtindex_CDayx1                    |   0.9046 |   0.8896 |   1.0169 |
groupby_ngroups_10000_rank                   | 1007.1193 | 989.4890 |   1.0178 |
groupby_nth_float64_none                     |  65.9187 |  64.7260 |   1.0184 |
frame_count_level_axis1_mixed_dtypes_multi   |  81.4800 |  79.9817 |   1.0187 |
frame_sort_index_by_columns                  |  31.9147 |  31.3116 |   1.0193 |
groupby_ngroups_10000_head                   |  65.1494 |  63.9053 |   1.0195 |
frame_to_csv_mixed                           | 503.7344 | 494.0030 |   1.0197 |
unstack_sparse_keyspace                      |   0.9720 |   0.9530 |   1.0199 |
read_csv_thou_vb                             |  14.9190 |  14.6270 |   1.0200 |
timeseries_with_format_replace               | 825.8780 | 809.6297 |   1.0201 |
frame_dropna_axis1_all_mixed_dtypes          | 201.0813 | 197.0267 |   1.0206 |
groupby_transform_multi_key3                 | 563.4913 | 551.8860 |   1.0210 |
timedelta_convert_int                        |   0.1133 |   0.1109 |   1.0215 |
match_strings                                |   0.2960 |   0.2897 |   1.0217 |
timeseries_iter_periodindex                  | 937.8570 | 917.8220 |   1.0218 |
join_dataframe_integer_key                   |   1.1926 |   1.1670 |   1.0220 |
stat_ops_series_std                          |   0.3860 |   0.3777 |   1.0221 |
frame_multi_and_st                           |  22.6813 |  22.1707 |   1.0230 |
frame_apply_axis_1                           |  58.1466 |  56.8313 |   1.0231 |
groupby_ngroups_100_diff                     |  11.0393 |  10.7853 |   1.0236 |
frame_ctor_dtindex_BQuarterBeginx1           |   1.1190 |   1.0923 |   1.0244 |
frame_ctor_dtindex_BQuarterBeginx2           |   1.1107 |   1.0840 |   1.0246 |
frame_float_equal                            |   1.3397 |   1.3063 |   1.0256 |
frame_from_records_generator                 |  54.8224 |  53.4543 |   1.0256 |
strings_contains_many_noregex                |   2.0823 |   2.0300 |   1.0258 |
strings_get                                  |   2.3750 |   2.3136 |   1.0266 |
replace_replacena                            |   0.4417 |   0.4303 |   1.0266 |
dtype_infer_timedelta64_2                    |   8.5163 |   8.2900 |   1.0273 |
frame_html_repr_trunc_si                     |  21.0226 |  20.4570 |   1.0276 |
dataframe_resample_mean_string               |   2.4560 |   2.3890 |   1.0280 |
groupby_ngroups_10000_unique                 | 511.7454 | 497.7160 |   1.0282 |
frame_shift_axis0                            |   9.3033 |   9.0457 |   1.0285 |
frame_dropna_axis0_all                       |  45.4573 |  44.1926 |   1.0286 |
groupby_ngroups_100_nunique                  |   7.7484 |   7.5307 |   1.0289 |
sql_string_write_fallback                    |  23.1417 |  22.4884 |   1.0291 |
dataframe_resample_max_numpy                 |   1.2786 |   1.2423 |   1.0292 |
panel_pct_change_items                       | 5977.8221 | 5807.0730 |   1.0294 |
frame_loc_dups                               |   0.6920 |   0.6714 |   1.0307 |
frame_ctor_dtindex_BMonthBeginx2             |   1.1263 |   1.0924 |   1.0311 |
read_csv_default_converter                   |   1.3087 |   1.2683 |   1.0318 |
read_table_multiple_date_baseline            |  62.8977 |  60.9293 |   1.0323 |
groupby_ngroups_100_value_counts             |  37.9097 |  36.7200 |   1.0324 |
strings_upper                                |   4.4840 |   4.3400 |   1.0332 |
eval_frame_and_python_one_thread             |  21.4583 |  20.7507 |   1.0341 |
frame_insert_100_columns_begin               |  25.8463 |  24.9857 |   1.0344 |
groupby_agg_builtins1                        |   7.4780 |   7.2234 |   1.0353 |
eval_frame_and_python                        |  20.7856 |  20.0644 |   1.0359 |
series_getitem_list_like                     |   0.1480 |   0.1427 |   1.0367 |
strings_strip                                |   3.4354 |   3.3133 |   1.0368 |
frame_ctor_dtindex_BMonthEndx1               |   0.9767 |   0.9417 |   1.0372 |
frame_mult_no_ne                             |   4.4643 |   4.3034 |   1.0374 |
groupby_ngroups_100_cumcount                 |   0.5494 |   0.5294 |   1.0378 |
frame_add                                    |   4.5600 |   4.3924 |   1.0382 |
series_string_vector_slice                   | 168.7703 | 162.5583 |   1.0382 |
groupby_ngroups_10000_cummin                 | 1077.7996 | 1038.0610 |   1.0383 |
groupby_ngroups_10000_all                    | 720.6297 | 694.0330 |   1.0383 |
strings_extract                              |  34.1103 |  32.8497 |   1.0384 |
concat_empty_frames2                         |   0.7330 |   0.7057 |   1.0386 |
frame_ctor_dtindex_CBMonthEndx1              |   2.9496 |   2.8390 |   1.0390 |
packers_write_json_T                         |  87.2324 |  83.9403 |   1.0392 |
groupby_simple_compress_timing               |  22.7056 |  21.8263 |   1.0403 |
frame_ctor_dtindex_YearBeginx2               |   0.8867 |   0.8500 |   1.0432 |
dtype_infer_uint32                           |   0.3046 |   0.2920 |   1.0433 |
groupby_frame_nth_none                       |   1.5750 |   1.5093 |   1.0435 |
strings_startswith                           |   3.4940 |   3.3460 |   1.0442 |
strings_encode_decode                        |   0.2256 |   0.2160 |   1.0445 |
sql_datetime_read_as_native_sqlalchemy       |  19.7659 |  18.9184 |   1.0448 |
groupby_ngroups_10000_mean                   |   1.6270 |   1.5570 |   1.0450 |
frame_dropna_axis1_all                       |  54.1590 |  51.7350 |   1.0469 |
groupby_dt_timegrouper_size                  |  14.8040 |  14.1377 |   1.0471 |
groupby_agg_builtins2                        |  33.8887 |  32.3486 |   1.0476 |
packers_write_sql                            | 1956.6897 | 1866.3857 |   1.0484 |
panel_pct_change_minor                       | 5288.4736 | 5042.6934 |   1.0487 |
frame_reindex_axis1                          | 143.6096 | 136.9224 |   1.0488 |
sort_level_one                               |   8.7850 |   8.3740 |   1.0491 |
timeseries_1min_5min_ohlc                    |   0.6716 |   0.6401 |   1.0493 |
groupby_ngroups_10000_nunique                | 704.0600 | 670.6223 |   1.0499 |
stats_rank2d_axis0_average                   |  18.2641 |  17.3707 |   1.0514 |
groupby_apply_dict_return                    |  25.1443 |  23.9117 |   1.0515 |
strings_lower                                |   4.9067 |   4.6660 |   1.0516 |
groupby_transform_series                     |  15.4867 |  14.7260 |   1.0517 |
series_ix_array                              |   0.5894 |   0.5604 |   1.0518 |
multiindex_with_datetime_level_sliced        |   0.1463 |   0.1390 |   1.0526 |
groupby_ngroups_10000_skew                   | 981.4417 | 930.7914 |   1.0544 |
groupby_ngroups_100_skew                     |   9.8927 |   9.3813 |   1.0545 |
frame_ctor_nested_dict_int64                 |  60.3494 |  57.1480 |   1.0560 |
frame_to_html_mixed                          | 165.5410 | 156.7531 |   1.0561 |
frame_ctor_list_of_dict                      |  59.4960 |  56.2257 |   1.0582 |
groupby_ngroups_100_mad                      |  35.4846 |  33.4873 |   1.0596 |
groupby_ngroups_10000_tail                   |  60.6120 |  57.1950 |   1.0597 |
groupby_transform                            | 119.0536 | 111.8713 |   1.0642 |
stats_rank_average                           |  25.0426 |  23.5287 |   1.0643 |
timestamp_ops_diff2                          |  16.7730 |  15.7553 |   1.0646 |
frame_ctor_dtindex_BusinessDayx1             |   0.9303 |   0.8733 |   1.0652 |
sql_write_fallback                           |  48.0767 |  45.0877 |   1.0663 |
timestamp_series_compare                     |   7.7377 |   7.2530 |   1.0668 |
indexing_dataframe_boolean_rows              |   0.2377 |   0.2227 |   1.0675 |
reshape_unstack_simple                       |   2.2411 |   2.0994 |   1.0675 |
strings_rstrip                               |   3.0786 |   2.8836 |   1.0676 |
dti_reset_index_tz                           |   4.7610 |   4.4587 |   1.0678 |
multiindex_with_datetime_level_full          |   9.3650 |   8.7677 |   1.0681 |
datetime_index_intersection                  |   0.2949 |   0.2760 |   1.0685 |
concat_series_axis1                          |  65.8151 |  61.4983 |   1.0702 |
series_getitem_pos_slice                     |   0.0397 |   0.0370 |   1.0708 |
dtype_infer_float32                          |   0.2507 |   0.2340 |   1.0713 |
stat_ops_frame_mean_int_axis_1               |   4.8420 |   4.5130 |   1.0729 |
stat_ops_frame_mean_float_axis_0             |   3.8706 |   3.6034 |   1.0742 |
timeseries_custom_bday_decr                  |   0.0240 |   0.0223 |   1.0747 |
series_iloc_list_like                        |   0.2007 |   0.1867 |   1.0749 |
frame_reindex_upcast                         |   5.9930 |   5.5740 |   1.0752 |
indexing_panel_subset                        |   0.6670 |   0.6204 |   1.0752 |
packers_read_pickle                          | 108.4936 | 100.8810 |   1.0755 |
groupby_ngroups_10000_any                    | 707.5600 | 657.6886 |   1.0758 |
dtype_infer_timedelta64_1                    |  46.3394 |  42.9693 |   1.0784 |
groupby_ngroups_10000_cummax                 | 1078.3180 | 998.9180 |   1.0795 |
groupby_ngroups_10000_prod                   |   1.8747 |   1.7360 |   1.0799 |
reindex_fillna_backfill                      |   0.2646 |   0.2450 |   1.0801 |
groupby_ngroups_100_tail                     |   0.6087 |   0.5634 |   1.0804 |
strings_contains_few                         |   4.5360 |   4.1974 |   1.0807 |
sql_read_query_fallback                      |  26.6887 |  24.6690 |   1.0819 |
groupby_series_nth_any                       |   3.1127 |   2.8744 |   1.0829 |
frame_ctor_dtindex_CustomBusinessDayx1       |   0.9580 |   0.8833 |   1.0845 |
strings_center                               |   3.7570 |   3.4637 |   1.0847 |
stat_ops_level_frame_sum                     |   2.1319 |   1.9654 |   1.0848 |
strings_lstrip                               |   3.3210 |   3.0596 |   1.0854 |
frame_get_dtype_counts                       |   0.0757 |   0.0697 |   1.0855 |
frame_ctor_dtindex_BDayx1                    |   0.9207 |   0.8463 |   1.0879 |
merge_2intkey_nosort                         |  13.6313 |  12.5194 |   1.0888 |
stats_rolling_mean                           |   0.8846 |   0.8110 |   1.0907 |
dataframe_resample_min_string                |   1.4046 |   1.2860 |   1.0922 |
groupby_ngroups_100_var                      |   0.3340 |   0.3057 |   1.0928 |
timeseries_sort_index                        |   7.1480 |   6.5390 |   1.0931 |
groupby_ngroups_100_pct_change               |  30.0250 |  27.4397 |   1.0942 |
packers_read_sql                             | 456.4653 | 417.0200 |   1.0946 |
groupby_ngroups_100_unique                   |   5.5910 |   5.1040 |   1.0954 |
replace_large_dict                           | 9793.7127 | 8937.8887 |   1.0958 |
timedelta_convert_string_seconds             | 103.9890 |  94.8300 |   1.0966 |
sql_read_table_sqlalchemy                    |  32.6030 |  29.7074 |   1.0975 |
read_table_multiple_date                     | 144.7484 | 131.8140 |   1.0981 |
frame_ctor_dtindex_QuarterBeginx2            |   1.0433 |   0.9487 |   1.0998 |
frame_ctor_dtindex_CustomBusinessDayx2       |   0.9689 |   0.8806 |   1.1003 |
stats_corr_spearman                          |  68.1484 |  61.8983 |   1.1010 |
packers_read_csv                             | 157.6234 | 143.1127 |   1.1014 |
panel_from_dict_two_different_indexes        |  62.8486 |  56.9550 |   1.1035 |
frame_getitem_single_column2                 |  19.1580 |  17.3460 |   1.1045 |
eval_frame_chained_cmp_python_one_thread     |  91.1884 |  82.3380 |   1.1075 |
frame_count_level_axis1_multi                |  86.7036 |  78.2163 |   1.1085 |
frame_apply_user_func                        |  70.7947 |  63.8500 |   1.1088 |
groupby_ngroups_100_size                     |   0.4110 |   0.3699 |   1.1111 |
timeseries_custom_bmonthend_incr_n           |   0.2093 |   0.1884 |   1.1114 |
stats_rank_pct_average                       |  29.2093 |  26.2606 |   1.1123 |
groupby_ngroups_10000_min                    |   1.9766 |   1.7763 |   1.1128 |
frame_ctor_dtindex_MonthBeginx1              |   1.0486 |   0.9410 |   1.1143 |
groupby_multi_count                          |   6.1553 |   5.5184 |   1.1154 |
groupby_ngroups_100_cumprod                  |  11.5967 |  10.3873 |   1.1164 |
groupby_series_simple_cython                 | 167.5251 | 149.9200 |   1.1174 |
sql_read_query_sqlalchemy                    |  32.5234 |  29.1040 |   1.1175 |
index_float64_mul                            |   1.8547 |   1.6540 |   1.1213 |
i8merge                                      | 880.4169 | 784.4897 |   1.1223 |
stat_ops_frame_sum_int_axis_0                |   3.5937 |   3.2016 |   1.1224 |
frame_drop_duplicates_na                     |  12.6983 |  11.3060 |   1.1232 |
timedelta_convert_string                     |  87.5190 |  77.7940 |   1.1250 |
series_loc_scalar                            |   0.0267 |   0.0237 |   1.1275 |
groupby_last_float64                         |   3.5220 |   3.1199 |   1.1289 |
series_drop_duplicates_int                   |   0.6540 |   0.5790 |   1.1294 |
series_ix_list_like                          |   0.1517 |   0.1343 |   1.1296 |
packers_write_stata                          |  20.9417 |  18.5220 |   1.1306 |
timeseries_custom_bday_cal_decr              |   0.0234 |   0.0207 |   1.1308 |
strings_replace                              |  11.4644 |  10.1280 |   1.1319 |
period_setitem                               |  13.2340 |  11.6623 |   1.1348 |
timeseries_custom_bday_incr                  |   0.0140 |   0.0123 |   1.1355 |
frame_to_csv2                                | 110.0953 |  96.9253 |   1.1359 |
groupby_ngroups_100_describe                 | 137.4177 | 120.8344 |   1.1372 |
packers_write_json_mixed_float_int           | 131.6490 | 115.6390 |   1.1384 |
strings_get_dummies                          |  66.3884 |  58.2893 |   1.1389 |
timeseries_custom_bmonthend_incr             |   0.1527 |   0.1340 |   1.1394 |
packers_read_stata_with_validation           |  55.2636 |  48.4761 |   1.1400 |
replace_fillna                               |   0.9890 |   0.8654 |   1.1429 |
dtype_infer_int32                            |   0.4090 |   0.3577 |   1.1433 |
frame_isnull                                 |   0.4657 |   0.4070 |   1.1443 |
join_dataframe_integer_2key                  |   3.3087 |   2.8860 |   1.1465 |
series_align_irregular_string                |  48.6543 |  42.3963 |   1.1476 |
frame_xs_row                                 |   0.0333 |   0.0290 |   1.1479 |
dti_reset_index                              |   0.3053 |   0.2657 |   1.1493 |
append_frame_single_mixed                    |   1.4187 |   1.2324 |   1.1512 |
strings_findall                              |   8.1600 |   7.0873 |   1.1514 |
timeseries_1min_5min_mean                    |   0.6646 |   0.5770 |   1.1519 |
multiindex_duplicated                        |  96.3167 |  83.5230 |   1.1532 |
packers_write_pickle                         | 130.7033 | 113.0910 |   1.1557 |
timeseries_custom_bday_cal_incr              |   0.0216 |   0.0187 |   1.1574 |
panel_from_dict_equiv_indexes                |  39.0920 |  33.7410 |   1.1586 |
frame_insert_500_columns_end                 |  84.1594 |  72.6186 |   1.1589 |
frame_to_csv_date_formatting                 |  11.6611 |  10.0466 |   1.1607 |
frame_iteritems                              |  25.1820 |  21.6257 |   1.1644 |
series_iloc_list_like                        |   0.7620 |   0.6526 |   1.1676 |
strings_repeat                               |   3.8571 |   3.3034 |   1.1676 |
reshape_pivot_time_series                    | 157.2340 | 134.5960 |   1.1682 |
strings_title                                |   7.7453 |   6.6237 |   1.1693 |
timeseries_year_incr                         |   0.0164 |   0.0140 |   1.1705 |
stat_ops_frame_mean_float_axis_1             |   4.6740 |   3.9916 |   1.1709 |
stat_ops_level_series_sum                    |   1.4474 |   1.2354 |   1.1716 |
packers_write_json_mixed_float_int_T         | 107.0007 |  91.1396 |   1.1740 |
stats_rank_pct_average_old                   |  28.4254 |  24.0947 |   1.1797 |
indexing_dataframe_boolean_no_ne             |  85.9000 |  72.7403 |   1.1809 |
strings_join_split                           |  36.1074 |  30.5471 |   1.1820 |
read_parse_dates_iso8601                     |   1.1773 |   0.9950 |   1.1832 |
stats_rank_average_int                       |  19.7994 |  16.6473 |   1.1893 |
series_loc_array                             |   0.9330 |   0.7813 |   1.1942 |
left_outer_join_index                        | 2329.4190 | 1949.8839 |   1.1946 |
groupby_int_count                            |   3.7150 |   3.0710 |   1.2097 |
strings_slice                                |   3.2950 |   2.7227 |   1.2102 |
sparse_frame_constructor                     |   5.0067 |   4.1353 |   1.2107 |
groupby_ngroups_100_cumsum                   |  12.2750 |  10.1330 |   1.2114 |
sparse_series_to_frame                       | 121.3010 |  99.8526 |   1.2148 |
indexing_dataframe_boolean_st                |  90.5453 |  74.4573 |   1.2161 |
groupby_sum_booleans                         |   1.0333 |   0.8493 |   1.2166 |
groupby_first_object                         |  15.8263 |  12.9640 |   1.2208 |
frame_ctor_dtindex_Nanox2                    |   0.9380 |   0.7677 |   1.2218 |
groupby_transform_ufunc                      | 101.8643 |  83.3070 |   1.2228 |
sql_float_read_query_sqlalchemy              |  12.8566 |  10.4984 |   1.2246 |
frame_ctor_dtindex_Microx1                   |   0.9426 |   0.7663 |   1.2301 |
reindex_fillna_backfill_float32              |   0.2473 |   0.2007 |   1.2325 |
frame_ctor_dtindex_Nanox1                    |   0.9500 |   0.7660 |   1.2403 |
sql_datetime_read_and_parse_sqlalchemy       |  18.3130 |  14.7080 |   1.2451 |
timeseries_period_downsample_mean            |  11.1174 |   8.8491 |   1.2563 |
dtype_infer_int64                            |   0.6446 |   0.5107 |   1.2622 |
groupby_last_float32                         |   3.0770 |   2.4350 |   1.2637 |
stat_ops_frame_mean_int_axis_0               |   4.5147 |   3.5590 |   1.2685 |
panel_from_dict_all_different_indexes        | 100.3427 |  78.9137 |   1.2715 |
reindex_multiindex                           |   1.3320 |   1.0437 |   1.2762 |
timeseries_year_apply                        |   0.0153 |   0.0120 |   1.2781 |
series_align_int64_index                     |  34.4250 |  26.7910 |   1.2849 |
strings_pad                                  |   4.4510 |   3.4587 |   1.2869 |
stat_ops_level_series_sum_multiple           |   4.9117 |   3.8057 |   1.2906 |
series_iloc_array                            |   5.0873 |   3.8664 |   1.3158 |
read_csv_infer_datetime_format_ymd           |   2.0707 |   1.5700 |   1.3189 |
dtype_infer_float64                          |   0.6750 |   0.5116 |   1.3192 |
frame_ctor_dtindex_Hourx1                    |   0.9096 |   0.6883 |   1.3216 |
frame_getitem_single_column                  |  19.6416 |  14.7943 |   1.3276 |
frame_ctor_dtindex_Microx2                   |   0.9057 |   0.6807 |   1.3305 |
series_loc_slice                             |   0.0476 |   0.0357 |   1.3341 |
stat_ops_level_frame_sum_multiple            |   6.6206 |   4.9623 |   1.3342 |
frame_ctor_dtindex_Minutex1                  |   0.9073 |   0.6800 |   1.3344 |
index_float64_boolean_series_indexer         |   3.6701 |   2.7480 |   1.3355 |
strings_match                                |   6.0484 |   4.5284 |   1.3357 |
dataframe_resample_min_numpy                 |   1.6863 |   1.2569 |   1.3416 |
eval_frame_chained_cmp_python                |  90.7466 |  66.8840 |   1.3568 |
index_str_boolean_series_indexer             |  10.8276 |   7.8300 |   1.3828 |
frame_ctor_dtindex_Minutex2                  |   0.8980 |   0.6429 |   1.3968 |
packers_write_pack                           |  27.5940 |  19.7157 |   1.3996 |
frame_ctor_dtindex_Dayx1                     |   0.9130 |   0.6493 |   1.4061 |
join_dataframe_index_single_key_bigger_sort  |  12.1287 |   8.5853 |   1.4127 |
frame_ctor_dtindex_Millix1                   |   0.8970 |   0.6333 |   1.4164 |
frame_ctor_dtindex_Millix2                   |   0.9647 |   0.6777 |   1.4236 |
frame_ctor_dtindex_Dayx2                     |   0.9143 |   0.6410 |   1.4264 |
frame_ctor_dtindex_Hourx2                    |   0.9403 |   0.6537 |   1.4385 |
groupby_ngroups_100_median                   |   0.4290 |   0.2973 |   1.4429 |
dataframe_resample_max_string                |   1.7533 |   1.2083 |   1.4511 |
frame_ctor_dtindex_Secondx1                  |   0.9033 |   0.6204 |   1.4561 |
frame_ctor_dtindex_Secondx2                  |   0.9097 |   0.6200 |   1.4674 |
timeseries_custom_bday_cal_incr_n            |   0.0263 |   0.0173 |   1.5183 |
series_getitem_label_slice                   |   0.0650 |   0.0427 |   1.5233 |
series_iloc_slice                            |   0.0410 |   0.0264 |   1.5542 |
indexing_dataframe_boolean                   | 110.2073 |  69.7950 |   1.5790 |
strings_contains_few_noregex                 |   3.1060 |   1.9280 |   1.6110 |
frame_boolean_row_select                     |   0.3196 |   0.1919 |   1.6654 |
frame_xs_mi_ix                               |   4.7997 |   2.8503 |   1.6839 |
datetime_index_union                         |   0.0747 |   0.0406 |   1.8395 |
eval_frame_mult_python_one_thread            |  27.5110 |  14.1670 |   1.9419 |
timeseries_custom_bday_apply_dt64            |   0.0373 |   0.0130 |   2.8598 |
groupby_first_datetimes                      |  24.7090 |   7.6466 |   3.2314 |
groupby_last_datetimes                       |  29.9554 |   9.0907 |   3.2952 |
timeseries_custom_bday_apply                 |   0.0403 |   0.0114 |   3.5455 |
timeseries_day_incr                          |   0.0237 |   0.0060 |   3.9733 |
timeseries_day_apply                         |   0.0267 |   0.0053 |   5.0149 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [f2b54fb] : BUG: Fixes GH9311 groupby on datetime64

datetime64 columns were changing at the nano-second scale when
applying a groupby aggregator.
Base   [5fd1fbd] : Merge pull request #9318 from jorisvandenbossche/doc-api-dt

DOC: delete removed Timedelta properties (see GH9257) from API overview

shoyer · 2015-02-03T04:51:13Z

pandas/src/generate_code.py

@@ -4,6 +4,11 @@
 # or we get a bootstrapping problem
 from StringIO import StringIO

+MAX_INT8 = 127


use the dtype properties instead of hard coding these, e.g., https://github.com/pydata/pandas/blob/484f66814a32324248e3d03203ea991ed896fd7a/pandas/core/common.py#L52

shoyer · 2015-02-03T05:00:07Z

OK, this looks much more reasonable to me.

I'm slightly troubled by the groupby_first_datetimes and groupby_last_datetimes performance tests -- aren't those the exact operations you tried hard to keep fast here? Can you run those tests again (see the -r option to vbench) to verify that they slowed down, and if so, figure out why? Maybe it's casting='safe'?

shoyer · 2015-02-03T05:00:39Z

pandas/core/groupby.py

@@ -85,6 +87,8 @@
 _dataframe_apply_whitelist = \
    _common_apply_whitelist | frozenset(['dtypes', 'corrwith'])

+_non_arithmetic_agg = ('first', 'last', 'min', 'max', 'nth', 'count')


This should probably be a frozenset

chrisbyboston · 2015-02-03T12:58:11Z

@shoyer Thanks for looking at it so quickly. I'll make the changes suggested, and I'll use a profiler to figure out why groupby_first_datetimes and groupby_last_datetimes are slower.

chrisbyboston · 2015-02-04T04:19:13Z

@shoyer

Alright, I found it. Here's the relevant function:

    def _try_coerce_result(self, result):
        """ reverse of try_coerce_args """
        if isinstance(result, np.ndarray):
            if result.dtype == 'i8':
                result = tslib.array_to_datetime(
                    result.astype(object).ravel()).reshape(result.shape)
            elif result.dtype.kind in ['i', 'f', 'O']:
                result = result.astype('M8[ns]', casting='safe')
        elif isinstance(result, (np.integer, np.datetime64)):
            result = lib.Timestamp(result)
        return result

We aren't even hitting the casting='safe' section any more because the new Cython functions keep the datetime64 as an i8 when it comes into this function. The slow down is from using the line that gets hit in the if result.dtype == 'i8 condition. Additionally, casting='safe' is useless in this elif block, because in numpy, none of the dtypes we're looking for can be safely cast to 'M8[ns]'. Here's the proof:

In [5]: np.can_cast(np.int8, 'M8[ns]')
Out[5]: False

In [6]: np.can_cast(np.int16, 'M8[ns]')
Out[6]: False

In [7]: np.can_cast(np.int32, 'M8[ns]')
Out[7]: False

In [8]: np.can_cast(np.int64, 'M8[ns]')
Out[8]: False

In [9]: np.can_cast(np.float32, 'M8[ns]')
Out[9]: False

In [10]: np.can_cast(np.float64, 'M8[ns]')
Out[10]: False

In [11]: np.can_cast('O', 'M8[ns]')
Out[11]: False

What's more, this block...

            if result.dtype == 'i8':
                result = tslib.array_to_datetime(
                    result.astype(object).ravel()).reshape(result.shape)

...I don't believe is necessary any more, as I think it was covering for this error in numpy 1.6.

I'm going to clean this function up and make sure all the tests are passing and that vbench looks better, and I'll push up my changes.

shoyer · 2015-02-04T04:27:08Z

@iwschris Interesting -- sounds like a good plan to me!

jreback · 2015-02-04T11:08:17Z

@iwschris The point of that coercion was to handle the cases where a returned input was non-i8. E.g. an op was applied to a datetime64 (say mean in a groubpy) that returned a float / object. So it prob wasn't hit very much (maybe not tested at all).

chrisbyboston · 2015-02-05T19:38:23Z

@shoyer I think I've made all the changes that you requested, and the groupby performance is fixed. Let me know if you see anything else that needs to be changed.

chrisbyboston · 2015-02-13T19:05:22Z

Cool. I think that last push takes care of everything mentioned up to this point. Anything else?

jreback · 2015-02-13T19:11:46Z

just FYI

In [1]: x = pd.date_range('20130101',periods=3)

In [2]: x.values            
Out[2]: 
array(['2012-12-31T19:00:00.000000000-0500',
       '2013-01-01T19:00:00.000000000-0500',
       '2013-01-02T19:00:00.000000000-0500'], dtype='datetime64[ns]')

In [3]: x.values.view('i8').base
Out[3]: array([1356998400000000000, 1357084800000000000, 1357171200000000000])

In [4]: x.values.astype('i8',copy=False).base

In [5]: np.asarray(x.values,'i8').base

The reason we always want to take a view on a M8/m8 object is that it does NOT copy.
whereas the other 2 methods always copy (even with the copy=False flag).

so these DO need to be treated separately (from a regular int64)

chrisbyboston · 2015-02-13T19:16:20Z

I see. Good info.

Newest stuff now pushed.

shoyer · 2015-02-13T19:18:32Z

@jreback interesting -- didn't realize that about view vs astype for datetime64

jreback · 2015-02-13T19:39:59Z

@iwschris can you just run another vbench vs master (you can limit it with -r groupby|timeseries if you want, just to check thanks

chrisbyboston · 2015-02-13T19:42:43Z

You bet. Should have it in a few minutes.

chrisbyboston · 2015-02-13T20:08:59Z

@jreback looks like objects weren't accounted for in that group of if elif statements. Making that change now...

jreback · 2015-02-13T20:17:58Z

@iwschris right, yeh that should be the else.

datetime64 columns were changing at the nano-second scale when applying a groupby aggregator.

chrisbyboston · 2015-02-13T20:55:24Z

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
groupby_sum_multiindex                       |   0.7521 |   3.2990 |   0.2280 |
groupby_first_float32                        |   2.3560 |   4.8750 |   0.4833 |
groupby_ngroups_100_max                      |   0.2607 |   0.4440 |   0.5871 |
groupby_multi_index                          | 586.1234 | 994.6311 |   0.5893 |
groupby_ngroups_100_count                    |   0.2876 |   0.4670 |   0.6159 |
timeseries_day_apply                         |   0.0319 |   0.0513 |   0.6223 |
groupby_ngroups_100_last                     |   0.2426 |   0.3810 |   0.6368 |
groupby_ngroups_10000_size                   |   3.7566 |   5.5623 |   0.6754 |
timeseries_custom_bday_cal_decr              |   0.0196 |   0.0290 |   0.6767 |
timeseries_custom_bmonthbegin_decr_n         |   0.1824 |   0.2547 |   0.7161 |
groupby_agg_builtins1                        |   6.5397 |   9.0700 |   0.7210 |
groupby_ngroups_100_first                    |   0.2743 |   0.3740 |   0.7335 |
timeseries_day_incr                          |   0.0233 |   0.0313 |   0.7437 |
groupby_ngroups_10000_last                   |   1.9563 |   2.5903 |   0.7552 |
groupby_int_count                            |   2.6743 |   3.5147 |   0.7609 |
groupby_agg_builtins2                        |  32.5734 |  42.7121 |   0.7626 |
groupby_series_nth_none                      |   0.9303 |   1.2077 |   0.7703 |
timeseries_custom_bday_cal_incr_n            |   0.0164 |   0.0211 |   0.7774 |
groupby_transform_series                     |  15.4217 |  19.6993 |   0.7829 |
groupby_last_float32                         |   2.4937 |   3.1406 |   0.7940 |
timeseries_custom_bday_incr                  |   0.0120 |   0.0150 |   0.7989 |
groupby_ngroups_100_skew                     |   9.2690 |  11.5190 |   0.8047 |
groupby_ngroups_10000_first                  |   1.5870 |   1.9684 |   0.8062 |
groupby_ngroups_10000_unique                 | 528.3200 | 650.3677 |   0.8123 |
groupby_ngroups_100_sem                      |   0.5520 |   0.6727 |   0.8207 |
groupby_ngroups_100_min                      |   0.3023 |   0.3684 |   0.8207 |
groupby_frame_apply_overhead                 |   6.6214 |   8.0397 |   0.8236 |
groupby_ngroups_10000_var                    |   1.7893 |   2.1513 |   0.8317 |
groupby_ngroups_100_sum                      |   0.3630 |   0.4364 |   0.8319 |
groupby_transform_ufunc                      |  90.9894 | 108.7949 |   0.8363 |
groupby_transform_multi_key4                 |  96.6847 | 115.4637 |   0.8374 |
groupby_dt_size                              |  19.4210 |  23.1177 |   0.8401 |
groupby_ngroups_100_head                     |   0.5563 |   0.6620 |   0.8403 |
timeseries_custom_bday_decr                  |   0.0197 |   0.0234 |   0.8435 |
groupby_transform_series2                    | 110.8003 | 130.9373 |   0.8462 |
groupby_ngroups_10000_max                    |   1.5440 |   1.8240 |   0.8465 |
groupby_transform_multi_key3                 | 492.3480 | 579.4737 |   0.8496 |
groupby_simple_compress_timing               |  22.3200 |  26.1633 |   0.8531 |
timeseries_to_datetime_iso8601               |   3.0967 |   3.6070 |   0.8585 |
groupby_int64_overflow                       | 251.3187 | 291.4154 |   0.8624 |
groupby_ngroups_10000_rank                   | 972.6830 | 1118.1310 |   0.8699 |
groupby_transform_multi_key2                 |  29.4696 |  33.8450 |   0.8707 |
groupby_ngroups_10000_all                    | 712.3833 | 810.1400 |   0.8793 |
groupby_transform_multi_key1                 |  43.5643 |  49.4163 |   0.8816 |
timeseries_1min_5min_ohlc                    |   0.7386 |   0.8353 |   0.8842 |
groupby_frame_singlekey_integer              |   1.3994 |   1.5797 |   0.8858 |
groupby_multi_different_numpy_functions      |   7.7173 |   8.6620 |   0.8909 |
timeseries_asof_nan                          |   5.7263 |   6.4077 |   0.8937 |
timeseries_custom_bmonthend_decr_n           |   0.2110 |   0.2357 |   0.8951 |
groupby_ngroups_10000_diff                   | 891.7777 | 987.4310 |   0.9031 |
groupby_multi_count                          |   5.0724 |   5.6121 |   0.9038 |
timeseries_custom_bmonthbegin_incr_n         |   0.1677 |   0.1847 |   0.9079 |
groupby_nth_float64_none                     |  69.3804 |  76.1610 |   0.9110 |
groupby_multi_different_functions            |   8.1933 |   8.9907 |   0.9113 |
groupby_indices                              |   4.6080 |   5.0390 |   0.9145 |
timeseries_asof                              |   5.8460 |   6.3770 |   0.9167 |
groupby_ngroups_10000_nunique                | 673.3130 | 732.7573 |   0.9189 |
timeseries_custom_bday_cal_incr              |   0.0207 |   0.0223 |   0.9253 |
groupby_ngroups_100_all                      |   7.4526 |   8.0520 |   0.9256 |
groupby_ngroups_100_pct_change               |  28.9074 |  30.9927 |   0.9327 |
groupby_apply_dict_return                    |  26.3223 |  28.0710 |   0.9377 |
groupby_nth_object_any                       | 907.5707 | 960.7120 |   0.9447 |
groupby_ngroups_100_size                     |   0.4393 |   0.4650 |   0.9448 |
groupby_series_nth_any                       |   2.8097 |   2.9716 |   0.9455 |
groupby_multi_size                           |  18.4967 |  19.5157 |   0.9478 |
groupby_ngroups_10000_sem                    |   2.3330 |   2.4590 |   0.9488 |
timeseries_year_incr                         |   0.0133 |   0.0140 |   0.9489 |
groupby_ngroups_10000_mad                    | 3652.3110 | 3848.0051 |   0.9491 |
groupby_first_datetimes                      |   7.6036 |   8.0037 |   0.9500 |
groupby_ngroups_100_mean                     |   0.2724 |   0.2867 |   0.9501 |
groupby_ngroups_100_any                      |   7.2740 |   7.6460 |   0.9513 |
groupby_nth_datetimes_any                    | 870.5579 | 914.8546 |   0.9516 |
timeseries_sort_index                        |   7.9620 |   8.3427 |   0.9544 |
timeseries_custom_bday_apply                 |   0.0134 |   0.0140 |   0.9545 |
groupby_multi_series_op                      |   9.9360 |  10.3896 |   0.9563 |
groupby_ngroups_100_rank                     |  11.3663 |  11.8790 |   0.9568 |
groupby_nth_object_none                      | 514.6056 | 536.9283 |   0.9584 |
groupby_last_datetimes                       |   9.1613 |   9.5367 |   0.9606 |
groupby_pivot_table                          |  11.9127 |  12.3960 |   0.9610 |
timeseries_period_downsample_mean            |   8.3439 |   8.6637 |   0.9631 |
groupby_ngroups_100_cummin                   |  11.8447 |  12.2477 |   0.9671 |
groupby_last_object                          |  13.5950 |  14.0096 |   0.9704 |
groupby_ngroups_100_unique                   |   5.3563 |   5.5177 |   0.9707 |
groupby_ngroups_10000_mean                   |   1.6189 |   1.6673 |   0.9710 |
groupby_first_object                         |  13.5777 |  13.9623 |   0.9725 |
groupby_frame_median                         |   4.6940 |   4.8234 |   0.9732 |
timeseries_with_format_replace               | 849.3273 | 870.5983 |   0.9756 |
timeseries_with_format_no_exact              | 642.9497 | 657.4820 |   0.9779 |
timeseries_custom_bmonthend_incr             |   0.1337 |   0.1367 |   0.9779 |
groupby_ngroups_10000_any                    | 680.4217 | 694.6084 |   0.9796 |
groupby_ngroups_100_median                   |   0.2913 |   0.2970 |   0.9810 |
groupby_ngroups_100_mad                      |  37.1737 |  37.8356 |   0.9825 |
timeseries_to_datetime_YYYYMMDD              |  10.6564 |  10.8293 |   0.9840 |
timeseries_iter_periodindex_preexit          |   9.1586 |   9.2920 |   0.9856 |
groupby_multi_cython                         |  11.9693 |  12.1080 |   0.9885 |
groupby_ngroups_100_cummax                   |  11.3430 |  11.4384 |   0.9917 |
groupby_sum_booleans                         |   0.9013 |   0.9084 |   0.9922 |
groupby_ngroups_10000_describe               | 13609.3757 | 13623.5873 |   0.9990 |
timeseries_large_lookup_value                |   0.0137 |   0.0137 |   1.0000 |
groupby_ngroups_10000_cumprod                | 1134.5750 | 1133.7350 |   1.0007 |
groupby_ngroups_10000_cummax                 | 1093.6377 | 1090.4837 |   1.0029 |
groupby_nth_float32_none                     |  71.3633 |  71.0537 |   1.0044 |
groupby_ngroups_100_describe                 | 134.6430 | 134.0171 |   1.0047 |
groupby_ngroups_10000_sum                    |   1.7863 |   1.7767 |   1.0054 |
groupby_ngroups_10000_pct_change             | 3164.2790 | 3113.4090 |   1.0163 |
groupby_ngroups_10000_cummin                 | 1061.4703 | 1044.0420 |   1.0167 |
groupby_ngroups_10000_count                  |   1.7674 |   1.7337 |   1.0194 |
groupby_ngroups_10000_value_counts           | 4022.8783 | 3946.0400 |   1.0195 |
groupby_ngroups_10000_median                 |   2.3727 |   2.3223 |   1.0217 |
groupby_ngroups_100_prod                     |   0.4040 |   0.3926 |   1.0291 |
groupby_ngroups_10000_cumsum                 | 1125.8500 | 1093.0517 |   1.0300 |
groupby_nth_datetimes_none                   | 482.6593 | 468.3119 |   1.0306 |
timeseries_timestamp_downsample_mean         |   3.8143 |   3.6987 |   1.0313 |
timeseries_iter_datetimeindex                | 562.7003 | 545.4220 |   1.0317 |
timeseries_infer_freq                        |   7.6024 |   7.3264 |   1.0377 |
groupby_ngroups_10000_cumcount               |  66.6266 |  64.1980 |   1.0378 |
groupby_transform                            | 124.0466 | 119.5150 |   1.0379 |
timeseries_1min_5min_mean                    |   0.6224 |   0.5994 |   1.0383 |
groupby_frame_apply                          |  31.1930 |  29.9776 |   1.0405 |
groupby_series_simple_cython                 | 173.9823 | 166.2357 |   1.0466 |
groupby_first_float64                        |   2.7541 |   2.6210 |   1.0508 |
timeseries_custom_bday_apply_dt64            |   0.0147 |   0.0140 |   1.0511 |
groupby_ngroups_10000_tail                   |  67.1217 |  63.7630 |   1.0527 |
timeseries_iter_periodindex                  | 1014.3323 | 961.3767 |   1.0551 |
groupby_ngroups_100_value_counts             |  42.9380 |  40.6311 |   1.0568 |
groupby_frame_nth_any                        |   4.6680 |   4.3880 |   1.0638 |
groupby_ngroups_100_var                      |   0.2960 |   0.2766 |   1.0701 |
groupby_ngroups_100_cumcount                 |   0.6820 |   0.6333 |   1.0768 |
groupby_ngroups_100_cumsum                   |  11.8704 |  11.0226 |   1.0769 |
groupby_ngroups_100_nunique                  |   8.1023 |   7.5150 |   1.0782 |
groupby_ngroups_100_diff                     |  11.5736 |  10.7230 |   1.0793 |
groupby_ngroups_10000_head                   |  67.4507 |  62.1850 |   1.0847 |
groupby_multi_python                         |  78.9371 |  72.3193 |   1.0915 |
groupby_ngroups_100_std                      |   0.4003 |   0.3661 |   1.0936 |
timeseries_iter_datetimeindex_preexit        |  11.6250 |  10.5534 |   1.1015 |
timeseries_custom_bday_cal_incr_neg_n        |   0.0213 |   0.0193 |   1.1029 |
timeseries_add_irregular                     |  11.4637 |  10.3450 |   1.1081 |
groupby_ngroups_10000_skew                   | 1084.5517 | 978.5136 |   1.1084 |
groupby_ngroups_100_tail                     |   0.6386 |   0.5750 |   1.1107 |
groupby_ngroups_100_cumprod                  |  11.9090 |  10.6903 |   1.1140 |
groupby_frame_nth_none                       |   1.7730 |   1.5844 |   1.1191 |
groupby_dt_timegrouper_size                  |  17.4917 |  15.5451 |   1.1252 |
groupby_ngroups_10000_prod                   |   2.0370 |   1.7896 |   1.1382 |
groupby_ngroups_10000_std                    |   2.1477 |   1.8667 |   1.1505 |
frame_assign_timeseries_index                |   0.6467 |   0.5604 |   1.1540 |
groupby_frame_cython_many_columns            |   2.5174 |   2.1220 |   1.1863 |
timeseries_is_month_start                    |   3.0104 |   2.5110 |   1.1989 |
timeseries_custom_bmonthend_incr_n           |   0.2177 |   0.1777 |   1.2250 |
timeseries_asof_single                       |   0.0223 |   0.0180 |   1.2434 |
groupby_ngroups_10000_min                    |   2.2504 |   1.8086 |   1.2442 |
groupby_last_float64                         |   3.3420 |   2.6394 |   1.2662 |
timeseries_slice_minutely                    |   0.0567 |   0.0400 |   1.4175 |
timeseries_year_apply                        |   0.0286 |   0.0160 |   1.7910 |
timeseries_timestamp_tzinfo_cons             |   0.0143 |   0.0077 |   1.8557 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

chrisbyboston · 2015-02-13T20:56:06Z

@jreback - vbench output posted, and the modification for object is done.

Let me know if there is anything else.

shoyer · 2015-02-13T21:35:47Z

vbench looks pretty good to me -- some nice speedups for grouped aggregations!

jreback · 2015-02-13T21:39:43Z

the vbenchs are due to other PR's actually.

Here is what I get

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
groupby_ngroups_100_count                    |   0.3027 |   0.4513 |   0.6707 |
groupby_ngroups_100_min                      |   0.2793 |   0.4063 |   0.6875 |
groupby_ngroups_100_max                      |   0.2987 |   0.4203 |   0.7107 |
groupby_ngroups_100_last                     |   0.2950 |   0.4013 |   0.7350 |
groupby_ngroups_100_first                    |   0.2836 |   0.3827 |   0.7412 |
groupby_int_count                            |   2.7003 |   3.2799 |   0.8233 |
groupby_ngroups_10000_min                    |   1.8367 |   2.2127 |   0.8301 |
groupby_ngroups_10000_max                    |   1.9150 |   2.1427 |   0.8937 |
timeseries_infer_freq                        |   7.5614 |   8.4263 |   0.8973 |
groupby_dt_timegrouper_size                  |  15.5920 |  17.3516 |   0.8986 |
groupby_simple_compress_timing               |  24.6580 |  27.4227 |   0.8992 |
groupby_ngroups_100_cumsum                   |  11.6700 |  12.9297 |   0.9026 |
timeseries_to_datetime_YYYYMMDD              |   7.5997 |   8.3993 |   0.9048 |
timeseries_sort_index                        |   7.4020 |   8.1673 |   0.9063 |
groupby_ngroups_100_tail                     |   0.6397 |   0.7053 |   0.9069 |
groupby_nth_float32_none                     |  70.5680 |  77.5150 |   0.9104 |
timeseries_custom_bmonthend_incr_n           |   0.1960 |   0.2136 |   0.9174 |
groupby_sum_booleans                         |   0.9297 |   1.0130 |   0.9178 |
groupby_ngroups_100_diff                     |  10.6417 |  11.5767 |   0.9192 |
groupby_ngroups_100_value_counts             |  41.5277 |  45.1584 |   0.9196 |
groupby_ngroups_10000_std                    |   2.0053 |   2.1660 |   0.9258 |
timeseries_period_downsample_mean            |   9.2463 |   9.9670 |   0.9277 |
groupby_ngroups_100_size                     |   0.4164 |   0.4483 |   0.9287 |
timeseries_iter_datetimeindex_preexit        |  10.2930 |  11.0653 |   0.9302 |
timeseries_iter_datetimeindex                | 558.3330 | 600.1287 |   0.9304 |
groupby_transform_multi_key4                 | 102.0021 | 109.6350 |   0.9304 |
groupby_ngroups_10000_value_counts           | 4049.3347 | 4349.6927 |   0.9309 |
timeseries_custom_bday_apply_dt64            |   0.0140 |   0.0150 |   0.9312 |
groupby_ngroups_10000_sum                    |   1.9740 |   2.1197 |   0.9313 |
groupby_multi_index                          | 527.2981 | 565.1563 |   0.9330 |
groupby_ngroups_10000_cummax                 | 1053.8023 | 1128.4770 |   0.9338 |
groupby_agg_builtins2                        |  29.9434 |  32.0370 |   0.9346 |
groupby_ngroups_10000_prod                   |   1.9640 |   2.0967 |   0.9367 |
timeseries_custom_bday_cal_incr              |   0.0190 |   0.0203 |   0.9373 |
plot_timeseries_period                       | 100.2293 | 106.8187 |   0.9383 |
groupby_ngroups_10000_last                   |   2.0286 |   2.1590 |   0.9396 |
groupby_ngroups_10000_cumprod                | 1070.9914 | 1139.1160 |   0.9402 |
timeseries_1min_5min_mean                    |   0.6700 |   0.7126 |   0.9402 |
timeseries_to_datetime_iso8601               |   3.4893 |   3.7096 |   0.9406 |
groupby_ngroups_10000_skew                   | 975.7330 | 1034.3137 |   0.9434 |
groupby_ngroups_10000_describe               | 11928.2053 | 12641.3703 |   0.9436 |
timeseries_custom_bday_cal_incr_neg_n        |   0.0220 |   0.0233 |   0.9454 |
timeseries_custom_bmonthend_decr_n           |   0.2356 |   0.2490 |   0.9464 |
timeseries_1min_5min_ohlc                    |   0.7553 |   0.7976 |   0.9470 |
timeseries_iter_periodindex_preexit          |   9.7260 |  10.2700 |   0.9470 |
groupby_ngroups_100_mad                      |  32.6534 |  34.4483 |   0.9479 |
timeseries_custom_bday_cal_incr_n            |   0.0190 |   0.0200 |   0.9484 |
timeseries_day_apply                         |   0.0257 |   0.0270 |   0.9500 |
timeseries_is_month_start                    |   2.8730 |   3.0140 |   0.9532 |
groupby_transform                            | 119.9337 | 125.6570 |   0.9545 |
timeseries_custom_bday_incr                  |   0.0134 |   0.0140 |   0.9545 |
timeseries_custom_bmonthend_incr             |   0.1537 |   0.1610 |   0.9546 |
timeseries_custom_bmonthbegin_decr_n         |   0.2077 |   0.2174 |   0.9554 |
groupby_transform_ufunc                      |  93.5323 |  97.8587 |   0.9558 |
timeseries_custom_bmonthbegin_incr_n         |   0.1920 |   0.2007 |   0.9568 |
groupby_ngroups_10000_first                  |   1.9853 |   2.0707 |   0.9588 |
timeseries_iter_periodindex                  | 969.0204 | 1010.4167 |   0.9590 |
groupby_frame_apply                          |  30.6340 |  31.9337 |   0.9593 |
timeseries_year_incr                         |   0.0150 |   0.0157 |   0.9594 |
timeseries_with_format_no_exact              | 629.6630 | 655.2920 |   0.9609 |
groupby_frame_cython_many_columns            |   2.4337 |   2.5264 |   0.9633 |
groupby_ngroups_10000_cumsum                 | 1083.9800 | 1123.7434 |   0.9646 |
groupby_last_datetimes                       |  10.0000 |  10.3630 |   0.9650 |
groupby_multi_count                          |   6.0413 |   6.2490 |   0.9668 |
timeseries_custom_bday_decr                  |   0.0210 |   0.0217 |   0.9670 |
timeseries_custom_bday_apply                 |   0.0126 |   0.0130 |   0.9695 |
groupby_transform_multi_key3                 | 581.4103 | 598.2854 |   0.9718 |
groupby_nth_object_any                       | 897.0937 | 922.5266 |   0.9724 |
groupby_nth_object_none                      | 490.6300 | 504.4353 |   0.9726 |
timeseries_year_apply                        |   0.0146 |   0.0150 |   0.9735 |
groupby_multi_series_op                      |  10.7660 |  11.0053 |   0.9783 |
groupby_ngroups_100_cumcount                 |   0.6344 |   0.6467 |   0.9810 |
groupby_ngroups_10000_all                    | 721.7060 | 735.6390 |   0.9811 |
groupby_ngroups_100_std                      |   0.3716 |   0.3787 |   0.9813 |
timeseries_with_format_replace               | 851.9327 | 867.9880 |   0.9815 |
groupby_ngroups_10000_mad                    | 3273.8944 | 3332.5966 |   0.9824 |
groupby_ngroups_10000_cumcount               |  64.3473 |  65.4813 |   0.9827 |
groupby_ngroups_100_describe                 | 118.8650 | 120.9197 |   0.9830 |
groupby_multi_python                         |  75.6930 |  76.8147 |   0.9854 |
groupby_transform_multi_key1                 |  51.7964 |  52.5324 |   0.9860 |
timeseries_slice_minutely                    |   0.0454 |   0.0460 |   0.9862 |
groupby_ngroups_100_cumprod                  |  11.7360 |  11.8947 |   0.9867 |
frame_assign_timeseries_index                |   0.6424 |   0.6500 |   0.9883 |
groupby_sum_multiindex                       |   0.9010 |   0.9106 |   0.9894 |
groupby_ngroups_10000_unique                 | 526.0944 | 531.6893 |   0.9895 |
groupby_ngroups_10000_median                 |   2.4913 |   2.5120 |   0.9918 |
groupby_nth_float64_none                     |  73.2677 |  73.7294 |   0.9937 |
timeseries_large_lookup_value                |   0.0149 |   0.0150 |   0.9947 |
groupby_multi_different_functions            |   9.1166 |   9.1650 |   0.9947 |
groupby_ngroups_100_prod                     |   0.4179 |   0.4200 |   0.9951 |
groupby_ngroups_100_all                      |   8.0884 |   8.1267 |   0.9953 |
groupby_ngroups_10000_pct_change             | 3417.7426 | 3432.1783 |   0.9958 |
groupby_ngroups_100_nunique                  |   8.5557 |   8.5843 |   0.9967 |
groupby_ngroups_10000_size                   |   3.6917 |   3.7020 |   0.9972 |
groupby_ngroups_10000_tail                   |  65.5094 |  65.5634 |   0.9992 |
groupby_multi_cython                         |  11.8380 |  11.8477 |   0.9992 |
groupby_ngroups_100_sum                      |   0.4176 |   0.4177 |   0.9998 |
timeseries_asof_single                       |   0.0207 |   0.0207 |   1.0000 |
timeseries_custom_bday_cal_decr              |   0.0223 |   0.0223 |   1.0000 |
groupby_int64_overflow                       | 286.5373 | 286.4160 |   1.0004 |
groupby_ngroups_100_cummax                   |  12.5953 |  12.5740 |   1.0017 |
timeseries_day_incr                          |   0.0270 |   0.0269 |   1.0029 |
groupby_ngroups_10000_any                    | 714.5850 | 711.8286 |   1.0039 |
groupby_series_nth_any                       |   3.5603 |   3.5450 |   1.0043 |
groupby_frame_apply_overhead                 |   6.8093 |   6.7707 |   1.0057 |
groupby_nth_datetimes_any                    | 918.3880 | 912.4471 |   1.0065 |
groupby_series_nth_none                      |   1.1896 |   1.1800 |   1.0081 |
groupby_ngroups_100_cummin                   |  12.1826 |  12.0743 |   1.0090 |
groupby_frame_median                         |   5.9024 |   5.8440 |   1.0100 |
groupby_frame_nth_any                        |   5.2697 |   5.2167 |   1.0102 |
groupby_ngroups_100_head                     |   0.6737 |   0.6667 |   1.0105 |
groupby_last_object                          |  14.8334 |  14.6753 |   1.0108 |
groupby_ngroups_100_var                      |   0.3307 |   0.3270 |   1.0114 |
groupby_ngroups_10000_var                    |   2.1030 |   2.0790 |   1.0115 |
groupby_ngroups_10000_nunique                | 771.5917 | 761.1760 |   1.0137 |
groupby_ngroups_10000_diff                   | 1028.4077 | 1010.5730 |   1.0176 |
groupby_frame_nth_none                       |   1.9077 |   1.8697 |   1.0203 |
groupby_pivot_table                          |  14.1603 |  13.8530 |   1.0222 |
groupby_last_float32                         |   3.0050 |   2.9330 |   1.0245 |
groupby_transform_series2                    | 112.3273 | 109.4310 |   1.0265 |
groupby_ngroups_100_rank                     |  13.1743 |  12.8287 |   1.0269 |
groupby_apply_dict_return                    |  29.3473 |  28.5690 |   1.0272 |
groupby_ngroups_10000_cummin                 | 1117.0100 | 1080.1910 |   1.0341 |
groupby_first_object                         |  14.8550 |  14.3493 |   1.0352 |
groupby_first_datetimes                      |   9.3730 |   9.0377 |   1.0371 |
groupby_ngroups_100_pct_change               |  37.1253 |  35.6517 |   1.0413 |
groupby_ngroups_10000_head                   |  67.1093 |  64.4406 |   1.0414 |
groupby_ngroups_10000_rank                   | 1145.4223 | 1098.6357 |   1.0426 |
groupby_ngroups_100_skew                     |  11.2367 |  10.7393 |   1.0463 |
groupby_transform_multi_key2                 |  36.3350 |  34.6060 |   1.0500 |
groupby_series_simple_cython                 | 183.0710 | 174.3507 |   1.0500 |
groupby_nth_datetimes_none                   | 443.9003 | 422.1927 |   1.0514 |
groupby_multi_size                           |  19.4400 |  18.4393 |   1.0543 |
timeseries_asof_nan                          |   2.4397 |   2.3093 |   1.0564 |
groupby_transform_series                     |  17.8304 |  16.7394 |   1.0652 |
groupby_multi_different_numpy_functions      |   9.3890 |   8.8094 |   1.0658 |
groupby_ngroups_100_median                   |   0.3647 |   0.3417 |   1.0675 |
groupby_ngroups_100_any                      |   8.1367 |   7.6074 |   1.0696 |
groupby_ngroups_100_unique                   |   6.4450 |   5.9753 |   1.0786 |
groupby_first_float64                        |   2.7567 |   2.5383 |   1.0860 |
groupby_dt_size                              |  22.9206 |  21.0621 |   1.0882 |
groupby_ngroups_10000_mean                   |   2.0073 |   1.8340 |   1.0945 |
groupby_ngroups_100_mean                     |   0.3433 |   0.3096 |   1.1088 |
groupby_last_float64                         |   3.6613 |   2.9603 |   1.2368 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [5f6cbf8] : BUG: Fixes GH9311 groupby on datetime64
datetime64 columns were changing at the nano-second scale when
applying a groupby aggregator.
Base   [f2882b8] : Merge pull request #9479 from jreback/align
BUG: bug in partial setting of with a DatetimeIndex (GH9478)

@iwschris I think you were benching against an older version.

in any event it looks fine.

ping when green

chrisbyboston · 2015-02-13T21:56:09Z

Good to know for future PR's. It just takes a fetch and a rebase, right?

jreback · 2015-02-13T21:59:04Z

yep, then I run with

./test_perf.sh -b master -t HEAD -r 'groupby|timeseries'

if you originally checked out with git checkout -b your_branch --track master
then a fetch/rebase will work

(or you can set the tracking branch, e.g. git branch yourbranch --set-upstream-to origin/master (or local master), however you prefer

chrisbyboston · 2015-02-13T22:30:41Z

Yep, our vbench's match now.

chrisbyboston · 2015-02-13T23:02:17Z

@jreback it's green.

BUG: Fixes GH9311 groupby on datetime64

jreback · 2015-02-14T03:11:27Z

thanks @iwschris !

working with cython and generated code is a bit non-trivial. thanks for all of the patience and effort!

feel free to look at other issues! (hint hint....#4095), might be a interesting

chrisbyboston · 2015-02-14T03:16:18Z

Thanks for working with me on it! I'll take a peek at #4095.

shoyer · 2015-02-14T04:48:51Z

Indeed, really nicely done

…ter)

…already fixed in master)

chrisbyboston mentioned this pull request Jan 23, 2015

BUG: first() changes datetime64 data #9311

Closed

shoyer reviewed Jan 25, 2015
View reviewed changes

jreback reviewed Jan 25, 2015
View reviewed changes

jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions Groupby labels Jan 25, 2015

jreback added this to the 0.16.0 milestone Jan 25, 2015

jreback mentioned this pull request Jan 25, 2015

groupby first can return values not in group #9300

Closed

chrisbyboston force-pushed the groupby_nano_int branch from aff1443 to 2219ef7 Compare February 2, 2015 16:34

shoyer reviewed Feb 3, 2015
View reviewed changes

chrisbyboston force-pushed the groupby_nano_int branch from 2219ef7 to 0acdaab Compare February 5, 2015 18:36

chrisbyboston force-pushed the groupby_nano_int branch from d33dae7 to 01e6526 Compare February 13, 2015 19:02

chrisbyboston force-pushed the groupby_nano_int branch from 01e6526 to 9fff246 Compare February 13, 2015 19:15

chrisbyboston force-pushed the groupby_nano_int branch from 9fff246 to dd89607 Compare February 13, 2015 20:48

BUG: Fixes GH9311 groupby on datetime64

5f6cbf8

datetime64 columns were changing at the nano-second scale when applying a groupby aggregator.

chrisbyboston force-pushed the groupby_nano_int branch from dd89607 to 5f6cbf8 Compare February 13, 2015 20:53

jreback added a commit that referenced this pull request Feb 14, 2015

Merge pull request #9345 from iwschris/groupby_nano_int

3f24b87

BUG: Fixes GH9311 groupby on datetime64

jreback merged commit 3f24b87 into pandas-dev:master Feb 14, 2015

jreback added a commit that referenced this pull request Sep 3, 2015

DOC: document regression, xref #9345, in #10979 (already fixed in mas…

2748e82

…ter)

jreback mentioned this pull request Sep 3, 2015

REGR: if someone wants to find where this was fixed would be gr8 #10980

Closed

jreback mentioned this pull request Sep 27, 2015

mean of int64 results in int64 instead of float64 #11199

Closed

nickeubank pushed a commit to nickeubank/pandas that referenced this pull request Sep 29, 2015

DOC: document regression, xref pandas-dev#9345, in pandas-dev#10979 (…

e0f5af8

…already fixed in master)

BUG: Fixes GH9311 groupby on datetime64 #9345

BUG: Fixes GH9311 groupby on datetime64 #9345

Conversation

chrisbyboston commented Jan 23, 2015

shoyer Jan 25, 2015

Choose a reason for hiding this comment

jreback Jan 25, 2015

Choose a reason for hiding this comment

shoyer Jan 25, 2015

Choose a reason for hiding this comment

jreback Jan 25, 2015

Choose a reason for hiding this comment

shoyer commented Jan 25, 2015

jreback Jan 25, 2015

Choose a reason for hiding this comment

shoyer commented Jan 25, 2015

chrisbyboston commented Jan 26, 2015

chrisbyboston commented Jan 30, 2015

shoyer commented Jan 30, 2015

jreback commented Jan 30, 2015

chrisbyboston commented Jan 30, 2015

chrisbyboston commented Feb 2, 2015

shoyer Feb 3, 2015

Choose a reason for hiding this comment

shoyer commented Feb 3, 2015

shoyer Feb 3, 2015

Choose a reason for hiding this comment

chrisbyboston commented Feb 3, 2015

chrisbyboston commented Feb 4, 2015

shoyer commented Feb 4, 2015

jreback commented Feb 4, 2015

chrisbyboston commented Feb 5, 2015

chrisbyboston commented Feb 13, 2015

jreback commented Feb 13, 2015

chrisbyboston commented Feb 13, 2015

shoyer commented Feb 13, 2015

jreback commented Feb 13, 2015

chrisbyboston commented Feb 13, 2015

chrisbyboston commented Feb 13, 2015

jreback commented Feb 13, 2015

chrisbyboston commented Feb 13, 2015

chrisbyboston commented Feb 13, 2015

shoyer commented Feb 13, 2015

jreback commented Feb 13, 2015

chrisbyboston commented Feb 13, 2015

jreback commented Feb 13, 2015

chrisbyboston commented Feb 13, 2015

chrisbyboston commented Feb 13, 2015

jreback commented Feb 14, 2015

chrisbyboston commented Feb 14, 2015

shoyer commented Feb 14, 2015