Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This PR combines two major changes that improve cudf's final binary size. The optimizations are: - Explicitly telling fatbin to always compress device code - Not generating asserts with `__FILE__`, `__LINE__`, etc when in release mode. With this PR we take cudf when built for `ALL` archs from `1.3GB` to `329MB`. Since this has such a dramatic change, I did some analysis on library sizes, compile times, runtime startup times, and runtime performance. ### Library sizes | variant | release 1 arch | release all archs | | -- | -- | -- | | branch-0.19 | 335MB | 1.3GB | | compression | 133MB | 476MB | | compression + no_assert | 110MB | 329MB | | rdc + comp + no_assert | 150MB | Not done | We see that compression and the removal of the asserts while staying with whole compilation ( no `-rdc` ) is best for performance. RDC taking more space makes sense to me as it has a less agressive optimizer and therefore generates more sass/ptx. ### Compile times | variant | release 1 arch | release all archs | | -- | -- | -- | | branch-0.19 | 15m50.411s | 44m12.849s | | compression | 15m14.021s | 44m11.658s | | compression + no_assert | 12m53.049s | 33m59.388s | | rdc + comp + no_assert | 12m58.550s | Not done | Compile times are interesting as we see that compression has no negative effect. I expect that the reduced IO ( from smaller binaries ) offsets the time spent compressing. ## Runtime starup times for `all archs`: | variant | branch-0.19 | compression + no_assert | | -- | -- | -- | | ctest -j1 | 4m0.326s | 3m57.500s | | DISPATCHER_TEST * 12 | 0m3.441s | 0m3.520s | The goal here was to time the tests to see if we had any measurable differences. The `DISPATCHER_TEST * 12` was selected as the runtime `~0.3s` would hopefully highligh any performance differences, and therefore running it 12 times would magnify the difference and wash out any system load impacts. I think from these rather crude measurements we are safe to assume no massive runtime loading time costs. ## Runtime performance for `all archs`: I used the REDUCTION_BENCH as my baseline for detecting any performance changes. The benchmarks executed on a V100 on a shared lab machine. <details> <summary>REDUCTION_BENCH ./compare benchmarks branch_0.19 this_compression_branch</summary> ``` Comparing ./reduction_0.19 to ./reduction_compressed Benchmark Time CPU Time Old Time New CPU Old CPU New ----------------------------------------------------------------------------------------------------------------------------------------------------- Reduction/bool_all/10000/manual_time +0.1038 +0.0785 16621 18346 37366 40299 Reduction/bool_all/100000/manual_time +0.0969 +0.0779 17057 18710 37738 40676 Reduction/bool_all/1000000/manual_time +0.0709 +0.0694 18175 19463 38256 40913 Reduction/bool_all/10000000/manual_time +0.0472 +0.0479 35489 37165 53792 56368 Reduction/bool_all/100000000/manual_time +0.0026 +0.0120 145630 146014 164349 166325 Reduction/int8_t_all/10000/manual_time +0.1415 +0.1098 16286 18591 36729 40762 Reduction/int8_t_all/100000/manual_time +0.0457 +0.0500 17180 17966 37798 39688 Reduction/int8_t_all/1000000/manual_time +0.0656 +0.0565 18297 19498 38522 40700 Reduction/int8_t_all/10000000/manual_time +0.0322 +0.0345 36117 37279 54645 56528 Reduction/int8_t_all/100000000/manual_time +0.0041 +0.0132 146222 146819 165053 167232 Reduction/int32_t_all/10000/manual_time +0.0931 +0.0781 16849 18417 37471 40397 Reduction/int32_t_all/100000/manual_time +0.0841 +0.0775 17371 18831 37772 40701 Reduction/int32_t_all/1000000/manual_time +0.0660 +0.0709 22323 23797 41633 44584 Reduction/int32_t_all/10000000/manual_time +0.0171 +0.0324 68303 69470 87006 89822 Reduction/int32_t_all/100000000/manual_time +0.0028 +0.0065 472041 473370 490693 493869 Reduction/float_all/10000/manual_time +0.0847 +0.0734 16951 18387 37708 40474 Reduction/float_all/100000/manual_time +0.0962 +0.0880 16939 18569 37229 40504 Reduction/float_all/1000000/manual_time +0.1879 +0.1679 22348 26547 41549 48526 Reduction/float_all/10000000/manual_time +0.0465 +0.0652 68851 72054 87438 93139 Reduction/float_all/100000000/manual_time +0.0081 +0.0126 472210 476037 491030 497241 Reduction/bool_any/10000/manual_time +0.0764 +0.0743 16292 17537 36830 39568 Reduction/bool_any/100000/manual_time +0.0568 +0.0591 16633 17578 37197 39394 Reduction/bool_any/1000000/manual_time +0.0771 +0.0593 18024 19414 38439 40720 Reduction/bool_any/10000000/manual_time +0.0388 +0.0452 34990 36347 53210 55614 Reduction/bool_any/100000000/manual_time +0.0012 +0.0096 145729 145903 164726 166315 Reduction/int8_t_any/10000/manual_time +0.1207 +0.0954 15686 17579 36213 39669 Reduction/int8_t_any/100000/manual_time +0.0546 +0.0580 16597 17503 37103 39254 Reduction/int8_t_any/1000000/manual_time +0.0801 +0.0669 17938 19374 38027 40572 Reduction/int8_t_any/10000000/manual_time +0.0492 +0.0541 34726 36434 52812 55671 Reduction/int8_t_any/100000000/manual_time +0.0028 +0.0119 145924 146340 164711 166670 Reduction/int32_t_any/10000/manual_time +0.0489 +0.0539 16805 17626 37587 39614 Reduction/int32_t_any/100000/manual_time +0.0824 +0.0717 16985 18384 37432 40114 Reduction/int32_t_any/1000000/manual_time +0.0588 +0.0590 22338 23652 41842 44311 Reduction/int32_t_any/10000000/manual_time +0.0203 +0.0339 67672 69045 86353 89280 Reduction/int32_t_any/100000000/manual_time +0.0023 +0.0054 471708 472770 490461 493130 Reduction/float_any/10000/manual_time +0.0454 +0.0494 16730 17490 37526 39382 Reduction/float_any/100000/manual_time +0.0895 +0.0739 16872 18381 37349 40110 Reduction/float_any/1000000/manual_time +0.0554 +0.0527 22402 23643 42014 44228 Reduction/float_any/10000000/manual_time +0.0026 +0.0081 68877 69060 88443 89159 Reduction/float_any/100000000/manual_time +0.0022 +0.0049 471781 472801 490753 493170 ReductionDictionary/int32_t_all/10000/manual_time +0.0503 +0.0535 30648 32190 51412 54160 ReductionDictionary/int32_t_all/100000/manual_time +0.0696 +0.0642 31076 33240 51545 54855 ReductionDictionary/int32_t_all/1000000/manual_time +0.9359 +0.4786 38709 74935 58597 86644 ReductionDictionary/int32_t_all/10000000/manual_time +0.0289 +0.0625 95351 98111 113516 120611 ReductionDictionary/int32_t_all/100000000/manual_time -0.0963 -0.0910 645050 582915 663356 602971 ReductionDictionary/float_all/10000/manual_time +0.0476 +0.0504 30177 31615 50814 53374 ReductionDictionary/float_all/100000/manual_time -0.0305 -0.0166 34208 33166 55546 54626 ReductionDictionary/float_all/1000000/manual_time -0.0258 -0.0137 40372 39329 61080 60242 ReductionDictionary/float_all/10000000/manual_time -0.0366 -0.0210 95681 92175 113937 111545 ReductionDictionary/float_all/100000000/manual_time -0.0784 -0.0733 644233 593744 662793 614231 ReductionDictionary/int32_t_any/10000/manual_time +0.0718 +0.0680 29882 32026 50723 54172 ReductionDictionary/int32_t_any/100000/manual_time +0.0776 +0.0704 30316 32670 50640 54203 ReductionDictionary/int32_t_any/1000000/manual_time +0.0291 +0.0372 37311 38397 57039 59159 ReductionDictionary/int32_t_any/10000000/manual_time -0.0497 -0.0285 95928 91163 113924 110675 ReductionDictionary/int32_t_any/100000000/manual_time -0.1160 -0.1109 655803 579715 673920 599181 ReductionDictionary/float_any/10000/manual_time +0.0561 +0.0563 29905 31583 50566 53413 ReductionDictionary/float_any/100000/manual_time +0.0622 +0.0581 30431 32322 50788 53740 ReductionDictionary/float_any/1000000/manual_time +0.0403 +0.0473 36955 38444 56562 59237 ReductionDictionary/float_any/10000000/manual_time -0.0626 -0.0421 96949 90881 115099 110252 ReductionDictionary/float_any/100000000/manual_time -0.1203 -0.1154 661222 581658 679363 600965 ReductionDictionary/int32_t_min/10000/manual_time +0.0539 +0.0525 45796 48265 66802 70306 ReductionDictionary/int32_t_min/100000/manual_time +0.0955 +0.0868 46170 50579 66882 72687 ReductionDictionary/int32_t_min/1000000/manual_time +0.0683 +0.0647 47495 50738 67291 71646 ReductionDictionary/int32_t_min/10000000/manual_time +0.0065 +0.0160 97469 98105 115997 117850 ReductionDictionary/int32_t_min/100000000/manual_time +0.0045 +0.0070 501086 503357 519638 523253 ReductionDictionary/float_min/10000/manual_time +0.0721 +0.0653 45551 48834 66485 70829 ReductionDictionary/float_min/100000/manual_time +0.0966 +0.0865 44953 49295 65457 71120 ReductionDictionary/float_min/1000000/manual_time +0.0618 +0.0582 47227 50145 67004 70900 ReductionDictionary/float_min/10000000/manual_time +0.0052 +0.0167 97166 97669 115559 117490 ReductionDictionary/float_min/100000000/manual_time +0.0060 +0.0083 501044 504029 519674 523988 ReductionDictionary/int32_t_max/10000/manual_time +0.0596 +0.0558 44987 47667 65991 69670 ReductionDictionary/int32_t_max/100000/manual_time +0.0735 +0.0625 44640 47921 65423 69515 ReductionDictionary/int32_t_max/1000000/manual_time +0.0610 +0.0569 46224 49042 65957 69707 ReductionDictionary/int32_t_max/10000000/manual_time -0.0100 -0.0009 97078 96105 115787 115685 ReductionDictionary/int32_t_max/100000000/manual_time +0.0021 +0.0041 500466 501504 518959 521082 ReductionDictionary/float_max/10000/manual_time +0.0815 +0.0716 44141 47738 65134 69798 ReductionDictionary/float_max/100000/manual_time +0.0481 +0.0466 46378 48609 67284 70422 ReductionDictionary/float_max/1000000/manual_time +0.0091 +0.0180 48140 48579 67837 69055 ReductionDictionary/float_max/10000000/manual_time -0.0000 +0.0063 96823 96820 115498 116228 ReductionDictionary/float_max/100000000/manual_time +0.0044 +0.0061 501050 503259 519938 523126 ReductionDictionary/int32_t_mean/10000/manual_time -0.0259 -0.0018 45053 43885 66162 66041 ReductionDictionary/int32_t_mean/100000/manual_time +0.0448 +0.0484 44776 46782 65662 68843 ReductionDictionary/int32_t_mean/1000000/manual_time +0.0134 +0.0263 52069 52766 72163 74061 ReductionDictionary/int32_t_mean/10000000/manual_time -0.1592 -0.1321 135248 113712 153825 133504 ReductionDictionary/int32_t_mean/100000000/manual_time -0.2259 -0.2180 856793 663232 875288 684472 ReductionDictionary/float_mean/10000/manual_time +0.1263 +0.1185 44256 49847 65389 73139 ReductionDictionary/float_mean/100000/manual_time +0.1304 +0.1231 44038 49780 64743 72713 ReductionDictionary/float_mean/1000000/manual_time +0.0792 +0.0887 51678 55771 71756 78124 ReductionDictionary/float_mean/10000000/manual_time -0.1397 -0.1058 136928 117802 155376 138931 ReductionDictionary/float_mean/100000000/manual_time -0.2322 -0.2256 861014 661106 879508 681082 Reduction/bool_sum/10000/manual_time +0.1048 +0.0881 16256 17960 36958 40216 Reduction/bool_sum/100000/manual_time +0.0424 +0.0473 17106 17832 37986 39781 Reduction/bool_sum/1000000/manual_time +0.0837 +0.0620 17928 19428 38311 40688 Reduction/bool_sum/10000000/manual_time +0.0408 +0.0395 34927 36352 53394 55502 Reduction/bool_sum/100000000/manual_time +0.0030 +0.0102 145074 145516 164126 165798 Reduction/int8_t_sum/10000/manual_time +0.1105 +0.0832 15704 17438 36410 39440 Reduction/int8_t_sum/100000/manual_time +0.0694 +0.0629 16539 17688 37233 39575 Reduction/int8_t_sum/1000000/manual_time +0.0783 +0.0574 17880 19279 38316 40517 Reduction/int8_t_sum/10000000/manual_time +0.0416 +0.0406 34809 36256 53305 55467 Reduction/int8_t_sum/100000000/manual_time +0.0010 +0.0098 144214 144358 163105 164700 Reduction/int32_t_sum/10000/manual_time +0.0074 +0.0264 17459 17588 38551 39570 Reduction/int32_t_sum/100000/manual_time +0.0791 +0.0684 17086 18438 37705 40286 Reduction/int32_t_sum/1000000/manual_time -0.0064 -0.0027 23751 23599 44384 44263 Reduction/int32_t_sum/10000000/manual_time +0.0238 +0.0358 67606 69212 86499 89593 Reduction/int32_t_sum/100000000/manual_time +0.0002 -0.0043 472981 473093 495882 493743 Reduction/int64_t_sum/10000/manual_time +0.0340 +0.0383 17246 17833 38306 39772 Reduction/int64_t_sum/100000/manual_time +0.0755 +0.0719 17841 19188 38242 40990 Reduction/int64_t_sum/1000000/manual_time +0.0576 +0.0635 28702 30354 47263 50266 Reduction/int64_t_sum/10000000/manual_time +0.0119 +0.0141 112911 114253 132783 134658 Reduction/int64_t_sum/100000000/manual_time +0.0013 +0.0021 920797 921986 940549 942485 Reduction/float_sum/10000/manual_time +0.0882 +0.0707 16479 17932 37421 40067 Reduction/float_sum/100000/manual_time +0.0903 +0.0774 16677 18183 37163 40041 Reduction/float_sum/1000000/manual_time +0.0571 +0.0622 22393 23673 41960 44569 Reduction/float_sum/10000000/manual_time +0.0179 +0.0345 67968 69184 86691 89678 Reduction/float_sum/100000000/manual_time +0.0015 -0.0015 472247 472973 494236 493474 Reduction/double_sum/10000/manual_time +0.0982 +0.0794 16036 17611 36657 39568 Reduction/double_sum/100000/manual_time +0.0153 +0.0340 18453 18735 38955 40281 Reduction/double_sum/1000000/manual_time +0.0596 +0.0636 28623 30328 47153 50154 Reduction/double_sum/10000000/manual_time +0.0089 +0.0130 113426 114441 132746 134476 Reduction/double_sum/100000000/manual_time +0.0026 +0.0044 920131 922540 939374 943549 Reduction/int32_t_product/10000/manual_time +0.1183 +0.0902 16570 18529 37443 40819 Reduction/int32_t_product/100000/manual_time +0.1097 +0.0905 17076 18948 37741 41158 Reduction/int32_t_product/1000000/manual_time +0.0655 +0.0622 22023 23465 41604 44194 Reduction/int32_t_product/10000000/manual_time +0.0237 +0.0374 67629 69233 86354 89583 Reduction/int32_t_product/100000000/manual_time +0.0003 +0.0024 473110 473265 492461 493666 Reduction/float_product/10000/manual_time +0.0839 +0.0615 16972 18395 38034 40374 Reduction/float_product/100000/manual_time +0.0787 +0.0718 17091 18436 37602 40302 Reduction/float_product/1000000/manual_time +0.0686 +0.0714 22282 23810 41856 44844 Reduction/float_product/10000000/manual_time +0.0135 +0.0191 68313 69233 87798 89474 Reduction/float_product/100000000/manual_time +0.0016 -0.0034 472681 473450 495621 493925 Reduction/int64_t_min/10000/manual_time +0.0885 +0.0697 16886 18380 37763 40393 Reduction/int64_t_min/100000/manual_time +0.0490 +0.0534 17988 18869 38269 40311 Reduction/int64_t_min/1000000/manual_time +0.0087 +0.0218 30355 30618 49407 50485 Reduction/int64_t_min/10000000/manual_time +0.0092 +0.0179 113906 114949 132977 135352 Reduction/int64_t_min/100000000/manual_time +0.0011 +0.0020 921089 922086 940543 942460 Reduction/double_min/10000/manual_time +0.0818 +0.0662 16848 18226 37674 40169 Reduction/double_min/100000/manual_time +0.0839 +0.0761 17470 18936 37537 40394 Reduction/double_min/1000000/manual_time +0.0582 +0.0629 28967 30653 47503 50490 Reduction/double_min/10000000/manual_time +0.0135 +0.0246 113413 114940 132250 135504 Reduction/double_min/100000000/manual_time +0.0017 +0.0027 920812 922344 940237 942769 Reduction/timestamp_ms_min/10000/manual_time +0.0910 +0.0673 16926 18466 37673 40210 Reduction/timestamp_ms_min/100000/manual_time +0.0830 +0.0738 17894 19379 38112 40925 Reduction/timestamp_ms_min/1000000/manual_time +0.0339 +0.0429 30113 31133 48915 51012 Reduction/timestamp_ms_min/10000000/manual_time +0.0109 +0.0189 114185 115433 133132 135650 Reduction/timestamp_ms_min/100000000/manual_time +0.0053 +0.0076 920924 925845 940096 947215 Reduction/int8_t_mean/10000/manual_time +0.0489 +0.0517 29025 30445 50078 52670 Reduction/int8_t_mean/100000/manual_time +0.0401 +0.0493 29218 30389 50050 52515 Reduction/int8_t_mean/1000000/manual_time -0.0215 -0.0036 33316 32600 54162 53965 Reduction/int8_t_mean/10000000/manual_time +0.0387 +0.0407 46726 48536 64936 67581 Reduction/int8_t_mean/100000000/manual_time +0.0007 +0.0058 161956 162077 180866 181918 Reduction/float_mean/10000/manual_time +0.0916 +0.0827 28959 31611 49748 53860 Reduction/float_mean/100000/manual_time +0.0369 +0.0485 30720 31854 51489 53988 Reduction/float_mean/1000000/manual_time +0.0218 +0.0310 36650 37448 56777 58539 Reduction/float_mean/10000000/manual_time +0.0266 +0.0402 81361 83528 99648 103653 Reduction/float_mean/100000000/manual_time -0.0036 -0.0003 489644 487869 508151 507999 Reduction/int32_t_variance/10000/manual_time +0.0337 +0.0343 29718 30718 50967 52713 Reduction/int32_t_variance/100000/manual_time -0.0506 -0.0189 33774 32065 55094 54054 Reduction/int32_t_variance/1000000/manual_time +0.0364 +0.0316 36062 37374 56511 58297 Reduction/int32_t_variance/10000000/manual_time +0.0107 +0.0197 83950 84849 102507 104529 Reduction/int32_t_variance/100000000/manual_time -0.0003 +0.0018 491601 491463 510405 511323 Reduction/double_variance/10000/manual_time +0.0536 +0.0457 29157 30721 50266 52565 Reduction/double_variance/100000/manual_time +0.0367 +0.0415 30552 31674 51014 53130 Reduction/double_variance/1000000/manual_time +0.0238 +0.0334 43737 44780 62726 64821 Reduction/double_variance/10000000/manual_time +0.0046 +0.0084 129382 129980 148232 149476 Reduction/double_variance/100000000/manual_time +0.0027 +0.0042 934085 936595 952680 956708 Reduction/int64_t_std/10000/manual_time +0.0638 +0.0592 28956 30803 49847 52798 Reduction/int64_t_std/100000/manual_time -0.0742 -0.0356 34976 32380 56179 54180 Reduction/int64_t_std/1000000/manual_time -0.1048 -0.0836 49651 44446 70034 64177 Reduction/int64_t_std/10000000/manual_time +0.0077 +0.0118 129373 130373 148033 149784 Reduction/int64_t_std/100000000/manual_time +0.0025 +0.0037 934055 936408 952668 956168 Reduction/float_std/10000/manual_time +0.0754 +0.0699 28573 30728 49316 52764 Reduction/float_std/100000/manual_time +0.0945 +0.0846 28682 31394 49008 53155 Reduction/float_std/1000000/manual_time +0.0356 +0.0437 36195 37485 56074 58526 Reduction/float_std/10000000/manual_time +0.0175 +0.0292 83490 84951 101651 104623 Reduction/float_std/100000000/manual_time +0.0018 +0.0049 490773 491670 509017 511496 ReductionScan/int8_no_nulls/10000/manual_time +0.0523 +0.0511 15603 16420 36157 38005 ReductionScan/int8_no_nulls/100000/manual_time +0.0181 +0.0295 17358 17672 37950 39070 ReductionScan/int8_no_nulls/1000000/manual_time +0.0251 +0.0390 26876 27550 45671 47451 ReductionScan/int8_no_nulls/10000000/manual_time +0.0166 +0.0278 89319 90800 109601 112649 ReductionScan/int8_no_nulls/100000000/manual_time +0.0007 +0.0019 735210 735689 756198 757634 ReductionScan/int32_no_nulls/10000/manual_time +0.0515 +0.0559 15234 16019 35716 37712 ReductionScan/int32_no_nulls/100000/manual_time +0.0593 +0.0583 16808 17805 36980 39134 ReductionScan/int32_no_nulls/1000000/manual_time +0.0390 +0.0469 26405 27436 44665 46759 ReductionScan/int32_no_nulls/10000000/manual_time +0.0103 +0.0196 114948 116137 135580 138232 ReductionScan/int32_no_nulls/100000000/manual_time +0.0032 +0.0056 1016197 1019401 1037281 1043090 ReductionScan/uint64_no_nulls/10000/manual_time +0.0162 +0.0308 16222 16485 36939 38077 ReductionScan/uint64_no_nulls/100000/manual_time +0.0588 +0.0685 18751 19852 38174 40789 ReductionScan/uint64_no_nulls/1000000/manual_time +0.0444 +0.0604 36339 37953 56819 60253 ReductionScan/uint64_no_nulls/10000000/manual_time -0.0072 +0.0016 217943 216376 238710 239081 ReductionScan/uint64_no_nulls/100000000/manual_time -0.0009 -0.0007 2001645 1999848 2023397 2021947 ReductionScan/float_no_nulls/10000/manual_time +0.0298 +0.0399 15775 16246 36567 38028 ReductionScan/float_no_nulls/100000/manual_time +0.0534 +0.0505 16888 17789 37162 39037 ReductionScan/float_no_nulls/1000000/manual_time +0.0339 +0.0416 26655 27558 45083 46960 ReductionScan/float_no_nulls/10000000/manual_time +0.0071 +0.0155 115565 116383 136602 138716 ReductionScan/float_no_nulls/100000000/manual_time -0.0019 -0.0006 1019440 1017475 1040591 1040016 ReductionScan/int16_nulls/10000/manual_time +0.0510 +0.0442 35588 37402 57805 60363 ReductionScan/int16_nulls/100000/manual_time +0.0442 +0.0450 37789 39458 59383 62055 ReductionScan/int16_nulls/1000000/manual_time -0.0270 -0.0174 48836 47519 69731 68521 ReductionScan/int16_nulls/10000000/manual_time -0.0046 +0.0034 122849 122281 145028 145520 ReductionScan/int16_nulls/100000000/manual_time -0.0025 -0.0107 864450 862291 892772 883246 ReductionScan/uint32_nulls/10000/manual_time -0.0178 -0.0017 37685 37016 59988 59883 ReductionScan/uint32_nulls/100000/manual_time +0.0478 +0.0494 37349 39134 58805 61708 ReductionScan/uint32_nulls/1000000/manual_time +0.0085 +0.0130 49350 49771 69758 70663 ReductionScan/uint32_nulls/10000000/manual_time +0.0013 +0.0086 146704 146896 169127 170574 ReductionScan/uint32_nulls/100000000/manual_time +0.0085 +0.0111 1112967 1122427 1133374 1145988 ReductionScan/double_nulls/10000/manual_time +0.0623 +0.0611 35538 37753 57332 60833 ReductionScan/double_nulls/100000/manual_time +0.0602 +0.0635 38508 40827 59025 62775 ReductionScan/double_nulls/1000000/manual_time +0.0202 +0.0275 55522 56645 77436 79567 ReductionScan/double_nulls/10000000/manual_time +0.0084 +0.0118 243996 246039 265992 269142 ReductionScan/double_nulls/100000000/manual_time -0.0013 -0.0008 2120668 2117888 2140996 2139376 Reduction/bool_minmax/10000/manual_time -0.1377 -0.0650 22410 19325 43937 41081 Reduction/bool_minmax/100000/manual_time +0.0523 +0.0537 18947 19937 39665 41795 Reduction/bool_minmax/1000000/manual_time +0.0734 +0.0617 19146 20551 39413 41846 Reduction/bool_minmax/10000000/manual_time +0.0358 +0.0438 37836 39189 56031 58483 Reduction/bool_minmax/100000000/manual_time +0.0000 -0.0063 149637 149641 170798 169728 Reduction/int8_t_minmax/10000/manual_time +0.0361 +0.0438 18831 19512 39671 41408 Reduction/int8_t_minmax/100000/manual_time +0.0629 +0.0571 18429 19588 39093 41327 Reduction/int8_t_minmax/1000000/manual_time +0.0874 +0.0696 18956 20613 39054 41772 Reduction/int8_t_minmax/10000000/manual_time +0.0369 +0.0413 37781 39175 55998 58312 Reduction/int8_t_minmax/100000000/manual_time -0.0002 +0.0059 149816 149783 168898 169894 Reduction/int32_t_minmax/10000/manual_time +0.0813 +0.0613 18423 19921 39271 41676 Reduction/int32_t_minmax/100000/manual_time +0.0848 +0.0668 18560 20133 39161 41779 Reduction/int32_t_minmax/1000000/manual_time +0.0630 +0.0610 23265 24730 42732 45337 Reduction/int32_t_minmax/10000000/manual_time +0.0107 +0.0196 69505 70248 88260 89989 Reduction/int32_t_minmax/100000000/manual_time -0.0004 +0.0009 474594 474410 493944 494395 Reduction/timestamp_ms_minmax/10000/manual_time +0.0716 +0.0605 19241 20618 40084 42509 Reduction/timestamp_ms_minmax/100000/manual_time +0.0418 +0.0513 19638 20459 39825 41867 Reduction/timestamp_ms_minmax/1000000/manual_time +0.0559 +0.0635 30346 32042 48851 51954 Reduction/timestamp_ms_minmax/10000000/manual_time +0.0030 +0.0025 117085 117437 136688 137033 Reduction/timestamp_ms_minmax/100000000/manual_time +0.0010 -0.0037 921557 922453 945924 942449 Reduction/float_minmax/10000/manual_time +0.0650 +0.0521 18507 19710 39467 41526 Reduction/float_minmax/100000/manual_time +0.0740 +0.0619 18504 19874 39105 41525 Reduction/float_minmax/1000000/manual_time +0.0499 +0.0373 23385 24552 43561 45185 Reduction/float_minmax/10000000/manual_time +0.0163 +0.0314 69400 70529 88043 90807 Reduction/float_minmax/100000000/manual_time +0.0000 +0.0025 474346 474361 493215 494461 ``` </details> It looks like the compression has a minor cost? I would have thought the removal of the `release_assert` would have improved performance. But maybe `reduce` isn't using it? Authors: - Robert Maynard (@robertmaynard) Approvers: - Keith Kraus (@kkraus14) - Mark Harris (@harrism) - Jake Hemstad (@jrhemstad) URL: #7583
- Loading branch information