Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Compile times between 1.12 and 1.15 are comparable. libcudf library size with 1.15 is < 1MB smaller compared to 1.12 Runtime Performance results: ``` Time CPU Time Old Time New CPU Old CPU New ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- COMPILED_BINARYOP/BM_compiled_binaryop<double, double, double, cudf::binary_operator::MOD>/100000/manual_time +0.0519 +0.0268 10 10 27 28 COMPILED_BINARYOP/BM_compiled_binaryop<int32_t, int64_t, double, cudf::binary_operator::PMOD>/10000/manual_time +0.0851 +0.0295 9 10 28 29 COMPILED_BINARYOP/BM_compiled_binaryop<int, int, int, cudf::binary_operator::SHIFT_LEFT>/10000/manual_time -0.0971 -0.0218 6 5 25 24 COMPILED_BINARYOP/BM_compiled_binaryop<double, int8_t, bool, cudf::binary_operator::LOGICAL_AND>/10000/manual_time -0.0717 -0.0196 7 6 26 25 COMPILED_BINARYOP/BM_compiled_binaryop<decimal32, decimal32, decimal32, cudf::binary_operator::NULL_MAX>/10000/manual_time +0.0748 +0.0191 7 8 26 27 ReductionDictionary/float_max/10000/manual_time -0.0714 -0.0512 31498 29249 50665 48071 ReductionScan/int16_nulls/100000/manual_time -0.1073 -0.0562 17598 15710 35995 33974 Concatenate/BM_concatenate<int64_t, true>/4096/64/manual_time -0.1211 -0.0844 0 0 0 0 OrcWrite/floats_file_output/31/0/32/1/0/manual_time +0.0227 +0.0576 84 86 78 83 CopyIfElse/int16/4096/manual_time +0.0648 +0.0262 0 0 0 0 CopyIfElse/float64/262144/manual_time +0.0609 +0.0192 0 0 0 0 CopyIfElse/int16_no_nulls/4096/manual_time +0.1355 +0.0219 0 0 0 0 CopyIfElse/int16_no_nulls/32768/manual_time +0.1582 +0.0395 0 0 0 0 CopyIfElse/uint32_no_nulls/4096/manual_time +0.1300 +0.0213 0 0 0 0 CopyIfElse/uint32_no_nulls/32768/manual_time +0.0500 +0.0155 0 0 0 0 CopyIfElse/float64_no_nulls/4096/manual_time +0.0583 +0.0083 0 0 0 0 TypeDispatcher/fp64_bandwidth_host/1/1024/1/manual_time +0.1010 +0.0236 3291 3623 22245 22771 TypeDispatcher/fp64_bandwidth_host/2/1024/1/manual_time +0.1157 +0.0309 4888 5453 23798 24533 TypeDispatcher/fp64_bandwidth_host/4/1024/1/manual_time +0.0901 +0.0324 8053 8779 26780 27647 TypeDispatcher/fp64_bandwidth_host/1/2048/1/manual_time +0.1191 +0.0246 3335 3732 22378 22928 TypeDispatcher/fp64_bandwidth_host/2/2048/1/manual_time +0.1349 +0.0321 4890 5550 23839 24604 TypeDispatcher/fp64_bandwidth_host/4/2048/1/manual_time +0.0622 +0.0177 8820 9369 27717 28207 TypeDispatcher/fp64_bandwidth_host/1/4096/1/manual_time +0.0951 +0.0229 3387 3709 22375 22888 TypeDispatcher/fp64_bandwidth_host/2/4096/1/manual_time +0.0507 +0.0125 5509 5788 24491 24796 TypeDispatcher/fp64_bandwidth_device/4/1024/1/manual_time +0.0601 +0.0266 10775 11422 29432 30215 TypeDispatcher/fp64_bandwidth_device/1/2048/1/manual_time +0.0618 +0.0290 9258 9831 27967 28778 TypeDispatcher/fp64_bandwidth_device/2/2048/1/manual_time +0.0516 +0.0218 9773 10277 28453 29073 TypeDispatcher/fp64_bandwidth_device/1/4096/1/manual_time +0.0653 +0.0259 9265 9869 27956 28681 TypeDispatcher/fp64_bandwidth_no/1/1024/1/manual_time +0.0868 +0.0275 3441 3739 22367 22983 TypeDispatcher/fp64_bandwidth_no/2/1024/1/manual_time +0.1635 +0.0396 3690 4293 22609 23503 TypeDispatcher/fp64_bandwidth_no/4/1024/1/manual_time +0.1144 +0.0316 4523 5040 23445 24186 TypeDispatcher/fp64_bandwidth_no/1/2048/1/manual_time +0.1469 +0.0413 3497 4011 22382 23306 TypeDispatcher/fp64_bandwidth_no/2/2048/1/manual_time +0.0550 +0.0216 4097 4323 23012 23509 TypeDispatcher/fp64_bandwidth_no/2/4096/1/manual_time +0.0510 +0.0151 4418 4643 23447 23800 TypeDispatcher/fp64_bandwidth_no/8/4096/1/manual_time +0.1093 +0.0235 6787 7528 26000 26612 TypeDispatcher/fp64_bandwidth_no/4/8192/1/manual_time +0.0608 +0.0071 5435 5766 24657 24832 SetNullmask/SetNullMaskKernel/1048576/manual_time -0.0949 -0.0022 3800 3440 22750 22699 StringExtract/four/32768/32/manual_time +0.0578 +0.0570 1 1 1 1 StringExtract/eight/4096/32/manual_time -0.0973 -0.0969 5 4 5 4 StringExtract/eight/4096/128/manual_time -0.0882 -0.0879 5 5 5 5 StringExtract/eight/4096/512/manual_time -0.0780 -0.0777 5 5 5 5 StringExtract/eight/4096/2048/manual_time -0.0703 -0.0701 5 5 6 5 StringExtract/eight/4096/8192/manual_time -0.0718 -0.0716 7 6 7 6 UrlDecode/BM_url_decode<10>/10000000/100/manual_time +0.0576 +0.0577 37 39 37 39 UrlDecode/BM_url_decode<50>/100000000/10/manual_time +0.0599 +0.0599 140 148 140 148 Shift/shift_half_nullable_out/1048576/manual_time -0.0813 -0.0745 0 0 0 0 Sort<false>/unstable_no_nulls/1024/8/manual_time -0.2900 -0.2832 1 1 1 1 Sort<false>/unstable_no_nulls/4096/8/manual_time -0.3819 -0.3739 1 1 1 1 Sort<false>/unstable_no_nulls/32768/8/manual_time -0.2767 -0.2714 1 1 1 1 Sort<false>/unstable_no_nulls/262144/8/manual_time -0.1866 -0.1840 1 1 1 1 Sort<true>/stable_no_nulls/1024/8/manual_time -0.2920 -0.2852 1 1 1 1 Sort<true>/stable_no_nulls/4096/8/manual_time -0.3818 -0.3737 1 1 1 1 Sort<true>/stable_no_nulls/32768/8/manual_time -0.2777 -0.2722 1 1 1 1 Sort<true>/stable_no_nulls/262144/8/manual_time -0.1886 -0.1859 1 1 1 1 Sort<false>/unstable/1024/8/manual_time -0.2835 -0.2778 1 1 1 1 Sort<false>/unstable/4096/8/manual_time -0.3425 -0.3365 1 1 1 1 Sort<false>/unstable/32768/8/manual_time -0.2179 -0.2149 1 1 1 1 Sort<false>/unstable/262144/8/manual_time -0.0930 -0.0922 2 2 2 2 Sort<true>/stable/1024/8/manual_time -0.2874 -0.2817 1 1 1 1 Sort<true>/stable/4096/8/manual_time -0.3490 -0.3429 1 1 1 1 Sort<true>/stable/32768/8/manual_time -0.2232 -0.2202 1 1 1 1 Sort<true>/stable/262144/8/manual_time -0.0964 -0.0955 2 2 2 2 Search/Table/4/1000000/manual_time +0.0796 +0.0560 4 4 4 4 Gather/double_coalesce_x/8192/1/manual_time +0.0751 +0.0182 8503 9141 27316 27812 Gather/double_coalesce_x/16384/1/manual_time +0.0879 +0.0246 8641 9400 27247 27916 Gather/double_coalesce_x/8192/2/manual_time +0.0635 +0.0286 16209 17238 34700 35692 Gather/double_coalesce_x/1024/4/manual_time +0.0517 +0.0308 30234 31797 49033 50543 Gather/double_coalesce_x/4096/4/manual_time +0.0568 +0.0327 30768 32517 49409 51026 Gather/double_coalesce_x/2048/8/manual_time +0.0533 +0.0376 59436 62601 78176 81117 Gather/double_coalesce_o/16384/1/manual_time +0.0525 +0.0138 8921 9389 27447 27824 ReplaceClamp/float_no_nulls/10000/manual_time +0.0502 +0.0309 41663 43753 60740 62616 AST/BM_ast_transform<int32_t, TreeType::IMBALANCED_LEFT, true, true>/100000000/1/manual_time +0.0581 +0.0579 8 9 8 9 ```
- Loading branch information