-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix key ordering in pointwise ops #855
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jul 5, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 48.0400μs | 17.4676μs | 57.2487 KOps/s | 59.3692 KOps/s | |
test_plain_set_stack_nested | 44.9830μs | 17.6326μs | 56.7133 KOps/s | 59.0901 KOps/s | |
test_plain_set_nested_inplace | 52.9980μs | 19.7295μs | 50.6856 KOps/s | 51.5276 KOps/s | |
test_plain_set_stack_nested_inplace | 58.3480μs | 19.2812μs | 51.8640 KOps/s | 51.7272 KOps/s | |
test_items | 33.1020μs | 2.5346μs | 394.5341 KOps/s | 388.2307 KOps/s | |
test_items_nested | 0.8977ms | 0.2870ms | 3.4838 KOps/s | 3.6669 KOps/s | |
test_items_nested_locked | 0.4075ms | 0.2873ms | 3.4801 KOps/s | 3.6545 KOps/s | |
test_items_nested_leaf | 0.1500ms | 79.5179μs | 12.5758 KOps/s | 12.7497 KOps/s | |
test_items_stack_nested | 0.5658ms | 0.2888ms | 3.4628 KOps/s | 3.6509 KOps/s | |
test_items_stack_nested_leaf | 0.1464ms | 79.4202μs | 12.5913 KOps/s | 12.6915 KOps/s | |
test_items_stack_nested_locked | 0.7013ms | 0.2901ms | 3.4470 KOps/s | 3.5722 KOps/s | |
test_keys | 32.8320μs | 3.9897μs | 250.6479 KOps/s | 245.5290 KOps/s | |
test_keys_nested | 0.2269ms | 0.1372ms | 7.2860 KOps/s | 7.2812 KOps/s | |
test_keys_nested_locked | 0.7233ms | 0.1410ms | 7.0918 KOps/s | 6.8766 KOps/s | |
test_keys_nested_leaf | 0.2068ms | 0.1159ms | 8.6302 KOps/s | 8.4744 KOps/s | |
test_keys_stack_nested | 0.2325ms | 0.1371ms | 7.2936 KOps/s | 7.0867 KOps/s | |
test_keys_stack_nested_leaf | 0.2517ms | 0.1166ms | 8.5729 KOps/s | 8.5373 KOps/s | |
test_keys_stack_nested_locked | 0.2305ms | 0.1410ms | 7.0936 KOps/s | 6.9584 KOps/s | |
test_values | 5.1634μs | 1.1584μs | 863.2640 KOps/s | 866.1952 KOps/s | |
test_values_nested | 96.3490μs | 50.7123μs | 19.7191 KOps/s | 20.0204 KOps/s | |
test_values_nested_locked | 0.1062ms | 51.0613μs | 19.5843 KOps/s | 19.4873 KOps/s | |
test_values_nested_leaf | 90.6490μs | 46.1707μs | 21.6587 KOps/s | 22.1367 KOps/s | |
test_values_stack_nested | 0.1029ms | 51.2310μs | 19.5194 KOps/s | 19.6480 KOps/s | |
test_values_stack_nested_leaf | 96.8600μs | 45.8616μs | 21.8047 KOps/s | 21.9991 KOps/s | |
test_values_stack_nested_locked | 0.1022ms | 51.1773μs | 19.5399 KOps/s | 19.8016 KOps/s | |
test_membership | 33.4420μs | 1.3493μs | 741.1241 KOps/s | 742.3769 KOps/s | |
test_membership_nested | 39.9440μs | 3.4440μs | 290.3642 KOps/s | 292.1907 KOps/s | |
test_membership_nested_leaf | 32.2610μs | 3.4141μs | 292.9023 KOps/s | 286.1330 KOps/s | |
test_membership_stacked_nested | 32.0790μs | 3.3910μs | 294.8950 KOps/s | 292.6537 KOps/s | |
test_membership_stacked_nested_leaf | 19.3460μs | 3.4253μs | 291.9477 KOps/s | 284.5214 KOps/s | |
test_membership_nested_last | 23.7940μs | 4.2204μs | 236.9462 KOps/s | 237.5760 KOps/s | |
test_membership_nested_leaf_last | 21.7700μs | 4.2264μs | 236.6062 KOps/s | 238.7599 KOps/s | |
test_membership_stacked_nested_last | 45.0740μs | 4.1381μs | 241.6560 KOps/s | 239.7896 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.8100μs | 4.1600μs | 240.3844 KOps/s | 236.6099 KOps/s | |
test_nested_getleaf | 43.9410μs | 10.7662μs | 92.8836 KOps/s | 90.3187 KOps/s | |
test_nested_get | 43.4710μs | 10.1914μs | 98.1222 KOps/s | 97.0166 KOps/s | |
test_stacked_getleaf | 90.4380μs | 10.8017μs | 92.5777 KOps/s | 92.5464 KOps/s | |
test_stacked_get | 27.4110μs | 10.2003μs | 98.0364 KOps/s | 97.5946 KOps/s | |
test_nested_getitemleaf | 50.9850μs | 11.3136μs | 88.3891 KOps/s | 87.1703 KOps/s | |
test_nested_getitem | 47.3980μs | 10.4270μs | 95.9046 KOps/s | 96.5153 KOps/s | |
test_stacked_getitemleaf | 65.9920μs | 11.1999μs | 89.2864 KOps/s | 88.2859 KOps/s | |
test_stacked_getitem | 49.5820μs | 10.4183μs | 95.9848 KOps/s | 94.4609 KOps/s | |
test_lock_nested | 1.9713ms | 0.3321ms | 3.0114 KOps/s | 3.0383 KOps/s | |
test_lock_stack_nested | 0.3647ms | 0.3019ms | 3.3125 KOps/s | 3.3095 KOps/s | |
test_unlock_nested | 0.8446ms | 0.3336ms | 2.9976 KOps/s | 3.0015 KOps/s | |
test_unlock_stack_nested | 0.4362ms | 0.3106ms | 3.2197 KOps/s | 3.2256 KOps/s | |
test_flatten_speed | 0.3599ms | 97.8982μs | 10.2147 KOps/s | 10.3779 KOps/s | |
test_unflatten_speed | 0.5928ms | 0.4081ms | 2.4502 KOps/s | 2.4476 KOps/s | |
test_common_ops | 3.9255ms | 0.7355ms | 1.3595 KOps/s | 1.3751 KOps/s | |
test_creation | 15.6990μs | 1.8890μs | 529.3852 KOps/s | 530.2646 KOps/s | |
test_creation_empty | 40.6960μs | 11.1645μs | 89.5696 KOps/s | 91.8493 KOps/s | |
test_creation_nested_1 | 58.1180μs | 13.5524μs | 73.7878 KOps/s | 73.9783 KOps/s | |
test_creation_nested_2 | 41.0770μs | 16.8276μs | 59.4261 KOps/s | 59.1302 KOps/s | |
test_clone | 86.9710μs | 12.7595μs | 78.3729 KOps/s | 75.6635 KOps/s | |
test_getitem[int] | 86.6910μs | 10.9375μs | 91.4282 KOps/s | 90.4363 KOps/s | |
test_getitem[slice_int] | 79.5280μs | 22.0188μs | 45.4157 KOps/s | 43.5293 KOps/s | |
test_getitem[range] | 78.2050μs | 59.0516μs | 16.9343 KOps/s | 17.1898 KOps/s | |
test_getitem[tuple] | 50.4640μs | 18.4174μs | 54.2965 KOps/s | 53.4881 KOps/s | |
test_getitem[list] | 77.9650μs | 39.6869μs | 25.1972 KOps/s | 24.7631 KOps/s | |
test_setitem_dim[int] | 0.1581ms | 35.5473μs | 28.1315 KOps/s | 28.4799 KOps/s | |
test_setitem_dim[slice_int] | 89.6670μs | 60.8280μs | 16.4398 KOps/s | 16.0174 KOps/s | |
test_setitem_dim[range] | 0.1454ms | 82.3422μs | 12.1444 KOps/s | 12.0598 KOps/s | |
test_setitem_dim[tuple] | 91.3500μs | 48.4054μs | 20.6588 KOps/s | 19.6323 KOps/s | |
test_setitem | 51.9960μs | 19.3406μs | 51.7046 KOps/s | 48.1569 KOps/s | |
test_set | 73.2740μs | 18.8158μs | 53.1468 KOps/s | 50.7435 KOps/s | |
test_set_shared | 3.9772ms | 0.1423ms | 7.0269 KOps/s | 6.9668 KOps/s | |
test_update | 0.1298ms | 22.2611μs | 44.9215 KOps/s | 43.2631 KOps/s | |
test_update_nested | 74.4080μs | 31.0185μs | 32.2388 KOps/s | 30.9600 KOps/s | |
test_update__nested | 0.1183ms | 24.8029μs | 40.3179 KOps/s | 39.3909 KOps/s | |
test_set_nested | 67.1250μs | 21.0451μs | 47.5170 KOps/s | 45.2224 KOps/s | |
test_set_nested_new | 83.0850μs | 25.1771μs | 39.7186 KOps/s | 38.0336 KOps/s | |
test_select | 0.1014ms | 40.3256μs | 24.7982 KOps/s | 24.0472 KOps/s | |
test_select_nested | 0.1173ms | 57.2151μs | 17.4779 KOps/s | 17.5885 KOps/s | |
test_exclude_nested | 0.2675ms | 0.1176ms | 8.5069 KOps/s | 8.3296 KOps/s | |
test_empty[True] | 0.5769ms | 0.3937ms | 2.5397 KOps/s | 2.5363 KOps/s | |
test_empty[False] | 8.2874μs | 1.0231μs | 977.4286 KOps/s | 989.4440 KOps/s | |
test_unbind_speed | 0.4523ms | 0.2435ms | 4.1065 KOps/s | 4.0806 KOps/s | |
test_unbind_speed_stack0 | 0.4401ms | 0.2438ms | 4.1018 KOps/s | 4.1893 KOps/s | |
test_unbind_speed_stack1 | 67.6612ms | 0.7046ms | 1.4192 KOps/s | 1.4664 KOps/s | |
test_split | 72.7578ms | 1.5814ms | 632.3617 Ops/s | 636.4504 Ops/s | |
test_chunk | 65.0146ms | 1.5658ms | 638.6522 Ops/s | 633.8638 Ops/s | |
test_creation[device0] | 0.1627ms | 82.4405μs | 12.1300 KOps/s | 11.8951 KOps/s | |
test_creation_from_tensor | 4.0303ms | 84.8635μs | 11.7836 KOps/s | 11.4287 KOps/s | |
test_add_one[memmap_tensor0] | 41.2670μs | 5.7853μs | 172.8504 KOps/s | 182.0908 KOps/s | |
test_contiguous[memmap_tensor0] | 20.0270μs | 0.6532μs | 1.5310 MOps/s | 1.5558 MOps/s | |
test_stack[memmap_tensor0] | 22.8430μs | 3.6951μs | 270.6283 KOps/s | 264.5751 KOps/s | |
test_memmaptd_index | 1.0457ms | 0.2629ms | 3.8040 KOps/s | 3.8622 KOps/s | |
test_memmaptd_index_astensor | 0.6812ms | 0.3355ms | 2.9808 KOps/s | 2.9821 KOps/s | |
test_memmaptd_index_op | 0.9768ms | 0.6388ms | 1.5655 KOps/s | 1.5752 KOps/s | |
test_serialize_model | 0.1703s | 0.1051s | 9.5115 Ops/s | 10.3705 Ops/s | |
test_serialize_model_pickle | 0.4564s | 0.3785s | 2.6422 Ops/s | 2.6248 Ops/s | |
test_serialize_weights | 97.3648ms | 93.5610ms | 10.6882 Ops/s | 9.5075 Ops/s | |
test_serialize_weights_returnearly | 0.1829s | 0.1273s | 7.8555 Ops/s | 8.6030 Ops/s | |
test_serialize_weights_pickle | 0.7621s | 0.4896s | 2.0423 Ops/s | 1.5805 Ops/s | |
test_serialize_weights_filesystem | 99.1071ms | 92.2741ms | 10.8373 Ops/s | 10.3740 Ops/s | |
test_serialize_model_filesystem | 0.1545s | 98.2059ms | 10.1827 Ops/s | 10.7650 Ops/s | |
test_reshape_pytree | 73.3960μs | 25.4517μs | 39.2902 KOps/s | 39.1798 KOps/s | |
test_reshape_td | 79.5080μs | 33.1326μs | 30.1817 KOps/s | 29.7268 KOps/s | |
test_view_pytree | 71.2320μs | 25.3080μs | 39.5131 KOps/s | 39.2397 KOps/s | |
test_view_td | 83.7760μs | 37.6721μs | 26.5448 KOps/s | 25.5600 KOps/s | |
test_unbind_pytree | 90.9590μs | 29.4025μs | 34.0108 KOps/s | 34.0353 KOps/s | |
test_unbind_td | 0.4424ms | 36.2495μs | 27.5866 KOps/s | 27.5263 KOps/s | |
test_split_pytree | 80.7210μs | 28.9392μs | 34.5553 KOps/s | 33.8029 KOps/s | |
test_split_td | 0.1193ms | 38.1275μs | 26.2278 KOps/s | 25.0536 KOps/s | |
test_add_pytree | 0.1209ms | 35.0464μs | 28.5336 KOps/s | 27.6699 KOps/s | |
test_add_td | 0.1332ms | 57.7992μs | 17.3013 KOps/s | 17.4066 KOps/s | |
test_distributed | 0.1780ms | 0.1033ms | 9.6787 KOps/s | 9.6895 KOps/s | |
test_tdmodule | 45.4650μs | 18.5516μs | 53.9036 KOps/s | 56.8430 KOps/s | |
test_tdmodule_dispatch | 53.7400μs | 36.3330μs | 27.5232 KOps/s | 27.8732 KOps/s | |
test_tdseq | 42.0080μs | 21.2272μs | 47.1093 KOps/s | 49.3227 KOps/s | |
test_tdseq_dispatch | 77.7740μs | 41.6460μs | 24.0119 KOps/s | 24.8745 KOps/s | |
test_instantiation_functorch | 1.9451ms | 1.3217ms | 756.5899 Ops/s | 742.2690 Ops/s | |
test_instantiation_td | 1.5508ms | 1.0194ms | 980.9335 Ops/s | 967.4085 Ops/s | |
test_exec_functorch | 0.2979ms | 0.1750ms | 5.7151 KOps/s | 6.1332 KOps/s | |
test_exec_functional_call | 0.2985ms | 0.1536ms | 6.5119 KOps/s | 6.5214 KOps/s | |
test_exec_td | 0.2819ms | 0.1477ms | 6.7690 KOps/s | 6.7960 KOps/s | |
test_exec_td_decorator | 0.3603ms | 0.2237ms | 4.4701 KOps/s | 4.4692 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8252ms | 0.4962ms | 2.0154 KOps/s | 2.0567 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9009ms | 0.4912ms | 2.0357 KOps/s | 2.0670 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6609ms | 0.4034ms | 2.4789 KOps/s | 2.5394 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6138ms | 0.4011ms | 2.4930 KOps/s | 2.5261 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8938ms | 0.5672ms | 1.7632 KOps/s | 1.7913 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9672ms | 0.5669ms | 1.7640 KOps/s | 1.8062 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7698ms | 0.4636ms | 2.1570 KOps/s | 2.2038 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8328ms | 0.4636ms | 2.1571 KOps/s | 2.1972 KOps/s | |
test_to_module_speed[True] | 2.7421ms | 1.6719ms | 598.1370 Ops/s | 593.3370 Ops/s | |
test_to_module_speed[False] | 1.7697ms | 1.6353ms | 611.4960 Ops/s | 613.9396 Ops/s | |
test_tc_init | 0.1193ms | 57.8828μs | 17.2763 KOps/s | 16.4827 KOps/s | |
test_tc_init_nested | 0.1974ms | 0.1187ms | 8.4245 KOps/s | 8.4609 KOps/s | |
test_tc_first_layer_tensor | 28.3130μs | 8.2675μs | 120.9561 KOps/s | 114.8011 KOps/s | |
test_tc_first_layer_nontensor | 48.9110μs | 8.1887μs | 122.1193 KOps/s | 113.8747 KOps/s | |
test_tc_second_layer_tensor | 25.7980μs | 2.5271μs | 395.7068 KOps/s | 379.3959 KOps/s | |
test_tc_second_layer_nontensor | 50.3230μs | 9.2188μs | 108.4736 KOps/s | 101.4017 KOps/s | |
test_unbind | 84.9729ms | 14.5708ms | 68.6304 Ops/s | 71.1493 Ops/s | |
test_full_like | 8.8912ms | 7.1534ms | 139.7940 Ops/s | 146.4651 Ops/s | |
test_zeros_like | 11.5723ms | 5.3233ms | 187.8549 Ops/s | 170.8059 Ops/s | |
test_ones_like | 12.3629ms | 6.2157ms | 160.8827 Ops/s | 168.7361 Ops/s | |
test_clone | 15.3356ms | 8.0565ms | 124.1230 Ops/s | 132.4394 Ops/s | |
test_squeeze | 66.8540μs | 12.8595μs | 77.7637 KOps/s | 76.5473 KOps/s | |
test_unsqueeze | 0.2603ms | 96.7769μs | 10.3330 KOps/s | 9.8892 KOps/s | |
test_split | 0.6016ms | 0.2798ms | 3.5740 KOps/s | 3.5770 KOps/s | |
test_permute | 0.4804ms | 0.2268ms | 4.4085 KOps/s | 4.3523 KOps/s | |
test_stack | 23.8765ms | 21.6894ms | 46.1054 Ops/s | 46.8003 Ops/s | |
test_cat | 24.0961ms | 21.2849ms | 46.9816 Ops/s | 47.5332 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 26.1500μs | 13.9320μs | 71.7771 KOps/s | 80.1548 KOps/s | |
test_plain_set_stack_nested | 32.1810μs | 13.8584μs | 72.1582 KOps/s | 79.9748 KOps/s | |
test_plain_set_nested_inplace | 39.3300μs | 15.1630μs | 65.9499 KOps/s | 72.6469 KOps/s | |
test_plain_set_stack_nested_inplace | 31.8510μs | 15.1347μs | 66.0733 KOps/s | 72.6988 KOps/s | |
test_items | 28.5200μs | 4.6457μs | 215.2550 KOps/s | 212.9098 KOps/s | |
test_items_nested | 0.3786ms | 0.3407ms | 2.9350 KOps/s | 2.9311 KOps/s | |
test_items_nested_locked | 0.3591ms | 0.3381ms | 2.9581 KOps/s | 2.8855 KOps/s | |
test_items_nested_leaf | 99.1610μs | 82.6127μs | 12.1047 KOps/s | 12.1198 KOps/s | |
test_items_stack_nested | 0.3712ms | 0.3452ms | 2.8970 KOps/s | 2.9347 KOps/s | |
test_items_stack_nested_leaf | 0.1058ms | 83.2768μs | 12.0081 KOps/s | 11.7012 KOps/s | |
test_items_stack_nested_locked | 0.3755ms | 0.3436ms | 2.9107 KOps/s | 2.9287 KOps/s | |
test_keys | 28.2210μs | 4.3562μs | 229.5588 KOps/s | 228.1846 KOps/s | |
test_keys_nested | 93.6020μs | 68.6672μs | 14.5630 KOps/s | 14.4130 KOps/s | |
test_keys_nested_locked | 0.7580ms | 74.2838μs | 13.4619 KOps/s | 13.4048 KOps/s | |
test_keys_nested_leaf | 84.8210μs | 59.5444μs | 16.7942 KOps/s | 16.8691 KOps/s | |
test_keys_stack_nested | 98.7410μs | 68.1472μs | 14.6741 KOps/s | 14.3885 KOps/s | |
test_keys_stack_nested_leaf | 82.8410μs | 57.4708μs | 17.4002 KOps/s | 16.8073 KOps/s | |
test_keys_stack_nested_locked | 97.4320μs | 72.9814μs | 13.7021 KOps/s | 13.3041 KOps/s | |
test_values | 9.7870μs | 1.8054μs | 553.8880 KOps/s | 555.2515 KOps/s | |
test_values_nested | 65.5410μs | 35.1973μs | 28.4113 KOps/s | 28.3311 KOps/s | |
test_values_nested_locked | 66.8010μs | 37.1937μs | 26.8863 KOps/s | 26.8611 KOps/s | |
test_values_nested_leaf | 48.9510μs | 31.4843μs | 31.7619 KOps/s | 31.5795 KOps/s | |
test_values_stack_nested | 66.9310μs | 36.2381μs | 27.5953 KOps/s | 27.7524 KOps/s | |
test_values_stack_nested_leaf | 59.2810μs | 32.3150μs | 30.9453 KOps/s | 31.2352 KOps/s | |
test_values_stack_nested_locked | 61.1010μs | 38.1045μs | 26.2436 KOps/s | 26.6896 KOps/s | |
test_membership | 1.7140μs | 0.6932μs | 1.4425 MOps/s | 1.4463 MOps/s | |
test_membership_nested | 17.6800μs | 2.6391μs | 378.9142 KOps/s | 388.1110 KOps/s | |
test_membership_nested_leaf | 30.5610μs | 2.6160μs | 382.2583 KOps/s | 388.8326 KOps/s | |
test_membership_stacked_nested | 20.2100μs | 2.5756μs | 388.2560 KOps/s | 392.9615 KOps/s | |
test_membership_stacked_nested_leaf | 17.8100μs | 2.5794μs | 387.6842 KOps/s | 388.9431 KOps/s | |
test_membership_nested_last | 32.8610μs | 3.0640μs | 326.3743 KOps/s | 326.7200 KOps/s | |
test_membership_nested_leaf_last | 21.3110μs | 3.1207μs | 320.4453 KOps/s | 325.7590 KOps/s | |
test_membership_stacked_nested_last | 33.9400μs | 3.1235μs | 320.1540 KOps/s | 323.4086 KOps/s | |
test_membership_stacked_nested_leaf_last | 21.6310μs | 3.1310μs | 319.3901 KOps/s | 325.6722 KOps/s | |
test_nested_getleaf | 34.1800μs | 8.2974μs | 120.5197 KOps/s | 119.0074 KOps/s | |
test_nested_get | 31.8900μs | 7.8031μs | 128.1541 KOps/s | 127.4612 KOps/s | |
test_stacked_getleaf | 27.7010μs | 8.3832μs | 119.2858 KOps/s | 118.8295 KOps/s | |
test_stacked_get | 30.8310μs | 7.7933μs | 128.3147 KOps/s | 127.0020 KOps/s | |
test_nested_getitemleaf | 30.0700μs | 8.4632μs | 118.1589 KOps/s | 116.4495 KOps/s | |
test_nested_getitem | 33.9410μs | 7.9988μs | 125.0180 KOps/s | 124.5599 KOps/s | |
test_stacked_getitemleaf | 35.6310μs | 8.5558μs | 116.8797 KOps/s | 116.1692 KOps/s | |
test_stacked_getitem | 25.0200μs | 7.9922μs | 125.1227 KOps/s | 124.2232 KOps/s | |
test_lock_nested | 57.4349ms | 0.3969ms | 2.5194 KOps/s | 2.4649 KOps/s | |
test_lock_stack_nested | 0.3425ms | 0.2919ms | 3.4258 KOps/s | 3.3111 KOps/s | |
test_unlock_nested | 59.8476ms | 0.3998ms | 2.5015 KOps/s | 2.4628 KOps/s | |
test_unlock_stack_nested | 0.3212ms | 0.3017ms | 3.3150 KOps/s | 3.2362 KOps/s | |
test_flatten_speed | 0.4123ms | 0.1017ms | 9.8362 KOps/s | 9.6626 KOps/s | |
test_unflatten_speed | 0.3139ms | 0.2898ms | 3.4502 KOps/s | 3.4180 KOps/s | |
test_common_ops | 1.0930ms | 0.6233ms | 1.6043 KOps/s | 1.7273 KOps/s | |
test_creation | 29.0000μs | 1.6481μs | 606.7445 KOps/s | 616.8917 KOps/s | |
test_creation_empty | 26.4110μs | 10.7088μs | 93.3812 KOps/s | 126.8563 KOps/s | |
test_creation_nested_1 | 28.0310μs | 12.5112μs | 79.9284 KOps/s | 103.4794 KOps/s | |
test_creation_nested_2 | 51.3810μs | 14.7616μs | 67.7434 KOps/s | 84.3790 KOps/s | |
test_clone | 56.9610μs | 11.6527μs | 85.8174 KOps/s | 85.3292 KOps/s | |
test_getitem[int] | 31.0600μs | 10.6368μs | 94.0133 KOps/s | 94.0386 KOps/s | |
test_getitem[slice_int] | 46.9210μs | 20.4875μs | 48.8102 KOps/s | 48.8220 KOps/s | |
test_getitem[range] | 64.4310μs | 46.5046μs | 21.5032 KOps/s | 22.0678 KOps/s | |
test_getitem[tuple] | 42.0410μs | 18.3026μs | 54.6370 KOps/s | 54.0202 KOps/s | |
test_getitem[list] | 0.1519ms | 31.9083μs | 31.3398 KOps/s | 30.5250 KOps/s | |
test_setitem_dim[int] | 66.8410μs | 28.4367μs | 35.1658 KOps/s | 37.2978 KOps/s | |
test_setitem_dim[slice_int] | 70.0310μs | 48.5305μs | 20.6056 KOps/s | 20.8238 KOps/s | |
test_setitem_dim[range] | 88.6910μs | 65.2632μs | 15.3226 KOps/s | 15.8482 KOps/s | |
test_setitem_dim[tuple] | 60.5010μs | 42.1918μs | 23.7013 KOps/s | 23.9793 KOps/s | |
test_setitem | 47.4700μs | 17.6002μs | 56.8174 KOps/s | 62.6276 KOps/s | |
test_set | 49.3110μs | 17.1548μs | 58.2927 KOps/s | 63.7642 KOps/s | |
test_set_shared | 1.6643ms | 98.4177μs | 10.1608 KOps/s | 10.1831 KOps/s | |
test_update | 90.7210μs | 20.5981μs | 48.5483 KOps/s | 55.5269 KOps/s | |
test_update_nested | 69.1020μs | 26.5024μs | 37.7325 KOps/s | 42.1088 KOps/s | |
test_update__nested | 67.7410μs | 22.3876μs | 44.6675 KOps/s | 43.9879 KOps/s | |
test_set_nested | 54.0410μs | 17.9797μs | 55.6183 KOps/s | 59.3294 KOps/s | |
test_set_nested_new | 69.0810μs | 20.8766μs | 47.9006 KOps/s | 51.3847 KOps/s | |
test_select | 66.6510μs | 33.4357μs | 29.9081 KOps/s | 30.4024 KOps/s | |
test_select_nested | 0.6063ms | 51.1052μs | 19.5675 KOps/s | 19.1088 KOps/s | |
test_exclude_nested | 0.1409ms | 0.1077ms | 9.2833 KOps/s | 9.0389 KOps/s | |
test_empty[True] | 0.3779ms | 0.3406ms | 2.9360 KOps/s | 2.8756 KOps/s | |
test_empty[False] | 2.7930μs | 0.7932μs | 1.2608 MOps/s | 1.2431 MOps/s | |
test_to | 89.8920μs | 58.9966μs | 16.9501 KOps/s | 17.1530 KOps/s | |
test_to_nonblocking | 75.2710μs | 35.2139μs | 28.3979 KOps/s | 27.6741 KOps/s | |
test_unbind_speed | 0.2927ms | 0.2540ms | 3.9376 KOps/s | 3.8755 KOps/s | |
test_unbind_speed_stack0 | 0.2927ms | 0.2547ms | 3.9258 KOps/s | 3.8177 KOps/s | |
test_unbind_speed_stack1 | 75.3276ms | 0.7656ms | 1.3062 KOps/s | 1.2705 KOps/s | |
test_split | 75.0175ms | 1.6498ms | 606.1403 Ops/s | 600.4106 Ops/s | |
test_chunk | 76.0235ms | 1.6517ms | 605.4437 Ops/s | 599.9829 Ops/s | |
test_creation[device0] | 0.1280ms | 59.3094μs | 16.8607 KOps/s | 17.5575 KOps/s | |
test_creation_from_tensor | 0.1334ms | 56.0179μs | 17.8514 KOps/s | 17.6168 KOps/s | |
test_add_one[memmap_tensor0] | 51.7810μs | 6.8567μs | 145.8435 KOps/s | 146.0735 KOps/s | |
test_contiguous[memmap_tensor0] | 29.5800μs | 0.6601μs | 1.5150 MOps/s | 1.5153 MOps/s | |
test_stack[memmap_tensor0] | 29.8500μs | 4.6944μs | 213.0203 KOps/s | 210.0070 KOps/s | |
test_memmaptd_index | 1.0432ms | 0.2764ms | 3.6177 KOps/s | 3.4875 KOps/s | |
test_memmaptd_index_astensor | 0.5899ms | 0.3329ms | 3.0043 KOps/s | 2.8611 KOps/s | |
test_memmaptd_index_op | 0.9466ms | 0.6587ms | 1.5182 KOps/s | 1.5932 KOps/s | |
test_serialize_model | 96.0782ms | 90.8790ms | 11.0036 Ops/s | 10.4370 Ops/s | |
test_serialize_model_pickle | 1.3485s | 1.2353s | 0.8095 Ops/s | 0.8083 Ops/s | |
test_serialize_weights | 93.0226ms | 89.2009ms | 11.2106 Ops/s | 9.6680 Ops/s | |
test_serialize_weights_returnearly | 0.2211s | 75.8420ms | 13.1853 Ops/s | 13.2590 Ops/s | |
test_serialize_weights_pickle | 1.4070s | 1.2542s | 0.7973 Ops/s | 0.7975 Ops/s | |
test_reshape_pytree | 51.2110μs | 26.0183μs | 38.4344 KOps/s | 38.2878 KOps/s | |
test_reshape_td | 56.0110μs | 31.6022μs | 31.6433 KOps/s | 31.5799 KOps/s | |
test_view_pytree | 48.8010μs | 25.6677μs | 38.9594 KOps/s | 38.7218 KOps/s | |
test_view_td | 67.1210μs | 36.4615μs | 27.4262 KOps/s | 26.1745 KOps/s | |
test_unbind_pytree | 54.9910μs | 31.8505μs | 31.3967 KOps/s | 30.5265 KOps/s | |
test_unbind_td | 0.4760ms | 39.5292μs | 25.2978 KOps/s | 24.7536 KOps/s | |
test_split_pytree | 65.4110μs | 34.5099μs | 28.9772 KOps/s | 28.2453 KOps/s | |
test_split_td | 0.4740ms | 38.9037μs | 25.7045 KOps/s | 25.6009 KOps/s | |
test_add_pytree | 64.7610μs | 37.7100μs | 26.5181 KOps/s | 26.4722 KOps/s | |
test_add_td | 82.1310μs | 52.8089μs | 18.9362 KOps/s | 19.6436 KOps/s | |
test_distributed | 1.7340ms | 85.8737μs | 11.6450 KOps/s | 11.4122 KOps/s | |
test_tdmodule | 37.3710μs | 16.4573μs | 60.7632 KOps/s | 65.2114 KOps/s | |
test_tdmodule_dispatch | 47.7100μs | 31.7356μs | 31.5103 KOps/s | 33.5855 KOps/s | |
test_tdseq | 42.2200μs | 17.6800μs | 56.5612 KOps/s | 59.2304 KOps/s | |
test_tdseq_dispatch | 63.1210μs | 36.1993μs | 27.6249 KOps/s | 30.6682 KOps/s | |
test_instantiation_functorch | 1.5290ms | 1.4049ms | 711.8021 Ops/s | 706.5376 Ops/s | |
test_instantiation_td | 77.9213ms | 1.0753ms | 929.9976 Ops/s | 922.8870 Ops/s | |
test_exec_functorch | 0.1887ms | 0.1432ms | 6.9826 KOps/s | 6.9129 KOps/s | |
test_exec_functional_call | 0.3081ms | 0.1323ms | 7.5589 KOps/s | 7.5565 KOps/s | |
test_exec_td | 0.1770ms | 0.1278ms | 7.8243 KOps/s | 7.7741 KOps/s | |
test_exec_td_decorator | 0.5043ms | 0.2011ms | 4.9725 KOps/s | 4.9529 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.5914ms | 0.5585ms | 1.7904 KOps/s | 1.7964 KOps/s | |
test_vmap_mlp_speed[True-False] | 1.2862ms | 0.5592ms | 1.7883 KOps/s | 1.7968 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.5506ms | 0.4856ms | 2.0594 KOps/s | 2.0485 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6313ms | 0.5115ms | 1.9549 KOps/s | 2.0096 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9343ms | 0.6594ms | 1.5165 KOps/s | 1.4251 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7961ms | 0.6583ms | 1.5190 KOps/s | 1.6195 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7185ms | 0.5780ms | 1.7302 KOps/s | 1.8290 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7028ms | 0.5761ms | 1.7358 KOps/s | 1.8171 KOps/s | |
test_vmap_transformer_speed[True-True] | 7.7595ms | 7.3822ms | 135.4607 Ops/s | 135.9745 Ops/s | |
test_vmap_transformer_speed[True-False] | 7.3799ms | 7.3087ms | 136.8223 Ops/s | 134.7067 Ops/s | |
test_vmap_transformer_speed[False-True] | 7.3217ms | 7.2553ms | 137.8304 Ops/s | 136.5672 Ops/s | |
test_vmap_transformer_speed[False-False] | 7.7074ms | 7.3509ms | 136.0377 Ops/s | 136.6042 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.6781ms | 17.7377ms | 56.3771 Ops/s | 56.0912 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 17.8692ms | 17.7769ms | 56.2529 Ops/s | 56.1483 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.8808ms | 17.6272ms | 56.7305 Ops/s | 56.5991 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.2325ms | 17.6550ms | 56.6411 Ops/s | 56.3115 Ops/s | |
test_to_module_speed[True] | 1.5995ms | 1.4811ms | 675.1798 Ops/s | 669.7717 Ops/s | |
test_to_module_speed[False] | 1.5838ms | 1.4791ms | 676.1063 Ops/s | 677.8007 Ops/s | |
test_tc_init | 83.4710μs | 59.5664μs | 16.7880 KOps/s | 19.4609 KOps/s | |
test_tc_init_nested | 0.1507ms | 0.1174ms | 8.5191 KOps/s | 9.7781 KOps/s | |
test_tc_first_layer_tensor | 23.8410μs | 3.6843μs | 271.4221 KOps/s | 268.8321 KOps/s | |
test_tc_first_layer_nontensor | 26.1200μs | 3.7084μs | 269.6562 KOps/s | 267.0216 KOps/s | |
test_tc_second_layer_tensor | 6.5002μs | 1.2089μs | 827.1746 KOps/s | 789.0325 KOps/s | |
test_tc_second_layer_nontensor | 24.3000μs | 4.2164μs | 237.1689 KOps/s | 234.1734 KOps/s | |
test_unbind | 0.1115s | 14.7457ms | 67.8162 Ops/s | 69.4159 Ops/s | |
test_full_like | 14.3288ms | 13.5820ms | 73.6267 Ops/s | 72.2418 Ops/s | |
test_zeros_like | 8.3559ms | 7.9687ms | 125.4908 Ops/s | 124.7093 Ops/s | |
test_ones_like | 8.5419ms | 7.9642ms | 125.5618 Ops/s | 124.9674 Ops/s | |
test_clone | 9.7521ms | 9.4975ms | 105.2905 Ops/s | 100.2970 Ops/s | |
test_squeeze | 72.1410μs | 10.8217μs | 92.4065 KOps/s | 94.4496 KOps/s | |
test_unsqueeze | 0.1345ms | 87.2276μs | 11.4643 KOps/s | 11.1088 KOps/s | |
test_split | 3.4263ms | 3.1011ms | 322.4684 Ops/s | 311.7268 Ops/s | |
test_permute | 0.2630ms | 0.2061ms | 4.8509 KOps/s | 4.9157 KOps/s | |
test_stack | 27.4510ms | 27.2169ms | 36.7419 Ops/s | 36.3331 Ops/s | |
test_cat | 27.2567ms | 27.0044ms | 37.0309 Ops/s | 36.3955 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.