-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Faster empty_like for MemoryMappedTensor #585
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 30, 2023
albertbou92
approved these changes
Nov 30, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 31.8500μs | 15.7819μs | 63.3636 KOps/s | 63.0403 KOps/s | |
test_plain_set_stack_nested | 0.1793ms | 0.1435ms | 6.9673 KOps/s | 6.9051 KOps/s | |
test_plain_set_nested_inplace | 43.5710μs | 18.7853μs | 53.2331 KOps/s | 52.2175 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3254ms | 0.1720ms | 5.8125 KOps/s | 5.8058 KOps/s | |
test_items | 31.3580μs | 2.4211μs | 413.0283 KOps/s | 402.7880 KOps/s | |
test_items_nested | 0.4411ms | 0.2838ms | 3.5242 KOps/s | 3.6204 KOps/s | |
test_items_nested_locked | 0.3233ms | 0.2692ms | 3.7150 KOps/s | 3.5956 KOps/s | |
test_items_nested_leaf | 0.8253ms | 0.1686ms | 5.9317 KOps/s | 5.8707 KOps/s | |
test_items_stack_nested | 1.6585ms | 1.4835ms | 674.0658 Ops/s | 656.7914 Ops/s | |
test_items_stack_nested_leaf | 1.7671ms | 1.3481ms | 741.7949 Ops/s | 714.9756 Ops/s | |
test_items_stack_nested_locked | 0.8474ms | 0.7607ms | 1.3146 KOps/s | 1.2609 KOps/s | |
test_keys | 52.4780μs | 3.8851μs | 257.3912 KOps/s | 259.3479 KOps/s | |
test_keys_nested | 3.1677ms | 0.1409ms | 7.0960 KOps/s | 6.7144 KOps/s | |
test_keys_nested_locked | 0.1857ms | 0.1390ms | 7.1952 KOps/s | 7.0608 KOps/s | |
test_keys_nested_leaf | 0.3809ms | 0.1389ms | 7.2000 KOps/s | 6.9710 KOps/s | |
test_keys_stack_nested | 1.5095ms | 1.4072ms | 710.6479 Ops/s | 695.0822 Ops/s | |
test_keys_stack_nested_leaf | 1.7588ms | 1.4070ms | 710.7531 Ops/s | 697.4718 Ops/s | |
test_keys_stack_nested_locked | 0.7550ms | 0.6704ms | 1.4916 KOps/s | 1.4541 KOps/s | |
test_values | 6.5924μs | 1.1730μs | 852.5254 KOps/s | 772.4969 KOps/s | |
test_values_nested | 89.2960μs | 49.2461μs | 20.3062 KOps/s | 20.0247 KOps/s | |
test_values_nested_locked | 82.5730μs | 49.3704μs | 20.2551 KOps/s | 19.8548 KOps/s | |
test_values_nested_leaf | 99.2950μs | 44.6909μs | 22.3759 KOps/s | 22.3729 KOps/s | |
test_values_stack_nested | 1.4297ms | 1.1966ms | 835.7200 Ops/s | 813.3827 Ops/s | |
test_values_stack_nested_leaf | 1.4125ms | 1.1913ms | 839.4367 Ops/s | 824.1280 Ops/s | |
test_values_stack_nested_locked | 0.9665ms | 0.5118ms | 1.9540 KOps/s | 1.9014 KOps/s | |
test_membership | 9.7480μs | 1.3523μs | 739.4646 KOps/s | 727.6418 KOps/s | |
test_membership_nested | 20.7280μs | 2.8023μs | 356.8508 KOps/s | 351.4277 KOps/s | |
test_membership_nested_leaf | 28.1720μs | 2.8511μs | 350.7442 KOps/s | 348.1738 KOps/s | |
test_membership_stacked_nested | 46.7170μs | 11.8225μs | 84.5844 KOps/s | 83.1906 KOps/s | |
test_membership_stacked_nested_leaf | 34.9650μs | 11.8419μs | 84.4458 KOps/s | 79.7554 KOps/s | |
test_membership_nested_last | 34.5540μs | 5.9203μs | 168.9110 KOps/s | 157.1748 KOps/s | |
test_membership_nested_leaf_last | 35.3150μs | 6.0254μs | 165.9630 KOps/s | 166.2569 KOps/s | |
test_membership_stacked_nested_last | 0.2391ms | 0.1683ms | 5.9428 KOps/s | 5.9803 KOps/s | |
test_membership_stacked_nested_leaf_last | 47.4580μs | 13.7307μs | 72.8297 KOps/s | 71.8509 KOps/s | |
test_nested_getleaf | 39.7940μs | 10.6570μs | 93.8348 KOps/s | 94.1917 KOps/s | |
test_nested_get | 41.1470μs | 10.1225μs | 98.7900 KOps/s | 100.5221 KOps/s | |
test_stacked_getleaf | 1.0610ms | 0.6459ms | 1.5481 KOps/s | 1.5174 KOps/s | |
test_stacked_get | 1.1805ms | 0.6138ms | 1.6293 KOps/s | 1.5753 KOps/s | |
test_nested_getitemleaf | 29.9060μs | 10.6520μs | 93.8793 KOps/s | 93.0424 KOps/s | |
test_nested_getitem | 39.2730μs | 10.0967μs | 99.0420 KOps/s | 97.6376 KOps/s | |
test_stacked_getitemleaf | 0.7623ms | 0.6427ms | 1.5560 KOps/s | 1.5005 KOps/s | |
test_stacked_getitem | 1.3129ms | 0.6098ms | 1.6399 KOps/s | 1.5617 KOps/s | |
test_lock_nested | 7.1352ms | 0.5680ms | 1.7606 KOps/s | 1.7704 KOps/s | |
test_lock_stack_nested | 7.6001ms | 5.0558ms | 197.7943 Ops/s | 197.2205 Ops/s | |
test_unlock_nested | 70.3042ms | 0.5134ms | 1.9478 KOps/s | 2.2528 KOps/s | |
test_unlock_stack_nested | 66.0627ms | 6.7768ms | 147.5613 Ops/s | 147.0538 Ops/s | |
test_flatten_speed | 0.5727ms | 0.2676ms | 3.7364 KOps/s | 3.7207 KOps/s | |
test_unflatten_speed | 1.7455ms | 0.4835ms | 2.0684 KOps/s | 2.1800 KOps/s | |
test_common_ops | 1.2089ms | 0.6642ms | 1.5056 KOps/s | 1.4846 KOps/s | |
test_creation | 58.8390μs | 2.4758μs | 403.9046 KOps/s | 398.9798 KOps/s | |
test_creation_empty | 30.2060μs | 8.1581μs | 122.5778 KOps/s | 123.3529 KOps/s | |
test_creation_nested_1 | 40.4450μs | 11.2943μs | 88.5406 KOps/s | 87.6794 KOps/s | |
test_creation_nested_2 | 39.6440μs | 15.0413μs | 66.4835 KOps/s | 66.1438 KOps/s | |
test_clone | 87.8630μs | 13.3905μs | 74.6799 KOps/s | 74.5813 KOps/s | |
test_getitem[int] | 34.8650μs | 13.0017μs | 76.9133 KOps/s | 76.7222 KOps/s | |
test_getitem[slice_int] | 0.1301ms | 26.3031μs | 38.0183 KOps/s | 38.9085 KOps/s | |
test_getitem[range] | 87.9140μs | 44.0657μs | 22.6934 KOps/s | 22.3669 KOps/s | |
test_getitem[tuple] | 63.3180μs | 20.2697μs | 49.3347 KOps/s | 48.1991 KOps/s | |
test_getitem[list] | 94.2850μs | 38.6804μs | 25.8529 KOps/s | 25.2651 KOps/s | |
test_setitem_dim[int] | 59.4300μs | 28.7006μs | 34.8425 KOps/s | 35.0077 KOps/s | |
test_setitem_dim[slice_int] | 92.5210μs | 52.2875μs | 19.1250 KOps/s | 18.8562 KOps/s | |
test_setitem_dim[range] | 0.1487ms | 73.1579μs | 13.6691 KOps/s | 13.6812 KOps/s | |
test_setitem_dim[tuple] | 84.6870μs | 41.9583μs | 23.8332 KOps/s | 23.8323 KOps/s | |
test_setitem | 78.8260μs | 18.4254μs | 54.2729 KOps/s | 54.1930 KOps/s | |
test_set | 79.7080μs | 17.5207μs | 57.0754 KOps/s | 55.5369 KOps/s | |
test_set_shared | 1.6807ms | 0.1411ms | 7.0867 KOps/s | 7.0358 KOps/s | |
test_update | 0.1111ms | 19.4289μs | 51.4698 KOps/s | 53.3670 KOps/s | |
test_update_nested | 89.2860μs | 26.2988μs | 38.0245 KOps/s | 37.5058 KOps/s | |
test_set_nested | 89.8680μs | 19.3517μs | 51.6751 KOps/s | 51.2719 KOps/s | |
test_set_nested_new | 78.2360μs | 24.4348μs | 40.9252 KOps/s | 39.0178 KOps/s | |
test_select | 0.1050ms | 49.8648μs | 20.0542 KOps/s | 19.7768 KOps/s | |
test_unbind_speed | 0.4551ms | 0.3760ms | 2.6597 KOps/s | 2.6768 KOps/s | |
test_unbind_speed_stack0 | 63.5713ms | 4.7261ms | 211.5898 Ops/s | 224.1267 Ops/s | |
test_unbind_speed_stack1 | 1.6471μs | 0.6518μs | 1.5342 MOps/s | 1.5732 MOps/s | |
test_split | 54.9724ms | 1.7616ms | 567.6578 Ops/s | 567.6658 Ops/s | |
test_chunk | 52.6340ms | 1.7449ms | 573.0871 Ops/s | 549.7865 Ops/s | |
test_creation[device0] | 0.6078ms | 0.2940ms | 3.4013 KOps/s | 3.3334 KOps/s | |
test_creation_from_tensor | 2.6767ms | 0.3320ms | 3.0121 KOps/s | 3.0086 KOps/s | |
test_add_one[memmap_tensor0] | 93.8840μs | 25.3146μs | 39.5029 KOps/s | 39.4299 KOps/s | |
test_contiguous[memmap_tensor0] | 30.4970μs | 5.7286μs | 174.5633 KOps/s | 173.2443 KOps/s | |
test_stack[memmap_tensor0] | 87.5230μs | 19.5318μs | 51.1985 KOps/s | 53.9055 KOps/s | |
test_memmaptd_index | 1.2355ms | 0.2251ms | 4.4426 KOps/s | 2.4373 KOps/s | |
test_memmaptd_index_astensor | 0.3215ms | 0.2573ms | 3.8860 KOps/s | 2.1348 KOps/s | |
test_memmaptd_index_op | 0.6072ms | 0.4956ms | 2.0176 KOps/s | 1.4201 KOps/s | |
test_reshape_pytree | 55.8040μs | 23.4572μs | 42.6308 KOps/s | 43.2654 KOps/s | |
test_reshape_td | 0.3969ms | 31.6100μs | 31.6355 KOps/s | 30.6044 KOps/s | |
test_view_pytree | 58.5290μs | 23.4583μs | 42.6289 KOps/s | 43.0650 KOps/s | |
test_view_td | 20.3280μs | 4.8598μs | 205.7678 KOps/s | 207.6055 KOps/s | |
test_unbind_pytree | 76.8330μs | 26.8161μs | 37.2911 KOps/s | 38.1050 KOps/s | |
test_unbind_td | 0.1258ms | 59.3170μs | 16.8586 KOps/s | 15.2398 KOps/s | |
test_split_pytree | 55.5440μs | 26.7533μs | 37.3785 KOps/s | 38.4671 KOps/s | |
test_split_td | 88.8260μs | 46.3286μs | 21.5849 KOps/s | 20.9907 KOps/s | |
test_add_pytree | 73.7570μs | 32.2100μs | 31.0462 KOps/s | 27.3006 KOps/s | |
test_add_td | 0.1308ms | 45.4094μs | 22.0219 KOps/s | 21.8670 KOps/s | |
test_distributed | 49.1220μs | 5.9400μs | 168.3509 KOps/s | 167.9495 KOps/s | |
test_tdmodule | 0.1728ms | 20.8578μs | 47.9438 KOps/s | 44.3328 KOps/s | |
test_tdmodule_dispatch | 0.1682ms | 37.7168μs | 26.5134 KOps/s | 25.3878 KOps/s | |
test_tdseq | 50.9540μs | 23.7263μs | 42.1474 KOps/s | 40.6494 KOps/s | |
test_tdseq_dispatch | 0.1372ms | 42.6089μs | 23.4693 KOps/s | 22.9931 KOps/s | |
test_instantiation_functorch | 1.4130ms | 1.2995ms | 769.5415 Ops/s | 778.5909 Ops/s | |
test_instantiation_td | 1.6170ms | 1.0298ms | 971.0264 Ops/s | 923.1350 Ops/s | |
test_exec_functorch | 0.3696ms | 0.1589ms | 6.2947 KOps/s | 6.1495 KOps/s | |
test_exec_functional_call | 0.4108ms | 0.1495ms | 6.6896 KOps/s | 6.6385 KOps/s | |
test_exec_td | 0.3274ms | 0.1435ms | 6.9695 KOps/s | 6.8402 KOps/s | |
test_exec_td_decorator | 0.9495ms | 0.1772ms | 5.6420 KOps/s | 5.4875 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4163ms | 0.8990ms | 1.1124 KOps/s | 1.1181 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9511ms | 0.4808ms | 2.0799 KOps/s | 2.1178 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1643ms | 0.7811ms | 1.2802 KOps/s | 1.2923 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5431ms | 0.3814ms | 2.6222 KOps/s | 2.6003 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.6583ms | 1.7782ms | 562.3693 Ops/s | 565.0756 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9007ms | 0.5122ms | 1.9523 KOps/s | 1.9405 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.0033ms | 1.4830ms | 674.3256 Ops/s | 681.4433 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7601ms | 0.3961ms | 2.5247 KOps/s | 2.5088 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.4615ms | 12.6032μs | 79.3450 KOps/s | 78.3180 KOps/s | |
test_plain_set_stack_nested | 0.1967ms | 0.1146ms | 8.7284 KOps/s | 8.6301 KOps/s | |
test_plain_set_nested_inplace | 33.4710μs | 14.8612μs | 67.2893 KOps/s | 66.2095 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1805ms | 0.1399ms | 7.1486 KOps/s | 7.0619 KOps/s | |
test_items | 28.8110μs | 4.6702μs | 214.1250 KOps/s | 212.5502 KOps/s | |
test_items_nested | 0.3774ms | 0.3381ms | 2.9574 KOps/s | 2.9861 KOps/s | |
test_items_nested_locked | 0.4287ms | 0.3426ms | 2.9185 KOps/s | 2.9522 KOps/s | |
test_items_nested_leaf | 0.2438ms | 0.1991ms | 5.0218 KOps/s | 5.0531 KOps/s | |
test_items_stack_nested | 1.5435ms | 1.4793ms | 675.9844 Ops/s | 668.7865 Ops/s | |
test_items_stack_nested_leaf | 1.3606ms | 1.3065ms | 765.4036 Ops/s | 759.3990 Ops/s | |
test_items_stack_nested_locked | 0.8497ms | 0.8129ms | 1.2302 KOps/s | 1.2095 KOps/s | |
test_keys | 41.3110μs | 4.5990μs | 217.4367 KOps/s | 215.9634 KOps/s | |
test_keys_nested | 3.2786ms | 90.7523μs | 11.0190 KOps/s | 11.0600 KOps/s | |
test_keys_nested_locked | 0.1143ms | 90.2784μs | 11.0768 KOps/s | 11.0639 KOps/s | |
test_keys_nested_leaf | 41.3409ms | 86.7979μs | 11.5210 KOps/s | 12.1759 KOps/s | |
test_keys_stack_nested | 1.5592ms | 1.3013ms | 768.4596 Ops/s | 769.2222 Ops/s | |
test_keys_stack_nested_leaf | 1.3658ms | 1.2875ms | 776.7148 Ops/s | 767.7453 Ops/s | |
test_keys_stack_nested_locked | 0.6650ms | 0.6234ms | 1.6040 KOps/s | 1.5779 KOps/s | |
test_values | 14.9237μs | 1.8928μs | 528.3141 KOps/s | 526.2636 KOps/s | |
test_values_nested | 68.0830μs | 43.1929μs | 23.1519 KOps/s | 23.0592 KOps/s | |
test_values_nested_locked | 0.1073ms | 45.3581μs | 22.0468 KOps/s | 21.8343 KOps/s | |
test_values_nested_leaf | 58.0430μs | 37.2890μs | 26.8176 KOps/s | 26.4450 KOps/s | |
test_values_stack_nested | 1.2051ms | 1.1445ms | 873.7288 Ops/s | 866.9907 Ops/s | |
test_values_stack_nested_leaf | 1.1836ms | 1.1219ms | 891.3454 Ops/s | 883.8879 Ops/s | |
test_values_stack_nested_locked | 0.5468ms | 0.4992ms | 2.0032 KOps/s | 1.9616 KOps/s | |
test_membership | 6.1744μs | 0.9373μs | 1.0669 MOps/s | 948.6238 KOps/s | |
test_membership_nested | 18.6710μs | 2.1887μs | 456.8875 KOps/s | 456.7165 KOps/s | |
test_membership_nested_leaf | 14.0505μs | 2.1169μs | 472.3892 KOps/s | 473.1406 KOps/s | |
test_membership_stacked_nested | 40.7510μs | 11.0455μs | 90.5344 KOps/s | 91.2699 KOps/s | |
test_membership_stacked_nested_leaf | 73.6830μs | 11.0465μs | 90.5268 KOps/s | 90.8871 KOps/s | |
test_membership_nested_last | 35.8820μs | 4.6606μs | 214.5650 KOps/s | 218.4917 KOps/s | |
test_membership_nested_leaf_last | 22.7810μs | 4.6893μs | 213.2503 KOps/s | 218.3642 KOps/s | |
test_membership_stacked_nested_last | 0.1866ms | 0.1342ms | 7.4530 KOps/s | 7.3967 KOps/s | |
test_membership_stacked_nested_leaf_last | 51.6820μs | 12.7416μs | 78.4830 KOps/s | 78.2578 KOps/s | |
test_nested_getleaf | 29.2310μs | 8.3546μs | 119.6950 KOps/s | 118.8351 KOps/s | |
test_nested_get | 28.6510μs | 7.9278μs | 126.1391 KOps/s | 125.9206 KOps/s | |
test_stacked_getleaf | 0.6987ms | 0.5690ms | 1.7576 KOps/s | 1.7796 KOps/s | |
test_stacked_get | 0.6061ms | 0.5440ms | 1.8384 KOps/s | 1.8715 KOps/s | |
test_nested_getitemleaf | 27.7610μs | 8.5018μs | 117.6219 KOps/s | 118.0450 KOps/s | |
test_nested_getitem | 31.2610μs | 8.0607μs | 124.0594 KOps/s | 124.8637 KOps/s | |
test_stacked_getitemleaf | 0.8116ms | 0.5668ms | 1.7643 KOps/s | 1.7569 KOps/s | |
test_stacked_getitem | 0.5586ms | 0.5328ms | 1.8768 KOps/s | 1.8543 KOps/s | |
test_lock_nested | 3.2113ms | 0.5539ms | 1.8053 KOps/s | 1.7612 KOps/s | |
test_lock_stack_nested | 82.9370ms | 7.2093ms | 138.7105 Ops/s | 137.9740 Ops/s | |
test_unlock_nested | 2.3688ms | 0.4265ms | 2.3444 KOps/s | 2.3029 KOps/s | |
test_unlock_stack_nested | 66.5467ms | 6.2426ms | 160.1906 Ops/s | 158.0031 Ops/s | |
test_flatten_speed | 0.2246ms | 0.1861ms | 5.3732 KOps/s | 5.3233 KOps/s | |
test_unflatten_speed | 0.4375ms | 0.3647ms | 2.7420 KOps/s | 2.7481 KOps/s | |
test_common_ops | 1.1255ms | 0.5933ms | 1.6856 KOps/s | 1.6475 KOps/s | |
test_creation | 64.1220μs | 2.1034μs | 475.4099 KOps/s | 474.4200 KOps/s | |
test_creation_empty | 27.5710μs | 6.7310μs | 148.5669 KOps/s | 138.8676 KOps/s | |
test_creation_nested_1 | 42.2310μs | 9.1010μs | 109.8778 KOps/s | 104.8260 KOps/s | |
test_creation_nested_2 | 41.1420μs | 11.8089μs | 84.6816 KOps/s | 82.0787 KOps/s | |
test_clone | 97.3440μs | 14.2661μs | 70.0965 KOps/s | 68.7352 KOps/s | |
test_getitem[int] | 30.3310μs | 12.1694μs | 82.1736 KOps/s | 81.2715 KOps/s | |
test_getitem[slice_int] | 50.9020μs | 23.7061μs | 42.1833 KOps/s | 42.0469 KOps/s | |
test_getitem[range] | 0.2405ms | 40.1628μs | 24.8986 KOps/s | 24.8468 KOps/s | |
test_getitem[tuple] | 40.4810μs | 20.0523μs | 49.8696 KOps/s | 48.8674 KOps/s | |
test_getitem[list] | 0.2554ms | 36.5407μs | 27.3668 KOps/s | 26.4924 KOps/s | |
test_setitem_dim[int] | 56.7230μs | 25.3252μs | 39.4863 KOps/s | 37.9526 KOps/s | |
test_setitem_dim[slice_int] | 61.6930μs | 45.3231μs | 22.0638 KOps/s | 21.4143 KOps/s | |
test_setitem_dim[range] | 97.3540μs | 62.7407μs | 15.9386 KOps/s | 15.7419 KOps/s | |
test_setitem_dim[tuple] | 59.4820μs | 38.9700μs | 25.6608 KOps/s | 25.7412 KOps/s | |
test_setitem | 94.3340μs | 17.9339μs | 55.7603 KOps/s | 53.9342 KOps/s | |
test_set | 88.2050μs | 17.4950μs | 57.1593 KOps/s | 56.3603 KOps/s | |
test_set_shared | 2.8966ms | 0.1047ms | 9.5485 KOps/s | 8.6112 KOps/s | |
test_update | 94.9040μs | 18.4243μs | 54.2762 KOps/s | 52.2170 KOps/s | |
test_update_nested | 0.1077ms | 25.2293μs | 39.6364 KOps/s | 38.9236 KOps/s | |
test_set_nested | 98.4540μs | 18.7529μs | 53.3250 KOps/s | 52.2182 KOps/s | |
test_set_nested_new | 96.3240μs | 22.9429μs | 43.5864 KOps/s | 42.5748 KOps/s | |
test_select | 73.8830μs | 46.3572μs | 21.5716 KOps/s | 21.2255 KOps/s | |
test_to | 74.6230μs | 54.8299μs | 18.2382 KOps/s | 18.2716 KOps/s | |
test_to_nonblocking | 68.5530μs | 34.8684μs | 28.6793 KOps/s | 25.7870 KOps/s | |
test_unbind_speed | 0.4077ms | 0.3618ms | 2.7636 KOps/s | 2.7548 KOps/s | |
test_unbind_speed_stack0 | 63.2311ms | 4.3730ms | 228.6754 Ops/s | 233.1590 Ops/s | |
test_unbind_speed_stack1 | 1.6431μs | 0.5259μs | 1.9016 MOps/s | 1.8929 MOps/s | |
test_split | 54.0366ms | 1.8082ms | 553.0379 Ops/s | 541.5858 Ops/s | |
test_chunk | 54.3971ms | 1.7994ms | 555.7271 Ops/s | 544.9208 Ops/s | |
test_creation[device0] | 0.5046ms | 0.3092ms | 3.2346 KOps/s | 3.2283 KOps/s | |
test_creation[device1] | 0.8266ms | 0.3145ms | 3.1797 KOps/s | 3.1659 KOps/s | |
test_creation_from_tensor | 57.3963ms | 0.3675ms | 2.7208 KOps/s | 2.9052 KOps/s | |
test_add_one[memmap_tensor0] | 91.0340μs | 23.9931μs | 41.6787 KOps/s | 40.5487 KOps/s | |
test_add_one[memmap_tensor1] | 0.2175ms | 72.9038μs | 13.7167 KOps/s | 13.3934 KOps/s | |
test_contiguous[memmap_tensor0] | 20.4710μs | 5.7952μs | 172.5577 KOps/s | 172.1464 KOps/s | |
test_contiguous[memmap_tensor1] | 61.6230μs | 21.9253μs | 45.6094 KOps/s | 45.4119 KOps/s | |
test_stack[memmap_tensor0] | 88.9550μs | 19.2375μs | 51.9817 KOps/s | 50.8372 KOps/s | |
test_stack[memmap_tensor1] | 0.1515ms | 73.3245μs | 13.6380 KOps/s | 13.7088 KOps/s | |
test_memmaptd_index | 0.2721ms | 0.2350ms | 4.2548 KOps/s | 2.2685 KOps/s | |
test_memmaptd_index_astensor | 0.3734ms | 0.2938ms | 3.4039 KOps/s | 2.0154 KOps/s | |
test_memmaptd_index_op | 0.5966ms | 0.5452ms | 1.8341 KOps/s | 1.3118 KOps/s | |
test_reshape_pytree | 39.7710μs | 20.6530μs | 48.4192 KOps/s | 47.2713 KOps/s | |
test_reshape_td | 59.2020μs | 30.6095μs | 32.6696 KOps/s | 31.9493 KOps/s | |
test_view_pytree | 36.4320μs | 20.6473μs | 48.4324 KOps/s | 47.8569 KOps/s | |
test_view_td | 17.7210μs | 4.0879μs | 244.6240 KOps/s | 243.0415 KOps/s | |
test_unbind_pytree | 55.5120μs | 25.8276μs | 38.7183 KOps/s | 37.6220 KOps/s | |
test_unbind_td | 96.3440μs | 56.0317μs | 17.8470 KOps/s | 17.4211 KOps/s | |
test_split_pytree | 52.9920μs | 23.8551μs | 41.9197 KOps/s | 41.3863 KOps/s | |
test_split_td | 71.4130μs | 45.0372μs | 22.2039 KOps/s | 22.0811 KOps/s | |
test_add_pytree | 68.3730μs | 31.7818μs | 31.4646 KOps/s | 30.9881 KOps/s | |
test_add_td | 73.6630μs | 43.8924μs | 22.7830 KOps/s | 21.1781 KOps/s | |
test_distributed | 18.1410μs | 5.7663μs | 173.4202 KOps/s | 180.5504 KOps/s | |
test_tdmodule | 32.5110μs | 16.7638μs | 59.6523 KOps/s | 58.6250 KOps/s | |
test_tdmodule_dispatch | 0.1200ms | 32.8489μs | 30.4424 KOps/s | 30.0855 KOps/s | |
test_tdseq | 35.8810μs | 19.6918μs | 50.7826 KOps/s | 48.9194 KOps/s | |
test_tdseq_dispatch | 50.9820μs | 35.6687μs | 28.0358 KOps/s | 27.6472 KOps/s | |
test_instantiation_functorch | 1.7997ms | 1.6697ms | 598.9247 Ops/s | 597.3339 Ops/s | |
test_instantiation_td | 1.6844ms | 1.1908ms | 839.7842 Ops/s | 843.6504 Ops/s | |
test_exec_functorch | 0.2141ms | 0.1600ms | 6.2499 KOps/s | 6.1964 KOps/s | |
test_exec_functional_call | 0.2113ms | 0.1618ms | 6.1791 KOps/s | 6.3270 KOps/s | |
test_exec_td | 0.2088ms | 0.1516ms | 6.5962 KOps/s | 6.7300 KOps/s | |
test_exec_td_decorator | 0.7449ms | 0.1922ms | 5.2030 KOps/s | 5.3099 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1842ms | 1.0811ms | 925.0141 Ops/s | 881.7027 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.6555ms | 0.6194ms | 1.6145 KOps/s | 1.5536 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0816ms | 0.9977ms | 1.0023 KOps/s | 970.9010 Ops/s | |
test_vmap_mlp_speed[False-False] | 0.6074ms | 0.5470ms | 1.8280 KOps/s | 1.7407 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.0028ms | 2.0612ms | 485.1652 Ops/s | 467.5803 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1849ms | 0.6629ms | 1.5085 KOps/s | 1.4614 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.2447ms | 1.7828ms | 560.9020 Ops/s | 531.9862 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9809ms | 0.5640ms | 1.7731 KOps/s | 1.6935 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.8824ms | 12.7318ms | 78.5434 Ops/s | 76.7213 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.3775ms | 8.2961ms | 120.5385 Ops/s | 118.3804 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.7165ms | 12.6251ms | 79.2075 Ops/s | 77.4830 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.2684ms | 8.1915ms | 122.0779 Ops/s | 119.0515 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 66.1297ms | 65.0234ms | 15.3791 Ops/s | 14.0207 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 22.0925ms | 20.0479ms | 49.8805 Ops/s | 48.9901 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 0.1380s | 63.3128ms | 15.7946 Ops/s | 16.5634 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 21.6904ms | 19.6205ms | 50.9670 Ops/s | 46.2853 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.