-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster __init__
#576
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 24, 2023
# Conflicts: # tensordict/_td.py
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 27.7520μs | 16.2458μs | 61.5545 KOps/s | 62.5788 KOps/s | |
test_plain_set_stack_nested | 0.1848ms | 0.1436ms | 6.9655 KOps/s | 6.8446 KOps/s | |
test_plain_set_nested_inplace | 45.3550μs | 19.4324μs | 51.4605 KOps/s | 52.1931 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3943ms | 0.1740ms | 5.7460 KOps/s | 5.4785 KOps/s | |
test_items | 18.8560μs | 2.4049μs | 415.8092 KOps/s | 414.6935 KOps/s | |
test_items_nested | 0.3481ms | 0.2653ms | 3.7686 KOps/s | 3.7741 KOps/s | |
test_items_nested_locked | 1.3824ms | 0.2696ms | 3.7091 KOps/s | 3.7807 KOps/s | |
test_items_nested_leaf | 1.0040ms | 0.1631ms | 6.1315 KOps/s | 6.0925 KOps/s | |
test_items_stack_nested | 1.5997ms | 1.4748ms | 678.0480 Ops/s | 672.2472 Ops/s | |
test_items_stack_nested_leaf | 1.7665ms | 1.3485ms | 741.5715 Ops/s | 723.4473 Ops/s | |
test_items_stack_nested_locked | 1.7869ms | 0.7610ms | 1.3140 KOps/s | 1.3149 KOps/s | |
test_keys | 30.6970μs | 3.8994μs | 256.4518 KOps/s | 260.5925 KOps/s | |
test_keys_nested | 0.5954ms | 0.1421ms | 7.0349 KOps/s | 6.5799 KOps/s | |
test_keys_nested_locked | 0.2964ms | 0.1415ms | 7.0654 KOps/s | 7.0693 KOps/s | |
test_keys_nested_leaf | 0.3833ms | 0.1479ms | 6.7609 KOps/s | 7.0076 KOps/s | |
test_keys_stack_nested | 2.0087ms | 1.4114ms | 708.5112 Ops/s | 711.7721 Ops/s | |
test_keys_stack_nested_leaf | 1.7851ms | 1.4080ms | 710.2409 Ops/s | 708.3654 Ops/s | |
test_keys_stack_nested_locked | 1.2292ms | 0.6855ms | 1.4589 KOps/s | 1.4595 KOps/s | |
test_values | 33.4075μs | 1.1874μs | 842.2078 KOps/s | 889.4031 KOps/s | |
test_values_nested | 96.6610μs | 49.3832μs | 20.2498 KOps/s | 20.1497 KOps/s | |
test_values_nested_locked | 0.3318ms | 51.5222μs | 19.4091 KOps/s | 20.0422 KOps/s | |
test_values_nested_leaf | 67.2660μs | 44.4710μs | 22.4866 KOps/s | 22.4800 KOps/s | |
test_values_stack_nested | 1.8409ms | 1.1855ms | 843.5166 Ops/s | 837.0549 Ops/s | |
test_values_stack_nested_leaf | 1.9137ms | 1.1842ms | 844.4246 Ops/s | 837.3378 Ops/s | |
test_values_stack_nested_locked | 0.9341ms | 0.5095ms | 1.9626 KOps/s | 1.9479 KOps/s | |
test_membership | 16.8110μs | 1.3583μs | 736.2321 KOps/s | 740.2884 KOps/s | |
test_membership_nested | 22.1920μs | 2.7834μs | 359.2725 KOps/s | 362.1514 KOps/s | |
test_membership_nested_leaf | 0.1351ms | 2.8111μs | 355.7349 KOps/s | 360.2995 KOps/s | |
test_membership_stacked_nested | 52.4090μs | 11.6193μs | 86.0635 KOps/s | 82.8484 KOps/s | |
test_membership_stacked_nested_leaf | 0.2140ms | 12.3572μs | 80.9245 KOps/s | 80.2100 KOps/s | |
test_membership_nested_last | 34.8250μs | 5.8962μs | 169.6018 KOps/s | 170.7603 KOps/s | |
test_membership_nested_leaf_last | 38.0610μs | 5.8948μs | 169.6413 KOps/s | 170.0306 KOps/s | |
test_membership_stacked_nested_last | 0.3416ms | 0.1667ms | 5.9999 KOps/s | 5.9220 KOps/s | |
test_membership_stacked_nested_leaf_last | 45.0240μs | 13.7228μs | 72.8715 KOps/s | 72.1076 KOps/s | |
test_nested_getleaf | 64.0600μs | 10.7466μs | 93.0524 KOps/s | 93.9317 KOps/s | |
test_nested_get | 34.7150μs | 10.2538μs | 97.5244 KOps/s | 99.7611 KOps/s | |
test_stacked_getleaf | 1.2662ms | 0.6348ms | 1.5754 KOps/s | 1.5680 KOps/s | |
test_stacked_get | 1.3376ms | 0.6071ms | 1.6471 KOps/s | 1.6436 KOps/s | |
test_nested_getitemleaf | 35.2460μs | 10.6422μs | 93.9658 KOps/s | 95.0440 KOps/s | |
test_nested_getitem | 32.8520μs | 10.2238μs | 97.8107 KOps/s | 100.1010 KOps/s | |
test_stacked_getitemleaf | 0.7521ms | 0.6367ms | 1.5705 KOps/s | 1.5706 KOps/s | |
test_stacked_getitem | 1.0664ms | 0.6078ms | 1.6453 KOps/s | 1.6439 KOps/s | |
test_lock_nested | 74.5251ms | 0.6541ms | 1.5288 KOps/s | 2.0493 KOps/s | |
test_lock_stack_nested | 10.1125ms | 5.2267ms | 191.3243 Ops/s | 121.1546 Ops/s | |
test_unlock_nested | 0.9741ms | 0.4444ms | 2.2501 KOps/s | 1.9886 KOps/s | |
test_unlock_stack_nested | 78.6142ms | 7.3497ms | 136.0592 Ops/s | 207.8619 Ops/s | |
test_flatten_speed | 0.5721ms | 0.2694ms | 3.7117 KOps/s | 3.7070 KOps/s | |
test_unflatten_speed | 0.7851ms | 0.4644ms | 2.1534 KOps/s | 2.1495 KOps/s | |
test_common_ops | 1.4001ms | 0.6835ms | 1.4630 KOps/s | 1.5145 KOps/s | |
test_creation | 32.2300μs | 2.4873μs | 402.0502 KOps/s | 415.1656 KOps/s | |
test_creation_empty | 29.8660μs | 8.6523μs | 115.5763 KOps/s | 124.2870 KOps/s | |
test_creation_nested_1 | 51.3260μs | 11.9300μs | 83.8223 KOps/s | 87.1964 KOps/s | |
test_creation_nested_2 | 54.4820μs | 15.6936μs | 63.7200 KOps/s | 67.7525 KOps/s | |
test_clone | 0.1601ms | 13.1904μs | 75.8127 KOps/s | 77.4640 KOps/s | |
test_getitem[int] | 39.6040μs | 13.0808μs | 76.4480 KOps/s | 78.2743 KOps/s | |
test_getitem[slice_int] | 0.1361ms | 24.9276μs | 40.1162 KOps/s | 40.9009 KOps/s | |
test_getitem[range] | 90.3800μs | 44.1557μs | 22.6472 KOps/s | 22.6648 KOps/s | |
test_getitem[tuple] | 86.0250μs | 20.2078μs | 49.4858 KOps/s | 50.3881 KOps/s | |
test_getitem[list] | 0.2348ms | 41.5139μs | 24.0883 KOps/s | 25.7723 KOps/s | |
test_setitem_dim[int] | 52.2980μs | 29.2742μs | 34.1598 KOps/s | 34.3747 KOps/s | |
test_setitem_dim[slice_int] | 89.5680μs | 54.1818μs | 18.4564 KOps/s | 18.4422 KOps/s | |
test_setitem_dim[range] | 0.1091ms | 75.6596μs | 13.2171 KOps/s | 13.4658 KOps/s | |
test_setitem_dim[tuple] | 0.1016ms | 44.2084μs | 22.6201 KOps/s | 23.4924 KOps/s | |
test_setitem | 0.1378ms | 18.4501μs | 54.2003 KOps/s | 56.1873 KOps/s | |
test_set | 0.1762ms | 18.0075μs | 55.5323 KOps/s | 59.0231 KOps/s | |
test_set_shared | 3.0737ms | 0.1444ms | 6.9242 KOps/s | 7.1269 KOps/s | |
test_update | 0.1105ms | 19.6894μs | 50.7887 KOps/s | 54.2388 KOps/s | |
test_update_nested | 0.1712ms | 26.9812μs | 37.0629 KOps/s | 36.6188 KOps/s | |
test_set_nested | 0.1491ms | 19.7660μs | 50.5919 KOps/s | 52.7011 KOps/s | |
test_set_nested_new | 0.1363ms | 24.9007μs | 40.1595 KOps/s | 41.3988 KOps/s | |
test_select | 0.1300ms | 51.6239μs | 19.3709 KOps/s | 20.5125 KOps/s | |
test_unbind_speed | 0.7463ms | 0.3676ms | 2.7201 KOps/s | 2.7043 KOps/s | |
test_unbind_speed_stack0 | 71.9756ms | 4.5562ms | 219.4795 Ops/s | 182.9332 Ops/s | |
test_unbind_speed_stack1 | 2.5392μs | 0.6346μs | 1.5757 MOps/s | 1.5536 MOps/s | |
test_split | 63.6809ms | 1.7865ms | 559.7444 Ops/s | 602.9797 Ops/s | |
test_chunk | 63.2697ms | 1.7439ms | 573.4436 Ops/s | 609.9920 Ops/s | |
test_creation[device0] | 3.3918ms | 0.3043ms | 3.2861 KOps/s | 3.2336 KOps/s | |
test_creation_from_tensor | 2.6679ms | 0.3346ms | 2.9882 KOps/s | 2.8869 KOps/s | |
test_add_one[memmap_tensor0] | 82.2450μs | 24.9082μs | 40.1475 KOps/s | 40.0588 KOps/s | |
test_contiguous[memmap_tensor0] | 30.0570μs | 5.7966μs | 172.5155 KOps/s | 175.5795 KOps/s | |
test_stack[memmap_tensor0] | 0.1220ms | 19.6190μs | 50.9710 KOps/s | 53.5790 KOps/s | |
test_memmaptd_index | 0.7641ms | 0.3975ms | 2.5157 KOps/s | 2.5072 KOps/s | |
test_memmaptd_index_astensor | 0.5957ms | 0.4582ms | 2.1825 KOps/s | 2.1730 KOps/s | |
test_memmaptd_index_op | 1.3118ms | 0.7047ms | 1.4191 KOps/s | 1.4184 KOps/s | |
test_reshape_pytree | 84.4280μs | 22.9536μs | 43.5662 KOps/s | 42.7644 KOps/s | |
test_reshape_td | 0.1365ms | 31.2604μs | 31.9893 KOps/s | 31.7479 KOps/s | |
test_view_pytree | 0.4868ms | 23.0631μs | 43.3592 KOps/s | 42.5491 KOps/s | |
test_view_td | 31.8300μs | 4.9991μs | 200.0362 KOps/s | 208.0765 KOps/s | |
test_unbind_pytree | 63.2390μs | 26.6742μs | 37.4894 KOps/s | 37.3904 KOps/s | |
test_unbind_td | 0.1282ms | 59.3548μs | 16.8478 KOps/s | 17.3361 KOps/s | |
test_split_pytree | 57.8990μs | 26.3568μs | 37.9408 KOps/s | 37.6941 KOps/s | |
test_split_td | 0.1012ms | 45.8028μs | 21.8327 KOps/s | 21.5452 KOps/s | |
test_add_pytree | 0.1000ms | 32.1701μs | 31.0847 KOps/s | 31.0532 KOps/s | |
test_add_td | 98.8250μs | 44.3499μs | 22.5480 KOps/s | 22.2060 KOps/s | |
test_distributed | 19.4970μs | 5.9635μs | 167.6875 KOps/s | 168.3150 KOps/s | |
test_tdmodule | 0.8908ms | 22.4675μs | 44.5088 KOps/s | 43.9878 KOps/s | |
test_tdmodule_dispatch | 0.2205ms | 39.7878μs | 25.1333 KOps/s | 26.1277 KOps/s | |
test_tdseq | 0.1463ms | 24.3614μs | 41.0486 KOps/s | 42.4128 KOps/s | |
test_tdseq_dispatch | 0.1409ms | 43.8875μs | 22.7855 KOps/s | 23.9345 KOps/s | |
test_instantiation_functorch | 2.0252ms | 1.2922ms | 773.8501 Ops/s | 784.4766 Ops/s | |
test_instantiation_td | 1.6295ms | 1.0200ms | 980.3782 Ops/s | 1.0038 KOps/s | |
test_exec_functorch | 0.2441ms | 0.1583ms | 6.3176 KOps/s | 5.9378 KOps/s | |
test_exec_functional_call | 0.3370ms | 0.1474ms | 6.7825 KOps/s | 6.7606 KOps/s | |
test_exec_td | 0.2539ms | 0.1408ms | 7.1024 KOps/s | 7.0019 KOps/s | |
test_exec_td_decorator | 0.9416ms | 0.1784ms | 5.6061 KOps/s | 5.1160 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4479ms | 0.9099ms | 1.0990 KOps/s | 1.1091 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7671ms | 0.4812ms | 2.0781 KOps/s | 2.1177 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1420ms | 0.7839ms | 1.2756 KOps/s | 1.2789 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7990ms | 0.3978ms | 2.5139 KOps/s | 2.5727 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.6138ms | 1.7838ms | 560.5947 Ops/s | 554.8152 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1221ms | 0.5216ms | 1.9172 KOps/s | 1.9107 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.0247ms | 1.4786ms | 676.3009 Ops/s | 662.3262 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8460ms | 0.4012ms | 2.4923 KOps/s | 2.4462 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.6879ms | 12.7184μs | 78.6263 KOps/s | 79.4180 KOps/s | |
test_plain_set_stack_nested | 0.1724ms | 0.1137ms | 8.7958 KOps/s | 8.4791 KOps/s | |
test_plain_set_nested_inplace | 29.0010μs | 15.1387μs | 66.0559 KOps/s | 66.3733 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1738ms | 0.1391ms | 7.1885 KOps/s | 7.1587 KOps/s | |
test_items | 24.5500μs | 4.6473μs | 215.1772 KOps/s | 214.3780 KOps/s | |
test_items_nested | 0.3572ms | 0.3370ms | 2.9672 KOps/s | 2.9648 KOps/s | |
test_items_nested_locked | 0.3771ms | 0.3386ms | 2.9536 KOps/s | 2.9742 KOps/s | |
test_items_nested_leaf | 0.2228ms | 0.1969ms | 5.0790 KOps/s | 5.0372 KOps/s | |
test_items_stack_nested | 1.5117ms | 1.4582ms | 685.7841 Ops/s | 677.8419 Ops/s | |
test_items_stack_nested_leaf | 1.3326ms | 1.2868ms | 777.1007 Ops/s | 768.9885 Ops/s | |
test_items_stack_nested_locked | 1.8376ms | 0.8140ms | 1.2284 KOps/s | 1.2508 KOps/s | |
test_keys | 34.9300μs | 4.5454μs | 220.0046 KOps/s | 219.5543 KOps/s | |
test_keys_nested | 0.5242ms | 89.5896μs | 11.1620 KOps/s | 11.1007 KOps/s | |
test_keys_nested_locked | 0.1110ms | 89.4184μs | 11.1834 KOps/s | 11.1126 KOps/s | |
test_keys_nested_leaf | 43.1561ms | 86.5948μs | 11.5480 KOps/s | 12.3092 KOps/s | |
test_keys_stack_nested | 1.3226ms | 1.2662ms | 789.7845 Ops/s | 781.7098 Ops/s | |
test_keys_stack_nested_leaf | 1.2990ms | 1.2593ms | 794.1049 Ops/s | 791.3633 Ops/s | |
test_keys_stack_nested_locked | 0.6622ms | 0.6136ms | 1.6297 KOps/s | 1.6790 KOps/s | |
test_values | 9.3837μs | 1.8690μs | 535.0328 KOps/s | 515.3924 KOps/s | |
test_values_nested | 67.7520μs | 42.5153μs | 23.5210 KOps/s | 23.4008 KOps/s | |
test_values_nested_locked | 73.1310μs | 45.0099μs | 22.2173 KOps/s | 23.3180 KOps/s | |
test_values_nested_leaf | 61.7320μs | 37.0103μs | 27.0195 KOps/s | 26.8366 KOps/s | |
test_values_stack_nested | 1.1693ms | 1.1239ms | 889.7382 Ops/s | 897.4386 Ops/s | |
test_values_stack_nested_leaf | 1.3023ms | 1.1131ms | 898.4022 Ops/s | 910.1991 Ops/s | |
test_values_stack_nested_locked | 0.5311ms | 0.4887ms | 2.0461 KOps/s | 2.0994 KOps/s | |
test_membership | 6.6060μs | 0.9188μs | 1.0884 MOps/s | 1.0515 MOps/s | |
test_membership_nested | 17.4510μs | 2.2498μs | 444.4746 KOps/s | 450.5587 KOps/s | |
test_membership_nested_leaf | 16.4505μs | 2.1305μs | 469.3814 KOps/s | 468.5226 KOps/s | |
test_membership_stacked_nested | 35.6700μs | 11.1026μs | 90.0691 KOps/s | 92.5335 KOps/s | |
test_membership_stacked_nested_leaf | 55.5800μs | 10.9957μs | 90.9446 KOps/s | 92.3534 KOps/s | |
test_membership_nested_last | 17.6000μs | 4.6247μs | 216.2302 KOps/s | 217.4369 KOps/s | |
test_membership_nested_leaf_last | 21.5310μs | 4.6345μs | 215.7719 KOps/s | 216.2116 KOps/s | |
test_membership_stacked_nested_last | 0.1848ms | 0.1340ms | 7.4604 KOps/s | 7.4725 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.1510μs | 12.9977μs | 76.9369 KOps/s | 77.6306 KOps/s | |
test_nested_getleaf | 33.5210μs | 8.4495μs | 118.3498 KOps/s | 119.7695 KOps/s | |
test_nested_get | 30.1210μs | 7.9425μs | 125.9045 KOps/s | 125.6443 KOps/s | |
test_stacked_getleaf | 0.6401ms | 0.5576ms | 1.7933 KOps/s | 1.7779 KOps/s | |
test_stacked_get | 0.6140ms | 0.5289ms | 1.8906 KOps/s | 1.8570 KOps/s | |
test_nested_getitemleaf | 29.2110μs | 8.4768μs | 117.9697 KOps/s | 118.2978 KOps/s | |
test_nested_getitem | 29.6500μs | 7.9774μs | 125.3543 KOps/s | 125.8414 KOps/s | |
test_stacked_getitemleaf | 0.6326ms | 0.5682ms | 1.7599 KOps/s | 1.7752 KOps/s | |
test_stacked_getitem | 0.5684ms | 0.5378ms | 1.8595 KOps/s | 1.8810 KOps/s | |
test_lock_nested | 3.1552ms | 0.5490ms | 1.8214 KOps/s | 2.2302 KOps/s | |
test_lock_stack_nested | 80.3645ms | 7.1062ms | 140.7213 Ops/s | 152.0633 Ops/s | |
test_unlock_nested | 2.3505ms | 0.4241ms | 2.3580 KOps/s | 2.0651 KOps/s | |
test_unlock_stack_nested | 66.1733ms | 6.1468ms | 162.6867 Ops/s | 138.6734 Ops/s | |
test_flatten_speed | 0.2256ms | 0.1867ms | 5.3555 KOps/s | 5.3728 KOps/s | |
test_unflatten_speed | 0.3928ms | 0.3637ms | 2.7491 KOps/s | 2.7551 KOps/s | |
test_common_ops | 1.1094ms | 0.5894ms | 1.6965 KOps/s | 1.7166 KOps/s | |
test_creation | 31.6000μs | 2.0941μs | 477.5239 KOps/s | 519.0920 KOps/s | |
test_creation_empty | 37.1300μs | 7.2860μs | 137.2500 KOps/s | 143.1736 KOps/s | |
test_creation_nested_1 | 29.7200μs | 9.6585μs | 103.5362 KOps/s | 106.1102 KOps/s | |
test_creation_nested_2 | 41.8300μs | 12.3710μs | 80.8341 KOps/s | 83.9276 KOps/s | |
test_clone | 81.7420μs | 14.1076μs | 70.8840 KOps/s | 72.2927 KOps/s | |
test_getitem[int] | 29.7810μs | 12.1551μs | 82.2697 KOps/s | 84.7239 KOps/s | |
test_getitem[slice_int] | 70.4810μs | 24.4006μs | 40.9827 KOps/s | 43.7938 KOps/s | |
test_getitem[range] | 0.2341ms | 40.2054μs | 24.8723 KOps/s | 27.7200 KOps/s | |
test_getitem[tuple] | 42.6600μs | 20.3169μs | 49.2201 KOps/s | 53.0606 KOps/s | |
test_getitem[list] | 0.2654ms | 34.1106μs | 29.3164 KOps/s | 29.2232 KOps/s | |
test_setitem_dim[int] | 43.5800μs | 26.3220μs | 37.9910 KOps/s | 39.9268 KOps/s | |
test_setitem_dim[slice_int] | 63.5100μs | 44.5592μs | 22.4421 KOps/s | 22.5297 KOps/s | |
test_setitem_dim[range] | 78.6420μs | 59.9423μs | 16.6827 KOps/s | 16.3380 KOps/s | |
test_setitem_dim[tuple] | 55.2210μs | 37.3510μs | 26.7730 KOps/s | 26.2396 KOps/s | |
test_setitem | 93.3520μs | 17.7533μs | 56.3275 KOps/s | 56.6182 KOps/s | |
test_set | 89.3610μs | 17.1413μs | 58.3388 KOps/s | 58.9610 KOps/s | |
test_set_shared | 2.9695ms | 0.1009ms | 9.9066 KOps/s | 10.1046 KOps/s | |
test_update | 77.8410μs | 18.6330μs | 53.6683 KOps/s | 54.0508 KOps/s | |
test_update_nested | 0.1070ms | 25.4467μs | 39.2978 KOps/s | 40.5131 KOps/s | |
test_set_nested | 91.1420μs | 18.6137μs | 53.7238 KOps/s | 55.5278 KOps/s | |
test_set_nested_new | 77.4110μs | 22.7350μs | 43.9851 KOps/s | 45.5653 KOps/s | |
test_select | 0.1099ms | 45.9232μs | 21.7755 KOps/s | 23.1037 KOps/s | |
test_to | 71.6510μs | 51.4066μs | 19.4528 KOps/s | 18.8834 KOps/s | |
test_to_nonblocking | 67.0810μs | 32.7775μs | 30.5087 KOps/s | 30.2559 KOps/s | |
test_unbind_speed | 0.3917ms | 0.3556ms | 2.8124 KOps/s | 2.9233 KOps/s | |
test_unbind_speed_stack0 | 62.0583ms | 4.2923ms | 232.9766 Ops/s | 197.8716 Ops/s | |
test_unbind_speed_stack1 | 1.9365μs | 0.5209μs | 1.9198 MOps/s | 1.8858 MOps/s | |
test_split | 52.9582ms | 1.7692ms | 565.2403 Ops/s | 564.5365 Ops/s | |
test_chunk | 52.7456ms | 1.7466ms | 572.5351 Ops/s | 573.1644 Ops/s | |
test_creation[device0] | 0.5319ms | 0.3078ms | 3.2485 KOps/s | 3.2576 KOps/s | |
test_creation[device1] | 0.8392ms | 0.3120ms | 3.2049 KOps/s | 3.2103 KOps/s | |
test_creation_from_tensor | 56.6480ms | 0.3636ms | 2.7504 KOps/s | 2.9731 KOps/s | |
test_add_one[memmap_tensor0] | 61.1310μs | 22.2941μs | 44.8550 KOps/s | 42.8934 KOps/s | |
test_add_one[memmap_tensor1] | 0.2041ms | 69.9699μs | 14.2919 KOps/s | 13.8670 KOps/s | |
test_contiguous[memmap_tensor0] | 36.4810μs | 5.6948μs | 175.5985 KOps/s | 174.8387 KOps/s | |
test_contiguous[memmap_tensor1] | 47.5110μs | 20.8428μs | 47.9782 KOps/s | 45.4825 KOps/s | |
test_stack[memmap_tensor0] | 36.9020μs | 18.7849μs | 53.2344 KOps/s | 53.3996 KOps/s | |
test_stack[memmap_tensor1] | 0.1533ms | 70.9613μs | 14.0922 KOps/s | 14.0654 KOps/s | |
test_memmaptd_index | 0.4603ms | 0.4115ms | 2.4303 KOps/s | 2.3883 KOps/s | |
test_memmaptd_index_astensor | 0.5258ms | 0.4689ms | 2.1326 KOps/s | 2.0888 KOps/s | |
test_memmaptd_index_op | 0.7801ms | 0.7146ms | 1.3995 KOps/s | 1.3784 KOps/s | |
test_reshape_pytree | 36.5010μs | 20.5431μs | 48.6782 KOps/s | 48.6298 KOps/s | |
test_reshape_td | 59.4000μs | 29.2373μs | 34.2028 KOps/s | 34.9743 KOps/s | |
test_view_pytree | 45.7310μs | 20.4657μs | 48.8622 KOps/s | 49.0798 KOps/s | |
test_view_td | 19.2400μs | 4.0409μs | 247.4708 KOps/s | 249.1835 KOps/s | |
test_unbind_pytree | 52.3620μs | 25.2214μs | 39.6488 KOps/s | 39.5259 KOps/s | |
test_unbind_td | 80.6910μs | 55.1321μs | 18.1383 KOps/s | 18.5430 KOps/s | |
test_split_pytree | 38.9420μs | 23.2892μs | 42.9384 KOps/s | 42.4578 KOps/s | |
test_split_td | 70.5220μs | 42.9314μs | 23.2930 KOps/s | 23.9377 KOps/s | |
test_add_pytree | 77.1310μs | 30.4831μs | 32.8051 KOps/s | 32.9500 KOps/s | |
test_add_td | 74.7710μs | 41.5073μs | 24.0922 KOps/s | 24.1054 KOps/s | |
test_distributed | 18.6200μs | 5.4551μs | 183.3132 KOps/s | 187.9182 KOps/s | |
test_tdmodule | 90.5620μs | 16.7608μs | 59.6631 KOps/s | 60.3500 KOps/s | |
test_tdmodule_dispatch | 0.2176ms | 33.0225μs | 30.2824 KOps/s | 31.2168 KOps/s | |
test_tdseq | 42.3400μs | 20.0394μs | 49.9018 KOps/s | 50.1430 KOps/s | |
test_tdseq_dispatch | 55.5800μs | 36.4281μs | 27.4514 KOps/s | 28.4315 KOps/s | |
test_instantiation_functorch | 1.8756ms | 1.6608ms | 602.1100 Ops/s | 599.2819 Ops/s | |
test_instantiation_td | 1.6378ms | 1.1652ms | 858.2399 Ops/s | 870.4144 Ops/s | |
test_exec_functorch | 0.1912ms | 0.1509ms | 6.6249 KOps/s | 6.5954 KOps/s | |
test_exec_functional_call | 0.2091ms | 0.1440ms | 6.9458 KOps/s | 6.8469 KOps/s | |
test_exec_td | 0.1788ms | 0.1355ms | 7.3810 KOps/s | 7.2366 KOps/s | |
test_exec_td_decorator | 0.7950ms | 0.1710ms | 5.8471 KOps/s | 5.5294 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.5317ms | 1.0733ms | 931.6941 Ops/s | 955.4466 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.6488ms | 0.5894ms | 1.6965 KOps/s | 1.6805 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.2178ms | 0.9723ms | 1.0285 KOps/s | 1.0379 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5734ms | 0.5320ms | 1.8795 KOps/s | 1.8956 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.7216ms | 2.0202ms | 494.9995 Ops/s | 497.4890 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1391ms | 0.6414ms | 1.5591 KOps/s | 1.5421 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.1416ms | 1.7185ms | 581.9097 Ops/s | 577.0918 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0445ms | 0.5335ms | 1.8743 KOps/s | 1.8397 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.6679ms | 12.2732ms | 81.4782 Ops/s | 81.3134 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.0139ms | 7.8758ms | 126.9720 Ops/s | 125.1492 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.3404ms | 12.0005ms | 83.3295 Ops/s | 82.3915 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.0894ms | 7.8053ms | 128.1183 Ops/s | 126.1403 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 63.4798ms | 62.5220ms | 15.9944 Ops/s | 15.9309 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 21.1333ms | 18.9633ms | 52.7335 Ops/s | 48.1198 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 0.1351s | 62.0451ms | 16.1173 Ops/s | 17.4227 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.8002ms | 18.5889ms | 53.7956 Ops/s | 48.7598 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.