-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Unbind and stack tds in map with chunksize=0 #589
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Dec 4, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 26.2490μs | 16.0729μs | 62.2166 KOps/s | 63.7023 KOps/s | |
test_plain_set_stack_nested | 0.1775ms | 0.1424ms | 7.0240 KOps/s | 7.0553 KOps/s | |
test_plain_set_nested_inplace | 41.8070μs | 18.1678μs | 55.0425 KOps/s | 57.1476 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3347ms | 0.1768ms | 5.6576 KOps/s | 5.7121 KOps/s | |
test_items | 23.5930μs | 2.4470μs | 408.6644 KOps/s | 387.5942 KOps/s | |
test_items_nested | 0.3707ms | 0.2744ms | 3.6445 KOps/s | 3.6757 KOps/s | |
test_items_nested_locked | 1.2992ms | 0.2692ms | 3.7143 KOps/s | 3.7146 KOps/s | |
test_items_nested_leaf | 0.7269ms | 0.1639ms | 6.1007 KOps/s | 5.9352 KOps/s | |
test_items_stack_nested | 1.5978ms | 1.4865ms | 672.7170 Ops/s | 676.2021 Ops/s | |
test_items_stack_nested_leaf | 1.4773ms | 1.3594ms | 735.6319 Ops/s | 744.9070 Ops/s | |
test_items_stack_nested_locked | 1.8572ms | 0.7654ms | 1.3065 KOps/s | 1.3103 KOps/s | |
test_keys | 20.5280μs | 3.8278μs | 261.2499 KOps/s | 260.7809 KOps/s | |
test_keys_nested | 0.5008ms | 0.1410ms | 7.0902 KOps/s | 6.6272 KOps/s | |
test_keys_nested_locked | 0.3323ms | 0.1403ms | 7.1287 KOps/s | 7.1299 KOps/s | |
test_keys_nested_leaf | 0.4015ms | 0.1383ms | 7.2326 KOps/s | 7.1290 KOps/s | |
test_keys_stack_nested | 1.6785ms | 1.3994ms | 714.6050 Ops/s | 713.4334 Ops/s | |
test_keys_stack_nested_leaf | 2.0549ms | 1.4007ms | 713.9338 Ops/s | 712.6524 Ops/s | |
test_keys_stack_nested_locked | 0.8215ms | 0.6662ms | 1.5010 KOps/s | 1.4891 KOps/s | |
test_values | 7.4288μs | 1.1639μs | 859.1852 KOps/s | 830.3258 KOps/s | |
test_values_nested | 94.1250μs | 48.9625μs | 20.4238 KOps/s | 20.1741 KOps/s | |
test_values_nested_locked | 0.1127ms | 49.4222μs | 20.2338 KOps/s | 20.0689 KOps/s | |
test_values_nested_leaf | 87.8340μs | 43.9645μs | 22.7456 KOps/s | 22.7732 KOps/s | |
test_values_stack_nested | 2.0071ms | 1.1981ms | 834.6320 Ops/s | 832.1607 Ops/s | |
test_values_stack_nested_leaf | 1.8801ms | 1.1878ms | 841.8999 Ops/s | 842.2989 Ops/s | |
test_values_stack_nested_locked | 0.8788ms | 0.5100ms | 1.9609 KOps/s | 1.9671 KOps/s | |
test_membership | 16.2800μs | 1.3288μs | 752.5444 KOps/s | 744.3440 KOps/s | |
test_membership_nested | 19.6970μs | 2.7912μs | 358.2720 KOps/s | 352.1577 KOps/s | |
test_membership_nested_leaf | 20.7880μs | 2.8021μs | 356.8703 KOps/s | 350.9733 KOps/s | |
test_membership_stacked_nested | 36.8290μs | 11.6346μs | 85.9502 KOps/s | 83.2797 KOps/s | |
test_membership_stacked_nested_leaf | 64.4100μs | 11.7014μs | 85.4602 KOps/s | 83.5885 KOps/s | |
test_membership_nested_last | 33.7120μs | 5.9290μs | 168.6618 KOps/s | 164.3784 KOps/s | |
test_membership_nested_leaf_last | 28.2620μs | 5.8695μs | 170.3735 KOps/s | 171.7498 KOps/s | |
test_membership_stacked_nested_last | 0.2180ms | 0.1677ms | 5.9637 KOps/s | 5.9617 KOps/s | |
test_membership_stacked_nested_leaf_last | 38.3310μs | 13.7737μs | 72.6020 KOps/s | 71.1527 KOps/s | |
test_nested_getleaf | 34.5840μs | 10.5811μs | 94.5084 KOps/s | 94.6763 KOps/s | |
test_nested_get | 30.1970μs | 10.1722μs | 98.3071 KOps/s | 99.4247 KOps/s | |
test_stacked_getleaf | 1.1799ms | 0.6420ms | 1.5575 KOps/s | 1.5448 KOps/s | |
test_stacked_get | 1.2336ms | 0.6104ms | 1.6383 KOps/s | 1.6144 KOps/s | |
test_nested_getitemleaf | 31.6690μs | 10.5114μs | 95.1351 KOps/s | 93.6094 KOps/s | |
test_nested_getitem | 32.3800μs | 9.9975μs | 100.0250 KOps/s | 100.0532 KOps/s | |
test_stacked_getitemleaf | 0.7902ms | 0.6483ms | 1.5426 KOps/s | 1.5428 KOps/s | |
test_stacked_getitem | 1.0540ms | 0.6139ms | 1.6290 KOps/s | 1.6222 KOps/s | |
test_lock_nested | 59.2189ms | 0.6206ms | 1.6113 KOps/s | 1.7681 KOps/s | |
test_lock_stack_nested | 9.2049ms | 5.0943ms | 196.2986 Ops/s | 192.8540 Ops/s | |
test_unlock_nested | 1.0874ms | 0.4474ms | 2.2351 KOps/s | 2.2447 KOps/s | |
test_unlock_stack_nested | 70.0117ms | 6.7504ms | 148.1397 Ops/s | 144.0595 Ops/s | |
test_flatten_speed | 0.4067ms | 0.2682ms | 3.7287 KOps/s | 3.7361 KOps/s | |
test_unflatten_speed | 0.9001ms | 0.4606ms | 2.1713 KOps/s | 2.1758 KOps/s | |
test_common_ops | 3.8868ms | 0.6753ms | 1.4809 KOps/s | 1.4519 KOps/s | |
test_creation | 20.9990μs | 2.4745μs | 404.1141 KOps/s | 391.7043 KOps/s | |
test_creation_empty | 21.8000μs | 8.0960μs | 123.5176 KOps/s | 119.8698 KOps/s | |
test_creation_nested_1 | 48.5610μs | 11.3994μs | 87.7240 KOps/s | 86.5649 KOps/s | |
test_creation_nested_2 | 36.0070μs | 14.8855μs | 67.1793 KOps/s | 65.5320 KOps/s | |
test_clone | 55.8040μs | 13.9470μs | 71.6998 KOps/s | 71.6404 KOps/s | |
test_getitem[int] | 43.2400μs | 13.2060μs | 75.7229 KOps/s | 75.2974 KOps/s | |
test_getitem[slice_int] | 63.2080μs | 26.3672μs | 37.9259 KOps/s | 38.6658 KOps/s | |
test_getitem[range] | 82.0130μs | 43.7995μs | 22.8313 KOps/s | 22.1563 KOps/s | |
test_getitem[tuple] | 61.2840μs | 20.9084μs | 47.8278 KOps/s | 46.9783 KOps/s | |
test_getitem[list] | 0.1768ms | 38.6786μs | 25.8541 KOps/s | 25.0042 KOps/s | |
test_setitem_dim[int] | 47.7290μs | 27.6988μs | 36.1026 KOps/s | 36.0449 KOps/s | |
test_setitem_dim[slice_int] | 80.4300μs | 52.7041μs | 18.9739 KOps/s | 18.9997 KOps/s | |
test_setitem_dim[range] | 0.1202ms | 70.3028μs | 14.2242 KOps/s | 13.8421 KOps/s | |
test_setitem_dim[tuple] | 61.8250μs | 40.9375μs | 24.4275 KOps/s | 24.0742 KOps/s | |
test_setitem | 73.4270μs | 18.8064μs | 53.1734 KOps/s | 52.6565 KOps/s | |
test_set | 87.8640μs | 17.9769μs | 55.6269 KOps/s | 54.1487 KOps/s | |
test_set_shared | 3.2228ms | 0.1385ms | 7.2191 KOps/s | 6.8497 KOps/s | |
test_update | 93.6650μs | 19.1854μs | 52.1230 KOps/s | 49.6919 KOps/s | |
test_update_nested | 88.2740μs | 27.5824μs | 36.2550 KOps/s | 35.9558 KOps/s | |
test_set_nested | 69.7200μs | 20.5606μs | 48.6367 KOps/s | 48.5017 KOps/s | |
test_set_nested_new | 88.4550μs | 26.1056μs | 38.3060 KOps/s | 39.1683 KOps/s | |
test_select | 0.1041ms | 52.2659μs | 19.1330 KOps/s | 19.5127 KOps/s | |
test_unbind_speed | 0.7175ms | 0.3818ms | 2.6194 KOps/s | 2.6257 KOps/s | |
test_unbind_speed_stack0 | 66.2016ms | 4.7034ms | 212.6120 Ops/s | 210.7403 Ops/s | |
test_unbind_speed_stack1 | 2.0053μs | 0.6445μs | 1.5517 MOps/s | 1.5887 MOps/s | |
test_split | 56.2655ms | 1.7791ms | 562.0754 Ops/s | 560.0625 Ops/s | |
test_chunk | 58.7017ms | 1.7535ms | 570.2729 Ops/s | 567.4537 Ops/s | |
test_creation[device0] | 0.4948ms | 0.2933ms | 3.4091 KOps/s | 3.4243 KOps/s | |
test_creation_from_tensor | 3.5844ms | 0.3301ms | 3.0298 KOps/s | 3.0163 KOps/s | |
test_add_one[memmap_tensor0] | 70.3510μs | 25.4034μs | 39.3649 KOps/s | 39.6142 KOps/s | |
test_contiguous[memmap_tensor0] | 25.5080μs | 5.8870μs | 169.8655 KOps/s | 162.0815 KOps/s | |
test_stack[memmap_tensor0] | 60.3430μs | 19.4528μs | 51.4065 KOps/s | 50.0121 KOps/s | |
test_memmaptd_index | 0.3635ms | 0.2081ms | 4.8045 KOps/s | 4.8451 KOps/s | |
test_memmaptd_index_astensor | 0.5307ms | 0.2673ms | 3.7413 KOps/s | 3.7326 KOps/s | |
test_memmaptd_index_op | 0.8038ms | 0.5046ms | 1.9816 KOps/s | 1.9458 KOps/s | |
test_reshape_pytree | 0.2339ms | 23.6248μs | 42.3284 KOps/s | 42.2182 KOps/s | |
test_reshape_td | 71.2830μs | 32.1026μs | 31.1502 KOps/s | 30.2129 KOps/s | |
test_view_pytree | 0.4033ms | 23.1696μs | 43.1599 KOps/s | 42.4550 KOps/s | |
test_view_td | 17.9630μs | 4.9004μs | 204.0662 KOps/s | 201.8268 KOps/s | |
test_unbind_pytree | 87.0720μs | 26.4076μs | 37.8679 KOps/s | 37.5133 KOps/s | |
test_unbind_td | 0.1458ms | 59.5665μs | 16.7880 KOps/s | 16.5535 KOps/s | |
test_split_pytree | 60.6430μs | 26.4421μs | 37.8185 KOps/s | 37.6835 KOps/s | |
test_split_td | 0.1002ms | 46.3669μs | 21.5671 KOps/s | 21.0127 KOps/s | |
test_add_pytree | 84.8590μs | 32.2172μs | 31.0393 KOps/s | 30.9275 KOps/s | |
test_add_td | 0.1069ms | 45.6218μs | 21.9194 KOps/s | 21.9585 KOps/s | |
test_distributed | 36.1470μs | 6.3103μs | 158.4712 KOps/s | 166.1509 KOps/s | |
test_tdmodule | 1.7198ms | 22.7683μs | 43.9207 KOps/s | 46.4709 KOps/s | |
test_tdmodule_dispatch | 0.1754ms | 38.7191μs | 25.8271 KOps/s | 25.6657 KOps/s | |
test_tdseq | 43.8920μs | 24.4628μs | 40.8784 KOps/s | 42.0138 KOps/s | |
test_tdseq_dispatch | 0.1392ms | 43.5245μs | 22.9755 KOps/s | 23.0294 KOps/s | |
test_instantiation_functorch | 1.5215ms | 1.3195ms | 757.8363 Ops/s | 755.8897 Ops/s | |
test_instantiation_td | 1.5487ms | 1.0390ms | 962.5052 Ops/s | 893.0799 Ops/s | |
test_exec_functorch | 0.2219ms | 0.1590ms | 6.2908 KOps/s | 6.2716 KOps/s | |
test_exec_functional_call | 0.3994ms | 0.1484ms | 6.7383 KOps/s | 6.6676 KOps/s | |
test_exec_td | 0.2130ms | 0.1423ms | 7.0268 KOps/s | 6.0609 KOps/s | |
test_exec_td_decorator | 0.7598ms | 0.1767ms | 5.6609 KOps/s | 5.5551 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4205ms | 0.8776ms | 1.1395 KOps/s | 1.1188 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.5789ms | 0.4642ms | 2.1544 KOps/s | 2.1523 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0678ms | 0.7612ms | 1.3138 KOps/s | 1.2926 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7795ms | 0.3891ms | 2.5699 KOps/s | 2.6281 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.7122ms | 1.7416ms | 574.1725 Ops/s | 566.7688 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9776ms | 0.5173ms | 1.9331 KOps/s | 1.9446 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.9201ms | 1.4516ms | 688.8769 Ops/s | 673.8862 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7772ms | 0.3988ms | 2.5076 KOps/s | 2.5132 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.6691ms | 12.5737μs | 79.5308 KOps/s | 79.8783 KOps/s | |
test_plain_set_stack_nested | 0.1436ms | 0.1157ms | 8.6427 KOps/s | 8.3623 KOps/s | |
test_plain_set_nested_inplace | 39.9310μs | 13.8574μs | 72.1634 KOps/s | 72.2631 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1770ms | 0.1425ms | 7.0179 KOps/s | 7.0398 KOps/s | |
test_items | 24.1710μs | 4.6637μs | 214.4198 KOps/s | 214.8345 KOps/s | |
test_items_nested | 0.3891ms | 0.3379ms | 2.9597 KOps/s | 2.9537 KOps/s | |
test_items_nested_locked | 0.3927ms | 0.3384ms | 2.9552 KOps/s | 2.9188 KOps/s | |
test_items_nested_leaf | 0.2402ms | 0.1980ms | 5.0496 KOps/s | 4.9813 KOps/s | |
test_items_stack_nested | 1.5848ms | 1.4793ms | 676.0038 Ops/s | 677.8422 Ops/s | |
test_items_stack_nested_leaf | 1.3977ms | 1.3060ms | 765.7243 Ops/s | 764.3955 Ops/s | |
test_items_stack_nested_locked | 0.8737ms | 0.8145ms | 1.2278 KOps/s | 1.1841 KOps/s | |
test_keys | 27.6110μs | 4.5786μs | 218.4095 KOps/s | 218.9096 KOps/s | |
test_keys_nested | 3.5756ms | 90.8047μs | 11.0126 KOps/s | 11.0778 KOps/s | |
test_keys_nested_locked | 0.1170ms | 90.4965μs | 11.0501 KOps/s | 11.1765 KOps/s | |
test_keys_nested_leaf | 41.9019ms | 86.3293μs | 11.5836 KOps/s | 12.2639 KOps/s | |
test_keys_stack_nested | 1.3663ms | 1.2895ms | 775.5183 Ops/s | 778.7913 Ops/s | |
test_keys_stack_nested_leaf | 1.3423ms | 1.2757ms | 783.8609 Ops/s | 780.8083 Ops/s | |
test_keys_stack_nested_locked | 0.7013ms | 0.6186ms | 1.6166 KOps/s | 1.5852 KOps/s | |
test_values | 9.0473μs | 1.8898μs | 529.1602 KOps/s | 529.3324 KOps/s | |
test_values_nested | 62.9330μs | 42.7495μs | 23.3921 KOps/s | 23.4151 KOps/s | |
test_values_nested_locked | 67.3220μs | 45.0829μs | 22.1814 KOps/s | 22.0521 KOps/s | |
test_values_nested_leaf | 57.7520μs | 37.2287μs | 26.8610 KOps/s | 26.8618 KOps/s | |
test_values_stack_nested | 1.2581ms | 1.1284ms | 886.1831 Ops/s | 880.4665 Ops/s | |
test_values_stack_nested_leaf | 1.1970ms | 1.1302ms | 884.8298 Ops/s | 880.2564 Ops/s | |
test_values_stack_nested_locked | 0.5912ms | 0.4977ms | 2.0093 KOps/s | 1.9550 KOps/s | |
test_membership | 5.2762μs | 0.9412μs | 1.0624 MOps/s | 1.0638 MOps/s | |
test_membership_nested | 16.9610μs | 2.1618μs | 462.5743 KOps/s | 453.4986 KOps/s | |
test_membership_nested_leaf | 11.6740μs | 2.0255μs | 493.7067 KOps/s | 478.0759 KOps/s | |
test_membership_stacked_nested | 45.6710μs | 10.8029μs | 92.5673 KOps/s | 90.7485 KOps/s | |
test_membership_stacked_nested_leaf | 28.5410μs | 10.8234μs | 92.3928 KOps/s | 92.2739 KOps/s | |
test_membership_nested_last | 32.7720μs | 4.5251μs | 220.9911 KOps/s | 219.0454 KOps/s | |
test_membership_nested_leaf_last | 20.2910μs | 4.5165μs | 221.4099 KOps/s | 219.5337 KOps/s | |
test_membership_stacked_nested_last | 0.1683ms | 0.1335ms | 7.4929 KOps/s | 7.4293 KOps/s | |
test_membership_stacked_nested_leaf_last | 41.4020μs | 12.6537μs | 79.0280 KOps/s | 79.4016 KOps/s | |
test_nested_getleaf | 28.7010μs | 8.3534μs | 119.7113 KOps/s | 117.3702 KOps/s | |
test_nested_get | 29.5710μs | 7.9157μs | 126.3314 KOps/s | 124.3579 KOps/s | |
test_stacked_getleaf | 0.6228ms | 0.5622ms | 1.7788 KOps/s | 1.7824 KOps/s | |
test_stacked_get | 0.6347ms | 0.5455ms | 1.8333 KOps/s | 1.8834 KOps/s | |
test_nested_getitemleaf | 31.8010μs | 8.4367μs | 118.5299 KOps/s | 118.3964 KOps/s | |
test_nested_getitem | 28.5220μs | 7.9544μs | 125.7164 KOps/s | 123.8566 KOps/s | |
test_stacked_getitemleaf | 0.6305ms | 0.5673ms | 1.7628 KOps/s | 1.7832 KOps/s | |
test_stacked_getitem | 0.5921ms | 0.5418ms | 1.8458 KOps/s | 1.8993 KOps/s | |
test_lock_nested | 3.2547ms | 0.5524ms | 1.8103 KOps/s | 1.8270 KOps/s | |
test_lock_stack_nested | 81.3943ms | 7.2070ms | 138.7540 Ops/s | 137.9253 Ops/s | |
test_unlock_nested | 2.3936ms | 0.4269ms | 2.3426 KOps/s | 2.3490 KOps/s | |
test_unlock_stack_nested | 67.4711ms | 6.2365ms | 160.3453 Ops/s | 163.2099 Ops/s | |
test_flatten_speed | 0.2338ms | 0.1865ms | 5.3628 KOps/s | 5.3392 KOps/s | |
test_unflatten_speed | 0.4276ms | 0.3631ms | 2.7538 KOps/s | 2.7515 KOps/s | |
test_common_ops | 1.1081ms | 0.5874ms | 1.7023 KOps/s | 1.7081 KOps/s | |
test_creation | 32.1910μs | 2.1194μs | 471.8379 KOps/s | 484.3657 KOps/s | |
test_creation_empty | 22.6120μs | 6.5741μs | 152.1114 KOps/s | 152.6133 KOps/s | |
test_creation_nested_1 | 40.9520μs | 8.8675μs | 112.7720 KOps/s | 113.6153 KOps/s | |
test_creation_nested_2 | 29.2610μs | 11.4055μs | 87.6767 KOps/s | 87.1290 KOps/s | |
test_clone | 0.1052ms | 14.0145μs | 71.3547 KOps/s | 70.0824 KOps/s | |
test_getitem[int] | 39.0110μs | 11.9711μs | 83.5346 KOps/s | 83.1167 KOps/s | |
test_getitem[slice_int] | 39.9720μs | 22.0066μs | 45.4409 KOps/s | 44.3595 KOps/s | |
test_getitem[range] | 62.4420μs | 38.5965μs | 25.9091 KOps/s | 25.5435 KOps/s | |
test_getitem[tuple] | 47.6430μs | 20.2312μs | 49.4287 KOps/s | 49.3468 KOps/s | |
test_getitem[list] | 0.2366ms | 34.6509μs | 28.8593 KOps/s | 28.7219 KOps/s | |
test_setitem_dim[int] | 41.6220μs | 25.3295μs | 39.4797 KOps/s | 40.5974 KOps/s | |
test_setitem_dim[slice_int] | 71.7340μs | 44.3288μs | 22.5587 KOps/s | 23.3350 KOps/s | |
test_setitem_dim[range] | 96.6140μs | 59.9948μs | 16.6681 KOps/s | 16.6969 KOps/s | |
test_setitem_dim[tuple] | 54.3420μs | 37.7515μs | 26.4890 KOps/s | 26.3681 KOps/s | |
test_setitem | 97.4030μs | 17.2003μs | 58.1386 KOps/s | 55.7457 KOps/s | |
test_set | 95.3430μs | 16.6128μs | 60.1946 KOps/s | 57.2660 KOps/s | |
test_set_shared | 2.7193ms | 0.1031ms | 9.7038 KOps/s | 8.7529 KOps/s | |
test_update | 87.4040μs | 17.9454μs | 55.7247 KOps/s | 53.6129 KOps/s | |
test_update_nested | 98.2330μs | 24.2706μs | 41.2021 KOps/s | 39.4117 KOps/s | |
test_set_nested | 58.9920μs | 18.7007μs | 53.4741 KOps/s | 54.3543 KOps/s | |
test_set_nested_new | 89.1730μs | 22.6595μs | 44.1317 KOps/s | 43.4220 KOps/s | |
test_select | 0.1172ms | 44.7271μs | 22.3578 KOps/s | 21.7877 KOps/s | |
test_to | 73.6630μs | 52.8666μs | 18.9156 KOps/s | 18.9183 KOps/s | |
test_to_nonblocking | 65.0120μs | 33.8022μs | 29.5839 KOps/s | 28.8592 KOps/s | |
test_unbind_speed | 0.4056ms | 0.3577ms | 2.7958 KOps/s | 2.8413 KOps/s | |
test_unbind_speed_stack0 | 62.0898ms | 4.5716ms | 218.7432 Ops/s | 235.0193 Ops/s | |
test_unbind_speed_stack1 | 2.0021μs | 0.5251μs | 1.9043 MOps/s | 1.8645 MOps/s | |
test_split | 1.9353ms | 1.6443ms | 608.1760 Ops/s | 573.2499 Ops/s | |
test_chunk | 53.3118ms | 1.7317ms | 577.4709 Ops/s | 580.1378 Ops/s | |
test_creation[device0] | 0.4149ms | 0.3055ms | 3.2736 KOps/s | 3.2825 KOps/s | |
test_creation[device1] | 55.4840ms | 0.3303ms | 3.0274 KOps/s | 3.2365 KOps/s | |
test_creation_from_tensor | 0.5687ms | 0.3328ms | 3.0052 KOps/s | 3.0040 KOps/s | |
test_add_one[memmap_tensor0] | 0.2669ms | 23.2078μs | 43.0890 KOps/s | 42.7841 KOps/s | |
test_add_one[memmap_tensor1] | 0.2107ms | 72.0568μs | 13.8779 KOps/s | 13.7892 KOps/s | |
test_contiguous[memmap_tensor0] | 26.2300μs | 5.7688μs | 173.3471 KOps/s | 177.2258 KOps/s | |
test_contiguous[memmap_tensor1] | 43.0120μs | 21.2719μs | 47.0104 KOps/s | 46.4226 KOps/s | |
test_stack[memmap_tensor0] | 49.7620μs | 18.7181μs | 53.4244 KOps/s | 52.4914 KOps/s | |
test_stack[memmap_tensor1] | 0.1524ms | 71.8149μs | 13.9247 KOps/s | 13.0345 KOps/s | |
test_memmaptd_index | 0.2982ms | 0.2278ms | 4.3908 KOps/s | 4.3376 KOps/s | |
test_memmaptd_index_astensor | 0.4017ms | 0.2850ms | 3.5082 KOps/s | 3.5030 KOps/s | |
test_memmaptd_index_op | 0.6090ms | 0.5414ms | 1.8471 KOps/s | 1.8072 KOps/s | |
test_reshape_pytree | 47.7320μs | 20.3050μs | 49.2490 KOps/s | 48.3486 KOps/s | |
test_reshape_td | 53.7530μs | 29.8198μs | 33.5348 KOps/s | 33.2508 KOps/s | |
test_view_pytree | 51.0030μs | 19.9471μs | 50.1325 KOps/s | 49.5953 KOps/s | |
test_view_td | 17.6210μs | 3.9776μs | 251.4073 KOps/s | 246.5422 KOps/s | |
test_unbind_pytree | 43.4110μs | 25.7506μs | 38.8340 KOps/s | 39.4009 KOps/s | |
test_unbind_td | 78.1730μs | 55.0730μs | 18.1577 KOps/s | 17.8616 KOps/s | |
test_split_pytree | 47.7710μs | 23.4933μs | 42.5652 KOps/s | 42.0999 KOps/s | |
test_split_td | 67.4220μs | 42.1480μs | 23.7259 KOps/s | 23.7794 KOps/s | |
test_add_pytree | 49.6620μs | 31.2964μs | 31.9525 KOps/s | 30.4371 KOps/s | |
test_add_td | 83.8530μs | 43.3759μs | 23.0543 KOps/s | 22.4379 KOps/s | |
test_distributed | 23.5910μs | 5.4655μs | 182.9673 KOps/s | 182.5189 KOps/s | |
test_tdmodule | 37.3220μs | 16.0774μs | 62.1991 KOps/s | 60.6416 KOps/s | |
test_tdmodule_dispatch | 0.1952ms | 32.0408μs | 31.2102 KOps/s | 31.0099 KOps/s | |
test_tdseq | 34.6110μs | 19.4069μs | 51.5280 KOps/s | 51.8164 KOps/s | |
test_tdseq_dispatch | 78.9430μs | 34.8034μs | 28.7328 KOps/s | 28.3503 KOps/s | |
test_instantiation_functorch | 2.2430ms | 1.6833ms | 594.0625 Ops/s | 606.2592 Ops/s | |
test_instantiation_td | 65.0585ms | 1.2454ms | 802.9470 Ops/s | 860.2498 Ops/s | |
test_exec_functorch | 0.2062ms | 0.1547ms | 6.4646 KOps/s | 6.3867 KOps/s | |
test_exec_functional_call | 0.2145ms | 0.1528ms | 6.5428 KOps/s | 6.3857 KOps/s | |
test_exec_td | 0.2311ms | 0.1461ms | 6.8466 KOps/s | 6.8478 KOps/s | |
test_exec_td_decorator | 0.8067ms | 0.1853ms | 5.3973 KOps/s | 5.4245 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1315ms | 1.0482ms | 954.0189 Ops/s | 948.6899 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.6799ms | 0.6014ms | 1.6627 KOps/s | 1.6330 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0485ms | 0.9620ms | 1.0395 KOps/s | 1.0428 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6119ms | 0.5365ms | 1.8639 KOps/s | 1.8457 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.8557ms | 1.9670ms | 508.3947 Ops/s | 506.3380 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1071ms | 0.6445ms | 1.5517 KOps/s | 1.5320 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.1351ms | 1.7017ms | 587.6359 Ops/s | 582.1585 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9265ms | 0.5464ms | 1.8303 KOps/s | 1.7905 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.4069ms | 12.2611ms | 81.5588 Ops/s | 80.3128 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.4357ms | 8.1133ms | 123.2544 Ops/s | 121.2130 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.8488ms | 12.1675ms | 82.1864 Ops/s | 80.9779 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.0498ms | 7.9955ms | 125.0709 Ops/s | 121.8171 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 63.4635ms | 62.5962ms | 15.9754 Ops/s | 14.8472 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 21.8868ms | 19.6784ms | 50.8171 Ops/s | 49.3829 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 58.3044ms | 57.0333ms | 17.5336 Ops/s | 17.4943 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 21.4439ms | 19.2787ms | 51.8708 Ops/s | 46.5763 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR allows
map
to be called on single items of a tensordict.This is useful whenever we want to work independently on each element of a stack, and where the stack dimension should be discarded.
This example uses
transforms.v2
in torchvision and returns tensors of the same type as the original one, which wouldn't be possible without this PR