-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] TensorDict.<reduction>(dim='feature') #1121
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Dec 2, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 46.5770μs | 18.8717μs | 52.9893 KOps/s | 55.4831 KOps/s | |
test_plain_set_stack_nested | 52.1370μs | 19.2934μs | 51.8311 KOps/s | 54.5593 KOps/s | |
test_plain_set_nested_inplace | 60.5530μs | 21.4758μs | 46.5641 KOps/s | 50.4287 KOps/s | |
test_plain_set_stack_nested_inplace | 79.0390μs | 20.9721μs | 47.6824 KOps/s | 50.2772 KOps/s | |
test_items | 23.1130μs | 4.0780μs | 245.2192 KOps/s | 245.2626 KOps/s | |
test_items_nested | 0.7400ms | 0.4011ms | 2.4930 KOps/s | 2.5574 KOps/s | |
test_items_nested_locked | 0.8960ms | 0.3975ms | 2.5157 KOps/s | 2.5676 KOps/s | |
test_items_nested_leaf | 0.1298ms | 71.4034μs | 14.0049 KOps/s | 13.7712 KOps/s | |
test_items_stack_nested | 0.7316ms | 0.4001ms | 2.4993 KOps/s | 2.5478 KOps/s | |
test_items_stack_nested_leaf | 0.1411ms | 72.3471μs | 13.8223 KOps/s | 13.6395 KOps/s | |
test_items_stack_nested_locked | 0.7787ms | 0.4012ms | 2.4924 KOps/s | 2.5444 KOps/s | |
test_keys | 28.3830μs | 3.4969μs | 285.9693 KOps/s | 287.9936 KOps/s | |
test_keys_nested | 0.2226ms | 0.1377ms | 7.2613 KOps/s | 7.3402 KOps/s | |
test_keys_nested_locked | 0.7500ms | 0.1442ms | 6.9358 KOps/s | 6.9468 KOps/s | |
test_keys_nested_leaf | 1.7828ms | 0.1192ms | 8.3871 KOps/s | 8.4782 KOps/s | |
test_keys_stack_nested | 0.2195ms | 0.1369ms | 7.3055 KOps/s | 7.2193 KOps/s | |
test_keys_stack_nested_leaf | 0.2283ms | 0.1204ms | 8.3078 KOps/s | 8.4767 KOps/s | |
test_keys_stack_nested_locked | 0.2467ms | 0.1439ms | 6.9490 KOps/s | 7.0241 KOps/s | |
test_values | 36.3656μs | 1.0729μs | 932.0488 KOps/s | 968.4179 KOps/s | |
test_values_nested | 0.1023ms | 55.5018μs | 18.0174 KOps/s | 18.1351 KOps/s | |
test_values_nested_locked | 0.1386ms | 55.3207μs | 18.0764 KOps/s | 18.1391 KOps/s | |
test_values_nested_leaf | 0.1220ms | 60.4682μs | 16.5376 KOps/s | 15.8978 KOps/s | |
test_values_stack_nested | 0.2137ms | 55.7536μs | 17.9360 KOps/s | 17.3363 KOps/s | |
test_values_stack_nested_leaf | 0.1165ms | 61.0149μs | 16.3894 KOps/s | 16.4520 KOps/s | |
test_values_stack_nested_locked | 0.1071ms | 56.0927μs | 17.8276 KOps/s | 17.8695 KOps/s | |
test_membership | 14.1770μs | 0.8578μs | 1.1657 MOps/s | 1.4183 MOps/s | |
test_membership_nested | 44.2720μs | 2.9913μs | 334.3067 KOps/s | 334.7078 KOps/s | |
test_membership_nested_leaf | 25.9080μs | 2.9975μs | 333.6122 KOps/s | 337.9482 KOps/s | |
test_membership_stacked_nested | 31.6690μs | 2.9780μs | 335.7959 KOps/s | 341.9644 KOps/s | |
test_membership_stacked_nested_leaf | 25.2970μs | 2.9997μs | 333.3668 KOps/s | 339.9705 KOps/s | |
test_membership_nested_last | 30.5870μs | 4.3773μs | 228.4519 KOps/s | 235.2443 KOps/s | |
test_membership_nested_leaf_last | 25.5670μs | 4.3148μs | 231.7625 KOps/s | 234.2868 KOps/s | |
test_membership_stacked_nested_last | 23.1730μs | 4.3528μs | 229.7396 KOps/s | 195.5843 KOps/s | |
test_membership_stacked_nested_leaf_last | 36.4980μs | 4.3139μs | 231.8067 KOps/s | 203.9664 KOps/s | |
test_nested_getleaf | 54.7640μs | 11.1621μs | 89.5889 KOps/s | 93.1920 KOps/s | |
test_nested_get | 33.3920μs | 10.4509μs | 95.6857 KOps/s | 95.8663 KOps/s | |
test_stacked_getleaf | 45.2720μs | 10.9743μs | 91.1217 KOps/s | 92.0891 KOps/s | |
test_stacked_get | 52.4800μs | 10.6097μs | 94.2529 KOps/s | 96.7738 KOps/s | |
test_nested_getitemleaf | 47.7390μs | 11.4535μs | 87.3094 KOps/s | 88.1215 KOps/s | |
test_nested_getitem | 56.8660μs | 11.1778μs | 89.4628 KOps/s | 95.2570 KOps/s | |
test_stacked_getitemleaf | 48.3300μs | 11.7080μs | 85.4115 KOps/s | 89.9798 KOps/s | |
test_stacked_getitem | 45.9560μs | 10.9997μs | 90.9115 KOps/s | 93.8048 KOps/s | |
test_lock_nested | 1.8608ms | 0.4417ms | 2.2639 KOps/s | 2.2189 KOps/s | |
test_lock_stack_nested | 0.5222ms | 0.4123ms | 2.4255 KOps/s | 2.3964 KOps/s | |
test_unlock_nested | 1.0197ms | 0.3677ms | 2.7193 KOps/s | 2.7125 KOps/s | |
test_unlock_stack_nested | 0.8294ms | 0.3355ms | 2.9805 KOps/s | 2.9916 KOps/s | |
test_flatten_speed | 0.1792ms | 95.7300μs | 10.4460 KOps/s | 10.4451 KOps/s | |
test_unflatten_speed | 0.7077ms | 0.4979ms | 2.0086 KOps/s | 2.0391 KOps/s | |
test_common_ops | 6.6208ms | 0.8091ms | 1.2360 KOps/s | 1.2442 KOps/s | |
test_creation | 27.2810μs | 2.1053μs | 474.9810 KOps/s | 473.0011 KOps/s | |
test_creation_empty | 41.8470μs | 11.7958μs | 84.7760 KOps/s | 90.2984 KOps/s | |
test_creation_nested_1 | 40.9260μs | 14.4013μs | 69.4381 KOps/s | 71.6645 KOps/s | |
test_creation_nested_2 | 65.9930μs | 19.3309μs | 51.7307 KOps/s | 54.8798 KOps/s | |
test_clone | 0.2292ms | 13.2350μs | 75.5571 KOps/s | 75.3051 KOps/s | |
test_getitem[int] | 1.2972ms | 12.7915μs | 78.1772 KOps/s | 80.2240 KOps/s | |
test_getitem[slice_int] | 0.2574ms | 25.4468μs | 39.2976 KOps/s | 39.6218 KOps/s | |
test_getitem[range] | 0.1652ms | 48.2100μs | 20.7426 KOps/s | 19.3949 KOps/s | |
test_getitem[tuple] | 0.1308ms | 20.7358μs | 48.2259 KOps/s | 48.5897 KOps/s | |
test_getitem[list] | 0.4936ms | 43.4214μs | 23.0301 KOps/s | 22.9274 KOps/s | |
test_setitem_dim[int] | 49.7520μs | 25.4261μs | 39.3296 KOps/s | 38.6585 KOps/s | |
test_setitem_dim[slice_int] | 0.1184ms | 54.1522μs | 18.4665 KOps/s | 18.5033 KOps/s | |
test_setitem_dim[range] | 0.1461ms | 74.1165μs | 13.4923 KOps/s | 13.5418 KOps/s | |
test_setitem_dim[tuple] | 71.0420μs | 40.8805μs | 24.4615 KOps/s | 23.5014 KOps/s | |
test_setitem | 0.3099ms | 20.7021μs | 48.3042 KOps/s | 48.8766 KOps/s | |
test_set | 0.3106ms | 20.1488μs | 49.6307 KOps/s | 50.2381 KOps/s | |
test_set_shared | 1.2644ms | 0.1729ms | 5.7821 KOps/s | 5.9311 KOps/s | |
test_update | 0.3098ms | 23.5185μs | 42.5197 KOps/s | 44.2643 KOps/s | |
test_update_nested | 0.2762ms | 34.9260μs | 28.6320 KOps/s | 30.0645 KOps/s | |
test_update__nested | 0.4041ms | 33.3976μs | 29.9423 KOps/s | 30.6325 KOps/s | |
test_set_nested | 0.2981ms | 22.8887μs | 43.6898 KOps/s | 45.1828 KOps/s | |
test_set_nested_new | 0.3209ms | 27.9125μs | 35.8263 KOps/s | 37.4077 KOps/s | |
test_select | 0.3535ms | 43.5669μs | 22.9532 KOps/s | 23.1593 KOps/s | |
test_select_nested | 0.1219ms | 60.1683μs | 16.6201 KOps/s | 16.8466 KOps/s | |
test_exclude_nested | 0.1469ms | 78.6500μs | 12.7146 KOps/s | 13.0180 KOps/s | |
test_empty[True] | 0.5210ms | 0.3885ms | 2.5743 KOps/s | 2.6319 KOps/s | |
test_empty[False] | 7.3978μs | 1.2045μs | 830.2408 KOps/s | 820.4894 KOps/s | |
test_unbind_speed | 0.3794ms | 0.2648ms | 3.7764 KOps/s | 3.7544 KOps/s | |
test_unbind_speed_stack0 | 0.4376ms | 0.2636ms | 3.7942 KOps/s | 3.8382 KOps/s | |
test_unbind_speed_stack1 | 0.1104s | 0.7737ms | 1.2925 KOps/s | 1.4027 KOps/s | |
test_split | 0.1177s | 1.7599ms | 568.2162 Ops/s | 574.8821 Ops/s | |
test_chunk | 0.1024s | 1.7034ms | 587.0598 Ops/s | 576.9642 Ops/s | |
test_consolidate_njt[False-None] | 11.3033ms | 8.1724ms | 122.3633 Ops/s | 122.9608 Ops/s | |
test_creation[device0] | 0.2235ms | 91.2835μs | 10.9549 KOps/s | 10.7076 KOps/s | |
test_creation_from_tensor | 3.3529ms | 95.6241μs | 10.4576 KOps/s | 10.3732 KOps/s | |
test_add_one[memmap_tensor0] | 0.2851ms | 4.9199μs | 203.2578 KOps/s | 210.5007 KOps/s | |
test_contiguous[memmap_tensor0] | 12.9240μs | 0.5168μs | 1.9349 MOps/s | 1.9788 MOps/s | |
test_stack[memmap_tensor0] | 63.3080μs | 3.4474μs | 290.0722 KOps/s | 297.1332 KOps/s | |
test_memmaptd_index | 1.0759ms | 0.2430ms | 4.1153 KOps/s | 4.2785 KOps/s | |
test_memmaptd_index_astensor | 0.6596ms | 0.3192ms | 3.1329 KOps/s | 3.1299 KOps/s | |
test_memmaptd_index_op | 1.3883ms | 0.5912ms | 1.6916 KOps/s | 1.7302 KOps/s | |
test_serialize_model | 0.1197s | 0.1155s | 8.6597 Ops/s | 7.3021 Ops/s | |
test_serialize_model_pickle | 0.4541s | 0.3963s | 2.5235 Ops/s | 2.5353 Ops/s | |
test_serialize_weights | 0.1272s | 0.1171s | 8.5386 Ops/s | 8.8687 Ops/s | |
test_serialize_weights_returnearly | 0.1744s | 0.1622s | 6.1647 Ops/s | 6.4959 Ops/s | |
test_serialize_weights_pickle | 0.4934s | 0.4029s | 2.4819 Ops/s | 2.4326 Ops/s | |
test_serialize_weights_filesystem | 0.2523s | 0.1594s | 6.2739 Ops/s | 6.8807 Ops/s | |
test_serialize_model_filesystem | 0.1758s | 0.1525s | 6.5574 Ops/s | 5.9816 Ops/s | |
test_reshape_pytree | 75.0100μs | 27.1766μs | 36.7964 KOps/s | 38.0091 KOps/s | |
test_reshape_td | 74.2980μs | 33.4946μs | 29.8556 KOps/s | 30.5400 KOps/s | |
test_view_pytree | 67.9270μs | 27.1367μs | 36.8505 KOps/s | 37.8727 KOps/s | |
test_view_td | 0.1005ms | 38.5984μs | 25.9078 KOps/s | 25.9238 KOps/s | |
test_unbind_pytree | 77.8750μs | 30.1855μs | 33.1284 KOps/s | 33.0601 KOps/s | |
test_unbind_td | 0.3423ms | 38.1062μs | 26.2424 KOps/s | 25.9360 KOps/s | |
test_split_pytree | 82.1230μs | 29.6265μs | 33.7536 KOps/s | 33.9104 KOps/s | |
test_split_td | 0.5156ms | 45.5446μs | 21.9565 KOps/s | 22.3368 KOps/s | |
test_add_pytree | 87.2830μs | 35.5934μs | 28.0951 KOps/s | 28.5995 KOps/s | |
test_add_td | 0.2920ms | 61.2999μs | 16.3133 KOps/s | 18.9561 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1369ms | 62.7730μs | 15.9304 KOps/s | 16.0731 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.8180ms | 0.1613ms | 6.2014 KOps/s | 6.2287 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1292ms | 46.0911μs | 21.6961 KOps/s | 21.6929 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3203ms | 0.1186ms | 8.4334 KOps/s | 8.5667 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1414ms | 26.3945μs | 37.8867 KOps/s | 37.4193 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1169ms | 54.6961μs | 18.2828 KOps/s | 18.6063 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1951ms | 79.1964μs | 12.6268 KOps/s | 12.6780 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1326ms | 68.4893μs | 14.6008 KOps/s | 14.8018 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2185ms | 0.1067ms | 9.3752 KOps/s | 9.5616 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3339ms | 0.2010ms | 4.9758 KOps/s | 4.9685 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2402ms | 46.3198μs | 21.5891 KOps/s | 20.5725 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 1.3488ms | 62.8243μs | 15.9174 KOps/s | 16.3813 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1897ms | 0.1056ms | 9.4667 KOps/s | 9.5776 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4255ms | 0.2034ms | 4.9171 KOps/s | 4.9418 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3452ms | 0.2135ms | 4.6847 KOps/s | 4.6932 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2370ms | 0.1071ms | 9.3388 KOps/s | 9.5538 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.3378ms | 55.4747μs | 18.0262 KOps/s | 18.3785 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2087ms | 47.3735μs | 21.1089 KOps/s | 22.0484 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6124ms | 0.1587ms | 6.3019 KOps/s | 6.2167 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2141ms | 0.1037ms | 9.6420 KOps/s | 9.6784 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 75.9810μs | 21.1844μs | 47.2044 KOps/s | 45.6878 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1171ms | 60.8060μs | 16.4458 KOps/s | 16.7311 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1554ms | 80.1601μs | 12.4750 KOps/s | 12.4550 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1693ms | 68.1438μs | 14.6748 KOps/s | 14.7021 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4569ms | 0.2138ms | 4.6775 KOps/s | 4.8844 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.4963ms | 1.3164ms | 759.6627 Ops/s | 777.4561 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2790ms | 0.2088ms | 4.7884 KOps/s | 4.9412 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.9125ms | 0.7881ms | 1.2688 KOps/s | 1.3006 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.7801ms | 0.4688ms | 2.1330 KOps/s | 2.2308 KOps/s | |
test_compile_assign_and_add_stack[eager] | 2.9527ms | 2.6469ms | 377.7947 Ops/s | 376.2672 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1454ms | 36.4230μs | 27.4552 KOps/s | 28.3907 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6471ms | 32.9684μs | 30.3321 KOps/s | 30.2804 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1023ms | 30.1810μs | 33.1334 KOps/s | 34.1557 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 68.1270μs | 23.4121μs | 42.7129 KOps/s | 42.8806 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1074ms | 31.0303μs | 32.2266 KOps/s | 33.7327 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 92.3820μs | 23.1553μs | 43.1867 KOps/s | 43.5209 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1286ms | 52.0418μs | 19.2153 KOps/s | 19.3468 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4216ms | 20.3126μs | 49.2306 KOps/s | 50.7910 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1032ms | 44.2754μs | 22.5859 KOps/s | 22.7526 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 85.4390μs | 18.7584μs | 53.3094 KOps/s | 53.0458 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1251ms | 44.8164μs | 22.3133 KOps/s | 22.3317 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 54.1110μs | 18.7828μs | 53.2403 KOps/s | 53.2193 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1164ms | 52.8907μs | 18.9069 KOps/s | 18.9524 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0024ms | 19.9591μs | 50.1024 KOps/s | 51.1753 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1048ms | 44.7402μs | 22.3512 KOps/s | 22.1817 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 81.2010μs | 18.6524μs | 53.6124 KOps/s | 53.5070 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1140ms | 44.7737μs | 22.3345 KOps/s | 22.2798 KOps/s | |
test_compile_indexing[int-pytree-eager] | 65.6230μs | 18.9043μs | 52.8981 KOps/s | 53.0472 KOps/s | |
test_mod_add[eager] | 0.1059ms | 35.1202μs | 28.4737 KOps/s | 28.4267 KOps/s | |
test_mod_add[compile] | 0.1324ms | 46.4675μs | 21.5204 KOps/s | 20.9646 KOps/s | |
test_mod_add[compile-overhead] | 0.1136ms | 46.1229μs | 21.6812 KOps/s | 21.1657 KOps/s | |
test_mod_wrap[eager] | 0.3914ms | 0.2237ms | 4.4708 KOps/s | 4.4673 KOps/s | |
test_mod_wrap[compile] | 0.3249ms | 0.2082ms | 4.8039 KOps/s | 4.8985 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4527ms | 0.2077ms | 4.8149 KOps/s | 4.9390 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.7092ms | 11.7953ms | 84.7798 Ops/s | 85.4702 Ops/s | |
test_mod_wrap_and_backward[compile] | 11.9642ms | 11.1120ms | 89.9925 Ops/s | 78.4122 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.2555ms | 11.5945ms | 86.2475 Ops/s | 73.5592 Ops/s | |
test_seq_add[eager] | 0.1953ms | 0.1130ms | 8.8530 KOps/s | 8.8537 KOps/s | |
test_seq_add[compile] | 0.1734ms | 61.9775μs | 16.1349 KOps/s | 15.9631 KOps/s | |
test_seq_add[compile-overhead] | 0.1295ms | 59.6622μs | 16.7610 KOps/s | 16.2060 KOps/s | |
test_seq_wrap[eager] | 0.8031ms | 0.4370ms | 2.2882 KOps/s | 2.2646 KOps/s | |
test_seq_wrap[compile] | 0.3212ms | 0.2287ms | 4.3728 KOps/s | 4.3851 KOps/s | |
test_seq_wrap[compile-overhead] | 0.5237ms | 0.2320ms | 4.3096 KOps/s | 4.4284 KOps/s | |
test_func_call_runtime[False-eager] | 1.0512ms | 0.5350ms | 1.8692 KOps/s | 1.8242 KOps/s | |
test_func_call_runtime[False-compile] | 0.8098ms | 0.4279ms | 2.3373 KOps/s | 2.3887 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8473ms | 0.4245ms | 2.3557 KOps/s | 2.3713 KOps/s | |
test_func_call_runtime[True-eager] | 0.8564ms | 0.7373ms | 1.3562 KOps/s | 1.3202 KOps/s | |
test_func_call_runtime[True-compile] | 0.5847ms | 0.4614ms | 2.1672 KOps/s | 2.1545 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9104ms | 0.4666ms | 2.1431 KOps/s | 2.1790 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9636ms | 0.5310ms | 1.8833 KOps/s | 1.8111 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7807ms | 0.4264ms | 2.3454 KOps/s | 2.3745 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5335ms | 0.4264ms | 2.3454 KOps/s | 2.3654 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.6538ms | 0.8866ms | 1.1280 KOps/s | 1.1089 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6164ms | 0.4899ms | 2.0413 KOps/s | 2.0383 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0113ms | 0.4952ms | 2.0194 KOps/s | 2.0413 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4942ms | 1.8657ms | 535.9866 Ops/s | 533.9930 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.6879ms | 0.5231ms | 1.9117 KOps/s | 1.9065 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7150ms | 0.5224ms | 1.9141 KOps/s | 1.9055 KOps/s | |
test_distributed | 0.2829ms | 0.1251ms | 7.9943 KOps/s | 7.7502 KOps/s | |
test_tdmodule | 45.4240μs | 26.2654μs | 38.0729 KOps/s | 36.8850 KOps/s | |
test_tdmodule_dispatch | 83.0140μs | 48.6217μs | 20.5669 KOps/s | 20.1671 KOps/s | |
test_tdseq | 58.4590μs | 26.3074μs | 38.0121 KOps/s | 36.9568 KOps/s | |
test_tdseq_dispatch | 93.6840μs | 51.4193μs | 19.4479 KOps/s | 19.5989 KOps/s | |
test_instantiation_functorch | 2.1840ms | 1.5592ms | 641.3375 Ops/s | 640.7101 Ops/s | |
test_exec_functorch | 0.4056ms | 0.1771ms | 5.6478 KOps/s | 5.4886 KOps/s | |
test_exec_functional_call | 0.3889ms | 0.1729ms | 5.7853 KOps/s | 5.7884 KOps/s | |
test_exec_td_decorator | 0.6060ms | 0.2273ms | 4.3992 KOps/s | 4.2427 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9227ms | 0.6500ms | 1.5384 KOps/s | 1.5386 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9091ms | 0.6489ms | 1.5412 KOps/s | 1.5074 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8015ms | 0.5225ms | 1.9140 KOps/s | 1.9077 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6931ms | 0.5193ms | 1.9256 KOps/s | 1.8906 KOps/s | |
test_to_module_speed[True] | 1.9570ms | 1.2975ms | 770.7319 Ops/s | 767.2519 Ops/s | |
test_to_module_speed[False] | 1.9838ms | 1.2634ms | 791.4843 Ops/s | 789.8247 Ops/s | |
test_tc_init | 87.9140μs | 45.8432μs | 21.8135 KOps/s | 21.0349 KOps/s | |
test_tc_init_nested | 0.1559ms | 90.1290μs | 11.0952 KOps/s | 10.8736 KOps/s | |
test_tc_first_layer_tensor | 24.2950μs | 1.5333μs | 652.2051 KOps/s | 658.3637 KOps/s | |
test_tc_first_layer_nontensor | 48.2000μs | 4.8114μs | 207.8377 KOps/s | 205.4476 KOps/s | |
test_tc_second_layer_tensor | 40.3950μs | 2.8294μs | 353.4345 KOps/s | 336.4271 KOps/s | |
test_tc_second_layer_nontensor | 27.7320μs | 6.1441μs | 162.7576 KOps/s | 163.4452 KOps/s | |
test_unbind | 0.2337s | 12.8671ms | 77.7173 Ops/s | 80.7436 Ops/s | |
test_full_like | 11.5778ms | 8.1609ms | 122.5362 Ops/s | 80.2472 Ops/s | |
test_zeros_like | 5.5324ms | 3.4075ms | 293.4716 Ops/s | 124.9594 Ops/s | |
test_ones_like | 6.2906ms | 3.7043ms | 269.9560 Ops/s | 124.2350 Ops/s | |
test_clone | 7.1438ms | 5.5349ms | 180.6729 Ops/s | 102.5211 Ops/s | |
test_squeeze | 62.1250μs | 11.5839μs | 86.3267 KOps/s | 83.9207 KOps/s | |
test_unsqueeze | 0.2966ms | 89.2230μs | 11.2079 KOps/s | 11.2675 KOps/s | |
test_split | 0.3673ms | 0.1928ms | 5.1861 KOps/s | 5.1641 KOps/s | |
test_permute | 0.3545ms | 0.2177ms | 4.5932 KOps/s | 4.6053 KOps/s | |
test_stack | 30.1093ms | 26.6396ms | 37.5382 Ops/s | 38.9251 Ops/s | |
test_cat | 31.8705ms | 27.2319ms | 36.7217 Ops/s | 39.7687 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 31.5310μs | 11.4966μs | 86.9824 KOps/s | 93.4053 KOps/s | |
test_plain_set_stack_nested | 34.6700μs | 11.4790μs | 87.1155 KOps/s | 93.5376 KOps/s | |
test_plain_set_nested_inplace | 37.3010μs | 12.3116μs | 81.2241 KOps/s | 86.0871 KOps/s | |
test_plain_set_stack_nested_inplace | 45.4010μs | 12.4193μs | 80.5196 KOps/s | 86.7603 KOps/s | |
test_items | 24.9810μs | 2.9012μs | 344.6835 KOps/s | 341.9352 KOps/s | |
test_items_nested | 0.3789ms | 0.3482ms | 2.8722 KOps/s | 2.8318 KOps/s | |
test_items_nested_locked | 0.4127ms | 0.3491ms | 2.8643 KOps/s | 2.8331 KOps/s | |
test_items_nested_leaf | 80.2310μs | 57.7618μs | 17.3125 KOps/s | 17.2681 KOps/s | |
test_items_stack_nested | 0.4079ms | 0.3520ms | 2.8408 KOps/s | 2.8259 KOps/s | |
test_items_stack_nested_leaf | 86.5000μs | 58.6471μs | 17.0511 KOps/s | 16.9442 KOps/s | |
test_items_stack_nested_locked | 0.3746ms | 0.3515ms | 2.8449 KOps/s | 2.8170 KOps/s | |
test_keys | 33.1800μs | 3.7021μs | 270.1191 KOps/s | 292.1499 KOps/s | |
test_keys_nested | 99.1310μs | 70.0521μs | 14.2751 KOps/s | 14.2313 KOps/s | |
test_keys_nested_locked | 0.7041ms | 76.1489μs | 13.1322 KOps/s | 13.0255 KOps/s | |
test_keys_nested_leaf | 94.8410μs | 60.9257μs | 16.4134 KOps/s | 16.2554 KOps/s | |
test_keys_stack_nested | 0.1044ms | 70.8573μs | 14.1129 KOps/s | 14.0932 KOps/s | |
test_keys_stack_nested_leaf | 98.6610μs | 61.4114μs | 16.2836 KOps/s | 15.9430 KOps/s | |
test_keys_stack_nested_locked | 0.1062ms | 76.3142μs | 13.1037 KOps/s | 13.0925 KOps/s | |
test_values | 8.7802μs | 0.8474μs | 1.1801 MOps/s | 1.1727 MOps/s | |
test_values_nested | 55.5710μs | 31.1672μs | 32.0850 KOps/s | 32.3531 KOps/s | |
test_values_nested_locked | 59.5710μs | 32.5991μs | 30.6757 KOps/s | 30.7078 KOps/s | |
test_values_nested_leaf | 57.2810μs | 33.3587μs | 29.9772 KOps/s | 29.8760 KOps/s | |
test_values_stack_nested | 52.4110μs | 31.3702μs | 31.8773 KOps/s | 31.6551 KOps/s | |
test_values_stack_nested_leaf | 59.9010μs | 33.6816μs | 29.6898 KOps/s | 29.4339 KOps/s | |
test_values_stack_nested_locked | 0.1236ms | 33.3930μs | 29.9464 KOps/s | 30.3629 KOps/s | |
test_membership | 1.4760μs | 0.5168μs | 1.9351 MOps/s | 1.9693 MOps/s | |
test_membership_nested | 29.7910μs | 2.0460μs | 488.7680 KOps/s | 498.9046 KOps/s | |
test_membership_nested_leaf | 16.6505μs | 1.9956μs | 501.1013 KOps/s | 504.1315 KOps/s | |
test_membership_stacked_nested | 33.5200μs | 2.0808μs | 480.5782 KOps/s | 480.8136 KOps/s | |
test_membership_stacked_nested_leaf | 28.7100μs | 2.0529μs | 487.1119 KOps/s | 482.1761 KOps/s | |
test_membership_nested_last | 30.0410μs | 2.8939μs | 345.5504 KOps/s | 343.9843 KOps/s | |
test_membership_nested_leaf_last | 29.1600μs | 2.9231μs | 342.1040 KOps/s | 338.5766 KOps/s | |
test_membership_stacked_nested_last | 25.3700μs | 3.3408μs | 299.3337 KOps/s | 347.2531 KOps/s | |
test_membership_stacked_nested_leaf_last | 42.4500μs | 3.3096μs | 302.1516 KOps/s | 342.5276 KOps/s | |
test_nested_getleaf | 30.4100μs | 6.1511μs | 162.5713 KOps/s | 162.5516 KOps/s | |
test_nested_get | 31.9000μs | 5.8160μs | 171.9391 KOps/s | 171.3314 KOps/s | |
test_stacked_getleaf | 38.6210μs | 6.1392μs | 162.8890 KOps/s | 163.5920 KOps/s | |
test_stacked_get | 26.6900μs | 5.8005μs | 172.3993 KOps/s | 172.7816 KOps/s | |
test_nested_getitemleaf | 33.8000μs | 6.2288μs | 160.5440 KOps/s | 159.8520 KOps/s | |
test_nested_getitem | 35.5610μs | 5.9004μs | 169.4792 KOps/s | 168.5306 KOps/s | |
test_stacked_getitemleaf | 44.9400μs | 6.2219μs | 160.7227 KOps/s | 159.8207 KOps/s | |
test_stacked_getitem | 32.6900μs | 5.8824μs | 169.9978 KOps/s | 170.1987 KOps/s | |
test_lock_nested | 1.1588ms | 0.3596ms | 2.7807 KOps/s | 2.6915 KOps/s | |
test_lock_stack_nested | 0.3861ms | 0.3305ms | 3.0258 KOps/s | 3.0126 KOps/s | |
test_unlock_nested | 0.6233ms | 0.2990ms | 3.3448 KOps/s | 3.2833 KOps/s | |
test_unlock_stack_nested | 0.3275ms | 0.2710ms | 3.6906 KOps/s | 3.6751 KOps/s | |
test_flatten_speed | 0.1059ms | 74.1427μs | 13.4875 KOps/s | 13.2400 KOps/s | |
test_unflatten_speed | 0.3399ms | 0.3014ms | 3.3176 KOps/s | 3.2650 KOps/s | |
test_common_ops | 1.6038ms | 0.6104ms | 1.6383 KOps/s | 1.7269 KOps/s | |
test_creation | 0.1864ms | 1.4689μs | 680.7897 KOps/s | 692.1686 KOps/s | |
test_creation_empty | 30.9610μs | 9.2424μs | 108.1966 KOps/s | 134.3623 KOps/s | |
test_creation_nested_1 | 51.3110μs | 10.7231μs | 93.2569 KOps/s | 114.5620 KOps/s | |
test_creation_nested_2 | 52.3500μs | 13.1266μs | 76.1809 KOps/s | 87.9016 KOps/s | |
test_clone | 1.9153ms | 10.4191μs | 95.9774 KOps/s | 99.2732 KOps/s | |
test_getitem[int] | 1.1452ms | 10.4622μs | 95.5822 KOps/s | 95.9968 KOps/s | |
test_getitem[slice_int] | 0.1130ms | 20.1777μs | 49.5597 KOps/s | 49.5495 KOps/s | |
test_getitem[range] | 0.1344ms | 39.1556μs | 25.5391 KOps/s | 27.6726 KOps/s | |
test_getitem[tuple] | 0.1084ms | 17.7601μs | 56.3060 KOps/s | 55.8395 KOps/s | |
test_getitem[list] | 0.1311ms | 33.4153μs | 29.9264 KOps/s | 30.8307 KOps/s | |
test_setitem_dim[int] | 38.0600μs | 19.1063μs | 52.3387 KOps/s | 55.2942 KOps/s | |
test_setitem_dim[slice_int] | 86.0100μs | 39.8050μs | 25.1225 KOps/s | 27.1617 KOps/s | |
test_setitem_dim[range] | 79.4210μs | 55.9252μs | 17.8810 KOps/s | 19.3869 KOps/s | |
test_setitem_dim[tuple] | 54.1900μs | 32.5853μs | 30.6887 KOps/s | 31.8249 KOps/s | |
test_setitem | 0.1347ms | 15.4333μs | 64.7949 KOps/s | 70.4507 KOps/s | |
test_set | 0.1310ms | 14.8811μs | 67.1992 KOps/s | 72.2131 KOps/s | |
test_set_shared | 1.7101ms | 0.1442ms | 6.9327 KOps/s | 6.9767 KOps/s | |
test_update | 0.5253ms | 18.1105μs | 55.2165 KOps/s | 60.2293 KOps/s | |
test_update_nested | 0.1246ms | 23.6313μs | 42.3168 KOps/s | 44.7656 KOps/s | |
test_update__nested | 1.1115ms | 24.7471μs | 40.4087 KOps/s | 43.0207 KOps/s | |
test_set_nested | 0.1215ms | 15.7675μs | 63.4218 KOps/s | 67.1321 KOps/s | |
test_set_nested_new | 0.1257ms | 18.2529μs | 54.7858 KOps/s | 57.1524 KOps/s | |
test_select | 65.0010μs | 30.8990μs | 32.3635 KOps/s | 33.6252 KOps/s | |
test_select_nested | 75.0410μs | 41.3762μs | 24.1685 KOps/s | 23.7582 KOps/s | |
test_exclude_nested | 90.7410μs | 61.3906μs | 16.2891 KOps/s | 16.1609 KOps/s | |
test_empty[True] | 0.3158ms | 0.2730ms | 3.6625 KOps/s | 3.6559 KOps/s | |
test_empty[False] | 4.7860μs | 0.7395μs | 1.3523 MOps/s | 1.3541 MOps/s | |
test_to | 88.1810μs | 54.0930μs | 18.4867 KOps/s | 18.2072 KOps/s | |
test_to_nonblocking | 88.1110μs | 48.3879μs | 20.6663 KOps/s | 21.5577 KOps/s | |
test_unbind_speed | 0.8463ms | 0.2264ms | 4.4168 KOps/s | 4.4187 KOps/s | |
test_unbind_speed_stack0 | 0.2846ms | 0.2253ms | 4.4395 KOps/s | 4.3949 KOps/s | |
test_unbind_speed_stack1 | 93.7262ms | 0.6478ms | 1.5436 KOps/s | 1.5524 KOps/s | |
test_split | 94.1869ms | 1.6726ms | 597.8668 Ops/s | 594.9992 Ops/s | |
test_chunk | 95.4542ms | 1.5639ms | 639.4444 Ops/s | 703.1940 Ops/s | |
test_consolidate[False-None] | 2.6436ms | 2.5626ms | 390.2242 Ops/s | 355.7900 Ops/s | |
test_consolidate[default-None] | 1.7668ms | 1.6260ms | 615.0190 Ops/s | 608.0817 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8675ms | 1.6599ms | 602.4623 Ops/s | 599.1183 Ops/s | |
test_consolidate_njt[False-None] | 6.6712ms | 6.3601ms | 157.2301 Ops/s | 156.8697 Ops/s | |
test_to[False-False-None] | 1.8158ms | 1.6714ms | 598.2873 Ops/s | 588.8536 Ops/s | |
test_to[True-False-None] | 1.5252ms | 1.2386ms | 807.3327 Ops/s | 807.6642 Ops/s | |
test_to[within-False-None] | 4.1397ms | 3.9070ms | 255.9489 Ops/s | 255.0598 Ops/s | |
test_to[True-default-None] | 5.1737ms | 4.9902ms | 200.3923 Ops/s | 191.4188 Ops/s | |
test_to_njt[False-False-None] | 7.1159ms | 6.8548ms | 145.8829 Ops/s | 145.4635 Ops/s | |
test_to_njt[True-False-None] | 5.6280ms | 5.3746ms | 186.0618 Ops/s | 184.2683 Ops/s | |
test_to_njt[within-False-None] | 12.2723ms | 12.0633ms | 82.8958 Ops/s | 82.6243 Ops/s | |
test_creation[device0] | 0.5760ms | 80.7403μs | 12.3854 KOps/s | 12.2305 KOps/s | |
test_creation_from_tensor | 0.5809ms | 83.9600μs | 11.9104 KOps/s | 11.6643 KOps/s | |
test_add_one[memmap_tensor0] | 0.2023ms | 6.6569μs | 150.2199 KOps/s | 148.1060 KOps/s | |
test_contiguous[memmap_tensor0] | 1.8470μs | 0.4020μs | 2.4879 MOps/s | 2.5048 MOps/s | |
test_stack[memmap_tensor0] | 44.1600μs | 4.4543μs | 224.5044 KOps/s | 221.8087 KOps/s | |
test_memmaptd_index | 1.8472ms | 0.2495ms | 4.0081 KOps/s | 4.0661 KOps/s | |
test_memmaptd_index_astensor | 0.9372ms | 0.3081ms | 3.2457 KOps/s | 3.3125 KOps/s | |
test_memmaptd_index_op | 0.9914ms | 0.5958ms | 1.6784 KOps/s | 1.7425 KOps/s | |
test_serialize_model | 0.1314s | 0.1307s | 7.6536 Ops/s | 7.6695 Ops/s | |
test_serialize_model_pickle | 1.3495s | 1.2120s | 0.8251 Ops/s | 0.7766 Ops/s | |
test_serialize_weights | 0.1320s | 0.1296s | 7.7165 Ops/s | 7.6932 Ops/s | |
test_serialize_weights_returnearly | 0.3194s | 53.4081ms | 18.7238 Ops/s | 16.3817 Ops/s | |
test_serialize_weights_pickle | 1.3771s | 1.2208s | 0.8191 Ops/s | 0.8177 Ops/s | |
test_reshape_pytree | 91.0310μs | 21.9282μs | 45.6035 KOps/s | 45.0086 KOps/s | |
test_reshape_td | 53.1810μs | 26.3047μs | 38.0161 KOps/s | 37.3957 KOps/s | |
test_view_pytree | 53.8700μs | 21.9226μs | 45.6151 KOps/s | 45.1700 KOps/s | |
test_view_td | 53.9010μs | 28.9888μs | 34.4960 KOps/s | 33.7731 KOps/s | |
test_unbind_pytree | 54.6610μs | 27.7623μs | 36.0201 KOps/s | 35.9487 KOps/s | |
test_unbind_td | 0.8251ms | 35.1456μs | 28.4531 KOps/s | 28.2705 KOps/s | |
test_split_pytree | 53.0510μs | 29.2452μs | 34.1937 KOps/s | 33.5727 KOps/s | |
test_split_td | 0.1741ms | 37.4609μs | 26.6945 KOps/s | 24.9914 KOps/s | |
test_add_pytree | 61.3600μs | 33.9101μs | 29.4897 KOps/s | 30.0340 KOps/s | |
test_add_td | 96.1810μs | 50.0709μs | 19.9717 KOps/s | 21.3699 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1721ms | 0.1180ms | 8.4736 KOps/s | 8.2548 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2135ms | 0.1240ms | 8.0661 KOps/s | 7.9707 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1642ms | 96.0729μs | 10.4088 KOps/s | 10.5363 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.2229ms | 0.1485ms | 6.7350 KOps/s | 6.6926 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 53.7910μs | 23.2198μs | 43.0666 KOps/s | 45.3002 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 60.3610μs | 26.3102μs | 38.0081 KOps/s | 34.2021 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2929ms | 64.6555μs | 15.4666 KOps/s | 15.3662 KOps/s | |
test_compile_copy_nested[pytree-eager] | 86.3910μs | 49.2449μs | 20.3067 KOps/s | 20.3043 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1818ms | 0.1421ms | 7.0353 KOps/s | 6.9366 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2955ms | 0.2079ms | 4.8110 KOps/s | 4.8289 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1470ms | 97.5724μs | 10.2488 KOps/s | 9.8300 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1064ms | 51.2880μs | 19.4978 KOps/s | 18.8941 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1810ms | 0.1358ms | 7.3629 KOps/s | 7.3757 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5604ms | 0.4805ms | 2.0810 KOps/s | 2.0832 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3643ms | 0.2451ms | 4.0808 KOps/s | 4.0151 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1810ms | 0.1428ms | 7.0051 KOps/s | 6.8198 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1742ms | 62.8632μs | 15.9075 KOps/s | 15.8445 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1417ms | 97.9750μs | 10.2067 KOps/s | 9.9761 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4452ms | 0.4060ms | 2.4631 KOps/s | 2.4628 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1981ms | 0.1349ms | 7.4144 KOps/s | 7.4284 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 44.4910μs | 19.3445μs | 51.6942 KOps/s | 55.9042 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 53.2200μs | 26.7268μs | 37.4156 KOps/s | 38.2940 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1721ms | 69.0120μs | 14.4902 KOps/s | 14.3281 KOps/s | |
test_compile_copy_flat[pytree-eager] | 97.4810μs | 51.2535μs | 19.5109 KOps/s | 19.3958 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.5954ms | 0.3865ms | 2.5871 KOps/s | 2.2332 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.6518ms | 2.5723ms | 388.7637 Ops/s | 384.3186 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5586ms | 0.3792ms | 2.6368 KOps/s | 2.2406 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7171ms | 2.6262ms | 380.7791 Ops/s | 376.2602 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.6351ms | 0.1117ms | 8.9505 KOps/s | 8.5587 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5555ms | 78.5918μs | 12.7240 KOps/s | 11.9377 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.4677ms | 0.1050ms | 9.5267 KOps/s | 9.1384 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1110ms | 68.1770μs | 14.6677 KOps/s | 14.1235 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1577ms | 0.1056ms | 9.4701 KOps/s | 9.2671 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1398ms | 68.0641μs | 14.6920 KOps/s | 14.4699 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1428ms | 0.1002ms | 9.9837 KOps/s | 10.0778 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1482ms | 16.6871μs | 59.9264 KOps/s | 57.0380 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1279ms | 95.3931μs | 10.4829 KOps/s | 10.6217 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 46.2800μs | 15.6179μs | 64.0290 KOps/s | 64.9722 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1453ms | 96.1855μs | 10.3966 KOps/s | 10.4811 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 44.6300μs | 15.6527μs | 63.8866 KOps/s | 64.2884 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1541ms | 0.1004ms | 9.9591 KOps/s | 10.0081 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6052ms | 16.3798μs | 61.0509 KOps/s | 57.8182 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1418ms | 95.5409μs | 10.4667 KOps/s | 10.5195 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 48.9400μs | 15.4745μs | 64.6224 KOps/s | 64.6503 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1371ms | 95.5687μs | 10.4637 KOps/s | 10.5071 KOps/s | |
test_compile_indexing[int-pytree-eager] | 91.6310μs | 19.8012μs | 50.5021 KOps/s | 64.6714 KOps/s | |
test_mod_add[eager] | 0.1367ms | 37.3001μs | 26.8096 KOps/s | 27.0701 KOps/s | |
test_mod_add[compile] | 0.1784ms | 79.4768μs | 12.5823 KOps/s | 12.4137 KOps/s | |
test_mod_add[compile-overhead] | 0.3188ms | 0.1652ms | 6.0526 KOps/s | 5.5550 KOps/s | |
test_mod_wrap[eager] | 0.3343ms | 0.2512ms | 3.9803 KOps/s | 3.8585 KOps/s | |
test_mod_wrap[compile] | 0.5698ms | 0.2914ms | 3.4313 KOps/s | 3.4862 KOps/s | |
test_mod_wrap[compile-overhead] | 7.1247ms | 3.8412ms | 260.3367 Ops/s | 262.7099 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4908ms | 1.3601ms | 735.2227 Ops/s | 690.5795 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3814ms | 1.2601ms | 793.5850 Ops/s | 731.6543 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3534ms | 0.9243ms | 1.0819 KOps/s | 990.1941 Ops/s | |
test_seq_add[eager] | 0.1772ms | 0.1128ms | 8.8636 KOps/s | 9.0353 KOps/s | |
test_seq_add[compile] | 0.2368ms | 95.4114μs | 10.4809 KOps/s | 11.3198 KOps/s | |
test_seq_add[compile-overhead] | 0.1791ms | 0.1272ms | 7.8646 KOps/s | 7.8195 KOps/s | |
test_seq_wrap[eager] | 0.5504ms | 0.4262ms | 2.3466 KOps/s | 2.3725 KOps/s | |
test_seq_wrap[compile] | 0.3737ms | 0.3146ms | 3.1790 KOps/s | 3.3370 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2651ms | 0.2201ms | 4.5432 KOps/s | 4.3866 KOps/s | |
test_func_call_runtime[False-eager] | 0.7791ms | 0.7350ms | 1.3605 KOps/s | 1.3379 KOps/s | |
test_func_call_runtime[False-compile] | 0.9294ms | 0.7438ms | 1.3444 KOps/s | 1.3539 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4027ms | 0.3583ms | 2.7913 KOps/s | 2.7978 KOps/s | |
test_func_call_runtime[True-eager] | 0.9923ms | 0.8953ms | 1.1170 KOps/s | 1.1015 KOps/s | |
test_func_call_runtime[True-compile] | 0.9250ms | 0.7965ms | 1.2555 KOps/s | 1.3139 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5423ms | 0.3805ms | 2.6281 KOps/s | 2.6345 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8219ms | 0.7284ms | 1.3729 KOps/s | 1.3546 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.0664ms | 0.7396ms | 1.3521 KOps/s | 1.3456 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4357ms | 0.3593ms | 2.7831 KOps/s | 2.7952 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0796ms | 1.0002ms | 999.7563 Ops/s | 1.0004 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0330ms | 0.7930ms | 1.2610 KOps/s | 1.2574 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4880ms | 0.4063ms | 2.4614 KOps/s | 2.4380 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5325ms | 2.0735ms | 482.2807 Ops/s | 477.6691 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8897ms | 0.8033ms | 1.2449 KOps/s | 1.2334 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4546ms | 0.4056ms | 2.4656 KOps/s | 2.4625 KOps/s | |
test_distributed | 0.6509ms | 0.1526ms | 6.5528 KOps/s | 8.4101 KOps/s | |
test_tdmodule | 62.0710μs | 19.9385μs | 50.1542 KOps/s | 52.1547 KOps/s | |
test_tdmodule_dispatch | 75.8600μs | 36.2018μs | 27.6229 KOps/s | 28.6413 KOps/s | |
test_tdseq | 39.5610μs | 19.3960μs | 51.5569 KOps/s | 52.7835 KOps/s | |
test_tdseq_dispatch | 57.3710μs | 37.8990μs | 26.3859 KOps/s | 27.7619 KOps/s | |
test_instantiation_functorch | 1.6083ms | 1.5203ms | 657.7713 Ops/s | 650.5880 Ops/s | |
test_exec_functorch | 0.1861ms | 0.1404ms | 7.1247 KOps/s | 7.0577 KOps/s | |
test_exec_functional_call | 0.1866ms | 0.1384ms | 7.2280 KOps/s | 7.4602 KOps/s | |
test_exec_td_decorator | 0.3624ms | 0.1787ms | 5.5956 KOps/s | 5.5461 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8334ms | 0.6832ms | 1.4637 KOps/s | 1.4642 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8378ms | 0.6844ms | 1.4610 KOps/s | 1.4532 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7615ms | 0.6143ms | 1.6280 KOps/s | 1.6620 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6983ms | 0.5896ms | 1.6962 KOps/s | 1.6738 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.4675ms | 19.1094ms | 52.3303 Ops/s | 52.1419 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.0625ms | 19.8342ms | 50.4180 Ops/s | 52.0898 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.6174ms | 19.0913ms | 52.3798 Ops/s | 52.5141 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.0331ms | 19.4570ms | 51.3954 Ops/s | 52.3865 Ops/s | |
test_to_module_speed[True] | 1.0792ms | 0.9315ms | 1.0735 KOps/s | 1.0495 KOps/s | |
test_to_module_speed[False] | 1.3082ms | 0.9275ms | 1.0781 KOps/s | 1.0884 KOps/s | |
test_tc_init | 56.2610μs | 35.9321μs | 27.8302 KOps/s | 28.4881 KOps/s | |
test_tc_init_nested | 0.2069ms | 74.4251μs | 13.4363 KOps/s | 14.1496 KOps/s | |
test_tc_first_layer_tensor | 4.8944μs | 0.7022μs | 1.4240 MOps/s | 1.4260 MOps/s | |
test_tc_first_layer_nontensor | 18.2700μs | 2.3177μs | 431.4685 KOps/s | 429.4303 KOps/s | |
test_tc_second_layer_tensor | 11.9867μs | 1.4226μs | 702.9326 KOps/s | 703.9102 KOps/s | |
test_tc_second_layer_nontensor | 39.8300μs | 3.0466μs | 328.2374 KOps/s | 327.8897 KOps/s | |
test_unbind | 0.2233s | 9.9358ms | 100.6458 Ops/s | 152.4530 Ops/s | |
test_full_like | 10.6615ms | 9.1558ms | 109.2206 Ops/s | 108.6675 Ops/s | |
test_zeros_like | 4.8546ms | 4.3008ms | 232.5154 Ops/s | 115.6708 Ops/s | |
test_ones_like | 10.0966ms | 7.1359ms | 140.1368 Ops/s | 231.5833 Ops/s | |
test_clone | 6.6902ms | 6.3467ms | 157.5620 Ops/s | 158.0967 Ops/s | |
test_squeeze | 60.3500μs | 9.2463μs | 108.1509 KOps/s | 107.8226 KOps/s | |
test_unsqueeze | 0.1213ms | 69.6665μs | 14.3541 KOps/s | 14.5829 KOps/s | |
test_split | 0.3617ms | 0.1547ms | 6.4651 KOps/s | 6.2182 KOps/s | |
test_permute | 0.2585ms | 0.1746ms | 5.7287 KOps/s | 5.5176 KOps/s | |
test_stack | 50.6203ms | 50.4039ms | 19.8397 Ops/s | 19.8492 Ops/s | |
test_cat | 50.5282ms | 50.2747ms | 19.8907 Ops/s | 23.6898 Ops/s |
vmoens
added a commit
that referenced
this pull request
Dec 2, 2024
ghstack-source-id: 68f21aca722895e8a240dbca66e97310c20a6b5d Pull Request resolved: #1121
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):