-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster copy of TDParams #1096
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 20, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 33.9340μs | 16.4075μs | 60.9478 KOps/s | 57.5544 KOps/s | |
test_plain_set_stack_nested | 47.0790μs | 16.7674μs | 59.6395 KOps/s | 57.0728 KOps/s | |
test_plain_set_nested_inplace | 68.0560μs | 18.3168μs | 54.5947 KOps/s | 52.3311 KOps/s | |
test_plain_set_stack_nested_inplace | 67.1860μs | 18.3636μs | 54.4556 KOps/s | 51.7642 KOps/s | |
test_items | 0.1604ms | 4.2693μs | 234.2330 KOps/s | 237.2934 KOps/s | |
test_items_nested | 0.5744ms | 0.3387ms | 2.9526 KOps/s | 2.8966 KOps/s | |
test_items_nested_locked | 0.6404ms | 0.3422ms | 2.9225 KOps/s | 2.9316 KOps/s | |
test_items_nested_leaf | 0.1299ms | 71.7923μs | 13.9291 KOps/s | 14.0357 KOps/s | |
test_items_stack_nested | 0.7922ms | 0.3477ms | 2.8756 KOps/s | 2.9071 KOps/s | |
test_items_stack_nested_leaf | 0.3105ms | 74.9423μs | 13.3436 KOps/s | 13.4479 KOps/s | |
test_items_stack_nested_locked | 0.4743ms | 0.3404ms | 2.9375 KOps/s | 2.8848 KOps/s | |
test_keys | 39.8530μs | 3.5642μs | 280.5713 KOps/s | 280.9288 KOps/s | |
test_keys_nested | 0.2069ms | 0.1361ms | 7.3470 KOps/s | 7.3606 KOps/s | |
test_keys_nested_locked | 1.7900ms | 0.1434ms | 6.9745 KOps/s | 7.1206 KOps/s | |
test_keys_nested_leaf | 0.2673ms | 0.1175ms | 8.5125 KOps/s | 8.4796 KOps/s | |
test_keys_stack_nested | 0.2001ms | 0.1365ms | 7.3244 KOps/s | 7.4868 KOps/s | |
test_keys_stack_nested_leaf | 0.2101ms | 0.1176ms | 8.5033 KOps/s | 8.6686 KOps/s | |
test_keys_stack_nested_locked | 0.2426ms | 0.1422ms | 7.0314 KOps/s | 7.1727 KOps/s | |
test_values | 8.7986μs | 1.0712μs | 933.5466 KOps/s | 933.0029 KOps/s | |
test_values_nested | 0.1100ms | 55.3382μs | 18.0707 KOps/s | 17.7850 KOps/s | |
test_values_nested_locked | 0.1086ms | 55.3634μs | 18.0625 KOps/s | 18.0568 KOps/s | |
test_values_nested_leaf | 0.1094ms | 60.9151μs | 16.4163 KOps/s | 16.3443 KOps/s | |
test_values_stack_nested | 0.1078ms | 56.9476μs | 17.5600 KOps/s | 16.2191 KOps/s | |
test_values_stack_nested_leaf | 0.1129ms | 61.5324μs | 16.2516 KOps/s | 16.5554 KOps/s | |
test_values_stack_nested_locked | 0.1274ms | 57.1346μs | 17.5025 KOps/s | 17.8103 KOps/s | |
test_membership | 6.5966μs | 0.7644μs | 1.3083 MOps/s | 1.1033 MOps/s | |
test_membership_nested | 35.7070μs | 2.7535μs | 363.1767 KOps/s | 358.5908 KOps/s | |
test_membership_nested_leaf | 44.1730μs | 2.7842μs | 359.1648 KOps/s | 356.7595 KOps/s | |
test_membership_stacked_nested | 54.7400μs | 2.7558μs | 362.8658 KOps/s | 367.1706 KOps/s | |
test_membership_stacked_nested_leaf | 18.4040μs | 2.8250μs | 353.9877 KOps/s | 363.0385 KOps/s | |
test_membership_nested_last | 23.2140μs | 4.0731μs | 245.5109 KOps/s | 241.3717 KOps/s | |
test_membership_nested_leaf_last | 24.3860μs | 4.1436μs | 241.3352 KOps/s | 246.4473 KOps/s | |
test_membership_stacked_nested_last | 46.1980μs | 4.6701μs | 214.1298 KOps/s | 167.2533 KOps/s | |
test_membership_stacked_nested_leaf_last | 35.9670μs | 4.7093μs | 212.3465 KOps/s | 174.8331 KOps/s | |
test_nested_getleaf | 62.5980μs | 10.8364μs | 92.2812 KOps/s | 94.1053 KOps/s | |
test_nested_get | 49.7330μs | 10.2899μs | 97.1824 KOps/s | 99.1629 KOps/s | |
test_stacked_getleaf | 50.5630μs | 10.6091μs | 94.2586 KOps/s | 94.8322 KOps/s | |
test_stacked_get | 47.5610μs | 10.0983μs | 99.0266 KOps/s | 98.8330 KOps/s | |
test_nested_getitemleaf | 54.9410μs | 11.0380μs | 90.5958 KOps/s | 90.2159 KOps/s | |
test_nested_getitem | 0.1365ms | 10.3704μs | 96.4286 KOps/s | 96.2914 KOps/s | |
test_stacked_getitemleaf | 73.3170μs | 11.1559μs | 89.6384 KOps/s | 91.7882 KOps/s | |
test_stacked_getitem | 34.8050μs | 10.4614μs | 95.5891 KOps/s | 96.6378 KOps/s | |
test_lock_nested | 3.1877ms | 0.4423ms | 2.2610 KOps/s | 1.8688 KOps/s | |
test_lock_stack_nested | 0.9089ms | 0.4122ms | 2.4258 KOps/s | 2.4737 KOps/s | |
test_unlock_nested | 0.6911ms | 0.3536ms | 2.8283 KOps/s | 2.7852 KOps/s | |
test_unlock_stack_nested | 0.5609ms | 0.3281ms | 3.0476 KOps/s | 3.0758 KOps/s | |
test_flatten_speed | 0.1608ms | 91.2894μs | 10.9542 KOps/s | 10.7514 KOps/s | |
test_unflatten_speed | 0.8281ms | 0.4728ms | 2.1151 KOps/s | 2.0547 KOps/s | |
test_common_ops | 3.6292ms | 0.7205ms | 1.3880 KOps/s | 1.3144 KOps/s | |
test_creation | 33.6330μs | 2.0400μs | 490.1978 KOps/s | 473.5704 KOps/s | |
test_creation_empty | 38.6420μs | 8.6075μs | 116.1782 KOps/s | 96.6443 KOps/s | |
test_creation_nested_1 | 48.0900μs | 11.2476μs | 88.9082 KOps/s | 78.4437 KOps/s | |
test_creation_nested_2 | 66.6050μs | 15.6330μs | 63.9673 KOps/s | 57.3822 KOps/s | |
test_clone | 0.1185ms | 12.9312μs | 77.3322 KOps/s | 78.0149 KOps/s | |
test_getitem[int] | 1.4990ms | 12.4798μs | 80.1292 KOps/s | 81.6276 KOps/s | |
test_getitem[slice_int] | 0.1390ms | 24.4772μs | 40.8544 KOps/s | 41.9955 KOps/s | |
test_getitem[range] | 0.2631ms | 48.2659μs | 20.7186 KOps/s | 20.8050 KOps/s | |
test_getitem[tuple] | 0.1389ms | 19.9232μs | 50.1927 KOps/s | 50.9505 KOps/s | |
test_getitem[list] | 0.4124ms | 43.7978μs | 22.8322 KOps/s | 23.2349 KOps/s | |
test_setitem_dim[int] | 84.8380μs | 26.9490μs | 37.1071 KOps/s | 38.7809 KOps/s | |
test_setitem_dim[slice_int] | 0.1034ms | 52.3427μs | 19.1049 KOps/s | 19.4999 KOps/s | |
test_setitem_dim[range] | 0.1270ms | 73.7729μs | 13.5551 KOps/s | 13.3982 KOps/s | |
test_setitem_dim[tuple] | 80.5610μs | 40.4905μs | 24.6971 KOps/s | 24.6975 KOps/s | |
test_setitem | 0.1333ms | 18.5675μs | 53.8576 KOps/s | 51.4862 KOps/s | |
test_set | 0.1571ms | 17.9651μs | 55.6636 KOps/s | 53.2340 KOps/s | |
test_set_shared | 2.0953ms | 0.1716ms | 5.8266 KOps/s | 5.9438 KOps/s | |
test_update | 0.1125ms | 19.3383μs | 51.7109 KOps/s | 46.7011 KOps/s | |
test_update_nested | 92.1950μs | 27.9288μs | 35.8053 KOps/s | 32.3746 KOps/s | |
test_update__nested | 0.8732ms | 32.0801μs | 31.1719 KOps/s | 30.7899 KOps/s | |
test_set_nested | 84.6190μs | 19.9974μs | 50.0066 KOps/s | 46.7800 KOps/s | |
test_set_nested_new | 80.9320μs | 24.0272μs | 41.6196 KOps/s | 38.4121 KOps/s | |
test_select | 99.6650μs | 39.0106μs | 25.6340 KOps/s | 24.1011 KOps/s | |
test_select_nested | 0.1193ms | 59.8001μs | 16.7224 KOps/s | 16.8722 KOps/s | |
test_exclude_nested | 0.1464ms | 75.1846μs | 13.3006 KOps/s | 13.4078 KOps/s | |
test_empty[True] | 0.7040ms | 0.3510ms | 2.8491 KOps/s | 2.8504 KOps/s | |
test_empty[False] | 11.0807μs | 1.2122μs | 824.9398 KOps/s | 820.7680 KOps/s | |
test_unbind_speed | 0.3980ms | 0.2618ms | 3.8198 KOps/s | 3.8462 KOps/s | |
test_unbind_speed_stack0 | 0.5162ms | 0.2582ms | 3.8726 KOps/s | 3.9714 KOps/s | |
test_unbind_speed_stack1 | 0.1041s | 0.7616ms | 1.3131 KOps/s | 1.4569 KOps/s | |
test_split | 0.1010s | 1.7213ms | 580.9665 Ops/s | 570.1472 Ops/s | |
test_chunk | 0.1084s | 1.7371ms | 575.6673 Ops/s | 571.5657 Ops/s | |
test_consolidate_njt[False-None] | 8.5916ms | 8.0625ms | 124.0312 Ops/s | 120.7531 Ops/s | |
test_creation[device0] | 0.2305ms | 89.3481μs | 11.1922 KOps/s | 10.3730 KOps/s | |
test_creation_from_tensor | 0.2232ms | 92.3374μs | 10.8299 KOps/s | 10.5555 KOps/s | |
test_add_one[memmap_tensor0] | 0.1224ms | 5.0934μs | 196.3338 KOps/s | 197.7421 KOps/s | |
test_contiguous[memmap_tensor0] | 10.3290μs | 0.5195μs | 1.9249 MOps/s | 1.8639 MOps/s | |
test_stack[memmap_tensor0] | 49.4820μs | 3.4257μs | 291.9140 KOps/s | 271.3112 KOps/s | |
test_memmaptd_index | 1.1471ms | 0.2372ms | 4.2161 KOps/s | 4.1704 KOps/s | |
test_memmaptd_index_astensor | 0.5691ms | 0.3138ms | 3.1863 KOps/s | 3.1371 KOps/s | |
test_memmaptd_index_op | 1.0379ms | 0.5531ms | 1.8081 KOps/s | 1.7321 KOps/s | |
test_serialize_model | 0.1369s | 0.1196s | 8.3635 Ops/s | 7.3003 Ops/s | |
test_serialize_model_pickle | 0.4685s | 0.4000s | 2.5003 Ops/s | 2.4933 Ops/s | |
test_serialize_weights | 0.2150s | 0.1284s | 7.7897 Ops/s | 8.7086 Ops/s | |
test_serialize_weights_returnearly | 0.1792s | 0.1628s | 6.1426 Ops/s | 6.2516 Ops/s | |
test_serialize_weights_pickle | 0.4898s | 0.4401s | 2.2723 Ops/s | 2.2923 Ops/s | |
test_serialize_weights_filesystem | 0.1448s | 0.1402s | 7.1343 Ops/s | 7.1491 Ops/s | |
test_serialize_model_filesystem | 0.2401s | 0.1610s | 6.2095 Ops/s | 5.8713 Ops/s | |
test_reshape_pytree | 76.5930μs | 26.5389μs | 37.6805 KOps/s | 37.3538 KOps/s | |
test_reshape_td | 74.8900μs | 32.3596μs | 30.9027 KOps/s | 31.1944 KOps/s | |
test_view_pytree | 78.2260μs | 26.6012μs | 37.5923 KOps/s | 37.0224 KOps/s | |
test_view_td | 0.1145ms | 38.8238μs | 25.7574 KOps/s | 26.2810 KOps/s | |
test_unbind_pytree | 80.4910μs | 30.5181μs | 32.7675 KOps/s | 33.2225 KOps/s | |
test_unbind_td | 0.3360ms | 38.5822μs | 25.9187 KOps/s | 26.0543 KOps/s | |
test_split_pytree | 85.6210μs | 29.9990μs | 33.3345 KOps/s | 33.4925 KOps/s | |
test_split_td | 0.1983ms | 44.3526μs | 22.5466 KOps/s | 22.8950 KOps/s | |
test_add_pytree | 77.0140μs | 35.8208μs | 27.9167 KOps/s | 27.4896 KOps/s | |
test_add_td | 0.1428ms | 53.0205μs | 18.8606 KOps/s | 17.0826 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1250ms | 61.9784μs | 16.1347 KOps/s | 16.3361 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3630ms | 0.1622ms | 6.1638 KOps/s | 6.2653 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 93.2240μs | 45.4058μs | 22.0236 KOps/s | 21.6488 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2264ms | 0.1183ms | 8.4529 KOps/s | 8.3188 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 61.4350μs | 26.1366μs | 38.2605 KOps/s | 39.3128 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1032ms | 53.5781μs | 18.6643 KOps/s | 18.4940 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1615ms | 78.9573μs | 12.6651 KOps/s | 12.6957 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1409ms | 68.3646μs | 14.6274 KOps/s | 14.9449 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1928ms | 0.1040ms | 9.6120 KOps/s | 9.5595 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3776ms | 0.1983ms | 5.0418 KOps/s | 5.0574 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1065ms | 44.3886μs | 22.5283 KOps/s | 21.4609 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5210ms | 61.4678μs | 16.2687 KOps/s | 16.4049 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1917ms | 0.1027ms | 9.7409 KOps/s | 9.5889 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4380ms | 0.2005ms | 4.9867 KOps/s | 4.8914 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4275ms | 0.2101ms | 4.7595 KOps/s | 4.7407 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2248ms | 0.1043ms | 9.5848 KOps/s | 9.4137 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1171ms | 55.9406μs | 17.8761 KOps/s | 18.7188 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1518ms | 47.6659μs | 20.9794 KOps/s | 21.7983 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3361ms | 0.1593ms | 6.2777 KOps/s | 6.2643 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2049ms | 0.1032ms | 9.6854 KOps/s | 9.6861 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 66.0740μs | 21.2060μs | 47.1566 KOps/s | 47.5589 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1609ms | 59.2666μs | 16.8729 KOps/s | 17.0928 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1785ms | 81.4238μs | 12.2814 KOps/s | 12.1028 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1452ms | 69.5699μs | 14.3740 KOps/s | 14.2666 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2846ms | 0.2093ms | 4.7784 KOps/s | 4.8158 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.1704ms | 1.2892ms | 775.6659 Ops/s | 797.7010 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4645ms | 0.2020ms | 4.9502 KOps/s | 4.9108 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.9369ms | 0.7760ms | 1.2887 KOps/s | 1.2635 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5547ms | 0.4583ms | 2.1819 KOps/s | 2.1519 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.0979ms | 2.4709ms | 404.7189 Ops/s | 390.3909 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1469ms | 35.0629μs | 28.5201 KOps/s | 27.3903 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.8343ms | 32.2903μs | 30.9690 KOps/s | 29.1441 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 73.8780μs | 28.7864μs | 34.7386 KOps/s | 32.9911 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 78.6870μs | 23.1633μs | 43.1718 KOps/s | 41.5548 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1077ms | 29.9647μs | 33.3726 KOps/s | 32.2818 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 79.8910μs | 23.2733μs | 42.9677 KOps/s | 42.0353 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2393ms | 52.4659μs | 19.0600 KOps/s | 18.8002 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.6119ms | 19.4000μs | 51.5464 KOps/s | 51.0506 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1260ms | 44.1963μs | 22.6263 KOps/s | 21.9257 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 63.4290μs | 18.7204μs | 53.4177 KOps/s | 51.9935 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1275ms | 45.1512μs | 22.1478 KOps/s | 21.2524 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1709ms | 18.9873μs | 52.6667 KOps/s | 52.3329 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1061ms | 53.4934μs | 18.6939 KOps/s | 18.3612 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0014ms | 19.4722μs | 51.3553 KOps/s | 51.1196 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1048ms | 45.4898μs | 21.9829 KOps/s | 21.2585 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 78.8580μs | 19.0196μs | 52.5772 KOps/s | 52.7284 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1214ms | 45.9088μs | 21.7823 KOps/s | 21.1854 KOps/s | |
test_compile_indexing[int-pytree-eager] | 85.0580μs | 18.8050μs | 53.1774 KOps/s | 52.0474 KOps/s | |
test_mod_add[eager] | 83.1660μs | 23.6258μs | 42.3266 KOps/s | 36.8722 KOps/s | |
test_mod_add[compile] | 0.1213ms | 44.6156μs | 22.4137 KOps/s | 21.2462 KOps/s | |
test_mod_add[compile-overhead] | 0.1116ms | 45.2132μs | 22.1174 KOps/s | 21.5231 KOps/s | |
test_mod_wrap[eager] | 0.3500ms | 0.2066ms | 4.8404 KOps/s | 4.5925 KOps/s | |
test_mod_wrap[compile] | 1.9887ms | 0.2051ms | 4.8757 KOps/s | 4.7373 KOps/s | |
test_mod_wrap[compile-overhead] | 1.8411ms | 0.2053ms | 4.8703 KOps/s | 4.7732 KOps/s | |
test_mod_wrap_and_backward[eager] | 14.2706ms | 11.5741ms | 86.3997 Ops/s | 91.0004 Ops/s | |
test_mod_wrap_and_backward[compile] | 16.4117ms | 12.3571ms | 80.9249 Ops/s | 78.1459 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.8766ms | 12.5505ms | 79.6781 Ops/s | 80.9790 Ops/s | |
test_seq_add[eager] | 0.1536ms | 84.1887μs | 11.8781 KOps/s | 10.9720 KOps/s | |
test_seq_add[compile] | 0.1170ms | 60.7357μs | 16.4648 KOps/s | 16.4877 KOps/s | |
test_seq_add[compile-overhead] | 0.1257ms | 57.4094μs | 17.4188 KOps/s | 16.9505 KOps/s | |
test_seq_wrap[eager] | 0.4981ms | 0.3713ms | 2.6931 KOps/s | 2.6160 KOps/s | |
test_seq_wrap[compile] | 0.4055ms | 0.2263ms | 4.4194 KOps/s | 4.3571 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4115ms | 0.2286ms | 4.3753 KOps/s | 4.4520 KOps/s | |
test_func_call_runtime[False-eager] | 0.8134ms | 0.5437ms | 1.8394 KOps/s | 1.8337 KOps/s | |
test_func_call_runtime[False-compile] | 1.0050ms | 0.4359ms | 2.2939 KOps/s | 2.3131 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5515ms | 0.4345ms | 2.3015 KOps/s | 2.3264 KOps/s | |
test_func_call_runtime[True-eager] | 1.0731ms | 0.7520ms | 1.3298 KOps/s | 1.3365 KOps/s | |
test_func_call_runtime[True-compile] | 0.7070ms | 0.4703ms | 2.1262 KOps/s | 2.1326 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9219ms | 0.4773ms | 2.0951 KOps/s | 2.1348 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8577ms | 0.5410ms | 1.8486 KOps/s | 1.8256 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7677ms | 0.4354ms | 2.2967 KOps/s | 2.3025 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6917ms | 0.4335ms | 2.3070 KOps/s | 2.3257 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4463ms | 0.8967ms | 1.1152 KOps/s | 1.1380 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.5912ms | 0.4920ms | 2.0327 KOps/s | 2.0324 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.9082ms | 0.4998ms | 2.0008 KOps/s | 2.0303 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.1259ms | 1.8923ms | 528.4584 Ops/s | 524.5319 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.6738ms | 0.5236ms | 1.9098 KOps/s | 1.8552 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.9001ms | 0.5284ms | 1.8924 KOps/s | 1.9459 KOps/s | |
test_distributed | 0.3585ms | 0.1263ms | 7.9196 KOps/s | 7.6640 KOps/s | |
test_tdmodule | 52.5790μs | 16.9899μs | 58.8586 KOps/s | 52.6840 KOps/s | |
test_tdmodule_dispatch | 55.1630μs | 32.7197μs | 30.5626 KOps/s | 27.2051 KOps/s | |
test_tdseq | 48.8610μs | 18.9425μs | 52.7913 KOps/s | 47.3859 KOps/s | |
test_tdseq_dispatch | 80.0900μs | 37.3074μs | 26.8043 KOps/s | 24.3984 KOps/s | |
test_instantiation_functorch | 1.8485ms | 1.5241ms | 656.1327 Ops/s | 623.9535 Ops/s | |
test_exec_functorch | 0.2814ms | 0.1795ms | 5.5708 KOps/s | 5.5443 KOps/s | |
test_exec_functional_call | 0.3019ms | 0.1688ms | 5.9256 KOps/s | 5.7171 KOps/s | |
test_exec_td_decorator | 0.5559ms | 0.2270ms | 4.4051 KOps/s | 4.3716 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7757ms | 0.6363ms | 1.5716 KOps/s | 1.5262 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0182ms | 0.6359ms | 1.5726 KOps/s | 1.5632 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8377ms | 0.5289ms | 1.8906 KOps/s | 1.8761 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7360ms | 0.5275ms | 1.8957 KOps/s | 1.8774 KOps/s | |
test_to_module_speed[True] | 2.0682ms | 1.3188ms | 758.2804 Ops/s | 784.2784 Ops/s | |
test_to_module_speed[False] | 1.8460ms | 1.2924ms | 773.7269 Ops/s | 807.2439 Ops/s | |
test_tc_init | 0.1080ms | 42.2898μs | 23.6464 KOps/s | 21.8100 KOps/s | |
test_tc_init_nested | 0.1485ms | 83.2507μs | 12.0119 KOps/s | 10.8806 KOps/s | |
test_tc_first_layer_tensor | 24.4850μs | 1.5322μs | 652.6514 KOps/s | 670.8925 KOps/s | |
test_tc_first_layer_nontensor | 59.0710μs | 4.6135μs | 216.7535 KOps/s | 216.5018 KOps/s | |
test_tc_second_layer_tensor | 31.7590μs | 2.8396μs | 352.1655 KOps/s | 365.4092 KOps/s | |
test_tc_second_layer_nontensor | 50.2440μs | 5.9646μs | 167.6545 KOps/s | 168.6621 KOps/s | |
test_unbind | 0.2416s | 13.0820ms | 76.4408 Ops/s | 81.7615 Ops/s | |
test_full_like | 18.0985ms | 13.8667ms | 72.1154 Ops/s | 122.0499 Ops/s | |
test_zeros_like | 14.3647ms | 8.0280ms | 124.5641 Ops/s | 335.6527 Ops/s | |
test_ones_like | 13.7291ms | 7.7377ms | 129.2366 Ops/s | 284.4030 Ops/s | |
test_clone | 14.1399ms | 9.6139ms | 104.0162 Ops/s | 177.1764 Ops/s | |
test_squeeze | 60.0730μs | 12.4198μs | 80.5164 KOps/s | 84.3989 KOps/s | |
test_unsqueeze | 0.1778ms | 88.8223μs | 11.2584 KOps/s | 11.2391 KOps/s | |
test_split | 0.5698ms | 0.1906ms | 5.2472 KOps/s | 5.3094 KOps/s | |
test_permute | 0.4662ms | 0.2241ms | 4.4614 KOps/s | 4.5828 KOps/s | |
test_stack | 30.2855ms | 27.8996ms | 35.8428 Ops/s | 37.5281 Ops/s | |
test_cat | 31.6475ms | 27.7583ms | 36.0253 Ops/s | 37.7820 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 35.9720μs | 10.8118μs | 92.4916 KOps/s | 86.7898 KOps/s | |
test_plain_set_stack_nested | 31.7610μs | 10.8716μs | 91.9830 KOps/s | 86.4394 KOps/s | |
test_plain_set_nested_inplace | 40.7020μs | 11.7402μs | 85.1775 KOps/s | 80.7435 KOps/s | |
test_plain_set_stack_nested_inplace | 39.3920μs | 11.6706μs | 85.6857 KOps/s | 80.4588 KOps/s | |
test_items | 33.7620μs | 2.9193μs | 342.5439 KOps/s | 345.3822 KOps/s | |
test_items_nested | 0.4724ms | 0.3187ms | 3.1373 KOps/s | 3.1507 KOps/s | |
test_items_nested_locked | 0.3769ms | 0.3214ms | 3.1115 KOps/s | 3.1430 KOps/s | |
test_items_nested_leaf | 84.5850μs | 58.9570μs | 16.9615 KOps/s | 16.8843 KOps/s | |
test_items_stack_nested | 0.3813ms | 0.3208ms | 3.1170 KOps/s | 3.1373 KOps/s | |
test_items_stack_nested_leaf | 93.9040μs | 60.5336μs | 16.5197 KOps/s | 17.1711 KOps/s | |
test_items_stack_nested_locked | 0.4003ms | 0.3230ms | 3.0956 KOps/s | 3.0951 KOps/s | |
test_keys | 75.4540μs | 3.4690μs | 288.2635 KOps/s | 289.2965 KOps/s | |
test_keys_nested | 0.1058ms | 69.4865μs | 14.3913 KOps/s | 14.1941 KOps/s | |
test_keys_nested_locked | 2.9828ms | 75.1597μs | 13.3050 KOps/s | 13.1283 KOps/s | |
test_keys_nested_leaf | 93.6340μs | 61.5464μs | 16.2479 KOps/s | 16.1823 KOps/s | |
test_keys_stack_nested | 0.1063ms | 70.9397μs | 14.0965 KOps/s | 14.1673 KOps/s | |
test_keys_stack_nested_leaf | 95.0040μs | 62.0630μs | 16.1127 KOps/s | 16.2620 KOps/s | |
test_keys_stack_nested_locked | 0.1071ms | 76.0625μs | 13.1471 KOps/s | 13.1942 KOps/s | |
test_values | 5.5070μs | 0.8440μs | 1.1848 MOps/s | 1.1826 MOps/s | |
test_values_nested | 67.3030μs | 31.2155μs | 32.0353 KOps/s | 32.1778 KOps/s | |
test_values_nested_locked | 63.4430μs | 32.9622μs | 30.3378 KOps/s | 30.3160 KOps/s | |
test_values_nested_leaf | 65.6530μs | 34.1276μs | 29.3018 KOps/s | 29.6468 KOps/s | |
test_values_stack_nested | 66.3630μs | 31.7251μs | 31.5208 KOps/s | 32.2408 KOps/s | |
test_values_stack_nested_leaf | 63.3130μs | 34.5730μs | 28.9243 KOps/s | 29.7626 KOps/s | |
test_values_stack_nested_locked | 56.9130μs | 33.3552μs | 29.9804 KOps/s | 30.2861 KOps/s | |
test_membership | 1.9366μs | 0.5177μs | 1.9318 MOps/s | 1.9474 MOps/s | |
test_membership_nested | 13.3705μs | 1.8878μs | 529.7093 KOps/s | 523.3712 KOps/s | |
test_membership_nested_leaf | 18.7655μs | 1.9034μs | 525.3745 KOps/s | 515.1535 KOps/s | |
test_membership_stacked_nested | 37.9020μs | 1.9607μs | 510.0094 KOps/s | 493.2736 KOps/s | |
test_membership_stacked_nested_leaf | 37.6920μs | 1.9603μs | 510.1242 KOps/s | 500.4893 KOps/s | |
test_membership_nested_last | 30.2510μs | 2.7858μs | 358.9674 KOps/s | 350.6043 KOps/s | |
test_membership_nested_leaf_last | 35.3510μs | 2.8264μs | 353.8050 KOps/s | 347.5504 KOps/s | |
test_membership_stacked_nested_last | 38.6620μs | 3.2367μs | 308.9601 KOps/s | 350.1974 KOps/s | |
test_membership_stacked_nested_leaf_last | 34.3720μs | 3.2596μs | 306.7880 KOps/s | 351.8986 KOps/s | |
test_nested_getleaf | 36.6020μs | 6.0335μs | 165.7411 KOps/s | 167.5535 KOps/s | |
test_nested_get | 35.9520μs | 5.7097μs | 175.1418 KOps/s | 175.0355 KOps/s | |
test_stacked_getleaf | 36.0020μs | 6.0002μs | 166.6619 KOps/s | 166.5239 KOps/s | |
test_stacked_get | 45.4320μs | 5.7026μs | 175.3581 KOps/s | 175.5316 KOps/s | |
test_nested_getitemleaf | 29.9720μs | 6.1613μs | 162.3031 KOps/s | 164.5527 KOps/s | |
test_nested_getitem | 37.9810μs | 5.8117μs | 172.0665 KOps/s | 171.8239 KOps/s | |
test_stacked_getitemleaf | 39.3720μs | 6.1178μs | 163.4580 KOps/s | 163.8504 KOps/s | |
test_stacked_getitem | 41.9920μs | 5.7567μs | 173.7120 KOps/s | 171.9548 KOps/s | |
test_lock_nested | 7.2502ms | 0.3725ms | 2.6844 KOps/s | 2.7134 KOps/s | |
test_lock_stack_nested | 0.3913ms | 0.3333ms | 3.0001 KOps/s | 2.9951 KOps/s | |
test_unlock_nested | 0.7503ms | 0.3054ms | 3.2745 KOps/s | 3.3178 KOps/s | |
test_unlock_stack_nested | 0.3347ms | 0.2742ms | 3.6463 KOps/s | 3.7026 KOps/s | |
test_flatten_speed | 0.1108ms | 73.3224μs | 13.6384 KOps/s | 13.6665 KOps/s | |
test_unflatten_speed | 0.3836ms | 0.2950ms | 3.3902 KOps/s | 3.4151 KOps/s | |
test_common_ops | 1.7696ms | 0.5800ms | 1.7241 KOps/s | 1.6797 KOps/s | |
test_creation | 0.1069ms | 1.4751μs | 677.9322 KOps/s | 685.2447 KOps/s | |
test_creation_empty | 50.3430μs | 7.7966μs | 128.2605 KOps/s | 107.9885 KOps/s | |
test_creation_nested_1 | 38.7520μs | 9.2284μs | 108.3611 KOps/s | 93.1911 KOps/s | |
test_creation_nested_2 | 51.7420μs | 11.8817μs | 84.1629 KOps/s | 76.2559 KOps/s | |
test_clone | 56.4430μs | 9.6631μs | 103.4863 KOps/s | 104.3942 KOps/s | |
test_getitem[int] | 92.3143ms | 15.4716μs | 64.6344 KOps/s | 94.7482 KOps/s | |
test_getitem[slice_int] | 0.1105ms | 20.6032μs | 48.5362 KOps/s | 48.5374 KOps/s | |
test_getitem[range] | 0.1326ms | 36.0692μs | 27.7245 KOps/s | 27.6920 KOps/s | |
test_getitem[tuple] | 0.1042ms | 18.0521μs | 55.3951 KOps/s | 54.4916 KOps/s | |
test_getitem[list] | 0.2361ms | 31.7664μs | 31.4798 KOps/s | 31.2975 KOps/s | |
test_setitem_dim[int] | 25.0010μs | 17.0566μs | 58.6283 KOps/s | 57.1178 KOps/s | |
test_setitem_dim[slice_int] | 59.5840μs | 35.5149μs | 28.1572 KOps/s | 28.2020 KOps/s | |
test_setitem_dim[range] | 82.5040μs | 51.5448μs | 19.4006 KOps/s | 19.5055 KOps/s | |
test_setitem_dim[tuple] | 57.1830μs | 29.2575μs | 34.1793 KOps/s | 33.1783 KOps/s | |
test_setitem | 81.6350μs | 13.5130μs | 74.0027 KOps/s | 67.7075 KOps/s | |
test_set | 92.1950μs | 13.3198μs | 75.0762 KOps/s | 71.0424 KOps/s | |
test_set_shared | 1.6487ms | 0.1436ms | 6.9649 KOps/s | 6.9537 KOps/s | |
test_update | 0.3386ms | 16.2197μs | 61.6534 KOps/s | 56.1272 KOps/s | |
test_update_nested | 1.1225ms | 20.9525μs | 47.7270 KOps/s | 44.6071 KOps/s | |
test_update__nested | 81.2440μs | 23.5511μs | 42.4608 KOps/s | 43.3704 KOps/s | |
test_set_nested | 83.1650μs | 14.3771μs | 69.5550 KOps/s | 66.3022 KOps/s | |
test_set_nested_new | 72.2640μs | 16.3688μs | 61.0919 KOps/s | 58.1001 KOps/s | |
test_select | 62.0940μs | 28.5941μs | 34.9722 KOps/s | 33.2868 KOps/s | |
test_select_nested | 72.5740μs | 41.5138μs | 24.0883 KOps/s | 24.0411 KOps/s | |
test_exclude_nested | 0.1127ms | 59.5023μs | 16.8061 KOps/s | 16.9944 KOps/s | |
test_empty[True] | 0.3150ms | 0.2542ms | 3.9335 KOps/s | 3.9361 KOps/s | |
test_empty[False] | 3.4271μs | 0.7419μs | 1.3479 MOps/s | 1.3520 MOps/s | |
test_to | 86.1950μs | 54.1027μs | 18.4834 KOps/s | 18.3974 KOps/s | |
test_to_nonblocking | 95.3350μs | 45.2303μs | 22.1091 KOps/s | 22.5853 KOps/s | |
test_unbind_speed | 1.6743ms | 0.2307ms | 4.3340 KOps/s | 4.3456 KOps/s | |
test_unbind_speed_stack0 | 0.2800ms | 0.2311ms | 4.3264 KOps/s | 4.3151 KOps/s | |
test_unbind_speed_stack1 | 92.9464ms | 0.6427ms | 1.5560 KOps/s | 1.5513 KOps/s | |
test_split | 93.6500ms | 1.5872ms | 630.0279 Ops/s | 640.9608 Ops/s | |
test_chunk | 95.7835ms | 1.6012ms | 624.5151 Ops/s | 587.8331 Ops/s | |
test_consolidate[False-None] | 96.2208ms | 2.8902ms | 345.9914 Ops/s | 383.4067 Ops/s | |
test_consolidate[default-None] | 1.8450ms | 1.6980ms | 588.9128 Ops/s | 596.9174 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8652ms | 1.6982ms | 588.8714 Ops/s | 577.3676 Ops/s | |
test_consolidate_njt[False-None] | 6.6552ms | 6.3377ms | 157.7862 Ops/s | 151.6324 Ops/s | |
test_to[False-False-None] | 1.7951ms | 1.6481ms | 606.7465 Ops/s | 613.9869 Ops/s | |
test_to[True-False-None] | 1.6510ms | 1.2922ms | 773.8788 Ops/s | 777.2406 Ops/s | |
test_to[within-False-None] | 4.5622ms | 4.0257ms | 248.4068 Ops/s | 253.2489 Ops/s | |
test_to[True-default-None] | 5.5862ms | 5.1776ms | 193.1407 Ops/s | 199.9816 Ops/s | |
test_to_njt[False-False-None] | 7.3806ms | 6.9411ms | 144.0695 Ops/s | 146.8333 Ops/s | |
test_to_njt[True-False-None] | 5.8206ms | 5.4220ms | 184.4350 Ops/s | 188.4751 Ops/s | |
test_to_njt[within-False-None] | 12.3872ms | 11.9945ms | 83.3714 Ops/s | 85.1553 Ops/s | |
test_creation[device0] | 0.5826ms | 79.2740μs | 12.6145 KOps/s | 12.8214 KOps/s | |
test_creation_from_tensor | 0.4697ms | 81.2200μs | 12.3122 KOps/s | 12.4794 KOps/s | |
test_add_one[memmap_tensor0] | 0.3078ms | 5.9783μs | 167.2706 KOps/s | 167.0615 KOps/s | |
test_contiguous[memmap_tensor0] | 1.7421μs | 0.4103μs | 2.4373 MOps/s | 2.4112 MOps/s | |
test_stack[memmap_tensor0] | 38.6720μs | 4.3693μs | 228.8672 KOps/s | 231.9893 KOps/s | |
test_memmaptd_index | 1.7822ms | 0.2386ms | 4.1905 KOps/s | 4.1593 KOps/s | |
test_memmaptd_index_astensor | 0.9544ms | 0.2962ms | 3.3764 KOps/s | 3.3630 KOps/s | |
test_memmaptd_index_op | 0.9763ms | 0.5392ms | 1.8547 KOps/s | 1.7711 KOps/s | |
test_serialize_model | 0.1328s | 0.1311s | 7.6292 Ops/s | 7.6410 Ops/s | |
test_serialize_model_pickle | 1.3505s | 1.2121s | 0.8250 Ops/s | 0.8244 Ops/s | |
test_serialize_weights | 0.1311s | 0.1301s | 7.6842 Ops/s | 7.6826 Ops/s | |
test_serialize_weights_returnearly | 47.4668ms | 41.2615ms | 24.2357 Ops/s | 14.5642 Ops/s | |
test_serialize_weights_pickle | 1.3483s | 1.2111s | 0.8257 Ops/s | 0.8351 Ops/s | |
test_reshape_pytree | 44.9110μs | 22.2840μs | 44.8752 KOps/s | 43.2435 KOps/s | |
test_reshape_td | 57.5020μs | 26.2096μs | 38.1539 KOps/s | 36.9473 KOps/s | |
test_view_pytree | 52.7620μs | 22.2982μs | 44.8467 KOps/s | 44.9154 KOps/s | |
test_view_td | 53.8520μs | 28.9234μs | 34.5741 KOps/s | 32.2934 KOps/s | |
test_unbind_pytree | 56.9320μs | 27.3736μs | 36.5316 KOps/s | 35.9214 KOps/s | |
test_unbind_td | 0.8847ms | 35.9118μs | 27.8460 KOps/s | 27.6272 KOps/s | |
test_split_pytree | 68.0130μs | 30.2922μs | 33.0118 KOps/s | 33.5186 KOps/s | |
test_split_td | 1.0545ms | 39.4590μs | 25.3427 KOps/s | 25.6437 KOps/s | |
test_add_pytree | 75.7430μs | 32.9817μs | 30.3198 KOps/s | 31.0402 KOps/s | |
test_add_td | 85.3640μs | 46.7498μs | 21.3904 KOps/s | 21.6153 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1766ms | 0.1210ms | 8.2673 KOps/s | 8.0140 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2332ms | 0.1237ms | 8.0858 KOps/s | 7.9891 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1512ms | 96.0847μs | 10.4075 KOps/s | 10.1955 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2063ms | 0.1458ms | 6.8593 KOps/s | 6.8912 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 52.8330μs | 23.6786μs | 42.2323 KOps/s | 32.1528 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1558ms | 27.0708μs | 36.9402 KOps/s | 36.9819 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3305ms | 64.5064μs | 15.5023 KOps/s | 15.0244 KOps/s | |
test_compile_copy_nested[pytree-eager] | 83.4140μs | 49.0404μs | 20.3914 KOps/s | 20.0173 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1818ms | 0.1431ms | 6.9894 KOps/s | 6.8913 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2939ms | 0.2060ms | 4.8548 KOps/s | 4.9437 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1487ms | 98.8316μs | 10.1182 KOps/s | 10.0865 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1069ms | 52.9333μs | 18.8917 KOps/s | 19.7497 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1827ms | 0.1364ms | 7.3314 KOps/s | 7.2912 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5578ms | 0.4647ms | 2.1520 KOps/s | 2.1426 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3872ms | 0.2465ms | 4.0560 KOps/s | 4.1280 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1926ms | 0.1441ms | 6.9406 KOps/s | 6.9402 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1355ms | 63.0568μs | 15.8587 KOps/s | 16.4427 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1553ms | 99.7416μs | 10.0259 KOps/s | 10.1384 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4656ms | 0.4006ms | 2.4964 KOps/s | 2.5515 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1989ms | 0.1367ms | 7.3163 KOps/s | 7.3278 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 64.8730μs | 19.3870μs | 51.5811 KOps/s | 55.8420 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 55.1530μs | 26.9482μs | 37.1082 KOps/s | 36.9472 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1171ms | 69.9369μs | 14.2986 KOps/s | 14.1441 KOps/s | |
test_compile_copy_flat[pytree-eager] | 95.7350μs | 51.4669μs | 19.4300 KOps/s | 19.2706 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6466ms | 0.3937ms | 2.5403 KOps/s | 2.1716 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.6878ms | 2.5431ms | 393.2154 Ops/s | 396.9514 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6114ms | 0.4366ms | 2.2905 KOps/s | 2.2183 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.6610ms | 2.5288ms | 395.4467 Ops/s | 394.0710 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1757ms | 0.1107ms | 9.0294 KOps/s | 9.0522 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5769ms | 76.0322μs | 13.1523 KOps/s | 13.1180 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2454ms | 0.1036ms | 9.6529 KOps/s | 9.8426 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1602ms | 66.7969μs | 14.9708 KOps/s | 15.3697 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2162ms | 0.1041ms | 9.6037 KOps/s | 9.6389 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1892ms | 67.2061μs | 14.8796 KOps/s | 15.5617 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2161ms | 0.1000ms | 9.9991 KOps/s | 9.8601 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1477ms | 16.8737μs | 59.2638 KOps/s | 59.4377 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1598ms | 95.4252μs | 10.4794 KOps/s | 10.5524 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1084ms | 15.9885μs | 62.5450 KOps/s | 64.4333 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2053ms | 95.7604μs | 10.4427 KOps/s | 10.4359 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 98.5840μs | 15.7850μs | 63.3513 KOps/s | 64.7694 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1951ms | 0.1012ms | 9.8766 KOps/s | 9.9869 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5645ms | 16.6845μs | 59.9359 KOps/s | 60.5537 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1935ms | 96.2237μs | 10.3924 KOps/s | 10.4474 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1523ms | 15.8605μs | 63.0497 KOps/s | 64.6856 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1875ms | 96.3209μs | 10.3820 KOps/s | 10.4864 KOps/s | |
test_compile_indexing[int-pytree-eager] | 97.6240μs | 15.8723μs | 63.0027 KOps/s | 64.6225 KOps/s | |
test_mod_add[eager] | 0.1338ms | 30.3562μs | 32.9422 KOps/s | 32.5019 KOps/s | |
test_mod_add[compile] | 0.1867ms | 77.8205μs | 12.8501 KOps/s | 12.5743 KOps/s | |
test_mod_add[compile-overhead] | 0.3134ms | 0.1641ms | 6.0929 KOps/s | 5.7693 KOps/s | |
test_mod_wrap[eager] | 0.3347ms | 0.2298ms | 4.3514 KOps/s | 4.3107 KOps/s | |
test_mod_wrap[compile] | 1.6067ms | 0.2822ms | 3.5436 KOps/s | 3.5561 KOps/s | |
test_mod_wrap[compile-overhead] | 7.4127ms | 3.8386ms | 260.5113 Ops/s | 264.0540 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5361ms | 1.3258ms | 754.2898 Ops/s | 718.1627 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4667ms | 1.2863ms | 777.4345 Ops/s | 726.4200 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4137ms | 0.9820ms | 1.0184 KOps/s | 972.0550 Ops/s | |
test_seq_add[eager] | 0.1611ms | 96.3985μs | 10.3736 KOps/s | 10.2988 KOps/s | |
test_seq_add[compile] | 0.1813ms | 87.4066μs | 11.4408 KOps/s | 11.6527 KOps/s | |
test_seq_add[compile-overhead] | 0.1844ms | 0.1284ms | 7.7869 KOps/s | 7.8182 KOps/s | |
test_seq_wrap[eager] | 0.4513ms | 0.3720ms | 2.6885 KOps/s | 2.6666 KOps/s | |
test_seq_wrap[compile] | 0.4208ms | 0.2990ms | 3.3448 KOps/s | 3.3431 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2986ms | 0.2221ms | 4.5017 KOps/s | 4.4929 KOps/s | |
test_func_call_runtime[False-eager] | 0.8043ms | 0.7105ms | 1.4074 KOps/s | 1.4183 KOps/s | |
test_func_call_runtime[False-compile] | 0.8246ms | 0.7436ms | 1.3448 KOps/s | 1.3685 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4746ms | 0.3636ms | 2.7505 KOps/s | 2.7170 KOps/s | |
test_func_call_runtime[True-eager] | 0.9982ms | 0.8735ms | 1.1448 KOps/s | 1.1339 KOps/s | |
test_func_call_runtime[True-compile] | 0.8392ms | 0.7681ms | 1.3019 KOps/s | 1.3035 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4589ms | 0.3845ms | 2.6005 KOps/s | 2.5976 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8330ms | 0.7429ms | 1.3461 KOps/s | 1.3012 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9138ms | 0.7545ms | 1.3253 KOps/s | 1.3286 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4223ms | 0.3650ms | 2.7394 KOps/s | 2.7332 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0796ms | 0.9699ms | 1.0310 KOps/s | 1.0270 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8601ms | 0.7954ms | 1.2572 KOps/s | 1.2557 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4615ms | 0.4101ms | 2.4383 KOps/s | 2.4273 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4884ms | 1.9833ms | 504.1984 Ops/s | 498.3557 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8893ms | 0.8092ms | 1.2358 KOps/s | 1.1694 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4668ms | 0.4144ms | 2.4131 KOps/s | 2.4129 KOps/s | |
test_distributed | 1.8704ms | 0.1853ms | 5.3962 KOps/s | 8.7547 KOps/s | |
test_tdmodule | 48.0830μs | 14.0397μs | 71.2266 KOps/s | 68.8997 KOps/s | |
test_tdmodule_dispatch | 67.4630μs | 27.6255μs | 36.1984 KOps/s | 34.0742 KOps/s | |
test_tdseq | 47.3820μs | 15.5224μs | 64.4231 KOps/s | 61.9709 KOps/s | |
test_tdseq_dispatch | 52.1920μs | 30.2898μs | 33.0144 KOps/s | 31.2162 KOps/s | |
test_instantiation_functorch | 1.6298ms | 1.5419ms | 648.5319 Ops/s | 650.8334 Ops/s | |
test_exec_functorch | 0.1974ms | 0.1386ms | 7.2136 KOps/s | 7.0092 KOps/s | |
test_exec_functional_call | 0.1686ms | 0.1314ms | 7.6126 KOps/s | 7.5589 KOps/s | |
test_exec_td_decorator | 0.3829ms | 0.1767ms | 5.6603 KOps/s | 5.6767 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7845ms | 0.6478ms | 1.5438 KOps/s | 1.5255 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7470ms | 0.6475ms | 1.5444 KOps/s | 1.5304 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7163ms | 0.5648ms | 1.7706 KOps/s | 1.7542 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6727ms | 0.5679ms | 1.7608 KOps/s | 1.7544 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.5943ms | 18.4224ms | 54.2818 Ops/s | 54.4962 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 18.5803ms | 18.4536ms | 54.1898 Ops/s | 54.3603 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.4517ms | 18.2705ms | 54.7331 Ops/s | 55.0474 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.5464ms | 18.3361ms | 54.5372 Ops/s | 54.8736 Ops/s | |
test_to_module_speed[True] | 1.2946ms | 0.9244ms | 1.0817 KOps/s | 1.0487 KOps/s | |
test_to_module_speed[False] | 1.4830ms | 0.9197ms | 1.0873 KOps/s | 1.0781 KOps/s | |
test_tc_init | 61.7030μs | 34.7133μs | 28.8074 KOps/s | 27.1292 KOps/s | |
test_tc_init_nested | 0.1733ms | 71.2934μs | 14.0266 KOps/s | 13.2134 KOps/s | |
test_tc_first_layer_tensor | 5.2360μs | 0.6982μs | 1.4322 MOps/s | 1.4249 MOps/s | |
test_tc_first_layer_nontensor | 53.9730μs | 2.2889μs | 436.8872 KOps/s | 427.2113 KOps/s | |
test_tc_second_layer_tensor | 9.3703μs | 1.4271μs | 700.6983 KOps/s | 699.2530 KOps/s | |
test_tc_second_layer_nontensor | 29.1020μs | 3.0469μs | 328.2033 KOps/s | 323.9309 KOps/s | |
test_unbind | 0.2376s | 10.0010ms | 99.9899 Ops/s | 152.2043 Ops/s | |
test_full_like | 12.9254ms | 9.1288ms | 109.5428 Ops/s | 109.4642 Ops/s | |
test_zeros_like | 5.2746ms | 4.3234ms | 231.2984 Ops/s | 231.8210 Ops/s | |
test_ones_like | 9.2153ms | 7.2211ms | 138.4839 Ops/s | 231.5149 Ops/s | |
test_clone | 6.8334ms | 6.3153ms | 158.3466 Ops/s | 157.9370 Ops/s | |
test_squeeze | 57.6530μs | 9.4195μs | 106.1630 KOps/s | 107.4572 KOps/s | |
test_unsqueeze | 0.1204ms | 71.0344μs | 14.0777 KOps/s | 13.8297 KOps/s | |
test_split | 0.3762ms | 0.1587ms | 6.3003 KOps/s | 6.2757 KOps/s | |
test_permute | 0.2253ms | 0.1812ms | 5.5179 KOps/s | 5.3962 KOps/s | |
test_stack | 51.1818ms | 50.7562ms | 19.7020 Ops/s | 19.6945 Ops/s | |
test_cat | 51.3135ms | 50.5710ms | 19.7742 Ops/s | 19.7232 Ops/s |
vmoens
force-pushed
the
params-new-unsafe
branch
from
November 20, 2024 10:44
0166842
to
2c04c84
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.