-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Quality] Avoid torch.distributed imports at root #1134
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Dec 9, 2024
ghstack-source-id: f773ed94d4b7ca13c603f32251f0735c751ebf94 Pull Request resolved: #1134
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Dec 9, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 43.5410μs | 18.4301μs | 54.2590 KOps/s | 56.7546 KOps/s | |
test_plain_set_stack_nested | 47.3590μs | 18.3522μs | 54.4893 KOps/s | 54.9862 KOps/s | |
test_plain_set_nested_inplace | 74.1980μs | 19.9350μs | 50.1630 KOps/s | 50.6672 KOps/s | |
test_plain_set_stack_nested_inplace | 58.9790μs | 19.7323μs | 50.6782 KOps/s | 51.1127 KOps/s | |
test_items | 28.0520μs | 4.1968μs | 238.2742 KOps/s | 238.8849 KOps/s | |
test_items_nested | 0.5753ms | 0.3994ms | 2.5041 KOps/s | 2.5149 KOps/s | |
test_items_nested_locked | 0.7397ms | 0.4025ms | 2.4844 KOps/s | 2.5396 KOps/s | |
test_items_nested_leaf | 0.1155ms | 70.2419μs | 14.2365 KOps/s | 14.1354 KOps/s | |
test_items_stack_nested | 0.7601ms | 0.4051ms | 2.4687 KOps/s | 2.4859 KOps/s | |
test_items_stack_nested_leaf | 0.1323ms | 72.5296μs | 13.7875 KOps/s | 13.5671 KOps/s | |
test_items_stack_nested_locked | 0.7702ms | 0.4077ms | 2.4530 KOps/s | 2.4961 KOps/s | |
test_keys | 52.0470μs | 3.5159μs | 284.4207 KOps/s | 278.7372 KOps/s | |
test_keys_nested | 0.2294ms | 0.1365ms | 7.3248 KOps/s | 7.3771 KOps/s | |
test_keys_nested_locked | 0.7393ms | 0.1424ms | 7.0210 KOps/s | 6.8795 KOps/s | |
test_keys_nested_leaf | 1.9661ms | 0.1165ms | 8.5817 KOps/s | 8.5114 KOps/s | |
test_keys_stack_nested | 0.2467ms | 0.1374ms | 7.2795 KOps/s | 7.3577 KOps/s | |
test_keys_stack_nested_leaf | 0.2299ms | 0.1170ms | 8.5473 KOps/s | 8.5687 KOps/s | |
test_keys_stack_nested_locked | 0.3003ms | 0.1403ms | 7.1270 KOps/s | 7.0549 KOps/s | |
test_values | 9.3234μs | 1.0835μs | 922.8998 KOps/s | 949.6284 KOps/s | |
test_values_nested | 0.1090ms | 54.7677μs | 18.2589 KOps/s | 17.7571 KOps/s | |
test_values_nested_locked | 0.3581ms | 55.7024μs | 17.9525 KOps/s | 17.8835 KOps/s | |
test_values_nested_leaf | 0.1168ms | 59.8711μs | 16.7026 KOps/s | 16.3318 KOps/s | |
test_values_stack_nested | 0.1003ms | 55.3606μs | 18.0634 KOps/s | 16.6886 KOps/s | |
test_values_stack_nested_leaf | 0.1172ms | 59.8440μs | 16.7101 KOps/s | 15.1867 KOps/s | |
test_values_stack_nested_locked | 0.1398ms | 55.5277μs | 18.0090 KOps/s | 17.3509 KOps/s | |
test_membership | 21.7900μs | 0.9029μs | 1.1076 MOps/s | 1.1592 MOps/s | |
test_membership_nested | 0.1191ms | 2.9584μs | 338.0217 KOps/s | 342.5148 KOps/s | |
test_membership_nested_leaf | 44.5930μs | 2.9506μs | 338.9104 KOps/s | 336.3967 KOps/s | |
test_membership_stacked_nested | 29.9760μs | 2.9137μs | 343.2021 KOps/s | 338.2489 KOps/s | |
test_membership_stacked_nested_leaf | 24.5060μs | 2.8939μs | 345.5552 KOps/s | 342.7936 KOps/s | |
test_membership_nested_last | 46.4460μs | 4.2244μs | 236.7212 KOps/s | 238.3727 KOps/s | |
test_membership_nested_leaf_last | 20.5880μs | 4.2703μs | 234.1770 KOps/s | 234.3640 KOps/s | |
test_membership_stacked_nested_last | 52.7780μs | 4.1826μs | 239.0866 KOps/s | 203.1504 KOps/s | |
test_membership_stacked_nested_leaf_last | 39.2230μs | 4.2463μs | 235.4972 KOps/s | 194.8626 KOps/s | |
test_nested_getleaf | 60.2320μs | 10.9574μs | 91.2623 KOps/s | 91.7585 KOps/s | |
test_nested_get | 59.3410μs | 10.4423μs | 95.7645 KOps/s | 91.9844 KOps/s | |
test_stacked_getleaf | 32.4110μs | 10.8726μs | 91.9745 KOps/s | 90.8673 KOps/s | |
test_stacked_get | 61.4650μs | 10.3342μs | 96.7659 KOps/s | 95.9426 KOps/s | |
test_nested_getitemleaf | 65.7830μs | 11.4193μs | 87.5707 KOps/s | 88.2937 KOps/s | |
test_nested_getitem | 34.4440μs | 10.6038μs | 94.3062 KOps/s | 94.0639 KOps/s | |
test_stacked_getitemleaf | 63.2270μs | 11.1014μs | 90.0787 KOps/s | 87.1568 KOps/s | |
test_stacked_getitem | 33.2930μs | 10.5954μs | 94.3802 KOps/s | 94.1520 KOps/s | |
test_lock_nested | 1.7478ms | 0.4442ms | 2.2511 KOps/s | 2.2554 KOps/s | |
test_lock_stack_nested | 0.7699ms | 0.4155ms | 2.4066 KOps/s | 2.4173 KOps/s | |
test_unlock_nested | 0.8586ms | 0.3649ms | 2.7408 KOps/s | 2.7333 KOps/s | |
test_unlock_stack_nested | 0.4930ms | 0.3334ms | 2.9998 KOps/s | 3.0274 KOps/s | |
test_flatten_speed | 0.1874ms | 94.8687μs | 10.5409 KOps/s | 10.6572 KOps/s | |
test_unflatten_speed | 1.0057ms | 0.5018ms | 1.9927 KOps/s | 2.0299 KOps/s | |
test_common_ops | 1.8207ms | 0.7941ms | 1.2594 KOps/s | 1.2770 KOps/s | |
test_creation | 23.5530μs | 2.1708μs | 460.6635 KOps/s | 483.5192 KOps/s | |
test_creation_empty | 53.9610μs | 11.7334μs | 85.2266 KOps/s | 92.1510 KOps/s | |
test_creation_nested_1 | 63.7390μs | 14.5043μs | 68.9449 KOps/s | 73.7634 KOps/s | |
test_creation_nested_2 | 54.1110μs | 18.8346μs | 53.0938 KOps/s | 56.0682 KOps/s | |
test_clone | 1.3663ms | 13.1169μs | 76.2377 KOps/s | 75.0308 KOps/s | |
test_getitem[int] | 0.8502ms | 12.6266μs | 79.1976 KOps/s | 75.5386 KOps/s | |
test_getitem[slice_int] | 0.1403ms | 24.5394μs | 40.7507 KOps/s | 38.1230 KOps/s | |
test_getitem[range] | 0.1739ms | 45.9735μs | 21.7517 KOps/s | 20.5428 KOps/s | |
test_getitem[tuple] | 0.1486ms | 20.5018μs | 48.7762 KOps/s | 47.4913 KOps/s | |
test_getitem[list] | 0.6778ms | 42.1813μs | 23.7072 KOps/s | 22.4317 KOps/s | |
test_setitem_dim[int] | 65.5220μs | 25.0777μs | 39.8760 KOps/s | 37.7132 KOps/s | |
test_setitem_dim[slice_int] | 0.1005ms | 52.6702μs | 18.9861 KOps/s | 18.5506 KOps/s | |
test_setitem_dim[range] | 0.1139ms | 73.7888μs | 13.5522 KOps/s | 13.6839 KOps/s | |
test_setitem_dim[tuple] | 79.7390μs | 40.9690μs | 24.4087 KOps/s | 24.2754 KOps/s | |
test_setitem | 0.2992ms | 20.4102μs | 48.9952 KOps/s | 47.7192 KOps/s | |
test_set | 61.4450μs | 20.1889μs | 49.5321 KOps/s | 49.8311 KOps/s | |
test_set_shared | 4.9586ms | 0.1661ms | 6.0219 KOps/s | 5.8375 KOps/s | |
test_update | 0.2910ms | 23.2034μs | 43.0971 KOps/s | 44.7617 KOps/s | |
test_update_nested | 0.3412ms | 33.5034μs | 29.8478 KOps/s | 30.1476 KOps/s | |
test_update__nested | 0.9509ms | 32.3877μs | 30.8760 KOps/s | 29.9881 KOps/s | |
test_set_nested | 0.3029ms | 22.5650μs | 44.3164 KOps/s | 44.3194 KOps/s | |
test_set_nested_new | 0.3371ms | 26.9096μs | 37.1615 KOps/s | 36.8485 KOps/s | |
test_select | 0.3385ms | 43.3327μs | 23.0772 KOps/s | 23.1463 KOps/s | |
test_select_nested | 0.1184ms | 59.9520μs | 16.6800 KOps/s | 16.5281 KOps/s | |
test_exclude_nested | 0.1432ms | 78.1573μs | 12.7947 KOps/s | 12.7634 KOps/s | |
test_empty[True] | 0.5205ms | 0.3794ms | 2.6356 KOps/s | 2.5583 KOps/s | |
test_empty[False] | 7.7093μs | 1.2675μs | 788.9724 KOps/s | 825.2033 KOps/s | |
test_unbind_speed | 0.3452ms | 0.2582ms | 3.8724 KOps/s | 3.7684 KOps/s | |
test_unbind_speed_stack0 | 0.4563ms | 0.2565ms | 3.8991 KOps/s | 3.9185 KOps/s | |
test_unbind_speed_stack1 | 0.1041s | 0.7663ms | 1.3049 KOps/s | 1.4626 KOps/s | |
test_split | 0.1061s | 1.7285ms | 578.5347 Ops/s | 562.7803 Ops/s | |
test_chunk | 0.1141s | 1.7505ms | 571.2594 Ops/s | 560.8005 Ops/s | |
test_consolidate_njt[False-None] | 9.2629ms | 8.0162ms | 124.7475 Ops/s | 123.8444 Ops/s | |
test_creation[device0] | 0.2330ms | 91.1631μs | 10.9694 KOps/s | 10.9759 KOps/s | |
test_creation_from_tensor | 3.6791ms | 94.8671μs | 10.5411 KOps/s | 10.5344 KOps/s | |
test_add_one[memmap_tensor0] | 0.3205ms | 4.7236μs | 211.7036 KOps/s | 194.2059 KOps/s | |
test_contiguous[memmap_tensor0] | 29.4750μs | 0.4954μs | 2.0184 MOps/s | 1.9379 MOps/s | |
test_stack[memmap_tensor0] | 79.0370μs | 3.2429μs | 308.3706 KOps/s | 270.8130 KOps/s | |
test_memmaptd_index | 0.9531ms | 0.2378ms | 4.2045 KOps/s | 4.1402 KOps/s | |
test_memmaptd_index_astensor | 0.5961ms | 0.3157ms | 3.1678 KOps/s | 3.1195 KOps/s | |
test_memmaptd_index_op | 0.9671ms | 0.5912ms | 1.6915 KOps/s | 1.7218 KOps/s | |
test_serialize_model | 0.1234s | 0.1162s | 8.6029 Ops/s | 7.5223 Ops/s | |
test_serialize_model_pickle | 0.4909s | 0.4009s | 2.4941 Ops/s | 2.5490 Ops/s | |
test_serialize_weights | 0.1221s | 0.1139s | 8.7791 Ops/s | 8.6279 Ops/s | |
test_serialize_weights_returnearly | 0.1770s | 0.1613s | 6.2010 Ops/s | 6.2926 Ops/s | |
test_serialize_weights_pickle | 1.1089s | 0.7066s | 1.4153 Ops/s | 2.4664 Ops/s | |
test_serialize_weights_filesystem | 0.1461s | 0.1411s | 7.0861 Ops/s | 7.0263 Ops/s | |
test_serialize_model_filesystem | 0.1523s | 0.1424s | 7.0219 Ops/s | 6.1495 Ops/s | |
test_reshape_pytree | 76.2020μs | 27.1957μs | 36.7706 KOps/s | 35.6946 KOps/s | |
test_reshape_td | 82.5340μs | 32.7572μs | 30.5276 KOps/s | 30.1733 KOps/s | |
test_view_pytree | 62.0760μs | 27.0348μs | 36.9894 KOps/s | 36.1786 KOps/s | |
test_view_td | 0.2194ms | 37.9093μs | 26.3788 KOps/s | 25.8734 KOps/s | |
test_unbind_pytree | 72.1740μs | 29.9216μs | 33.4207 KOps/s | 32.7467 KOps/s | |
test_unbind_td | 0.3744ms | 38.8667μs | 25.7290 KOps/s | 25.3864 KOps/s | |
test_split_pytree | 72.4440μs | 29.4333μs | 33.9751 KOps/s | 32.6842 KOps/s | |
test_split_td | 0.1081s | 54.1327μs | 18.4731 KOps/s | 21.3996 KOps/s | |
test_add_pytree | 0.1042ms | 35.2590μs | 28.3615 KOps/s | 27.6137 KOps/s | |
test_add_td | 0.1213ms | 57.9860μs | 17.2455 KOps/s | 17.8674 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1561ms | 61.2064μs | 16.3382 KOps/s | 15.7102 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4035ms | 0.1598ms | 6.2587 KOps/s | 6.2132 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 97.5720μs | 44.9326μs | 22.2555 KOps/s | 21.9296 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2882ms | 0.1200ms | 8.3302 KOps/s | 8.3683 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 84.5680μs | 26.0079μs | 38.4499 KOps/s | 35.0506 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1224ms | 53.5662μs | 18.6685 KOps/s | 18.6058 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1737ms | 79.3515μs | 12.6022 KOps/s | 12.5224 KOps/s | |
test_compile_copy_nested[pytree-eager] | 2.7558ms | 68.0082μs | 14.7041 KOps/s | 14.6993 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2273ms | 0.1046ms | 9.5647 KOps/s | 9.2300 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3131ms | 0.1972ms | 5.0703 KOps/s | 4.9115 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1064ms | 44.1751μs | 22.6372 KOps/s | 21.2262 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5513ms | 60.9492μs | 16.4071 KOps/s | 16.2142 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2164ms | 0.1027ms | 9.7379 KOps/s | 9.5369 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.2883ms | 0.2002ms | 4.9938 KOps/s | 4.9633 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3792ms | 0.2069ms | 4.8338 KOps/s | 4.6942 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2214ms | 0.1039ms | 9.6223 KOps/s | 9.5364 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1693ms | 54.9825μs | 18.1876 KOps/s | 18.2439 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 96.8810μs | 45.5802μs | 21.9394 KOps/s | 21.1829 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5782ms | 0.1585ms | 6.3102 KOps/s | 6.2881 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2026ms | 0.1025ms | 9.7531 KOps/s | 9.6708 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 70.6820μs | 20.7126μs | 48.2797 KOps/s | 45.1753 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1589ms | 60.5905μs | 16.5042 KOps/s | 16.7925 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1584ms | 81.0871μs | 12.3324 KOps/s | 12.2722 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1556ms | 68.5937μs | 14.5786 KOps/s | 14.3457 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3019ms | 0.2042ms | 4.8981 KOps/s | 4.8379 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.5015ms | 1.3159ms | 759.9149 Ops/s | 768.9831 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3220ms | 0.1971ms | 5.0731 KOps/s | 4.9070 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3495ms | 0.7736ms | 1.2927 KOps/s | 1.2690 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5380ms | 0.4423ms | 2.2607 KOps/s | 2.1732 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.4488ms | 2.5818ms | 387.3200 Ops/s | 385.6056 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1250ms | 34.7766μs | 28.7550 KOps/s | 26.9807 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5529ms | 33.1057μs | 30.2062 KOps/s | 30.1061 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 69.0890μs | 28.2200μs | 35.4358 KOps/s | 33.5975 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 71.0320μs | 23.1401μs | 43.2150 KOps/s | 41.4173 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 73.9280μs | 29.2936μs | 34.1371 KOps/s | 33.0434 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 66.0330μs | 23.3956μs | 42.7431 KOps/s | 42.3181 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1423ms | 49.8322μs | 20.0673 KOps/s | 19.0591 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5434ms | 20.0387μs | 49.9033 KOps/s | 47.5964 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1051ms | 43.2209μs | 23.1369 KOps/s | 22.0161 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 58.6600μs | 19.3908μs | 51.5709 KOps/s | 51.7176 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1307ms | 44.4189μs | 22.5130 KOps/s | 21.7948 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 65.5530μs | 19.3740μs | 51.6156 KOps/s | 51.8418 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1040ms | 51.2658μs | 19.5062 KOps/s | 18.7086 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8729ms | 20.1648μs | 49.5913 KOps/s | 48.2298 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1162ms | 44.5127μs | 22.4655 KOps/s | 21.9905 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2824ms | 19.3727μs | 51.6191 KOps/s | 51.7972 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1528ms | 44.2944μs | 22.5762 KOps/s | 22.3082 KOps/s | |
test_compile_indexing[int-pytree-eager] | 71.8650μs | 19.4005μs | 51.5451 KOps/s | 52.4106 KOps/s | |
test_mod_add[eager] | 0.3562ms | 34.9879μs | 28.5813 KOps/s | 29.3376 KOps/s | |
test_mod_add[compile] | 0.1247ms | 46.7111μs | 21.4082 KOps/s | 20.6576 KOps/s | |
test_mod_add[compile-overhead] | 0.1060ms | 45.8890μs | 21.7917 KOps/s | 20.6281 KOps/s | |
test_mod_wrap[eager] | 0.4403ms | 0.2232ms | 4.4797 KOps/s | 4.4111 KOps/s | |
test_mod_wrap[compile] | 0.3136ms | 0.2018ms | 4.9551 KOps/s | 4.7368 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3433ms | 0.2001ms | 4.9978 KOps/s | 4.7775 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.4312ms | 11.5987ms | 86.2163 Ops/s | 84.7359 Ops/s | |
test_mod_wrap_and_backward[compile] | 19.0724ms | 12.4183ms | 80.5264 Ops/s | 79.1348 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.4300ms | 11.1296ms | 89.8507 Ops/s | 75.9496 Ops/s | |
test_seq_add[eager] | 0.2379ms | 0.1124ms | 8.8975 KOps/s | 9.0822 KOps/s | |
test_seq_add[compile] | 0.1066ms | 60.9707μs | 16.4013 KOps/s | 15.8282 KOps/s | |
test_seq_add[compile-overhead] | 0.1741ms | 58.8631μs | 16.9886 KOps/s | 16.4253 KOps/s | |
test_seq_wrap[eager] | 0.8666ms | 0.4397ms | 2.2741 KOps/s | 2.2677 KOps/s | |
test_seq_wrap[compile] | 0.4110ms | 0.2232ms | 4.4813 KOps/s | 4.2090 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3910ms | 0.2222ms | 4.4999 KOps/s | 4.2957 KOps/s | |
test_func_call_runtime[False-eager] | 0.9239ms | 0.5465ms | 1.8298 KOps/s | 1.8050 KOps/s | |
test_func_call_runtime[False-compile] | 0.8073ms | 0.4213ms | 2.3737 KOps/s | 2.2959 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5212ms | 0.4213ms | 2.3737 KOps/s | 2.2975 KOps/s | |
test_func_call_runtime[True-eager] | 1.1824ms | 0.7575ms | 1.3202 KOps/s | 1.2961 KOps/s | |
test_func_call_runtime[True-compile] | 0.8238ms | 0.4658ms | 2.1469 KOps/s | 2.0853 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5748ms | 0.4642ms | 2.1541 KOps/s | 2.0925 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7458ms | 0.5439ms | 1.8384 KOps/s | 1.8062 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9013ms | 0.4209ms | 2.3759 KOps/s | 2.3107 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7860ms | 0.4226ms | 2.3663 KOps/s | 2.3164 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0535ms | 0.8935ms | 1.1193 KOps/s | 1.1071 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.7966ms | 0.4904ms | 2.0392 KOps/s | 2.0122 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6045ms | 0.4834ms | 2.0685 KOps/s | 2.0128 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.2219ms | 1.8666ms | 535.7409 Ops/s | 527.4590 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7114ms | 0.5141ms | 1.9450 KOps/s | 1.8843 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.6328ms | 0.5133ms | 1.9483 KOps/s | 1.8868 KOps/s | |
test_distributed | 0.2224ms | 0.1254ms | 7.9776 KOps/s | 7.6644 KOps/s | |
test_tdmodule | 57.6480μs | 26.2231μs | 38.1343 KOps/s | 38.3117 KOps/s | |
test_tdmodule_dispatch | 81.9330μs | 47.8501μs | 20.8986 KOps/s | 20.5457 KOps/s | |
test_tdseq | 42.3280μs | 26.5696μs | 37.6370 KOps/s | 38.0765 KOps/s | |
test_tdseq_dispatch | 92.1620μs | 51.1636μs | 19.5452 KOps/s | 20.0698 KOps/s | |
test_instantiation_functorch | 2.7351ms | 1.5309ms | 653.1977 Ops/s | 646.4587 Ops/s | |
test_exec_functorch | 0.2933ms | 0.1807ms | 5.5334 KOps/s | 5.3813 KOps/s | |
test_exec_functional_call | 0.4015ms | 0.1743ms | 5.7373 KOps/s | 5.8295 KOps/s | |
test_exec_td_decorator | 0.5770ms | 0.2286ms | 4.3753 KOps/s | 4.1729 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9696ms | 0.6652ms | 1.5033 KOps/s | 1.5188 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8629ms | 0.6450ms | 1.5503 KOps/s | 1.5496 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8145ms | 0.5188ms | 1.9277 KOps/s | 1.8944 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9514ms | 0.5275ms | 1.8956 KOps/s | 1.8888 KOps/s | |
test_to_module_speed[True] | 1.8791ms | 1.2769ms | 783.1539 Ops/s | 763.2110 Ops/s | |
test_to_module_speed[False] | 1.9707ms | 1.2611ms | 792.9718 Ops/s | 798.5748 Ops/s | |
test_tc_init | 89.0960μs | 48.1496μs | 20.7686 KOps/s | 22.4475 KOps/s | |
test_tc_init_nested | 0.1869ms | 93.7676μs | 10.6647 KOps/s | 10.9262 KOps/s | |
test_tc_first_layer_tensor | 41.5000μs | 1.4867μs | 672.6243 KOps/s | 659.7877 KOps/s | |
test_tc_first_layer_nontensor | 29.8560μs | 4.7152μs | 212.0795 KOps/s | 204.6342 KOps/s | |
test_tc_second_layer_tensor | 34.3440μs | 2.7573μs | 362.6695 KOps/s | 362.5930 KOps/s | |
test_tc_second_layer_nontensor | 53.9810μs | 6.0627μs | 164.9421 KOps/s | 167.2721 KOps/s | |
test_unbind | 0.2240s | 13.0261ms | 76.7692 Ops/s | 75.9648 Ops/s | |
test_full_like | 8.9543ms | 7.2131ms | 138.6362 Ops/s | 130.3522 Ops/s | |
test_zeros_like | 3.6484ms | 2.7963ms | 357.6154 Ops/s | 327.9688 Ops/s | |
test_ones_like | 11.9553ms | 6.3870ms | 156.5668 Ops/s | 292.1165 Ops/s | |
test_clone | 10.7872ms | 8.2736ms | 120.8660 Ops/s | 156.6369 Ops/s | |
test_squeeze | 79.7890μs | 12.0364μs | 83.0813 KOps/s | 81.9017 KOps/s | |
test_unsqueeze | 0.1518ms | 89.5803μs | 11.1632 KOps/s | 10.9897 KOps/s | |
test_split | 0.5494ms | 0.1950ms | 5.1276 KOps/s | 5.0201 KOps/s | |
test_permute | 0.3865ms | 0.2041ms | 4.8997 KOps/s | 4.8554 KOps/s | |
test_stack | 28.6310ms | 24.6912ms | 40.5002 Ops/s | 37.6961 Ops/s | |
test_cat | 27.4176ms | 24.5151ms | 40.7912 Ops/s | 37.8607 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.8710μs | 11.0041μs | 90.8748 KOps/s | 99.2428 KOps/s | |
test_plain_set_stack_nested | 34.8010μs | 11.0622μs | 90.3981 KOps/s | 99.1158 KOps/s | |
test_plain_set_nested_inplace | 0.2097ms | 11.9140μs | 83.9351 KOps/s | 91.2335 KOps/s | |
test_plain_set_stack_nested_inplace | 42.0310μs | 11.9248μs | 83.8591 KOps/s | 91.2350 KOps/s | |
test_items | 36.4910μs | 2.8894μs | 346.0947 KOps/s | 343.0515 KOps/s | |
test_items_nested | 0.5518ms | 0.3553ms | 2.8148 KOps/s | 2.8045 KOps/s | |
test_items_nested_locked | 0.4142ms | 0.3573ms | 2.7984 KOps/s | 2.7966 KOps/s | |
test_items_nested_leaf | 82.0110μs | 57.9861μs | 17.2455 KOps/s | 17.1991 KOps/s | |
test_items_stack_nested | 0.4009ms | 0.3545ms | 2.8206 KOps/s | 2.7645 KOps/s | |
test_items_stack_nested_leaf | 86.0520μs | 57.7730μs | 17.3091 KOps/s | 16.6326 KOps/s | |
test_items_stack_nested_locked | 0.3962ms | 0.3563ms | 2.8062 KOps/s | 2.7876 KOps/s | |
test_keys | 33.8600μs | 3.4879μs | 286.7093 KOps/s | 288.1990 KOps/s | |
test_keys_nested | 0.1117ms | 70.5725μs | 14.1698 KOps/s | 13.9837 KOps/s | |
test_keys_nested_locked | 0.7801ms | 76.8735μs | 13.0084 KOps/s | 12.9818 KOps/s | |
test_keys_nested_leaf | 97.4020μs | 61.6455μs | 16.2218 KOps/s | 16.1851 KOps/s | |
test_keys_stack_nested | 0.1277ms | 70.7646μs | 14.1314 KOps/s | 13.9639 KOps/s | |
test_keys_stack_nested_leaf | 0.1011ms | 61.2538μs | 16.3255 KOps/s | 15.8927 KOps/s | |
test_keys_stack_nested_locked | 0.1125ms | 76.2963μs | 13.1068 KOps/s | 12.9378 KOps/s | |
test_values | 6.5102μs | 0.8530μs | 1.1724 MOps/s | 1.1348 MOps/s | |
test_values_nested | 95.2620μs | 31.4049μs | 31.8421 KOps/s | 32.1721 KOps/s | |
test_values_nested_locked | 64.3210μs | 33.2048μs | 30.1162 KOps/s | 30.7125 KOps/s | |
test_values_nested_leaf | 62.6310μs | 33.6128μs | 29.7506 KOps/s | 29.9822 KOps/s | |
test_values_stack_nested | 72.7710μs | 31.5240μs | 31.7219 KOps/s | 31.3094 KOps/s | |
test_values_stack_nested_leaf | 60.2410μs | 33.9152μs | 29.4853 KOps/s | 29.1574 KOps/s | |
test_values_stack_nested_locked | 64.9410μs | 33.0922μs | 30.2186 KOps/s | 29.9292 KOps/s | |
test_membership | 1.9350μs | 0.5257μs | 1.9023 MOps/s | 1.8586 MOps/s | |
test_membership_nested | 19.5150μs | 2.0086μs | 497.8522 KOps/s | 493.8639 KOps/s | |
test_membership_nested_leaf | 15.8955μs | 2.0152μs | 496.2238 KOps/s | 492.7361 KOps/s | |
test_membership_stacked_nested | 29.6500μs | 2.1069μs | 474.6322 KOps/s | 480.3819 KOps/s | |
test_membership_stacked_nested_leaf | 24.7910μs | 2.1431μs | 466.6126 KOps/s | 482.9710 KOps/s | |
test_membership_nested_last | 26.6410μs | 2.9834μs | 335.1864 KOps/s | 334.4776 KOps/s | |
test_membership_nested_leaf_last | 25.7000μs | 2.9732μs | 336.3377 KOps/s | 337.0392 KOps/s | |
test_membership_stacked_nested_last | 36.0110μs | 3.0141μs | 331.7715 KOps/s | 295.5516 KOps/s | |
test_membership_stacked_nested_leaf_last | 41.6310μs | 2.9638μs | 337.4099 KOps/s | 297.0103 KOps/s | |
test_nested_getleaf | 30.5110μs | 6.1781μs | 161.8613 KOps/s | 162.6317 KOps/s | |
test_nested_get | 48.9010μs | 5.8865μs | 169.8792 KOps/s | 169.3999 KOps/s | |
test_stacked_getleaf | 41.0110μs | 6.1916μs | 161.5084 KOps/s | 162.5174 KOps/s | |
test_stacked_get | 32.1300μs | 5.8915μs | 169.7359 KOps/s | 171.7732 KOps/s | |
test_nested_getitemleaf | 41.0110μs | 6.2612μs | 159.7140 KOps/s | 160.7271 KOps/s | |
test_nested_getitem | 32.7410μs | 5.9501μs | 168.0655 KOps/s | 168.3493 KOps/s | |
test_stacked_getitemleaf | 43.6710μs | 6.2497μs | 160.0090 KOps/s | 160.7908 KOps/s | |
test_stacked_getitem | 27.0510μs | 5.9391μs | 168.3752 KOps/s | 169.6052 KOps/s | |
test_lock_nested | 1.0829ms | 0.3666ms | 2.7274 KOps/s | 2.6574 KOps/s | |
test_lock_stack_nested | 0.3829ms | 0.3356ms | 2.9794 KOps/s | 2.9374 KOps/s | |
test_unlock_nested | 0.6567ms | 0.3037ms | 3.2923 KOps/s | 3.2342 KOps/s | |
test_unlock_stack_nested | 0.3777ms | 0.2728ms | 3.6651 KOps/s | 3.6093 KOps/s | |
test_flatten_speed | 0.1266ms | 74.2452μs | 13.4689 KOps/s | 13.3398 KOps/s | |
test_unflatten_speed | 0.3545ms | 0.3059ms | 3.2690 KOps/s | 3.2359 KOps/s | |
test_common_ops | 1.5069ms | 0.5900ms | 1.6949 KOps/s | 1.8026 KOps/s | |
test_creation | 0.1915ms | 1.4520μs | 688.7262 KOps/s | 672.4305 KOps/s | |
test_creation_empty | 31.7600μs | 7.8731μs | 127.0142 KOps/s | 164.9215 KOps/s | |
test_creation_nested_1 | 1.6214ms | 9.3727μs | 106.6923 KOps/s | 131.5830 KOps/s | |
test_creation_nested_2 | 61.8210μs | 11.9320μs | 83.8085 KOps/s | 98.7094 KOps/s | |
test_clone | 0.1421ms | 9.9534μs | 100.4685 KOps/s | 96.5006 KOps/s | |
test_getitem[int] | 1.1859ms | 10.6200μs | 94.1621 KOps/s | 91.3714 KOps/s | |
test_getitem[slice_int] | 0.1327ms | 20.4255μs | 48.9584 KOps/s | 46.9091 KOps/s | |
test_getitem[range] | 0.1821ms | 39.0707μs | 25.5946 KOps/s | 26.9495 KOps/s | |
test_getitem[tuple] | 0.1057ms | 18.0626μs | 55.3630 KOps/s | 53.9004 KOps/s | |
test_getitem[list] | 0.1327ms | 32.0609μs | 31.1906 KOps/s | 28.8805 KOps/s | |
test_setitem_dim[int] | 44.9900μs | 18.7600μs | 53.3048 KOps/s | 55.0459 KOps/s | |
test_setitem_dim[slice_int] | 59.6710μs | 38.0448μs | 26.2848 KOps/s | 26.8973 KOps/s | |
test_setitem_dim[range] | 78.1810μs | 54.8812μs | 18.2212 KOps/s | 19.4689 KOps/s | |
test_setitem_dim[tuple] | 54.2010μs | 32.5604μs | 30.7121 KOps/s | 32.3023 KOps/s | |
test_setitem | 0.1536ms | 15.8016μs | 63.2847 KOps/s | 74.5713 KOps/s | |
test_set | 50.0710μs | 14.3334μs | 69.7669 KOps/s | 75.5425 KOps/s | |
test_set_shared | 1.6386ms | 0.1441ms | 6.9388 KOps/s | 6.8314 KOps/s | |
test_update | 0.5672ms | 16.6994μs | 59.8823 KOps/s | 66.6476 KOps/s | |
test_update_nested | 1.1611ms | 24.2835μs | 41.1802 KOps/s | 49.5103 KOps/s | |
test_update__nested | 67.8310μs | 24.5384μs | 40.7524 KOps/s | 42.1476 KOps/s | |
test_set_nested | 0.1329ms | 14.9513μs | 66.8837 KOps/s | 70.1718 KOps/s | |
test_set_nested_new | 0.1330ms | 17.8878μs | 55.9040 KOps/s | 60.5532 KOps/s | |
test_select | 0.1436ms | 30.5214μs | 32.7639 KOps/s | 35.4977 KOps/s | |
test_select_nested | 69.6610μs | 42.1584μs | 23.7201 KOps/s | 23.6565 KOps/s | |
test_exclude_nested | 0.1012ms | 61.7277μs | 16.2002 KOps/s | 16.2623 KOps/s | |
test_empty[True] | 0.3756ms | 0.2759ms | 3.6240 KOps/s | 3.5650 KOps/s | |
test_empty[False] | 3.3421μs | 0.7787μs | 1.2842 MOps/s | 1.3031 MOps/s | |
test_to | 87.0120μs | 54.6667μs | 18.2927 KOps/s | 17.2563 KOps/s | |
test_to_nonblocking | 0.1960ms | 45.4017μs | 22.0256 KOps/s | 21.6784 KOps/s | |
test_unbind_speed | 1.6694ms | 0.2339ms | 4.2748 KOps/s | 4.3597 KOps/s | |
test_unbind_speed_stack0 | 0.2971ms | 0.2284ms | 4.3785 KOps/s | 4.2961 KOps/s | |
test_unbind_speed_stack1 | 93.1758ms | 0.6464ms | 1.5471 KOps/s | 1.6735 KOps/s | |
test_split | 95.5329ms | 1.7094ms | 585.0142 Ops/s | 619.2744 Ops/s | |
test_chunk | 97.3832ms | 1.5828ms | 631.7880 Ops/s | 616.9278 Ops/s | |
test_consolidate[False-None] | 3.4352ms | 2.6134ms | 382.6479 Ops/s | 344.0850 Ops/s | |
test_consolidate[default-None] | 1.7529ms | 1.6779ms | 595.9765 Ops/s | 583.1654 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8372ms | 1.7073ms | 585.7086 Ops/s | 569.1823 Ops/s | |
test_consolidate_njt[False-None] | 6.5931ms | 6.3181ms | 158.2744 Ops/s | 158.6550 Ops/s | |
test_to[False-False-None] | 1.7598ms | 1.6320ms | 612.7491 Ops/s | 609.2914 Ops/s | |
test_to[True-False-None] | 1.4933ms | 1.2700ms | 787.3749 Ops/s | 774.2872 Ops/s | |
test_to[within-False-None] | 4.1573ms | 3.9130ms | 255.5610 Ops/s | 182.4377 Ops/s | |
test_to[True-default-None] | 5.2707ms | 5.0480ms | 198.0974 Ops/s | 191.3883 Ops/s | |
test_to_njt[False-False-None] | 6.9883ms | 6.8237ms | 146.5479 Ops/s | 140.9777 Ops/s | |
test_to_njt[True-False-None] | 5.5974ms | 5.3468ms | 187.0291 Ops/s | 179.0251 Ops/s | |
test_to_njt[within-False-None] | 11.9290ms | 11.8002ms | 84.7447 Ops/s | 80.5381 Ops/s | |
test_creation[device0] | 0.4669ms | 77.9331μs | 12.8315 KOps/s | 12.7164 KOps/s | |
test_creation_from_tensor | 0.5989ms | 81.8433μs | 12.2185 KOps/s | 12.3143 KOps/s | |
test_add_one[memmap_tensor0] | 0.2761ms | 5.9978μs | 166.7276 KOps/s | 151.3426 KOps/s | |
test_contiguous[memmap_tensor0] | 1.7561μs | 0.4065μs | 2.4601 MOps/s | 2.4123 MOps/s | |
test_stack[memmap_tensor0] | 29.3200μs | 4.4055μs | 226.9868 KOps/s | 215.9160 KOps/s | |
test_memmaptd_index | 1.6368ms | 0.2414ms | 4.1433 KOps/s | 3.9541 KOps/s | |
test_memmaptd_index_astensor | 0.5689ms | 0.3010ms | 3.3228 KOps/s | 3.2359 KOps/s | |
test_memmaptd_index_op | 0.9586ms | 0.5438ms | 1.8389 KOps/s | 1.8273 KOps/s | |
test_serialize_model | 0.1317s | 0.1307s | 7.6494 Ops/s | 7.6800 Ops/s | |
test_serialize_model_pickle | 1.3524s | 1.2120s | 0.8251 Ops/s | 0.8213 Ops/s | |
test_serialize_weights | 0.1314s | 0.1301s | 7.6886 Ops/s | 7.7170 Ops/s | |
test_serialize_weights_returnearly | 0.4289s | 65.3719ms | 15.2971 Ops/s | 11.9756 Ops/s | |
test_serialize_weights_pickle | 1.3520s | 1.2151s | 0.8230 Ops/s | 0.8097 Ops/s | |
test_reshape_pytree | 62.4310μs | 22.2704μs | 44.9026 KOps/s | 43.6986 KOps/s | |
test_reshape_td | 0.1307ms | 26.9961μs | 37.0424 KOps/s | 36.6236 KOps/s | |
test_view_pytree | 45.1100μs | 22.1741μs | 45.0977 KOps/s | 44.9749 KOps/s | |
test_view_td | 55.3410μs | 29.9817μs | 33.3536 KOps/s | 31.6246 KOps/s | |
test_unbind_pytree | 63.3410μs | 27.7601μs | 36.0229 KOps/s | 35.4055 KOps/s | |
test_unbind_td | 0.7539ms | 35.2381μs | 28.3784 KOps/s | 27.9572 KOps/s | |
test_split_pytree | 0.2229ms | 29.6768μs | 33.6964 KOps/s | 32.9420 KOps/s | |
test_split_td | 0.9425ms | 38.4849μs | 25.9842 KOps/s | 24.7440 KOps/s | |
test_add_pytree | 81.1810μs | 33.3267μs | 30.0060 KOps/s | 29.1207 KOps/s | |
test_add_td | 0.1010ms | 45.2660μs | 22.0917 KOps/s | 23.5531 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1879ms | 0.1208ms | 8.2770 KOps/s | 8.0177 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2665ms | 0.1242ms | 8.0514 KOps/s | 7.8525 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2441ms | 96.5336μs | 10.3591 KOps/s | 10.0702 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3108ms | 0.1448ms | 6.9038 KOps/s | 6.6651 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1527ms | 23.3177μs | 42.8859 KOps/s | 37.1916 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1398ms | 26.9561μs | 37.0974 KOps/s | 36.6195 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1120ms | 65.1364μs | 15.3524 KOps/s | 15.1034 KOps/s | |
test_compile_copy_nested[pytree-eager] | 87.8920μs | 49.2063μs | 20.3226 KOps/s | 19.9538 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1823ms | 0.1437ms | 6.9613 KOps/s | 7.0175 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3222ms | 0.2079ms | 4.8103 KOps/s | 4.8165 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2527ms | 0.1017ms | 9.8328 KOps/s | 10.1802 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2020ms | 52.8713μs | 18.9139 KOps/s | 19.3353 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2315ms | 0.1381ms | 7.2430 KOps/s | 7.3714 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6325ms | 0.4671ms | 2.1408 KOps/s | 2.0730 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3813ms | 0.2476ms | 4.0383 KOps/s | 4.0396 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2978ms | 0.1458ms | 6.8581 KOps/s | 6.9499 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1954ms | 61.3196μs | 16.3080 KOps/s | 16.0530 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2511ms | 0.1006ms | 9.9442 KOps/s | 10.0914 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4337ms | 0.3980ms | 2.5124 KOps/s | 2.4712 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1774ms | 0.1381ms | 7.2405 KOps/s | 7.4440 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 55.6210μs | 19.5877μs | 51.0524 KOps/s | 44.9562 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1036ms | 27.0948μs | 36.9074 KOps/s | 37.2064 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1203ms | 69.9991μs | 14.2859 KOps/s | 14.2657 KOps/s | |
test_compile_copy_flat[pytree-eager] | 83.6610μs | 51.3586μs | 19.4709 KOps/s | 19.3201 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6708ms | 0.4047ms | 2.4709 KOps/s | 2.1537 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7174ms | 2.5562ms | 391.2004 Ops/s | 392.8270 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6220ms | 0.4382ms | 2.2822 KOps/s | 2.2651 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7814ms | 2.5880ms | 386.4003 Ops/s | 385.6575 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2577ms | 0.1184ms | 8.4447 KOps/s | 8.8490 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5532ms | 81.6189μs | 12.2521 KOps/s | 12.7811 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2646ms | 0.1128ms | 8.8616 KOps/s | 9.5231 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2187ms | 69.8419μs | 14.3181 KOps/s | 15.0157 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2616ms | 0.1125ms | 8.8889 KOps/s | 9.4446 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2209ms | 70.7866μs | 14.1270 KOps/s | 14.9580 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2456ms | 0.1062ms | 9.4120 KOps/s | 9.6838 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1529ms | 17.2353μs | 58.0205 KOps/s | 56.3916 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1433ms | 98.7622μs | 10.1253 KOps/s | 10.2780 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 56.3810μs | 15.8869μs | 62.9449 KOps/s | 62.7754 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1526ms | 0.1019ms | 9.8160 KOps/s | 10.2709 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1539ms | 15.8090μs | 63.2551 KOps/s | 60.3242 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2524ms | 0.1055ms | 9.4764 KOps/s | 9.8314 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5858ms | 17.0207μs | 58.7520 KOps/s | 57.7100 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.4868ms | 0.1015ms | 9.8570 KOps/s | 9.8995 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 43.7310μs | 15.8541μs | 63.0751 KOps/s | 62.7043 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.4928ms | 98.7879μs | 10.1227 KOps/s | 10.2815 KOps/s | |
test_compile_indexing[int-pytree-eager] | 45.8010μs | 16.0993μs | 62.1145 KOps/s | 62.8196 KOps/s | |
test_mod_add[eager] | 80.7910μs | 37.5291μs | 26.6460 KOps/s | 27.6718 KOps/s | |
test_mod_add[compile] | 0.2005ms | 81.7292μs | 12.2355 KOps/s | 12.2280 KOps/s | |
test_mod_add[compile-overhead] | 0.3241ms | 0.1669ms | 5.9902 KOps/s | 5.6952 KOps/s | |
test_mod_wrap[eager] | 0.3493ms | 0.2604ms | 3.8401 KOps/s | 4.0453 KOps/s | |
test_mod_wrap[compile] | 0.4307ms | 0.2863ms | 3.4927 KOps/s | 3.4573 KOps/s | |
test_mod_wrap[compile-overhead] | 7.1899ms | 3.7873ms | 264.0407 Ops/s | 265.1275 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4700ms | 1.3254ms | 754.4637 Ops/s | 699.8996 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3928ms | 1.2535ms | 797.7621 Ops/s | 725.2439 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3757ms | 0.9362ms | 1.0681 KOps/s | 964.2428 Ops/s | |
test_seq_add[eager] | 0.1596ms | 0.1122ms | 8.9117 KOps/s | 8.8924 KOps/s | |
test_seq_add[compile] | 0.2250ms | 87.1435μs | 11.4753 KOps/s | 11.1027 KOps/s | |
test_seq_add[compile-overhead] | 0.3130ms | 0.1290ms | 7.7496 KOps/s | 7.6057 KOps/s | |
test_seq_wrap[eager] | 0.6097ms | 0.4070ms | 2.4573 KOps/s | 2.4219 KOps/s | |
test_seq_wrap[compile] | 0.4451ms | 0.2962ms | 3.3767 KOps/s | 3.2528 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3601ms | 0.2229ms | 4.4855 KOps/s | 4.3995 KOps/s | |
test_func_call_runtime[False-eager] | 0.8705ms | 0.7064ms | 1.4157 KOps/s | 1.3721 KOps/s | |
test_func_call_runtime[False-compile] | 0.8989ms | 0.7378ms | 1.3554 KOps/s | 1.3211 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5147ms | 0.3625ms | 2.7587 KOps/s | 2.7394 KOps/s | |
test_func_call_runtime[True-eager] | 1.0039ms | 0.8649ms | 1.1562 KOps/s | 1.1111 KOps/s | |
test_func_call_runtime[True-compile] | 0.8714ms | 0.7578ms | 1.3195 KOps/s | 1.2870 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5176ms | 0.3846ms | 2.6003 KOps/s | 2.5922 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8473ms | 0.7046ms | 1.4193 KOps/s | 1.3703 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8885ms | 0.7388ms | 1.3536 KOps/s | 1.3214 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5047ms | 0.3650ms | 2.7399 KOps/s | 2.7348 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1246ms | 0.9670ms | 1.0341 KOps/s | 998.2042 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9215ms | 0.7848ms | 1.2743 KOps/s | 1.2429 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5331ms | 0.4170ms | 2.3981 KOps/s | 2.4214 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4628ms | 2.0161ms | 496.0027 Ops/s | 492.2531 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9433ms | 0.8147ms | 1.2275 KOps/s | 1.2104 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4668ms | 0.4179ms | 2.3927 KOps/s | 2.3482 KOps/s | |
test_distributed | 2.7818ms | 0.2000ms | 5.0006 KOps/s | 8.6894 KOps/s | |
test_tdmodule | 40.7410μs | 19.2123μs | 52.0501 KOps/s | 56.7661 KOps/s | |
test_tdmodule_dispatch | 0.2544ms | 34.6188μs | 28.8861 KOps/s | 31.5588 KOps/s | |
test_tdseq | 30.9210μs | 19.0560μs | 52.4769 KOps/s | 57.1716 KOps/s | |
test_tdseq_dispatch | 76.8810μs | 36.6970μs | 27.2502 KOps/s | 29.6596 KOps/s | |
test_instantiation_functorch | 1.6753ms | 1.5453ms | 647.1383 Ops/s | 651.8740 Ops/s | |
test_exec_functorch | 0.5304ms | 0.1405ms | 7.1193 KOps/s | 7.1539 KOps/s | |
test_exec_functional_call | 0.5196ms | 0.1297ms | 7.7126 KOps/s | 7.6200 KOps/s | |
test_exec_td_decorator | 0.3703ms | 0.1743ms | 5.7357 KOps/s | 5.6296 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0618ms | 0.6607ms | 1.5135 KOps/s | 1.4917 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0517ms | 0.6654ms | 1.5029 KOps/s | 1.4842 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9792ms | 0.5707ms | 1.7522 KOps/s | 1.6705 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9550ms | 0.5722ms | 1.7477 KOps/s | 1.7329 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.9059ms | 18.5055ms | 54.0380 Ops/s | 53.7454 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.4022ms | 18.6435ms | 53.6380 Ops/s | 53.6639 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.6330ms | 18.5704ms | 53.8492 Ops/s | 54.2341 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.6033ms | 18.5605ms | 53.8779 Ops/s | 54.1558 Ops/s | |
test_to_module_speed[True] | 1.2586ms | 0.9222ms | 1.0844 KOps/s | 1.0731 KOps/s | |
test_to_module_speed[False] | 1.5378ms | 0.9077ms | 1.1017 KOps/s | 1.0947 KOps/s | |
test_tc_init | 68.1410μs | 32.8584μs | 30.4336 KOps/s | 33.0453 KOps/s | |
test_tc_init_nested | 0.1279ms | 66.9432μs | 14.9380 KOps/s | 15.7353 KOps/s | |
test_tc_first_layer_tensor | 14.0187μs | 0.6995μs | 1.4296 MOps/s | 1.4197 MOps/s | |
test_tc_first_layer_nontensor | 89.9510μs | 2.3059μs | 433.6643 KOps/s | 429.2672 KOps/s | |
test_tc_second_layer_tensor | 36.4605μs | 1.4325μs | 698.0764 KOps/s | 694.6352 KOps/s | |
test_tc_second_layer_nontensor | 38.3910μs | 3.0855μs | 324.0953 KOps/s | 327.9530 KOps/s | |
test_unbind | 0.2390s | 9.9683ms | 100.3182 Ops/s | 149.4197 Ops/s | |
test_full_like | 9.9667ms | 9.5087ms | 105.1664 Ops/s | 101.8522 Ops/s | |
test_zeros_like | 9.3940ms | 7.3176ms | 136.6572 Ops/s | 233.6041 Ops/s | |
test_ones_like | 4.9454ms | 4.3799ms | 228.3173 Ops/s | 232.7041 Ops/s | |
test_clone | 7.1517ms | 6.6907ms | 149.4621 Ops/s | 105.6851 Ops/s | |
test_squeeze | 59.8410μs | 9.1742μs | 109.0008 KOps/s | 109.7053 KOps/s | |
test_unsqueeze | 0.1210ms | 69.0752μs | 14.4770 KOps/s | 14.8200 KOps/s | |
test_split | 0.3765ms | 0.1546ms | 6.4680 KOps/s | 6.3746 KOps/s | |
test_permute | 0.2464ms | 0.1770ms | 5.6489 KOps/s | 5.6061 KOps/s | |
test_stack | 52.2387ms | 51.7392ms | 19.3277 Ops/s | 19.4173 Ops/s | |
test_cat | 52.5280ms | 51.6492ms | 19.3614 Ops/s | 19.4230 Ops/s |
This was referenced Dec 9, 2024
vmoens
added a commit
that referenced
this pull request
Dec 9, 2024
ghstack-source-id: f773ed94d4b7ca13c603f32251f0735c751ebf94 Pull Request resolved: #1134
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):