-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FBCode] Deactivate vmap monkey-patching in FBCode #1135
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Dec 9, 2024
ghstack-source-id: aa1e4393b693705dbb7caecac95b28f85828998b Pull Request resolved: #1135
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Dec 9, 2024
vmoens
added a commit
that referenced
this pull request
Dec 9, 2024
ghstack-source-id: 4f1ebfd0fd4ff5b7378c8692a064406e72fc68c0 Pull Request resolved: #1135
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 40.0950μs | 18.1561μs | 55.0778 KOps/s | 60.5053 KOps/s | |
test_plain_set_stack_nested | 47.9300μs | 18.2970μs | 54.6538 KOps/s | 58.4908 KOps/s | |
test_plain_set_nested_inplace | 62.0450μs | 19.8470μs | 50.3854 KOps/s | 54.3557 KOps/s | |
test_plain_set_stack_nested_inplace | 58.6790μs | 19.7280μs | 50.6894 KOps/s | 54.4226 KOps/s | |
test_items | 19.0350μs | 4.2066μs | 237.7213 KOps/s | 237.9675 KOps/s | |
test_items_nested | 0.4490ms | 0.4000ms | 2.5001 KOps/s | 2.4677 KOps/s | |
test_items_nested_locked | 0.6933ms | 0.3971ms | 2.5180 KOps/s | 2.4546 KOps/s | |
test_items_nested_leaf | 0.1274ms | 71.5707μs | 13.9722 KOps/s | 14.0344 KOps/s | |
test_items_stack_nested | 0.7934ms | 0.4050ms | 2.4690 KOps/s | 2.4390 KOps/s | |
test_items_stack_nested_leaf | 0.1423ms | 74.0596μs | 13.5026 KOps/s | 13.5693 KOps/s | |
test_items_stack_nested_locked | 0.7594ms | 0.4040ms | 2.4752 KOps/s | 2.4412 KOps/s | |
test_keys | 21.3100μs | 3.5839μs | 279.0241 KOps/s | 286.4257 KOps/s | |
test_keys_nested | 0.1849ms | 0.1367ms | 7.3131 KOps/s | 7.4270 KOps/s | |
test_keys_nested_locked | 1.9252ms | 0.1427ms | 7.0093 KOps/s | 7.0769 KOps/s | |
test_keys_nested_leaf | 0.1837ms | 0.1173ms | 8.5238 KOps/s | 8.6237 KOps/s | |
test_keys_stack_nested | 0.2419ms | 0.1348ms | 7.4171 KOps/s | 7.3975 KOps/s | |
test_keys_stack_nested_leaf | 0.1887ms | 0.1162ms | 8.6042 KOps/s | 8.6696 KOps/s | |
test_keys_stack_nested_locked | 0.2719ms | 0.1417ms | 7.0579 KOps/s | 7.1251 KOps/s | |
test_values | 6.6464μs | 1.0434μs | 958.3796 KOps/s | 960.4027 KOps/s | |
test_values_nested | 0.1308ms | 56.1705μs | 17.8029 KOps/s | 17.5560 KOps/s | |
test_values_nested_locked | 0.1112ms | 55.4773μs | 18.0254 KOps/s | 18.3825 KOps/s | |
test_values_nested_leaf | 0.1140ms | 60.1070μs | 16.6370 KOps/s | 17.0420 KOps/s | |
test_values_stack_nested | 0.1241ms | 56.1365μs | 17.8137 KOps/s | 17.9689 KOps/s | |
test_values_stack_nested_leaf | 0.1166ms | 59.5044μs | 16.8055 KOps/s | 16.7314 KOps/s | |
test_values_stack_nested_locked | 0.1160ms | 56.1697μs | 17.8032 KOps/s | 17.7655 KOps/s | |
test_membership | 5.1287μs | 0.7491μs | 1.3349 MOps/s | 1.3961 MOps/s | |
test_membership_nested | 25.2070μs | 2.9409μs | 340.0311 KOps/s | 343.7777 KOps/s | |
test_membership_nested_leaf | 23.1130μs | 2.9971μs | 333.6506 KOps/s | 341.9048 KOps/s | |
test_membership_stacked_nested | 28.3430μs | 2.9310μs | 341.1768 KOps/s | 344.8048 KOps/s | |
test_membership_stacked_nested_leaf | 24.8970μs | 2.9698μs | 336.7282 KOps/s | 346.6255 KOps/s | |
test_membership_nested_last | 45.8850μs | 4.2793μs | 233.6810 KOps/s | 237.3931 KOps/s | |
test_membership_nested_leaf_last | 23.6140μs | 4.2601μs | 234.7341 KOps/s | 236.0288 KOps/s | |
test_membership_stacked_nested_last | 43.3110μs | 5.3710μs | 186.1838 KOps/s | 148.2057 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.1920μs | 5.4790μs | 182.5150 KOps/s | 147.7408 KOps/s | |
test_nested_getleaf | 50.2330μs | 10.6923μs | 93.5250 KOps/s | 94.9583 KOps/s | |
test_nested_get | 46.0460μs | 10.4215μs | 95.9551 KOps/s | 97.1296 KOps/s | |
test_stacked_getleaf | 37.2800μs | 10.8344μs | 92.2989 KOps/s | 94.1207 KOps/s | |
test_stacked_get | 49.0020μs | 10.1230μs | 98.7848 KOps/s | 96.4116 KOps/s | |
test_nested_getitemleaf | 39.1400μs | 11.0959μs | 90.1233 KOps/s | 90.0193 KOps/s | |
test_nested_getitem | 34.4850μs | 10.4642μs | 95.5636 KOps/s | 94.9361 KOps/s | |
test_stacked_getitemleaf | 48.1500μs | 11.1649μs | 89.5661 KOps/s | 91.1666 KOps/s | |
test_stacked_getitem | 29.7050μs | 10.4149μs | 96.0166 KOps/s | 96.0517 KOps/s | |
test_lock_nested | 4.3484ms | 0.4401ms | 2.2723 KOps/s | 2.2473 KOps/s | |
test_lock_stack_nested | 0.6364ms | 0.4058ms | 2.4645 KOps/s | 2.4489 KOps/s | |
test_unlock_nested | 0.6596ms | 0.3559ms | 2.8099 KOps/s | 2.7510 KOps/s | |
test_unlock_stack_nested | 0.4037ms | 0.3238ms | 3.0884 KOps/s | 3.0878 KOps/s | |
test_flatten_speed | 0.1490ms | 93.5367μs | 10.6910 KOps/s | 10.6642 KOps/s | |
test_unflatten_speed | 0.8867ms | 0.4965ms | 2.0142 KOps/s | 2.0465 KOps/s | |
test_common_ops | 4.4028ms | 0.8046ms | 1.2429 KOps/s | 1.3542 KOps/s | |
test_creation | 12.3130μs | 2.1168μs | 472.4046 KOps/s | 477.9785 KOps/s | |
test_creation_empty | 49.5820μs | 12.4128μs | 80.5619 KOps/s | 114.5590 KOps/s | |
test_creation_nested_1 | 40.1850μs | 15.1161μs | 66.1546 KOps/s | 89.5032 KOps/s | |
test_creation_nested_2 | 59.7820μs | 19.3186μs | 51.7635 KOps/s | 63.7593 KOps/s | |
test_clone | 70.9920μs | 12.9778μs | 77.0546 KOps/s | 75.7311 KOps/s | |
test_getitem[int] | 1.2227ms | 12.9602μs | 77.1592 KOps/s | 80.3344 KOps/s | |
test_getitem[slice_int] | 0.1440ms | 25.9150μs | 38.5877 KOps/s | 39.2007 KOps/s | |
test_getitem[range] | 0.1703ms | 49.9480μs | 20.0208 KOps/s | 20.5343 KOps/s | |
test_getitem[tuple] | 0.1380ms | 20.7580μs | 48.1742 KOps/s | 48.5521 KOps/s | |
test_getitem[list] | 0.1686ms | 45.9274μs | 21.7735 KOps/s | 22.7433 KOps/s | |
test_setitem_dim[int] | 49.0710μs | 25.3906μs | 39.3846 KOps/s | 38.6961 KOps/s | |
test_setitem_dim[slice_int] | 83.9070μs | 52.6949μs | 18.9772 KOps/s | 18.3351 KOps/s | |
test_setitem_dim[range] | 0.1236ms | 73.8429μs | 13.5423 KOps/s | 13.3669 KOps/s | |
test_setitem_dim[tuple] | 85.2990μs | 41.7861μs | 23.9314 KOps/s | 23.8106 KOps/s | |
test_setitem | 64.1500μs | 21.5100μs | 46.4901 KOps/s | 51.6479 KOps/s | |
test_set | 68.8780μs | 20.9385μs | 47.7589 KOps/s | 54.4717 KOps/s | |
test_set_shared | 3.4300ms | 0.1660ms | 6.0252 KOps/s | 5.9122 KOps/s | |
test_update | 0.1200ms | 24.6365μs | 40.5901 KOps/s | 50.2741 KOps/s | |
test_update_nested | 0.1016ms | 35.0769μs | 28.5088 KOps/s | 32.5430 KOps/s | |
test_update__nested | 0.2581ms | 31.8561μs | 31.3912 KOps/s | 29.6997 KOps/s | |
test_set_nested | 67.6360μs | 22.6394μs | 44.1707 KOps/s | 48.1480 KOps/s | |
test_set_nested_new | 77.7150μs | 27.2909μs | 36.6423 KOps/s | 39.4211 KOps/s | |
test_select | 0.3940ms | 43.4190μs | 23.0314 KOps/s | 23.8850 KOps/s | |
test_select_nested | 0.1260ms | 59.4783μs | 16.8129 KOps/s | 16.8978 KOps/s | |
test_exclude_nested | 0.1534ms | 78.5252μs | 12.7348 KOps/s | 12.7379 KOps/s | |
test_empty[True] | 0.5302ms | 0.3783ms | 2.6433 KOps/s | 2.6339 KOps/s | |
test_empty[False] | 8.0710μs | 1.1866μs | 842.7309 KOps/s | 821.9498 KOps/s | |
test_unbind_speed | 0.3830ms | 0.2587ms | 3.8653 KOps/s | 3.7655 KOps/s | |
test_unbind_speed_stack0 | 0.4720ms | 0.2526ms | 3.9585 KOps/s | 3.9315 KOps/s | |
test_unbind_speed_stack1 | 96.1509ms | 0.7287ms | 1.3723 KOps/s | 1.6126 KOps/s | |
test_split | 1.8201ms | 1.5897ms | 629.0432 Ops/s | 575.9940 Ops/s | |
test_chunk | 88.1541ms | 1.8664ms | 535.7890 Ops/s | 584.0087 Ops/s | |
test_consolidate_njt[False-None] | 8.4849ms | 8.0001ms | 124.9979 Ops/s | 115.4930 Ops/s | |
test_creation[device0] | 0.2157ms | 90.2719μs | 11.0776 KOps/s | 10.6633 KOps/s | |
test_creation_from_tensor | 3.0037ms | 93.5804μs | 10.6860 KOps/s | 10.3297 KOps/s | |
test_add_one[memmap_tensor0] | 0.1235ms | 5.0583μs | 197.6937 KOps/s | 203.5954 KOps/s | |
test_contiguous[memmap_tensor0] | 10.1290μs | 0.5188μs | 1.9276 MOps/s | 1.9652 MOps/s | |
test_stack[memmap_tensor0] | 41.6880μs | 3.5324μs | 283.0915 KOps/s | 297.0641 KOps/s | |
test_memmaptd_index | 1.1196ms | 0.2350ms | 4.2557 KOps/s | 4.2078 KOps/s | |
test_memmaptd_index_astensor | 0.5882ms | 0.3137ms | 3.1880 KOps/s | 3.1437 KOps/s | |
test_memmaptd_index_op | 0.9360ms | 0.6008ms | 1.6644 KOps/s | 1.8008 KOps/s | |
test_serialize_model | 0.1173s | 0.1107s | 9.0310 Ops/s | 7.9109 Ops/s | |
test_serialize_model_pickle | 0.4526s | 0.3865s | 2.5872 Ops/s | 2.5805 Ops/s | |
test_serialize_weights | 0.1950s | 0.1245s | 8.0310 Ops/s | 9.0252 Ops/s | |
test_serialize_weights_returnearly | 0.1641s | 0.1529s | 6.5413 Ops/s | 6.4453 Ops/s | |
test_serialize_weights_pickle | 1.1521s | 0.7017s | 1.4250 Ops/s | 2.4096 Ops/s | |
test_serialize_weights_filesystem | 0.1466s | 0.1392s | 7.1834 Ops/s | 6.4309 Ops/s | |
test_serialize_model_filesystem | 0.2345s | 0.1561s | 6.4048 Ops/s | 6.8329 Ops/s | |
test_reshape_pytree | 86.9130μs | 27.3943μs | 36.5040 KOps/s | 36.5000 KOps/s | |
test_reshape_td | 65.7330μs | 33.0697μs | 30.2392 KOps/s | 30.4772 KOps/s | |
test_view_pytree | 78.4570μs | 26.9809μs | 37.0633 KOps/s | 36.4234 KOps/s | |
test_view_td | 82.9250μs | 38.9003μs | 25.7068 KOps/s | 24.7085 KOps/s | |
test_unbind_pytree | 66.3440μs | 30.1361μs | 33.1828 KOps/s | 32.8789 KOps/s | |
test_unbind_td | 0.3565ms | 38.4934μs | 25.9785 KOps/s | 25.7618 KOps/s | |
test_split_pytree | 63.2380μs | 30.3794μs | 32.9170 KOps/s | 32.8468 KOps/s | |
test_split_td | 0.5635ms | 45.8358μs | 21.8170 KOps/s | 22.3814 KOps/s | |
test_add_pytree | 77.9560μs | 36.4891μs | 27.4054 KOps/s | 27.4418 KOps/s | |
test_add_td | 0.1176ms | 56.5415μs | 17.6861 KOps/s | 19.3206 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1694ms | 62.2914μs | 16.0536 KOps/s | 15.9433 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 1.2493ms | 0.1609ms | 6.2154 KOps/s | 5.2417 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1076ms | 46.0199μs | 21.7297 KOps/s | 21.4997 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2173ms | 0.1205ms | 8.3012 KOps/s | 8.2825 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 59.2910μs | 26.1546μs | 38.2341 KOps/s | 39.3765 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1162ms | 53.5535μs | 18.6729 KOps/s | 18.7807 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2082ms | 78.1066μs | 12.8030 KOps/s | 12.7728 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1439ms | 67.8634μs | 14.7355 KOps/s | 14.7885 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2242ms | 0.1038ms | 9.6302 KOps/s | 9.5214 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2859ms | 0.2007ms | 4.9836 KOps/s | 4.9768 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1080ms | 45.5119μs | 21.9723 KOps/s | 22.6872 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.3741ms | 61.2658μs | 16.3223 KOps/s | 16.0289 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2237ms | 0.1040ms | 9.6143 KOps/s | 9.8193 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4301ms | 0.2023ms | 4.9424 KOps/s | 4.9483 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3128ms | 0.2128ms | 4.6986 KOps/s | 4.6865 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2274ms | 0.1078ms | 9.2792 KOps/s | 9.3262 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1201ms | 55.9357μs | 17.8777 KOps/s | 18.0415 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1315ms | 47.3815μs | 21.1053 KOps/s | 21.8990 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5444ms | 0.1594ms | 6.2723 KOps/s | 6.2276 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1919ms | 0.1039ms | 9.6287 KOps/s | 9.7219 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 60.6630μs | 21.4407μs | 46.6402 KOps/s | 48.6653 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1253ms | 58.8319μs | 16.9976 KOps/s | 16.9973 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2241ms | 82.0959μs | 12.1809 KOps/s | 12.2964 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1548ms | 69.9135μs | 14.3034 KOps/s | 14.4762 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4673ms | 0.2103ms | 4.7553 KOps/s | 4.7778 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.4090ms | 1.3127ms | 761.7756 Ops/s | 779.5152 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4101ms | 0.2059ms | 4.8572 KOps/s | 4.8331 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.6400ms | 0.7816ms | 1.2795 KOps/s | 1.2991 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5629ms | 0.4565ms | 2.1907 KOps/s | 2.2013 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.1795ms | 2.7644ms | 361.7364 Ops/s | 398.4745 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1404ms | 36.5118μs | 27.3884 KOps/s | 27.0704 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.2264s | 44.5252μs | 22.4592 KOps/s | 29.7378 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 71.4130μs | 30.1541μs | 33.1630 KOps/s | 34.1343 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 86.6700μs | 23.8406μs | 41.9453 KOps/s | 41.5049 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 70.7030μs | 31.2041μs | 32.0470 KOps/s | 33.4954 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 69.3890μs | 23.9175μs | 41.8104 KOps/s | 41.8937 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1199ms | 52.3702μs | 19.0948 KOps/s | 19.2551 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4193ms | 19.8990μs | 50.2537 KOps/s | 48.5492 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 98.5140μs | 44.6596μs | 22.3916 KOps/s | 22.3296 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 65.8230μs | 18.9063μs | 52.8924 KOps/s | 51.5444 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1221ms | 46.1307μs | 21.6776 KOps/s | 22.0592 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.6231ms | 19.0974μs | 52.3632 KOps/s | 50.8005 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.5254ms | 53.7578μs | 18.6019 KOps/s | 18.9830 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8622ms | 20.2209μs | 49.4538 KOps/s | 49.4769 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1120ms | 45.3906μs | 22.0310 KOps/s | 22.1814 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 57.4770μs | 19.1150μs | 52.3150 KOps/s | 51.5077 KOps/s | |
test_compile_indexing[int-pytree-compile] | 95.8390μs | 44.9434μs | 22.2502 KOps/s | 22.1777 KOps/s | |
test_compile_indexing[int-pytree-eager] | 53.0290μs | 19.0425μs | 52.5142 KOps/s | 51.9731 KOps/s | |
test_mod_add[eager] | 99.9760μs | 36.5498μs | 27.3600 KOps/s | 29.8468 KOps/s | |
test_mod_add[compile] | 92.0620μs | 47.1802μs | 21.1953 KOps/s | 20.5422 KOps/s | |
test_mod_add[compile-overhead] | 0.1220ms | 46.6567μs | 21.4331 KOps/s | 20.8132 KOps/s | |
test_mod_wrap[eager] | 0.4075ms | 0.2277ms | 4.3926 KOps/s | 4.4113 KOps/s | |
test_mod_wrap[compile] | 0.4184ms | 0.2085ms | 4.7958 KOps/s | 4.7743 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3769ms | 0.2074ms | 4.8213 KOps/s | 4.7957 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.0482ms | 10.6424ms | 93.9639 Ops/s | 85.0782 Ops/s | |
test_mod_wrap_and_backward[compile] | 11.9395ms | 10.5673ms | 94.6314 Ops/s | 83.2479 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.1289ms | 10.5265ms | 94.9979 Ops/s | 80.9532 Ops/s | |
test_seq_add[eager] | 0.2496ms | 0.1188ms | 8.4163 KOps/s | 8.8422 KOps/s | |
test_seq_add[compile] | 0.1129ms | 62.5193μs | 15.9951 KOps/s | 16.2723 KOps/s | |
test_seq_add[compile-overhead] | 0.1153ms | 59.2936μs | 16.8652 KOps/s | 16.9568 KOps/s | |
test_seq_wrap[eager] | 1.0130ms | 0.4491ms | 2.2267 KOps/s | 2.2725 KOps/s | |
test_seq_wrap[compile] | 0.4160ms | 0.2256ms | 4.4332 KOps/s | 4.3582 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4441ms | 0.2279ms | 4.3879 KOps/s | 4.2612 KOps/s | |
test_func_call_runtime[False-eager] | 0.7294ms | 0.5394ms | 1.8540 KOps/s | 1.7745 KOps/s | |
test_func_call_runtime[False-compile] | 0.8026ms | 0.4277ms | 2.3381 KOps/s | 2.3553 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5171ms | 0.4249ms | 2.3534 KOps/s | 2.3343 KOps/s | |
test_func_call_runtime[True-eager] | 0.8358ms | 0.7515ms | 1.3307 KOps/s | 1.2934 KOps/s | |
test_func_call_runtime[True-compile] | 0.5449ms | 0.4611ms | 2.1688 KOps/s | 2.1564 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6180ms | 0.4642ms | 2.1541 KOps/s | 2.1417 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8673ms | 0.5378ms | 1.8594 KOps/s | 1.7995 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8027ms | 0.4261ms | 2.3467 KOps/s | 2.3564 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5178ms | 0.4258ms | 2.3487 KOps/s | 2.3461 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.5164ms | 0.8843ms | 1.1308 KOps/s | 1.1096 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6397ms | 0.4878ms | 2.0502 KOps/s | 2.0565 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6305ms | 0.4869ms | 2.0537 KOps/s | 2.0518 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7805ms | 1.9057ms | 524.7528 Ops/s | 530.8484 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8410ms | 0.5174ms | 1.9328 KOps/s | 1.9198 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.0194ms | 0.5166ms | 1.9358 KOps/s | 1.9137 KOps/s | |
test_distributed | 0.2301ms | 0.1263ms | 7.9158 KOps/s | 7.7633 KOps/s | |
test_tdmodule | 55.2030μs | 26.8713μs | 37.2144 KOps/s | 38.9645 KOps/s | |
test_tdmodule_dispatch | 80.8010μs | 48.8253μs | 20.4812 KOps/s | 21.6029 KOps/s | |
test_tdseq | 54.6120μs | 27.5986μs | 36.2338 KOps/s | 39.8497 KOps/s | |
test_tdseq_dispatch | 0.1009ms | 54.0558μs | 18.4994 KOps/s | 20.6032 KOps/s | |
test_instantiation_functorch | 2.5221ms | 1.5994ms | 625.2510 Ops/s | 630.4313 Ops/s | |
test_exec_functorch | 0.2923ms | 0.1778ms | 5.6255 KOps/s | 5.4126 KOps/s | |
test_exec_functional_call | 0.2959ms | 0.1704ms | 5.8676 KOps/s | 5.5597 KOps/s | |
test_exec_td_decorator | 0.4456ms | 0.2279ms | 4.3881 KOps/s | 4.3016 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2142ms | 0.6608ms | 1.5134 KOps/s | 1.5163 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.3286ms | 0.6592ms | 1.5171 KOps/s | 1.4917 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8527ms | 0.5282ms | 1.8932 KOps/s | 1.8709 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6606ms | 0.5258ms | 1.9018 KOps/s | 1.8737 KOps/s | |
test_to_module_speed[True] | 2.1453ms | 1.2894ms | 775.5308 Ops/s | 781.4907 Ops/s | |
test_to_module_speed[False] | 1.7150ms | 1.2572ms | 795.4028 Ops/s | 797.6166 Ops/s | |
test_tc_init | 99.6870μs | 48.1987μs | 20.7475 KOps/s | 24.8047 KOps/s | |
test_tc_init_nested | 0.2155ms | 97.3088μs | 10.2766 KOps/s | 12.3640 KOps/s | |
test_tc_first_layer_tensor | 20.6580μs | 1.4945μs | 669.1077 KOps/s | 658.0302 KOps/s | |
test_tc_first_layer_nontensor | 26.9900μs | 4.6766μs | 213.8287 KOps/s | 213.2901 KOps/s | |
test_tc_second_layer_tensor | 21.7510μs | 2.7630μs | 361.9195 KOps/s | 360.3360 KOps/s | |
test_tc_second_layer_nontensor | 35.3660μs | 5.9067μs | 169.3007 KOps/s | 166.0555 KOps/s | |
test_unbind | 0.2091s | 12.5503ms | 79.6796 Ops/s | 84.2978 Ops/s | |
test_full_like | 14.3146ms | 11.7589ms | 85.0421 Ops/s | 92.3558 Ops/s | |
test_zeros_like | 12.8306ms | 8.1838ms | 122.1921 Ops/s | 140.6882 Ops/s | |
test_ones_like | 10.2930ms | 7.5260ms | 132.8726 Ops/s | 137.2385 Ops/s | |
test_clone | 12.1790ms | 8.7569ms | 114.1950 Ops/s | 112.4714 Ops/s | |
test_squeeze | 61.3150μs | 11.7270μs | 85.2734 KOps/s | 86.0497 KOps/s | |
test_unsqueeze | 0.2857ms | 88.9371μs | 11.2439 KOps/s | 10.7371 KOps/s | |
test_split | 0.3880ms | 0.1998ms | 5.0052 KOps/s | 5.1936 KOps/s | |
test_permute | 0.3892ms | 0.2039ms | 4.9035 KOps/s | 4.8592 KOps/s | |
test_stack | 29.1480ms | 24.6262ms | 40.6071 Ops/s | 40.9438 Ops/s | |
test_cat | 26.5040ms | 24.3648ms | 41.0427 Ops/s | 40.4722 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1158ms | 11.0786μs | 90.2641 KOps/s | 100.6182 KOps/s | |
test_plain_set_stack_nested | 39.4410μs | 11.0311μs | 90.6525 KOps/s | 100.2648 KOps/s | |
test_plain_set_nested_inplace | 52.5410μs | 11.9517μs | 83.6700 KOps/s | 92.1155 KOps/s | |
test_plain_set_stack_nested_inplace | 36.9000μs | 11.9240μs | 83.8643 KOps/s | 92.1665 KOps/s | |
test_items | 36.2710μs | 2.9090μs | 343.7555 KOps/s | 342.9961 KOps/s | |
test_items_nested | 0.3868ms | 0.3550ms | 2.8170 KOps/s | 2.8591 KOps/s | |
test_items_nested_locked | 0.5088ms | 0.3575ms | 2.7968 KOps/s | 2.8167 KOps/s | |
test_items_nested_leaf | 94.9720μs | 58.7503μs | 17.0212 KOps/s | 17.1740 KOps/s | |
test_items_stack_nested | 0.3932ms | 0.3541ms | 2.8241 KOps/s | 2.8397 KOps/s | |
test_items_stack_nested_leaf | 92.9120μs | 58.2887μs | 17.1560 KOps/s | 17.0940 KOps/s | |
test_items_stack_nested_locked | 0.5284ms | 0.3574ms | 2.7983 KOps/s | 2.8218 KOps/s | |
test_keys | 0.1812ms | 3.4886μs | 286.6495 KOps/s | 289.2708 KOps/s | |
test_keys_nested | 0.2433ms | 69.5048μs | 14.3875 KOps/s | 14.2619 KOps/s | |
test_keys_nested_locked | 0.8872ms | 75.8019μs | 13.1923 KOps/s | 13.2085 KOps/s | |
test_keys_nested_leaf | 93.5220μs | 61.5573μs | 16.2450 KOps/s | 16.2891 KOps/s | |
test_keys_stack_nested | 0.1029ms | 70.4171μs | 14.2011 KOps/s | 14.0734 KOps/s | |
test_keys_stack_nested_leaf | 0.1358ms | 61.6380μs | 16.2238 KOps/s | 16.1322 KOps/s | |
test_keys_stack_nested_locked | 0.1215ms | 75.6306μs | 13.2222 KOps/s | 13.0660 KOps/s | |
test_values | 5.6585μs | 0.8435μs | 1.1855 MOps/s | 1.1870 MOps/s | |
test_values_nested | 0.1447ms | 31.2624μs | 31.9873 KOps/s | 32.1996 KOps/s | |
test_values_nested_locked | 66.2510μs | 33.0857μs | 30.2246 KOps/s | 30.4706 KOps/s | |
test_values_nested_leaf | 0.1598ms | 33.6406μs | 29.7260 KOps/s | 29.7692 KOps/s | |
test_values_stack_nested | 0.1811ms | 31.4562μs | 31.7903 KOps/s | 31.8373 KOps/s | |
test_values_stack_nested_leaf | 0.1006ms | 33.9553μs | 29.4505 KOps/s | 29.5747 KOps/s | |
test_values_stack_nested_locked | 61.4120μs | 33.1593μs | 30.1575 KOps/s | 30.0827 KOps/s | |
test_membership | 2.0810μs | 0.5101μs | 1.9602 MOps/s | 1.9663 MOps/s | |
test_membership_nested | 37.1155μs | 2.0409μs | 489.9712 KOps/s | 492.4674 KOps/s | |
test_membership_nested_leaf | 17.9650μs | 2.0658μs | 484.0844 KOps/s | 494.9287 KOps/s | |
test_membership_stacked_nested | 0.1480ms | 2.1382μs | 467.6756 KOps/s | 475.6107 KOps/s | |
test_membership_stacked_nested_leaf | 32.1710μs | 2.1190μs | 471.9170 KOps/s | 471.3529 KOps/s | |
test_membership_nested_last | 42.3010μs | 2.9317μs | 341.0974 KOps/s | 339.0168 KOps/s | |
test_membership_nested_leaf_last | 33.7610μs | 2.9853μs | 334.9727 KOps/s | 335.1325 KOps/s | |
test_membership_stacked_nested_last | 28.1300μs | 3.0111μs | 332.1012 KOps/s | 336.4362 KOps/s | |
test_membership_stacked_nested_leaf_last | 29.1010μs | 2.9623μs | 337.5730 KOps/s | 340.1518 KOps/s | |
test_nested_getleaf | 69.5320μs | 6.1125μs | 163.5985 KOps/s | 162.4525 KOps/s | |
test_nested_get | 28.3210μs | 5.8698μs | 170.3650 KOps/s | 172.0342 KOps/s | |
test_stacked_getleaf | 37.5510μs | 6.1652μs | 162.2013 KOps/s | 163.1330 KOps/s | |
test_stacked_get | 32.5810μs | 5.8593μs | 170.6691 KOps/s | 171.5939 KOps/s | |
test_nested_getitemleaf | 35.2110μs | 6.2808μs | 159.2163 KOps/s | 160.2169 KOps/s | |
test_nested_getitem | 29.6800μs | 5.9711μs | 167.4736 KOps/s | 169.5126 KOps/s | |
test_stacked_getitemleaf | 58.7610μs | 6.2576μs | 159.8063 KOps/s | 160.2491 KOps/s | |
test_stacked_getitem | 66.6220μs | 5.9190μs | 168.9461 KOps/s | 169.5119 KOps/s | |
test_lock_nested | 9.1979ms | 0.3782ms | 2.6444 KOps/s | 2.7079 KOps/s | |
test_lock_stack_nested | 0.4647ms | 0.3367ms | 2.9698 KOps/s | 3.0020 KOps/s | |
test_unlock_nested | 0.7575ms | 0.3101ms | 3.2243 KOps/s | 3.2974 KOps/s | |
test_unlock_stack_nested | 0.4425ms | 0.2766ms | 3.6156 KOps/s | 3.6759 KOps/s | |
test_flatten_speed | 0.2582ms | 75.2230μs | 13.2938 KOps/s | 13.3987 KOps/s | |
test_unflatten_speed | 0.5364ms | 0.3132ms | 3.1924 KOps/s | 3.2709 KOps/s | |
test_common_ops | 1.7862ms | 0.6146ms | 1.6270 KOps/s | 1.8041 KOps/s | |
test_creation | 0.1149ms | 1.4789μs | 676.1619 KOps/s | 675.4237 KOps/s | |
test_creation_empty | 32.8210μs | 8.3112μs | 120.3194 KOps/s | 165.1243 KOps/s | |
test_creation_nested_1 | 1.7595ms | 9.7785μs | 102.2655 KOps/s | 131.4359 KOps/s | |
test_creation_nested_2 | 37.4910μs | 12.5219μs | 79.8600 KOps/s | 99.6338 KOps/s | |
test_clone | 0.1060ms | 11.0072μs | 90.8492 KOps/s | 96.2708 KOps/s | |
test_getitem[int] | 1.2943ms | 10.7736μs | 92.8199 KOps/s | 94.8147 KOps/s | |
test_getitem[slice_int] | 94.3627ms | 30.1494μs | 33.1682 KOps/s | 49.6787 KOps/s | |
test_getitem[range] | 0.1624ms | 36.6095μs | 27.3153 KOps/s | 27.6115 KOps/s | |
test_getitem[tuple] | 0.1056ms | 18.5402μs | 53.9369 KOps/s | 56.2152 KOps/s | |
test_getitem[list] | 0.2541ms | 34.9370μs | 28.6229 KOps/s | 31.3866 KOps/s | |
test_setitem_dim[int] | 40.9810μs | 19.7504μs | 50.6318 KOps/s | 54.9696 KOps/s | |
test_setitem_dim[slice_int] | 63.7420μs | 40.3749μs | 24.7679 KOps/s | 26.8818 KOps/s | |
test_setitem_dim[range] | 0.1593ms | 56.9013μs | 17.5743 KOps/s | 19.2895 KOps/s | |
test_setitem_dim[tuple] | 73.4010μs | 34.2807μs | 29.1710 KOps/s | 31.8396 KOps/s | |
test_setitem | 0.1566ms | 15.9568μs | 62.6692 KOps/s | 73.0623 KOps/s | |
test_set | 0.1516ms | 15.5112μs | 64.4694 KOps/s | 75.6511 KOps/s | |
test_set_shared | 1.5257ms | 0.1471ms | 6.8004 KOps/s | 6.8964 KOps/s | |
test_update | 0.3608ms | 18.3622μs | 54.4596 KOps/s | 66.4583 KOps/s | |
test_update_nested | 0.1394ms | 23.3547μs | 42.8180 KOps/s | 49.7052 KOps/s | |
test_update__nested | 1.0986ms | 24.8512μs | 40.2396 KOps/s | 41.5833 KOps/s | |
test_set_nested | 0.1142ms | 16.6491μs | 60.0631 KOps/s | 68.7662 KOps/s | |
test_set_nested_new | 0.1622ms | 18.9506μs | 52.7688 KOps/s | 60.1863 KOps/s | |
test_select | 0.1399ms | 31.6094μs | 31.6361 KOps/s | 34.2043 KOps/s | |
test_select_nested | 78.9720μs | 42.4660μs | 23.5482 KOps/s | 23.8535 KOps/s | |
test_exclude_nested | 0.3731ms | 62.6698μs | 15.9566 KOps/s | 16.2605 KOps/s | |
test_empty[True] | 0.3392ms | 0.2789ms | 3.5856 KOps/s | 3.6370 KOps/s | |
test_empty[False] | 3.6161μs | 0.7559μs | 1.3229 MOps/s | 1.3599 MOps/s | |
test_to | 85.5020μs | 53.4486μs | 18.7096 KOps/s | 18.5422 KOps/s | |
test_to_nonblocking | 0.1981ms | 45.9317μs | 21.7714 KOps/s | 21.9947 KOps/s | |
test_unbind_speed | 0.2610ms | 0.2306ms | 4.3367 KOps/s | 4.3968 KOps/s | |
test_unbind_speed_stack0 | 0.3429ms | 0.2346ms | 4.2624 KOps/s | 4.3402 KOps/s | |
test_unbind_speed_stack1 | 93.6029ms | 0.6515ms | 1.5349 KOps/s | 1.5510 KOps/s | |
test_split | 95.1344ms | 1.7192ms | 581.6725 Ops/s | 592.9489 Ops/s | |
test_chunk | 1.5610ms | 1.4470ms | 691.0719 Ops/s | 633.1177 Ops/s | |
test_consolidate[False-None] | 97.5196ms | 2.8346ms | 352.7839 Ops/s | 384.8231 Ops/s | |
test_consolidate[default-None] | 1.8197ms | 1.6753ms | 596.9143 Ops/s | 607.9256 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9615ms | 1.7067ms | 585.9161 Ops/s | 601.3920 Ops/s | |
test_consolidate_njt[False-None] | 6.6486ms | 6.4360ms | 155.3748 Ops/s | 154.0013 Ops/s | |
test_to[False-False-None] | 1.8606ms | 1.6415ms | 609.1873 Ops/s | 611.4548 Ops/s | |
test_to[True-False-None] | 1.5305ms | 1.2742ms | 784.8116 Ops/s | 813.5842 Ops/s | |
test_to[within-False-None] | 4.2501ms | 3.9975ms | 250.1591 Ops/s | 254.4441 Ops/s | |
test_to[True-default-None] | 5.4032ms | 5.2604ms | 190.1014 Ops/s | 191.5795 Ops/s | |
test_to_njt[False-False-None] | 7.3286ms | 7.1066ms | 140.7135 Ops/s | 143.9985 Ops/s | |
test_to_njt[True-False-None] | 5.9748ms | 5.7448ms | 174.0690 Ops/s | 179.8099 Ops/s | |
test_to_njt[within-False-None] | 12.2321ms | 12.0004ms | 83.3307 Ops/s | 83.0075 Ops/s | |
test_creation[device0] | 0.4420ms | 83.0846μs | 12.0359 KOps/s | 12.4186 KOps/s | |
test_creation_from_tensor | 0.6111ms | 85.4006μs | 11.7095 KOps/s | 11.7798 KOps/s | |
test_add_one[memmap_tensor0] | 0.3249ms | 6.8847μs | 145.2492 KOps/s | 152.4422 KOps/s | |
test_contiguous[memmap_tensor0] | 1.8205μs | 0.4000μs | 2.5000 MOps/s | 2.4461 MOps/s | |
test_stack[memmap_tensor0] | 42.1010μs | 4.4198μs | 226.2558 KOps/s | 228.0005 KOps/s | |
test_memmaptd_index | 1.8237ms | 0.2454ms | 4.0754 KOps/s | 4.0688 KOps/s | |
test_memmaptd_index_astensor | 0.6047ms | 0.3024ms | 3.3072 KOps/s | 3.2974 KOps/s | |
test_memmaptd_index_op | 1.0121ms | 0.5854ms | 1.7083 KOps/s | 1.8082 KOps/s | |
test_serialize_model | 0.4149s | 0.1712s | 5.8424 Ops/s | 7.6392 Ops/s | |
test_serialize_model_pickle | 1.3504s | 1.2171s | 0.8216 Ops/s | 0.8239 Ops/s | |
test_serialize_weights | 0.1302s | 0.1292s | 7.7390 Ops/s | 7.7231 Ops/s | |
test_serialize_weights_returnearly | 0.3178s | 52.8066ms | 18.9370 Ops/s | 23.5690 Ops/s | |
test_serialize_weights_pickle | 1.3670s | 1.2246s | 0.8166 Ops/s | 0.8196 Ops/s | |
test_reshape_pytree | 0.1659ms | 22.0507μs | 45.3500 KOps/s | 44.7411 KOps/s | |
test_reshape_td | 0.1550ms | 26.5026μs | 37.7322 KOps/s | 38.0033 KOps/s | |
test_view_pytree | 0.1476ms | 21.7448μs | 45.9880 KOps/s | 44.9435 KOps/s | |
test_view_td | 0.1176ms | 30.1658μs | 33.1501 KOps/s | 33.3866 KOps/s | |
test_unbind_pytree | 0.1448ms | 28.0275μs | 35.6793 KOps/s | 35.7743 KOps/s | |
test_unbind_td | 0.7867ms | 36.0986μs | 27.7019 KOps/s | 27.9938 KOps/s | |
test_split_pytree | 0.1549ms | 29.8576μs | 33.4923 KOps/s | 33.0619 KOps/s | |
test_split_td | 0.9944ms | 38.4470μs | 26.0099 KOps/s | 25.7446 KOps/s | |
test_add_pytree | 0.1380ms | 34.8385μs | 28.7039 KOps/s | 29.0986 KOps/s | |
test_add_td | 0.2269ms | 52.0380μs | 19.2167 KOps/s | 22.9815 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2598ms | 0.1211ms | 8.2576 KOps/s | 7.6516 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2763ms | 0.1267ms | 7.8952 KOps/s | 8.0138 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2460ms | 95.9315μs | 10.4241 KOps/s | 10.3261 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3036ms | 0.1486ms | 6.7282 KOps/s | 6.8882 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1656ms | 24.2520μs | 41.2337 KOps/s | 44.2945 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1575ms | 26.3295μs | 37.9803 KOps/s | 37.3403 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4409ms | 64.2791μs | 15.5572 KOps/s | 15.3652 KOps/s | |
test_compile_copy_nested[pytree-eager] | 93.6420μs | 49.6398μs | 20.1451 KOps/s | 20.0310 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.5505ms | 0.1431ms | 6.9872 KOps/s | 7.1525 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6383ms | 0.2110ms | 4.7399 KOps/s | 4.7941 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2578ms | 97.7025μs | 10.2352 KOps/s | 10.1008 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2082ms | 52.1853μs | 19.1625 KOps/s | 19.2634 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.5200ms | 0.1349ms | 7.4109 KOps/s | 7.4575 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8454ms | 0.4766ms | 2.0982 KOps/s | 2.1390 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4272ms | 0.2521ms | 3.9672 KOps/s | 4.0527 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3063ms | 0.1500ms | 6.6684 KOps/s | 7.0646 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2532ms | 62.7456μs | 15.9374 KOps/s | 16.4594 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2790ms | 0.1009ms | 9.9101 KOps/s | 10.3736 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.7656ms | 0.3985ms | 2.5093 KOps/s | 2.5504 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2744ms | 0.1356ms | 7.3752 KOps/s | 7.4951 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2128ms | 21.5743μs | 46.3514 KOps/s | 54.9703 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 97.5820μs | 26.2686μs | 38.0683 KOps/s | 37.4626 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1826ms | 69.0604μs | 14.4801 KOps/s | 14.4462 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1046ms | 51.3201μs | 19.4855 KOps/s | 19.4447 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6011ms | 0.3887ms | 2.5728 KOps/s | 2.2799 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7954ms | 2.5678ms | 389.4420 Ops/s | 379.8061 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5802ms | 0.4337ms | 2.3059 KOps/s | 2.3072 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8259ms | 2.5573ms | 391.0323 Ops/s | 397.3517 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.3701ms | 0.1129ms | 8.8573 KOps/s | 8.8037 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5870ms | 77.9194μs | 12.8338 KOps/s | 13.0526 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.5352ms | 0.1061ms | 9.4268 KOps/s | 9.7782 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2573ms | 71.2560μs | 14.0339 KOps/s | 15.1333 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2891ms | 0.1111ms | 8.9970 KOps/s | 9.6938 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2609ms | 70.6369μs | 14.1569 KOps/s | 14.9575 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2506ms | 99.8020μs | 10.0198 KOps/s | 10.1596 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1573ms | 17.0621μs | 58.6095 KOps/s | 58.3262 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2571ms | 95.3688μs | 10.4856 KOps/s | 10.6571 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1505ms | 15.8286μs | 63.1769 KOps/s | 63.4140 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.3020ms | 95.5092μs | 10.4702 KOps/s | 10.5472 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.2159ms | 15.8129μs | 63.2395 KOps/s | 63.3615 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2832ms | 0.1016ms | 9.8395 KOps/s | 10.1296 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6503ms | 16.8118μs | 59.4819 KOps/s | 57.7602 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2992ms | 0.1008ms | 9.9226 KOps/s | 10.5514 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1535ms | 15.6218μs | 64.0130 KOps/s | 63.1865 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2954ms | 95.3312μs | 10.4897 KOps/s | 10.5850 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1356ms | 15.7392μs | 63.5357 KOps/s | 63.6314 KOps/s | |
test_mod_add[eager] | 0.1885ms | 37.6430μs | 26.5653 KOps/s | 28.4673 KOps/s | |
test_mod_add[compile] | 0.3781ms | 79.3550μs | 12.6016 KOps/s | 12.5653 KOps/s | |
test_mod_add[compile-overhead] | 0.3187ms | 0.1650ms | 6.0596 KOps/s | 5.7915 KOps/s | |
test_mod_wrap[eager] | 0.3904ms | 0.2437ms | 4.1034 KOps/s | 4.1456 KOps/s | |
test_mod_wrap[compile] | 0.4256ms | 0.2773ms | 3.6063 KOps/s | 3.5473 KOps/s | |
test_mod_wrap[compile-overhead] | 7.1049ms | 3.6844ms | 271.4173 Ops/s | 263.2948 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5426ms | 1.3494ms | 741.0492 Ops/s | 696.7550 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4587ms | 1.2627ms | 791.9411 Ops/s | 737.7342 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3821ms | 0.9178ms | 1.0895 KOps/s | 951.4185 Ops/s | |
test_seq_add[eager] | 0.2945ms | 0.1172ms | 8.5344 KOps/s | 9.2682 KOps/s | |
test_seq_add[compile] | 0.2693ms | 92.3448μs | 10.8290 KOps/s | 10.8471 KOps/s | |
test_seq_add[compile-overhead] | 0.3161ms | 0.1340ms | 7.4642 KOps/s | 7.8452 KOps/s | |
test_seq_wrap[eager] | 0.6240ms | 0.4290ms | 2.3309 KOps/s | 2.4714 KOps/s | |
test_seq_wrap[compile] | 0.5205ms | 0.3157ms | 3.1681 KOps/s | 3.3774 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3667ms | 0.2218ms | 4.5088 KOps/s | 4.4900 KOps/s | |
test_func_call_runtime[False-eager] | 0.8758ms | 0.7269ms | 1.3757 KOps/s | 1.3902 KOps/s | |
test_func_call_runtime[False-compile] | 0.9708ms | 0.7568ms | 1.3213 KOps/s | 1.3461 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5046ms | 0.3632ms | 2.7536 KOps/s | 2.7889 KOps/s | |
test_func_call_runtime[True-eager] | 1.0554ms | 0.8848ms | 1.1302 KOps/s | 1.1232 KOps/s | |
test_func_call_runtime[True-compile] | 0.9287ms | 0.7536ms | 1.3270 KOps/s | 1.3305 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5726ms | 0.3831ms | 2.6100 KOps/s | 2.6423 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8740ms | 0.7217ms | 1.3855 KOps/s | 1.3867 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8999ms | 0.7399ms | 1.3515 KOps/s | 1.3578 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5052ms | 0.3602ms | 2.7764 KOps/s | 2.7564 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1510ms | 0.9814ms | 1.0190 KOps/s | 1.0111 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9335ms | 0.7846ms | 1.2746 KOps/s | 1.2833 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5495ms | 0.4036ms | 2.4775 KOps/s | 2.4578 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5213ms | 2.0448ms | 489.0493 Ops/s | 483.7951 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9862ms | 0.8055ms | 1.2415 KOps/s | 1.2546 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5625ms | 0.4061ms | 2.4624 KOps/s | 2.4553 KOps/s | |
test_distributed | 4.2462ms | 0.2906ms | 3.4416 KOps/s | 8.1819 KOps/s | |
test_tdmodule | 64.9010μs | 19.2100μs | 52.0562 KOps/s | 53.3287 KOps/s | |
test_tdmodule_dispatch | 0.1595ms | 34.7893μs | 28.7444 KOps/s | 30.8069 KOps/s | |
test_tdseq | 37.9410μs | 18.6636μs | 53.5804 KOps/s | 57.1845 KOps/s | |
test_tdseq_dispatch | 0.1127ms | 36.3180μs | 27.5345 KOps/s | 29.6530 KOps/s | |
test_instantiation_functorch | 1.7315ms | 1.5320ms | 652.7337 Ops/s | 641.2741 Ops/s | |
test_exec_functorch | 0.2853ms | 0.1464ms | 6.8290 KOps/s | 7.0397 KOps/s | |
test_exec_functional_call | 0.2423ms | 0.1390ms | 7.1947 KOps/s | 7.4337 KOps/s | |
test_exec_td_decorator | 0.3738ms | 0.1858ms | 5.3829 KOps/s | 5.5439 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8397ms | 0.6737ms | 1.4844 KOps/s | 1.4478 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8469ms | 0.6717ms | 1.4887 KOps/s | 1.4616 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7489ms | 0.5828ms | 1.7158 KOps/s | 1.7028 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7843ms | 0.5905ms | 1.6936 KOps/s | 1.6974 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.1030ms | 18.8868ms | 52.9470 Ops/s | 52.9716 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.7197ms | 19.0164ms | 52.5863 Ops/s | 52.9226 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.0397ms | 18.8330ms | 53.0982 Ops/s | 53.3530 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.9847ms | 18.7790ms | 53.2509 Ops/s | 53.3424 Ops/s | |
test_to_module_speed[True] | 1.0543ms | 0.9441ms | 1.0592 KOps/s | 1.0546 KOps/s | |
test_to_module_speed[False] | 1.3027ms | 0.9329ms | 1.0719 KOps/s | 1.0720 KOps/s | |
test_tc_init | 0.1190ms | 36.2921μs | 27.5542 KOps/s | 29.5447 KOps/s | |
test_tc_init_nested | 0.1544ms | 73.3899μs | 13.6259 KOps/s | 14.9013 KOps/s | |
test_tc_first_layer_tensor | 4.7286μs | 0.7027μs | 1.4231 MOps/s | 1.4649 MOps/s | |
test_tc_first_layer_nontensor | 26.0010μs | 2.3052μs | 433.8070 KOps/s | 430.6062 KOps/s | |
test_tc_second_layer_tensor | 16.8955μs | 1.4377μs | 695.5535 KOps/s | 714.6280 KOps/s | |
test_tc_second_layer_nontensor | 65.7120μs | 3.0455μs | 328.3540 KOps/s | 326.8768 KOps/s | |
test_unbind | 0.2209s | 9.9300ms | 100.7050 Ops/s | 151.1369 Ops/s | |
test_full_like | 11.4193ms | 9.5652ms | 104.5454 Ops/s | 105.2023 Ops/s | |
test_zeros_like | 5.3477ms | 4.3647ms | 229.1102 Ops/s | 231.6464 Ops/s | |
test_ones_like | 4.9352ms | 4.3703ms | 228.8187 Ops/s | 229.2322 Ops/s | |
test_clone | 11.9437ms | 9.3713ms | 106.7085 Ops/s | 106.6670 Ops/s | |
test_squeeze | 0.1491ms | 9.2081μs | 108.6001 KOps/s | 108.4736 KOps/s | |
test_unsqueeze | 0.2115ms | 71.9543μs | 13.8977 KOps/s | 14.4467 KOps/s | |
test_split | 0.2979ms | 0.1577ms | 6.3399 KOps/s | 6.3373 KOps/s | |
test_permute | 0.3030ms | 0.1790ms | 5.5857 KOps/s | 5.7446 KOps/s | |
test_stack | 51.5214ms | 51.0742ms | 19.5794 Ops/s | 19.4460 Ops/s | |
test_cat | 51.6775ms | 50.9127ms | 19.6415 Ops/s | 19.4921 Ops/s |
vmoens
added a commit
that referenced
this pull request
Dec 9, 2024
ghstack-source-id: 4f1ebfd0fd4ff5b7378c8692a064406e72fc68c0 Pull Request resolved: #1135
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):