Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature,Refactor] More args in constructors, refactor free functions #1116

Merged
merged 1 commit into from
Nov 28, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 28, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 28, 2024
ghstack-source-id: 35e2444bb5d4bf92b78437063e2f5aec83651713
Pull Request resolved: #1116
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 28, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}47$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 35.2550μs 18.2388μs 54.8282 KOps/s 63.0230 KOps/s $\textbf{\color{#d91a1a}-13.00\%}$
test_plain_set_stack_nested 62.2660μs 17.5925μs 56.8424 KOps/s 59.3763 KOps/s $\color{#d91a1a}-4.27\%$
test_plain_set_nested_inplace 53.5400μs 19.5894μs 51.0480 KOps/s 54.2757 KOps/s $\textbf{\color{#d91a1a}-5.95\%}$
test_plain_set_stack_nested_inplace 51.9270μs 19.8421μs 50.3979 KOps/s 55.7857 KOps/s $\textbf{\color{#d91a1a}-9.66\%}$
test_items 32.2680μs 4.1642μs 240.1422 KOps/s 248.3198 KOps/s $\color{#d91a1a}-3.29\%$
test_items_nested 0.4956ms 0.3833ms 2.6089 KOps/s 2.5543 KOps/s $\color{#35bf28}+2.14\%$
test_items_nested_locked 0.5127ms 0.3941ms 2.5372 KOps/s 2.5802 KOps/s $\color{#d91a1a}-1.67\%$
test_items_nested_leaf 0.1364ms 72.3209μs 13.8273 KOps/s 14.4643 KOps/s $\color{#d91a1a}-4.40\%$
test_items_stack_nested 0.6151ms 0.3949ms 2.5323 KOps/s 2.4811 KOps/s $\color{#35bf28}+2.06\%$
test_items_stack_nested_leaf 0.1316ms 71.7176μs 13.9436 KOps/s 14.0368 KOps/s $\color{#d91a1a}-0.66\%$
test_items_stack_nested_locked 0.5083ms 0.3866ms 2.5869 KOps/s 2.5813 KOps/s $\color{#35bf28}+0.21\%$
test_keys 15.8290μs 3.4109μs 293.1749 KOps/s 292.1116 KOps/s $\color{#35bf28}+0.36\%$
test_keys_nested 0.2822ms 0.1358ms 7.3630 KOps/s 7.3421 KOps/s $\color{#35bf28}+0.28\%$
test_keys_nested_locked 0.6965ms 0.1430ms 6.9906 KOps/s 7.0888 KOps/s $\color{#d91a1a}-1.38\%$
test_keys_nested_leaf 0.2135ms 0.1156ms 8.6517 KOps/s 8.7388 KOps/s $\color{#d91a1a}-1.00\%$
test_keys_stack_nested 0.1934ms 0.1349ms 7.4121 KOps/s 7.5645 KOps/s $\color{#d91a1a}-2.01\%$
test_keys_stack_nested_leaf 0.2046ms 0.1154ms 8.6680 KOps/s 8.8559 KOps/s $\color{#d91a1a}-2.12\%$
test_keys_stack_nested_locked 0.2191ms 0.1429ms 7.0002 KOps/s 7.2318 KOps/s $\color{#d91a1a}-3.20\%$
test_values 7.7064μs 1.0681μs 936.2654 KOps/s 984.6997 KOps/s $\color{#d91a1a}-4.92\%$
test_values_nested 0.1095ms 55.1934μs 18.1181 KOps/s 18.7184 KOps/s $\color{#d91a1a}-3.21\%$
test_values_nested_locked 0.1085ms 54.3848μs 18.3875 KOps/s 18.6158 KOps/s $\color{#d91a1a}-1.23\%$
test_values_nested_leaf 0.1160ms 59.0856μs 16.9246 KOps/s 17.5316 KOps/s $\color{#d91a1a}-3.46\%$
test_values_stack_nested 0.1247ms 57.0596μs 17.5255 KOps/s 18.3990 KOps/s $\color{#d91a1a}-4.75\%$
test_values_stack_nested_leaf 0.1198ms 61.3388μs 16.3029 KOps/s 16.8924 KOps/s $\color{#d91a1a}-3.49\%$
test_values_stack_nested_locked 0.1158ms 55.3944μs 18.0524 KOps/s 17.9222 KOps/s $\color{#35bf28}+0.73\%$
test_membership 16.3400μs 0.8635μs 1.1580 MOps/s 1.1685 MOps/s $\color{#d91a1a}-0.89\%$
test_membership_nested 22.2510μs 2.8078μs 356.1485 KOps/s 359.0121 KOps/s $\color{#d91a1a}-0.80\%$
test_membership_nested_leaf 40.5860μs 2.9520μs 338.7530 KOps/s 360.4117 KOps/s $\textbf{\color{#d91a1a}-6.01\%}$
test_membership_stacked_nested 25.9290μs 2.8780μs 347.4619 KOps/s 355.1657 KOps/s $\color{#d91a1a}-2.17\%$
test_membership_stacked_nested_leaf 23.5540μs 2.9486μs 339.1450 KOps/s 351.6689 KOps/s $\color{#d91a1a}-3.56\%$
test_membership_nested_last 42.2790μs 4.4432μs 225.0635 KOps/s 243.3449 KOps/s $\textbf{\color{#d91a1a}-7.51\%}$
test_membership_nested_leaf_last 38.5320μs 4.4418μs 225.1352 KOps/s 240.3105 KOps/s $\textbf{\color{#d91a1a}-6.31\%}$
test_membership_stacked_nested_last 42.5100μs 4.2462μs 235.5050 KOps/s 76.2293 KOps/s $\textbf{\color{#35bf28}+208.94\%}$
test_membership_stacked_nested_leaf_last 27.2810μs 4.3560μs 229.5709 KOps/s 76.2606 KOps/s $\textbf{\color{#35bf28}+201.03\%}$
test_nested_getleaf 52.1980μs 10.6340μs 94.0381 KOps/s 97.2893 KOps/s $\color{#d91a1a}-3.34\%$
test_nested_get 45.1950μs 10.2125μs 97.9195 KOps/s 103.3443 KOps/s $\textbf{\color{#d91a1a}-5.25\%}$
test_stacked_getleaf 33.4630μs 10.5462μs 94.8212 KOps/s 92.9434 KOps/s $\color{#35bf28}+2.02\%$
test_stacked_get 28.1630μs 9.9661μs 100.3404 KOps/s 101.9306 KOps/s $\color{#d91a1a}-1.56\%$
test_nested_getitemleaf 57.0570μs 10.9690μs 91.1659 KOps/s 93.1813 KOps/s $\color{#d91a1a}-2.16\%$
test_nested_getitem 47.4090μs 10.3617μs 96.5093 KOps/s 95.4660 KOps/s $\color{#35bf28}+1.09\%$
test_stacked_getitemleaf 56.2550μs 11.1036μs 90.0608 KOps/s 90.5838 KOps/s $\color{#d91a1a}-0.58\%$
test_stacked_getitem 36.1180μs 10.4083μs 96.0774 KOps/s 97.5908 KOps/s $\color{#d91a1a}-1.55\%$
test_lock_nested 0.8162ms 0.4322ms 2.3139 KOps/s 2.3615 KOps/s $\color{#d91a1a}-2.01\%$
test_lock_stack_nested 0.6379ms 0.4009ms 2.4942 KOps/s 2.5603 KOps/s $\color{#d91a1a}-2.58\%$
test_unlock_nested 0.7063ms 0.3460ms 2.8901 KOps/s 2.8434 KOps/s $\color{#35bf28}+1.64\%$
test_unlock_stack_nested 0.6425ms 0.3214ms 3.1116 KOps/s 3.2477 KOps/s $\color{#d91a1a}-4.19\%$
test_flatten_speed 0.1938ms 93.7609μs 10.6654 KOps/s 10.9651 KOps/s $\color{#d91a1a}-2.73\%$
test_unflatten_speed 0.9111ms 0.4936ms 2.0260 KOps/s 2.0687 KOps/s $\color{#d91a1a}-2.06\%$
test_common_ops 2.9917ms 0.7602ms 1.3154 KOps/s 1.4270 KOps/s $\textbf{\color{#d91a1a}-7.82\%}$
test_creation 34.9350μs 1.9316μs 517.7103 KOps/s 499.7471 KOps/s $\color{#35bf28}+3.59\%$
test_creation_empty 45.0940μs 11.1853μs 89.4034 KOps/s 118.3961 KOps/s $\textbf{\color{#d91a1a}-24.49\%}$
test_creation_nested_1 1.8667ms 13.8892μs 71.9984 KOps/s 91.4109 KOps/s $\textbf{\color{#d91a1a}-21.24\%}$
test_creation_nested_2 42.8300μs 17.8727μs 55.9514 KOps/s 65.3145 KOps/s $\textbf{\color{#d91a1a}-14.34\%}$
test_clone 0.1021ms 12.7454μs 78.4595 KOps/s 78.6441 KOps/s $\color{#d91a1a}-0.23\%$
test_getitem[int] 0.8852ms 12.1953μs 81.9986 KOps/s 80.1922 KOps/s $\color{#35bf28}+2.25\%$
test_getitem[slice_int] 0.1400ms 24.0419μs 41.5941 KOps/s 40.6064 KOps/s $\color{#35bf28}+2.43\%$
test_getitem[range] 0.2540ms 46.7883μs 21.3729 KOps/s 21.6452 KOps/s $\color{#d91a1a}-1.26\%$
test_getitem[tuple] 0.1298ms 19.4925μs 51.3018 KOps/s 49.5041 KOps/s $\color{#35bf28}+3.63\%$
test_getitem[list] 0.2663ms 42.4668μs 23.5478 KOps/s 24.1752 KOps/s $\color{#d91a1a}-2.60\%$
test_setitem_dim[int] 48.9720μs 23.6367μs 42.3072 KOps/s 41.1746 KOps/s $\color{#35bf28}+2.75\%$
test_setitem_dim[slice_int] 0.1062ms 50.4114μs 19.8368 KOps/s 20.1663 KOps/s $\color{#d91a1a}-1.63\%$
test_setitem_dim[range] 0.1124ms 72.0283μs 13.8834 KOps/s 14.2282 KOps/s $\color{#d91a1a}-2.42\%$
test_setitem_dim[tuple] 66.9050μs 39.1830μs 25.5213 KOps/s 24.2958 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_setitem 93.1440μs 20.2960μs 49.2709 KOps/s 52.6371 KOps/s $\textbf{\color{#d91a1a}-6.40\%}$
test_set 0.1083ms 19.7348μs 50.6718 KOps/s 55.2837 KOps/s $\textbf{\color{#d91a1a}-8.34\%}$
test_set_shared 1.0856ms 0.1620ms 6.1718 KOps/s 6.0909 KOps/s $\color{#35bf28}+1.33\%$
test_update 0.1851ms 22.3671μs 44.7085 KOps/s 49.4310 KOps/s $\textbf{\color{#d91a1a}-9.55\%}$
test_update_nested 0.1135ms 32.7858μs 30.5010 KOps/s 32.3652 KOps/s $\textbf{\color{#d91a1a}-5.76\%}$
test_update__nested 0.2702ms 33.0885μs 30.2220 KOps/s 31.2481 KOps/s $\color{#d91a1a}-3.28\%$
test_set_nested 0.1493ms 21.9196μs 45.6213 KOps/s 49.5229 KOps/s $\textbf{\color{#d91a1a}-7.88\%}$
test_set_nested_new 0.1349ms 26.7674μs 37.3589 KOps/s 40.2036 KOps/s $\textbf{\color{#d91a1a}-7.08\%}$
test_select 0.2328ms 42.4912μs 23.5343 KOps/s 25.6267 KOps/s $\textbf{\color{#d91a1a}-8.16\%}$
test_select_nested 0.1220ms 59.2900μs 16.8663 KOps/s 17.4562 KOps/s $\color{#d91a1a}-3.38\%$
test_exclude_nested 0.1559ms 77.0966μs 12.9707 KOps/s 12.9177 KOps/s $\color{#35bf28}+0.41\%$
test_empty[True] 0.5312ms 0.3711ms 2.6944 KOps/s 2.7220 KOps/s $\color{#d91a1a}-1.01\%$
test_empty[False] 10.9003μs 1.2067μs 828.7105 KOps/s 864.0256 KOps/s $\color{#d91a1a}-4.09\%$
test_unbind_speed 0.3185ms 0.2506ms 3.9912 KOps/s 3.9040 KOps/s $\color{#35bf28}+2.23\%$
test_unbind_speed_stack0 0.3462ms 0.2509ms 3.9859 KOps/s 4.1524 KOps/s $\color{#d91a1a}-4.01\%$
test_unbind_speed_stack1 99.7101ms 0.7444ms 1.3434 KOps/s 1.5116 KOps/s $\textbf{\color{#d91a1a}-11.13\%}$
test_split 96.2533ms 1.6555ms 604.0495 Ops/s 592.3442 Ops/s $\color{#35bf28}+1.98\%$
test_chunk 0.1001s 1.6645ms 600.7783 Ops/s 604.2201 Ops/s $\color{#d91a1a}-0.57\%$
test_consolidate_njt[False-None] 9.7532ms 7.7742ms 128.6299 Ops/s 127.9156 Ops/s $\color{#35bf28}+0.56\%$
test_creation[device0] 0.2261ms 87.4427μs 11.4361 KOps/s 11.1993 KOps/s $\color{#35bf28}+2.11\%$
test_creation_from_tensor 4.3579ms 93.6284μs 10.6805 KOps/s 10.7909 KOps/s $\color{#d91a1a}-1.02\%$
test_add_one[memmap_tensor0] 0.1194ms 4.8522μs 206.0903 KOps/s 217.2374 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_contiguous[memmap_tensor0] 21.5410μs 0.4887μs 2.0463 MOps/s 2.0272 MOps/s $\color{#35bf28}+0.94\%$
test_stack[memmap_tensor0] 63.2880μs 3.3724μs 296.5216 KOps/s 301.4758 KOps/s $\color{#d91a1a}-1.64\%$
test_memmaptd_index 0.4732ms 0.2239ms 4.4664 KOps/s 4.3342 KOps/s $\color{#35bf28}+3.05\%$
test_memmaptd_index_astensor 0.6778ms 0.3040ms 3.2896 KOps/s 3.2998 KOps/s $\color{#d91a1a}-0.31\%$
test_memmaptd_index_op 1.0964ms 0.5758ms 1.7366 KOps/s 1.9019 KOps/s $\textbf{\color{#d91a1a}-8.69\%}$
test_serialize_model 0.1231s 0.1136s 8.8066 Ops/s 7.7829 Ops/s $\textbf{\color{#35bf28}+13.15\%}$
test_serialize_model_pickle 0.4750s 0.3967s 2.5205 Ops/s 2.5561 Ops/s $\color{#d91a1a}-1.39\%$
test_serialize_weights 0.2014s 0.1265s 7.9065 Ops/s 8.6790 Ops/s $\textbf{\color{#d91a1a}-8.90\%}$
test_serialize_weights_returnearly 0.1844s 0.1655s 6.0406 Ops/s 6.4357 Ops/s $\textbf{\color{#d91a1a}-6.14\%}$
test_serialize_weights_pickle 0.5471s 0.4394s 2.2760 Ops/s 2.5466 Ops/s $\textbf{\color{#d91a1a}-10.63\%}$
test_serialize_weights_filesystem 0.1450s 0.1410s 7.0922 Ops/s 7.1577 Ops/s $\color{#d91a1a}-0.92\%$
test_serialize_model_filesystem 0.1690s 0.1490s 6.7119 Ops/s 6.6946 Ops/s $\color{#35bf28}+0.26\%$
test_reshape_pytree 54.1410μs 25.8336μs 38.7093 KOps/s 37.9270 KOps/s $\color{#35bf28}+2.06\%$
test_reshape_td 80.9600μs 31.9904μs 31.2594 KOps/s 31.1755 KOps/s $\color{#35bf28}+0.27\%$
test_view_pytree 66.8130μs 26.6046μs 37.5875 KOps/s 37.8559 KOps/s $\color{#d91a1a}-0.71\%$
test_view_td 81.3910μs 38.4049μs 26.0384 KOps/s 27.6333 KOps/s $\textbf{\color{#d91a1a}-5.77\%}$
test_unbind_pytree 89.2750μs 30.0393μs 33.2897 KOps/s 33.7105 KOps/s $\color{#d91a1a}-1.25\%$
test_unbind_td 0.3586ms 37.6751μs 26.5427 KOps/s 25.9700 KOps/s $\color{#35bf28}+2.21\%$
test_split_pytree 69.9510μs 28.6116μs 34.9509 KOps/s 33.7563 KOps/s $\color{#35bf28}+3.54\%$
test_split_td 0.4978ms 42.5938μs 23.4776 KOps/s 23.4114 KOps/s $\color{#35bf28}+0.28\%$
test_add_pytree 85.9500μs 35.3492μs 28.2891 KOps/s 28.9404 KOps/s $\color{#d91a1a}-2.25\%$
test_add_td 0.1438ms 54.1916μs 18.4531 KOps/s 20.7449 KOps/s $\textbf{\color{#d91a1a}-11.05\%}$
test_compile_add_one_nested[tensordict-compile] 0.1369ms 59.7355μs 16.7405 KOps/s 16.7962 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_add_one_nested[tensordict-eager] 0.4908ms 0.1567ms 6.3802 KOps/s 6.4062 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_add_one_nested[pytree-compile] 87.5130μs 44.6692μs 22.3868 KOps/s 22.3819 KOps/s $\color{#35bf28}+0.02\%$
test_compile_add_one_nested[pytree-eager] 0.2214ms 0.1190ms 8.4033 KOps/s 8.6023 KOps/s $\color{#d91a1a}-2.31\%$
test_compile_copy_nested[tensordict-compile] 60.6230μs 24.7238μs 40.4469 KOps/s 40.0430 KOps/s $\color{#35bf28}+1.01\%$
test_compile_copy_nested[tensordict-eager] 0.1559ms 52.4958μs 19.0491 KOps/s 18.9525 KOps/s $\color{#35bf28}+0.51\%$
test_compile_copy_nested[pytree-compile] 0.1506ms 78.6137μs 12.7204 KOps/s 12.7734 KOps/s $\color{#d91a1a}-0.42\%$
test_compile_copy_nested[pytree-eager] 0.1211ms 66.5749μs 15.0207 KOps/s 14.9827 KOps/s $\color{#35bf28}+0.25\%$
test_compile_add_one_flat[tensordict-compile] 0.1882ms 0.1024ms 9.7660 KOps/s 9.2670 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_compile_add_one_flat[tensordict-eager] 0.3204ms 0.1960ms 5.1027 KOps/s 4.9442 KOps/s $\color{#35bf28}+3.21\%$
test_compile_add_one_flat[tensorclass-compile] 0.1451ms 43.9864μs 22.7343 KOps/s 21.5630 KOps/s $\textbf{\color{#35bf28}+5.43\%}$
test_compile_add_one_flat[tensorclass-eager] 0.4934ms 61.7538μs 16.1933 KOps/s 16.6406 KOps/s $\color{#d91a1a}-2.69\%$
test_compile_add_one_flat[pytree-compile] 0.1717ms 98.7389μs 10.1277 KOps/s 9.7604 KOps/s $\color{#35bf28}+3.76\%$
test_compile_add_one_flat[pytree-eager] 0.3516ms 0.2012ms 4.9700 KOps/s 5.0506 KOps/s $\color{#d91a1a}-1.59\%$
test_compile_add_self_flat[tensordict-eager] 0.4808ms 0.2080ms 4.8077 KOps/s 5.0208 KOps/s $\color{#d91a1a}-4.24\%$
test_compile_add_self_flat[tensordict-compile] 0.1631ms 0.1010ms 9.8966 KOps/s 9.9538 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_add_self_flat[tensorclass-eager] 0.2174ms 53.3925μs 18.7292 KOps/s 19.0085 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_add_self_flat[tensorclass-compile] 0.2105ms 47.9630μs 20.8494 KOps/s 22.7635 KOps/s $\textbf{\color{#d91a1a}-8.41\%}$
test_compile_add_self_flat[pytree-eager] 1.6135ms 0.1604ms 6.2337 KOps/s 6.3267 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_add_self_flat[pytree-compile] 0.1849ms 0.1011ms 9.8946 KOps/s 10.2495 KOps/s $\color{#d91a1a}-3.46\%$
test_compile_copy_flat[tensordict-compile] 62.5560μs 20.3878μs 49.0489 KOps/s 48.3904 KOps/s $\color{#35bf28}+1.36\%$
test_compile_copy_flat[tensordict-eager] 0.1318ms 57.6677μs 17.3407 KOps/s 17.1953 KOps/s $\color{#35bf28}+0.85\%$
test_compile_copy_flat[pytree-compile] 0.1763ms 79.5441μs 12.5716 KOps/s 12.6197 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_copy_flat[pytree-eager] 0.1527ms 67.6794μs 14.7755 KOps/s 14.8678 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_assign_and_add[tensordict-compile] 0.3058ms 0.2041ms 4.9005 KOps/s 5.0286 KOps/s $\color{#d91a1a}-2.55\%$
test_compile_assign_and_add[tensordict-eager] 1.4887ms 1.2778ms 782.6085 Ops/s 800.9458 Ops/s $\color{#d91a1a}-2.29\%$
test_compile_assign_and_add[pytree-compile] 0.3403ms 0.2016ms 4.9599 KOps/s 5.1324 KOps/s $\color{#d91a1a}-3.36\%$
test_compile_assign_and_add[pytree-eager] 1.2436ms 0.7616ms 1.3131 KOps/s 1.3389 KOps/s $\color{#d91a1a}-1.93\%$
test_compile_assign_and_add_stack[compile] 0.7919ms 0.4414ms 2.2655 KOps/s 2.2732 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_assign_and_add_stack[eager] 3.5638ms 2.5747ms 388.3889 Ops/s 407.2657 Ops/s $\color{#d91a1a}-4.64\%$
test_compile_indexing[tensor-tensordict-compile] 91.2600μs 35.6084μs 28.0833 KOps/s 29.0042 KOps/s $\color{#d91a1a}-3.18\%$
test_compile_indexing[tensor-tensordict-eager] 0.5839ms 32.2260μs 31.0308 KOps/s 31.4096 KOps/s $\color{#d91a1a}-1.21\%$
test_compile_indexing[tensor-tensorclass-compile] 94.2150μs 28.3615μs 35.2591 KOps/s 36.1251 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_indexing[tensor-tensorclass-eager] 75.8810μs 23.1918μs 43.1187 KOps/s 44.4601 KOps/s $\color{#d91a1a}-3.02\%$
test_compile_indexing[tensor-pytree-compile] 68.8690μs 28.9307μs 34.5654 KOps/s 35.5181 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_indexing[tensor-pytree-eager] 67.8660μs 23.4184μs 42.7014 KOps/s 43.9587 KOps/s $\color{#d91a1a}-2.86\%$
test_compile_indexing[slice-tensordict-compile] 0.1160ms 50.6366μs 19.7486 KOps/s 20.0303 KOps/s $\color{#d91a1a}-1.41\%$
test_compile_indexing[slice-tensordict-eager] 0.5838ms 20.0405μs 49.8988 KOps/s 51.0895 KOps/s $\color{#d91a1a}-2.33\%$
test_compile_indexing[slice-tensorclass-compile] 0.1101ms 44.0035μs 22.7255 KOps/s 23.3813 KOps/s $\color{#d91a1a}-2.81\%$
test_compile_indexing[slice-tensorclass-eager] 59.8420μs 18.8640μs 53.0110 KOps/s 55.0942 KOps/s $\color{#d91a1a}-3.78\%$
test_compile_indexing[slice-pytree-compile] 0.1120ms 44.1073μs 22.6720 KOps/s 23.4361 KOps/s $\color{#d91a1a}-3.26\%$
test_compile_indexing[slice-pytree-eager] 51.1650μs 18.6741μs 53.5501 KOps/s 54.4117 KOps/s $\color{#d91a1a}-1.58\%$
test_compile_indexing[int-tensordict-compile] 0.1182ms 52.3429μs 19.1048 KOps/s 20.3877 KOps/s $\textbf{\color{#d91a1a}-6.29\%}$
test_compile_indexing[int-tensordict-eager] 0.9588ms 19.4263μs 51.4765 KOps/s 49.8591 KOps/s $\color{#35bf28}+3.24\%$
test_compile_indexing[int-tensorclass-compile] 0.1626ms 43.8964μs 22.7809 KOps/s 23.2968 KOps/s $\color{#d91a1a}-2.21\%$
test_compile_indexing[int-tensorclass-eager] 54.0710μs 18.5354μs 53.9508 KOps/s 53.2205 KOps/s $\color{#35bf28}+1.37\%$
test_compile_indexing[int-pytree-compile] 0.1342ms 43.7767μs 22.8432 KOps/s 23.0642 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_indexing[int-pytree-eager] 59.1200μs 18.3556μs 54.4791 KOps/s 55.2477 KOps/s $\color{#d91a1a}-1.39\%$
test_mod_add[eager] 75.6410μs 32.6658μs 30.6130 KOps/s 31.4234 KOps/s $\color{#d91a1a}-2.58\%$
test_mod_add[compile] 0.1005ms 47.1200μs 21.2224 KOps/s 21.8045 KOps/s $\color{#d91a1a}-2.67\%$
test_mod_add[compile-overhead] 0.1053ms 45.1906μs 22.1285 KOps/s 22.1011 KOps/s $\color{#35bf28}+0.12\%$
test_mod_wrap[eager] 0.3531ms 0.2228ms 4.4881 KOps/s 4.7736 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_mod_wrap[compile] 0.4075ms 0.2063ms 4.8469 KOps/s 5.1141 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_mod_wrap[compile-overhead] 0.3381ms 0.2047ms 4.8852 KOps/s 5.1667 KOps/s $\textbf{\color{#d91a1a}-5.45\%}$
test_mod_wrap_and_backward[eager] 14.5672ms 12.0753ms 82.8134 Ops/s 91.1530 Ops/s $\textbf{\color{#d91a1a}-9.15\%}$
test_mod_wrap_and_backward[compile] 19.2037ms 13.9088ms 71.8972 Ops/s 77.8041 Ops/s $\textbf{\color{#d91a1a}-7.59\%}$
test_mod_wrap_and_backward[compile-overhead] 18.1347ms 12.7136ms 78.6561 Ops/s 80.1331 Ops/s $\color{#d91a1a}-1.84\%$
test_seq_add[eager] 0.2385ms 0.1071ms 9.3383 KOps/s 9.5322 KOps/s $\color{#d91a1a}-2.03\%$
test_seq_add[compile] 0.1391ms 61.9941μs 16.1306 KOps/s 17.2046 KOps/s $\textbf{\color{#d91a1a}-6.24\%}$
test_seq_add[compile-overhead] 0.1346ms 57.7952μs 17.3025 KOps/s 17.5444 KOps/s $\color{#d91a1a}-1.38\%$
test_seq_wrap[eager] 0.8679ms 0.4308ms 2.3213 KOps/s 2.4026 KOps/s $\color{#d91a1a}-3.38\%$
test_seq_wrap[compile] 0.3816ms 0.2293ms 4.3619 KOps/s 4.5486 KOps/s $\color{#d91a1a}-4.11\%$
test_seq_wrap[compile-overhead] 0.3433ms 0.2290ms 4.3670 KOps/s 4.6196 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_func_call_runtime[False-eager] 0.9331ms 0.5206ms 1.9209 KOps/s 1.9406 KOps/s $\color{#d91a1a}-1.02\%$
test_func_call_runtime[False-compile] 0.8999ms 0.4226ms 2.3664 KOps/s 2.4547 KOps/s $\color{#d91a1a}-3.60\%$
test_func_call_runtime[False-compile-overhead] 0.8676ms 0.4187ms 2.3882 KOps/s 2.3889 KOps/s $\color{#d91a1a}-0.03\%$
test_func_call_runtime[True-eager] 1.2719ms 0.7428ms 1.3462 KOps/s 1.3581 KOps/s $\color{#d91a1a}-0.87\%$
test_func_call_runtime[True-compile] 0.5682ms 0.4642ms 2.1543 KOps/s 2.2732 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_func_call_runtime[True-compile-overhead] 0.5680ms 0.4724ms 2.1167 KOps/s 2.2770 KOps/s $\textbf{\color{#d91a1a}-7.04\%}$
test_func_call_cm_runtime[False-eager] 1.2416ms 0.5373ms 1.8612 KOps/s 1.9592 KOps/s $\textbf{\color{#d91a1a}-5.00\%}$
test_func_call_cm_runtime[False-compile] 0.7398ms 0.4188ms 2.3877 KOps/s 2.4519 KOps/s $\color{#d91a1a}-2.62\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8318ms 0.4174ms 2.3958 KOps/s 2.4667 KOps/s $\color{#d91a1a}-2.88\%$
test_func_call_cm_runtime[True-eager] 1.0941ms 0.8663ms 1.1544 KOps/s 1.1624 KOps/s $\color{#d91a1a}-0.69\%$
test_func_call_cm_runtime[True-compile] 0.5940ms 0.4772ms 2.0957 KOps/s 2.1526 KOps/s $\color{#d91a1a}-2.64\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5535ms 0.4741ms 2.1092 KOps/s 2.1528 KOps/s $\color{#d91a1a}-2.03\%$
test_vmap_func_call_cm_runtime[eager] 3.4981ms 1.8533ms 539.5844 Ops/s 548.0001 Ops/s $\color{#d91a1a}-1.54\%$
test_vmap_func_call_cm_runtime[compile] 0.8789ms 0.5089ms 1.9649 KOps/s 1.9571 KOps/s $\color{#35bf28}+0.40\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8048ms 0.5236ms 1.9097 KOps/s 1.9764 KOps/s $\color{#d91a1a}-3.37\%$
test_distributed 0.2936ms 0.1223ms 8.1745 KOps/s 8.1759 KOps/s $\color{#d91a1a}-0.02\%$
test_tdmodule 54.3210μs 24.7178μs 40.4568 KOps/s 42.2517 KOps/s $\color{#d91a1a}-4.25\%$
test_tdmodule_dispatch 73.5970μs 46.3968μs 21.5532 KOps/s 23.1964 KOps/s $\textbf{\color{#d91a1a}-7.08\%}$
test_tdseq 56.4060μs 25.2063μs 39.6726 KOps/s 43.7262 KOps/s $\textbf{\color{#d91a1a}-9.27\%}$
test_tdseq_dispatch 74.4180μs 47.1663μs 21.2016 KOps/s 21.7756 KOps/s $\color{#d91a1a}-2.64\%$
test_instantiation_functorch 2.3955ms 1.4932ms 669.7082 Ops/s 669.0413 Ops/s $\color{#35bf28}+0.10\%$
test_exec_functorch 0.3515ms 0.1768ms 5.6557 KOps/s 5.9588 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_exec_functional_call 0.2606ms 0.1698ms 5.8910 KOps/s 6.0986 KOps/s $\color{#d91a1a}-3.40\%$
test_exec_td_decorator 0.4515ms 0.2210ms 4.5245 KOps/s 4.6137 KOps/s $\color{#d91a1a}-1.93\%$
test_vmap_mlp_speed_decorator[True-True] 1.4616ms 0.6725ms 1.4869 KOps/s 1.5951 KOps/s $\textbf{\color{#d91a1a}-6.78\%}$
test_vmap_mlp_speed_decorator[True-False] 0.8869ms 0.6553ms 1.5260 KOps/s 1.5807 KOps/s $\color{#d91a1a}-3.46\%$
test_vmap_mlp_speed_decorator[False-True] 0.6919ms 0.5238ms 1.9090 KOps/s 1.9861 KOps/s $\color{#d91a1a}-3.88\%$
test_vmap_mlp_speed_decorator[False-False] 0.9203ms 0.5248ms 1.9055 KOps/s 1.9906 KOps/s $\color{#d91a1a}-4.28\%$
test_to_module_speed[True] 2.0257ms 1.2495ms 800.3492 Ops/s 806.2999 Ops/s $\color{#d91a1a}-0.74\%$
test_to_module_speed[False] 1.6877ms 1.2076ms 828.0702 Ops/s 808.4425 Ops/s $\color{#35bf28}+2.43\%$
test_tc_init 83.3950μs 45.7896μs 21.8390 KOps/s 22.9264 KOps/s $\color{#d91a1a}-4.74\%$
test_tc_init_nested 0.2125ms 90.6412μs 11.0325 KOps/s 11.8277 KOps/s $\textbf{\color{#d91a1a}-6.72\%}$
test_tc_first_layer_tensor 15.8290μs 1.4734μs 678.6965 KOps/s 681.8348 KOps/s $\color{#d91a1a}-0.46\%$
test_tc_first_layer_nontensor 24.4060μs 4.5616μs 219.2190 KOps/s 222.3620 KOps/s $\color{#d91a1a}-1.41\%$
test_tc_second_layer_tensor 26.2790μs 2.7638μs 361.8219 KOps/s 365.7807 KOps/s $\color{#d91a1a}-1.08\%$
test_tc_second_layer_nontensor 24.1850μs 6.0416μs 165.5183 KOps/s 171.6511 KOps/s $\color{#d91a1a}-3.57\%$
test_unbind 0.2178s 12.5020ms 79.9870 Ops/s 82.8611 Ops/s $\color{#d91a1a}-3.47\%$
test_full_like 16.0905ms 11.3111ms 88.4088 Ops/s 137.9852 Ops/s $\textbf{\color{#d91a1a}-35.93\%}$
test_zeros_like 14.0145ms 7.7203ms 129.5287 Ops/s 361.9480 Ops/s $\textbf{\color{#d91a1a}-64.21\%}$
test_ones_like 10.7280ms 7.6695ms 130.3869 Ops/s 297.7170 Ops/s $\textbf{\color{#d91a1a}-56.20\%}$
test_clone 15.1664ms 9.2990ms 107.5386 Ops/s 200.7360 Ops/s $\textbf{\color{#d91a1a}-46.43\%}$
test_squeeze 67.1050μs 11.9054μs 83.9954 KOps/s 84.7569 KOps/s $\color{#d91a1a}-0.90\%$
test_unsqueeze 0.2160ms 88.6092μs 11.2855 KOps/s 11.6088 KOps/s $\color{#d91a1a}-2.79\%$
test_split 0.4597ms 0.1915ms 5.2232 KOps/s 5.2535 KOps/s $\color{#d91a1a}-0.58\%$
test_permute 0.3882ms 0.2151ms 4.6484 KOps/s 4.7451 KOps/s $\color{#d91a1a}-2.04\%$
test_stack 26.5562ms 24.4182ms 40.9531 Ops/s 41.2787 Ops/s $\color{#d91a1a}-0.79\%$
test_cat 28.7790ms 24.4595ms 40.8839 Ops/s 41.3496 Ops/s $\color{#d91a1a}-1.13\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}24$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 33.5910μs 11.2927μs 88.5530 KOps/s 97.1432 KOps/s $\textbf{\color{#d91a1a}-8.84\%}$
test_plain_set_stack_nested 34.0310μs 11.3447μs 88.1468 KOps/s 96.8939 KOps/s $\textbf{\color{#d91a1a}-9.03\%}$
test_plain_set_nested_inplace 73.3210μs 12.2159μs 81.8603 KOps/s 89.2071 KOps/s $\textbf{\color{#d91a1a}-8.24\%}$
test_plain_set_stack_nested_inplace 38.0010μs 12.0886μs 82.7227 KOps/s 90.4243 KOps/s $\textbf{\color{#d91a1a}-8.52\%}$
test_items 27.6310μs 2.8954μs 345.3696 KOps/s 342.9443 KOps/s $\color{#35bf28}+0.71\%$
test_items_nested 0.4558ms 0.3508ms 2.8504 KOps/s 2.8035 KOps/s $\color{#35bf28}+1.67\%$
test_items_nested_locked 0.4214ms 0.3489ms 2.8661 KOps/s 2.7970 KOps/s $\color{#35bf28}+2.47\%$
test_items_nested_leaf 81.1610μs 58.0640μs 17.2224 KOps/s 17.2948 KOps/s $\color{#d91a1a}-0.42\%$
test_items_stack_nested 0.4045ms 0.3535ms 2.8287 KOps/s 2.7988 KOps/s $\color{#35bf28}+1.07\%$
test_items_stack_nested_leaf 87.5910μs 58.6703μs 17.0444 KOps/s 17.3354 KOps/s $\color{#d91a1a}-1.68\%$
test_items_stack_nested_locked 0.4142ms 0.3525ms 2.8365 KOps/s 2.7730 KOps/s $\color{#35bf28}+2.29\%$
test_keys 30.2700μs 3.4625μs 288.8102 KOps/s 290.2719 KOps/s $\color{#d91a1a}-0.50\%$
test_keys_nested 95.8120μs 70.4204μs 14.2004 KOps/s 14.1736 KOps/s $\color{#35bf28}+0.19\%$
test_keys_nested_locked 0.7339ms 76.1185μs 13.1374 KOps/s 13.0870 KOps/s $\color{#35bf28}+0.39\%$
test_keys_nested_leaf 94.7910μs 61.8688μs 16.1632 KOps/s 16.1671 KOps/s $\color{#d91a1a}-0.02\%$
test_keys_stack_nested 0.1189ms 70.7967μs 14.1250 KOps/s 14.2612 KOps/s $\color{#d91a1a}-0.96\%$
test_keys_stack_nested_leaf 97.1210μs 62.1157μs 16.0990 KOps/s 16.3825 KOps/s $\color{#d91a1a}-1.73\%$
test_keys_stack_nested_locked 0.1594ms 76.7143μs 13.0354 KOps/s 13.1292 KOps/s $\color{#d91a1a}-0.71\%$
test_values 5.0767μs 0.8506μs 1.1756 MOps/s 1.1772 MOps/s $\color{#d91a1a}-0.14\%$
test_values_nested 59.8010μs 31.3373μs 31.9108 KOps/s 32.0139 KOps/s $\color{#d91a1a}-0.32\%$
test_values_nested_locked 63.1010μs 33.2758μs 30.0519 KOps/s 30.6070 KOps/s $\color{#d91a1a}-1.81\%$
test_values_nested_leaf 64.3210μs 33.8550μs 29.5377 KOps/s 29.7537 KOps/s $\color{#d91a1a}-0.73\%$
test_values_stack_nested 89.9510μs 31.5005μs 31.7455 KOps/s 31.8577 KOps/s $\color{#d91a1a}-0.35\%$
test_values_stack_nested_leaf 72.9710μs 34.0473μs 29.3709 KOps/s 29.7150 KOps/s $\color{#d91a1a}-1.16\%$
test_values_stack_nested_locked 67.3700μs 33.3102μs 30.0208 KOps/s 30.4245 KOps/s $\color{#d91a1a}-1.33\%$
test_membership 1.6650μs 0.5072μs 1.9715 MOps/s 1.9520 MOps/s $\color{#35bf28}+1.00\%$
test_membership_nested 15.9250μs 1.9581μs 510.6997 KOps/s 488.5148 KOps/s $\color{#35bf28}+4.54\%$
test_membership_nested_leaf 17.6450μs 1.9595μs 510.3400 KOps/s 494.6488 KOps/s $\color{#35bf28}+3.17\%$
test_membership_stacked_nested 24.3910μs 2.0607μs 485.2788 KOps/s 473.7609 KOps/s $\color{#35bf28}+2.43\%$
test_membership_stacked_nested_leaf 37.3910μs 2.0611μs 485.1771 KOps/s 476.3541 KOps/s $\color{#35bf28}+1.85\%$
test_membership_nested_last 39.5600μs 2.9373μs 340.4453 KOps/s 339.4457 KOps/s $\color{#35bf28}+0.29\%$
test_membership_nested_leaf_last 28.3000μs 2.9490μs 339.1001 KOps/s 339.7361 KOps/s $\color{#d91a1a}-0.19\%$
test_membership_stacked_nested_last 29.5400μs 2.9348μs 340.7374 KOps/s 339.0455 KOps/s $\color{#35bf28}+0.50\%$
test_membership_stacked_nested_leaf_last 46.6000μs 2.9327μs 340.9871 KOps/s 337.6603 KOps/s $\color{#35bf28}+0.99\%$
test_nested_getleaf 45.1310μs 6.1583μs 162.3817 KOps/s 162.1059 KOps/s $\color{#35bf28}+0.17\%$
test_nested_get 71.0910μs 5.8434μs 171.1346 KOps/s 169.8330 KOps/s $\color{#35bf28}+0.77\%$
test_stacked_getleaf 44.1010μs 6.1088μs 163.6996 KOps/s 163.1077 KOps/s $\color{#35bf28}+0.36\%$
test_stacked_get 0.1796ms 5.8200μs 171.8200 KOps/s 171.6212 KOps/s $\color{#35bf28}+0.12\%$
test_nested_getitemleaf 29.6500μs 6.2453μs 160.1211 KOps/s 161.1590 KOps/s $\color{#d91a1a}-0.64\%$
test_nested_getitem 29.6800μs 5.9096μs 169.2157 KOps/s 168.0546 KOps/s $\color{#35bf28}+0.69\%$
test_stacked_getitemleaf 39.1200μs 6.1983μs 161.3336 KOps/s 160.6321 KOps/s $\color{#35bf28}+0.44\%$
test_stacked_getitem 43.3600μs 5.8951μs 169.6335 KOps/s 168.6229 KOps/s $\color{#35bf28}+0.60\%$
test_lock_nested 0.8179ms 0.3631ms 2.7543 KOps/s 2.6888 KOps/s $\color{#35bf28}+2.44\%$
test_lock_stack_nested 0.3681ms 0.3322ms 3.0098 KOps/s 2.9751 KOps/s $\color{#35bf28}+1.17\%$
test_unlock_nested 0.6328ms 0.3036ms 3.2940 KOps/s 3.2594 KOps/s $\color{#35bf28}+1.06\%$
test_unlock_stack_nested 0.3052ms 0.2721ms 3.6755 KOps/s 3.6623 KOps/s $\color{#35bf28}+0.36\%$
test_flatten_speed 0.1144ms 74.0238μs 13.5092 KOps/s 13.3160 KOps/s $\color{#35bf28}+1.45\%$
test_unflatten_speed 0.3277ms 0.3042ms 3.2876 KOps/s 3.2595 KOps/s $\color{#35bf28}+0.86\%$
test_common_ops 1.5249ms 0.6017ms 1.6620 KOps/s 1.7505 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_creation 0.1948ms 1.4748μs 678.0427 KOps/s 670.5271 KOps/s $\color{#35bf28}+1.12\%$
test_creation_empty 29.0810μs 8.6850μs 115.1407 KOps/s 146.0132 KOps/s $\textbf{\color{#d91a1a}-21.14\%}$
test_creation_nested_1 31.8600μs 10.3426μs 96.6878 KOps/s 119.0659 KOps/s $\textbf{\color{#d91a1a}-18.79\%}$
test_creation_nested_2 34.5010μs 12.7456μs 78.4582 KOps/s 91.7629 KOps/s $\textbf{\color{#d91a1a}-14.50\%}$
test_clone 0.1319ms 9.9753μs 100.2474 KOps/s 99.2478 KOps/s $\color{#35bf28}+1.01\%$
test_getitem[int] 1.6496ms 10.6247μs 94.1199 KOps/s 91.9891 KOps/s $\color{#35bf28}+2.32\%$
test_getitem[slice_int] 0.1049ms 20.1019μs 49.7465 KOps/s 47.7051 KOps/s $\color{#35bf28}+4.28\%$
test_getitem[range] 0.1324ms 36.2833μs 27.5609 KOps/s 27.3007 KOps/s $\color{#35bf28}+0.95\%$
test_getitem[tuple] 0.1105ms 18.0039μs 55.5435 KOps/s 53.6479 KOps/s $\color{#35bf28}+3.53\%$
test_getitem[list] 0.3710ms 31.5804μs 31.6653 KOps/s 29.1883 KOps/s $\textbf{\color{#35bf28}+8.49\%}$
test_setitem_dim[int] 30.8600μs 17.8095μs 56.1498 KOps/s 57.8211 KOps/s $\color{#d91a1a}-2.89\%$
test_setitem_dim[slice_int] 61.4410μs 36.9267μs 27.0807 KOps/s 28.0899 KOps/s $\color{#d91a1a}-3.59\%$
test_setitem_dim[range] 77.5610μs 51.9041μs 19.2663 KOps/s 18.8454 KOps/s $\color{#35bf28}+2.23\%$
test_setitem_dim[tuple] 69.5610μs 30.3495μs 32.9495 KOps/s 31.5705 KOps/s $\color{#35bf28}+4.37\%$
test_setitem 0.1297ms 14.6052μs 68.4686 KOps/s 72.4926 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_set 0.1239ms 14.3289μs 69.7893 KOps/s 74.1366 KOps/s $\textbf{\color{#d91a1a}-5.86\%}$
test_set_shared 1.5639ms 0.1451ms 6.8910 KOps/s 6.8868 KOps/s $\color{#35bf28}+0.06\%$
test_update 0.4400ms 17.2738μs 57.8910 KOps/s 64.9457 KOps/s $\textbf{\color{#d91a1a}-10.86\%}$
test_update_nested 0.1414ms 22.1437μs 45.1595 KOps/s 47.3675 KOps/s $\color{#d91a1a}-4.66\%$
test_update__nested 0.6876ms 23.2124μs 43.0804 KOps/s 43.1345 KOps/s $\color{#d91a1a}-0.13\%$
test_set_nested 0.1899ms 15.2965μs 65.3744 KOps/s 68.9866 KOps/s $\textbf{\color{#d91a1a}-5.24\%}$
test_set_nested_new 0.1376ms 17.3094μs 57.7721 KOps/s 58.3942 KOps/s $\color{#d91a1a}-1.07\%$
test_select 0.1393ms 29.1165μs 34.3448 KOps/s 34.5382 KOps/s $\color{#d91a1a}-0.56\%$
test_select_nested 0.1025ms 41.0418μs 24.3654 KOps/s 23.8578 KOps/s $\color{#35bf28}+2.13\%$
test_exclude_nested 83.6710μs 60.7770μs 16.4536 KOps/s 16.1744 KOps/s $\color{#35bf28}+1.73\%$
test_empty[True] 0.3224ms 0.2735ms 3.6565 KOps/s 3.5963 KOps/s $\color{#35bf28}+1.67\%$
test_empty[False] 3.5900μs 0.7514μs 1.3309 MOps/s 1.3430 MOps/s $\color{#d91a1a}-0.90\%$
test_to 88.5810μs 55.1779μs 18.1232 KOps/s 18.0308 KOps/s $\color{#35bf28}+0.51\%$
test_to_nonblocking 92.7110μs 45.8656μs 21.8028 KOps/s 21.7934 KOps/s $\color{#35bf28}+0.04\%$
test_unbind_speed 0.2696ms 0.2309ms 4.3309 KOps/s 4.3843 KOps/s $\color{#d91a1a}-1.22\%$
test_unbind_speed_stack0 0.2908ms 0.2304ms 4.3409 KOps/s 4.3995 KOps/s $\color{#d91a1a}-1.33\%$
test_unbind_speed_stack1 96.5192ms 0.6420ms 1.5576 KOps/s 1.5518 KOps/s $\color{#35bf28}+0.37\%$
test_split 97.4115ms 1.7200ms 581.4091 Ops/s 580.3805 Ops/s $\color{#35bf28}+0.18\%$
test_chunk 96.4564ms 1.5950ms 626.9691 Ops/s 632.6644 Ops/s $\color{#d91a1a}-0.90\%$
test_consolidate[False-None] 3.4376ms 2.6258ms 380.8371 Ops/s 381.4417 Ops/s $\color{#d91a1a}-0.16\%$
test_consolidate[default-None] 2.0871ms 1.6845ms 593.6351 Ops/s 593.8786 Ops/s $\color{#d91a1a}-0.04\%$
test_consolidate[reduce-overhead-None] 1.9172ms 1.7225ms 580.5493 Ops/s 580.0965 Ops/s $\color{#35bf28}+0.08\%$
test_consolidate_njt[False-None] 6.8186ms 6.5078ms 153.6627 Ops/s 151.6236 Ops/s $\color{#35bf28}+1.34\%$
test_to[False-False-None] 1.7620ms 1.6622ms 601.6196 Ops/s 606.1835 Ops/s $\color{#d91a1a}-0.75\%$
test_to[True-False-None] 1.5023ms 1.2903ms 775.0260 Ops/s 782.5633 Ops/s $\color{#d91a1a}-0.96\%$
test_to[within-False-None] 4.2741ms 4.0079ms 249.5061 Ops/s 252.6117 Ops/s $\color{#d91a1a}-1.23\%$
test_to[True-default-None] 5.2919ms 5.0553ms 197.8132 Ops/s 192.5787 Ops/s $\color{#35bf28}+2.72\%$
test_to_njt[False-False-None] 7.0792ms 6.8362ms 146.2809 Ops/s 141.9894 Ops/s $\color{#35bf28}+3.02\%$
test_to_njt[True-False-None] 5.6369ms 5.4057ms 184.9890 Ops/s 179.2532 Ops/s $\color{#35bf28}+3.20\%$
test_to_njt[within-False-None] 12.2804ms 12.0023ms 83.3173 Ops/s 81.6918 Ops/s $\color{#35bf28}+1.99\%$
test_creation[device0] 0.4623ms 78.0242μs 12.8165 KOps/s 12.4086 KOps/s $\color{#35bf28}+3.29\%$
test_creation_from_tensor 0.5585ms 80.6864μs 12.3937 KOps/s 12.3719 KOps/s $\color{#35bf28}+0.18\%$
test_add_one[memmap_tensor0] 0.4248ms 6.3037μs 158.6368 KOps/s 158.8084 KOps/s $\color{#d91a1a}-0.11\%$
test_contiguous[memmap_tensor0] 4.1025μs 0.4053μs 2.4673 MOps/s 2.4066 MOps/s $\color{#35bf28}+2.52\%$
test_stack[memmap_tensor0] 42.8710μs 4.6157μs 216.6541 KOps/s 216.3122 KOps/s $\color{#35bf28}+0.16\%$
test_memmaptd_index 1.7646ms 0.2454ms 4.0745 KOps/s 4.0775 KOps/s $\color{#d91a1a}-0.07\%$
test_memmaptd_index_astensor 0.5661ms 0.3000ms 3.3328 KOps/s 3.2990 KOps/s $\color{#35bf28}+1.02\%$
test_memmaptd_index_op 0.9981ms 0.5637ms 1.7740 KOps/s 1.8310 KOps/s $\color{#d91a1a}-3.11\%$
test_serialize_model 0.1315s 0.1305s 7.6646 Ops/s 7.6888 Ops/s $\color{#d91a1a}-0.31\%$
test_serialize_model_pickle 1.3494s 1.1849s 0.8439 Ops/s 0.8247 Ops/s $\color{#35bf28}+2.33\%$
test_serialize_weights 0.1309s 0.1294s 7.7306 Ops/s 7.7134 Ops/s $\color{#35bf28}+0.22\%$
test_serialize_weights_returnearly 0.4169s 65.8573ms 15.1843 Ops/s 23.7269 Ops/s $\textbf{\color{#d91a1a}-36.00\%}$
test_serialize_weights_pickle 1.3820s 1.2171s 0.8217 Ops/s 0.8184 Ops/s $\color{#35bf28}+0.40\%$
test_reshape_pytree 52.2010μs 22.1236μs 45.2005 KOps/s 44.8613 KOps/s $\color{#35bf28}+0.76\%$
test_reshape_td 53.5110μs 26.3930μs 37.8888 KOps/s 37.8625 KOps/s $\color{#35bf28}+0.07\%$
test_view_pytree 54.9710μs 21.8991μs 45.6640 KOps/s 45.3434 KOps/s $\color{#35bf28}+0.71\%$
test_view_td 64.8000μs 28.5964μs 34.9694 KOps/s 35.9171 KOps/s $\color{#d91a1a}-2.64\%$
test_unbind_pytree 87.5110μs 27.4506μs 36.4291 KOps/s 36.0430 KOps/s $\color{#35bf28}+1.07\%$
test_unbind_td 0.8400ms 35.1061μs 28.4851 KOps/s 28.5081 KOps/s $\color{#d91a1a}-0.08\%$
test_split_pytree 73.7410μs 29.8791μs 33.4682 KOps/s 32.9382 KOps/s $\color{#35bf28}+1.61\%$
test_split_td 1.0103ms 37.7655μs 26.4792 KOps/s 25.9139 KOps/s $\color{#35bf28}+2.18\%$
test_add_pytree 0.1119ms 32.5568μs 30.7156 KOps/s 30.0803 KOps/s $\color{#35bf28}+2.11\%$
test_add_td 87.6610μs 44.4978μs 22.4730 KOps/s 23.6055 KOps/s $\color{#d91a1a}-4.80\%$
test_compile_add_one_nested[tensordict-compile] 0.1771ms 0.1217ms 8.2157 KOps/s 8.0961 KOps/s $\color{#35bf28}+1.48\%$
test_compile_add_one_nested[tensordict-eager] 0.2323ms 0.1252ms 7.9893 KOps/s 8.0327 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_add_one_nested[pytree-compile] 0.1425ms 98.5157μs 10.1507 KOps/s 10.0768 KOps/s $\color{#35bf28}+0.73\%$
test_compile_add_one_nested[pytree-eager] 1.5281ms 0.1461ms 6.8450 KOps/s 6.7607 KOps/s $\color{#35bf28}+1.25\%$
test_compile_copy_nested[tensordict-compile] 73.6610μs 23.3378μs 42.8490 KOps/s 43.8020 KOps/s $\color{#d91a1a}-2.18\%$
test_compile_copy_nested[tensordict-eager] 60.0710μs 26.7474μs 37.3868 KOps/s 37.8365 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_copy_nested[pytree-compile] 0.2983ms 65.0496μs 15.3729 KOps/s 15.1543 KOps/s $\color{#35bf28}+1.44\%$
test_compile_copy_nested[pytree-eager] 86.3810μs 49.1696μs 20.3378 KOps/s 20.0826 KOps/s $\color{#35bf28}+1.27\%$
test_compile_add_one_flat[tensordict-compile] 0.1988ms 0.1426ms 7.0147 KOps/s 6.9183 KOps/s $\color{#35bf28}+1.39\%$
test_compile_add_one_flat[tensordict-eager] 0.5900ms 0.2087ms 4.7914 KOps/s 4.8099 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_add_one_flat[tensorclass-compile] 0.1466ms 98.3261μs 10.1702 KOps/s 9.3787 KOps/s $\textbf{\color{#35bf28}+8.44\%}$
test_compile_add_one_flat[tensorclass-eager] 0.4481ms 51.6412μs 19.3644 KOps/s 19.1235 KOps/s $\color{#35bf28}+1.26\%$
test_compile_add_one_flat[pytree-compile] 0.2295ms 0.1360ms 7.3519 KOps/s 7.2708 KOps/s $\color{#35bf28}+1.12\%$
test_compile_add_one_flat[pytree-eager] 0.8610ms 0.4696ms 2.1294 KOps/s 2.1521 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_add_self_flat[tensordict-eager] 0.6520ms 0.2523ms 3.9638 KOps/s 4.0474 KOps/s $\color{#d91a1a}-2.07\%$
test_compile_add_self_flat[tensordict-compile] 0.5436ms 0.1448ms 6.9077 KOps/s 6.8691 KOps/s $\color{#35bf28}+0.56\%$
test_compile_add_self_flat[tensorclass-eager] 0.4540ms 63.7508μs 15.6861 KOps/s 16.2570 KOps/s $\color{#d91a1a}-3.51\%$
test_compile_add_self_flat[tensorclass-compile] 0.2512ms 99.3312μs 10.0673 KOps/s 9.9195 KOps/s $\color{#35bf28}+1.49\%$
test_compile_add_self_flat[pytree-eager] 0.7777ms 0.4034ms 2.4791 KOps/s 2.4640 KOps/s $\color{#35bf28}+0.61\%$
test_compile_add_self_flat[pytree-compile] 0.5325ms 0.1382ms 7.2342 KOps/s 7.3846 KOps/s $\color{#d91a1a}-2.04\%$
test_compile_copy_flat[tensordict-compile] 0.1245ms 19.3722μs 51.6205 KOps/s 54.5777 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_compile_copy_flat[tensordict-eager] 0.3970ms 27.5051μs 36.3569 KOps/s 37.7395 KOps/s $\color{#d91a1a}-3.66\%$
test_compile_copy_flat[pytree-compile] 0.4601ms 69.6597μs 14.3555 KOps/s 14.3043 KOps/s $\color{#35bf28}+0.36\%$
test_compile_copy_flat[pytree-eager] 0.4246ms 51.3598μs 19.4705 KOps/s 19.3188 KOps/s $\color{#35bf28}+0.79\%$
test_compile_assign_and_add[tensordict-compile] 1.6351ms 0.3933ms 2.5425 KOps/s 2.2208 KOps/s $\textbf{\color{#35bf28}+14.48\%}$
test_compile_assign_and_add[tensordict-eager] 2.7390ms 2.5954ms 385.2997 Ops/s 389.0543 Ops/s $\color{#d91a1a}-0.97\%$
test_compile_assign_and_add[pytree-compile] 1.5831ms 0.3823ms 2.6159 KOps/s 2.2413 KOps/s $\textbf{\color{#35bf28}+16.71\%}$
test_compile_assign_and_add[pytree-eager] 2.6343ms 2.5677ms 389.4565 Ops/s 384.5781 Ops/s $\color{#35bf28}+1.27\%$
test_compile_indexing[tensor-tensordict-compile] 0.1581ms 0.1119ms 8.9388 KOps/s 8.6325 KOps/s $\color{#35bf28}+3.55\%$
test_compile_indexing[tensor-tensordict-eager] 0.5615ms 77.1192μs 12.9669 KOps/s 12.9040 KOps/s $\color{#35bf28}+0.49\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1581ms 0.1093ms 9.1491 KOps/s 9.6117 KOps/s $\color{#d91a1a}-4.81\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1331ms 67.8176μs 14.7454 KOps/s 14.7684 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_indexing[tensor-pytree-compile] 0.1615ms 0.1097ms 9.1135 KOps/s 9.4549 KOps/s $\color{#d91a1a}-3.61\%$
test_compile_indexing[tensor-pytree-eager] 0.1100ms 66.6162μs 15.0114 KOps/s 14.9637 KOps/s $\color{#35bf28}+0.32\%$
test_compile_indexing[slice-tensordict-compile] 0.1449ms 0.1038ms 9.6343 KOps/s 9.8436 KOps/s $\color{#d91a1a}-2.13\%$
test_compile_indexing[slice-tensordict-eager] 0.1483ms 20.6849μs 48.3444 KOps/s 57.9813 KOps/s $\textbf{\color{#d91a1a}-16.62\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1436ms 99.2157μs 10.0791 KOps/s 9.8593 KOps/s $\color{#35bf28}+2.23\%$
test_compile_indexing[slice-tensorclass-eager] 48.1200μs 15.9122μs 62.8450 KOps/s 63.3089 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_indexing[slice-pytree-compile] 0.1478ms 0.1014ms 9.8657 KOps/s 9.7732 KOps/s $\color{#35bf28}+0.95\%$
test_compile_indexing[slice-pytree-eager] 50.7010μs 15.9269μs 62.7869 KOps/s 63.4745 KOps/s $\color{#d91a1a}-1.08\%$
test_compile_indexing[int-tensordict-compile] 0.1574ms 0.1036ms 9.6512 KOps/s 9.3979 KOps/s $\color{#35bf28}+2.70\%$
test_compile_indexing[int-tensordict-eager] 0.5715ms 16.9231μs 59.0907 KOps/s 58.5440 KOps/s $\color{#35bf28}+0.93\%$
test_compile_indexing[int-tensorclass-compile] 0.1383ms 96.7739μs 10.3334 KOps/s 9.7858 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_compile_indexing[int-tensorclass-eager] 45.8610μs 15.9289μs 62.7790 KOps/s 63.3385 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_indexing[int-pytree-compile] 0.2488ms 97.5404μs 10.2522 KOps/s 9.8980 KOps/s $\color{#35bf28}+3.58\%$
test_compile_indexing[int-pytree-eager] 0.1399ms 15.7473μs 63.5031 KOps/s 64.7238 KOps/s $\color{#d91a1a}-1.89\%$
test_mod_add[eager] 85.6910μs 37.1701μs 26.9034 KOps/s 26.4955 KOps/s $\color{#35bf28}+1.54\%$
test_mod_add[compile] 0.1736ms 81.1921μs 12.3165 KOps/s 12.2048 KOps/s $\color{#35bf28}+0.92\%$
test_mod_add[compile-overhead] 0.3272ms 0.1678ms 5.9588 KOps/s 5.5167 KOps/s $\textbf{\color{#35bf28}+8.01\%}$
test_mod_wrap[eager] 0.3368ms 0.2440ms 4.0980 KOps/s 4.0489 KOps/s $\color{#35bf28}+1.21\%$
test_mod_wrap[compile] 0.3480ms 0.2916ms 3.4292 KOps/s 3.3942 KOps/s $\color{#35bf28}+1.03\%$
test_mod_wrap[compile-overhead] 7.1313ms 3.7643ms 265.6544 Ops/s 267.4384 Ops/s $\color{#d91a1a}-0.67\%$
test_mod_wrap_and_backward[eager] 1.5183ms 1.3266ms 753.7812 Ops/s 706.0877 Ops/s $\textbf{\color{#35bf28}+6.75\%}$
test_mod_wrap_and_backward[compile] 1.3802ms 1.2553ms 796.6210 Ops/s 728.8653 Ops/s $\textbf{\color{#35bf28}+9.30\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3741ms 0.9268ms 1.0790 KOps/s 909.9590 Ops/s $\textbf{\color{#35bf28}+18.58\%}$
test_seq_add[eager] 0.1735ms 0.1108ms 9.0286 KOps/s 9.0244 KOps/s $\color{#35bf28}+0.05\%$
test_seq_add[compile] 0.2390ms 90.6397μs 11.0327 KOps/s 11.0555 KOps/s $\color{#d91a1a}-0.21\%$
test_seq_add[compile-overhead] 0.2617ms 0.1290ms 7.7512 KOps/s 7.4774 KOps/s $\color{#35bf28}+3.66\%$
test_seq_wrap[eager] 0.5647ms 0.4097ms 2.4409 KOps/s 2.3465 KOps/s $\color{#35bf28}+4.02\%$
test_seq_wrap[compile] 0.3495ms 0.2984ms 3.3510 KOps/s 3.2708 KOps/s $\color{#35bf28}+2.45\%$
test_seq_wrap[compile-overhead] 0.2689ms 0.2261ms 4.4230 KOps/s 4.4067 KOps/s $\color{#35bf28}+0.37\%$
test_func_call_runtime[False-eager] 0.7792ms 0.7228ms 1.3835 KOps/s 1.3680 KOps/s $\color{#35bf28}+1.13\%$
test_func_call_runtime[False-compile] 0.7900ms 0.7354ms 1.3597 KOps/s 1.3301 KOps/s $\color{#35bf28}+2.23\%$
test_func_call_runtime[False-compile-overhead] 0.4734ms 0.3627ms 2.7571 KOps/s 2.7452 KOps/s $\color{#35bf28}+0.43\%$
test_func_call_runtime[True-eager] 0.9377ms 0.8850ms 1.1299 KOps/s 1.1182 KOps/s $\color{#35bf28}+1.05\%$
test_func_call_runtime[True-compile] 0.8290ms 0.7581ms 1.3191 KOps/s 1.3013 KOps/s $\color{#35bf28}+1.37\%$
test_func_call_runtime[True-compile-overhead] 0.5356ms 0.3838ms 2.6054 KOps/s 2.5734 KOps/s $\color{#35bf28}+1.24\%$
test_func_call_cm_runtime[False-eager] 0.7986ms 0.7185ms 1.3918 KOps/s 1.3789 KOps/s $\color{#35bf28}+0.94\%$
test_func_call_cm_runtime[False-compile] 0.7973ms 0.7467ms 1.3393 KOps/s 1.3297 KOps/s $\color{#35bf28}+0.72\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4053ms 0.3650ms 2.7398 KOps/s 2.7394 KOps/s $\color{#35bf28}+0.02\%$
test_func_call_cm_runtime[True-eager] 1.1338ms 0.9735ms 1.0272 KOps/s 991.8479 Ops/s $\color{#35bf28}+3.57\%$
test_func_call_cm_runtime[True-compile] 0.8439ms 0.7934ms 1.2603 KOps/s 1.2470 KOps/s $\color{#35bf28}+1.07\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4549ms 0.4089ms 2.4454 KOps/s 2.4316 KOps/s $\color{#35bf28}+0.57\%$
test_vmap_func_call_cm_runtime[eager] 2.4534ms 2.0061ms 498.4798 Ops/s 492.7131 Ops/s $\color{#35bf28}+1.17\%$
test_vmap_func_call_cm_runtime[compile] 0.9620ms 0.8125ms 1.2307 KOps/s 1.2309 KOps/s $\color{#d91a1a}-0.01\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4829ms 0.4230ms 2.3639 KOps/s 2.4179 KOps/s $\color{#d91a1a}-2.23\%$
test_distributed 3.0288ms 0.2273ms 4.3991 KOps/s 8.7991 KOps/s $\textbf{\color{#d91a1a}-50.01\%}$
test_tdmodule 48.0610μs 20.0269μs 49.9327 KOps/s 52.1284 KOps/s $\color{#d91a1a}-4.21\%$
test_tdmodule_dispatch 64.4510μs 35.8495μs 27.8944 KOps/s 30.1176 KOps/s $\textbf{\color{#d91a1a}-7.38\%}$
test_tdseq 48.2500μs 19.8706μs 50.3255 KOps/s 53.4690 KOps/s $\textbf{\color{#d91a1a}-5.88\%}$
test_tdseq_dispatch 64.8500μs 36.7180μs 27.2346 KOps/s 28.3984 KOps/s $\color{#d91a1a}-4.10\%$
test_instantiation_functorch 1.6212ms 1.5071ms 663.5093 Ops/s 640.4032 Ops/s $\color{#35bf28}+3.61\%$
test_exec_functorch 0.2660ms 0.1405ms 7.1170 KOps/s 7.0446 KOps/s $\color{#35bf28}+1.03\%$
test_exec_functional_call 0.2082ms 0.1309ms 7.6385 KOps/s 7.5699 KOps/s $\color{#35bf28}+0.91\%$
test_exec_td_decorator 0.3628ms 0.1760ms 5.6808 KOps/s 5.6593 KOps/s $\color{#35bf28}+0.38\%$
test_vmap_mlp_speed_decorator[True-True] 0.7863ms 0.6678ms 1.4974 KOps/s 1.5176 KOps/s $\color{#d91a1a}-1.33\%$
test_vmap_mlp_speed_decorator[True-False] 0.7985ms 0.6632ms 1.5078 KOps/s 1.5150 KOps/s $\color{#d91a1a}-0.48\%$
test_vmap_mlp_speed_decorator[False-True] 0.7845ms 0.5745ms 1.7405 KOps/s 1.7456 KOps/s $\color{#d91a1a}-0.29\%$
test_vmap_mlp_speed_decorator[False-False] 0.6856ms 0.5758ms 1.7368 KOps/s 1.7488 KOps/s $\color{#d91a1a}-0.69\%$
test_vmap_transformer_speed_decorator[True-True] 18.8089ms 18.6284ms 53.6816 Ops/s 54.4358 Ops/s $\color{#d91a1a}-1.39\%$
test_vmap_transformer_speed_decorator[True-False] 18.7451ms 18.6676ms 53.5688 Ops/s 54.2533 Ops/s $\color{#d91a1a}-1.26\%$
test_vmap_transformer_speed_decorator[False-True] 18.8679ms 18.5459ms 53.9204 Ops/s 54.6333 Ops/s $\color{#d91a1a}-1.30\%$
test_vmap_transformer_speed_decorator[False-False] 18.9258ms 18.5324ms 53.9594 Ops/s 54.5076 Ops/s $\color{#d91a1a}-1.01\%$
test_to_module_speed[True] 1.0451ms 0.9439ms 1.0594 KOps/s 1.0691 KOps/s $\color{#d91a1a}-0.91\%$
test_to_module_speed[False] 1.4082ms 0.9271ms 1.0786 KOps/s 1.0831 KOps/s $\color{#d91a1a}-0.41\%$
test_tc_init 59.0310μs 38.1630μs 26.2034 KOps/s 29.9302 KOps/s $\textbf{\color{#d91a1a}-12.45\%}$
test_tc_init_nested 0.1743ms 75.4936μs 13.2462 KOps/s 14.6521 KOps/s $\textbf{\color{#d91a1a}-9.60\%}$
test_tc_first_layer_tensor 4.4471μs 0.7046μs 1.4192 MOps/s 1.4145 MOps/s $\color{#35bf28}+0.33\%$
test_tc_first_layer_nontensor 21.3100μs 2.3757μs 420.9305 KOps/s 435.5318 KOps/s $\color{#d91a1a}-3.35\%$
test_tc_second_layer_tensor 7.7277μs 1.4326μs 698.0203 KOps/s 708.6444 KOps/s $\color{#d91a1a}-1.50\%$
test_tc_second_layer_nontensor 0.1289ms 3.1891μs 313.5659 KOps/s 330.1334 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_unbind 0.2153s 9.5813ms 104.3703 Ops/s 152.2306 Ops/s $\textbf{\color{#d91a1a}-31.44\%}$
test_full_like 10.1866ms 9.2734ms 107.8354 Ops/s 101.3297 Ops/s $\textbf{\color{#35bf28}+6.42\%}$
test_zeros_like 5.2588ms 4.3279ms 231.0582 Ops/s 227.0435 Ops/s $\color{#35bf28}+1.77\%$
test_ones_like 4.5008ms 4.3331ms 230.7831 Ops/s 226.0003 Ops/s $\color{#35bf28}+2.12\%$
test_clone 12.0023ms 9.2870ms 107.6772 Ops/s 145.5368 Ops/s $\textbf{\color{#d91a1a}-26.01\%}$
test_squeeze 58.3910μs 9.3973μs 106.4131 KOps/s 108.4336 KOps/s $\color{#d91a1a}-1.86\%$
test_unsqueeze 0.1161ms 72.0130μs 13.8864 KOps/s 13.8339 KOps/s $\color{#35bf28}+0.38\%$
test_split 0.3855ms 0.1550ms 6.4514 KOps/s 6.1294 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_permute 0.2362ms 0.1783ms 5.6094 KOps/s 5.5777 KOps/s $\color{#35bf28}+0.57\%$
test_stack 52.3750ms 51.2068ms 19.5287 Ops/s 19.2146 Ops/s $\color{#35bf28}+1.63\%$
test_cat 52.6950ms 51.7074ms 19.3396 Ops/s 22.9336 Ops/s $\textbf{\color{#d91a1a}-15.67\%}$

@vmoens vmoens merged commit e3fe346 into gh/vmoens/38/base Nov 28, 2024
48 of 50 checks passed
@vmoens vmoens deleted the gh/vmoens/38/head branch November 28, 2024 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants