Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Add missing packages for smoke test #1149

Merged
merged 3 commits into from
Dec 19, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 19, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
This was referenced Dec 19, 2024
vmoens added a commit that referenced this pull request Dec 19, 2024
ghstack-source-id: 17746de9509078e3693ca4a5234b2a1bdd29d5f4
Pull Request resolved: #1149
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 19, 2024
@vmoens vmoens added the CI label Dec 19, 2024
Copy link

github-actions bot commented Dec 19, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 45.6350μs 21.6109μs 46.2730 KOps/s 48.9761 KOps/s $\textbf{\color{#d91a1a}-5.52\%}$
test_plain_set_stack_nested 42.7790μs 21.6538μs 46.1814 KOps/s 48.1862 KOps/s $\color{#d91a1a}-4.16\%$
test_plain_set_nested_inplace 70.9120μs 23.4825μs 42.5849 KOps/s 44.7100 KOps/s $\color{#d91a1a}-4.75\%$
test_plain_set_stack_nested_inplace 77.1840μs 23.5367μs 42.4868 KOps/s 44.6061 KOps/s $\color{#d91a1a}-4.75\%$
test_items 86.2800μs 4.2006μs 238.0633 KOps/s 240.5506 KOps/s $\color{#d91a1a}-1.03\%$
test_items_nested 0.7311ms 0.4007ms 2.4959 KOps/s 2.4516 KOps/s $\color{#35bf28}+1.80\%$
test_items_nested_locked 0.4620ms 0.4019ms 2.4884 KOps/s 2.4578 KOps/s $\color{#35bf28}+1.24\%$
test_items_nested_leaf 0.1451ms 77.0818μs 12.9732 KOps/s 13.0279 KOps/s $\color{#d91a1a}-0.42\%$
test_items_stack_nested 0.7365ms 0.4028ms 2.4824 KOps/s 2.4568 KOps/s $\color{#35bf28}+1.04\%$
test_items_stack_nested_leaf 0.1587ms 76.9808μs 12.9903 KOps/s 12.9379 KOps/s $\color{#35bf28}+0.40\%$
test_items_stack_nested_locked 0.4833ms 0.4045ms 2.4722 KOps/s 2.4585 KOps/s $\color{#35bf28}+0.56\%$
test_keys 31.9090μs 3.4966μs 285.9922 KOps/s 284.6050 KOps/s $\color{#35bf28}+0.49\%$
test_keys_nested 0.3247ms 0.1673ms 5.9755 KOps/s 5.9601 KOps/s $\color{#35bf28}+0.26\%$
test_keys_nested_locked 1.7162ms 0.1733ms 5.7714 KOps/s 5.7320 KOps/s $\color{#35bf28}+0.69\%$
test_keys_nested_leaf 0.2583ms 0.1488ms 6.7199 KOps/s 6.8447 KOps/s $\color{#d91a1a}-1.82\%$
test_keys_stack_nested 0.3178ms 0.1676ms 5.9664 KOps/s 5.9443 KOps/s $\color{#35bf28}+0.37\%$
test_keys_stack_nested_leaf 0.3120ms 0.1464ms 6.8307 KOps/s 6.8382 KOps/s $\color{#d91a1a}-0.11\%$
test_keys_stack_nested_locked 0.3126ms 0.1727ms 5.7919 KOps/s 5.8090 KOps/s $\color{#d91a1a}-0.29\%$
test_values 6.6484μs 1.0711μs 933.5955 KOps/s 973.7252 KOps/s $\color{#d91a1a}-4.12\%$
test_values_nested 0.1237ms 63.2841μs 15.8018 KOps/s 16.1512 KOps/s $\color{#d91a1a}-2.16\%$
test_values_nested_locked 0.1125ms 62.6682μs 15.9571 KOps/s 15.9546 KOps/s $\color{#35bf28}+0.02\%$
test_values_nested_leaf 0.1640ms 72.2646μs 13.8380 KOps/s 13.5603 KOps/s $\color{#35bf28}+2.05\%$
test_values_stack_nested 0.1128ms 63.1301μs 15.8403 KOps/s 16.2539 KOps/s $\color{#d91a1a}-2.54\%$
test_values_stack_nested_leaf 0.1319ms 71.8173μs 13.9242 KOps/s 13.9660 KOps/s $\color{#d91a1a}-0.30\%$
test_values_stack_nested_locked 0.1269ms 62.4604μs 16.0101 KOps/s 16.0532 KOps/s $\color{#d91a1a}-0.27\%$
test_membership 20.4680μs 0.8893μs 1.1245 MOps/s 1.1052 MOps/s $\color{#35bf28}+1.75\%$
test_membership_nested 42.8100μs 2.9030μs 344.4745 KOps/s 346.7499 KOps/s $\color{#d91a1a}-0.66\%$
test_membership_nested_leaf 21.3300μs 2.8440μs 351.6116 KOps/s 340.7638 KOps/s $\color{#35bf28}+3.18\%$
test_membership_stacked_nested 39.3830μs 2.8955μs 345.3632 KOps/s 345.4190 KOps/s $\color{#d91a1a}-0.02\%$
test_membership_stacked_nested_leaf 18.9560μs 2.8796μs 347.2716 KOps/s 342.3222 KOps/s $\color{#35bf28}+1.45\%$
test_membership_nested_last 43.5810μs 4.3669μs 228.9967 KOps/s 231.6805 KOps/s $\color{#d91a1a}-1.16\%$
test_membership_nested_leaf_last 50.6140μs 4.3367μs 230.5874 KOps/s 231.2988 KOps/s $\color{#d91a1a}-0.31\%$
test_membership_stacked_nested_last 24.5260μs 4.3323μs 230.8245 KOps/s 232.6235 KOps/s $\color{#d91a1a}-0.77\%$
test_membership_stacked_nested_leaf_last 25.1060μs 4.2917μs 233.0056 KOps/s 232.6206 KOps/s $\color{#35bf28}+0.17\%$
test_nested_getleaf 45.6650μs 10.9526μs 91.3023 KOps/s 92.6439 KOps/s $\color{#d91a1a}-1.45\%$
test_nested_get 37.0590μs 10.5077μs 95.1684 KOps/s 98.9350 KOps/s $\color{#d91a1a}-3.81\%$
test_stacked_getleaf 42.3990μs 10.7823μs 92.7443 KOps/s 93.4391 KOps/s $\color{#d91a1a}-0.74\%$
test_stacked_get 43.5410μs 10.3208μs 96.8916 KOps/s 97.9811 KOps/s $\color{#d91a1a}-1.11\%$
test_nested_getitemleaf 39.7040μs 11.3487μs 88.1156 KOps/s 90.4723 KOps/s $\color{#d91a1a}-2.60\%$
test_nested_getitem 35.6660μs 10.5135μs 95.1160 KOps/s 96.5561 KOps/s $\color{#d91a1a}-1.49\%$
test_stacked_getitemleaf 50.9150μs 11.1258μs 89.8813 KOps/s 89.5639 KOps/s $\color{#35bf28}+0.35\%$
test_stacked_getitem 78.0550μs 10.3962μs 96.1891 KOps/s 95.1167 KOps/s $\color{#35bf28}+1.13\%$
test_lock_nested 4.6162ms 0.4693ms 2.1307 KOps/s 2.1836 KOps/s $\color{#d91a1a}-2.42\%$
test_lock_stack_nested 0.7217ms 0.4341ms 2.3037 KOps/s 2.3652 KOps/s $\color{#d91a1a}-2.60\%$
test_unlock_nested 0.8568ms 0.3794ms 2.6358 KOps/s 2.6836 KOps/s $\color{#d91a1a}-1.78\%$
test_unlock_stack_nested 0.5296ms 0.3506ms 2.8525 KOps/s 2.8930 KOps/s $\color{#d91a1a}-1.40\%$
test_flatten_speed 0.2432ms 99.8697μs 10.0130 KOps/s 10.0770 KOps/s $\color{#d91a1a}-0.64\%$
test_unflatten_speed 0.7056ms 0.5295ms 1.8887 KOps/s 1.9173 KOps/s $\color{#d91a1a}-1.49\%$
test_common_ops 1.6823ms 0.7971ms 1.2545 KOps/s 1.3180 KOps/s $\color{#d91a1a}-4.82\%$
test_creation 65.6310μs 2.5492μs 392.2752 KOps/s 406.1558 KOps/s $\color{#d91a1a}-3.42\%$
test_creation_empty 40.2250μs 12.5856μs 79.4559 KOps/s 96.3061 KOps/s $\textbf{\color{#d91a1a}-17.50\%}$
test_creation_nested_1 38.7120μs 15.5867μs 64.1571 KOps/s 77.2668 KOps/s $\textbf{\color{#d91a1a}-16.97\%}$
test_creation_nested_2 56.6260μs 19.8489μs 50.3807 KOps/s 56.6271 KOps/s $\textbf{\color{#d91a1a}-11.03\%}$
test_clone 75.9510μs 13.3129μs 75.1153 KOps/s 72.3093 KOps/s $\color{#35bf28}+3.88\%$
test_getitem[int] 0.8641ms 12.7435μs 78.4714 KOps/s 75.7028 KOps/s $\color{#35bf28}+3.66\%$
test_getitem[slice_int] 0.1370ms 24.5665μs 40.7058 KOps/s 39.5097 KOps/s $\color{#35bf28}+3.03\%$
test_getitem[range] 0.2042ms 48.9003μs 20.4498 KOps/s 20.5665 KOps/s $\color{#d91a1a}-0.57\%$
test_getitem[tuple] 0.1340ms 20.2939μs 49.2759 KOps/s 48.2998 KOps/s $\color{#35bf28}+2.02\%$
test_getitem[list] 0.1861ms 44.0141μs 22.7200 KOps/s 22.9544 KOps/s $\color{#d91a1a}-1.02\%$
test_setitem_dim[int] 62.7660μs 25.3770μs 39.4058 KOps/s 40.7493 KOps/s $\color{#d91a1a}-3.30\%$
test_setitem_dim[slice_int] 0.1076ms 50.5858μs 19.7684 KOps/s 19.3673 KOps/s $\color{#35bf28}+2.07\%$
test_setitem_dim[range] 0.1212ms 72.2706μs 13.8369 KOps/s 13.7576 KOps/s $\color{#35bf28}+0.58\%$
test_setitem_dim[tuple] 94.6760μs 41.6151μs 24.0297 KOps/s 24.9953 KOps/s $\color{#d91a1a}-3.86\%$
test_setitem 81.8220μs 20.9483μs 47.7365 KOps/s 50.7980 KOps/s $\textbf{\color{#d91a1a}-6.03\%}$
test_set 0.1166ms 20.3427μs 49.1577 KOps/s 51.4241 KOps/s $\color{#d91a1a}-4.41\%$
test_set_shared 5.1673ms 0.1723ms 5.8022 KOps/s 5.8879 KOps/s $\color{#d91a1a}-1.45\%$
test_update 0.1303ms 23.5806μs 42.4078 KOps/s 47.0188 KOps/s $\textbf{\color{#d91a1a}-9.81\%}$
test_update_nested 0.2249ms 34.5824μs 28.9165 KOps/s 30.5745 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_update__nested 0.4827ms 33.7950μs 29.5902 KOps/s 28.9186 KOps/s $\color{#35bf28}+2.32\%$
test_set_nested 0.1156ms 22.4285μs 44.5862 KOps/s 46.3947 KOps/s $\color{#d91a1a}-3.90\%$
test_set_nested_new 0.1713ms 27.5347μs 36.3178 KOps/s 38.0894 KOps/s $\color{#d91a1a}-4.65\%$
test_select 0.2208ms 44.4288μs 22.5079 KOps/s 23.4088 KOps/s $\color{#d91a1a}-3.85\%$
test_select_nested 0.1237ms 62.7027μs 15.9483 KOps/s 15.8824 KOps/s $\color{#35bf28}+0.41\%$
test_exclude_nested 0.1574ms 81.9007μs 12.2099 KOps/s 12.0784 KOps/s $\color{#35bf28}+1.09\%$
test_empty[True] 0.7297ms 0.4134ms 2.4188 KOps/s 2.4021 KOps/s $\color{#35bf28}+0.70\%$
test_empty[False] 12.0023μs 1.4060μs 711.2598 KOps/s 709.1523 KOps/s $\color{#35bf28}+0.30\%$
test_unbind_speed 0.6644ms 0.2849ms 3.5102 KOps/s 3.7161 KOps/s $\textbf{\color{#d91a1a}-5.54\%}$
test_unbind_speed_stack0 0.6728ms 0.2728ms 3.6657 KOps/s 3.7468 KOps/s $\color{#d91a1a}-2.16\%$
test_unbind_speed_stack1 0.1071s 0.8055ms 1.2414 KOps/s 1.3692 KOps/s $\textbf{\color{#d91a1a}-9.33\%}$
test_split 1.8817ms 1.5926ms 627.8935 Ops/s 555.5820 Ops/s $\textbf{\color{#35bf28}+13.02\%}$
test_chunk 0.1048s 1.9304ms 518.0178 Ops/s 566.9586 Ops/s $\textbf{\color{#d91a1a}-8.63\%}$
test_consolidate_njt[False-None] 9.8369ms 8.1623ms 122.5151 Ops/s 120.9927 Ops/s $\color{#35bf28}+1.26\%$
test_creation[device0] 3.7014ms 91.3595μs 10.9458 KOps/s 11.0237 KOps/s $\color{#d91a1a}-0.71\%$
test_creation_from_tensor 0.3638ms 93.1368μs 10.7369 KOps/s 10.5050 KOps/s $\color{#35bf28}+2.21\%$
test_add_one[memmap_tensor0] 0.3202ms 4.7011μs 212.7141 KOps/s 210.0717 KOps/s $\color{#35bf28}+1.26\%$
test_contiguous[memmap_tensor0] 12.3230μs 0.5209μs 1.9196 MOps/s 1.9601 MOps/s $\color{#d91a1a}-2.06\%$
test_stack[memmap_tensor0] 45.5950μs 3.3358μs 299.7761 KOps/s 307.4237 KOps/s $\color{#d91a1a}-2.49\%$
test_memmaptd_index 1.1329ms 0.2351ms 4.2532 KOps/s 4.2655 KOps/s $\color{#d91a1a}-0.29\%$
test_memmaptd_index_astensor 0.5808ms 0.3219ms 3.1070 KOps/s 3.0889 KOps/s $\color{#35bf28}+0.59\%$
test_memmaptd_index_op 1.1968ms 0.5938ms 1.6842 KOps/s 1.7863 KOps/s $\textbf{\color{#d91a1a}-5.72\%}$
test_serialize_model 0.1296s 0.1183s 8.4539 Ops/s 7.5102 Ops/s $\textbf{\color{#35bf28}+12.57\%}$
test_serialize_model_pickle 0.4348s 0.3900s 2.5643 Ops/s 2.5680 Ops/s $\color{#d91a1a}-0.15\%$
test_serialize_weights 0.2264s 0.1330s 7.5171 Ops/s 8.6421 Ops/s $\textbf{\color{#d91a1a}-13.02\%}$
test_serialize_weights_returnearly 0.1851s 0.1616s 6.1873 Ops/s 6.3976 Ops/s $\color{#d91a1a}-3.29\%$
test_serialize_weights_pickle 1.3042s 0.7009s 1.4268 Ops/s 2.4822 Ops/s $\textbf{\color{#d91a1a}-42.52\%}$
test_serialize_weights_filesystem 0.1592s 0.1448s 6.9078 Ops/s 7.0742 Ops/s $\color{#d91a1a}-2.35\%$
test_serialize_model_filesystem 0.2483s 0.1583s 6.3169 Ops/s 6.6150 Ops/s $\color{#d91a1a}-4.51\%$
test_reshape_pytree 68.2370μs 26.5037μs 37.7306 KOps/s 37.6453 KOps/s $\color{#35bf28}+0.23\%$
test_reshape_td 0.1062ms 32.2536μs 31.0043 KOps/s 30.0575 KOps/s $\color{#35bf28}+3.15\%$
test_view_pytree 68.9380μs 26.5495μs 37.6654 KOps/s 37.2625 KOps/s $\color{#35bf28}+1.08\%$
test_view_td 0.1025ms 38.1779μs 26.1932 KOps/s 25.8867 KOps/s $\color{#35bf28}+1.18\%$
test_unbind_pytree 95.9580μs 29.5788μs 33.8080 KOps/s 33.2764 KOps/s $\color{#35bf28}+1.60\%$
test_unbind_td 0.3691ms 39.6698μs 25.2081 KOps/s 24.8324 KOps/s $\color{#35bf28}+1.51\%$
test_split_pytree 74.5280μs 29.2240μs 34.2185 KOps/s 33.9435 KOps/s $\color{#35bf28}+0.81\%$
test_split_td 0.5429ms 45.1877μs 22.1299 KOps/s 22.1939 KOps/s $\color{#d91a1a}-0.29\%$
test_add_pytree 93.6330μs 35.3471μs 28.2908 KOps/s 27.8121 KOps/s $\color{#35bf28}+1.72\%$
test_add_td 0.1142ms 55.2444μs 18.1014 KOps/s 18.1744 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_add_one_nested[tensordict-compile] 0.1604ms 61.7117μs 16.2044 KOps/s 15.9850 KOps/s $\color{#35bf28}+1.37\%$
test_compile_add_one_nested[tensordict-eager] 1.4847ms 0.1684ms 5.9385 KOps/s 5.9572 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_add_one_nested[pytree-compile] 0.1363ms 45.2655μs 22.0919 KOps/s 22.1305 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_add_one_nested[pytree-eager] 0.1987ms 0.1183ms 8.4566 KOps/s 8.5325 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_copy_nested[tensordict-compile] 0.1205ms 26.4622μs 37.7898 KOps/s 38.1872 KOps/s $\color{#d91a1a}-1.04\%$
test_compile_copy_nested[tensordict-eager] 0.1133ms 59.2429μs 16.8797 KOps/s 16.9405 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_copy_nested[pytree-compile] 0.1935ms 79.2478μs 12.6187 KOps/s 12.4835 KOps/s $\color{#35bf28}+1.08\%$
test_compile_copy_nested[pytree-eager] 0.1204ms 67.7370μs 14.7630 KOps/s 14.5645 KOps/s $\color{#35bf28}+1.36\%$
test_compile_add_one_flat[tensordict-compile] 0.1903ms 0.1033ms 9.6789 KOps/s 9.4644 KOps/s $\color{#35bf28}+2.27\%$
test_compile_add_one_flat[tensordict-eager] 0.4227ms 0.2130ms 4.6949 KOps/s 4.5940 KOps/s $\color{#35bf28}+2.20\%$
test_compile_add_one_flat[tensorclass-compile] 0.1423ms 44.3599μs 22.5429 KOps/s 22.3275 KOps/s $\color{#35bf28}+0.96\%$
test_compile_add_one_flat[tensorclass-eager] 0.5133ms 62.4709μs 16.0074 KOps/s 15.7374 KOps/s $\color{#35bf28}+1.72\%$
test_compile_add_one_flat[pytree-compile] 0.2471ms 0.1036ms 9.6485 KOps/s 9.6965 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_one_flat[pytree-eager] 0.3322ms 0.1997ms 5.0084 KOps/s 4.9337 KOps/s $\color{#35bf28}+1.51\%$
test_compile_add_self_flat[tensordict-eager] 0.5606ms 0.2369ms 4.2219 KOps/s 4.2879 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_add_self_flat[tensordict-compile] 0.2284ms 0.1058ms 9.4490 KOps/s 9.5311 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_add_self_flat[tensorclass-eager] 0.2534ms 60.1874μs 16.6148 KOps/s 17.2768 KOps/s $\color{#d91a1a}-3.83\%$
test_compile_add_self_flat[tensorclass-compile] 0.1294ms 46.4521μs 21.5276 KOps/s 21.8524 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_add_self_flat[pytree-eager] 0.6369ms 0.1561ms 6.4046 KOps/s 6.2229 KOps/s $\color{#35bf28}+2.92\%$
test_compile_add_self_flat[pytree-compile] 0.3366ms 0.1016ms 9.8427 KOps/s 9.4540 KOps/s $\color{#35bf28}+4.11\%$
test_compile_copy_flat[tensordict-compile] 61.4140μs 21.2525μs 47.0532 KOps/s 45.5449 KOps/s $\color{#35bf28}+3.31\%$
test_compile_copy_flat[tensordict-eager] 0.1081ms 67.2289μs 14.8746 KOps/s 14.8768 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_copy_flat[pytree-compile] 0.1438ms 81.9602μs 12.2010 KOps/s 12.1610 KOps/s $\color{#35bf28}+0.33\%$
test_compile_copy_flat[pytree-eager] 0.1551ms 68.4358μs 14.6122 KOps/s 14.3427 KOps/s $\color{#35bf28}+1.88\%$
test_compile_assign_and_add[tensordict-compile] 0.4554ms 0.2053ms 4.8716 KOps/s 4.7869 KOps/s $\color{#35bf28}+1.77\%$
test_compile_assign_and_add[tensordict-eager] 1.5201ms 1.3035ms 767.1524 Ops/s 768.8583 Ops/s $\color{#d91a1a}-0.22\%$
test_compile_assign_and_add[pytree-compile] 0.2663ms 0.1978ms 5.0563 KOps/s 4.8709 KOps/s $\color{#35bf28}+3.81\%$
test_compile_assign_and_add[pytree-eager] 0.9206ms 0.7678ms 1.3025 KOps/s 1.2865 KOps/s $\color{#35bf28}+1.24\%$
test_compile_assign_and_add_stack[compile] 0.7830ms 0.4505ms 2.2196 KOps/s 2.1638 KOps/s $\color{#35bf28}+2.58\%$
test_compile_assign_and_add_stack[eager] 2.9074ms 2.6620ms 375.6638 Ops/s 384.0023 Ops/s $\color{#d91a1a}-2.17\%$
test_compile_indexing[tensor-tensordict-compile] 0.1520ms 36.6884μs 27.2566 KOps/s 26.8514 KOps/s $\color{#35bf28}+1.51\%$
test_compile_indexing[tensor-tensordict-eager] 0.4629ms 31.6910μs 31.5547 KOps/s 29.3300 KOps/s $\textbf{\color{#35bf28}+7.59\%}$
test_compile_indexing[tensor-tensorclass-compile] 74.6880μs 28.9771μs 34.5100 KOps/s 33.7043 KOps/s $\color{#35bf28}+2.39\%$
test_compile_indexing[tensor-tensorclass-eager] 71.9640μs 22.7010μs 44.0508 KOps/s 42.2615 KOps/s $\color{#35bf28}+4.23\%$
test_compile_indexing[tensor-pytree-compile] 84.2560μs 29.8975μs 33.4476 KOps/s 32.9538 KOps/s $\color{#35bf28}+1.50\%$
test_compile_indexing[tensor-pytree-eager] 78.9060μs 22.5218μs 44.4014 KOps/s 42.4553 KOps/s $\color{#35bf28}+4.58\%$
test_compile_indexing[slice-tensordict-compile] 0.1150ms 52.5049μs 19.0458 KOps/s 18.9685 KOps/s $\color{#35bf28}+0.41\%$
test_compile_indexing[slice-tensordict-eager] 0.5590ms 19.2855μs 51.8525 KOps/s 46.9524 KOps/s $\textbf{\color{#35bf28}+10.44\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1375ms 44.4964μs 22.4737 KOps/s 21.8293 KOps/s $\color{#35bf28}+2.95\%$
test_compile_indexing[slice-tensorclass-eager] 53.8400μs 18.4103μs 54.3174 KOps/s 51.0352 KOps/s $\textbf{\color{#35bf28}+6.43\%}$
test_compile_indexing[slice-pytree-compile] 0.1104ms 45.0474μs 22.1988 KOps/s 21.4116 KOps/s $\color{#35bf28}+3.68\%$
test_compile_indexing[slice-pytree-eager] 59.4700μs 18.3862μs 54.3885 KOps/s 49.9455 KOps/s $\textbf{\color{#35bf28}+8.90\%}$
test_compile_indexing[int-tensordict-compile] 0.1146ms 53.5828μs 18.6627 KOps/s 18.2878 KOps/s $\color{#35bf28}+2.05\%$
test_compile_indexing[int-tensordict-eager] 1.2031ms 19.2210μs 52.0263 KOps/s 47.0340 KOps/s $\textbf{\color{#35bf28}+10.61\%}$
test_compile_indexing[int-tensorclass-compile] 0.1342ms 45.4380μs 22.0080 KOps/s 21.4791 KOps/s $\color{#35bf28}+2.46\%$
test_compile_indexing[int-tensorclass-eager] 81.6220μs 18.2324μs 54.8473 KOps/s 52.1943 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_compile_indexing[int-pytree-compile] 0.1083ms 44.5983μs 22.4224 KOps/s 21.4175 KOps/s $\color{#35bf28}+4.69\%$
test_compile_indexing[int-pytree-eager] 57.0370μs 18.2127μs 54.9069 KOps/s 52.3566 KOps/s $\color{#35bf28}+4.87\%$
test_mod_add[eager] 0.1113ms 35.4028μs 28.2463 KOps/s 29.2126 KOps/s $\color{#d91a1a}-3.31\%$
test_mod_add[compile] 0.1333ms 48.1805μs 20.7553 KOps/s 19.8407 KOps/s $\color{#35bf28}+4.61\%$
test_mod_add[compile-overhead] 91.9310μs 47.4172μs 21.0894 KOps/s 19.8871 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_mod_wrap[eager] 0.3346ms 0.2282ms 4.3823 KOps/s 4.4076 KOps/s $\color{#d91a1a}-0.57\%$
test_mod_wrap[compile] 0.3084ms 0.2070ms 4.8301 KOps/s 4.6493 KOps/s $\color{#35bf28}+3.89\%$
test_mod_wrap[compile-overhead] 0.3133ms 0.2057ms 4.8624 KOps/s 4.7204 KOps/s $\color{#35bf28}+3.01\%$
test_mod_wrap_and_backward[eager] 19.6830ms 12.6572ms 79.0063 Ops/s 89.7759 Ops/s $\textbf{\color{#d91a1a}-12.00\%}$
test_mod_wrap_and_backward[compile] 19.0979ms 13.1782ms 75.8827 Ops/s 78.6843 Ops/s $\color{#d91a1a}-3.56\%$
test_mod_wrap_and_backward[compile-overhead] 15.7388ms 12.7809ms 78.2421 Ops/s 83.8044 Ops/s $\textbf{\color{#d91a1a}-6.64\%}$
test_seq_add[eager] 0.2588ms 0.1128ms 8.8643 KOps/s 8.4693 KOps/s $\color{#35bf28}+4.66\%$
test_seq_add[compile] 0.1318ms 61.6454μs 16.2218 KOps/s 15.7618 KOps/s $\color{#35bf28}+2.92\%$
test_seq_add[compile-overhead] 0.1649ms 60.3384μs 16.5732 KOps/s 16.3372 KOps/s $\color{#35bf28}+1.44\%$
test_seq_wrap[eager] 0.7348ms 0.4546ms 2.2000 KOps/s 2.2568 KOps/s $\color{#d91a1a}-2.52\%$
test_seq_wrap[compile] 0.4165ms 0.2284ms 4.3790 KOps/s 4.3195 KOps/s $\color{#35bf28}+1.38\%$
test_seq_wrap[compile-overhead] 0.3704ms 0.2312ms 4.3251 KOps/s 4.3324 KOps/s $\color{#d91a1a}-0.17\%$
test_func_call_runtime[False-eager] 0.9954ms 0.5534ms 1.8071 KOps/s 1.8164 KOps/s $\color{#d91a1a}-0.51\%$
test_func_call_runtime[False-compile] 0.5842ms 0.4286ms 2.3333 KOps/s 2.2952 KOps/s $\color{#35bf28}+1.66\%$
test_func_call_runtime[False-compile-overhead] 0.8362ms 0.4240ms 2.3584 KOps/s 2.2980 KOps/s $\color{#35bf28}+2.63\%$
test_func_call_runtime[True-eager] 1.5883ms 0.7763ms 1.2882 KOps/s 1.3192 KOps/s $\color{#d91a1a}-2.34\%$
test_func_call_runtime[True-compile] 0.7045ms 0.4665ms 2.1435 KOps/s 2.1132 KOps/s $\color{#35bf28}+1.43\%$
test_func_call_runtime[True-compile-overhead] 0.5872ms 0.4649ms 2.1511 KOps/s 2.1251 KOps/s $\color{#35bf28}+1.22\%$
test_func_call_cm_runtime[False-eager] 0.6910ms 0.5499ms 1.8184 KOps/s 1.8004 KOps/s $\color{#35bf28}+1.00\%$
test_func_call_cm_runtime[False-compile] 0.5231ms 0.4237ms 2.3600 KOps/s 2.3098 KOps/s $\color{#35bf28}+2.18\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8355ms 0.4287ms 2.3325 KOps/s 2.2944 KOps/s $\color{#35bf28}+1.66\%$
test_func_call_cm_runtime[True-eager] 1.0447ms 0.9060ms 1.1037 KOps/s 1.1132 KOps/s $\color{#d91a1a}-0.85\%$
test_func_call_cm_runtime[True-compile] 0.6411ms 0.4927ms 2.0296 KOps/s 2.0234 KOps/s $\color{#35bf28}+0.31\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8656ms 0.4946ms 2.0220 KOps/s 2.0057 KOps/s $\color{#35bf28}+0.81\%$
test_vmap_func_call_cm_runtime[eager] 2.7293ms 1.9281ms 518.6441 Ops/s 524.6508 Ops/s $\color{#d91a1a}-1.14\%$
test_vmap_func_call_cm_runtime[compile] 0.9009ms 0.5153ms 1.9405 KOps/s 1.9551 KOps/s $\color{#d91a1a}-0.75\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6383ms 0.5245ms 1.9065 KOps/s 1.9311 KOps/s $\color{#d91a1a}-1.28\%$
test_distributed 0.2911ms 0.1254ms 7.9762 KOps/s 7.7543 KOps/s $\color{#35bf28}+2.86\%$
test_tdmodule 57.4770μs 25.5118μs 39.1976 KOps/s 37.6778 KOps/s $\color{#35bf28}+4.03\%$
test_tdmodule_dispatch 95.2770μs 47.3293μs 21.1285 KOps/s 20.8876 KOps/s $\color{#35bf28}+1.15\%$
test_tdseq 54.7720μs 28.6499μs 34.9041 KOps/s 34.9783 KOps/s $\color{#d91a1a}-0.21\%$
test_tdseq_dispatch 88.9250μs 54.5682μs 18.3257 KOps/s 18.7408 KOps/s $\color{#d91a1a}-2.22\%$
test_instantiation_functorch 2.2113ms 1.5329ms 652.3794 Ops/s 655.1296 Ops/s $\color{#d91a1a}-0.42\%$
test_exec_functorch 0.3128ms 0.1813ms 5.5160 KOps/s 5.5244 KOps/s $\color{#d91a1a}-0.15\%$
test_exec_functional_call 0.3925ms 0.1739ms 5.7498 KOps/s 5.7450 KOps/s $\color{#35bf28}+0.08\%$
test_exec_td_decorator 0.4860ms 0.2402ms 4.1639 KOps/s 4.2207 KOps/s $\color{#d91a1a}-1.35\%$
test_vmap_mlp_speed_decorator[True-True] 1.0912ms 0.6619ms 1.5109 KOps/s 1.5369 KOps/s $\color{#d91a1a}-1.69\%$
test_vmap_mlp_speed_decorator[True-False] 1.1482ms 0.6768ms 1.4775 KOps/s 1.5339 KOps/s $\color{#d91a1a}-3.68\%$
test_vmap_mlp_speed_decorator[False-True] 1.2166ms 0.5292ms 1.8895 KOps/s 1.8878 KOps/s $\color{#35bf28}+0.09\%$
test_vmap_mlp_speed_decorator[False-False] 0.8077ms 0.5305ms 1.8850 KOps/s 1.8862 KOps/s $\color{#d91a1a}-0.06\%$
test_to_module_speed[True] 1.4479ms 1.3325ms 750.4445 Ops/s 738.8871 Ops/s $\color{#35bf28}+1.56\%$
test_to_module_speed[False] 1.4196ms 1.3046ms 766.5111 Ops/s 760.7333 Ops/s $\color{#35bf28}+0.76\%$
test_tc_init 0.1178ms 48.5956μs 20.5780 KOps/s 22.2522 KOps/s $\textbf{\color{#d91a1a}-7.52\%}$
test_tc_init_nested 0.2176ms 95.4781μs 10.4736 KOps/s 11.1015 KOps/s $\textbf{\color{#d91a1a}-5.66\%}$
test_tc_first_layer_tensor 28.1030μs 1.5103μs 662.1138 KOps/s 659.7567 KOps/s $\color{#35bf28}+0.36\%$
test_tc_first_layer_nontensor 30.9470μs 4.6601μs 214.5878 KOps/s 217.7249 KOps/s $\color{#d91a1a}-1.44\%$
test_tc_second_layer_tensor 17.3120μs 2.8044μs 356.5788 KOps/s 349.4903 KOps/s $\color{#35bf28}+2.03\%$
test_tc_second_layer_nontensor 36.1570μs 5.9212μs 168.8850 KOps/s 164.2180 KOps/s $\color{#35bf28}+2.84\%$
test_unbind 0.2385s 13.6546ms 73.2356 Ops/s 75.6552 Ops/s $\color{#d91a1a}-3.20\%$
test_full_like 8.6858ms 7.4148ms 134.8661 Ops/s 80.7252 Ops/s $\textbf{\color{#35bf28}+67.07\%}$
test_zeros_like 4.0054ms 2.9093ms 343.7221 Ops/s 124.8517 Ops/s $\textbf{\color{#35bf28}+175.30\%}$
test_ones_like 3.7836ms 3.2938ms 303.6032 Ops/s 127.7508 Ops/s $\textbf{\color{#35bf28}+137.65\%}$
test_clone 6.0285ms 5.2843ms 189.2399 Ops/s 106.7570 Ops/s $\textbf{\color{#35bf28}+77.26\%}$
test_squeeze 86.6210μs 12.1049μs 82.6115 KOps/s 79.6637 KOps/s $\color{#35bf28}+3.70\%$
test_unsqueeze 0.1613ms 90.7151μs 11.0235 KOps/s 10.9890 KOps/s $\color{#35bf28}+0.31\%$
test_split 0.3637ms 0.1934ms 5.1694 KOps/s 5.0060 KOps/s $\color{#35bf28}+3.27\%$
test_permute 0.3121ms 0.2086ms 4.7948 KOps/s 4.8640 KOps/s $\color{#d91a1a}-1.42\%$
test_stack 28.8100ms 25.0397ms 39.9366 Ops/s 37.8852 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_cat 27.1532ms 24.7070ms 40.4744 Ops/s 37.9450 Ops/s $\textbf{\color{#35bf28}+6.67\%}$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 19, 2024
ghstack-source-id: 6a78016ac15a324c036ebb85ab82efcbc8dc3fbd
Pull Request resolved: #1149
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 19, 2024
ghstack-source-id: 1ae2795ee734baac4419b722d4e11d522051b112
Pull Request resolved: #1149
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}36$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.7400μs 11.5768μs 86.3795 KOps/s 76.1494 KOps/s $\textbf{\color{#35bf28}+13.43\%}$
test_plain_set_stack_nested 35.7110μs 11.4697μs 87.1863 KOps/s 74.9186 KOps/s $\textbf{\color{#35bf28}+16.37\%}$
test_plain_set_nested_inplace 36.4500μs 12.4631μs 80.2371 KOps/s 70.5957 KOps/s $\textbf{\color{#35bf28}+13.66\%}$
test_plain_set_stack_nested_inplace 35.5910μs 12.4392μs 80.3912 KOps/s 70.1141 KOps/s $\textbf{\color{#35bf28}+14.66\%}$
test_items 24.0100μs 2.8947μs 345.4595 KOps/s 344.8690 KOps/s $\color{#35bf28}+0.17\%$
test_items_nested 0.4149ms 0.3617ms 2.7644 KOps/s 2.7464 KOps/s $\color{#35bf28}+0.65\%$
test_items_nested_locked 0.4167ms 0.3613ms 2.7677 KOps/s 2.7706 KOps/s $\color{#d91a1a}-0.10\%$
test_items_nested_leaf 87.1010μs 58.6274μs 17.0569 KOps/s 16.9273 KOps/s $\color{#35bf28}+0.77\%$
test_items_stack_nested 0.4205ms 0.3647ms 2.7417 KOps/s 2.7685 KOps/s $\color{#d91a1a}-0.97\%$
test_items_stack_nested_leaf 0.1051ms 60.6241μs 16.4951 KOps/s 16.4514 KOps/s $\color{#35bf28}+0.27\%$
test_items_stack_nested_locked 0.4174ms 0.3612ms 2.7683 KOps/s 2.7509 KOps/s $\color{#35bf28}+0.63\%$
test_keys 24.7410μs 3.4439μs 290.3649 KOps/s 287.5110 KOps/s $\color{#35bf28}+0.99\%$
test_keys_nested 0.1152ms 81.9556μs 12.2017 KOps/s 12.0671 KOps/s $\color{#35bf28}+1.12\%$
test_keys_nested_locked 0.7105ms 88.0399μs 11.3585 KOps/s 11.2437 KOps/s $\color{#35bf28}+1.02\%$
test_keys_nested_leaf 0.1056ms 72.4923μs 13.7946 KOps/s 13.4917 KOps/s $\color{#35bf28}+2.24\%$
test_keys_stack_nested 0.1124ms 83.5616μs 11.9672 KOps/s 11.9330 KOps/s $\color{#35bf28}+0.29\%$
test_keys_stack_nested_leaf 0.1100ms 73.6137μs 13.5844 KOps/s 13.2195 KOps/s $\color{#35bf28}+2.76\%$
test_keys_stack_nested_locked 0.1274ms 89.1880μs 11.2123 KOps/s 11.0292 KOps/s $\color{#35bf28}+1.66\%$
test_values 6.1302μs 0.8523μs 1.1733 MOps/s 1.1675 MOps/s $\color{#35bf28}+0.49\%$
test_values_nested 66.2310μs 34.1954μs 29.2437 KOps/s 28.8289 KOps/s $\color{#35bf28}+1.44\%$
test_values_nested_locked 68.1510μs 35.7849μs 27.9448 KOps/s 27.5581 KOps/s $\color{#35bf28}+1.40\%$
test_values_nested_leaf 62.7910μs 39.0009μs 25.6404 KOps/s 25.4967 KOps/s $\color{#35bf28}+0.56\%$
test_values_stack_nested 67.0510μs 34.7162μs 28.8050 KOps/s 28.8074 KOps/s $-0.01\%$
test_values_stack_nested_leaf 77.7120μs 39.2814μs 25.4574 KOps/s 25.3169 KOps/s $\color{#35bf28}+0.55\%$
test_values_stack_nested_locked 74.8110μs 36.3154μs 27.5365 KOps/s 27.4288 KOps/s $\color{#35bf28}+0.39\%$
test_membership 2.3301μs 0.5170μs 1.9342 MOps/s 1.9540 MOps/s $\color{#d91a1a}-1.02\%$
test_membership_nested 15.4055μs 2.0479μs 488.3072 KOps/s 488.1466 KOps/s $\color{#35bf28}+0.03\%$
test_membership_nested_leaf 29.7055μs 2.0754μs 481.8324 KOps/s 484.9108 KOps/s $\color{#d91a1a}-0.63\%$
test_membership_stacked_nested 23.9000μs 2.1231μs 471.0180 KOps/s 458.7036 KOps/s $\color{#35bf28}+2.68\%$
test_membership_stacked_nested_leaf 33.2700μs 2.1143μs 472.9618 KOps/s 464.0475 KOps/s $\color{#35bf28}+1.92\%$
test_membership_nested_last 49.1610μs 3.1158μs 320.9440 KOps/s 319.5997 KOps/s $\color{#35bf28}+0.42\%$
test_membership_nested_leaf_last 27.5900μs 3.0908μs 323.5433 KOps/s 318.1838 KOps/s $\color{#35bf28}+1.68\%$
test_membership_stacked_nested_last 32.2510μs 3.1387μs 318.6025 KOps/s 190.4649 KOps/s $\textbf{\color{#35bf28}+67.28\%}$
test_membership_stacked_nested_leaf_last 32.9610μs 3.1610μs 316.3543 KOps/s 186.5954 KOps/s $\textbf{\color{#35bf28}+69.54\%}$
test_nested_getleaf 41.0500μs 6.1857μs 161.6630 KOps/s 159.0052 KOps/s $\color{#35bf28}+1.67\%$
test_nested_get 30.3410μs 5.8248μs 171.6789 KOps/s 169.6771 KOps/s $\color{#35bf28}+1.18\%$
test_stacked_getleaf 46.7710μs 6.1656μs 162.1891 KOps/s 160.8067 KOps/s $\color{#35bf28}+0.86\%$
test_stacked_get 26.6710μs 5.8576μs 170.7192 KOps/s 170.1448 KOps/s $\color{#35bf28}+0.34\%$
test_nested_getitemleaf 51.9510μs 6.2529μs 159.9249 KOps/s 159.7574 KOps/s $\color{#35bf28}+0.10\%$
test_nested_getitem 86.4520μs 5.9400μs 168.3497 KOps/s 167.5984 KOps/s $\color{#35bf28}+0.45\%$
test_stacked_getitemleaf 50.5710μs 6.2551μs 159.8700 KOps/s 158.5746 KOps/s $\color{#35bf28}+0.82\%$
test_stacked_getitem 29.9100μs 5.9178μs 168.9830 KOps/s 167.1111 KOps/s $\color{#35bf28}+1.12\%$
test_lock_nested 8.9793ms 0.4007ms 2.4958 KOps/s 2.5295 KOps/s $\color{#d91a1a}-1.33\%$
test_lock_stack_nested 0.4106ms 0.3581ms 2.7928 KOps/s 2.7901 KOps/s $\color{#35bf28}+0.09\%$
test_unlock_nested 0.6304ms 0.3323ms 3.0094 KOps/s 3.0277 KOps/s $\color{#d91a1a}-0.61\%$
test_unlock_stack_nested 0.4288ms 0.2995ms 3.3393 KOps/s 3.3710 KOps/s $\color{#d91a1a}-0.94\%$
test_flatten_speed 0.1200ms 75.5253μs 13.2406 KOps/s 13.0564 KOps/s $\color{#35bf28}+1.41\%$
test_unflatten_speed 0.3811ms 0.3279ms 3.0499 KOps/s 3.0552 KOps/s $\color{#d91a1a}-0.17\%$
test_common_ops 1.6885ms 0.6010ms 1.6638 KOps/s 1.5240 KOps/s $\textbf{\color{#35bf28}+9.18\%}$
test_creation 0.1627ms 1.7810μs 561.4964 KOps/s 550.3152 KOps/s $\color{#35bf28}+2.03\%$
test_creation_empty 39.4610μs 6.5744μs 152.1040 KOps/s 99.7754 KOps/s $\textbf{\color{#35bf28}+52.45\%}$
test_creation_nested_1 28.4800μs 8.1900μs 122.1008 KOps/s 84.8523 KOps/s $\textbf{\color{#35bf28}+43.90\%}$
test_creation_nested_2 35.2410μs 11.0646μs 90.3782 KOps/s 69.0116 KOps/s $\textbf{\color{#35bf28}+30.96\%}$
test_clone 73.7920μs 11.9768μs 83.4950 KOps/s 87.9041 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_getitem[int] 1.5463ms 11.5120μs 86.8656 KOps/s 88.9897 KOps/s $\color{#d91a1a}-2.39\%$
test_getitem[slice_int] 0.1138ms 22.3455μs 44.7518 KOps/s 45.4066 KOps/s $\color{#d91a1a}-1.44\%$
test_getitem[range] 0.1306ms 40.4207μs 24.7398 KOps/s 25.1925 KOps/s $\color{#d91a1a}-1.80\%$
test_getitem[tuple] 0.1042ms 19.2133μs 52.0472 KOps/s 52.8953 KOps/s $\color{#d91a1a}-1.60\%$
test_getitem[list] 0.1950ms 35.7507μs 27.9715 KOps/s 28.2979 KOps/s $\color{#d91a1a}-1.15\%$
test_setitem_dim[int] 39.1700μs 20.5672μs 48.6212 KOps/s 49.1197 KOps/s $\color{#d91a1a}-1.01\%$
test_setitem_dim[slice_int] 62.1910μs 40.1302μs 24.9189 KOps/s 24.7240 KOps/s $\color{#35bf28}+0.79\%$
test_setitem_dim[range] 77.4810μs 54.5332μs 18.3375 KOps/s 18.0194 KOps/s $\color{#35bf28}+1.76\%$
test_setitem_dim[tuple] 65.5010μs 33.5962μs 29.7653 KOps/s 29.2870 KOps/s $\color{#35bf28}+1.63\%$
test_setitem 77.4320μs 15.8688μs 63.0167 KOps/s 58.2304 KOps/s $\textbf{\color{#35bf28}+8.22\%}$
test_set 80.6120μs 15.1069μs 66.1948 KOps/s 60.1689 KOps/s $\textbf{\color{#35bf28}+10.01\%}$
test_set_shared 1.5956ms 0.1521ms 6.5763 KOps/s 6.5897 KOps/s $\color{#d91a1a}-0.20\%$
test_update 0.2058ms 17.4500μs 57.3066 KOps/s 49.6697 KOps/s $\textbf{\color{#35bf28}+15.38\%}$
test_update_nested 92.0510μs 22.9123μs 43.6447 KOps/s 38.5717 KOps/s $\textbf{\color{#35bf28}+13.15\%}$
test_update__nested 0.1356ms 27.2356μs 36.7167 KOps/s 36.6825 KOps/s $\color{#35bf28}+0.09\%$
test_set_nested 80.3210μs 16.8283μs 59.4237 KOps/s 55.9226 KOps/s $\textbf{\color{#35bf28}+6.26\%}$
test_set_nested_new 88.4920μs 19.0590μs 52.4687 KOps/s 49.4929 KOps/s $\textbf{\color{#35bf28}+6.01\%}$
test_select 91.2120μs 29.8089μs 33.5470 KOps/s 30.5020 KOps/s $\textbf{\color{#35bf28}+9.98\%}$
test_select_nested 86.4210μs 44.9233μs 22.2602 KOps/s 22.2557 KOps/s $\color{#35bf28}+0.02\%$
test_exclude_nested 94.5620μs 65.7305μs 15.2136 KOps/s 15.2968 KOps/s $\color{#d91a1a}-0.54\%$
test_empty[True] 0.3548ms 0.2947ms 3.3929 KOps/s 3.3605 KOps/s $\color{#35bf28}+0.96\%$
test_empty[False] 3.4690μs 0.8250μs 1.2121 MOps/s 1.2047 MOps/s $\color{#35bf28}+0.61\%$
test_to 0.1204ms 64.2156μs 15.5725 KOps/s 17.3758 KOps/s $\textbf{\color{#d91a1a}-10.38\%}$
test_to_nonblocking 84.8220μs 49.5599μs 20.1776 KOps/s 20.1072 KOps/s $\color{#35bf28}+0.35\%$
test_unbind_speed 1.5687ms 0.2518ms 3.9715 KOps/s 4.0698 KOps/s $\color{#d91a1a}-2.42\%$
test_unbind_speed_stack0 0.3332ms 0.2533ms 3.9485 KOps/s 4.0420 KOps/s $\color{#d91a1a}-2.31\%$
test_unbind_speed_stack1 91.5298ms 0.6975ms 1.4337 KOps/s 1.4738 KOps/s $\color{#d91a1a}-2.73\%$
test_split 93.2217ms 1.6669ms 599.9094 Ops/s 615.2880 Ops/s $\color{#d91a1a}-2.50\%$
test_chunk 93.3932ms 1.6708ms 598.5172 Ops/s 612.0057 Ops/s $\color{#d91a1a}-2.20\%$
test_consolidate[False-None] 96.1194ms 3.0542ms 327.4133 Ops/s 327.2565 Ops/s $\color{#35bf28}+0.05\%$
test_consolidate[default-None] 2.2063ms 1.8087ms 552.8838 Ops/s 567.9491 Ops/s $\color{#d91a1a}-2.65\%$
test_consolidate[reduce-overhead-None] 2.0179ms 1.8324ms 545.7248 Ops/s 564.7843 Ops/s $\color{#d91a1a}-3.37\%$
test_consolidate_njt[False-None] 7.2683ms 6.9159ms 144.5944 Ops/s 108.9073 Ops/s $\textbf{\color{#35bf28}+32.77\%}$
test_to[False-False-None] 1.9547ms 1.7928ms 557.7893 Ops/s 565.2088 Ops/s $\color{#d91a1a}-1.31\%$
test_to[True-False-None] 1.6373ms 1.4675ms 681.4530 Ops/s 714.2225 Ops/s $\color{#d91a1a}-4.59\%$
test_to[within-False-None] 4.6052ms 4.3747ms 228.5881 Ops/s 233.0220 Ops/s $\color{#d91a1a}-1.90\%$
test_to[True-default-None] 5.7705ms 5.5233ms 181.0503 Ops/s 179.8920 Ops/s $\color{#35bf28}+0.64\%$
test_to_njt[False-False-None] 7.5964ms 7.2123ms 138.6512 Ops/s 140.6943 Ops/s $\color{#d91a1a}-1.45\%$
test_to_njt[True-False-None] 6.2685ms 5.9313ms 168.5972 Ops/s 174.8105 Ops/s $\color{#d91a1a}-3.55\%$
test_to_njt[within-False-None] 13.8543ms 12.7763ms 78.2700 Ops/s 79.3445 Ops/s $\color{#d91a1a}-1.35\%$
test_creation[device0] 0.4709ms 81.7543μs 12.2318 KOps/s 12.3156 KOps/s $\color{#d91a1a}-0.68\%$
test_creation_from_tensor 0.5150ms 85.9920μs 11.6290 KOps/s 11.8133 KOps/s $\color{#d91a1a}-1.56\%$
test_add_one[memmap_tensor0] 0.3799ms 7.5356μs 132.7037 KOps/s 139.5651 KOps/s $\color{#d91a1a}-4.92\%$
test_contiguous[memmap_tensor0] 2.3805μs 0.4313μs 2.3186 MOps/s 2.3249 MOps/s $\color{#d91a1a}-0.27\%$
test_stack[memmap_tensor0] 41.3900μs 4.8278μs 207.1356 KOps/s 210.5310 KOps/s $\color{#d91a1a}-1.61\%$
test_memmaptd_index 0.6076ms 0.2711ms 3.6891 KOps/s 3.7844 KOps/s $\color{#d91a1a}-2.52\%$
test_memmaptd_index_astensor 0.7071ms 0.3342ms 2.9926 KOps/s 3.0405 KOps/s $\color{#d91a1a}-1.58\%$
test_memmaptd_index_op 1.0448ms 0.6039ms 1.6558 KOps/s 1.5555 KOps/s $\textbf{\color{#35bf28}+6.45\%}$
test_serialize_model 0.1321s 0.1311s 7.6260 Ops/s 7.6397 Ops/s $\color{#d91a1a}-0.18\%$
test_serialize_model_pickle 1.3490s 1.2149s 0.8231 Ops/s 0.8251 Ops/s $\color{#d91a1a}-0.24\%$
test_serialize_weights 0.1328s 0.1307s 7.6495 Ops/s 7.6642 Ops/s $\color{#d91a1a}-0.19\%$
test_serialize_weights_returnearly 0.5489s 62.7710ms 15.9309 Ops/s 15.3351 Ops/s $\color{#35bf28}+3.89\%$
test_serialize_weights_pickle 1.3456s 1.2207s 0.8192 Ops/s 0.8206 Ops/s $\color{#d91a1a}-0.17\%$
test_reshape_pytree 55.3210μs 22.8356μs 43.7913 KOps/s 43.4898 KOps/s $\color{#35bf28}+0.69\%$
test_reshape_td 56.9410μs 27.7317μs 36.0598 KOps/s 36.0437 KOps/s $\color{#35bf28}+0.04\%$
test_view_pytree 61.1510μs 22.7180μs 44.0179 KOps/s 44.1500 KOps/s $\color{#d91a1a}-0.30\%$
test_view_td 58.5810μs 30.8584μs 32.4061 KOps/s 29.1793 KOps/s $\textbf{\color{#35bf28}+11.06\%}$
test_unbind_pytree 56.3010μs 29.0135μs 34.4668 KOps/s 34.4020 KOps/s $\color{#35bf28}+0.19\%$
test_unbind_td 0.8025ms 38.1480μs 26.2137 KOps/s 26.1773 KOps/s $\color{#35bf28}+0.14\%$
test_split_pytree 91.5020μs 30.8740μs 32.3897 KOps/s 32.1951 KOps/s $\color{#35bf28}+0.60\%$
test_split_td 0.9747ms 41.1689μs 24.2902 KOps/s 24.9304 KOps/s $\color{#d91a1a}-2.57\%$
test_add_pytree 62.8110μs 36.8366μs 27.1469 KOps/s 27.7519 KOps/s $\color{#d91a1a}-2.18\%$
test_add_td 0.1937ms 48.6377μs 20.5602 KOps/s 18.8812 KOps/s $\textbf{\color{#35bf28}+8.89\%}$
test_compile_add_one_nested[tensordict-compile] 0.1763ms 0.1230ms 8.1301 KOps/s 7.8442 KOps/s $\color{#35bf28}+3.65\%$
test_compile_add_one_nested[tensordict-eager] 0.2436ms 0.1313ms 7.6167 KOps/s 7.5982 KOps/s $\color{#35bf28}+0.24\%$
test_compile_add_one_nested[pytree-compile] 0.1448ms 97.0291μs 10.3062 KOps/s 10.2286 KOps/s $\color{#35bf28}+0.76\%$
test_compile_add_one_nested[pytree-eager] 0.2163ms 0.1570ms 6.3702 KOps/s 6.5206 KOps/s $\color{#d91a1a}-2.31\%$
test_compile_copy_nested[tensordict-compile] 48.9410μs 23.4899μs 42.5714 KOps/s 44.5554 KOps/s $\color{#d91a1a}-4.45\%$
test_compile_copy_nested[tensordict-eager] 60.4620μs 30.0534μs 33.2741 KOps/s 33.1083 KOps/s $\color{#35bf28}+0.50\%$
test_compile_copy_nested[pytree-compile] 0.4312ms 65.3767μs 15.2960 KOps/s 15.0975 KOps/s $\color{#35bf28}+1.31\%$
test_compile_copy_nested[pytree-eager] 76.7420μs 49.3405μs 20.2673 KOps/s 20.2449 KOps/s $\color{#35bf28}+0.11\%$
test_compile_add_one_flat[tensordict-compile] 0.1885ms 0.1466ms 6.8200 KOps/s 7.0696 KOps/s $\color{#d91a1a}-3.53\%$
test_compile_add_one_flat[tensordict-eager] 0.3093ms 0.2175ms 4.5973 KOps/s 4.5620 KOps/s $\color{#35bf28}+0.77\%$
test_compile_add_one_flat[tensorclass-compile] 0.3476ms 0.1028ms 9.7288 KOps/s 9.7440 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_add_one_flat[tensorclass-eager] 0.1218ms 56.2835μs 17.7672 KOps/s 17.7946 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_add_one_flat[pytree-compile] 0.1895ms 0.1403ms 7.1286 KOps/s 7.3433 KOps/s $\color{#d91a1a}-2.92\%$
test_compile_add_one_flat[pytree-eager] 0.5927ms 0.5135ms 1.9474 KOps/s 1.9761 KOps/s $\color{#d91a1a}-1.45\%$
test_compile_add_self_flat[tensordict-eager] 0.3805ms 0.2610ms 3.8314 KOps/s 3.8315 KOps/s $-0.00\%$
test_compile_add_self_flat[tensordict-compile] 0.1879ms 0.1463ms 6.8351 KOps/s 7.0093 KOps/s $\color{#d91a1a}-2.49\%$
test_compile_add_self_flat[tensorclass-eager] 0.1521ms 67.3226μs 14.8539 KOps/s 15.1641 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_add_self_flat[tensorclass-compile] 0.1553ms 0.1016ms 9.8440 KOps/s 10.1093 KOps/s $\color{#d91a1a}-2.62\%$
test_compile_add_self_flat[pytree-eager] 0.4956ms 0.4281ms 2.3358 KOps/s 2.3752 KOps/s $\color{#d91a1a}-1.66\%$
test_compile_add_self_flat[pytree-compile] 0.2292ms 0.1394ms 7.1736 KOps/s 7.0949 KOps/s $\color{#35bf28}+1.11\%$
test_compile_copy_flat[tensordict-compile] 57.4710μs 19.8421μs 50.3979 KOps/s 55.2526 KOps/s $\textbf{\color{#d91a1a}-8.79\%}$
test_compile_copy_flat[tensordict-eager] 63.8910μs 32.3589μs 30.9034 KOps/s 31.2419 KOps/s $\color{#d91a1a}-1.08\%$
test_compile_copy_flat[pytree-compile] 0.1254ms 71.1027μs 14.0642 KOps/s 14.0149 KOps/s $\color{#35bf28}+0.35\%$
test_compile_copy_flat[pytree-eager] 87.7810μs 51.5293μs 19.4064 KOps/s 19.1958 KOps/s $\color{#35bf28}+1.10\%$
test_compile_assign_and_add[tensordict-compile] 1.6870ms 0.4044ms 2.4726 KOps/s 2.2174 KOps/s $\textbf{\color{#35bf28}+11.51\%}$
test_compile_assign_and_add[tensordict-eager] 3.1407ms 2.7555ms 362.9099 Ops/s 367.8918 Ops/s $\color{#d91a1a}-1.35\%$
test_compile_assign_and_add[pytree-compile] 1.6643ms 0.4479ms 2.2328 KOps/s 2.2353 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_assign_and_add[pytree-eager] 2.9446ms 2.8027ms 356.8024 Ops/s 360.5573 Ops/s $\color{#d91a1a}-1.04\%$
test_compile_indexing[tensor-tensordict-compile] 0.1854ms 0.1194ms 8.3785 KOps/s 8.1995 KOps/s $\color{#35bf28}+2.18\%$
test_compile_indexing[tensor-tensordict-eager] 0.5694ms 84.3104μs 11.8609 KOps/s 12.1093 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2046ms 0.1122ms 8.9110 KOps/s 9.1742 KOps/s $\color{#d91a1a}-2.87\%$
test_compile_indexing[tensor-tensorclass-eager] 0.4558ms 73.0520μs 13.6889 KOps/s 13.4606 KOps/s $\color{#35bf28}+1.70\%$
test_compile_indexing[tensor-pytree-compile] 0.5158ms 0.1165ms 8.5835 KOps/s 8.6762 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_indexing[tensor-pytree-eager] 0.4697ms 73.0785μs 13.6839 KOps/s 13.6018 KOps/s $\color{#35bf28}+0.60\%$
test_compile_indexing[slice-tensordict-compile] 0.5030ms 0.1034ms 9.6758 KOps/s 9.7905 KOps/s $\color{#d91a1a}-1.17\%$
test_compile_indexing[slice-tensordict-eager] 0.1425ms 18.8005μs 53.1900 KOps/s 49.3225 KOps/s $\textbf{\color{#35bf28}+7.84\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1385ms 98.6346μs 10.1384 KOps/s 10.1213 KOps/s $\color{#35bf28}+0.17\%$
test_compile_indexing[slice-tensorclass-eager] 0.4040ms 16.7901μs 59.5590 KOps/s 60.0882 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_indexing[slice-pytree-compile] 0.5041ms 0.1043ms 9.5917 KOps/s 10.0610 KOps/s $\color{#d91a1a}-4.66\%$
test_compile_indexing[slice-pytree-eager] 0.3994ms 17.8707μs 55.9574 KOps/s 60.1358 KOps/s $\textbf{\color{#d91a1a}-6.95\%}$
test_compile_indexing[int-tensordict-compile] 0.2009ms 0.1089ms 9.1850 KOps/s 9.3365 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_indexing[int-tensordict-eager] 0.5401ms 18.5299μs 53.9668 KOps/s 54.8043 KOps/s $\color{#d91a1a}-1.53\%$
test_compile_indexing[int-tensorclass-compile] 0.5027ms 0.1038ms 9.6294 KOps/s 9.6761 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_indexing[int-tensorclass-eager] 0.4093ms 17.7016μs 56.4921 KOps/s 60.1257 KOps/s $\textbf{\color{#d91a1a}-6.04\%}$
test_compile_indexing[int-pytree-compile] 0.5035ms 0.1042ms 9.5934 KOps/s 10.1489 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_compile_indexing[int-pytree-eager] 0.3978ms 16.8979μs 59.1790 KOps/s 60.5814 KOps/s $\color{#d91a1a}-2.32\%$
test_mod_add[eager] 0.4371ms 38.0091μs 26.3095 KOps/s 24.8808 KOps/s $\textbf{\color{#35bf28}+5.74\%}$
test_mod_add[compile] 0.4835ms 82.8171μs 12.0748 KOps/s 11.4401 KOps/s $\textbf{\color{#35bf28}+5.55\%}$
test_mod_add[compile-overhead] 0.3283ms 0.1718ms 5.8194 KOps/s 5.6582 KOps/s $\color{#35bf28}+2.85\%$
test_mod_wrap[eager] 0.6579ms 0.2539ms 3.9379 KOps/s 3.6770 KOps/s $\textbf{\color{#35bf28}+7.10\%}$
test_mod_wrap[compile] 0.3716ms 0.2934ms 3.4086 KOps/s 3.2189 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_mod_wrap[compile-overhead] 6.6466ms 3.6854ms 271.3409 Ops/s 271.3332 Ops/s $+0.00\%$
test_mod_wrap_and_backward[eager] 1.5422ms 1.3909ms 718.9579 Ops/s 676.0073 Ops/s $\textbf{\color{#35bf28}+6.35\%}$
test_mod_wrap_and_backward[compile] 1.4196ms 1.3051ms 766.2034 Ops/s 759.1150 Ops/s $\color{#35bf28}+0.93\%$
test_mod_wrap_and_backward[compile-overhead] 1.4067ms 0.9473ms 1.0557 KOps/s 1.0563 KOps/s $\color{#d91a1a}-0.06\%$
test_seq_add[eager] 0.2042ms 0.1204ms 8.3069 KOps/s 8.2316 KOps/s $\color{#35bf28}+0.91\%$
test_seq_add[compile] 0.2541ms 91.7790μs 10.8957 KOps/s 10.5128 KOps/s $\color{#35bf28}+3.64\%$
test_seq_add[compile-overhead] 0.2013ms 0.1329ms 7.5234 KOps/s 7.5414 KOps/s $\color{#d91a1a}-0.24\%$
test_seq_wrap[eager] 0.4896ms 0.4204ms 2.3785 KOps/s 2.1933 KOps/s $\textbf{\color{#35bf28}+8.44\%}$
test_seq_wrap[compile] 0.3625ms 0.3091ms 3.2352 KOps/s 3.1979 KOps/s $\color{#35bf28}+1.17\%$
test_seq_wrap[compile-overhead] 0.2851ms 0.2308ms 4.3334 KOps/s 4.2922 KOps/s $\color{#35bf28}+0.96\%$
test_func_call_runtime[False-eager] 0.8472ms 0.7511ms 1.3315 KOps/s 1.3188 KOps/s $\color{#35bf28}+0.96\%$
test_func_call_runtime[False-compile] 0.8993ms 0.7717ms 1.2958 KOps/s 1.2850 KOps/s $\color{#35bf28}+0.84\%$
test_func_call_runtime[False-compile-overhead] 0.4388ms 0.3754ms 2.6639 KOps/s 2.6703 KOps/s $\color{#d91a1a}-0.24\%$
test_func_call_runtime[True-eager] 1.0618ms 0.9245ms 1.0817 KOps/s 1.0629 KOps/s $\color{#35bf28}+1.77\%$
test_func_call_runtime[True-compile] 0.8557ms 0.7897ms 1.2663 KOps/s 1.2558 KOps/s $\color{#35bf28}+0.84\%$
test_func_call_runtime[True-compile-overhead] 0.4745ms 0.4038ms 2.4764 KOps/s 2.5280 KOps/s $\color{#d91a1a}-2.04\%$
test_func_call_cm_runtime[False-eager] 0.9210ms 0.8082ms 1.2373 KOps/s 1.3308 KOps/s $\textbf{\color{#d91a1a}-7.03\%}$
test_func_call_cm_runtime[False-compile] 0.9364ms 0.8262ms 1.2104 KOps/s 1.2820 KOps/s $\textbf{\color{#d91a1a}-5.58\%}$
test_func_call_cm_runtime[False-compile-overhead] 0.4589ms 0.3774ms 2.6497 KOps/s 2.6533 KOps/s $\color{#d91a1a}-0.14\%$
test_func_call_cm_runtime[True-eager] 1.1730ms 1.0241ms 976.4357 Ops/s 967.4522 Ops/s $\color{#35bf28}+0.93\%$
test_func_call_cm_runtime[True-compile] 0.9275ms 0.8208ms 1.2184 KOps/s 1.2127 KOps/s $\color{#35bf28}+0.47\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4773ms 0.4249ms 2.3533 KOps/s 2.3615 KOps/s $\color{#d91a1a}-0.35\%$
test_vmap_func_call_cm_runtime[eager] 2.5734ms 2.1306ms 469.3575 Ops/s 467.9958 Ops/s $\color{#35bf28}+0.29\%$
test_vmap_func_call_cm_runtime[compile] 0.9607ms 0.8698ms 1.1497 KOps/s 1.1773 KOps/s $\color{#d91a1a}-2.35\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4896ms 0.4259ms 2.3479 KOps/s 2.3445 KOps/s $\color{#35bf28}+0.14\%$
test_distributed 2.0693ms 0.1781ms 5.6154 KOps/s 8.7632 KOps/s $\textbf{\color{#d91a1a}-35.92\%}$
test_tdmodule 38.5500μs 19.0287μs 52.5521 KOps/s 46.0823 KOps/s $\textbf{\color{#35bf28}+14.04\%}$
test_tdmodule_dispatch 0.2787ms 34.7041μs 28.8151 KOps/s 25.9306 KOps/s $\textbf{\color{#35bf28}+11.12\%}$
test_tdseq 42.2610μs 19.7290μs 50.6869 KOps/s 44.7231 KOps/s $\textbf{\color{#35bf28}+13.33\%}$
test_tdseq_dispatch 58.8010μs 36.6310μs 27.2993 KOps/s 23.9465 KOps/s $\textbf{\color{#35bf28}+14.00\%}$
test_instantiation_functorch 1.7211ms 1.6123ms 620.2182 Ops/s 629.4023 Ops/s $\color{#d91a1a}-1.46\%$
test_exec_functorch 0.2893ms 0.1576ms 6.3465 KOps/s 6.5204 KOps/s $\color{#d91a1a}-2.67\%$
test_exec_functional_call 0.2054ms 0.1475ms 6.7818 KOps/s 6.8374 KOps/s $\color{#d91a1a}-0.81\%$
test_exec_td_decorator 0.3856ms 0.1962ms 5.0965 KOps/s 5.0817 KOps/s $\color{#35bf28}+0.29\%$
test_vmap_mlp_speed_decorator[True-True] 0.7972ms 0.6923ms 1.4444 KOps/s 1.4259 KOps/s $\color{#35bf28}+1.29\%$
test_vmap_mlp_speed_decorator[True-False] 0.8013ms 0.6902ms 1.4489 KOps/s 1.4253 KOps/s $\color{#35bf28}+1.65\%$
test_vmap_mlp_speed_decorator[False-True] 0.7304ms 0.6074ms 1.6462 KOps/s 1.6507 KOps/s $\color{#d91a1a}-0.27\%$
test_vmap_mlp_speed_decorator[False-False] 0.7315ms 0.6082ms 1.6441 KOps/s 1.6460 KOps/s $\color{#d91a1a}-0.12\%$
test_vmap_transformer_speed_decorator[True-True] 19.9041ms 19.5834ms 51.0636 Ops/s 51.3822 Ops/s $\color{#d91a1a}-0.62\%$
test_vmap_transformer_speed_decorator[True-False] 19.8032ms 19.5697ms 51.0995 Ops/s 51.3599 Ops/s $\color{#d91a1a}-0.51\%$
test_vmap_transformer_speed_decorator[False-True] 19.8049ms 19.5072ms 51.2632 Ops/s 51.8207 Ops/s $\color{#d91a1a}-1.08\%$
test_vmap_transformer_speed_decorator[False-False] 19.7636ms 19.4524ms 51.4074 Ops/s 51.7368 Ops/s $\color{#d91a1a}-0.64\%$
test_to_module_speed[True] 1.0968ms 0.9894ms 1.0107 KOps/s 1.0077 KOps/s $\color{#35bf28}+0.29\%$
test_to_module_speed[False] 1.5388ms 0.9718ms 1.0291 KOps/s 1.0288 KOps/s $\color{#35bf28}+0.02\%$
test_tc_init 89.7020μs 36.1989μs 27.6251 KOps/s 24.2897 KOps/s $\textbf{\color{#35bf28}+13.73\%}$
test_tc_init_nested 0.1087ms 72.5923μs 13.7756 KOps/s 12.2609 KOps/s $\textbf{\color{#35bf28}+12.35\%}$
test_tc_first_layer_tensor 4.8959μs 0.7078μs 1.4129 MOps/s 1.4270 MOps/s $\color{#d91a1a}-0.99\%$
test_tc_first_layer_nontensor 30.2600μs 2.3787μs 420.3952 KOps/s 418.1248 KOps/s $\color{#35bf28}+0.54\%$
test_tc_second_layer_tensor 9.0852μs 1.4309μs 698.8520 KOps/s 700.3483 KOps/s $\color{#d91a1a}-0.21\%$
test_tc_second_layer_nontensor 31.5800μs 3.1334μs 319.1458 KOps/s 317.6426 KOps/s $\color{#35bf28}+0.47\%$
test_unbind 0.2375s 10.6590ms 93.8172 Ops/s 140.1949 Ops/s $\textbf{\color{#d91a1a}-33.08\%}$
test_full_like 10.2366ms 9.1085ms 109.7877 Ops/s 109.4485 Ops/s $\color{#35bf28}+0.31\%$
test_zeros_like 4.8254ms 4.3226ms 231.3438 Ops/s 137.2173 Ops/s $\textbf{\color{#35bf28}+68.60\%}$
test_ones_like 4.6221ms 4.3196ms 231.5043 Ops/s 231.7171 Ops/s $\color{#d91a1a}-0.09\%$
test_clone 6.9790ms 6.3426ms 157.6643 Ops/s 157.8505 Ops/s $\color{#d91a1a}-0.12\%$
test_squeeze 82.3710μs 10.0214μs 99.7866 KOps/s 103.3867 KOps/s $\color{#d91a1a}-3.48\%$
test_unsqueeze 0.1320ms 77.5356μs 12.8973 KOps/s 13.9773 KOps/s $\textbf{\color{#d91a1a}-7.73\%}$
test_split 0.2914ms 0.1631ms 6.1316 KOps/s 6.0499 KOps/s $\color{#35bf28}+1.35\%$
test_permute 0.2518ms 0.1866ms 5.3584 KOps/s 5.5144 KOps/s $\color{#d91a1a}-2.83\%$
test_stack 50.8125ms 50.6235ms 19.7537 Ops/s 19.7680 Ops/s $\color{#d91a1a}-0.07\%$
test_cat 50.8735ms 50.5898ms 19.7668 Ops/s 19.8231 Ops/s $\color{#d91a1a}-0.28\%$

@vmoens vmoens merged commit f9c99cb into gh/vmoens/41/base Dec 19, 2024
41 of 45 checks passed
vmoens added a commit that referenced this pull request Dec 19, 2024
ghstack-source-id: 1ae2795ee734baac4419b722d4e11d522051b112
Pull Request resolved: #1149
@vmoens vmoens deleted the gh/vmoens/41/head branch December 19, 2024 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants