Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] TorchScript compat #1141

Open
wants to merge 1 commit into
base: gh/vmoens/35/base
Choose a base branch
from
Open

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 17, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 17, 2024
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: a2bd5ea52bd65d81b109c2b82b8a09ef2505453d
Pull Request resolved: #1141
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}19$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 45.0340μs 21.2680μs 47.0190 KOps/s 49.3471 KOps/s $\color{#d91a1a}-4.72\%$
test_plain_set_stack_nested 72.8260μs 21.4210μs 46.6832 KOps/s 48.5349 KOps/s $\color{#d91a1a}-3.82\%$
test_plain_set_nested_inplace 66.4430μs 23.1648μs 43.1690 KOps/s 44.5391 KOps/s $\color{#d91a1a}-3.08\%$
test_plain_set_stack_nested_inplace 0.1033ms 22.8574μs 43.7496 KOps/s 44.8863 KOps/s $\color{#d91a1a}-2.53\%$
test_items 23.5440μs 4.2501μs 235.2870 KOps/s 237.6643 KOps/s $\color{#d91a1a}-1.00\%$
test_items_nested 0.5655ms 0.4046ms 2.4717 KOps/s 2.4629 KOps/s $\color{#35bf28}+0.36\%$
test_items_nested_locked 0.7380ms 0.4105ms 2.4358 KOps/s 2.4542 KOps/s $\color{#d91a1a}-0.75\%$
test_items_nested_leaf 0.1361ms 76.4867μs 13.0742 KOps/s 12.8085 KOps/s $\color{#35bf28}+2.07\%$
test_items_stack_nested 0.8881ms 0.4080ms 2.4508 KOps/s 2.4283 KOps/s $\color{#35bf28}+0.93\%$
test_items_stack_nested_leaf 0.1346ms 77.7326μs 12.8646 KOps/s 12.1708 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_items_stack_nested_locked 0.8524ms 0.4097ms 2.4407 KOps/s 2.4309 KOps/s $\color{#35bf28}+0.40\%$
test_keys 21.6800μs 3.4776μs 287.5550 KOps/s 282.8066 KOps/s $\color{#35bf28}+1.68\%$
test_keys_nested 0.2673ms 0.1650ms 6.0605 KOps/s 5.9278 KOps/s $\color{#35bf28}+2.24\%$
test_keys_nested_locked 0.7157ms 0.1713ms 5.8381 KOps/s 5.6741 KOps/s $\color{#35bf28}+2.89\%$
test_keys_nested_leaf 1.5820ms 0.1437ms 6.9579 KOps/s 6.7412 KOps/s $\color{#35bf28}+3.22\%$
test_keys_stack_nested 0.3025ms 0.1657ms 6.0348 KOps/s 6.0176 KOps/s $\color{#35bf28}+0.29\%$
test_keys_stack_nested_leaf 0.2596ms 0.1443ms 6.9313 KOps/s 6.9857 KOps/s $\color{#d91a1a}-0.78\%$
test_keys_stack_nested_locked 0.3768ms 0.1718ms 5.8205 KOps/s 5.8190 KOps/s $\color{#35bf28}+0.03\%$
test_values 9.6040μs 1.0361μs 965.1129 KOps/s 945.1036 KOps/s $\color{#35bf28}+2.12\%$
test_values_nested 0.1315ms 62.4538μs 16.0118 KOps/s 15.7063 KOps/s $\color{#35bf28}+1.95\%$
test_values_nested_locked 0.1200ms 62.9714μs 15.8802 KOps/s 15.8867 KOps/s $\color{#d91a1a}-0.04\%$
test_values_nested_leaf 0.1293ms 72.3465μs 13.8224 KOps/s 13.6269 KOps/s $\color{#35bf28}+1.43\%$
test_values_stack_nested 0.1157ms 63.9879μs 15.6280 KOps/s 15.4913 KOps/s $\color{#35bf28}+0.88\%$
test_values_stack_nested_leaf 0.1374ms 72.7479μs 13.7461 KOps/s 13.8263 KOps/s $\color{#d91a1a}-0.58\%$
test_values_stack_nested_locked 0.1471ms 63.1650μs 15.8316 KOps/s 15.4866 KOps/s $\color{#35bf28}+2.23\%$
test_membership 40.1250μs 0.8970μs 1.1148 MOps/s 1.1453 MOps/s $\color{#d91a1a}-2.66\%$
test_membership_nested 21.9510μs 3.0274μs 330.3189 KOps/s 337.9826 KOps/s $\color{#d91a1a}-2.27\%$
test_membership_nested_leaf 42.0080μs 3.0863μs 324.0163 KOps/s 338.0672 KOps/s $\color{#d91a1a}-4.16\%$
test_membership_stacked_nested 20.5980μs 3.0273μs 330.3231 KOps/s 345.1090 KOps/s $\color{#d91a1a}-4.28\%$
test_membership_stacked_nested_leaf 33.4920μs 3.0426μs 328.6687 KOps/s 341.5634 KOps/s $\color{#d91a1a}-3.78\%$
test_membership_nested_last 26.5300μs 4.5087μs 221.7919 KOps/s 221.6974 KOps/s $\color{#35bf28}+0.04\%$
test_membership_nested_leaf_last 48.2100μs 4.5469μs 219.9301 KOps/s 224.2752 KOps/s $\color{#d91a1a}-1.94\%$
test_membership_stacked_nested_last 28.0530μs 5.3623μs 186.4878 KOps/s 74.3643 KOps/s $\textbf{\color{#35bf28}+150.78\%}$
test_membership_stacked_nested_leaf_last 47.8490μs 5.4287μs 184.2054 KOps/s 74.1934 KOps/s $\textbf{\color{#35bf28}+148.28\%}$
test_nested_getleaf 33.2720μs 10.8366μs 92.2797 KOps/s 90.8610 KOps/s $\color{#35bf28}+1.56\%$
test_nested_get 53.5000μs 10.1659μs 98.3682 KOps/s 98.2878 KOps/s $\color{#35bf28}+0.08\%$
test_stacked_getleaf 49.9830μs 10.9067μs 91.6866 KOps/s 90.1873 KOps/s $\color{#35bf28}+1.66\%$
test_stacked_get 32.9410μs 10.1343μs 98.6750 KOps/s 98.1655 KOps/s $\color{#35bf28}+0.52\%$
test_nested_getitemleaf 53.2090μs 11.2182μs 89.1412 KOps/s 87.1073 KOps/s $\color{#35bf28}+2.33\%$
test_nested_getitem 47.4190μs 10.4864μs 95.3618 KOps/s 94.1656 KOps/s $\color{#35bf28}+1.27\%$
test_stacked_getitemleaf 35.5460μs 11.2636μs 88.7818 KOps/s 88.3804 KOps/s $\color{#35bf28}+0.45\%$
test_stacked_getitem 52.5180μs 10.5692μs 94.6142 KOps/s 94.9250 KOps/s $\color{#d91a1a}-0.33\%$
test_lock_nested 1.8963ms 0.4594ms 2.1765 KOps/s 2.1828 KOps/s $\color{#d91a1a}-0.29\%$
test_lock_stack_nested 0.6311ms 0.4325ms 2.3120 KOps/s 2.3856 KOps/s $\color{#d91a1a}-3.09\%$
test_unlock_nested 0.7650ms 0.3779ms 2.6463 KOps/s 2.6428 KOps/s $\color{#35bf28}+0.14\%$
test_unlock_stack_nested 0.5448ms 0.3507ms 2.8515 KOps/s 2.9640 KOps/s $\color{#d91a1a}-3.80\%$
test_flatten_speed 0.1992ms 0.1009ms 9.9127 KOps/s 9.8801 KOps/s $\color{#35bf28}+0.33\%$
test_unflatten_speed 0.6909ms 0.5145ms 1.9438 KOps/s 1.8824 KOps/s $\color{#35bf28}+3.26\%$
test_common_ops 4.8465ms 0.8335ms 1.1998 KOps/s 1.2853 KOps/s $\textbf{\color{#d91a1a}-6.66\%}$
test_creation 17.2520μs 2.5566μs 391.1459 KOps/s 402.8764 KOps/s $\color{#d91a1a}-2.91\%$
test_creation_empty 30.1560μs 12.6736μs 78.9042 KOps/s 94.8327 KOps/s $\textbf{\color{#d91a1a}-16.80\%}$
test_creation_nested_1 73.7550μs 15.8245μs 63.1932 KOps/s 72.7796 KOps/s $\textbf{\color{#d91a1a}-13.17\%}$
test_creation_nested_2 51.4760μs 20.6130μs 48.5130 KOps/s 55.4954 KOps/s $\textbf{\color{#d91a1a}-12.58\%}$
test_clone 0.2141ms 14.0407μs 71.2217 KOps/s 72.5432 KOps/s $\color{#d91a1a}-1.82\%$
test_getitem[int] 1.3211ms 13.4720μs 74.2280 KOps/s 77.0792 KOps/s $\color{#d91a1a}-3.70\%$
test_getitem[slice_int] 0.1413ms 25.0608μs 39.9030 KOps/s 39.9383 KOps/s $\color{#d91a1a}-0.09\%$
test_getitem[range] 0.1647ms 48.6879μs 20.5390 KOps/s 20.2495 KOps/s $\color{#35bf28}+1.43\%$
test_getitem[tuple] 0.1374ms 21.0648μs 47.4725 KOps/s 47.9882 KOps/s $\color{#d91a1a}-1.07\%$
test_getitem[list] 0.4916ms 44.2388μs 22.6046 KOps/s 22.1583 KOps/s $\color{#35bf28}+2.01\%$
test_setitem_dim[int] 53.4400μs 25.7081μs 38.8983 KOps/s 37.1047 KOps/s $\color{#35bf28}+4.83\%$
test_setitem_dim[slice_int] 87.0220μs 52.9879μs 18.8722 KOps/s 18.6698 KOps/s $\color{#35bf28}+1.08\%$
test_setitem_dim[range] 0.1336ms 73.0143μs 13.6960 KOps/s 13.1681 KOps/s $\color{#35bf28}+4.01\%$
test_setitem_dim[tuple] 66.7840μs 41.8817μs 23.8768 KOps/s 23.4973 KOps/s $\color{#35bf28}+1.61\%$
test_setitem 0.3064ms 22.0978μs 45.2534 KOps/s 49.1714 KOps/s $\textbf{\color{#d91a1a}-7.97\%}$
test_set 0.1297ms 21.6325μs 46.2267 KOps/s 50.9307 KOps/s $\textbf{\color{#d91a1a}-9.24\%}$
test_set_shared 1.1903ms 0.1738ms 5.7533 KOps/s 5.8554 KOps/s $\color{#d91a1a}-1.74\%$
test_update 0.2539ms 24.6671μs 40.5399 KOps/s 46.4001 KOps/s $\textbf{\color{#d91a1a}-12.63\%}$
test_update_nested 0.2739ms 36.0650μs 27.7277 KOps/s 30.6573 KOps/s $\textbf{\color{#d91a1a}-9.56\%}$
test_update__nested 0.8165ms 35.9231μs 27.8372 KOps/s 28.3888 KOps/s $\color{#d91a1a}-1.94\%$
test_set_nested 0.2261ms 24.0307μs 41.6135 KOps/s 45.5799 KOps/s $\textbf{\color{#d91a1a}-8.70\%}$
test_set_nested_new 0.2838ms 28.9885μs 34.4965 KOps/s 37.7929 KOps/s $\textbf{\color{#d91a1a}-8.72\%}$
test_select 0.2242ms 46.6009μs 21.4588 KOps/s 22.8396 KOps/s $\textbf{\color{#d91a1a}-6.05\%}$
test_select_nested 0.1281ms 65.6162μs 15.2401 KOps/s 15.6161 KOps/s $\color{#d91a1a}-2.41\%$
test_exclude_nested 0.1616ms 86.2963μs 11.5880 KOps/s 11.9042 KOps/s $\color{#d91a1a}-2.66\%$
test_empty[True] 0.9426ms 0.4207ms 2.3772 KOps/s 2.3813 KOps/s $\color{#d91a1a}-0.17\%$
test_empty[False] 11.9072μs 1.4047μs 711.9060 KOps/s 688.9268 KOps/s $\color{#35bf28}+3.34\%$
test_unbind_speed 0.6054ms 0.2797ms 3.5757 KOps/s 3.6190 KOps/s $\color{#d91a1a}-1.20\%$
test_unbind_speed_stack0 0.4924ms 0.2769ms 3.6111 KOps/s 3.7888 KOps/s $\color{#d91a1a}-4.69\%$
test_unbind_speed_stack1 0.1140s 0.8291ms 1.2061 KOps/s 1.3998 KOps/s $\textbf{\color{#d91a1a}-13.83\%}$
test_split 0.1057s 1.7985ms 556.0139 Ops/s 566.3306 Ops/s $\color{#d91a1a}-1.82\%$
test_chunk 0.1017s 1.7990ms 555.8730 Ops/s 559.8401 Ops/s $\color{#d91a1a}-0.71\%$
test_consolidate_njt[False-None] 8.2465ms 7.9909ms 125.1427 Ops/s 119.2055 Ops/s $\color{#35bf28}+4.98\%$
test_creation[device0] 0.2220ms 89.5886μs 11.1621 KOps/s 10.8339 KOps/s $\color{#35bf28}+3.03\%$
test_creation_from_tensor 3.5658ms 93.7478μs 10.6669 KOps/s 10.4197 KOps/s $\color{#35bf28}+2.37\%$
test_add_one[memmap_tensor0] 0.2137ms 5.0410μs 198.3735 KOps/s 204.1577 KOps/s $\color{#d91a1a}-2.83\%$
test_contiguous[memmap_tensor0] 18.8050μs 0.5090μs 1.9648 MOps/s 1.9115 MOps/s $\color{#35bf28}+2.79\%$
test_stack[memmap_tensor0] 49.7030μs 3.6047μs 277.4139 KOps/s 280.5899 KOps/s $\color{#d91a1a}-1.13\%$
test_memmaptd_index 1.0275ms 0.2397ms 4.1714 KOps/s 3.9445 KOps/s $\textbf{\color{#35bf28}+5.75\%}$
test_memmaptd_index_astensor 0.8110ms 0.3254ms 3.0731 KOps/s 2.8908 KOps/s $\textbf{\color{#35bf28}+6.31\%}$
test_memmaptd_index_op 1.0184ms 0.6185ms 1.6168 KOps/s 1.6649 KOps/s $\color{#d91a1a}-2.89\%$
test_serialize_model 0.1276s 0.1164s 8.5899 Ops/s 8.6100 Ops/s $\color{#d91a1a}-0.23\%$
test_serialize_model_pickle 0.4622s 0.3842s 2.6030 Ops/s 2.5264 Ops/s $\color{#35bf28}+3.04\%$
test_serialize_weights 0.1254s 0.1139s 8.7783 Ops/s 8.7413 Ops/s $\color{#35bf28}+0.42\%$
test_serialize_weights_returnearly 0.3887s 0.2016s 4.9601 Ops/s 6.4815 Ops/s $\textbf{\color{#d91a1a}-23.47\%}$
test_serialize_weights_pickle 0.4960s 0.4070s 2.4567 Ops/s 2.5994 Ops/s $\textbf{\color{#d91a1a}-5.49\%}$
test_serialize_weights_filesystem 0.1477s 0.1438s 6.9559 Ops/s 7.0403 Ops/s $\color{#d91a1a}-1.20\%$
test_serialize_model_filesystem 0.1554s 0.1451s 6.8907 Ops/s 5.9441 Ops/s $\textbf{\color{#35bf28}+15.93\%}$
test_reshape_pytree 65.0110μs 27.3195μs 36.6039 KOps/s 36.7147 KOps/s $\color{#d91a1a}-0.30\%$
test_reshape_td 67.1150μs 34.0547μs 29.3646 KOps/s 29.9077 KOps/s $\color{#d91a1a}-1.82\%$
test_view_pytree 57.6780μs 27.4050μs 36.4897 KOps/s 36.6436 KOps/s $\color{#d91a1a}-0.42\%$
test_view_td 84.6280μs 39.1340μs 25.5532 KOps/s 25.8203 KOps/s $\color{#d91a1a}-1.03\%$
test_unbind_pytree 72.4250μs 30.5621μs 32.7203 KOps/s 33.3037 KOps/s $\color{#d91a1a}-1.75\%$
test_unbind_td 0.4043ms 41.5165μs 24.0868 KOps/s 24.4180 KOps/s $\color{#d91a1a}-1.36\%$
test_split_pytree 64.4710μs 30.0733μs 33.2521 KOps/s 33.2490 KOps/s $+0.01\%$
test_split_td 0.4883ms 45.9476μs 21.7639 KOps/s 20.9301 KOps/s $\color{#35bf28}+3.98\%$
test_add_pytree 87.7540μs 36.6934μs 27.2529 KOps/s 27.3551 KOps/s $\color{#d91a1a}-0.37\%$
test_add_td 0.1333ms 59.7681μs 16.7313 KOps/s 16.3256 KOps/s $\color{#35bf28}+2.49\%$
test_compile_add_one_nested[tensordict-compile] 0.1330ms 60.8345μs 16.4380 KOps/s 16.0861 KOps/s $\color{#35bf28}+2.19\%$
test_compile_add_one_nested[tensordict-eager] 0.3995ms 0.1678ms 5.9612 KOps/s 5.6539 KOps/s $\textbf{\color{#35bf28}+5.43\%}$
test_compile_add_one_nested[pytree-compile] 93.3240μs 45.7261μs 21.8694 KOps/s 21.6481 KOps/s $\color{#35bf28}+1.02\%$
test_compile_add_one_nested[pytree-eager] 0.2144ms 0.1197ms 8.3546 KOps/s 8.4411 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_copy_nested[tensordict-compile] 63.2180μs 26.0528μs 38.3837 KOps/s 39.3020 KOps/s $\color{#d91a1a}-2.34\%$
test_compile_copy_nested[tensordict-eager] 0.1306ms 60.2865μs 16.5875 KOps/s 16.8916 KOps/s $\color{#d91a1a}-1.80\%$
test_compile_copy_nested[pytree-compile] 0.1668ms 78.9877μs 12.6602 KOps/s 12.7229 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_copy_nested[pytree-eager] 0.1464ms 68.6359μs 14.5696 KOps/s 14.7662 KOps/s $\color{#d91a1a}-1.33\%$
test_compile_add_one_flat[tensordict-compile] 0.1840ms 0.1037ms 9.6410 KOps/s 9.5041 KOps/s $\color{#35bf28}+1.44\%$
test_compile_add_one_flat[tensordict-eager] 0.4328ms 0.2160ms 4.6295 KOps/s 4.5868 KOps/s $\color{#35bf28}+0.93\%$
test_compile_add_one_flat[tensorclass-compile] 89.4770μs 44.8765μs 22.2834 KOps/s 21.4385 KOps/s $\color{#35bf28}+3.94\%$
test_compile_add_one_flat[tensorclass-eager] 0.4986ms 65.0218μs 15.3794 KOps/s 15.3457 KOps/s $\color{#35bf28}+0.22\%$
test_compile_add_one_flat[pytree-compile] 0.1753ms 0.1018ms 9.8274 KOps/s 9.7596 KOps/s $\color{#35bf28}+0.70\%$
test_compile_add_one_flat[pytree-eager] 0.3670ms 0.2012ms 4.9712 KOps/s 4.9299 KOps/s $\color{#35bf28}+0.84\%$
test_compile_add_self_flat[tensordict-eager] 0.4572ms 0.2330ms 4.2918 KOps/s 4.1632 KOps/s $\color{#35bf28}+3.09\%$
test_compile_add_self_flat[tensordict-compile] 0.2215ms 0.1040ms 9.6158 KOps/s 9.6124 KOps/s $\color{#35bf28}+0.04\%$
test_compile_add_self_flat[tensorclass-eager] 0.1352ms 58.3631μs 17.1341 KOps/s 16.5823 KOps/s $\color{#35bf28}+3.33\%$
test_compile_add_self_flat[tensorclass-compile] 0.1156ms 46.2000μs 21.6450 KOps/s 22.3108 KOps/s $\color{#d91a1a}-2.98\%$
test_compile_add_self_flat[pytree-eager] 0.6446ms 0.1617ms 6.1849 KOps/s 6.3233 KOps/s $\color{#d91a1a}-2.19\%$
test_compile_add_self_flat[pytree-compile] 0.1841ms 0.1022ms 9.7828 KOps/s 9.7492 KOps/s $\color{#35bf28}+0.34\%$
test_compile_copy_flat[tensordict-compile] 65.8930μs 21.4507μs 46.6185 KOps/s 48.7525 KOps/s $\color{#d91a1a}-4.38\%$
test_compile_copy_flat[tensordict-eager] 0.1297ms 65.9643μs 15.1597 KOps/s 14.7932 KOps/s $\color{#35bf28}+2.48\%$
test_compile_copy_flat[pytree-compile] 0.1546ms 80.8002μs 12.3762 KOps/s 12.1997 KOps/s $\color{#35bf28}+1.45\%$
test_compile_copy_flat[pytree-eager] 0.1310ms 69.3193μs 14.4260 KOps/s 14.5074 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_assign_and_add[tensordict-compile] 0.3166ms 0.2037ms 4.9098 KOps/s 4.8771 KOps/s $\color{#35bf28}+0.67\%$
test_compile_assign_and_add[tensordict-eager] 2.7809ms 1.3415ms 745.4429 Ops/s 744.0523 Ops/s $\color{#35bf28}+0.19\%$
test_compile_assign_and_add[pytree-compile] 0.2930ms 0.2020ms 4.9507 KOps/s 5.0695 KOps/s $\color{#d91a1a}-2.34\%$
test_compile_assign_and_add[pytree-eager] 0.9859ms 0.7766ms 1.2877 KOps/s 1.2838 KOps/s $\color{#35bf28}+0.30\%$
test_compile_assign_and_add_stack[compile] 0.6939ms 0.4479ms 2.2328 KOps/s 2.2505 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_assign_and_add_stack[eager] 2.9978ms 2.8288ms 353.5055 Ops/s 372.2595 Ops/s $\textbf{\color{#d91a1a}-5.04\%}$
test_compile_indexing[tensor-tensordict-compile] 0.1011ms 35.4355μs 28.2203 KOps/s 28.2461 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_indexing[tensor-tensordict-eager] 0.5251ms 33.0295μs 30.2760 KOps/s 29.1809 KOps/s $\color{#35bf28}+3.75\%$
test_compile_indexing[tensor-tensorclass-compile] 67.5660μs 29.4807μs 33.9205 KOps/s 35.1603 KOps/s $\color{#d91a1a}-3.53\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1008ms 23.9674μs 41.7234 KOps/s 41.1343 KOps/s $\color{#35bf28}+1.43\%$
test_compile_indexing[tensor-pytree-compile] 88.2750μs 29.5638μs 33.8252 KOps/s 33.3044 KOps/s $\color{#35bf28}+1.56\%$
test_compile_indexing[tensor-pytree-eager] 91.1700μs 23.9546μs 41.7456 KOps/s 41.7546 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_indexing[slice-tensordict-compile] 0.1060ms 50.7354μs 19.7101 KOps/s 19.3261 KOps/s $\color{#35bf28}+1.99\%$
test_compile_indexing[slice-tensordict-eager] 0.5543ms 20.0059μs 49.9853 KOps/s 48.4778 KOps/s $\color{#35bf28}+3.11\%$
test_compile_indexing[slice-tensorclass-compile] 0.1307ms 43.1478μs 23.1761 KOps/s 22.4887 KOps/s $\color{#35bf28}+3.06\%$
test_compile_indexing[slice-tensorclass-eager] 51.7360μs 18.8896μs 52.9393 KOps/s 51.6190 KOps/s $\color{#35bf28}+2.56\%$
test_compile_indexing[slice-pytree-compile] 0.1066ms 43.9833μs 22.7359 KOps/s 21.5248 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_compile_indexing[slice-pytree-eager] 64.6200μs 18.9923μs 52.6528 KOps/s 52.0121 KOps/s $\color{#35bf28}+1.23\%$
test_compile_indexing[int-tensordict-compile] 0.1203ms 52.1369μs 19.1803 KOps/s 18.8792 KOps/s $\color{#35bf28}+1.60\%$
test_compile_indexing[int-tensordict-eager] 0.9476ms 20.0672μs 49.8326 KOps/s 49.8208 KOps/s $\color{#35bf28}+0.02\%$
test_compile_indexing[int-tensorclass-compile] 0.1145ms 44.2137μs 22.6174 KOps/s 22.0388 KOps/s $\color{#35bf28}+2.63\%$
test_compile_indexing[int-tensorclass-eager] 65.2700μs 18.8838μs 52.9554 KOps/s 52.2682 KOps/s $\color{#35bf28}+1.31\%$
test_compile_indexing[int-pytree-compile] 0.1030ms 44.3701μs 22.5377 KOps/s 21.9309 KOps/s $\color{#35bf28}+2.77\%$
test_compile_indexing[int-pytree-eager] 62.0060μs 18.8388μs 53.0819 KOps/s 51.8444 KOps/s $\color{#35bf28}+2.39\%$
test_mod_add[eager] 83.4060μs 34.4392μs 29.0367 KOps/s 29.0540 KOps/s $\color{#d91a1a}-0.06\%$
test_mod_add[compile] 0.1113ms 47.4561μs 21.0721 KOps/s 20.9776 KOps/s $\color{#35bf28}+0.45\%$
test_mod_add[compile-overhead] 0.1094ms 46.4375μs 21.5343 KOps/s 20.4286 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_mod_wrap[eager] 0.4159ms 0.2222ms 4.5014 KOps/s 4.4494 KOps/s $\color{#35bf28}+1.17\%$
test_mod_wrap[compile] 0.3546ms 0.2026ms 4.9364 KOps/s 4.8384 KOps/s $\color{#35bf28}+2.03\%$
test_mod_wrap[compile-overhead] 0.2702ms 0.2026ms 4.9348 KOps/s 4.8850 KOps/s $\color{#35bf28}+1.02\%$
test_mod_wrap_and_backward[eager] 14.1846ms 11.6884ms 85.5548 Ops/s 80.0720 Ops/s $\textbf{\color{#35bf28}+6.85\%}$
test_mod_wrap_and_backward[compile] 17.7714ms 13.1284ms 76.1706 Ops/s 75.1930 Ops/s $\color{#35bf28}+1.30\%$
test_mod_wrap_and_backward[compile-overhead] 17.0207ms 12.1146ms 82.5451 Ops/s 80.5837 Ops/s $\color{#35bf28}+2.43\%$
test_seq_add[eager] 0.2137ms 0.1149ms 8.6999 KOps/s 8.5487 KOps/s $\color{#35bf28}+1.77\%$
test_seq_add[compile] 0.1208ms 59.2461μs 16.8788 KOps/s 16.1237 KOps/s $\color{#35bf28}+4.68\%$
test_seq_add[compile-overhead] 0.1531ms 58.3282μs 17.1444 KOps/s 16.2722 KOps/s $\textbf{\color{#35bf28}+5.36\%}$
test_seq_wrap[eager] 0.6804ms 0.4598ms 2.1749 KOps/s 2.2163 KOps/s $\color{#d91a1a}-1.87\%$
test_seq_wrap[compile] 0.3124ms 0.2232ms 4.4797 KOps/s 4.3046 KOps/s $\color{#35bf28}+4.07\%$
test_seq_wrap[compile-overhead] 0.4241ms 0.2261ms 4.4222 KOps/s 4.3589 KOps/s $\color{#35bf28}+1.45\%$
test_func_call_runtime[False-eager] 0.9408ms 0.5402ms 1.8511 KOps/s 1.8056 KOps/s $\color{#35bf28}+2.52\%$
test_func_call_runtime[False-compile] 0.5528ms 0.4234ms 2.3618 KOps/s 2.3348 KOps/s $\color{#35bf28}+1.15\%$
test_func_call_runtime[False-compile-overhead] 0.5933ms 0.4230ms 2.3643 KOps/s 2.3306 KOps/s $\color{#35bf28}+1.44\%$
test_func_call_runtime[True-eager] 1.2382ms 0.7580ms 1.3192 KOps/s 1.2845 KOps/s $\color{#35bf28}+2.70\%$
test_func_call_runtime[True-compile] 0.5751ms 0.4582ms 2.1823 KOps/s 2.1437 KOps/s $\color{#35bf28}+1.80\%$
test_func_call_runtime[True-compile-overhead] 0.9738ms 0.4686ms 2.1340 KOps/s 2.1400 KOps/s $\color{#d91a1a}-0.28\%$
test_func_call_cm_runtime[False-eager] 0.9763ms 0.5428ms 1.8424 KOps/s 1.8262 KOps/s $\color{#35bf28}+0.89\%$
test_func_call_cm_runtime[False-compile] 0.5575ms 0.4220ms 2.3699 KOps/s 2.3445 KOps/s $\color{#35bf28}+1.09\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5405ms 0.4253ms 2.3515 KOps/s 2.3417 KOps/s $\color{#35bf28}+0.42\%$
test_func_call_cm_runtime[True-eager] 1.0497ms 0.9130ms 1.0953 KOps/s 1.0927 KOps/s $\color{#35bf28}+0.24\%$
test_func_call_cm_runtime[True-compile] 0.6359ms 0.4879ms 2.0497 KOps/s 2.0350 KOps/s $\color{#35bf28}+0.72\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9129ms 0.4891ms 2.0448 KOps/s 2.0540 KOps/s $\color{#d91a1a}-0.45\%$
test_vmap_func_call_cm_runtime[eager] 2.8425ms 1.9079ms 524.1460 Ops/s 521.0721 Ops/s $\color{#35bf28}+0.59\%$
test_vmap_func_call_cm_runtime[compile] 0.8437ms 0.5117ms 1.9544 KOps/s 1.9093 KOps/s $\color{#35bf28}+2.36\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6957ms 0.5127ms 1.9506 KOps/s 1.9301 KOps/s $\color{#35bf28}+1.06\%$
test_distributed 0.2909ms 0.1253ms 7.9784 KOps/s 7.8208 KOps/s $\color{#35bf28}+2.02\%$
test_tdmodule 47.7500μs 25.4048μs 39.3627 KOps/s 37.7593 KOps/s $\color{#35bf28}+4.25\%$
test_tdmodule_dispatch 73.5880μs 48.3171μs 20.6966 KOps/s 20.9042 KOps/s $\color{#d91a1a}-0.99\%$
test_tdseq 47.9990μs 26.1682μs 38.2144 KOps/s 38.2082 KOps/s $\color{#35bf28}+0.02\%$
test_tdseq_dispatch 77.8760μs 50.8112μs 19.6807 KOps/s 20.0790 KOps/s $\color{#d91a1a}-1.98\%$
test_instantiation_functorch 1.7616ms 1.5115ms 661.5833 Ops/s 638.5717 Ops/s $\color{#35bf28}+3.60\%$
test_exec_functorch 0.7974ms 0.1832ms 5.4573 KOps/s 5.5852 KOps/s $\color{#d91a1a}-2.29\%$
test_exec_functional_call 0.3513ms 0.1717ms 5.8253 KOps/s 5.7878 KOps/s $\color{#35bf28}+0.65\%$
test_exec_td_decorator 0.4888ms 0.2299ms 4.3502 KOps/s 4.2835 KOps/s $\color{#35bf28}+1.56\%$
test_vmap_mlp_speed_decorator[True-True] 0.9915ms 0.6557ms 1.5251 KOps/s 1.5368 KOps/s $\color{#d91a1a}-0.76\%$
test_vmap_mlp_speed_decorator[True-False] 0.9140ms 0.6561ms 1.5241 KOps/s 1.5299 KOps/s $\color{#d91a1a}-0.38\%$
test_vmap_mlp_speed_decorator[False-True] 1.0345ms 0.5300ms 1.8868 KOps/s 1.8954 KOps/s $\color{#d91a1a}-0.46\%$
test_vmap_mlp_speed_decorator[False-False] 1.0791ms 0.5273ms 1.8965 KOps/s 1.8857 KOps/s $\color{#35bf28}+0.57\%$
test_to_module_speed[True] 2.1904ms 1.3530ms 739.0771 Ops/s 738.4737 Ops/s $\color{#35bf28}+0.08\%$
test_to_module_speed[False] 1.8023ms 1.3092ms 763.8455 Ops/s 756.5758 Ops/s $\color{#35bf28}+0.96\%$
test_tc_init 83.3050μs 47.7584μs 20.9387 KOps/s 21.5224 KOps/s $\color{#d91a1a}-2.71\%$
test_tc_init_nested 0.1763ms 98.0669μs 10.1971 KOps/s 10.6196 KOps/s $\color{#d91a1a}-3.98\%$
test_tc_first_layer_tensor 42.8600μs 1.5073μs 663.4343 KOps/s 648.0631 KOps/s $\color{#35bf28}+2.37\%$
test_tc_first_layer_nontensor 33.5330μs 4.6590μs 214.6401 KOps/s 208.2979 KOps/s $\color{#35bf28}+3.04\%$
test_tc_second_layer_tensor 33.0190μs 2.7441μs 364.4196 KOps/s 352.9297 KOps/s $\color{#35bf28}+3.26\%$
test_tc_second_layer_nontensor 28.9540μs 6.0074μs 166.4620 KOps/s 162.4829 KOps/s $\color{#35bf28}+2.45\%$
test_unbind 0.2132s 13.7627ms 72.6600 Ops/s 78.5207 Ops/s $\textbf{\color{#d91a1a}-7.46\%}$
test_full_like 8.9633ms 7.2674ms 137.6017 Ops/s 136.3050 Ops/s $\color{#35bf28}+0.95\%$
test_zeros_like 12.4368ms 7.5541ms 132.3778 Ops/s 359.9212 Ops/s $\textbf{\color{#d91a1a}-63.22\%}$
test_ones_like 10.7830ms 7.7576ms 128.9059 Ops/s 315.0060 Ops/s $\textbf{\color{#d91a1a}-59.08\%}$
test_clone 17.0024ms 9.7336ms 102.7372 Ops/s 199.2713 Ops/s $\textbf{\color{#d91a1a}-48.44\%}$
test_squeeze 57.7880μs 12.5087μs 79.9445 KOps/s 79.3434 KOps/s $\color{#35bf28}+0.76\%$
test_unsqueeze 0.1721ms 91.6715μs 10.9085 KOps/s 10.0343 KOps/s $\textbf{\color{#35bf28}+8.71\%}$
test_split 0.4953ms 0.1964ms 5.0925 KOps/s 5.0207 KOps/s $\color{#35bf28}+1.43\%$
test_permute 0.3569ms 0.2052ms 4.8742 KOps/s 4.6878 KOps/s $\color{#35bf28}+3.98\%$
test_stack 34.6889ms 25.2601ms 39.5881 Ops/s 39.8501 Ops/s $\color{#d91a1a}-0.66\%$
test_cat 30.5922ms 25.1086ms 39.8270 Ops/s 38.8148 Ops/s $\color{#35bf28}+2.61\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}26$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.0210μs 13.1368μs 76.1223 KOps/s 78.5362 KOps/s $\color{#d91a1a}-3.07\%$
test_plain_set_stack_nested 39.9310μs 13.2767μs 75.3197 KOps/s 77.2398 KOps/s $\color{#d91a1a}-2.49\%$
test_plain_set_nested_inplace 67.4920μs 14.2499μs 70.1758 KOps/s 71.9214 KOps/s $\color{#d91a1a}-2.43\%$
test_plain_set_stack_nested_inplace 44.0110μs 14.3972μs 69.4579 KOps/s 72.0576 KOps/s $\color{#d91a1a}-3.61\%$
test_items 38.2300μs 2.8864μs 346.4478 KOps/s 345.4498 KOps/s $\color{#35bf28}+0.29\%$
test_items_nested 0.4159ms 0.3621ms 2.7616 KOps/s 2.7589 KOps/s $\color{#35bf28}+0.10\%$
test_items_nested_locked 0.4225ms 0.3647ms 2.7423 KOps/s 2.7520 KOps/s $\color{#d91a1a}-0.35\%$
test_items_nested_leaf 80.2310μs 57.9860μs 17.2455 KOps/s 17.1704 KOps/s $\color{#35bf28}+0.44\%$
test_items_stack_nested 0.4160ms 0.3636ms 2.7500 KOps/s 2.7413 KOps/s $\color{#35bf28}+0.32\%$
test_items_stack_nested_leaf 90.2920μs 58.9419μs 16.9658 KOps/s 16.7178 KOps/s $\color{#35bf28}+1.48\%$
test_items_stack_nested_locked 0.4168ms 0.3622ms 2.7610 KOps/s 2.7390 KOps/s $\color{#35bf28}+0.80\%$
test_keys 27.5000μs 3.4704μs 288.1521 KOps/s 289.8236 KOps/s $\color{#d91a1a}-0.58\%$
test_keys_nested 0.1249ms 81.4685μs 12.2747 KOps/s 12.2531 KOps/s $\color{#35bf28}+0.18\%$
test_keys_nested_locked 0.7460ms 87.1770μs 11.4709 KOps/s 11.4734 KOps/s $\color{#d91a1a}-0.02\%$
test_keys_nested_leaf 0.1079ms 71.9067μs 13.9069 KOps/s 13.8625 KOps/s $\color{#35bf28}+0.32\%$
test_keys_stack_nested 0.1193ms 81.6794μs 12.2430 KOps/s 12.0175 KOps/s $\color{#35bf28}+1.88\%$
test_keys_stack_nested_leaf 0.1078ms 72.9616μs 13.7058 KOps/s 13.5073 KOps/s $\color{#35bf28}+1.47\%$
test_keys_stack_nested_locked 0.1254ms 87.2869μs 11.4565 KOps/s 11.2424 KOps/s $\color{#35bf28}+1.90\%$
test_values 5.7802μs 0.8529μs 1.1724 MOps/s 1.1684 MOps/s $\color{#35bf28}+0.35\%$
test_values_nested 66.6310μs 34.9781μs 28.5894 KOps/s 29.3290 KOps/s $\color{#d91a1a}-2.52\%$
test_values_nested_locked 77.2020μs 36.1706μs 27.6468 KOps/s 28.0696 KOps/s $\color{#d91a1a}-1.51\%$
test_values_nested_leaf 69.0610μs 39.8218μs 25.1119 KOps/s 25.4402 KOps/s $\color{#d91a1a}-1.29\%$
test_values_stack_nested 68.1520μs 35.1500μs 28.4495 KOps/s 28.5073 KOps/s $\color{#d91a1a}-0.20\%$
test_values_stack_nested_leaf 91.3420μs 39.8444μs 25.0976 KOps/s 25.4924 KOps/s $\color{#d91a1a}-1.55\%$
test_values_stack_nested_locked 64.0310μs 36.3664μs 27.4979 KOps/s 27.6883 KOps/s $\color{#d91a1a}-0.69\%$
test_membership 2.1490μs 0.5214μs 1.9180 MOps/s 1.9786 MOps/s $\color{#d91a1a}-3.06\%$
test_membership_nested 22.6205μs 1.9670μs 508.3944 KOps/s 476.8773 KOps/s $\textbf{\color{#35bf28}+6.61\%}$
test_membership_nested_leaf 18.3905μs 1.9887μs 502.8528 KOps/s 495.6430 KOps/s $\color{#35bf28}+1.45\%$
test_membership_stacked_nested 39.4310μs 2.0818μs 480.3635 KOps/s 491.8342 KOps/s $\color{#d91a1a}-2.33\%$
test_membership_stacked_nested_leaf 30.2510μs 2.0371μs 490.9001 KOps/s 492.6623 KOps/s $\color{#d91a1a}-0.36\%$
test_membership_nested_last 40.1910μs 3.0753μs 325.1758 KOps/s 329.1931 KOps/s $\color{#d91a1a}-1.22\%$
test_membership_nested_leaf_last 31.4710μs 3.0414μs 328.7954 KOps/s 329.6375 KOps/s $\color{#d91a1a}-0.26\%$
test_membership_stacked_nested_last 35.2110μs 3.0894μs 323.6845 KOps/s 328.9488 KOps/s $\color{#d91a1a}-1.60\%$
test_membership_stacked_nested_leaf_last 39.8510μs 3.0432μs 328.6052 KOps/s 325.0405 KOps/s $\color{#35bf28}+1.10\%$
test_nested_getleaf 38.9110μs 6.1446μs 162.7434 KOps/s 162.4166 KOps/s $\color{#35bf28}+0.20\%$
test_nested_get 40.3110μs 5.7521μs 173.8505 KOps/s 169.4204 KOps/s $\color{#35bf28}+2.61\%$
test_stacked_getleaf 40.6500μs 6.1267μs 163.2190 KOps/s 162.7967 KOps/s $\color{#35bf28}+0.26\%$
test_stacked_get 51.1210μs 5.8516μs 170.8943 KOps/s 171.5759 KOps/s $\color{#d91a1a}-0.40\%$
test_nested_getitemleaf 38.8510μs 6.2935μs 158.8940 KOps/s 158.7326 KOps/s $\color{#35bf28}+0.10\%$
test_nested_getitem 41.2310μs 5.9778μs 167.2855 KOps/s 167.0766 KOps/s $\color{#35bf28}+0.13\%$
test_stacked_getitemleaf 43.6710μs 6.3465μs 157.5674 KOps/s 160.0806 KOps/s $\color{#d91a1a}-1.57\%$
test_stacked_getitem 49.7710μs 5.9711μs 167.4737 KOps/s 168.7500 KOps/s $\color{#d91a1a}-0.76\%$
test_lock_nested 0.8315ms 0.3744ms 2.6706 KOps/s 2.6603 KOps/s $\color{#35bf28}+0.39\%$
test_lock_stack_nested 0.4145ms 0.3511ms 2.8486 KOps/s 2.9262 KOps/s $\color{#d91a1a}-2.65\%$
test_unlock_nested 0.6575ms 0.3161ms 3.1637 KOps/s 3.2621 KOps/s $\color{#d91a1a}-3.02\%$
test_unlock_stack_nested 0.3422ms 0.2886ms 3.4648 KOps/s 3.5749 KOps/s $\color{#d91a1a}-3.08\%$
test_flatten_speed 0.1123ms 74.7777μs 13.3730 KOps/s 13.3524 KOps/s $\color{#35bf28}+0.15\%$
test_unflatten_speed 0.3766ms 0.3200ms 3.1249 KOps/s 3.1088 KOps/s $\color{#35bf28}+0.52\%$
test_common_ops 1.5365ms 0.6323ms 1.5815 KOps/s 1.6407 KOps/s $\color{#d91a1a}-3.61\%$
test_creation 0.1125ms 1.7211μs 581.0333 KOps/s 579.2806 KOps/s $\color{#35bf28}+0.30\%$
test_creation_empty 37.5810μs 9.8303μs 101.7260 KOps/s 106.6056 KOps/s $\color{#d91a1a}-4.58\%$
test_creation_nested_1 48.3500μs 11.4494μs 87.3409 KOps/s 91.7236 KOps/s $\color{#d91a1a}-4.78\%$
test_creation_nested_2 37.0200μs 14.1443μs 70.7001 KOps/s 73.9672 KOps/s $\color{#d91a1a}-4.42\%$
test_clone 0.1201ms 10.5226μs 95.0339 KOps/s 98.1424 KOps/s $\color{#d91a1a}-3.17\%$
test_getitem[int] 1.8181ms 10.8425μs 92.2293 KOps/s 93.6355 KOps/s $\color{#d91a1a}-1.50\%$
test_getitem[slice_int] 0.1094ms 21.9156μs 45.6295 KOps/s 49.2116 KOps/s $\textbf{\color{#d91a1a}-7.28\%}$
test_getitem[range] 0.1353ms 39.6526μs 25.2190 KOps/s 27.9139 KOps/s $\textbf{\color{#d91a1a}-9.65\%}$
test_getitem[tuple] 0.1126ms 18.9765μs 52.6969 KOps/s 54.7156 KOps/s $\color{#d91a1a}-3.69\%$
test_getitem[list] 0.1257ms 33.8624μs 29.5313 KOps/s 30.7141 KOps/s $\color{#d91a1a}-3.85\%$
test_setitem_dim[int] 30.5010μs 19.3136μs 51.7769 KOps/s 57.3985 KOps/s $\textbf{\color{#d91a1a}-9.79\%}$
test_setitem_dim[slice_int] 75.2810μs 38.4953μs 25.9772 KOps/s 27.6449 KOps/s $\textbf{\color{#d91a1a}-6.03\%}$
test_setitem_dim[range] 80.7820μs 54.6874μs 18.2857 KOps/s 19.8046 KOps/s $\textbf{\color{#d91a1a}-7.67\%}$
test_setitem_dim[tuple] 54.1320μs 32.7763μs 30.5098 KOps/s 31.3742 KOps/s $\color{#d91a1a}-2.75\%$
test_setitem 51.5010μs 16.0629μs 62.2552 KOps/s 65.7967 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_set 0.1234ms 15.5228μs 64.4213 KOps/s 68.0955 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_set_shared 1.6327ms 0.1487ms 6.7237 KOps/s 6.7540 KOps/s $\color{#d91a1a}-0.45\%$
test_update 0.5557ms 19.3580μs 51.6582 KOps/s 56.1929 KOps/s $\textbf{\color{#d91a1a}-8.07\%}$
test_update_nested 0.1244ms 24.7366μs 40.4260 KOps/s 42.5051 KOps/s $\color{#d91a1a}-4.89\%$
test_update__nested 0.9100ms 25.5775μs 39.0968 KOps/s 42.5405 KOps/s $\textbf{\color{#d91a1a}-8.10\%}$
test_set_nested 0.1325ms 16.6832μs 59.9404 KOps/s 63.4724 KOps/s $\textbf{\color{#d91a1a}-5.56\%}$
test_set_nested_new 0.1223ms 19.2677μs 51.9002 KOps/s 55.0538 KOps/s $\textbf{\color{#d91a1a}-5.73\%}$
test_select 0.1354ms 31.8629μs 31.3845 KOps/s 33.9238 KOps/s $\textbf{\color{#d91a1a}-7.49\%}$
test_select_nested 73.9620μs 43.8381μs 22.8112 KOps/s 23.2607 KOps/s $\color{#d91a1a}-1.93\%$
test_exclude_nested 0.1031ms 62.8792μs 15.9035 KOps/s 15.8571 KOps/s $\color{#35bf28}+0.29\%$
test_empty[True] 0.3558ms 0.2877ms 3.4754 KOps/s 3.4785 KOps/s $\color{#d91a1a}-0.09\%$
test_empty[False] 3.7731μs 0.8273μs 1.2088 MOps/s 1.1981 MOps/s $\color{#35bf28}+0.89\%$
test_to 82.7920μs 55.5583μs 17.9991 KOps/s 17.8567 KOps/s $\color{#35bf28}+0.80\%$
test_to_nonblocking 0.9683ms 47.8102μs 20.9160 KOps/s 21.3313 KOps/s $\color{#d91a1a}-1.95\%$
test_unbind_speed 1.8433ms 0.2358ms 4.2407 KOps/s 4.2753 KOps/s $\color{#d91a1a}-0.81\%$
test_unbind_speed_stack0 0.3617ms 0.2401ms 4.1656 KOps/s 4.2621 KOps/s $\color{#d91a1a}-2.26\%$
test_unbind_speed_stack1 91.7154ms 0.6702ms 1.4921 KOps/s 1.4973 KOps/s $\color{#d91a1a}-0.34\%$
test_split 92.0432ms 1.6179ms 618.0773 Ops/s 637.8002 Ops/s $\color{#d91a1a}-3.09\%$
test_chunk 94.5223ms 1.7670ms 565.9293 Ops/s 582.6468 Ops/s $\color{#d91a1a}-2.87\%$
test_consolidate[False-None] 2.9947ms 2.6742ms 373.9450 Ops/s 381.1775 Ops/s $\color{#d91a1a}-1.90\%$
test_consolidate[default-None] 1.8384ms 1.7251ms 579.6824 Ops/s 590.2473 Ops/s $\color{#d91a1a}-1.79\%$
test_consolidate[reduce-overhead-None] 1.9102ms 1.7683ms 565.5114 Ops/s 577.1500 Ops/s $\color{#d91a1a}-2.02\%$
test_consolidate_njt[False-None] 6.7341ms 6.5582ms 152.4805 Ops/s 154.2526 Ops/s $\color{#d91a1a}-1.15\%$
test_to[False-False-None] 1.8064ms 1.7104ms 584.6473 Ops/s 579.3490 Ops/s $\color{#35bf28}+0.91\%$
test_to[True-False-None] 1.5575ms 1.3296ms 752.0819 Ops/s 763.1031 Ops/s $\color{#d91a1a}-1.44\%$
test_to[within-False-None] 4.4892ms 4.1858ms 238.9034 Ops/s 245.9609 Ops/s $\color{#d91a1a}-2.87\%$
test_to[True-default-None] 5.5606ms 5.2929ms 188.9316 Ops/s 178.9869 Ops/s $\textbf{\color{#35bf28}+5.56\%}$
test_to_njt[False-False-None] 7.0154ms 6.8603ms 145.7653 Ops/s 145.8504 Ops/s $\color{#d91a1a}-0.06\%$
test_to_njt[True-False-None] 5.8832ms 5.4582ms 183.2112 Ops/s 185.0032 Ops/s $\color{#d91a1a}-0.97\%$
test_to_njt[within-False-None] 12.4625ms 12.1718ms 82.1571 Ops/s 82.7965 Ops/s $\color{#d91a1a}-0.77\%$
test_creation[device0] 0.7147ms 78.6515μs 12.7143 KOps/s 12.7696 KOps/s $\color{#d91a1a}-0.43\%$
test_creation_from_tensor 0.6045ms 82.6553μs 12.0984 KOps/s 12.0795 KOps/s $\color{#35bf28}+0.16\%$
test_add_one[memmap_tensor0] 0.4753ms 6.3516μs 157.4414 KOps/s 163.6476 KOps/s $\color{#d91a1a}-3.79\%$
test_contiguous[memmap_tensor0] 1.8505μs 0.4071μs 2.4564 MOps/s 2.5096 MOps/s $\color{#d91a1a}-2.12\%$
test_stack[memmap_tensor0] 46.5310μs 4.6191μs 216.4909 KOps/s 223.9520 KOps/s $\color{#d91a1a}-3.33\%$
test_memmaptd_index 1.8130ms 0.2640ms 3.7873 KOps/s 4.0084 KOps/s $\textbf{\color{#d91a1a}-5.51\%}$
test_memmaptd_index_astensor 0.9737ms 0.3306ms 3.0248 KOps/s 3.2014 KOps/s $\textbf{\color{#d91a1a}-5.52\%}$
test_memmaptd_index_op 1.0555ms 0.6106ms 1.6376 KOps/s 1.7122 KOps/s $\color{#d91a1a}-4.36\%$
test_serialize_model 0.1309s 0.1298s 7.7022 Ops/s 7.6661 Ops/s $\color{#35bf28}+0.47\%$
test_serialize_model_pickle 1.3494s 1.2152s 0.8229 Ops/s 0.8222 Ops/s $\color{#35bf28}+0.08\%$
test_serialize_weights 0.4113s 0.1697s 5.8914 Ops/s 7.6890 Ops/s $\textbf{\color{#d91a1a}-23.38\%}$
test_serialize_weights_returnearly 0.3140s 53.5523ms 18.6733 Ops/s 14.7766 Ops/s $\textbf{\color{#35bf28}+26.37\%}$
test_serialize_weights_pickle 1.3729s 1.2155s 0.8227 Ops/s 0.8222 Ops/s $\color{#35bf28}+0.06\%$
test_reshape_pytree 83.2420μs 22.0522μs 45.3469 KOps/s 45.9656 KOps/s $\color{#d91a1a}-1.35\%$
test_reshape_td 52.5510μs 26.5055μs 37.7280 KOps/s 36.5794 KOps/s $\color{#35bf28}+3.14\%$
test_view_pytree 47.8710μs 22.1203μs 45.2074 KOps/s 46.6570 KOps/s $\color{#d91a1a}-3.11\%$
test_view_td 60.3610μs 30.5680μs 32.7140 KOps/s 33.0339 KOps/s $\color{#d91a1a}-0.97\%$
test_unbind_pytree 55.4810μs 28.1236μs 35.5573 KOps/s 36.7003 KOps/s $\color{#d91a1a}-3.11\%$
test_unbind_td 0.8392ms 36.3838μs 27.4848 KOps/s 27.6954 KOps/s $\color{#d91a1a}-0.76\%$
test_split_pytree 61.4110μs 30.8405μs 32.4249 KOps/s 33.9817 KOps/s $\color{#d91a1a}-4.58\%$
test_split_td 1.0917ms 39.6647μs 25.2113 KOps/s 27.0273 KOps/s $\textbf{\color{#d91a1a}-6.72\%}$
test_add_pytree 63.2710μs 33.7783μs 29.6048 KOps/s 31.1241 KOps/s $\color{#d91a1a}-4.88\%$
test_add_td 0.1872ms 50.2621μs 19.8957 KOps/s 21.3675 KOps/s $\textbf{\color{#d91a1a}-6.89\%}$
test_compile_add_one_nested[tensordict-compile] 0.1720ms 0.1192ms 8.3881 KOps/s 8.3989 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_add_one_nested[tensordict-eager] 0.2159ms 0.1281ms 7.8048 KOps/s 7.6568 KOps/s $\color{#35bf28}+1.93\%$
test_compile_add_one_nested[pytree-compile] 0.1584ms 94.5265μs 10.5790 KOps/s 10.6926 KOps/s $\color{#d91a1a}-1.06\%$
test_compile_add_one_nested[pytree-eager] 1.3785ms 0.1493ms 6.6959 KOps/s 6.6505 KOps/s $\color{#35bf28}+0.68\%$
test_compile_copy_nested[tensordict-compile] 52.3410μs 22.8747μs 43.7163 KOps/s 47.0566 KOps/s $\textbf{\color{#d91a1a}-7.10\%}$
test_compile_copy_nested[tensordict-eager] 55.2320μs 29.3456μs 34.0766 KOps/s 34.0377 KOps/s $\color{#35bf28}+0.11\%$
test_compile_copy_nested[pytree-compile] 99.4320μs 64.1058μs 15.5992 KOps/s 15.4898 KOps/s $\color{#35bf28}+0.71\%$
test_compile_copy_nested[pytree-eager] 79.5310μs 48.7973μs 20.4929 KOps/s 20.1268 KOps/s $\color{#35bf28}+1.82\%$
test_compile_add_one_flat[tensordict-compile] 0.1812ms 0.1406ms 7.1146 KOps/s 7.3294 KOps/s $\color{#d91a1a}-2.93\%$
test_compile_add_one_flat[tensordict-eager] 0.3117ms 0.2149ms 4.6534 KOps/s 4.6823 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_add_one_flat[tensorclass-compile] 0.1562ms 99.9890μs 10.0011 KOps/s 10.5223 KOps/s $\color{#d91a1a}-4.95\%$
test_compile_add_one_flat[tensorclass-eager] 0.1142ms 54.0755μs 18.4927 KOps/s 18.0204 KOps/s $\color{#35bf28}+2.62\%$
test_compile_add_one_flat[pytree-compile] 0.1850ms 0.1369ms 7.3036 KOps/s 7.4143 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_add_one_flat[pytree-eager] 0.6008ms 0.4867ms 2.0548 KOps/s 2.0688 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_add_self_flat[tensordict-eager] 0.3722ms 0.2583ms 3.8713 KOps/s 3.8739 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_add_self_flat[tensordict-compile] 0.2100ms 0.1463ms 6.8335 KOps/s 7.1968 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1622ms 64.4133μs 15.5247 KOps/s 15.4355 KOps/s $\color{#35bf28}+0.58\%$
test_compile_add_self_flat[tensorclass-compile] 0.1482ms 99.9866μs 10.0013 KOps/s 10.0682 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_add_self_flat[pytree-eager] 0.5094ms 0.4159ms 2.4044 KOps/s 2.4484 KOps/s $\color{#d91a1a}-1.80\%$
test_compile_add_self_flat[pytree-compile] 0.1836ms 0.1348ms 7.4206 KOps/s 7.2278 KOps/s $\color{#35bf28}+2.67\%$
test_compile_copy_flat[tensordict-compile] 51.4510μs 18.7097μs 53.4482 KOps/s 56.3523 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_compile_copy_flat[tensordict-eager] 58.8610μs 30.5439μs 32.7398 KOps/s 32.1659 KOps/s $\color{#35bf28}+1.78\%$
test_compile_copy_flat[pytree-compile] 0.1772ms 70.0578μs 14.2739 KOps/s 14.4246 KOps/s $\color{#d91a1a}-1.04\%$
test_compile_copy_flat[pytree-eager] 80.8420μs 51.1734μs 19.5414 KOps/s 19.4725 KOps/s $\color{#35bf28}+0.35\%$
test_compile_assign_and_add[tensordict-compile] 1.7169ms 0.4200ms 2.3809 KOps/s 2.2721 KOps/s $\color{#35bf28}+4.79\%$
test_compile_assign_and_add[tensordict-eager] 2.9484ms 2.6414ms 378.5803 Ops/s 385.0892 Ops/s $\color{#d91a1a}-1.69\%$
test_compile_assign_and_add[pytree-compile] 1.5727ms 0.4272ms 2.3406 KOps/s 2.3157 KOps/s $\color{#35bf28}+1.07\%$
test_compile_assign_and_add[pytree-eager] 3.0413ms 2.7183ms 367.8785 Ops/s 383.8057 Ops/s $\color{#d91a1a}-4.15\%$
test_compile_indexing[tensor-tensordict-compile] 0.2319ms 0.1196ms 8.3645 KOps/s 8.8019 KOps/s $\color{#d91a1a}-4.97\%$
test_compile_indexing[tensor-tensordict-eager] 0.5477ms 77.2885μs 12.9385 KOps/s 12.6805 KOps/s $\color{#35bf28}+2.04\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1666ms 0.1049ms 9.5292 KOps/s 9.5207 KOps/s $\color{#35bf28}+0.09\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1092ms 66.6579μs 15.0020 KOps/s 14.8689 KOps/s $\color{#35bf28}+0.89\%$
test_compile_indexing[tensor-pytree-compile] 0.1630ms 0.1055ms 9.4819 KOps/s 9.4286 KOps/s $\color{#35bf28}+0.56\%$
test_compile_indexing[tensor-pytree-eager] 0.1485ms 70.3738μs 14.2098 KOps/s 14.8917 KOps/s $\color{#d91a1a}-4.58\%$
test_compile_indexing[slice-tensordict-compile] 0.1490ms 0.1034ms 9.6756 KOps/s 9.7133 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_indexing[slice-tensordict-eager] 0.1487ms 17.3434μs 57.6589 KOps/s 56.6672 KOps/s $\color{#35bf28}+1.75\%$
test_compile_indexing[slice-tensorclass-compile] 0.1455ms 0.1006ms 9.9408 KOps/s 10.3922 KOps/s $\color{#d91a1a}-4.34\%$
test_compile_indexing[slice-tensorclass-eager] 42.7700μs 15.6720μs 63.8080 KOps/s 64.7889 KOps/s $\color{#d91a1a}-1.51\%$
test_compile_indexing[slice-pytree-compile] 0.1528ms 0.1014ms 9.8639 KOps/s 10.1063 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_indexing[slice-pytree-eager] 79.1920μs 15.4517μs 64.7178 KOps/s 65.1692 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_indexing[int-tensordict-compile] 0.1516ms 0.1005ms 9.9466 KOps/s 9.6231 KOps/s $\color{#35bf28}+3.36\%$
test_compile_indexing[int-tensordict-eager] 0.5634ms 16.8814μs 59.2367 KOps/s 59.9622 KOps/s $\color{#d91a1a}-1.21\%$
test_compile_indexing[int-tensorclass-compile] 0.1391ms 96.7940μs 10.3312 KOps/s 10.2800 KOps/s $\color{#35bf28}+0.50\%$
test_compile_indexing[int-tensorclass-eager] 49.8710μs 15.6690μs 63.8201 KOps/s 65.0500 KOps/s $\color{#d91a1a}-1.89\%$
test_compile_indexing[int-pytree-compile] 0.1486ms 96.8750μs 10.3226 KOps/s 10.2172 KOps/s $\color{#35bf28}+1.03\%$
test_compile_indexing[int-pytree-eager] 41.7310μs 15.7740μs 63.3955 KOps/s 65.2235 KOps/s $\color{#d91a1a}-2.80\%$
test_mod_add[eager] 0.1163ms 38.3374μs 26.0842 KOps/s 26.0311 KOps/s $\color{#35bf28}+0.20\%$
test_mod_add[compile] 0.1296ms 81.3895μs 12.2866 KOps/s 12.7114 KOps/s $\color{#d91a1a}-3.34\%$
test_mod_add[compile-overhead] 0.3247ms 0.1766ms 5.6632 KOps/s 5.8441 KOps/s $\color{#d91a1a}-3.10\%$
test_mod_wrap[eager] 0.3313ms 0.2540ms 3.9368 KOps/s 4.0739 KOps/s $\color{#d91a1a}-3.36\%$
test_mod_wrap[compile] 0.3481ms 0.2797ms 3.5750 KOps/s 3.5349 KOps/s $\color{#35bf28}+1.14\%$
test_mod_wrap[compile-overhead] 7.1096ms 3.7643ms 265.6540 Ops/s 266.9544 Ops/s $\color{#d91a1a}-0.49\%$
test_mod_wrap_and_backward[eager] 1.4252ms 1.3200ms 757.5848 Ops/s 704.3921 Ops/s $\textbf{\color{#35bf28}+7.55\%}$
test_mod_wrap_and_backward[compile] 1.6854ms 1.2507ms 799.5617 Ops/s 727.4414 Ops/s $\textbf{\color{#35bf28}+9.91\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3743ms 0.9219ms 1.0847 KOps/s 980.2974 Ops/s $\textbf{\color{#35bf28}+10.65\%}$
test_seq_add[eager] 0.5212ms 0.1147ms 8.7165 KOps/s 8.6547 KOps/s $\color{#35bf28}+0.71\%$
test_seq_add[compile] 0.4794ms 88.2753μs 11.3282 KOps/s 11.4213 KOps/s $\color{#d91a1a}-0.82\%$
test_seq_add[compile-overhead] 0.5181ms 0.1264ms 7.9112 KOps/s 7.8903 KOps/s $\color{#35bf28}+0.27\%$
test_seq_wrap[eager] 0.7927ms 0.4153ms 2.4081 KOps/s 2.4022 KOps/s $\color{#35bf28}+0.24\%$
test_seq_wrap[compile] 0.7395ms 0.3056ms 3.2718 KOps/s 3.3311 KOps/s $\color{#d91a1a}-1.78\%$
test_seq_wrap[compile-overhead] 0.6157ms 0.2205ms 4.5353 KOps/s 4.5601 KOps/s $\color{#d91a1a}-0.54\%$
test_func_call_runtime[False-eager] 1.1031ms 0.7067ms 1.4150 KOps/s 1.3494 KOps/s $\color{#35bf28}+4.87\%$
test_func_call_runtime[False-compile] 1.1579ms 0.7372ms 1.3565 KOps/s 1.2864 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_func_call_runtime[False-compile-overhead] 0.4381ms 0.3564ms 2.8057 KOps/s 2.8238 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_runtime[True-eager] 1.2841ms 0.8734ms 1.1450 KOps/s 1.1379 KOps/s $\color{#35bf28}+0.62\%$
test_func_call_runtime[True-compile] 1.1619ms 0.7622ms 1.3121 KOps/s 1.2974 KOps/s $\color{#35bf28}+1.13\%$
test_func_call_runtime[True-compile-overhead] 0.7925ms 0.3794ms 2.6356 KOps/s 2.6552 KOps/s $\color{#d91a1a}-0.74\%$
test_func_call_cm_runtime[False-eager] 1.1240ms 0.7082ms 1.4120 KOps/s 1.3864 KOps/s $\color{#35bf28}+1.85\%$
test_func_call_cm_runtime[False-compile] 1.1291ms 0.7392ms 1.3529 KOps/s 1.3283 KOps/s $\color{#35bf28}+1.85\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4265ms 0.3605ms 2.7738 KOps/s 2.7839 KOps/s $\color{#d91a1a}-0.36\%$
test_func_call_cm_runtime[True-eager] 1.3810ms 0.9860ms 1.0142 KOps/s 1.0059 KOps/s $\color{#35bf28}+0.82\%$
test_func_call_cm_runtime[True-compile] 1.1214ms 0.8141ms 1.2283 KOps/s 1.2596 KOps/s $\color{#d91a1a}-2.49\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4626ms 0.4037ms 2.4772 KOps/s 2.4599 KOps/s $\color{#35bf28}+0.70\%$
test_vmap_func_call_cm_runtime[eager] 2.5309ms 2.0057ms 498.5880 Ops/s 493.0859 Ops/s $\color{#35bf28}+1.12\%$
test_vmap_func_call_cm_runtime[compile] 0.9003ms 0.7949ms 1.2581 KOps/s 1.2303 KOps/s $\color{#35bf28}+2.26\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4657ms 0.4048ms 2.4701 KOps/s 2.4597 KOps/s $\color{#35bf28}+0.42\%$
test_distributed 2.8632ms 0.3489ms 2.8660 KOps/s 8.1271 KOps/s $\textbf{\color{#d91a1a}-64.73\%}$
test_tdmodule 77.7520μs 20.0186μs 49.9537 KOps/s 47.6033 KOps/s $\color{#35bf28}+4.94\%$
test_tdmodule_dispatch 64.9110μs 37.0159μs 27.0154 KOps/s 26.9646 KOps/s $\color{#35bf28}+0.19\%$
test_tdseq 41.7810μs 20.1526μs 49.6214 KOps/s 48.9263 KOps/s $\color{#35bf28}+1.42\%$
test_tdseq_dispatch 57.0920μs 38.5958μs 25.9096 KOps/s 25.5012 KOps/s $\color{#35bf28}+1.60\%$
test_instantiation_functorch 1.6411ms 1.5250ms 655.7187 Ops/s 651.9999 Ops/s $\color{#35bf28}+0.57\%$
test_exec_functorch 0.2093ms 0.1400ms 7.1426 KOps/s 7.1566 KOps/s $\color{#d91a1a}-0.20\%$
test_exec_functional_call 0.1834ms 0.1315ms 7.6066 KOps/s 7.4481 KOps/s $\color{#35bf28}+2.13\%$
test_exec_td_decorator 0.3654ms 0.1798ms 5.5631 KOps/s 5.5375 KOps/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed_decorator[True-True] 0.8534ms 0.6790ms 1.4727 KOps/s 1.4529 KOps/s $\color{#35bf28}+1.36\%$
test_vmap_mlp_speed_decorator[True-False] 0.8872ms 0.6908ms 1.4475 KOps/s 1.4565 KOps/s $\color{#d91a1a}-0.61\%$
test_vmap_mlp_speed_decorator[False-True] 0.7238ms 0.6006ms 1.6650 KOps/s 1.7314 KOps/s $\color{#d91a1a}-3.83\%$
test_vmap_mlp_speed_decorator[False-False] 0.7372ms 0.6030ms 1.6584 KOps/s 1.7306 KOps/s $\color{#d91a1a}-4.17\%$
test_vmap_transformer_speed_decorator[True-True] 19.9346ms 18.9257ms 52.8381 Ops/s 53.6828 Ops/s $\color{#d91a1a}-1.57\%$
test_vmap_transformer_speed_decorator[True-False] 19.3842ms 18.6755ms 53.5461 Ops/s 53.4498 Ops/s $\color{#35bf28}+0.18\%$
test_vmap_transformer_speed_decorator[False-True] 18.7925ms 18.5658ms 53.8624 Ops/s 54.1120 Ops/s $\color{#d91a1a}-0.46\%$
test_vmap_transformer_speed_decorator[False-False] 18.7233ms 18.5171ms 54.0042 Ops/s 54.1277 Ops/s $\color{#d91a1a}-0.23\%$
test_to_module_speed[True] 1.0569ms 0.9596ms 1.0421 KOps/s 1.0276 KOps/s $\color{#35bf28}+1.42\%$
test_to_module_speed[False] 1.3282ms 0.9437ms 1.0597 KOps/s 1.0520 KOps/s $\color{#35bf28}+0.73\%$
test_tc_init 72.0710μs 39.0386μs 25.6157 KOps/s 26.5280 KOps/s $\color{#d91a1a}-3.44\%$
test_tc_init_nested 0.1192ms 79.6364μs 12.5571 KOps/s 13.5484 KOps/s $\textbf{\color{#d91a1a}-7.32\%}$
test_tc_first_layer_tensor 26.7110μs 0.8054μs 1.2417 MOps/s 1.4419 MOps/s $\textbf{\color{#d91a1a}-13.89\%}$
test_tc_first_layer_nontensor 25.4710μs 2.2934μs 436.0337 KOps/s 435.9161 KOps/s $\color{#35bf28}+0.03\%$
test_tc_second_layer_tensor 36.2282μs 1.4101μs 709.1483 KOps/s 714.0697 KOps/s $\color{#d91a1a}-0.69\%$
test_tc_second_layer_nontensor 26.2710μs 3.0424μs 328.6830 KOps/s 332.5238 KOps/s $\color{#d91a1a}-1.16\%$
test_unbind 0.2227s 10.3147ms 96.9491 Ops/s 144.8519 Ops/s $\textbf{\color{#d91a1a}-33.07\%}$
test_full_like 10.0962ms 9.1059ms 109.8190 Ops/s 110.2935 Ops/s $\color{#d91a1a}-0.43\%$
test_zeros_like 9.1652ms 7.1783ms 139.3083 Ops/s 235.5167 Ops/s $\textbf{\color{#d91a1a}-40.85\%}$
test_ones_like 4.4809ms 4.3037ms 232.3560 Ops/s 237.9432 Ops/s $\color{#d91a1a}-2.35\%$
test_clone 13.6095ms 10.3209ms 96.8906 Ops/s 111.1751 Ops/s $\textbf{\color{#d91a1a}-12.85\%}$
test_squeeze 57.4710μs 9.5322μs 104.9075 KOps/s 107.1155 KOps/s $\color{#d91a1a}-2.06\%$
test_unsqueeze 0.2117ms 72.5303μs 13.7873 KOps/s 13.5943 KOps/s $\color{#35bf28}+1.42\%$
test_split 0.2669ms 0.1586ms 6.3034 KOps/s 6.1041 KOps/s $\color{#35bf28}+3.26\%$
test_permute 0.2598ms 0.1770ms 5.6482 KOps/s 5.4102 KOps/s $\color{#35bf28}+4.40\%$
test_stack 50.6017ms 50.2868ms 19.8859 Ops/s 19.9289 Ops/s $\color{#d91a1a}-0.22\%$
test_cat 50.4578ms 50.1391ms 19.9445 Ops/s 19.8803 Ops/s $\color{#35bf28}+0.32\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants