Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add __abs__ docstrings, __neg__, __rxor__, __ror__, __invert__, __and__, __rand__, __radd__, __rtruediv__, __rmul__, __rsub__, __rpow__, bitwise_and, logical_and #1154

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 42.8790μs 21.1193μs 47.3501 KOps/s 48.8049 KOps/s $\color{#d91a1a}-2.98\%$
test_plain_set_stack_nested 49.1110μs 21.5889μs 46.3200 KOps/s 48.1243 KOps/s $\color{#d91a1a}-3.75\%$
test_plain_set_nested_inplace 62.5970μs 23.3040μs 42.9111 KOps/s 45.0251 KOps/s $\color{#d91a1a}-4.70\%$
test_plain_set_stack_nested_inplace 79.0170μs 23.3793μs 42.7728 KOps/s 44.8545 KOps/s $\color{#d91a1a}-4.64\%$
test_items 23.6440μs 4.3200μs 231.4838 KOps/s 231.3787 KOps/s $\color{#35bf28}+0.05\%$
test_items_nested 0.5279ms 0.4083ms 2.4492 KOps/s 2.4511 KOps/s $\color{#d91a1a}-0.08\%$
test_items_nested_locked 0.6450ms 0.4080ms 2.4511 KOps/s 2.4448 KOps/s $\color{#35bf28}+0.26\%$
test_items_nested_leaf 0.1379ms 77.4136μs 12.9176 KOps/s 13.0992 KOps/s $\color{#d91a1a}-1.39\%$
test_items_stack_nested 0.6172ms 0.4073ms 2.4549 KOps/s 2.4474 KOps/s $\color{#35bf28}+0.30\%$
test_items_stack_nested_leaf 0.1378ms 79.9134μs 12.5135 KOps/s 12.7104 KOps/s $\color{#d91a1a}-1.55\%$
test_items_stack_nested_locked 0.7446ms 0.4098ms 2.4404 KOps/s 2.4346 KOps/s $\color{#35bf28}+0.24\%$
test_keys 18.9960μs 3.5043μs 285.3662 KOps/s 287.4073 KOps/s $\color{#d91a1a}-0.71\%$
test_keys_nested 0.2761ms 0.1657ms 6.0343 KOps/s 5.9197 KOps/s $\color{#35bf28}+1.94\%$
test_keys_nested_locked 0.8029ms 0.1691ms 5.9121 KOps/s 5.7396 KOps/s $\color{#35bf28}+3.00\%$
test_keys_nested_leaf 1.9522ms 0.1430ms 6.9945 KOps/s 6.7596 KOps/s $\color{#35bf28}+3.48\%$
test_keys_stack_nested 0.2603ms 0.1633ms 6.1221 KOps/s 5.9664 KOps/s $\color{#35bf28}+2.61\%$
test_keys_stack_nested_leaf 0.2262ms 0.1410ms 7.0899 KOps/s 6.9083 KOps/s $\color{#35bf28}+2.63\%$
test_keys_stack_nested_locked 0.3246ms 0.1689ms 5.9219 KOps/s 5.7685 KOps/s $\color{#35bf28}+2.66\%$
test_values 7.0590μs 1.0519μs 950.6955 KOps/s 958.3737 KOps/s $\color{#d91a1a}-0.80\%$
test_values_nested 0.1199ms 63.1820μs 15.8273 KOps/s 15.9831 KOps/s $\color{#d91a1a}-0.98\%$
test_values_nested_locked 0.1175ms 62.3780μs 16.0313 KOps/s 15.7821 KOps/s $\color{#35bf28}+1.58\%$
test_values_nested_leaf 0.1322ms 72.4357μs 13.8053 KOps/s 13.9916 KOps/s $\color{#d91a1a}-1.33\%$
test_values_stack_nested 0.1361ms 64.6030μs 15.4791 KOps/s 15.9310 KOps/s $\color{#d91a1a}-2.84\%$
test_values_stack_nested_leaf 0.1262ms 72.8333μs 13.7300 KOps/s 13.9192 KOps/s $\color{#d91a1a}-1.36\%$
test_values_stack_nested_locked 0.1200ms 64.0284μs 15.6181 KOps/s 15.8648 KOps/s $\color{#d91a1a}-1.56\%$
test_membership 4.5686μs 0.7641μs 1.3088 MOps/s 1.3666 MOps/s $\color{#d91a1a}-4.23\%$
test_membership_nested 25.4180μs 2.9763μs 335.9829 KOps/s 336.4250 KOps/s $\color{#d91a1a}-0.13\%$
test_membership_nested_leaf 23.0130μs 2.9928μs 334.1323 KOps/s 334.5169 KOps/s $\color{#d91a1a}-0.11\%$
test_membership_stacked_nested 39.8640μs 2.9364μs 340.5489 KOps/s 333.5409 KOps/s $\color{#35bf28}+2.10\%$
test_membership_stacked_nested_leaf 16.2400μs 2.9503μs 338.9523 KOps/s 338.0220 KOps/s $\color{#35bf28}+0.28\%$
test_membership_nested_last 24.6160μs 4.4358μs 225.4371 KOps/s 229.3202 KOps/s $\color{#d91a1a}-1.69\%$
test_membership_nested_leaf_last 39.8340μs 4.3978μs 227.3846 KOps/s 226.5840 KOps/s $\color{#35bf28}+0.35\%$
test_membership_stacked_nested_last 34.6140μs 5.1632μs 193.6775 KOps/s 226.1516 KOps/s $\textbf{\color{#d91a1a}-14.36\%}$
test_membership_stacked_nested_leaf_last 34.8750μs 5.2010μs 192.2713 KOps/s 226.3639 KOps/s $\textbf{\color{#d91a1a}-15.06\%}$
test_nested_getleaf 63.6740μs 11.0751μs 90.2927 KOps/s 92.4608 KOps/s $\color{#d91a1a}-2.34\%$
test_nested_get 36.5280μs 10.3914μs 96.2338 KOps/s 96.4859 KOps/s $\color{#d91a1a}-0.26\%$
test_stacked_getleaf 37.4390μs 10.8000μs 92.5929 KOps/s 90.6684 KOps/s $\color{#35bf28}+2.12\%$
test_stacked_get 35.2450μs 10.4609μs 95.5942 KOps/s 87.7308 KOps/s $\textbf{\color{#35bf28}+8.96\%}$
test_nested_getitemleaf 36.7080μs 11.4243μs 87.5328 KOps/s 88.0138 KOps/s $\color{#d91a1a}-0.55\%$
test_nested_getitem 38.8220μs 10.6128μs 94.2257 KOps/s 94.2501 KOps/s $\color{#d91a1a}-0.03\%$
test_stacked_getitemleaf 36.2770μs 11.3205μs 88.3356 KOps/s 85.3235 KOps/s $\color{#35bf28}+3.53\%$
test_stacked_getitem 40.0740μs 10.5782μs 94.5339 KOps/s 92.1675 KOps/s $\color{#35bf28}+2.57\%$
test_lock_nested 1.9575ms 0.4535ms 2.2050 KOps/s 2.1950 KOps/s $\color{#35bf28}+0.45\%$
test_lock_stack_nested 1.7800ms 0.4286ms 2.3332 KOps/s 2.3168 KOps/s $\color{#35bf28}+0.71\%$
test_unlock_nested 0.7414ms 0.3694ms 2.7073 KOps/s 2.6447 KOps/s $\color{#35bf28}+2.37\%$
test_unlock_stack_nested 0.5408ms 0.3413ms 2.9298 KOps/s 2.8584 KOps/s $\color{#35bf28}+2.50\%$
test_flatten_speed 0.1866ms 99.7204μs 10.0280 KOps/s 9.7951 KOps/s $\color{#35bf28}+2.38\%$
test_unflatten_speed 0.9114ms 0.5333ms 1.8752 KOps/s 1.8771 KOps/s $\color{#d91a1a}-0.10\%$
test_common_ops 3.2440ms 0.8022ms 1.2465 KOps/s 1.2936 KOps/s $\color{#d91a1a}-3.64\%$
test_creation 37.5200μs 2.5095μs 398.4866 KOps/s 396.6423 KOps/s $\color{#35bf28}+0.46\%$
test_creation_empty 47.3880μs 12.3785μs 80.7853 KOps/s 92.6458 KOps/s $\textbf{\color{#d91a1a}-12.80\%}$
test_creation_nested_1 65.3620μs 14.9072μs 67.0819 KOps/s 72.8954 KOps/s $\textbf{\color{#d91a1a}-7.98\%}$
test_creation_nested_2 67.5850μs 19.7799μs 50.5565 KOps/s 54.4268 KOps/s $\textbf{\color{#d91a1a}-7.11\%}$
test_clone 0.2052ms 13.5562μs 73.7670 KOps/s 72.9761 KOps/s $\color{#35bf28}+1.08\%$
test_getitem[int] 1.4043ms 13.0737μs 76.4896 KOps/s 77.7896 KOps/s $\color{#d91a1a}-1.67\%$
test_getitem[slice_int] 0.4136ms 24.9776μs 40.0358 KOps/s 35.3816 KOps/s $\textbf{\color{#35bf28}+13.15\%}$
test_getitem[range] 0.1645ms 47.9098μs 20.8725 KOps/s 20.4763 KOps/s $\color{#35bf28}+1.94\%$
test_getitem[tuple] 0.1308ms 20.6379μs 48.4545 KOps/s 49.6202 KOps/s $\color{#d91a1a}-2.35\%$
test_getitem[list] 0.1605ms 42.9603μs 23.2773 KOps/s 22.5161 KOps/s $\color{#35bf28}+3.38\%$
test_setitem_dim[int] 41.1370μs 25.5805μs 39.0922 KOps/s 39.1816 KOps/s $\color{#d91a1a}-0.23\%$
test_setitem_dim[slice_int] 0.1166ms 53.1598μs 18.8112 KOps/s 19.2890 KOps/s $\color{#d91a1a}-2.48\%$
test_setitem_dim[range] 0.1412ms 71.8266μs 13.9224 KOps/s 13.8228 KOps/s $\color{#35bf28}+0.72\%$
test_setitem_dim[tuple] 82.4640μs 41.9917μs 23.8142 KOps/s 24.2016 KOps/s $\color{#d91a1a}-1.60\%$
test_setitem 0.1261ms 21.2582μs 47.0408 KOps/s 49.9225 KOps/s $\textbf{\color{#d91a1a}-5.77\%}$
test_set 0.2337ms 21.4666μs 46.5839 KOps/s 50.8840 KOps/s $\textbf{\color{#d91a1a}-8.45\%}$
test_set_shared 2.8392ms 0.1735ms 5.7652 KOps/s 5.8081 KOps/s $\color{#d91a1a}-0.74\%$
test_update 0.3531ms 23.6977μs 42.1981 KOps/s 45.5085 KOps/s $\textbf{\color{#d91a1a}-7.27\%}$
test_update_nested 0.1426ms 33.8821μs 29.5141 KOps/s 31.6739 KOps/s $\textbf{\color{#d91a1a}-6.82\%}$
test_update__nested 0.6020ms 34.9874μs 28.5817 KOps/s 28.4851 KOps/s $\color{#35bf28}+0.34\%$
test_set_nested 0.1468ms 22.4120μs 44.6190 KOps/s 46.1339 KOps/s $\color{#d91a1a}-3.28\%$
test_set_nested_new 0.1121ms 26.8639μs 37.2246 KOps/s 38.6668 KOps/s $\color{#d91a1a}-3.73\%$
test_select 0.2167ms 44.6264μs 22.4083 KOps/s 23.7592 KOps/s $\textbf{\color{#d91a1a}-5.69\%}$
test_select_nested 0.1233ms 63.6308μs 15.7157 KOps/s 15.9866 KOps/s $\color{#d91a1a}-1.69\%$
test_exclude_nested 0.1478ms 82.0238μs 12.1916 KOps/s 11.7369 KOps/s $\color{#35bf28}+3.87\%$
test_empty[True] 0.5282ms 0.4123ms 2.4254 KOps/s 2.4477 KOps/s $\color{#d91a1a}-0.91\%$
test_empty[False] 15.3885μs 1.3595μs 735.5391 KOps/s 721.4085 KOps/s $\color{#35bf28}+1.96\%$
test_unbind_speed 0.3927ms 0.2716ms 3.6815 KOps/s 3.7008 KOps/s $\color{#d91a1a}-0.52\%$
test_unbind_speed_stack0 0.4740ms 0.2684ms 3.7264 KOps/s 3.7471 KOps/s $\color{#d91a1a}-0.55\%$
test_unbind_speed_stack1 0.1033s 0.8800ms 1.1363 KOps/s 1.2910 KOps/s $\textbf{\color{#d91a1a}-11.98\%}$
test_split 97.2960ms 1.7412ms 574.3211 Ops/s 630.0658 Ops/s $\textbf{\color{#d91a1a}-8.85\%}$
test_chunk 0.1035s 1.7655ms 566.4014 Ops/s 518.0719 Ops/s $\textbf{\color{#35bf28}+9.33\%}$
test_consolidate_njt[False-None] 13.4235ms 8.1614ms 122.5286 Ops/s 122.8432 Ops/s $\color{#d91a1a}-0.26\%$
test_creation[device0] 2.9695ms 91.2408μs 10.9600 KOps/s 10.6036 KOps/s $\color{#35bf28}+3.36\%$
test_creation_from_tensor 0.2134ms 93.9026μs 10.6493 KOps/s 10.5437 KOps/s $\color{#35bf28}+1.00\%$
test_add_one[memmap_tensor0] 0.2118ms 4.9335μs 202.6975 KOps/s 203.5109 KOps/s $\color{#d91a1a}-0.40\%$
test_contiguous[memmap_tensor0] 20.2480μs 0.5115μs 1.9551 MOps/s 2.0070 MOps/s $\color{#d91a1a}-2.59\%$
test_stack[memmap_tensor0] 54.7520μs 3.6472μs 274.1829 KOps/s 301.0362 KOps/s $\textbf{\color{#d91a1a}-8.92\%}$
test_memmaptd_index 0.4524ms 0.2413ms 4.1436 KOps/s 4.1046 KOps/s $\color{#35bf28}+0.95\%$
test_memmaptd_index_astensor 0.7430ms 0.3317ms 3.0148 KOps/s 3.0198 KOps/s $\color{#d91a1a}-0.17\%$
test_memmaptd_index_op 0.9912ms 0.6043ms 1.6549 KOps/s 1.6989 KOps/s $\color{#d91a1a}-2.59\%$
test_serialize_model 0.1245s 0.1171s 8.5387 Ops/s 8.5158 Ops/s $\color{#35bf28}+0.27\%$
test_serialize_model_pickle 0.4470s 0.3902s 2.5630 Ops/s 2.5248 Ops/s $\color{#35bf28}+1.51\%$
test_serialize_weights 0.2185s 0.1299s 7.7007 Ops/s 7.5862 Ops/s $\color{#35bf28}+1.51\%$
test_serialize_weights_returnearly 0.1741s 0.1579s 6.3320 Ops/s 6.4118 Ops/s $\color{#d91a1a}-1.24\%$
test_serialize_weights_pickle 0.4936s 0.4096s 2.4413 Ops/s 1.1595 Ops/s $\textbf{\color{#35bf28}+110.55\%}$
test_serialize_weights_filesystem 0.1462s 0.1395s 7.1682 Ops/s 7.0017 Ops/s $\color{#35bf28}+2.38\%$
test_serialize_model_filesystem 0.2451s 0.1617s 6.1860 Ops/s 6.4856 Ops/s $\color{#d91a1a}-4.62\%$
test_reshape_pytree 60.5930μs 27.0702μs 36.9410 KOps/s 37.3771 KOps/s $\color{#d91a1a}-1.17\%$
test_reshape_td 97.1710μs 34.0689μs 29.3523 KOps/s 30.7511 KOps/s $\color{#d91a1a}-4.55\%$
test_view_pytree 70.5320μs 27.6655μs 36.1461 KOps/s 37.3034 KOps/s $\color{#d91a1a}-3.10\%$
test_view_td 91.2800μs 37.8045μs 26.4519 KOps/s 25.4326 KOps/s $\color{#35bf28}+4.01\%$
test_unbind_pytree 80.6290μs 30.5707μs 32.7111 KOps/s 32.9473 KOps/s $\color{#d91a1a}-0.72\%$
test_unbind_td 0.3590ms 40.5831μs 24.6408 KOps/s 24.7077 KOps/s $\color{#d91a1a}-0.27\%$
test_split_pytree 72.6250μs 30.4105μs 32.8833 KOps/s 33.4769 KOps/s $\color{#d91a1a}-1.77\%$
test_split_td 0.2071ms 45.5764μs 21.9412 KOps/s 22.1099 KOps/s $\color{#d91a1a}-0.76\%$
test_add_pytree 0.1232ms 36.1011μs 27.7000 KOps/s 27.7473 KOps/s $\color{#d91a1a}-0.17\%$
test_add_td 0.1379ms 57.6306μs 17.3519 KOps/s 18.8475 KOps/s $\textbf{\color{#d91a1a}-7.94\%}$
test_compile_add_one_nested[tensordict-compile] 0.1168ms 60.7209μs 16.4688 KOps/s 16.4066 KOps/s $\color{#35bf28}+0.38\%$
test_compile_add_one_nested[tensordict-eager] 0.4011ms 0.1689ms 5.9195 KOps/s 5.9458 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_add_one_nested[pytree-compile] 97.4510μs 45.2321μs 22.1082 KOps/s 22.1761 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_add_one_nested[pytree-eager] 0.2299ms 0.1209ms 8.2734 KOps/s 8.3251 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_copy_nested[tensordict-compile] 77.9150μs 26.3952μs 37.8857 KOps/s 38.7578 KOps/s $\color{#d91a1a}-2.25\%$
test_compile_copy_nested[tensordict-eager] 0.1549ms 59.5186μs 16.8015 KOps/s 17.2636 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_copy_nested[pytree-compile] 0.1597ms 80.0959μs 12.4850 KOps/s 12.7215 KOps/s $\color{#d91a1a}-1.86\%$
test_compile_copy_nested[pytree-eager] 0.1317ms 68.7291μs 14.5499 KOps/s 14.8492 KOps/s $\color{#d91a1a}-2.02\%$
test_compile_add_one_flat[tensordict-compile] 0.6504ms 0.1027ms 9.7340 KOps/s 9.7526 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_add_one_flat[tensordict-eager] 0.3788ms 0.2143ms 4.6670 KOps/s 4.6696 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_add_one_flat[tensorclass-compile] 94.9360μs 45.1262μs 22.1601 KOps/s 22.3815 KOps/s $\color{#d91a1a}-0.99\%$
test_compile_add_one_flat[tensorclass-eager] 0.5037ms 63.6207μs 15.7182 KOps/s 15.3845 KOps/s $\color{#35bf28}+2.17\%$
test_compile_add_one_flat[pytree-compile] 0.2121ms 0.1021ms 9.7982 KOps/s 9.8464 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_add_one_flat[pytree-eager] 0.3502ms 0.2040ms 4.9015 KOps/s 4.7322 KOps/s $\color{#35bf28}+3.58\%$
test_compile_add_self_flat[tensordict-eager] 0.3150ms 0.2345ms 4.2642 KOps/s 4.3322 KOps/s $\color{#d91a1a}-1.57\%$
test_compile_add_self_flat[tensordict-compile] 0.1725ms 0.1026ms 9.7511 KOps/s 9.4943 KOps/s $\color{#35bf28}+2.70\%$
test_compile_add_self_flat[tensorclass-eager] 0.1325ms 60.4895μs 16.5318 KOps/s 17.1766 KOps/s $\color{#d91a1a}-3.75\%$
test_compile_add_self_flat[tensorclass-compile] 0.4045ms 47.4886μs 21.0577 KOps/s 22.5559 KOps/s $\textbf{\color{#d91a1a}-6.64\%}$
test_compile_add_self_flat[pytree-eager] 1.6442ms 0.1621ms 6.1693 KOps/s 6.1664 KOps/s $\color{#35bf28}+0.05\%$
test_compile_add_self_flat[pytree-compile] 0.2462ms 0.1021ms 9.7901 KOps/s 9.8752 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_copy_flat[tensordict-compile] 55.7230μs 21.4957μs 46.5209 KOps/s 47.9272 KOps/s $\color{#d91a1a}-2.93\%$
test_compile_copy_flat[tensordict-eager] 0.1325ms 67.0402μs 14.9164 KOps/s 15.2406 KOps/s $\color{#d91a1a}-2.13\%$
test_compile_copy_flat[pytree-compile] 0.1442ms 80.9447μs 12.3541 KOps/s 12.1617 KOps/s $\color{#35bf28}+1.58\%$
test_compile_copy_flat[pytree-eager] 0.1412ms 69.2508μs 14.4403 KOps/s 14.3157 KOps/s $\color{#35bf28}+0.87\%$
test_compile_assign_and_add[tensordict-compile] 0.3549ms 0.1996ms 5.0112 KOps/s 4.7990 KOps/s $\color{#35bf28}+4.42\%$
test_compile_assign_and_add[tensordict-eager] 2.5102ms 1.3093ms 763.7952 Ops/s 758.0692 Ops/s $\color{#35bf28}+0.76\%$
test_compile_assign_and_add[pytree-compile] 0.2826ms 0.1958ms 5.1064 KOps/s 4.9441 KOps/s $\color{#35bf28}+3.28\%$
test_compile_assign_and_add[pytree-eager] 0.9558ms 0.7811ms 1.2802 KOps/s 1.2620 KOps/s $\color{#35bf28}+1.44\%$
test_compile_assign_and_add_stack[compile] 0.5271ms 0.4360ms 2.2938 KOps/s 2.2027 KOps/s $\color{#35bf28}+4.13\%$
test_compile_assign_and_add_stack[eager] 4.4849ms 2.7279ms 366.5837 Ops/s 378.4095 Ops/s $\color{#d91a1a}-3.13\%$
test_compile_indexing[tensor-tensordict-compile] 86.0110μs 34.9051μs 28.6491 KOps/s 27.9286 KOps/s $\color{#35bf28}+2.58\%$
test_compile_indexing[tensor-tensordict-eager] 0.5634ms 32.8454μs 30.4456 KOps/s 31.2641 KOps/s $\color{#d91a1a}-2.62\%$
test_compile_indexing[tensor-tensorclass-compile] 91.3100μs 29.0405μs 34.4346 KOps/s 34.1635 KOps/s $\color{#35bf28}+0.79\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1055ms 23.8890μs 41.8603 KOps/s 42.5793 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_indexing[tensor-pytree-compile] 77.0140μs 29.0611μs 34.4102 KOps/s 33.9426 KOps/s $\color{#35bf28}+1.38\%$
test_compile_indexing[tensor-pytree-eager] 57.4270μs 23.4215μs 42.6958 KOps/s 42.5043 KOps/s $\color{#35bf28}+0.45\%$
test_compile_indexing[slice-tensordict-compile] 99.4350μs 51.4272μs 19.4450 KOps/s 19.5137 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_indexing[slice-tensordict-eager] 0.3937ms 20.7326μs 48.2332 KOps/s 49.9092 KOps/s $\color{#d91a1a}-3.36\%$
test_compile_indexing[slice-tensorclass-compile] 89.9370μs 43.0974μs 23.2033 KOps/s 22.4632 KOps/s $\color{#35bf28}+3.29\%$
test_compile_indexing[slice-tensorclass-eager] 56.4650μs 19.2240μs 52.0183 KOps/s 51.9817 KOps/s $\color{#35bf28}+0.07\%$
test_compile_indexing[slice-pytree-compile] 0.2755ms 44.2073μs 22.6207 KOps/s 22.3816 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[slice-pytree-eager] 70.8920μs 19.2593μs 51.9229 KOps/s 52.4563 KOps/s $\color{#d91a1a}-1.02\%$
test_compile_indexing[int-tensordict-compile] 0.1138ms 52.4299μs 19.0731 KOps/s 17.7530 KOps/s $\textbf{\color{#35bf28}+7.44\%}$
test_compile_indexing[int-tensordict-eager] 1.0031ms 21.3718μs 46.7907 KOps/s 49.8166 KOps/s $\textbf{\color{#d91a1a}-6.07\%}$
test_compile_indexing[int-tensorclass-compile] 0.1751ms 44.6059μs 22.4186 KOps/s 22.4152 KOps/s $\color{#35bf28}+0.02\%$
test_compile_indexing[int-tensorclass-eager] 0.3051ms 19.1681μs 52.1701 KOps/s 52.5273 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_indexing[int-pytree-compile] 0.2436ms 44.4660μs 22.4891 KOps/s 22.4622 KOps/s $\color{#35bf28}+0.12\%$
test_compile_indexing[int-pytree-eager] 73.1760μs 19.1228μs 52.2935 KOps/s 52.4624 KOps/s $\color{#d91a1a}-0.32\%$
test_mod_add[eager] 85.6590μs 34.5952μs 28.9058 KOps/s 29.9058 KOps/s $\color{#d91a1a}-3.34\%$
test_mod_add[compile] 0.1123ms 47.4399μs 21.0793 KOps/s 21.2702 KOps/s $\color{#d91a1a}-0.90\%$
test_mod_add[compile-overhead] 0.1228ms 47.5935μs 21.0113 KOps/s 20.9808 KOps/s $\color{#35bf28}+0.15\%$
test_mod_wrap[eager] 0.3616ms 0.2241ms 4.4621 KOps/s 4.3519 KOps/s $\color{#35bf28}+2.53\%$
test_mod_wrap[compile] 0.4593ms 0.2090ms 4.7847 KOps/s 4.9529 KOps/s $\color{#d91a1a}-3.40\%$
test_mod_wrap[compile-overhead] 0.3921ms 0.2015ms 4.9617 KOps/s 4.9203 KOps/s $\color{#35bf28}+0.84\%$
test_mod_wrap_and_backward[eager] 12.3799ms 11.1350ms 89.8072 Ops/s 92.0980 Ops/s $\color{#d91a1a}-2.49\%$
test_mod_wrap_and_backward[compile] 14.0918ms 11.2719ms 88.7166 Ops/s 92.3067 Ops/s $\color{#d91a1a}-3.89\%$
test_mod_wrap_and_backward[compile-overhead] 13.7342ms 11.5042ms 86.9247 Ops/s 93.0264 Ops/s $\textbf{\color{#d91a1a}-6.56\%}$
test_seq_add[eager] 0.2166ms 0.1116ms 8.9597 KOps/s 8.7755 KOps/s $\color{#35bf28}+2.10\%$
test_seq_add[compile] 0.1473ms 59.6472μs 16.7652 KOps/s 16.3251 KOps/s $\color{#35bf28}+2.70\%$
test_seq_add[compile-overhead] 0.1247ms 59.5377μs 16.7961 KOps/s 17.1400 KOps/s $\color{#d91a1a}-2.01\%$
test_seq_wrap[eager] 0.7490ms 0.4318ms 2.3156 KOps/s 2.2321 KOps/s $\color{#35bf28}+3.74\%$
test_seq_wrap[compile] 0.3418ms 0.2195ms 4.5567 KOps/s 4.4195 KOps/s $\color{#35bf28}+3.10\%$
test_seq_wrap[compile-overhead] 0.4435ms 0.2225ms 4.4953 KOps/s 4.4687 KOps/s $\color{#35bf28}+0.59\%$
test_func_call_runtime[False-eager] 1.6077ms 0.5523ms 1.8105 KOps/s 1.7528 KOps/s $\color{#35bf28}+3.29\%$
test_func_call_runtime[False-compile] 0.7146ms 0.4206ms 2.3776 KOps/s 2.3929 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_runtime[False-compile-overhead] 1.0365ms 0.4250ms 2.3529 KOps/s 2.3942 KOps/s $\color{#d91a1a}-1.72\%$
test_func_call_runtime[True-eager] 1.2422ms 0.7566ms 1.3218 KOps/s 1.2898 KOps/s $\color{#35bf28}+2.48\%$
test_func_call_runtime[True-compile] 0.9608ms 0.4592ms 2.1775 KOps/s 2.1692 KOps/s $\color{#35bf28}+0.38\%$
test_func_call_runtime[True-compile-overhead] 0.8527ms 0.4621ms 2.1640 KOps/s 2.1535 KOps/s $\color{#35bf28}+0.49\%$
test_func_call_cm_runtime[False-eager] 1.3613ms 0.5508ms 1.8156 KOps/s 1.7921 KOps/s $\color{#35bf28}+1.31\%$
test_func_call_cm_runtime[False-compile] 0.5202ms 0.4178ms 2.3937 KOps/s 2.3851 KOps/s $\color{#35bf28}+0.36\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6128ms 0.4162ms 2.4029 KOps/s 2.3825 KOps/s $\color{#35bf28}+0.86\%$
test_func_call_cm_runtime[True-eager] 1.0461ms 0.8992ms 1.1121 KOps/s 1.0865 KOps/s $\color{#35bf28}+2.36\%$
test_func_call_cm_runtime[True-compile] 0.8785ms 0.4803ms 2.0820 KOps/s 2.0587 KOps/s $\color{#35bf28}+1.13\%$
test_func_call_cm_runtime[True-compile-overhead] 0.7328ms 0.4763ms 2.0993 KOps/s 2.0730 KOps/s $\color{#35bf28}+1.27\%$
test_vmap_func_call_cm_runtime[eager] 2.6002ms 1.8759ms 533.0769 Ops/s 512.8304 Ops/s $\color{#35bf28}+3.95\%$
test_vmap_func_call_cm_runtime[compile] 0.8294ms 0.5056ms 1.9777 KOps/s 1.9251 KOps/s $\color{#35bf28}+2.73\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.9130ms 0.5035ms 1.9863 KOps/s 1.9300 KOps/s $\color{#35bf28}+2.91\%$
test_distributed 0.2489ms 0.1241ms 8.0601 KOps/s 7.9005 KOps/s $\color{#35bf28}+2.02\%$
test_tdmodule 67.6160μs 25.8545μs 38.6780 KOps/s 39.1340 KOps/s $\color{#d91a1a}-1.17\%$
test_tdmodule_dispatch 73.0060μs 47.0538μs 21.2523 KOps/s 21.3665 KOps/s $\color{#d91a1a}-0.53\%$
test_tdseq 58.9600μs 28.5224μs 35.0601 KOps/s 34.9866 KOps/s $\color{#35bf28}+0.21\%$
test_tdseq_dispatch 76.5220μs 52.4098μs 19.0804 KOps/s 18.9498 KOps/s $\color{#35bf28}+0.69\%$
test_instantiation_functorch 1.7531ms 1.5140ms 660.5208 Ops/s 649.0316 Ops/s $\color{#35bf28}+1.77\%$
test_exec_functorch 0.4648ms 0.1866ms 5.3579 KOps/s 5.5810 KOps/s $\color{#d91a1a}-4.00\%$
test_exec_functional_call 0.3498ms 0.1694ms 5.9044 KOps/s 5.7217 KOps/s $\color{#35bf28}+3.19\%$
test_exec_td_decorator 0.4233ms 0.2294ms 4.3586 KOps/s 4.2242 KOps/s $\color{#35bf28}+3.18\%$
test_vmap_mlp_speed_decorator[True-True] 0.8232ms 0.6469ms 1.5458 KOps/s 1.4929 KOps/s $\color{#35bf28}+3.55\%$
test_vmap_mlp_speed_decorator[True-False] 0.9645ms 0.6643ms 1.5053 KOps/s 1.5025 KOps/s $\color{#35bf28}+0.18\%$
test_vmap_mlp_speed_decorator[False-True] 0.8633ms 0.5221ms 1.9155 KOps/s 1.8650 KOps/s $\color{#35bf28}+2.71\%$
test_vmap_mlp_speed_decorator[False-False] 1.0319ms 0.5216ms 1.9171 KOps/s 1.8381 KOps/s $\color{#35bf28}+4.30\%$
test_to_module_speed[True] 2.3907ms 1.3273ms 753.4047 Ops/s 725.8361 Ops/s $\color{#35bf28}+3.80\%$
test_to_module_speed[False] 2.2359ms 1.2925ms 773.7082 Ops/s 750.2196 Ops/s $\color{#35bf28}+3.13\%$
test_tc_init 71.5730μs 47.2999μs 21.1417 KOps/s 20.6388 KOps/s $\color{#35bf28}+2.44\%$
test_tc_init_nested 0.2204ms 95.0777μs 10.5177 KOps/s 10.6054 KOps/s $\color{#d91a1a}-0.83\%$
test_tc_first_layer_tensor 44.1230μs 1.4731μs 678.8571 KOps/s 664.4559 KOps/s $\color{#35bf28}+2.17\%$
test_tc_first_layer_nontensor 29.1510μs 4.7289μs 211.4641 KOps/s 218.1898 KOps/s $\color{#d91a1a}-3.08\%$
test_tc_second_layer_tensor 25.9780μs 2.7670μs 361.4076 KOps/s 355.7421 KOps/s $\color{#35bf28}+1.59\%$
test_tc_second_layer_nontensor 46.0560μs 5.9092μs 169.2278 KOps/s 169.7020 KOps/s $\color{#d91a1a}-0.28\%$
test_unbind 0.2091s 13.0316ms 76.7365 Ops/s 67.3446 Ops/s $\textbf{\color{#35bf28}+13.95\%}$
test_full_like 7.4776ms 6.8821ms 145.3044 Ops/s 143.1784 Ops/s $\color{#35bf28}+1.48\%$
test_zeros_like 7.1962ms 2.8726ms 348.1150 Ops/s 351.1624 Ops/s $\color{#d91a1a}-0.87\%$
test_ones_like 3.8031ms 3.2660ms 306.1844 Ops/s 288.2360 Ops/s $\textbf{\color{#35bf28}+6.23\%}$
test_clone 5.2550ms 4.8220ms 207.3849 Ops/s 126.2637 Ops/s $\textbf{\color{#35bf28}+64.25\%}$
test_squeeze 57.1070μs 12.2968μs 81.3217 KOps/s 80.2226 KOps/s $\color{#35bf28}+1.37\%$
test_unsqueeze 0.1976ms 91.8179μs 10.8911 KOps/s 10.9353 KOps/s $\color{#d91a1a}-0.40\%$
test_split 0.5248ms 0.1931ms 5.1788 KOps/s 4.9151 KOps/s $\textbf{\color{#35bf28}+5.36\%}$
test_permute 0.4080ms 0.2068ms 4.8367 KOps/s 4.8845 KOps/s $\color{#d91a1a}-0.98\%$
test_stack 29.8353ms 24.1399ms 41.4251 Ops/s 39.9206 Ops/s $\color{#35bf28}+3.77\%$
test_cat 24.7090ms 23.6697ms 42.2481 Ops/s 39.2999 Ops/s $\textbf{\color{#35bf28}+7.50\%}$

@vmoens vmoens added the enhancement New feature or request label Dec 20, 2024
@vmoens vmoens merged commit 6d56dc7 into gh/vmoens/40/base Dec 20, 2024
48 of 55 checks passed
vmoens added a commit that referenced this pull request Dec 20, 2024
… `__invert__`, `__and__`, `__rand__`, `__radd__`, `__rtruediv__`, `__rmul__`, `__rsub__`, `__rpow__`, `bitwise_and`, `logical_and`

ghstack-source-id: 97ce710b5a4b552d9477182e1836cf3777c2d756
Pull Request resolved: #1154
@vmoens vmoens deleted the gh/vmoens/40/head branch December 20, 2024 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants