Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Refactor implement_for #556

Merged
merged 2 commits into from
Nov 13, 2023
Merged

[Refactor] Refactor implement_for #556

merged 2 commits into from
Nov 13, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 13, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 13, 2023
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 105. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 49.2220μs 15.3244μs 65.2556 KOps/s 66.4241 KOps/s $\color{#d91a1a}-1.76\%$
test_plain_set_stack_nested 0.1921ms 0.1447ms 6.9095 KOps/s 7.0653 KOps/s $\color{#d91a1a}-2.20\%$
test_plain_set_nested_inplace 41.9990μs 18.2530μs 54.7854 KOps/s 55.1610 KOps/s $\color{#d91a1a}-0.68\%$
test_plain_set_stack_nested_inplace 0.3231ms 0.1727ms 5.7915 KOps/s 5.8943 KOps/s $\color{#d91a1a}-1.74\%$
test_items 33.9940μs 2.4100μs 414.9426 KOps/s 417.3353 KOps/s $\color{#d91a1a}-0.57\%$
test_items_nested 0.3376ms 0.2713ms 3.6861 KOps/s 3.6538 KOps/s $\color{#35bf28}+0.88\%$
test_items_nested_locked 0.5559ms 0.2731ms 3.6622 KOps/s 3.6497 KOps/s $\color{#35bf28}+0.34\%$
test_items_nested_leaf 0.5183ms 0.1671ms 5.9840 KOps/s 5.9321 KOps/s $\color{#35bf28}+0.88\%$
test_items_stack_nested 1.4895ms 1.3802ms 724.5371 Ops/s 713.3605 Ops/s $\color{#35bf28}+1.57\%$
test_items_stack_nested_leaf 1.4217ms 1.2558ms 796.3021 Ops/s 788.7269 Ops/s $\color{#35bf28}+0.96\%$
test_items_stack_nested_locked 1.8996ms 0.7518ms 1.3301 KOps/s 1.3064 KOps/s $\color{#35bf28}+1.82\%$
test_keys 23.4740μs 3.8539μs 259.4771 KOps/s 254.3893 KOps/s $\color{#35bf28}+2.00\%$
test_keys_nested 4.5442ms 0.1427ms 7.0074 KOps/s 6.7675 KOps/s $\color{#35bf28}+3.54\%$
test_keys_nested_locked 0.1855ms 0.1415ms 7.0656 KOps/s 7.0606 KOps/s $\color{#35bf28}+0.07\%$
test_keys_nested_leaf 0.2204ms 0.1393ms 7.1767 KOps/s 7.2515 KOps/s $\color{#d91a1a}-1.03\%$
test_keys_stack_nested 2.0858ms 1.2874ms 776.7581 Ops/s 760.8439 Ops/s $\color{#35bf28}+2.09\%$
test_keys_stack_nested_leaf 1.4156ms 1.2839ms 778.8720 Ops/s 767.2768 Ops/s $\color{#35bf28}+1.51\%$
test_keys_stack_nested_locked 0.7893ms 0.6431ms 1.5549 KOps/s 1.5643 KOps/s $\color{#d91a1a}-0.60\%$
test_values 10.4973μs 1.1620μs 860.5673 KOps/s 869.9382 KOps/s $\color{#d91a1a}-1.08\%$
test_values_nested 90.3680μs 48.2671μs 20.7180 KOps/s 20.7855 KOps/s $\color{#d91a1a}-0.32\%$
test_values_nested_locked 95.7380μs 48.0795μs 20.7989 KOps/s 20.9289 KOps/s $\color{#d91a1a}-0.62\%$
test_values_nested_leaf 72.4050μs 42.7367μs 23.3991 KOps/s 23.6024 KOps/s $\color{#d91a1a}-0.86\%$
test_values_stack_nested 1.1926ms 1.0959ms 912.5214 Ops/s 891.9563 Ops/s $\color{#35bf28}+2.31\%$
test_values_stack_nested_leaf 1.4018ms 1.0975ms 911.1820 Ops/s 903.5134 Ops/s $\color{#35bf28}+0.85\%$
test_values_stack_nested_locked 0.8191ms 0.4966ms 2.0137 KOps/s 1.9974 KOps/s $\color{#35bf28}+0.82\%$
test_membership 34.9750μs 1.3498μs 740.8459 KOps/s 753.9941 KOps/s $\color{#d91a1a}-1.74\%$
test_membership_nested 18.3640μs 2.8085μs 356.0612 KOps/s 357.6059 KOps/s $\color{#d91a1a}-0.43\%$
test_membership_nested_leaf 22.2010μs 2.8004μs 357.0939 KOps/s 360.6306 KOps/s $\color{#d91a1a}-0.98\%$
test_membership_stacked_nested 50.1330μs 11.5126μs 86.8613 KOps/s 86.0732 KOps/s $\color{#35bf28}+0.92\%$
test_membership_stacked_nested_leaf 42.8700μs 11.4876μs 87.0504 KOps/s 85.8622 KOps/s $\color{#35bf28}+1.38\%$
test_membership_nested_last 46.6070μs 5.7518μs 173.8571 KOps/s 171.6809 KOps/s $\color{#35bf28}+1.27\%$
test_membership_nested_leaf_last 25.0470μs 5.7463μs 174.0259 KOps/s 171.6587 KOps/s $\color{#35bf28}+1.38\%$
test_membership_stacked_nested_last 0.3005ms 0.1789ms 5.5896 KOps/s 5.5297 KOps/s $\color{#35bf28}+1.08\%$
test_membership_stacked_nested_leaf_last 43.5620μs 13.6553μs 73.2316 KOps/s 72.9648 KOps/s $\color{#35bf28}+0.37\%$
test_nested_getleaf 0.1475ms 12.3575μs 80.9223 KOps/s 82.2180 KOps/s $\color{#d91a1a}-1.58\%$
test_nested_get 56.4620μs 11.3172μs 88.3611 KOps/s 86.2792 KOps/s $\color{#35bf28}+2.41\%$
test_stacked_getleaf 0.6758ms 0.5713ms 1.7504 KOps/s 1.7604 KOps/s $\color{#d91a1a}-0.57\%$
test_stacked_get 0.6336ms 0.5477ms 1.8260 KOps/s 1.8561 KOps/s $\color{#d91a1a}-1.62\%$
test_nested_getitemleaf 47.7590μs 12.0118μs 83.2518 KOps/s 82.6026 KOps/s $\color{#35bf28}+0.79\%$
test_nested_getitem 45.2640μs 11.4948μs 86.9959 KOps/s 87.2667 KOps/s $\color{#d91a1a}-0.31\%$
test_stacked_getitemleaf 0.6567ms 0.5687ms 1.7585 KOps/s 1.7695 KOps/s $\color{#d91a1a}-0.63\%$
test_stacked_getitem 1.1823ms 0.5479ms 1.8253 KOps/s 1.8642 KOps/s $\color{#d91a1a}-2.09\%$
test_lock_nested 55.1570ms 0.9392ms 1.0647 KOps/s 1.1287 KOps/s $\textbf{\color{#d91a1a}-5.68\%}$
test_lock_stack_nested 67.9023ms 11.6229ms 86.0371 Ops/s 77.3456 Ops/s $\textbf{\color{#35bf28}+11.24\%}$
test_unlock_nested 51.5868ms 0.9360ms 1.0684 KOps/s 1.0529 KOps/s $\color{#35bf28}+1.47\%$
test_unlock_stack_nested 72.0774ms 12.1463ms 82.3294 Ops/s 75.1997 Ops/s $\textbf{\color{#35bf28}+9.48\%}$
test_flatten_speed 0.7582ms 0.6872ms 1.4552 KOps/s 1.4617 KOps/s $\color{#d91a1a}-0.44\%$
test_unflatten_speed 1.7889ms 1.1798ms 847.5745 Ops/s 852.4092 Ops/s $\color{#d91a1a}-0.57\%$
test_common_ops 0.7272ms 0.6277ms 1.5932 KOps/s 1.6071 KOps/s $\color{#d91a1a}-0.87\%$
test_creation 28.9140μs 2.1359μs 468.1872 KOps/s 474.4243 KOps/s $\color{#d91a1a}-1.31\%$
test_creation_empty 39.2930μs 7.3461μs 136.1262 KOps/s 138.8764 KOps/s $\color{#d91a1a}-1.98\%$
test_creation_nested_1 54.8020μs 11.4214μs 87.5549 KOps/s 89.7875 KOps/s $\color{#d91a1a}-2.49\%$
test_creation_nested_2 38.5620μs 13.9256μs 71.8100 KOps/s 73.8822 KOps/s $\color{#d91a1a}-2.80\%$
test_clone 59.0300μs 10.2692μs 97.3786 KOps/s 94.4638 KOps/s $\color{#35bf28}+3.09\%$
test_getitem[int] 32.4300μs 12.8907μs 77.5751 KOps/s 77.8697 KOps/s $\color{#d91a1a}-0.38\%$
test_getitem[slice_int] 69.9610μs 29.7374μs 33.6276 KOps/s 32.2945 KOps/s $\color{#35bf28}+4.13\%$
test_getitem[range] 0.1901ms 54.8517μs 18.2310 KOps/s 17.5221 KOps/s $\color{#35bf28}+4.05\%$
test_getitem[tuple] 54.4720μs 23.7212μs 42.1564 KOps/s 41.5438 KOps/s $\color{#35bf28}+1.47\%$
test_getitem[list] 0.2257ms 48.6546μs 20.5530 KOps/s 19.2375 KOps/s $\textbf{\color{#35bf28}+6.84\%}$
test_setitem_dim[int] 45.9760μs 26.4437μs 37.8161 KOps/s 37.4646 KOps/s $\color{#35bf28}+0.94\%$
test_setitem_dim[slice_int] 85.8200μs 49.2655μs 20.2982 KOps/s 19.7875 KOps/s $\color{#35bf28}+2.58\%$
test_setitem_dim[range] 0.1090ms 70.3989μs 14.2048 KOps/s 13.5427 KOps/s $\color{#35bf28}+4.89\%$
test_setitem_dim[tuple] 77.1930μs 39.5474μs 25.2861 KOps/s 25.1150 KOps/s $\color{#35bf28}+0.68\%$
test_setitem 71.7540μs 14.8310μs 67.4264 KOps/s 67.3189 KOps/s $\color{#35bf28}+0.16\%$
test_set 65.8630μs 14.2769μs 70.0433 KOps/s 70.4722 KOps/s $\color{#d91a1a}-0.61\%$
test_set_shared 0.2597ms 0.1502ms 6.6582 KOps/s 6.2475 KOps/s $\textbf{\color{#35bf28}+6.57\%}$
test_update 91.2600μs 19.0289μs 52.5517 KOps/s 51.8602 KOps/s $\color{#35bf28}+1.33\%$
test_update_nested 0.1146ms 27.7801μs 35.9970 KOps/s 36.4538 KOps/s $\color{#d91a1a}-1.25\%$
test_set_nested 60.9240μs 16.0250μs 62.4026 KOps/s 63.5836 KOps/s $\color{#d91a1a}-1.86\%$
test_set_nested_new 0.1053ms 22.5838μs 44.2795 KOps/s 44.9252 KOps/s $\color{#d91a1a}-1.44\%$
test_select 0.1353ms 46.7926μs 21.3709 KOps/s 22.0348 KOps/s $\color{#d91a1a}-3.01\%$
test_unbind_speed 0.3780ms 0.2833ms 3.5297 KOps/s 3.5208 KOps/s $\color{#35bf28}+0.25\%$
test_unbind_speed_stack0 61.2736ms 4.2826ms 233.5042 Ops/s 222.5234 Ops/s $\color{#35bf28}+4.93\%$
test_unbind_speed_stack1 1.6410μs 0.6026μs 1.6596 MOps/s 1.6471 MOps/s $\color{#35bf28}+0.76\%$
test_creation[device0] 0.7044ms 0.2890ms 3.4597 KOps/s 3.4213 KOps/s $\color{#35bf28}+1.12\%$
test_creation_from_tensor 5.1004ms 0.3251ms 3.0764 KOps/s 3.0783 KOps/s $\color{#d91a1a}-0.06\%$
test_add_one[memmap_tensor0] 0.4386ms 24.6315μs 40.5985 KOps/s 39.4579 KOps/s $\color{#35bf28}+2.89\%$
test_contiguous[memmap_tensor0] 28.8230μs 5.6242μs 177.8027 KOps/s 175.7060 KOps/s $\color{#35bf28}+1.19\%$
test_stack[memmap_tensor0] 60.6430μs 18.6211μs 53.7024 KOps/s 53.0368 KOps/s $\color{#35bf28}+1.25\%$
test_memmaptd_index 0.3173ms 0.2368ms 4.2224 KOps/s 4.2816 KOps/s $\color{#d91a1a}-1.38\%$
test_memmaptd_index_astensor 1.5684ms 0.9352ms 1.0693 KOps/s 1.1227 KOps/s $\color{#d91a1a}-4.76\%$
test_memmaptd_index_op 2.3104ms 2.1916ms 456.2949 Ops/s 444.2448 Ops/s $\color{#35bf28}+2.71\%$
test_reshape_pytree 75.3200μs 23.0975μs 43.2948 KOps/s 43.2257 KOps/s $\color{#35bf28}+0.16\%$
test_reshape_td 51.7360μs 21.4623μs 46.5933 KOps/s 48.2504 KOps/s $\color{#d91a1a}-3.43\%$
test_view_pytree 57.1060μs 23.0709μs 43.3446 KOps/s 43.7901 KOps/s $\color{#d91a1a}-1.02\%$
test_view_td 14.5770μs 4.1832μs 239.0536 KOps/s 241.5096 KOps/s $\color{#d91a1a}-1.02\%$
test_unbind_pytree 59.1400μs 26.3589μs 37.9379 KOps/s 38.3165 KOps/s $\color{#d91a1a}-0.99\%$
test_unbind_td 80.7010μs 39.8358μs 25.1030 KOps/s 25.2097 KOps/s $\color{#d91a1a}-0.42\%$
test_split_pytree 58.1390μs 26.2179μs 38.1418 KOps/s 38.4746 KOps/s $\color{#d91a1a}-0.86\%$
test_split_td 0.1716ms 75.7513μs 13.2011 KOps/s 12.9736 KOps/s $\color{#35bf28}+1.75\%$
test_add_pytree 66.6040μs 31.9976μs 31.2523 KOps/s 30.9503 KOps/s $\color{#35bf28}+0.98\%$
test_add_td 93.6860μs 43.4642μs 23.0074 KOps/s 23.3264 KOps/s $\color{#d91a1a}-1.37\%$
test_distributed 19.3860μs 6.6903μs 149.4696 KOps/s 157.3327 KOps/s $\color{#d91a1a}-5.00\%$
test_tdmodule 1.4173ms 21.6128μs 46.2688 KOps/s 45.2798 KOps/s $\color{#35bf28}+2.18\%$
test_tdmodule_dispatch 0.1889ms 37.1449μs 26.9216 KOps/s 26.6755 KOps/s $\color{#35bf28}+0.92\%$
test_tdseq 42.6290μs 22.9575μs 43.5587 KOps/s 42.2545 KOps/s $\color{#35bf28}+3.09\%$
test_tdseq_dispatch 0.4838ms 41.3097μs 24.2074 KOps/s 24.6990 KOps/s $\color{#d91a1a}-1.99\%$
test_instantiation_functorch 1.9946ms 1.2958ms 771.7044 Ops/s 776.9857 Ops/s $\color{#d91a1a}-0.68\%$
test_instantiation_td 58.7427ms 1.1054ms 904.6606 Ops/s 981.9966 Ops/s $\textbf{\color{#d91a1a}-7.88\%}$
test_exec_functorch 0.2386ms 0.1455ms 6.8746 KOps/s 6.9107 KOps/s $\color{#d91a1a}-0.52\%$
test_exec_td 0.2306ms 0.1410ms 7.0938 KOps/s 7.0255 KOps/s $\color{#35bf28}+0.97\%$
test_vmap_mlp_speed[True-True] 1.1804ms 0.8353ms 1.1972 KOps/s 1.1745 KOps/s $\color{#35bf28}+1.93\%$
test_vmap_mlp_speed[True-False] 0.5480ms 0.4587ms 2.1801 KOps/s 2.1809 KOps/s $\color{#d91a1a}-0.04\%$
test_vmap_mlp_speed[False-True] 0.9892ms 0.7508ms 1.3320 KOps/s 1.3024 KOps/s $\color{#35bf28}+2.27\%$
test_vmap_mlp_speed[False-False] 0.6972ms 0.3862ms 2.5896 KOps/s 2.5824 KOps/s $\color{#35bf28}+0.28\%$

@vmoens vmoens added the Refactor Refactoring code - not a new feature label Nov 13, 2023
@vmoens vmoens marked this pull request as ready for review November 13, 2023 16:41
@vmoens vmoens merged commit 95bd846 into main Nov 13, 2023
41 of 43 checks passed
@vmoens vmoens deleted the update_implement_for branch November 13, 2023 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants