Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Refactor to create empty tds where necessary #522

Merged
merged 1 commit into from
Sep 6, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 6, 2023

Description

  • Ensures that empty tensordicts are created for parameter-free modules (eg., nn.Tanh)
  • Add a as_module arg that allows to get TensorDIctParams object that can be used for functional calls as well as storing params in a nn.Module.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 6, 2023
@vmoens vmoens added enhancement New feature or request Refactor Refactoring code - not a new feature labels Sep 6, 2023
@github-actions
Copy link

github-actions bot commented Sep 6, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 109. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.7010μs 20.3796μs 49.0688 KOps/s 49.2512 KOps/s $\color{#d91a1a}-0.37\%$
test_plain_set_stack_nested 0.2256ms 0.1903ms 5.2542 KOps/s 5.2729 KOps/s $\color{#d91a1a}-0.36\%$
test_plain_set_nested_inplace 58.2010μs 23.6491μs 42.2849 KOps/s 41.3360 KOps/s $\color{#35bf28}+2.30\%$
test_plain_set_stack_nested_inplace 0.2541ms 0.2230ms 4.4835 KOps/s 4.4200 KOps/s $\color{#35bf28}+1.44\%$
test_items 28.0010μs 3.4987μs 285.8177 KOps/s 292.6771 KOps/s $\color{#d91a1a}-2.34\%$
test_items_nested 2.4148ms 0.3690ms 2.7099 KOps/s 2.7509 KOps/s $\color{#d91a1a}-1.49\%$
test_items_nested_locked 0.4391ms 0.3663ms 2.7297 KOps/s 2.7320 KOps/s $\color{#d91a1a}-0.08\%$
test_items_nested_leaf 1.2308ms 0.2240ms 4.4641 KOps/s 4.5281 KOps/s $\color{#d91a1a}-1.41\%$
test_items_stack_nested 2.1427ms 1.9908ms 502.3153 Ops/s 481.1388 Ops/s $\color{#35bf28}+4.40\%$
test_items_stack_nested_leaf 1.9118ms 1.8100ms 552.4912 Ops/s 528.2673 Ops/s $\color{#35bf28}+4.59\%$
test_items_stack_nested_locked 3.1283ms 1.0006ms 999.4469 Ops/s 1.0208 KOps/s $\color{#d91a1a}-2.09\%$
test_keys 29.4000μs 5.0354μs 198.5946 KOps/s 195.9676 KOps/s $\color{#35bf28}+1.34\%$
test_keys_nested 2.5521ms 0.1851ms 5.4026 KOps/s 5.3920 KOps/s $\color{#35bf28}+0.20\%$
test_keys_nested_locked 0.2314ms 0.1826ms 5.4766 KOps/s 5.4750 KOps/s $\color{#35bf28}+0.03\%$
test_keys_nested_leaf 0.3962ms 0.1768ms 5.6553 KOps/s 5.1894 KOps/s $\textbf{\color{#35bf28}+8.98\%}$
test_keys_stack_nested 2.6896ms 1.8485ms 540.9932 Ops/s 543.2763 Ops/s $\color{#d91a1a}-0.42\%$
test_keys_stack_nested_leaf 2.0579ms 1.8229ms 548.5681 Ops/s 544.3132 Ops/s $\color{#35bf28}+0.78\%$
test_keys_stack_nested_locked 0.9316ms 0.8174ms 1.2233 KOps/s 1.2167 KOps/s $\color{#35bf28}+0.54\%$
test_values 19.8010μs 1.5262μs 655.2240 KOps/s 649.4649 KOps/s $\color{#35bf28}+0.89\%$
test_values_nested 91.5010μs 66.2421μs 15.0961 KOps/s 15.1268 KOps/s $\color{#d91a1a}-0.20\%$
test_values_nested_locked 96.6010μs 66.2110μs 15.1032 KOps/s 15.1674 KOps/s $\color{#d91a1a}-0.42\%$
test_values_nested_leaf 92.4010μs 58.6565μs 17.0484 KOps/s 17.1428 KOps/s $\color{#d91a1a}-0.55\%$
test_values_stack_nested 2.0072ms 1.5892ms 629.2608 Ops/s 623.6622 Ops/s $\color{#35bf28}+0.90\%$
test_values_stack_nested_leaf 1.6687ms 1.5819ms 632.1698 Ops/s 627.3512 Ops/s $\color{#35bf28}+0.77\%$
test_values_stack_nested_locked 0.7163ms 0.6409ms 1.5602 KOps/s 1.5349 KOps/s $\color{#35bf28}+1.65\%$
test_membership 13.5000μs 1.8562μs 538.7403 KOps/s 541.8499 KOps/s $\color{#d91a1a}-0.57\%$
test_membership_nested 40.2000μs 3.5264μs 283.5758 KOps/s 279.4347 KOps/s $\color{#35bf28}+1.48\%$
test_membership_nested_leaf 36.9000μs 3.5738μs 279.8139 KOps/s 279.6260 KOps/s $\color{#35bf28}+0.07\%$
test_membership_stacked_nested 51.4000μs 14.3139μs 69.8620 KOps/s 69.6105 KOps/s $\color{#35bf28}+0.36\%$
test_membership_stacked_nested_leaf 40.2000μs 14.4806μs 69.0577 KOps/s 69.6901 KOps/s $\color{#d91a1a}-0.91\%$
test_membership_nested_last 37.3000μs 7.4490μs 134.2455 KOps/s 133.3085 KOps/s $\color{#35bf28}+0.70\%$
test_membership_nested_leaf_last 57.4010μs 7.4722μs 133.8300 KOps/s 133.7622 KOps/s $\color{#35bf28}+0.05\%$
test_membership_stacked_nested_last 0.3203ms 0.2259ms 4.4259 KOps/s 4.4076 KOps/s $\color{#35bf28}+0.42\%$
test_membership_stacked_nested_leaf_last 47.8010μs 16.7235μs 59.7961 KOps/s 59.5154 KOps/s $\color{#35bf28}+0.47\%$
test_nested_getleaf 42.6000μs 15.8561μs 63.0671 KOps/s 63.2136 KOps/s $\color{#d91a1a}-0.23\%$
test_nested_get 50.2010μs 14.9784μs 66.7626 KOps/s 66.5145 KOps/s $\color{#35bf28}+0.37\%$
test_stacked_getleaf 1.3323ms 0.9223ms 1.0843 KOps/s 1.1403 KOps/s $\color{#d91a1a}-4.91\%$
test_stacked_get 0.8810ms 0.8439ms 1.1850 KOps/s 1.1887 KOps/s $\color{#d91a1a}-0.31\%$
test_nested_getitemleaf 92.5010μs 15.9091μs 62.8570 KOps/s 63.1091 KOps/s $\color{#d91a1a}-0.40\%$
test_nested_getitem 41.9010μs 14.9813μs 66.7499 KOps/s 66.0310 KOps/s $\color{#35bf28}+1.09\%$
test_stacked_getitemleaf 1.0012ms 0.8786ms 1.1382 KOps/s 1.1357 KOps/s $\color{#35bf28}+0.22\%$
test_stacked_getitem 0.8767ms 0.8427ms 1.1866 KOps/s 1.1912 KOps/s $\color{#d91a1a}-0.39\%$
test_lock_nested 97.1692ms 1.5284ms 654.2640 Ops/s 697.3505 Ops/s $\textbf{\color{#d91a1a}-6.18\%}$
test_lock_stack_nested 0.1184s 22.0060ms 45.4421 Ops/s 49.7370 Ops/s $\textbf{\color{#d91a1a}-8.64\%}$
test_unlock_nested 1.8135ms 1.4520ms 688.7016 Ops/s 645.9879 Ops/s $\textbf{\color{#35bf28}+6.61\%}$
test_unlock_stack_nested 0.1192s 20.2851ms 49.2972 Ops/s 48.7945 Ops/s $\color{#35bf28}+1.03\%$
test_flatten_speed 1.1071ms 1.0265ms 974.1819 Ops/s 975.9414 Ops/s $\color{#d91a1a}-0.18\%$
test_unflatten_speed 1.8831ms 1.8372ms 544.3175 Ops/s 549.4583 Ops/s $\color{#d91a1a}-0.94\%$
test_common_ops 6.4459ms 1.1154ms 896.5624 Ops/s 894.2186 Ops/s $\color{#35bf28}+0.26\%$
test_creation 70.2010μs 6.1765μs 161.9041 KOps/s 162.9211 KOps/s $\color{#d91a1a}-0.62\%$
test_creation_empty 34.6000μs 13.8186μs 72.3663 KOps/s 73.2249 KOps/s $\color{#d91a1a}-1.17\%$
test_creation_nested_1 51.7010μs 25.2356μs 39.6266 KOps/s 39.8992 KOps/s $\color{#d91a1a}-0.68\%$
test_creation_nested_2 61.3010μs 27.5524μs 36.2945 KOps/s 36.5533 KOps/s $\color{#d91a1a}-0.71\%$
test_clone 0.2044ms 24.8654μs 40.2166 KOps/s 39.6439 KOps/s $\color{#35bf28}+1.44\%$
test_getitem[int] 49.0000μs 27.2303μs 36.7238 KOps/s 35.3972 KOps/s $\color{#35bf28}+3.75\%$
test_getitem[slice_int] 0.1193ms 54.4284μs 18.3728 KOps/s 18.6605 KOps/s $\color{#d91a1a}-1.54\%$
test_getitem[range] 0.1377ms 80.7256μs 12.3876 KOps/s 12.2893 KOps/s $\color{#35bf28}+0.80\%$
test_getitem[tuple] 0.1562ms 44.7598μs 22.3415 KOps/s 22.0686 KOps/s $\color{#35bf28}+1.24\%$
test_getitem[list] 0.3992ms 76.3966μs 13.0896 KOps/s 13.0304 KOps/s $\color{#35bf28}+0.45\%$
test_setitem_dim[int] 57.8000μs 32.3273μs 30.9336 KOps/s 30.4703 KOps/s $\color{#35bf28}+1.52\%$
test_setitem_dim[slice_int] 97.0010μs 57.2680μs 17.4618 KOps/s 17.3480 KOps/s $\color{#35bf28}+0.66\%$
test_setitem_dim[range] 0.1079ms 77.7447μs 12.8626 KOps/s 12.7132 KOps/s $\color{#35bf28}+1.18\%$
test_setitem_dim[tuple] 80.8010μs 47.6711μs 20.9771 KOps/s 20.7513 KOps/s $\color{#35bf28}+1.09\%$
test_setitem 0.2442ms 33.1812μs 30.1376 KOps/s 30.5253 KOps/s $\color{#d91a1a}-1.27\%$
test_set 0.2518ms 31.6195μs 31.6261 KOps/s 31.6669 KOps/s $\color{#d91a1a}-0.13\%$
test_set_shared 4.3719ms 0.1809ms 5.5292 KOps/s 5.5698 KOps/s $\color{#d91a1a}-0.73\%$
test_update 0.2574ms 35.7791μs 27.9493 KOps/s 27.6802 KOps/s $\color{#35bf28}+0.97\%$
test_update_nested 0.2489ms 53.4046μs 18.7250 KOps/s 18.8393 KOps/s $\color{#d91a1a}-0.61\%$
test_set_nested 0.2588ms 35.0358μs 28.5422 KOps/s 28.8167 KOps/s $\color{#d91a1a}-0.95\%$
test_set_nested_new 0.2458ms 53.4132μs 18.7220 KOps/s 18.6691 KOps/s $\color{#35bf28}+0.28\%$
test_select 0.3189ms 97.7472μs 10.2305 KOps/s 10.2298 KOps/s $+0.01\%$
test_unbind_speed 0.9665ms 0.6540ms 1.5291 KOps/s 1.5279 KOps/s $\color{#35bf28}+0.07\%$
test_unbind_speed_stack0 0.1069s 9.3202ms 107.2943 Ops/s 104.8847 Ops/s $\color{#35bf28}+2.30\%$
test_unbind_speed_stack1 21.6000μs 1.1832μs 845.1598 KOps/s 851.5423 KOps/s $\color{#d91a1a}-0.75\%$
test_creation[device0] 2.8688ms 0.4600ms 2.1741 KOps/s 2.1953 KOps/s $\color{#d91a1a}-0.97\%$
test_creation_from_tensor 0.6259ms 0.5058ms 1.9772 KOps/s 1.9406 KOps/s $\color{#35bf28}+1.89\%$
test_add_one[memmap_tensor0] 2.3827ms 33.2028μs 30.1180 KOps/s 30.1032 KOps/s $\color{#35bf28}+0.05\%$
test_contiguous[memmap_tensor0] 34.4000μs 8.6319μs 115.8493 KOps/s 111.2333 KOps/s $\color{#35bf28}+4.15\%$
test_stack[memmap_tensor0] 81.6010μs 26.4700μs 37.7786 KOps/s 36.4420 KOps/s $\color{#35bf28}+3.67\%$
test_memmaptd_index 0.3808ms 0.3195ms 3.1296 KOps/s 3.1721 KOps/s $\color{#d91a1a}-1.34\%$
test_memmaptd_index_astensor 4.9978ms 1.3561ms 737.3984 Ops/s 742.4319 Ops/s $\color{#d91a1a}-0.68\%$
test_memmaptd_index_op 2.7484ms 2.6145ms 382.4761 Ops/s 381.4665 Ops/s $\color{#35bf28}+0.26\%$
test_reshape_pytree 0.1023ms 37.5808μs 26.6093 KOps/s 25.9627 KOps/s $\color{#35bf28}+2.49\%$
test_reshape_td 82.0010μs 46.1291μs 21.6783 KOps/s 22.0118 KOps/s $\color{#d91a1a}-1.52\%$
test_view_pytree 94.1010μs 34.9171μs 28.6393 KOps/s 28.2917 KOps/s $\color{#35bf28}+1.23\%$
test_view_td 27.6000μs 8.8684μs 112.7596 KOps/s 113.7771 KOps/s $\color{#d91a1a}-0.89\%$
test_unbind_pytree 79.2010μs 38.5694μs 25.9273 KOps/s 25.7367 KOps/s $\color{#35bf28}+0.74\%$
test_unbind_td 0.1938ms 95.8365μs 10.4344 KOps/s 10.2515 KOps/s $\color{#35bf28}+1.78\%$
test_split_pytree 65.5010μs 44.2835μs 22.5818 KOps/s 21.9596 KOps/s $\color{#35bf28}+2.83\%$
test_split_td 3.3036ms 0.1174ms 8.5156 KOps/s 8.5774 KOps/s $\color{#d91a1a}-0.72\%$
test_add_pytree 96.1010μs 46.5880μs 21.4648 KOps/s 20.7902 KOps/s $\color{#35bf28}+3.24\%$
test_add_td 0.2369ms 76.3035μs 13.1056 KOps/s 13.1055 KOps/s $+0.00\%$
test_distributed 26.7010μs 8.9579μs 111.6337 KOps/s 111.5716 KOps/s $\color{#35bf28}+0.06\%$
test_tdmodule 0.2008ms 28.5320μs 35.0484 KOps/s 35.2764 KOps/s $\color{#d91a1a}-0.65\%$
test_tdmodule_dispatch 0.3086ms 55.6870μs 17.9575 KOps/s 18.1341 KOps/s $\color{#d91a1a}-0.97\%$
test_tdseq 58.8000μs 30.4704μs 32.8187 KOps/s 30.5358 KOps/s $\textbf{\color{#35bf28}+7.48\%}$
test_tdseq_dispatch 0.6251ms 66.8771μs 14.9528 KOps/s 15.0515 KOps/s $\color{#d91a1a}-0.66\%$
test_instantiation_functorch 1.7425ms 1.6382ms 610.4251 Ops/s 601.2512 Ops/s $\color{#35bf28}+1.53\%$
test_instantiation_td 2.1725ms 1.3696ms 730.1171 Ops/s 729.1821 Ops/s $\color{#35bf28}+0.13\%$
test_exec_functorch 0.2445ms 0.1873ms 5.3399 KOps/s 5.2284 KOps/s $\color{#35bf28}+2.13\%$
test_exec_td 0.2149ms 0.1768ms 5.6565 KOps/s 5.4480 KOps/s $\color{#35bf28}+3.83\%$
test_vmap_mlp_speed[True-True] 9.0583ms 1.2265ms 815.3040 Ops/s 811.5939 Ops/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed[True-False] 5.6599ms 0.6344ms 1.5762 KOps/s 1.5790 KOps/s $\color{#d91a1a}-0.18\%$
test_vmap_mlp_speed[False-True] 10.3196ms 1.0477ms 954.4628 Ops/s 961.6785 Ops/s $\color{#d91a1a}-0.75\%$
test_vmap_mlp_speed[False-False] 7.8358ms 0.4657ms 2.1471 KOps/s 2.1125 KOps/s $\color{#35bf28}+1.64\%$
test_vmap_transformer_speed[True-True] 21.4103ms 14.0822ms 71.0117 Ops/s 70.2650 Ops/s $\color{#35bf28}+1.06\%$
test_vmap_transformer_speed[True-False] 16.8576ms 9.1688ms 109.0660 Ops/s 90.8038 Ops/s $\textbf{\color{#35bf28}+20.11\%}$
test_vmap_transformer_speed[False-True] 23.2853ms 13.9375ms 71.7487 Ops/s 70.5381 Ops/s $\color{#35bf28}+1.72\%$
test_vmap_transformer_speed[False-False] 15.3879ms 8.7919ms 113.7409 Ops/s 111.5662 Ops/s $\color{#35bf28}+1.95\%$

Copy link
Contributor

@matteobettini matteobettini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if key in keys and (
not is_tensor_collection(out.get(key)) or not out.get(key).is_empty()
):
print(out.get(key))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left over

@vmoens vmoens marked this pull request as ready for review September 6, 2023 14:58
@vmoens vmoens merged commit 14ca63b into main Sep 6, 2023
26 of 27 checks passed
@vmoens vmoens deleted the refactor_from_module branch September 6, 2023 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants