-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] state_dict hooks compatibility in from_module
and to_module
#596
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Dec 11, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1400ms | 17.3453μs | 57.6525 KOps/s | 62.4285 KOps/s | |
test_plain_set_stack_nested | 0.2517ms | 0.1421ms | 7.0353 KOps/s | 7.0234 KOps/s | |
test_plain_set_nested_inplace | 55.7840μs | 19.8769μs | 50.3096 KOps/s | 53.8207 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3250ms | 0.1759ms | 5.6844 KOps/s | 5.5473 KOps/s | |
test_items | 15.9090μs | 2.4462μs | 408.7928 KOps/s | 412.8885 KOps/s | |
test_items_nested | 0.5542ms | 0.2711ms | 3.6893 KOps/s | 3.6919 KOps/s | |
test_items_nested_locked | 0.6440ms | 0.2709ms | 3.6919 KOps/s | 3.6828 KOps/s | |
test_items_nested_leaf | 0.2125ms | 0.1670ms | 5.9889 KOps/s | 6.0391 KOps/s | |
test_items_stack_nested | 2.1524ms | 1.3680ms | 730.9749 Ops/s | 751.7134 Ops/s | |
test_items_stack_nested_leaf | 1.9284ms | 1.2150ms | 823.0260 Ops/s | 835.2675 Ops/s | |
test_items_stack_nested_locked | 1.1229ms | 0.8823ms | 1.1335 KOps/s | 1.1398 KOps/s | |
test_keys | 19.2750μs | 4.2274μs | 236.5531 KOps/s | 257.5496 KOps/s | |
test_keys_nested | 53.4843ms | 0.1573ms | 6.3577 KOps/s | 6.7540 KOps/s | |
test_keys_nested_locked | 0.2641ms | 0.1468ms | 6.8113 KOps/s | 6.7202 KOps/s | |
test_keys_nested_leaf | 0.2131ms | 0.1297ms | 7.7086 KOps/s | 7.7116 KOps/s | |
test_keys_stack_nested | 1.5192ms | 1.3025ms | 767.7499 Ops/s | 778.2348 Ops/s | |
test_keys_stack_nested_leaf | 2.0958ms | 1.2980ms | 770.3907 Ops/s | 776.1664 Ops/s | |
test_keys_stack_nested_locked | 3.6089ms | 0.8248ms | 1.2124 KOps/s | 1.2283 KOps/s | |
test_values | 6.1418μs | 1.1369μs | 879.5606 KOps/s | 857.1922 KOps/s | |
test_values_nested | 93.6640μs | 51.9106μs | 19.2639 KOps/s | 18.1565 KOps/s | |
test_values_nested_locked | 0.1056ms | 52.2033μs | 19.1559 KOps/s | 19.1455 KOps/s | |
test_values_nested_leaf | 4.0486ms | 46.7328μs | 21.3982 KOps/s | 21.4655 KOps/s | |
test_values_stack_nested | 1.2799ms | 1.0701ms | 934.4855 Ops/s | 953.4445 Ops/s | |
test_values_stack_nested_leaf | 1.8489ms | 1.0555ms | 947.4106 Ops/s | 970.2931 Ops/s | |
test_values_stack_nested_locked | 0.7331ms | 0.6208ms | 1.6108 KOps/s | 1.6508 KOps/s | |
test_membership | 19.1560μs | 1.3388μs | 746.9485 KOps/s | 760.9406 KOps/s | |
test_membership_nested | 22.6120μs | 2.9299μs | 341.3133 KOps/s | 348.2275 KOps/s | |
test_membership_nested_leaf | 23.9650μs | 2.9392μs | 340.2243 KOps/s | 339.1227 KOps/s | |
test_membership_stacked_nested | 30.9580μs | 11.9142μs | 83.9334 KOps/s | 85.1624 KOps/s | |
test_membership_stacked_nested_leaf | 38.0310μs | 11.9277μs | 83.8385 KOps/s | 83.9895 KOps/s | |
test_membership_nested_last | 39.3040μs | 6.0050μs | 166.5276 KOps/s | 164.3708 KOps/s | |
test_membership_nested_leaf_last | 29.0040μs | 5.9513μs | 168.0296 KOps/s | 166.8085 KOps/s | |
test_membership_stacked_nested_last | 0.3597ms | 0.1674ms | 5.9721 KOps/s | 5.9828 KOps/s | |
test_membership_stacked_nested_leaf_last | 49.1120μs | 13.9794μs | 71.5338 KOps/s | 71.5025 KOps/s | |
test_nested_getleaf | 32.2200μs | 10.5228μs | 95.0316 KOps/s | 94.4209 KOps/s | |
test_nested_get | 30.3960μs | 10.0318μs | 99.6834 KOps/s | 98.7462 KOps/s | |
test_stacked_getleaf | 0.6318ms | 0.3981ms | 2.5119 KOps/s | 2.4488 KOps/s | |
test_stacked_get | 0.5772ms | 0.3628ms | 2.7564 KOps/s | 2.6682 KOps/s | |
test_nested_getitemleaf | 29.9560μs | 10.6096μs | 94.2545 KOps/s | 93.3900 KOps/s | |
test_nested_getitem | 37.2600μs | 10.0129μs | 99.8713 KOps/s | 98.5275 KOps/s | |
test_stacked_getitemleaf | 0.8669ms | 0.4006ms | 2.4965 KOps/s | 2.4252 KOps/s | |
test_stacked_getitem | 0.5456ms | 0.3638ms | 2.7485 KOps/s | 2.6573 KOps/s | |
test_lock_nested | 1.2383ms | 0.4084ms | 2.4485 KOps/s | 2.4072 KOps/s | |
test_lock_stack_nested | 70.7318ms | 6.2733ms | 159.4069 Ops/s | 155.0296 Ops/s | |
test_unlock_nested | 62.2035ms | 0.4775ms | 2.0944 KOps/s | 2.3793 KOps/s | |
test_unlock_stack_nested | 71.5023ms | 5.9403ms | 168.3425 Ops/s | 161.9871 Ops/s | |
test_flatten_speed | 0.5772ms | 0.3647ms | 2.7423 KOps/s | 2.6978 KOps/s | |
test_unflatten_speed | 0.6817ms | 0.4583ms | 2.1818 KOps/s | 2.1707 KOps/s | |
test_common_ops | 5.1225ms | 0.6914ms | 1.4463 KOps/s | 1.5180 KOps/s | |
test_creation | 27.2710μs | 2.0194μs | 495.1909 KOps/s | 503.8599 KOps/s | |
test_creation_empty | 34.4440μs | 10.8500μs | 92.1656 KOps/s | 114.9754 KOps/s | |
test_creation_nested_1 | 37.4600μs | 13.5740μs | 73.6704 KOps/s | 86.2351 KOps/s | |
test_creation_nested_2 | 51.1550μs | 16.7592μs | 59.6686 KOps/s | 67.9146 KOps/s | |
test_clone | 95.7890μs | 12.1685μs | 82.1795 KOps/s | 80.5660 KOps/s | |
test_getitem[int] | 35.8660μs | 11.6806μs | 85.6124 KOps/s | 83.8435 KOps/s | |
test_getitem[slice_int] | 93.0640μs | 23.2096μs | 43.0857 KOps/s | 41.5850 KOps/s | |
test_getitem[range] | 0.1225ms | 41.8944μs | 23.8695 KOps/s | 23.0152 KOps/s | |
test_getitem[tuple] | 44.9130μs | 19.1502μs | 52.2188 KOps/s | 51.7980 KOps/s | |
test_getitem[list] | 80.2200μs | 37.0535μs | 26.9880 KOps/s | 25.4109 KOps/s | |
test_setitem_dim[int] | 58.0380μs | 29.3076μs | 34.1208 KOps/s | 34.6682 KOps/s | |
test_setitem_dim[slice_int] | 91.9210μs | 55.2657μs | 18.0944 KOps/s | 17.8909 KOps/s | |
test_setitem_dim[range] | 0.1326ms | 74.0322μs | 13.5076 KOps/s | 13.4722 KOps/s | |
test_setitem_dim[tuple] | 76.7320μs | 44.0915μs | 22.6801 KOps/s | 22.7287 KOps/s | |
test_setitem | 0.1666ms | 18.5095μs | 54.0264 KOps/s | 55.2890 KOps/s | |
test_set | 0.1754ms | 18.3880μs | 54.3832 KOps/s | 57.0777 KOps/s | |
test_set_shared | 2.1736ms | 0.1350ms | 7.4092 KOps/s | 7.2492 KOps/s | |
test_update | 99.1240μs | 21.4208μs | 46.6836 KOps/s | 50.7753 KOps/s | |
test_update_nested | 97.1610μs | 28.5029μs | 35.0841 KOps/s | 36.8657 KOps/s | |
test_set_nested | 0.1553ms | 19.9718μs | 50.0706 KOps/s | 52.2562 KOps/s | |
test_set_nested_new | 0.1193ms | 24.5688μs | 40.7020 KOps/s | 42.9142 KOps/s | |
test_select | 97.4620μs | 46.7336μs | 21.3979 KOps/s | 21.0495 KOps/s | |
test_unbind_speed | 0.6271ms | 0.3364ms | 2.9727 KOps/s | 2.9549 KOps/s | |
test_unbind_speed_stack0 | 63.9612ms | 4.1792ms | 239.2800 Ops/s | 237.2108 Ops/s | |
test_unbind_speed_stack1 | 1.4893μs | 0.6218μs | 1.6083 MOps/s | 1.5217 MOps/s | |
test_split | 59.6358ms | 1.6481ms | 606.7616 Ops/s | 594.7048 Ops/s | |
test_chunk | 3.2130ms | 1.5723ms | 636.0058 Ops/s | 606.7421 Ops/s | |
test_creation[device0] | 0.1800ms | 96.7006μs | 10.3412 KOps/s | 9.9764 KOps/s | |
test_creation_from_tensor | 4.7488ms | 77.7598μs | 12.8601 KOps/s | 12.2693 KOps/s | |
test_add_one[memmap_tensor0] | 0.2559ms | 5.2152μs | 191.7486 KOps/s | 188.0493 KOps/s | |
test_contiguous[memmap_tensor0] | 17.7040μs | 0.6380μs | 1.5674 MOps/s | 1.5648 MOps/s | |
test_stack[memmap_tensor0] | 49.5620μs | 3.5411μs | 282.3992 KOps/s | 286.4341 KOps/s | |
test_memmaptd_index | 0.3695ms | 0.1968ms | 5.0822 KOps/s | 5.0323 KOps/s | |
test_memmaptd_index_astensor | 0.4529ms | 0.2572ms | 3.8886 KOps/s | 3.8272 KOps/s | |
test_memmaptd_index_op | 0.7627ms | 0.5507ms | 1.8157 KOps/s | 1.9234 KOps/s | |
test_serialize_model | 0.1017s | 96.8312ms | 10.3273 Ops/s | 9.2541 Ops/s | |
test_serialize_model_pickle | 0.4507s | 0.3757s | 2.6618 Ops/s | 2.5923 Ops/s | |
test_serialize_weights | 0.1572s | 0.1032s | 9.6855 Ops/s | 9.4085 Ops/s | |
test_serialize_weights_returnearly | 0.1822s | 0.1280s | 7.8148 Ops/s | 7.6669 Ops/s | |
test_serialize_weights_pickle | 0.6961s | 0.4971s | 2.0118 Ops/s | 2.3615 Ops/s | |
test_serialize_weights_filesystem | 0.1476s | 94.4118ms | 10.5919 Ops/s | 10.6792 Ops/s | |
test_serialize_model_filesystem | 0.1498s | 94.7292ms | 10.5564 Ops/s | 11.1123 Ops/s | |
test_reshape_pytree | 57.0960μs | 23.0467μs | 43.3902 KOps/s | 42.4144 KOps/s | |
test_reshape_td | 80.1990μs | 30.0731μs | 33.2523 KOps/s | 32.4092 KOps/s | |
test_view_pytree | 54.0610μs | 22.9079μs | 43.6530 KOps/s | 42.7319 KOps/s | |
test_view_td | 33.5420μs | 4.8386μs | 206.6697 KOps/s | 198.7917 KOps/s | |
test_unbind_pytree | 67.7770μs | 26.4390μs | 37.8229 KOps/s | 37.7172 KOps/s | |
test_unbind_td | 99.0240μs | 53.5419μs | 18.6770 KOps/s | 18.1711 KOps/s | |
test_split_pytree | 54.7920μs | 26.0595μs | 38.3737 KOps/s | 38.0002 KOps/s | |
test_split_td | 0.5481ms | 42.5521μs | 23.5006 KOps/s | 23.0724 KOps/s | |
test_add_pytree | 83.9070μs | 32.5334μs | 30.7376 KOps/s | 30.7930 KOps/s | |
test_add_td | 0.1072ms | 50.6934μs | 19.7265 KOps/s | 20.6325 KOps/s | |
test_distributed | 0.1736ms | 97.1816μs | 10.2900 KOps/s | 9.9042 KOps/s | |
test_tdmodule | 0.7531ms | 22.5298μs | 44.3856 KOps/s | 46.7229 KOps/s | |
test_tdmodule_dispatch | 0.1822ms | 39.8000μs | 25.1256 KOps/s | 25.6425 KOps/s | |
test_tdseq | 0.1167ms | 25.9130μs | 38.5907 KOps/s | 41.1978 KOps/s | |
test_tdseq_dispatch | 0.1392ms | 45.3395μs | 22.0558 KOps/s | 23.2249 KOps/s | |
test_instantiation_functorch | 1.5111ms | 1.2753ms | 784.1383 Ops/s | 767.4028 Ops/s | |
test_instantiation_td | 1.5200ms | 0.9943ms | 1.0057 KOps/s | 984.7598 Ops/s | |
test_exec_functorch | 0.2868ms | 0.1559ms | 6.4158 KOps/s | 6.2516 KOps/s | |
test_exec_functional_call | 0.2894ms | 0.1448ms | 6.9055 KOps/s | 6.8039 KOps/s | |
test_exec_td | 0.2712ms | 0.1417ms | 7.0554 KOps/s | 6.9648 KOps/s | |
test_exec_td_decorator | 0.7888ms | 0.1766ms | 5.6612 KOps/s | 5.5024 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1994ms | 0.8795ms | 1.1370 KOps/s | 1.1135 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.9215ms | 0.4683ms | 2.1352 KOps/s | 2.1330 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0474ms | 0.7680ms | 1.3022 KOps/s | 1.2800 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6313ms | 0.3814ms | 2.6218 KOps/s | 2.5896 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.0529ms | 2.4278ms | 411.9023 Ops/s | 409.7337 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9152ms | 0.5173ms | 1.9333 KOps/s | 1.9154 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.8347ms | 1.9720ms | 507.1080 Ops/s | 508.8530 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6914ms | 0.3967ms | 2.5211 KOps/s | 2.4707 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1372ms | 13.5620μs | 73.7355 KOps/s | 71.0700 KOps/s | |
test_plain_set_stack_nested | 0.1547ms | 0.1184ms | 8.4445 KOps/s | 8.4665 KOps/s | |
test_plain_set_nested_inplace | 41.3800μs | 14.7983μs | 67.5756 KOps/s | 65.3531 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1894ms | 0.1446ms | 6.9158 KOps/s | 6.8981 KOps/s | |
test_items | 24.4700μs | 4.6759μs | 213.8627 KOps/s | 208.9761 KOps/s | |
test_items_nested | 0.4090ms | 0.3382ms | 2.9572 KOps/s | 2.9467 KOps/s | |
test_items_nested_locked | 0.4118ms | 0.3394ms | 2.9462 KOps/s | 2.9118 KOps/s | |
test_items_nested_leaf | 0.2810ms | 0.1990ms | 5.0264 KOps/s | 4.9840 KOps/s | |
test_items_stack_nested | 1.4323ms | 1.2988ms | 769.9445 Ops/s | 770.4830 Ops/s | |
test_items_stack_nested_leaf | 1.2235ms | 1.1288ms | 885.8827 Ops/s | 877.2682 Ops/s | |
test_items_stack_nested_locked | 1.0425ms | 0.8950ms | 1.1173 KOps/s | 1.1006 KOps/s | |
test_keys | 18.4500μs | 4.6023μs | 217.2825 KOps/s | 216.0096 KOps/s | |
test_keys_nested | 0.7904ms | 94.6765μs | 10.5623 KOps/s | 10.6740 KOps/s | |
test_keys_nested_locked | 0.1351ms | 94.5496μs | 10.5765 KOps/s | 10.7692 KOps/s | |
test_keys_nested_leaf | 0.1804ms | 78.1864μs | 12.7900 KOps/s | 12.9408 KOps/s | |
test_keys_stack_nested | 1.1881ms | 1.1391ms | 877.8924 Ops/s | 884.3274 Ops/s | |
test_keys_stack_nested_leaf | 1.2472ms | 1.1165ms | 895.6563 Ops/s | 892.1104 Ops/s | |
test_keys_stack_nested_locked | 0.8254ms | 0.7175ms | 1.3937 KOps/s | 1.3866 KOps/s | |
test_values | 13.2537μs | 1.8843μs | 530.6959 KOps/s | 523.8714 KOps/s | |
test_values_nested | 75.7310μs | 44.9285μs | 22.2576 KOps/s | 21.9666 KOps/s | |
test_values_nested_locked | 75.5420μs | 47.0740μs | 21.2432 KOps/s | 20.9648 KOps/s | |
test_values_nested_leaf | 64.8810μs | 39.1641μs | 25.5336 KOps/s | 25.3813 KOps/s | |
test_values_stack_nested | 1.0526ms | 0.9434ms | 1.0599 KOps/s | 1.0378 KOps/s | |
test_values_stack_nested_leaf | 1.0577ms | 0.9317ms | 1.0733 KOps/s | 1.0656 KOps/s | |
test_values_stack_nested_locked | 0.6998ms | 0.5722ms | 1.7476 KOps/s | 1.7342 KOps/s | |
test_membership | 3.6860μs | 0.9394μs | 1.0646 MOps/s | 1.0673 MOps/s | |
test_membership_nested | 32.6100μs | 2.3063μs | 433.5911 KOps/s | 433.8712 KOps/s | |
test_membership_nested_leaf | 16.2600μs | 2.2283μs | 448.7656 KOps/s | 446.4456 KOps/s | |
test_membership_stacked_nested | 32.5800μs | 11.2142μs | 89.1730 KOps/s | 90.3751 KOps/s | |
test_membership_stacked_nested_leaf | 30.6710μs | 11.3287μs | 88.2717 KOps/s | 91.8350 KOps/s | |
test_membership_nested_last | 39.9510μs | 4.7251μs | 211.6352 KOps/s | 210.3031 KOps/s | |
test_membership_nested_leaf_last | 18.7410μs | 4.7398μs | 210.9787 KOps/s | 208.3336 KOps/s | |
test_membership_stacked_nested_last | 0.1774ms | 0.1373ms | 7.2813 KOps/s | 7.3688 KOps/s | |
test_membership_stacked_nested_leaf_last | 51.1210μs | 13.3490μs | 74.9118 KOps/s | 78.7540 KOps/s | |
test_nested_getleaf | 38.5910μs | 8.4056μs | 118.9686 KOps/s | 119.5909 KOps/s | |
test_nested_get | 24.8700μs | 7.9422μs | 125.9090 KOps/s | 126.1635 KOps/s | |
test_stacked_getleaf | 0.4353ms | 0.3217ms | 3.1086 KOps/s | 3.1497 KOps/s | |
test_stacked_get | 0.3394ms | 0.2899ms | 3.4489 KOps/s | 3.5417 KOps/s | |
test_nested_getitemleaf | 36.6900μs | 8.4657μs | 118.1242 KOps/s | 118.4880 KOps/s | |
test_nested_getitem | 31.4700μs | 8.0128μs | 124.8004 KOps/s | 125.0007 KOps/s | |
test_stacked_getitemleaf | 0.3977ms | 0.3240ms | 3.0868 KOps/s | 3.1446 KOps/s | |
test_stacked_getitem | 0.3415ms | 0.2915ms | 3.4304 KOps/s | 3.5210 KOps/s | |
test_lock_nested | 4.5991ms | 0.4250ms | 2.3532 KOps/s | 2.3822 KOps/s | |
test_lock_stack_nested | 92.4297ms | 6.7673ms | 147.7683 Ops/s | 149.7002 Ops/s | |
test_unlock_nested | 0.8865ms | 0.4145ms | 2.4127 KOps/s | 2.4081 KOps/s | |
test_unlock_stack_nested | 89.1323ms | 7.0499ms | 141.8461 Ops/s | 141.2035 Ops/s | |
test_flatten_speed | 0.7903ms | 0.2620ms | 3.8169 KOps/s | 3.8528 KOps/s | |
test_unflatten_speed | 0.4292ms | 0.3525ms | 2.8371 KOps/s | 2.7980 KOps/s | |
test_common_ops | 1.0261ms | 0.5780ms | 1.7300 KOps/s | 1.6579 KOps/s | |
test_creation | 17.5500μs | 1.6032μs | 623.7384 KOps/s | 618.6956 KOps/s | |
test_creation_empty | 21.9000μs | 7.9206μs | 126.2525 KOps/s | 108.3920 KOps/s | |
test_creation_nested_1 | 42.3910μs | 9.8248μs | 101.7831 KOps/s | 89.6687 KOps/s | |
test_creation_nested_2 | 31.7500μs | 12.3112μs | 81.2267 KOps/s | 73.5771 KOps/s | |
test_clone | 0.1558ms | 12.8589μs | 77.7672 KOps/s | 77.1915 KOps/s | |
test_getitem[int] | 25.6900μs | 11.1683μs | 89.5391 KOps/s | 87.9813 KOps/s | |
test_getitem[slice_int] | 44.4810μs | 21.7223μs | 46.0357 KOps/s | 45.3078 KOps/s | |
test_getitem[range] | 69.0910μs | 37.7513μs | 26.4892 KOps/s | 27.3363 KOps/s | |
test_getitem[tuple] | 55.6810μs | 18.8605μs | 53.0209 KOps/s | 53.4351 KOps/s | |
test_getitem[list] | 0.4048ms | 35.0628μs | 28.5203 KOps/s | 29.3296 KOps/s | |
test_setitem_dim[int] | 66.4710μs | 27.7454μs | 36.0421 KOps/s | 35.9479 KOps/s | |
test_setitem_dim[slice_int] | 82.5810μs | 49.0132μs | 20.4027 KOps/s | 21.0275 KOps/s | |
test_setitem_dim[range] | 0.1087ms | 65.1509μs | 15.3490 KOps/s | 15.9271 KOps/s | |
test_setitem_dim[tuple] | 61.3710μs | 41.7874μs | 23.9306 KOps/s | 23.9579 KOps/s | |
test_setitem | 0.1372ms | 17.3407μs | 57.6678 KOps/s | 55.0099 KOps/s | |
test_set | 0.1325ms | 16.7488μs | 59.7058 KOps/s | 56.8552 KOps/s | |
test_set_shared | 2.8979ms | 0.1026ms | 9.7492 KOps/s | 9.8526 KOps/s | |
test_update | 0.1280ms | 19.4351μs | 51.4533 KOps/s | 48.1437 KOps/s | |
test_update_nested | 0.1527ms | 25.5954μs | 39.0695 KOps/s | 36.7373 KOps/s | |
test_set_nested | 0.1329ms | 18.0311μs | 55.4599 KOps/s | 53.1385 KOps/s | |
test_set_nested_new | 0.1380ms | 20.9831μs | 47.6575 KOps/s | 45.3712 KOps/s | |
test_select | 0.1565ms | 41.8579μs | 23.8903 KOps/s | 23.4074 KOps/s | |
test_to | 74.6020μs | 54.2570μs | 18.4308 KOps/s | 18.1902 KOps/s | |
test_to_nonblocking | 73.3010μs | 34.8042μs | 28.7322 KOps/s | 28.8325 KOps/s | |
test_unbind_speed | 0.3722ms | 0.3311ms | 3.0206 KOps/s | 3.0242 KOps/s | |
test_unbind_speed_stack0 | 86.5820ms | 4.1775ms | 239.3772 Ops/s | 256.5125 Ops/s | |
test_unbind_speed_stack1 | 1.5750μs | 0.5388μs | 1.8559 MOps/s | 1.8863 MOps/s | |
test_split | 1.8554ms | 1.5727ms | 635.8451 Ops/s | 575.5000 Ops/s | |
test_chunk | 79.0481ms | 1.7026ms | 587.3235 Ops/s | 584.0735 Ops/s | |
test_creation[device0] | 0.1417ms | 72.6205μs | 13.7702 KOps/s | 13.8337 KOps/s | |
test_creation_from_tensor | 0.1317ms | 53.3616μs | 18.7401 KOps/s | 17.5757 KOps/s | |
test_add_one[memmap_tensor0] | 0.1472ms | 7.0969μs | 140.9062 KOps/s | 140.7511 KOps/s | |
test_contiguous[memmap_tensor0] | 23.7800μs | 0.6506μs | 1.5370 MOps/s | 1.5207 MOps/s | |
test_stack[memmap_tensor0] | 33.5810μs | 4.5966μs | 217.5522 KOps/s | 224.1874 KOps/s | |
test_memmaptd_index | 0.2710ms | 0.2485ms | 4.0249 KOps/s | 4.0655 KOps/s | |
test_memmaptd_index_astensor | 0.3311ms | 0.3012ms | 3.3197 KOps/s | 3.2936 KOps/s | |
test_memmaptd_index_op | 0.7806ms | 0.5868ms | 1.7042 KOps/s | 1.6495 KOps/s | |
test_serialize_model | 0.1701s | 99.1106ms | 10.0897 Ops/s | 9.6108 Ops/s | |
test_serialize_model_pickle | 1.3487s | 1.2365s | 0.8088 Ops/s | 0.8056 Ops/s | |
test_serialize_weights | 0.1696s | 96.1028ms | 10.4055 Ops/s | 9.7487 Ops/s | |
test_serialize_weights_returnearly | 0.2726s | 79.0735ms | 12.6465 Ops/s | 14.7876 Ops/s | |
test_serialize_weights_pickle | 1.3531s | 1.2382s | 0.8077 Ops/s | 0.8082 Ops/s | |
test_reshape_pytree | 57.2110μs | 24.6835μs | 40.5129 KOps/s | 40.9625 KOps/s | |
test_reshape_td | 58.9410μs | 29.0580μs | 34.4139 KOps/s | 35.5274 KOps/s | |
test_view_pytree | 54.6410μs | 24.2635μs | 41.2141 KOps/s | 42.3294 KOps/s | |
test_view_td | 21.4110μs | 4.1009μs | 243.8491 KOps/s | 245.0510 KOps/s | |
test_unbind_pytree | 52.7810μs | 29.8786μs | 33.4688 KOps/s | 33.4730 KOps/s | |
test_unbind_td | 83.7810μs | 51.8030μs | 19.3039 KOps/s | 18.0397 KOps/s | |
test_split_pytree | 51.9610μs | 28.2703μs | 35.3729 KOps/s | 34.5890 KOps/s | |
test_split_td | 0.7514ms | 40.3405μs | 24.7890 KOps/s | 24.3731 KOps/s | |
test_add_pytree | 60.4610μs | 36.3874μs | 27.4820 KOps/s | 27.8288 KOps/s | |
test_add_td | 96.3610μs | 47.4363μs | 21.0809 KOps/s | 20.5102 KOps/s | |
test_distributed | 3.9383ms | 73.2619μs | 13.6497 KOps/s | 13.4299 KOps/s | |
test_tdmodule | 36.7400μs | 17.2074μs | 58.1144 KOps/s | 53.5232 KOps/s | |
test_tdmodule_dispatch | 0.2483ms | 33.5821μs | 29.7777 KOps/s | 28.7829 KOps/s | |
test_tdseq | 39.9410μs | 20.7328μs | 48.2327 KOps/s | 46.1799 KOps/s | |
test_tdseq_dispatch | 53.1210μs | 36.2253μs | 27.6050 KOps/s | 26.4358 KOps/s | |
test_instantiation_functorch | 1.7707ms | 1.6760ms | 596.6663 Ops/s | 599.9548 Ops/s | |
test_instantiation_td | 1.7637ms | 1.1931ms | 838.1392 Ops/s | 865.1082 Ops/s | |
test_exec_functorch | 0.1953ms | 0.1558ms | 6.4183 KOps/s | 6.3371 KOps/s | |
test_exec_functional_call | 0.1854ms | 0.1578ms | 6.3356 KOps/s | 6.3427 KOps/s | |
test_exec_td | 0.2165ms | 0.1474ms | 6.7820 KOps/s | 6.7759 KOps/s | |
test_exec_td_decorator | 0.7590ms | 0.1889ms | 5.2941 KOps/s | 5.3562 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.2023ms | 1.1157ms | 896.3102 Ops/s | 899.5287 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.7271ms | 0.6647ms | 1.5044 KOps/s | 1.5109 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1615ms | 1.0275ms | 973.2353 Ops/s | 978.4973 Ops/s | |
test_vmap_mlp_speed[False-False] | 0.6537ms | 0.5919ms | 1.6896 KOps/s | 1.6981 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.2408ms | 2.5343ms | 394.5879 Ops/s | 391.1642 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0136ms | 0.7109ms | 1.4067 KOps/s | 1.3960 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.5364ms | 2.1120ms | 473.4815 Ops/s | 466.8251 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.1096ms | 0.6106ms | 1.6378 KOps/s | 1.6428 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.6607ms | 12.5604ms | 79.6150 Ops/s | 79.8489 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.2966ms | 8.2350ms | 121.4329 Ops/s | 121.6993 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.4867ms | 12.4337ms | 80.4263 Ops/s | 80.9636 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.1948ms | 8.1601ms | 122.5480 Ops/s | 122.8381 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 0.1665s | 83.0680ms | 12.0383 Ops/s | 11.9058 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 21.5018ms | 19.7528ms | 50.6258 Ops/s | 50.7519 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 69.9475ms | 68.9055ms | 14.5126 Ops/s | 14.4450 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 21.1530ms | 19.4443ms | 51.4289 Ops/s | 47.0814 Ops/s |
@vmoens The PR looks good to me. I'm thinking to write a DCP storage plugin with TensorDict to see how it works. But we will need TensorDict to support 1.) DTensor and 2.) optimizer state_dict. |
@fegin I made the from-module method a little more recursive. Let me see about DTensors and optimizers state-dict compatibility. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Optionally calls the state_dict hooks in
state_dict
andload_state_dict
.cc @fegin