-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] inplace to_module #610
Conversation
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 31.3390μs | 15.5836μs | 64.1700 KOps/s | 63.8748 KOps/s | |
test_plain_set_stack_nested | 0.1948ms | 0.1426ms | 7.0130 KOps/s | 7.0588 KOps/s | |
test_plain_set_nested_inplace | 63.2390μs | 17.9156μs | 55.8173 KOps/s | 55.4594 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2320ms | 0.1741ms | 5.7449 KOps/s | 5.6606 KOps/s | |
test_items | 17.5230μs | 2.5462μs | 392.7437 KOps/s | 398.5276 KOps/s | |
test_items_nested | 0.3926ms | 0.2659ms | 3.7613 KOps/s | 3.7150 KOps/s | |
test_items_nested_locked | 0.5567ms | 0.2672ms | 3.7431 KOps/s | 3.6776 KOps/s | |
test_items_nested_leaf | 1.1396ms | 0.1762ms | 5.6750 KOps/s | 5.9643 KOps/s | |
test_items_stack_nested | 1.4327ms | 1.3073ms | 764.9628 Ops/s | 752.5044 Ops/s | |
test_items_stack_nested_leaf | 1.2740ms | 1.1788ms | 848.3528 Ops/s | 841.3675 Ops/s | |
test_items_stack_nested_locked | 0.8622ms | 0.7567ms | 1.3215 KOps/s | 1.2934 KOps/s | |
test_keys | 18.6050μs | 3.9125μs | 255.5913 KOps/s | 255.3199 KOps/s | |
test_keys_nested | 50.3692ms | 0.1589ms | 6.2914 KOps/s | 6.7272 KOps/s | |
test_keys_nested_locked | 0.1992ms | 0.1486ms | 6.7290 KOps/s | 6.6958 KOps/s | |
test_keys_nested_leaf | 0.1997ms | 0.1295ms | 7.7238 KOps/s | 7.6947 KOps/s | |
test_keys_stack_nested | 2.4466ms | 1.2746ms | 784.5386 Ops/s | 781.1785 Ops/s | |
test_keys_stack_nested_leaf | 1.8233ms | 1.2637ms | 791.3312 Ops/s | 785.5006 Ops/s | |
test_keys_stack_nested_locked | 5.2063ms | 0.7059ms | 1.4166 KOps/s | 1.4264 KOps/s | |
test_values | 8.0232μs | 1.1358μs | 880.4555 KOps/s | 833.2648 KOps/s | |
test_values_nested | 0.1031ms | 52.0042μs | 19.2292 KOps/s | 19.3325 KOps/s | |
test_values_nested_locked | 99.8880μs | 52.3951μs | 19.0857 KOps/s | 19.1451 KOps/s | |
test_values_nested_leaf | 0.1201ms | 46.8142μs | 21.3611 KOps/s | 21.8429 KOps/s | |
test_values_stack_nested | 1.6950ms | 1.0459ms | 956.0710 Ops/s | 946.6361 Ops/s | |
test_values_stack_nested_leaf | 1.1340ms | 1.0271ms | 973.5986 Ops/s | 961.9214 Ops/s | |
test_values_stack_nested_locked | 0.8717ms | 0.5052ms | 1.9796 KOps/s | 1.9165 KOps/s | |
test_membership | 35.9780μs | 1.3743μs | 727.6463 KOps/s | 757.3635 KOps/s | |
test_membership_nested | 19.9480μs | 2.8709μs | 348.3212 KOps/s | 347.3336 KOps/s | |
test_membership_nested_leaf | 37.4900μs | 2.8765μs | 347.6480 KOps/s | 342.3955 KOps/s | |
test_membership_stacked_nested | 28.8040μs | 11.9950μs | 83.3681 KOps/s | 82.6833 KOps/s | |
test_membership_stacked_nested_leaf | 45.2950μs | 12.0510μs | 82.9804 KOps/s | 82.1650 KOps/s | |
test_membership_nested_last | 46.9180μs | 6.0310μs | 165.8105 KOps/s | 167.4666 KOps/s | |
test_membership_nested_leaf_last | 26.6400μs | 6.0588μs | 165.0481 KOps/s | 167.6926 KOps/s | |
test_membership_stacked_nested_last | 0.2239ms | 0.1673ms | 5.9759 KOps/s | 5.9813 KOps/s | |
test_membership_stacked_nested_leaf_last | 58.9910μs | 14.0671μs | 71.0878 KOps/s | 70.5914 KOps/s | |
test_nested_getleaf | 35.2070μs | 10.6989μs | 93.4673 KOps/s | 94.2008 KOps/s | |
test_nested_get | 30.8780μs | 10.0835μs | 99.1715 KOps/s | 99.0649 KOps/s | |
test_stacked_getleaf | 0.5962ms | 0.4660ms | 2.1459 KOps/s | 2.1388 KOps/s | |
test_stacked_get | 0.7765ms | 0.4371ms | 2.2879 KOps/s | 2.2742 KOps/s | |
test_nested_getitemleaf | 54.1720μs | 10.5839μs | 94.4833 KOps/s | 93.2539 KOps/s | |
test_nested_getitem | 48.1810μs | 10.0355μs | 99.6466 KOps/s | 99.2503 KOps/s | |
test_stacked_getitemleaf | 0.8406ms | 0.4685ms | 2.1345 KOps/s | 2.1220 KOps/s | |
test_stacked_getitem | 0.7172ms | 0.4381ms | 2.2828 KOps/s | 2.2686 KOps/s | |
test_lock_nested | 1.2448ms | 0.4128ms | 2.4222 KOps/s | 2.4009 KOps/s | |
test_lock_stack_nested | 77.0164ms | 6.3670ms | 157.0588 Ops/s | 155.4492 Ops/s | |
test_unlock_nested | 66.7179ms | 0.4840ms | 2.0663 KOps/s | 2.3482 KOps/s | |
test_unlock_stack_nested | 73.7984ms | 6.0421ms | 165.5061 Ops/s | 163.8035 Ops/s | |
test_flatten_speed | 0.6382ms | 0.3686ms | 2.7129 KOps/s | 2.7017 KOps/s | |
test_unflatten_speed | 0.9484ms | 0.4523ms | 2.2108 KOps/s | 2.2313 KOps/s | |
test_common_ops | 4.7481ms | 0.6565ms | 1.5231 KOps/s | 1.4978 KOps/s | |
test_creation | 16.7710μs | 1.9427μs | 514.7372 KOps/s | 501.6324 KOps/s | |
test_creation_empty | 26.3400μs | 7.7181μs | 129.5650 KOps/s | 122.0734 KOps/s | |
test_creation_nested_1 | 33.7240μs | 10.6103μs | 94.2476 KOps/s | 91.7586 KOps/s | |
test_creation_nested_2 | 69.5810μs | 15.7877μs | 63.3406 KOps/s | 58.6848 KOps/s | |
test_clone | 0.1023ms | 12.5682μs | 79.5656 KOps/s | 78.7840 KOps/s | |
test_getitem[int] | 34.3250μs | 11.8912μs | 84.0955 KOps/s | 79.9204 KOps/s | |
test_getitem[slice_int] | 84.4290μs | 23.4317μs | 42.6772 KOps/s | 41.8593 KOps/s | |
test_getitem[range] | 0.1176ms | 44.2870μs | 22.5800 KOps/s | 23.2440 KOps/s | |
test_getitem[tuple] | 44.6840μs | 18.9664μs | 52.7249 KOps/s | 51.5002 KOps/s | |
test_getitem[list] | 99.7070μs | 37.3930μs | 26.7430 KOps/s | 26.2794 KOps/s | |
test_setitem_dim[int] | 51.5470μs | 26.5222μs | 37.7043 KOps/s | 34.9769 KOps/s | |
test_setitem_dim[slice_int] | 0.1203ms | 52.8620μs | 18.9172 KOps/s | 18.3631 KOps/s | |
test_setitem_dim[range] | 0.1161ms | 70.5367μs | 14.1770 KOps/s | 13.8018 KOps/s | |
test_setitem_dim[tuple] | 78.0270μs | 40.6446μs | 24.6035 KOps/s | 22.4314 KOps/s | |
test_setitem | 0.1409ms | 17.0645μs | 58.6011 KOps/s | 54.8712 KOps/s | |
test_set | 0.1172ms | 16.5621μs | 60.3790 KOps/s | 56.9595 KOps/s | |
test_set_shared | 6.9972ms | 0.1382ms | 7.2359 KOps/s | 7.1438 KOps/s | |
test_update | 0.1266ms | 18.4173μs | 54.2968 KOps/s | 51.4774 KOps/s | |
test_update_nested | 0.1413ms | 25.7088μs | 38.8972 KOps/s | 37.6592 KOps/s | |
test_set_nested | 0.1235ms | 18.5587μs | 53.8830 KOps/s | 51.5273 KOps/s | |
test_set_nested_new | 0.1336ms | 22.3316μs | 44.7796 KOps/s | 42.7529 KOps/s | |
test_select | 0.1207ms | 46.7060μs | 21.4105 KOps/s | 21.1261 KOps/s | |
test_unbind_speed | 0.4996ms | 0.3387ms | 2.9524 KOps/s | 2.8890 KOps/s | |
test_unbind_speed_stack0 | 71.1752ms | 4.2798ms | 233.6541 Ops/s | 238.8738 Ops/s | |
test_unbind_speed_stack1 | 2.7728μs | 0.6456μs | 1.5488 MOps/s | 1.5853 MOps/s | |
test_split | 1.6159ms | 1.5430ms | 648.0863 Ops/s | 588.2220 Ops/s | |
test_chunk | 66.5630ms | 1.6356ms | 611.4123 Ops/s | 597.7710 Ops/s | |
test_creation[device0] | 3.3799ms | 0.2960ms | 3.3780 KOps/s | 3.4331 KOps/s | |
test_creation_from_tensor | 66.2213ms | 0.3642ms | 2.7458 KOps/s | 3.0539 KOps/s | |
test_add_one[memmap_tensor0] | 0.2822ms | 25.0403μs | 39.9356 KOps/s | 29.3441 KOps/s | |
test_contiguous[memmap_tensor0] | 26.7200μs | 5.8796μs | 170.0804 KOps/s | 174.2072 KOps/s | |
test_stack[memmap_tensor0] | 51.0760μs | 19.5590μs | 51.1274 KOps/s | 51.0145 KOps/s | |
test_memmaptd_index | 0.3013ms | 0.1925ms | 5.1958 KOps/s | 5.0513 KOps/s | |
test_memmaptd_index_astensor | 0.4967ms | 0.2508ms | 3.9867 KOps/s | 3.9123 KOps/s | |
test_memmaptd_index_op | 0.9956ms | 0.4850ms | 2.0620 KOps/s | 1.9807 KOps/s | |
test_serialize_model | 0.1653s | 0.1047s | 9.5510 Ops/s | 9.9819 Ops/s | |
test_serialize_model_filesystem | 98.1557ms | 91.8300ms | 10.8897 Ops/s | 10.6636 Ops/s | |
test_serialize_model_pickle | 0.4479s | 0.3817s | 2.6198 Ops/s | 2.6130 Ops/s | |
test_serialize_weights | 96.2699ms | 93.5229ms | 10.6926 Ops/s | 9.4470 Ops/s | |
test_serialize_weights_filesystem | 0.1589s | 97.0804ms | 10.3007 Ops/s | 10.2041 Ops/s | |
test_serialize_weights_returnearly | 0.1783s | 0.1276s | 7.8341 Ops/s | 8.2044 Ops/s | |
test_serialize_weights_pickle | 1.1604s | 0.6491s | 1.5406 Ops/s | 2.0662 Ops/s | |
test_reshape_pytree | 54.9740μs | 23.5764μs | 42.4153 KOps/s | 42.4681 KOps/s | |
test_reshape_td | 58.8810μs | 30.6858μs | 32.5884 KOps/s | 33.5099 KOps/s | |
test_view_pytree | 80.5820μs | 23.4322μs | 42.6763 KOps/s | 42.9318 KOps/s | |
test_view_td | 63.0370μs | 4.8700μs | 205.3386 KOps/s | 206.4646 KOps/s | |
test_unbind_pytree | 56.7870μs | 26.0398μs | 38.4027 KOps/s | 37.6495 KOps/s | |
test_unbind_td | 0.1203ms | 55.4112μs | 18.0469 KOps/s | 18.1243 KOps/s | |
test_split_pytree | 62.8780μs | 26.5273μs | 37.6970 KOps/s | 38.2006 KOps/s | |
test_split_td | 0.5552ms | 43.2915μs | 23.0992 KOps/s | 22.6622 KOps/s | |
test_add_pytree | 0.1040ms | 32.0347μs | 31.2162 KOps/s | 30.5618 KOps/s | |
test_add_td | 0.1072ms | 42.8259μs | 23.3504 KOps/s | 22.0961 KOps/s | |
test_distributed | 19.5970μs | 5.9364μs | 168.4531 KOps/s | 166.4235 KOps/s | |
test_tdmodule | 0.9021ms | 21.9435μs | 45.5715 KOps/s | 46.7917 KOps/s | |
test_tdmodule_dispatch | 0.1748ms | 37.9904μs | 26.3225 KOps/s | 25.1699 KOps/s | |
test_tdseq | 56.5060μs | 24.2363μs | 41.2604 KOps/s | 39.5901 KOps/s | |
test_tdseq_dispatch | 0.1357ms | 42.1431μs | 23.7287 KOps/s | 23.0886 KOps/s | |
test_instantiation_functorch | 2.0637ms | 1.2886ms | 776.0438 Ops/s | 777.4379 Ops/s | |
test_instantiation_td | 1.4324ms | 0.9895ms | 1.0106 KOps/s | 994.3935 Ops/s | |
test_exec_functorch | 0.2777ms | 0.1565ms | 6.3885 KOps/s | 6.3230 KOps/s | |
test_exec_functional_call | 0.2311ms | 0.1481ms | 6.7500 KOps/s | 6.8421 KOps/s | |
test_exec_td | 0.2151ms | 0.1418ms | 7.0535 KOps/s | 6.9617 KOps/s | |
test_exec_td_decorator | 0.6322ms | 0.1748ms | 5.7218 KOps/s | 5.7667 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.9490ms | 0.8609ms | 1.1616 KOps/s | 1.1326 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8443ms | 0.4595ms | 2.1763 KOps/s | 2.1200 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.2431ms | 0.7597ms | 1.3163 KOps/s | 1.2980 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.8438ms | 0.3846ms | 2.6000 KOps/s | 2.5339 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.5624ms | 1.7151ms | 583.0654 Ops/s | 573.6739 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9792ms | 0.5086ms | 1.9660 KOps/s | 1.9299 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.9490ms | 1.4468ms | 691.1698 Ops/s | 686.8760 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6833ms | 0.3981ms | 2.5121 KOps/s | 2.4951 KOps/s |
@@ -308,6 +308,8 @@ def is_empty(self): | |||
def to_module( | |||
self, | |||
module, | |||
*, | |||
inplace: bool | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is None
supported for backward compatibility ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to catch inplace=True when use_state_dict=False since this is not implemented.
By making it None, and given that inplace is True for state dict, we tell users: you explicitly asked for False but we can't make that happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining!
Allows
to_module
to write tensors in-place.