-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Faster instantiation #550
Conversation
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 56.0350μs | 15.2140μs | 65.7291 KOps/s | 65.8717 KOps/s | |
test_plain_set_stack_nested | 0.2698ms | 0.1423ms | 7.0250 KOps/s | 7.1038 KOps/s | |
test_plain_set_nested_inplace | 46.5170μs | 18.1511μs | 55.0932 KOps/s | 55.4405 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2331ms | 0.1708ms | 5.8558 KOps/s | 5.9144 KOps/s | |
test_items | 38.9030μs | 2.4179μs | 413.5894 KOps/s | 388.8505 KOps/s | |
test_items_nested | 0.6022ms | 0.2656ms | 3.7647 KOps/s | 3.6904 KOps/s | |
test_items_nested_locked | 0.4476ms | 0.2668ms | 3.7477 KOps/s | 3.7076 KOps/s | |
test_items_nested_leaf | 0.5749ms | 0.1650ms | 6.0621 KOps/s | 6.0240 KOps/s | |
test_items_stack_nested | 2.3944ms | 1.3860ms | 721.5018 Ops/s | 671.5397 Ops/s | |
test_items_stack_nested_leaf | 1.5137ms | 1.2583ms | 794.6927 Ops/s | 745.6299 Ops/s | |
test_items_stack_nested_locked | 0.8473ms | 0.7495ms | 1.3343 KOps/s | 1.3371 KOps/s | |
test_keys | 17.8130μs | 3.8384μs | 260.5233 KOps/s | 258.1388 KOps/s | |
test_keys_nested | 1.4759ms | 0.1371ms | 7.2915 KOps/s | 6.9607 KOps/s | |
test_keys_nested_locked | 0.1967ms | 0.1372ms | 7.2891 KOps/s | 7.3779 KOps/s | |
test_keys_nested_leaf | 0.2504ms | 0.1352ms | 7.3939 KOps/s | 7.4647 KOps/s | |
test_keys_stack_nested | 5.1950ms | 1.3304ms | 751.6449 Ops/s | 722.5429 Ops/s | |
test_keys_stack_nested_leaf | 1.4026ms | 1.2750ms | 784.3283 Ops/s | 718.3891 Ops/s | |
test_keys_stack_nested_locked | 0.9216ms | 0.6293ms | 1.5891 KOps/s | 1.6131 KOps/s | |
test_values | 11.4237μs | 1.1851μs | 843.8122 KOps/s | 840.0529 KOps/s | |
test_values_nested | 97.1920μs | 46.9313μs | 21.3078 KOps/s | 20.5361 KOps/s | |
test_values_nested_locked | 0.1006ms | 47.5879μs | 21.0137 KOps/s | 20.6198 KOps/s | |
test_values_nested_leaf | 96.0510μs | 42.2372μs | 23.6758 KOps/s | 22.8973 KOps/s | |
test_values_stack_nested | 1.7276ms | 1.1130ms | 898.4781 Ops/s | 829.3066 Ops/s | |
test_values_stack_nested_leaf | 1.2550ms | 1.1004ms | 908.7589 Ops/s | 841.0574 Ops/s | |
test_values_stack_nested_locked | 0.8348ms | 0.4954ms | 2.0185 KOps/s | 2.0759 KOps/s | |
test_membership | 15.5990μs | 1.3404μs | 746.0672 KOps/s | 741.6596 KOps/s | |
test_membership_nested | 23.7540μs | 2.7659μs | 361.5422 KOps/s | 358.1436 KOps/s | |
test_membership_nested_leaf | 42.1790μs | 2.7848μs | 359.0900 KOps/s | 360.5792 KOps/s | |
test_membership_stacked_nested | 39.6240μs | 11.4176μs | 87.5843 KOps/s | 88.0807 KOps/s | |
test_membership_stacked_nested_leaf | 55.9850μs | 11.3277μs | 88.2794 KOps/s | 88.2459 KOps/s | |
test_membership_nested_last | 24.7760μs | 5.7278μs | 174.5857 KOps/s | 172.2179 KOps/s | |
test_membership_nested_leaf_last | 21.0800μs | 5.7247μs | 174.6813 KOps/s | 166.7107 KOps/s | |
test_membership_stacked_nested_last | 0.3913ms | 0.1772ms | 5.6439 KOps/s | 5.6516 KOps/s | |
test_membership_stacked_nested_leaf_last | 40.9160μs | 13.4471μs | 74.3653 KOps/s | 75.3878 KOps/s | |
test_nested_getleaf | 45.4860μs | 11.8564μs | 84.3427 KOps/s | 84.0394 KOps/s | |
test_nested_get | 46.6350μs | 11.1719μs | 89.5101 KOps/s | 89.0916 KOps/s | |
test_stacked_getleaf | 0.9925ms | 0.5792ms | 1.7264 KOps/s | 1.5247 KOps/s | |
test_stacked_get | 0.6278ms | 0.5502ms | 1.8177 KOps/s | 1.5966 KOps/s | |
test_nested_getitemleaf | 34.9350μs | 11.9062μs | 83.9896 KOps/s | 83.6307 KOps/s | |
test_nested_getitem | 51.2670μs | 11.2211μs | 89.1174 KOps/s | 88.4688 KOps/s | |
test_stacked_getitemleaf | 0.8596ms | 0.5785ms | 1.7285 KOps/s | 1.5282 KOps/s | |
test_stacked_getitem | 0.9713ms | 0.5524ms | 1.8102 KOps/s | 1.5956 KOps/s | |
test_lock_nested | 53.9103ms | 0.9434ms | 1.0600 KOps/s | 896.1367 Ops/s | |
test_lock_stack_nested | 67.9741ms | 12.5843ms | 79.4640 Ops/s | 66.9588 Ops/s | |
test_unlock_nested | 54.3506ms | 0.9564ms | 1.0456 KOps/s | 842.7497 Ops/s | |
test_unlock_stack_nested | 72.7516ms | 13.0772ms | 76.4688 Ops/s | 63.6767 Ops/s | |
test_flatten_speed | 0.7630ms | 0.6886ms | 1.4521 KOps/s | 1.2708 KOps/s | |
test_unflatten_speed | 1.7833ms | 1.1957ms | 836.3437 Ops/s | 715.1534 Ops/s | |
test_common_ops | 0.7243ms | 0.6222ms | 1.6072 KOps/s | 1.1750 KOps/s | |
test_creation | 18.2640μs | 2.1775μs | 459.2436 KOps/s | 215.6383 KOps/s | |
test_creation_empty | 32.0400μs | 7.4258μs | 134.6649 KOps/s | 96.1951 KOps/s | |
test_creation_nested_1 | 30.8480μs | 11.5200μs | 86.8053 KOps/s | 53.9119 KOps/s | |
test_creation_nested_2 | 46.4470μs | 13.8338μs | 72.2867 KOps/s | 48.1362 KOps/s | |
test_clone | 73.2880μs | 10.6714μs | 93.7084 KOps/s | 56.1069 KOps/s | |
test_getitem[int] | 52.4790μs | 13.1137μs | 76.2561 KOps/s | 48.0495 KOps/s | |
test_getitem[slice_int] | 85.6310μs | 29.4133μs | 33.9982 KOps/s | 24.3198 KOps/s | |
test_getitem[range] | 0.1335ms | 55.4246μs | 18.0425 KOps/s | 14.9624 KOps/s | |
test_getitem[tuple] | 55.9350μs | 23.3660μs | 42.7973 KOps/s | 29.9441 KOps/s | |
test_getitem[list] | 0.2219ms | 49.6428μs | 20.1439 KOps/s | 16.1251 KOps/s | |
test_setitem_dim[int] | 52.2680μs | 26.3926μs | 37.8894 KOps/s | 37.7733 KOps/s | |
test_setitem_dim[slice_int] | 95.0680μs | 49.8568μs | 20.0574 KOps/s | 19.2580 KOps/s | |
test_setitem_dim[range] | 0.1633ms | 73.1597μs | 13.6687 KOps/s | 14.0077 KOps/s | |
test_setitem_dim[tuple] | 74.6800μs | 39.4273μs | 25.3631 KOps/s | 24.5337 KOps/s | |
test_setitem | 97.2620μs | 14.7664μs | 67.7212 KOps/s | 43.1305 KOps/s | |
test_set | 0.1010ms | 14.1035μs | 70.9042 KOps/s | 44.2877 KOps/s | |
test_set_shared | 1.6091ms | 0.1625ms | 6.1551 KOps/s | 6.0987 KOps/s | |
test_update | 0.2170ms | 18.9540μs | 52.7592 KOps/s | 38.5237 KOps/s | |
test_update_nested | 0.1171ms | 28.4260μs | 35.1790 KOps/s | 25.6557 KOps/s | |
test_set_nested | 77.7060μs | 16.3969μs | 60.9873 KOps/s | 40.6194 KOps/s | |
test_set_nested_new | 0.1244ms | 22.3930μs | 44.6568 KOps/s | 25.3488 KOps/s | |
test_select | 0.1726ms | 47.1530μs | 21.2075 KOps/s | 12.9721 KOps/s | |
test_unbind_speed | 0.5076ms | 0.2918ms | 3.4266 KOps/s | 2.0365 KOps/s | |
test_unbind_speed_stack0 | 63.7672ms | 4.6376ms | 215.6292 Ops/s | 158.3695 Ops/s | |
test_unbind_speed_stack1 | 2.0544μs | 0.6130μs | 1.6313 MOps/s | 1.6486 MOps/s | |
test_creation[device0] | 0.4010ms | 0.2868ms | 3.4862 KOps/s | 3.5075 KOps/s | |
test_creation_from_tensor | 3.3112ms | 0.3215ms | 3.1104 KOps/s | 3.1271 KOps/s | |
test_add_one[memmap_tensor0] | 0.4685ms | 24.5745μs | 40.6926 KOps/s | 40.8964 KOps/s | |
test_contiguous[memmap_tensor0] | 27.1310μs | 5.6998μs | 175.4451 KOps/s | 174.9651 KOps/s | |
test_stack[memmap_tensor0] | 53.1800μs | 18.8392μs | 53.0808 KOps/s | 54.4446 KOps/s | |
test_memmaptd_index | 0.4509ms | 0.2383ms | 4.1958 KOps/s | 4.1151 KOps/s | |
test_memmaptd_index_astensor | 1.1044ms | 0.9109ms | 1.0978 KOps/s | 1.0681 KOps/s | |
test_memmaptd_index_op | 2.8473ms | 2.1405ms | 467.1745 Ops/s | 451.7064 Ops/s | |
test_reshape_pytree | 77.4050μs | 23.3870μs | 42.7588 KOps/s | 43.2620 KOps/s | |
test_reshape_td | 54.6630μs | 21.0596μs | 47.4842 KOps/s | 32.8756 KOps/s | |
test_view_pytree | 63.7090μs | 23.2909μs | 42.9352 KOps/s | 43.2808 KOps/s | |
test_view_td | 19.0550μs | 4.1221μs | 242.5935 KOps/s | 155.4520 KOps/s | |
test_unbind_pytree | 68.3880μs | 26.4016μs | 37.8764 KOps/s | 37.9928 KOps/s | |
test_unbind_td | 82.6140μs | 40.7063μs | 24.5662 KOps/s | 13.7284 KOps/s | |
test_split_pytree | 91.3110μs | 26.4871μs | 37.7542 KOps/s | 38.5426 KOps/s | |
test_split_td | 0.5411ms | 55.1011μs | 18.1485 KOps/s | 12.0454 KOps/s | |
test_add_pytree | 73.0060μs | 32.3796μs | 30.8836 KOps/s | 31.0878 KOps/s | |
test_add_td | 2.7767ms | 43.0637μs | 23.2214 KOps/s | 17.3195 KOps/s | |
test_distributed | 26.4700μs | 6.0095μs | 166.4025 KOps/s | 161.1689 KOps/s | |
test_tdmodule | 0.1687ms | 21.0695μs | 47.4621 KOps/s | 44.7298 KOps/s | |
test_tdmodule_dispatch | 0.2111ms | 37.3291μs | 26.7887 KOps/s | 24.3657 KOps/s | |
test_tdseq | 0.1163ms | 24.0006μs | 41.6656 KOps/s | 41.9140 KOps/s | |
test_tdseq_dispatch | 0.1414ms | 42.4701μs | 23.5460 KOps/s | 21.5663 KOps/s | |
test_instantiation_functorch | 1.4210ms | 1.2976ms | 770.6753 Ops/s | 723.8403 Ops/s | |
test_instantiation_td | 1.6292ms | 1.0503ms | 952.0721 Ops/s | 948.1749 Ops/s | |
test_exec_functorch | 0.2145ms | 0.1435ms | 6.9695 KOps/s | 6.9405 KOps/s | |
test_exec_td | 0.2186ms | 0.1407ms | 7.1062 KOps/s | 6.9808 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.2700ms | 0.8467ms | 1.1810 KOps/s | 1.0789 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7000ms | 0.4569ms | 2.1888 KOps/s | 2.1132 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0869ms | 0.7420ms | 1.3477 KOps/s | 1.2335 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5978ms | 0.3792ms | 2.6371 KOps/s | 2.6389 KOps/s |
woohoo! +112% on creation speed! |
I see, so defining directly the attributes as class-level attributes is faster at runtime than doing attribute assignment in the new method, is that it? |
Yes that and removing slots |
looking forward to see how it affects torchrl speed! :) |
No description provided.