-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Remove shared/memmap inheritance from clone / select / exclude #624
Conversation
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 42.2890μs | 16.9386μs | 59.0369 KOps/s | 59.6602 KOps/s | |
test_plain_set_stack_nested | 0.1859ms | 0.1405ms | 7.1163 KOps/s | 6.9729 KOps/s | |
test_plain_set_nested_inplace | 42.9310μs | 18.9778μs | 52.6931 KOps/s | 51.6330 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2962ms | 0.1742ms | 5.7398 KOps/s | 5.6393 KOps/s | |
test_items | 14.0360μs | 2.5062μs | 399.0073 KOps/s | 407.1817 KOps/s | |
test_items_nested | 1.0618ms | 0.2726ms | 3.6689 KOps/s | 3.6607 KOps/s | |
test_items_nested_locked | 0.3292ms | 0.2720ms | 3.6760 KOps/s | 3.6400 KOps/s | |
test_items_nested_leaf | 0.5716ms | 0.1695ms | 5.8982 KOps/s | 5.9558 KOps/s | |
test_items_stack_nested | 1.6044ms | 1.3337ms | 749.8019 Ops/s | 741.7511 Ops/s | |
test_items_stack_nested_leaf | 1.3147ms | 1.1994ms | 833.7642 Ops/s | 806.6262 Ops/s | |
test_items_stack_nested_locked | 0.9955ms | 0.8694ms | 1.1502 KOps/s | 1.1151 KOps/s | |
test_keys | 20.0080μs | 3.8631μs | 258.8607 KOps/s | 256.9797 KOps/s | |
test_keys_nested | 48.2249ms | 0.1588ms | 6.2961 KOps/s | 6.6597 KOps/s | |
test_keys_nested_locked | 0.2577ms | 0.1561ms | 6.4053 KOps/s | 6.4663 KOps/s | |
test_keys_nested_leaf | 0.2366ms | 0.1310ms | 7.6307 KOps/s | 7.6512 KOps/s | |
test_keys_stack_nested | 1.4084ms | 1.2843ms | 778.6379 Ops/s | 771.5054 Ops/s | |
test_keys_stack_nested_leaf | 1.4993ms | 1.2745ms | 784.6350 Ops/s | 773.0753 Ops/s | |
test_keys_stack_nested_locked | 1.0147ms | 0.8101ms | 1.2344 KOps/s | 1.1906 KOps/s | |
test_values | 7.3587μs | 1.2240μs | 817.0106 KOps/s | 909.7908 KOps/s | |
test_values_nested | 97.6520μs | 51.1912μs | 19.5346 KOps/s | 19.1978 KOps/s | |
test_values_nested_locked | 99.8370μs | 51.7165μs | 19.3362 KOps/s | 19.4083 KOps/s | |
test_values_nested_leaf | 0.1342ms | 45.4476μs | 22.0034 KOps/s | 21.6535 KOps/s | |
test_values_stack_nested | 1.2796ms | 1.0487ms | 953.5922 Ops/s | 954.7465 Ops/s | |
test_values_stack_nested_leaf | 1.6131ms | 1.0383ms | 963.1018 Ops/s | 923.4548 Ops/s | |
test_values_stack_nested_locked | 1.0163ms | 0.6182ms | 1.6176 KOps/s | 1.5886 KOps/s | |
test_membership | 13.6060μs | 1.3050μs | 766.2579 KOps/s | 714.9944 KOps/s | |
test_membership_nested | 41.7280μs | 3.4475μs | 290.0659 KOps/s | 288.6202 KOps/s | |
test_membership_nested_leaf | 26.8500μs | 3.4470μs | 290.1094 KOps/s | 296.1731 KOps/s | |
test_membership_stacked_nested | 48.7910μs | 12.0983μs | 82.6561 KOps/s | 82.2248 KOps/s | |
test_membership_stacked_nested_leaf | 35.5070μs | 12.1105μs | 82.5727 KOps/s | 80.5293 KOps/s | |
test_membership_nested_last | 36.8190μs | 6.6402μs | 150.5981 KOps/s | 151.4362 KOps/s | |
test_membership_nested_leaf_last | 29.4350μs | 6.6871μs | 149.5410 KOps/s | 151.1608 KOps/s | |
test_membership_stacked_nested_last | 0.3954ms | 0.1727ms | 5.7897 KOps/s | 5.5188 KOps/s | |
test_membership_stacked_nested_leaf_last | 53.7710μs | 14.3136μs | 69.8635 KOps/s | 69.5821 KOps/s | |
test_nested_getleaf | 48.6010μs | 10.3813μs | 96.3268 KOps/s | 91.7751 KOps/s | |
test_nested_get | 30.5870μs | 9.7941μs | 102.1025 KOps/s | 97.3025 KOps/s | |
test_stacked_getleaf | 0.6597ms | 0.4003ms | 2.4981 KOps/s | 2.5207 KOps/s | |
test_stacked_get | 0.6643ms | 0.3703ms | 2.7002 KOps/s | 2.7132 KOps/s | |
test_nested_getitemleaf | 36.8790μs | 10.4062μs | 96.0964 KOps/s | 93.4833 KOps/s | |
test_nested_getitem | 35.6370μs | 9.8850μs | 101.1629 KOps/s | 98.7431 KOps/s | |
test_stacked_getitemleaf | 0.7243ms | 0.4019ms | 2.4880 KOps/s | 2.4836 KOps/s | |
test_stacked_getitem | 0.5587ms | 0.3696ms | 2.7057 KOps/s | 2.7300 KOps/s | |
test_lock_nested | 1.2829ms | 0.3902ms | 2.5625 KOps/s | 2.5178 KOps/s | |
test_lock_stack_nested | 81.4792ms | 6.5024ms | 153.7889 Ops/s | 154.4270 Ops/s | |
test_unlock_nested | 70.4953ms | 0.4633ms | 2.1584 KOps/s | 2.5495 KOps/s | |
test_unlock_stack_nested | 81.5934ms | 6.1058ms | 163.7794 Ops/s | 166.1743 Ops/s | |
test_flatten_speed | 0.6276ms | 0.3641ms | 2.7463 KOps/s | 2.6964 KOps/s | |
test_unflatten_speed | 0.6529ms | 0.4504ms | 2.2203 KOps/s | 2.1842 KOps/s | |
test_common_ops | 3.9323ms | 0.6699ms | 1.4928 KOps/s | 1.4685 KOps/s | |
test_creation | 15.3390μs | 1.9279μs | 518.6997 KOps/s | 544.1994 KOps/s | |
test_creation_empty | 45.4650μs | 9.9300μs | 100.7052 KOps/s | 95.1942 KOps/s | |
test_creation_nested_1 | 31.0090μs | 12.4779μs | 80.1417 KOps/s | 77.9646 KOps/s | |
test_creation_nested_2 | 0.1079ms | 15.8204μs | 63.2095 KOps/s | 62.1888 KOps/s | |
test_clone | 0.2255ms | 14.0684μs | 71.0812 KOps/s | 77.3347 KOps/s | |
test_getitem[int] | 33.9640μs | 11.0734μs | 90.3063 KOps/s | 89.6644 KOps/s | |
test_getitem[slice_int] | 61.6250μs | 22.0697μs | 45.3111 KOps/s | 44.7815 KOps/s | |
test_getitem[range] | 85.2390μs | 39.6212μs | 25.2390 KOps/s | 24.4706 KOps/s | |
test_getitem[tuple] | 55.4040μs | 18.3067μs | 54.6248 KOps/s | 54.9633 KOps/s | |
test_getitem[list] | 78.3770μs | 35.2188μs | 28.3939 KOps/s | 26.8868 KOps/s | |
test_setitem_dim[int] | 60.6040μs | 30.9644μs | 32.2952 KOps/s | 32.6556 KOps/s | |
test_setitem_dim[slice_int] | 0.1144ms | 55.9193μs | 17.8829 KOps/s | 17.5232 KOps/s | |
test_setitem_dim[range] | 0.1372ms | 74.7204μs | 13.3832 KOps/s | 13.4004 KOps/s | |
test_setitem_dim[tuple] | 87.2740μs | 44.7652μs | 22.3388 KOps/s | 22.1361 KOps/s | |
test_setitem | 0.2559ms | 19.8637μs | 50.3432 KOps/s | 50.6162 KOps/s | |
test_set | 0.1619ms | 19.0798μs | 52.4116 KOps/s | 51.7975 KOps/s | |
test_set_shared | 3.2159ms | 0.1393ms | 7.1811 KOps/s | 7.0870 KOps/s | |
test_update | 0.2345ms | 21.4769μs | 46.5618 KOps/s | 44.8736 KOps/s | |
test_update_nested | 0.2384ms | 29.2373μs | 34.2028 KOps/s | 32.9524 KOps/s | |
test_set_nested | 0.1630ms | 21.1267μs | 47.3334 KOps/s | 47.3968 KOps/s | |
test_set_nested_new | 0.1506ms | 25.5473μs | 39.1431 KOps/s | 39.8361 KOps/s | |
test_select | 0.2761ms | 38.2821μs | 26.1218 KOps/s | 25.9240 KOps/s | |
test_select_nested | 0.1107ms | 57.8653μs | 17.2815 KOps/s | 16.3649 KOps/s | |
test_exclude_nested | 0.2787ms | 0.1080ms | 9.2588 KOps/s | 8.8417 KOps/s | |
test_empty[True] | 0.5275ms | 0.3192ms | 3.1333 KOps/s | 3.0846 KOps/s | |
test_empty[False] | 7.6704μs | 1.0531μs | 949.5375 KOps/s | 986.6617 KOps/s | |
test_unbind_speed | 0.5437ms | 0.3226ms | 3.0995 KOps/s | 3.1585 KOps/s | |
test_unbind_speed_stack0 | 75.1477ms | 3.9142ms | 255.4789 Ops/s | 224.6859 Ops/s | |
test_unbind_speed_stack1 | 2.3104μs | 0.6320μs | 1.5822 MOps/s | 1.5745 MOps/s | |
test_split | 72.4722ms | 1.6060ms | 622.6782 Ops/s | 684.9593 Ops/s | |
test_chunk | 70.0045ms | 1.5794ms | 633.1364 Ops/s | 646.5099 Ops/s | |
test_creation[device0] | 4.0314ms | 0.1048ms | 9.5418 KOps/s | 9.8661 KOps/s | |
test_creation_from_tensor | 0.2363ms | 84.0051μs | 11.9040 KOps/s | 12.0808 KOps/s | |
test_add_one[memmap_tensor0] | 0.7008ms | 5.4741μs | 182.6793 KOps/s | 194.2736 KOps/s | |
test_contiguous[memmap_tensor0] | 15.9700μs | 0.6373μs | 1.5691 MOps/s | 1.5382 MOps/s | |
test_stack[memmap_tensor0] | 80.5410μs | 3.6435μs | 274.4636 KOps/s | 286.6242 KOps/s | |
test_memmaptd_index | 0.9572ms | 0.2204ms | 4.5373 KOps/s | 4.5790 KOps/s | |
test_memmaptd_index_astensor | 0.8331ms | 0.2865ms | 3.4908 KOps/s | 3.6242 KOps/s | |
test_memmaptd_index_op | 0.9333ms | 0.5643ms | 1.7722 KOps/s | 1.7512 KOps/s | |
test_serialize_model | 0.1679s | 0.1118s | 8.9443 Ops/s | 9.8357 Ops/s | |
test_serialize_model_pickle | 0.4482s | 0.3773s | 2.6505 Ops/s | 2.5956 Ops/s | |
test_serialize_weights | 0.1008s | 98.1991ms | 10.1834 Ops/s | 9.2937 Ops/s | |
test_serialize_weights_returnearly | 0.1935s | 0.1322s | 7.5670 Ops/s | 7.4707 Ops/s | |
test_serialize_weights_pickle | 0.6441s | 0.5165s | 1.9361 Ops/s | 2.3574 Ops/s | |
test_serialize_weights_filesystem | 0.1019s | 92.0737ms | 10.8609 Ops/s | 10.5353 Ops/s | |
test_serialize_model_filesystem | 0.1636s | 0.1001s | 9.9919 Ops/s | 11.0915 Ops/s | |
test_reshape_pytree | 54.8030μs | 22.9998μs | 43.4786 KOps/s | 43.0717 KOps/s | |
test_reshape_td | 67.5770μs | 29.7799μs | 33.5797 KOps/s | 33.0899 KOps/s | |
test_view_pytree | 89.3450μs | 22.9332μs | 43.6049 KOps/s | 43.1154 KOps/s | |
test_view_td | 23.7550μs | 4.8695μs | 205.3591 KOps/s | 205.1644 KOps/s | |
test_unbind_pytree | 74.9000μs | 26.7364μs | 37.4022 KOps/s | 37.4725 KOps/s | |
test_unbind_td | 0.1068ms | 49.6731μs | 20.1316 KOps/s | 19.9238 KOps/s | |
test_split_pytree | 72.5050μs | 26.1610μs | 38.2249 KOps/s | 37.8936 KOps/s | |
test_split_td | 0.5874ms | 41.0676μs | 24.3501 KOps/s | 24.4809 KOps/s | |
test_add_pytree | 73.0470μs | 32.4915μs | 30.7773 KOps/s | 30.8714 KOps/s | |
test_add_td | 0.1219ms | 50.5245μs | 19.7924 KOps/s | 19.4473 KOps/s | |
test_distributed | 0.2264ms | 97.5997μs | 10.2459 KOps/s | 9.8100 KOps/s | |
test_tdmodule | 0.7514ms | 23.2421μs | 43.0253 KOps/s | 43.4493 KOps/s | |
test_tdmodule_dispatch | 0.2097ms | 40.0978μs | 24.9390 KOps/s | 24.5050 KOps/s | |
test_tdseq | 54.9630μs | 25.4597μs | 39.2777 KOps/s | 39.6679 KOps/s | |
test_tdseq_dispatch | 0.1507ms | 45.0776μs | 22.1839 KOps/s | 22.5768 KOps/s | |
test_instantiation_functorch | 1.7439ms | 1.2864ms | 777.3358 Ops/s | 769.0395 Ops/s | |
test_instantiation_td | 1.6259ms | 1.0046ms | 995.4550 Ops/s | 994.8229 Ops/s | |
test_exec_functorch | 0.2908ms | 0.1583ms | 6.3158 KOps/s | 6.3262 KOps/s | |
test_exec_functional_call | 0.2807ms | 0.1470ms | 6.8030 KOps/s | 6.9104 KOps/s | |
test_exec_td | 0.2209ms | 0.1418ms | 7.0528 KOps/s | 6.9258 KOps/s | |
test_exec_td_decorator | 0.9016ms | 0.1756ms | 5.6948 KOps/s | 5.5813 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.7351ms | 0.9171ms | 1.0904 KOps/s | 1.0956 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8289ms | 0.4817ms | 2.0759 KOps/s | 2.0810 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0559ms | 0.7925ms | 1.2618 KOps/s | 1.2575 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6922ms | 0.3920ms | 2.5508 KOps/s | 2.5532 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.0883ms | 2.4047ms | 415.8560 Ops/s | 405.7896 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0524ms | 0.5291ms | 1.8901 KOps/s | 1.8875 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.6140ms | 1.9652ms | 508.8506 Ops/s | 497.8718 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 70.7597ms | 0.4328ms | 2.3107 KOps/s | 2.4870 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1455ms | 14.3114μs | 69.8742 KOps/s | 74.2509 KOps/s | |
test_plain_set_stack_nested | 0.2618ms | 0.1211ms | 8.2573 KOps/s | 8.4109 KOps/s | |
test_plain_set_nested_inplace | 0.1467ms | 15.7448μs | 63.5132 KOps/s | 67.5397 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2879ms | 0.1485ms | 6.7344 KOps/s | 6.7936 KOps/s | |
test_items | 0.1192ms | 4.8356μs | 206.7997 KOps/s | 209.7806 KOps/s | |
test_items_nested | 0.4625ms | 0.3418ms | 2.9254 KOps/s | 2.9301 KOps/s | |
test_items_nested_locked | 0.3871ms | 0.3449ms | 2.8991 KOps/s | 2.9110 KOps/s | |
test_items_nested_leaf | 0.7181ms | 0.2024ms | 4.9411 KOps/s | 4.9454 KOps/s | |
test_items_stack_nested | 1.3993ms | 1.3224ms | 756.2146 Ops/s | 758.0649 Ops/s | |
test_items_stack_nested_leaf | 1.2036ms | 1.1524ms | 867.7376 Ops/s | 878.0033 Ops/s | |
test_items_stack_nested_locked | 0.9797ms | 0.9215ms | 1.0851 KOps/s | 1.0834 KOps/s | |
test_keys | 23.4600μs | 4.6085μs | 216.9926 KOps/s | 218.1781 KOps/s | |
test_keys_nested | 1.5706ms | 95.7549μs | 10.4433 KOps/s | 10.5486 KOps/s | |
test_keys_nested_locked | 0.1247ms | 98.8597μs | 10.1153 KOps/s | 10.1640 KOps/s | |
test_keys_nested_leaf | 0.1861ms | 79.0062μs | 12.6572 KOps/s | 12.6946 KOps/s | |
test_keys_stack_nested | 1.2593ms | 1.1691ms | 855.3567 Ops/s | 855.5136 Ops/s | |
test_keys_stack_nested_leaf | 1.3777ms | 1.1536ms | 866.8177 Ops/s | 864.2538 Ops/s | |
test_keys_stack_nested_locked | 0.8502ms | 0.7618ms | 1.3128 KOps/s | 1.3573 KOps/s | |
test_values | 7.4500μs | 1.9021μs | 525.7354 KOps/s | 532.4510 KOps/s | |
test_values_nested | 65.1610μs | 45.7644μs | 21.8510 KOps/s | 22.0624 KOps/s | |
test_values_nested_locked | 71.2110μs | 48.1040μs | 20.7883 KOps/s | 21.0466 KOps/s | |
test_values_nested_leaf | 93.9920μs | 40.0274μs | 24.9829 KOps/s | 25.1810 KOps/s | |
test_values_stack_nested | 1.1951ms | 0.9659ms | 1.0353 KOps/s | 1.0359 KOps/s | |
test_values_stack_nested_leaf | 1.0053ms | 0.9536ms | 1.0486 KOps/s | 1.0470 KOps/s | |
test_values_stack_nested_locked | 0.6566ms | 0.5914ms | 1.6909 KOps/s | 1.7122 KOps/s | |
test_membership | 23.8100μs | 1.1073μs | 903.1064 KOps/s | 1.0428 MOps/s | |
test_membership_nested | 51.8410μs | 2.9406μs | 340.0667 KOps/s | 338.9692 KOps/s | |
test_membership_nested_leaf | 20.2100μs | 2.9532μs | 338.6152 KOps/s | 338.6660 KOps/s | |
test_membership_stacked_nested | 33.4000μs | 11.2527μs | 88.8673 KOps/s | 89.1557 KOps/s | |
test_membership_stacked_nested_leaf | 31.9900μs | 11.2872μs | 88.5960 KOps/s | 88.6289 KOps/s | |
test_membership_nested_last | 23.1000μs | 5.4096μs | 184.8549 KOps/s | 184.6922 KOps/s | |
test_membership_nested_leaf_last | 23.9700μs | 5.4351μs | 183.9890 KOps/s | 185.4654 KOps/s | |
test_membership_stacked_nested_last | 0.1726ms | 0.1442ms | 6.9352 KOps/s | 6.9684 KOps/s | |
test_membership_stacked_nested_leaf_last | 62.4810μs | 13.2765μs | 75.3210 KOps/s | 75.9614 KOps/s | |
test_nested_getleaf | 30.6100μs | 8.4019μs | 119.0207 KOps/s | 118.5097 KOps/s | |
test_nested_get | 35.2600μs | 7.9921μs | 125.1240 KOps/s | 124.6804 KOps/s | |
test_stacked_getleaf | 1.1712ms | 0.3211ms | 3.1144 KOps/s | 3.1078 KOps/s | |
test_stacked_get | 0.3198ms | 0.2845ms | 3.5145 KOps/s | 3.4739 KOps/s | |
test_nested_getitemleaf | 22.9200μs | 8.4476μs | 118.3773 KOps/s | 118.4795 KOps/s | |
test_nested_getitem | 22.7210μs | 8.0070μs | 124.8908 KOps/s | 124.7536 KOps/s | |
test_stacked_getitemleaf | 0.3771ms | 0.3229ms | 3.0966 KOps/s | 3.1000 KOps/s | |
test_stacked_getitem | 0.3409ms | 0.2909ms | 3.4381 KOps/s | 3.4284 KOps/s | |
test_lock_nested | 7.1087ms | 0.4240ms | 2.3586 KOps/s | 2.4483 KOps/s | |
test_lock_stack_nested | 83.4369ms | 6.5117ms | 153.5708 Ops/s | 156.4724 Ops/s | |
test_unlock_nested | 0.8096ms | 0.4108ms | 2.4340 KOps/s | 2.4709 KOps/s | |
test_unlock_stack_nested | 83.7354ms | 6.8483ms | 146.0220 Ops/s | 145.4811 Ops/s | |
test_flatten_speed | 76.4320ms | 0.2860ms | 3.4968 KOps/s | 3.7809 KOps/s | |
test_unflatten_speed | 0.4087ms | 0.3638ms | 2.7489 KOps/s | 2.7562 KOps/s | |
test_common_ops | 1.0938ms | 0.6360ms | 1.5723 KOps/s | 1.6591 KOps/s | |
test_creation | 13.9900μs | 1.5908μs | 628.6097 KOps/s | 629.1395 KOps/s | |
test_creation_empty | 24.3700μs | 9.6024μs | 104.1401 KOps/s | 123.1066 KOps/s | |
test_creation_nested_1 | 30.0210μs | 11.3914μs | 87.7856 KOps/s | 100.9130 KOps/s | |
test_creation_nested_2 | 37.6710μs | 13.8790μs | 72.0511 KOps/s | 80.7371 KOps/s | |
test_clone | 42.7610μs | 13.8477μs | 72.2143 KOps/s | 73.9101 KOps/s | |
test_getitem[int] | 52.2110μs | 11.2019μs | 89.2704 KOps/s | 90.2336 KOps/s | |
test_getitem[slice_int] | 43.9110μs | 21.9539μs | 45.5500 KOps/s | 44.9223 KOps/s | |
test_getitem[range] | 66.4810μs | 37.6578μs | 26.5549 KOps/s | 26.9676 KOps/s | |
test_getitem[tuple] | 37.0010μs | 19.5947μs | 51.0341 KOps/s | 51.2907 KOps/s | |
test_getitem[list] | 59.6010μs | 34.1074μs | 29.3192 KOps/s | 28.7976 KOps/s | |
test_setitem_dim[int] | 47.8010μs | 28.6115μs | 34.9509 KOps/s | 36.4315 KOps/s | |
test_setitem_dim[slice_int] | 69.0410μs | 51.1806μs | 19.5386 KOps/s | 19.9754 KOps/s | |
test_setitem_dim[range] | 0.1027ms | 64.4552μs | 15.5147 KOps/s | 15.7422 KOps/s | |
test_setitem_dim[tuple] | 72.4810μs | 44.3011μs | 22.5728 KOps/s | 23.1271 KOps/s | |
test_setitem | 0.1210ms | 19.4449μs | 51.4274 KOps/s | 53.5379 KOps/s | |
test_set | 0.1162ms | 18.9621μs | 52.7367 KOps/s | 54.9180 KOps/s | |
test_set_shared | 2.7272ms | 0.1035ms | 9.6578 KOps/s | 9.5917 KOps/s | |
test_update | 0.1118ms | 22.0207μs | 45.4119 KOps/s | 49.1886 KOps/s | |
test_update_nested | 0.1389ms | 28.8037μs | 34.7178 KOps/s | 36.6568 KOps/s | |
test_set_nested | 0.1086ms | 20.3345μs | 49.1776 KOps/s | 51.4352 KOps/s | |
test_set_nested_new | 0.1461ms | 23.0911μs | 43.3067 KOps/s | 43.8457 KOps/s | |
test_select | 60.5610μs | 36.1147μs | 27.6896 KOps/s | 27.5858 KOps/s | |
test_select_nested | 70.6220μs | 53.9351μs | 18.5408 KOps/s | 17.9740 KOps/s | |
test_exclude_nested | 0.1331ms | 0.1110ms | 9.0064 KOps/s | 9.1260 KOps/s | |
test_empty[True] | 0.3470ms | 0.3250ms | 3.0767 KOps/s | 3.1274 KOps/s | |
test_empty[False] | 7.0771μs | 0.8628μs | 1.1590 MOps/s | 1.1598 MOps/s | |
test_to | 73.3220μs | 53.4790μs | 18.6989 KOps/s | 18.8983 KOps/s | |
test_to_nonblocking | 0.1819ms | 33.4945μs | 29.8556 KOps/s | 30.1714 KOps/s | |
test_unbind_speed | 0.3588ms | 0.3317ms | 3.0145 KOps/s | 3.0765 KOps/s | |
test_unbind_speed_stack0 | 79.1580ms | 3.9088ms | 255.8318 Ops/s | 266.5742 Ops/s | |
test_unbind_speed_stack1 | 1.6216μs | 0.5648μs | 1.7706 MOps/s | 1.8645 MOps/s | |
test_split | 76.0209ms | 1.6636ms | 601.1196 Ops/s | 596.7511 Ops/s | |
test_chunk | 75.2368ms | 1.6464ms | 607.3954 Ops/s | 642.4269 Ops/s | |
test_creation[device0] | 0.1409ms | 73.1134μs | 13.6774 KOps/s | 12.5668 KOps/s | |
test_creation_from_tensor | 0.1869ms | 53.3730μs | 18.7360 KOps/s | 17.0657 KOps/s | |
test_add_one[memmap_tensor0] | 72.9110μs | 7.2352μs | 138.2133 KOps/s | 137.3295 KOps/s | |
test_contiguous[memmap_tensor0] | 13.0420μs | 0.6485μs | 1.5420 MOps/s | 1.5065 MOps/s | |
test_stack[memmap_tensor0] | 29.5410μs | 4.5838μs | 218.1599 KOps/s | 215.6404 KOps/s | |
test_memmaptd_index | 1.1439ms | 0.2621ms | 3.8155 KOps/s | 3.7234 KOps/s | |
test_memmaptd_index_astensor | 0.6342ms | 0.3204ms | 3.1212 KOps/s | 3.0587 KOps/s | |
test_memmaptd_index_op | 0.9440ms | 0.6357ms | 1.5732 KOps/s | 1.6127 KOps/s | |
test_serialize_model | 0.1713s | 98.1766ms | 10.1857 Ops/s | 10.5638 Ops/s | |
test_serialize_model_pickle | 1.3524s | 1.2357s | 0.8092 Ops/s | 0.8075 Ops/s | |
test_serialize_weights | 0.1666s | 95.7195ms | 10.4472 Ops/s | 9.7269 Ops/s | |
test_serialize_weights_returnearly | 0.2437s | 72.7801ms | 13.7400 Ops/s | 12.7676 Ops/s | |
test_serialize_weights_pickle | 1.3499s | 1.2366s | 0.8087 Ops/s | 0.8081 Ops/s | |
test_reshape_pytree | 54.4010μs | 24.7216μs | 40.4505 KOps/s | 40.2897 KOps/s | |
test_reshape_td | 69.5310μs | 29.3283μs | 34.0968 KOps/s | 33.7475 KOps/s | |
test_view_pytree | 49.8910μs | 24.4770μs | 40.8547 KOps/s | 40.3201 KOps/s | |
test_view_td | 29.2510μs | 4.2396μs | 235.8695 KOps/s | 235.2203 KOps/s | |
test_unbind_pytree | 0.1468ms | 31.1125μs | 32.1414 KOps/s | 32.2104 KOps/s | |
test_unbind_td | 0.1412ms | 51.8813μs | 19.2748 KOps/s | 19.4172 KOps/s | |
test_split_pytree | 48.9010μs | 28.9207μs | 34.5773 KOps/s | 34.8089 KOps/s | |
test_split_td | 0.7214ms | 40.5487μs | 24.6617 KOps/s | 24.7573 KOps/s | |
test_add_pytree | 61.2810μs | 37.6843μs | 26.5363 KOps/s | 26.4829 KOps/s | |
test_add_td | 78.3820μs | 51.9838μs | 19.2368 KOps/s | 19.3492 KOps/s | |
test_distributed | 0.2221ms | 70.7222μs | 14.1398 KOps/s | 9.6632 KOps/s | |
test_tdmodule | 0.1041ms | 18.6882μs | 53.5098 KOps/s | 57.3867 KOps/s | |
test_tdmodule_dispatch | 0.1340ms | 35.0506μs | 28.5302 KOps/s | 29.3954 KOps/s | |
test_tdseq | 44.8000μs | 21.7094μs | 46.0630 KOps/s | 48.7115 KOps/s | |
test_tdseq_dispatch | 54.7100μs | 38.7468μs | 25.8086 KOps/s | 27.1922 KOps/s | |
test_instantiation_functorch | 1.8125ms | 1.6762ms | 596.6030 Ops/s | 592.6178 Ops/s | |
test_instantiation_td | 1.7368ms | 1.1709ms | 854.0749 Ops/s | 849.8057 Ops/s | |
test_exec_functorch | 0.2845ms | 0.1612ms | 6.2043 KOps/s | 6.1968 KOps/s | |
test_exec_functional_call | 0.2035ms | 0.1612ms | 6.2033 KOps/s | 6.2694 KOps/s | |
test_exec_td | 0.1825ms | 0.1509ms | 6.6285 KOps/s | 6.5515 KOps/s | |
test_exec_td_decorator | 0.9101ms | 0.1914ms | 5.2258 KOps/s | 5.2029 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4015ms | 1.1224ms | 890.9360 Ops/s | 894.8465 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.7467ms | 0.6782ms | 1.4745 KOps/s | 1.5016 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0701ms | 1.0296ms | 971.2909 Ops/s | 970.9900 Ops/s | |
test_vmap_mlp_speed[False-False] | 0.6437ms | 0.5971ms | 1.6748 KOps/s | 1.6783 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.2392ms | 2.5494ms | 392.2545 Ops/s | 397.5071 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1103ms | 0.7198ms | 1.3892 KOps/s | 1.3268 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.5897ms | 2.1471ms | 465.7398 Ops/s | 462.2565 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9057ms | 0.6179ms | 1.6183 KOps/s | 1.6000 KOps/s | |
test_vmap_transformer_speed[True-True] | 13.0929ms | 12.5757ms | 79.5185 Ops/s | 79.9598 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.4284ms | 8.2730ms | 120.8750 Ops/s | 121.1317 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.9421ms | 12.4510ms | 80.3152 Ops/s | 80.6125 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.4369ms | 8.2005ms | 121.9444 Ops/s | 122.4640 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 0.1681s | 83.0502ms | 12.0409 Ops/s | 13.2058 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 21.5738ms | 20.1129ms | 49.7194 Ops/s | 50.7109 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 71.8196ms | 69.3777ms | 14.4139 Ops/s | 13.4165 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 0.1162s | 21.3953ms | 46.7391 Ops/s | 51.7405 Ops/s |
I merged to fix torchrl's CI but happy to revert / edit if you feel changes are needed |
Makes sense!
This seems to be roughly in-line with the idea that if the input and output of the ops share storage (e.g. with unbind or transpose), we propogate the lock and shared/memmap attributes while in other cases, we do not. This is very reasonable and should not "surprise" the user. Some things to look out for:
|
Good points. |
Makes sense! |
Description
@shagunsodhani I may benefit from a bit of feedback on this.
Context
When a tensordict is placed in shared memory or memmaped, it is blocked. We do this to avoid having people writing in it and hoping that these changes will be reflected in another process (which won't be the case). Locking the tensordict ensures that you must first unlock it (and hence loose the
is_shared()
attribute) before writing.For some operations (typically all the ops that don't change the
data_ptr()
) I thought it was cool to keep theis_shared()
attribute since we can be sure that the content is still shared. That means that coming from a shared / memmap tensordict all these ops would return a shared and locked tensordict:Problem
In the past we just copies the private
_is_memmap
and_is_shared
but not the lock: that meant that you ended up with a shared but not locked TD (which is bad!) I solved this in #621Unfortunately that breaks a lot of stuff in torchrl:
Problem is that usually if you do
clone
,select
orexclude
you may want to modify the tensordict that you had.So the plan now will be:
Thoughts?