-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Robust to lazy_legacy set to false and context managers for reshape ops #634
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 24, 2024
vmoens
changed the title
[Feature] Robust to lazy_legacy set to false
[Feature] Robust to lazy_legacy set to false and context managers for reshape ops
Jan 24, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 31.1980μs | 16.8970μs | 59.1820 KOps/s | 56.7609 KOps/s | |
test_plain_set_stack_nested | 0.2873ms | 0.1450ms | 6.8967 KOps/s | 6.6405 KOps/s | |
test_plain_set_nested_inplace | 74.6390μs | 19.8458μs | 50.3886 KOps/s | 49.8573 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3378ms | 0.1819ms | 5.4966 KOps/s | 5.4755 KOps/s | |
test_items | 39.0630μs | 2.5017μs | 399.7218 KOps/s | 404.2763 KOps/s | |
test_items_nested | 0.4726ms | 0.2726ms | 3.6685 KOps/s | 3.7348 KOps/s | |
test_items_nested_locked | 1.3260ms | 0.2754ms | 3.6310 KOps/s | 3.6948 KOps/s | |
test_items_nested_leaf | 0.3087ms | 0.1691ms | 5.9136 KOps/s | 6.0574 KOps/s | |
test_items_stack_nested | 1.6208ms | 1.3573ms | 736.7794 Ops/s | 757.6026 Ops/s | |
test_items_stack_nested_leaf | 1.6526ms | 1.2282ms | 814.1909 Ops/s | 847.2262 Ops/s | |
test_items_stack_nested_locked | 1.1679ms | 0.9002ms | 1.1108 KOps/s | 1.1449 KOps/s | |
test_keys | 39.5530μs | 3.8554μs | 259.3785 KOps/s | 253.2860 KOps/s | |
test_keys_nested | 60.2289ms | 0.1585ms | 6.3087 KOps/s | 6.6199 KOps/s | |
test_keys_nested_locked | 0.2583ms | 0.1515ms | 6.5991 KOps/s | 6.4914 KOps/s | |
test_keys_nested_leaf | 0.2999ms | 0.1302ms | 7.6793 KOps/s | 7.5586 KOps/s | |
test_keys_stack_nested | 1.7501ms | 1.2952ms | 772.0596 Ops/s | 792.1747 Ops/s | |
test_keys_stack_nested_leaf | 1.5721ms | 1.3001ms | 769.2001 Ops/s | 794.3950 Ops/s | |
test_keys_stack_nested_locked | 1.0972ms | 0.8231ms | 1.2149 KOps/s | 1.2288 KOps/s | |
test_values | 8.5678μs | 1.1793μs | 847.9437 KOps/s | 871.6537 KOps/s | |
test_values_nested | 0.1115ms | 51.8978μs | 19.2686 KOps/s | 19.3935 KOps/s | |
test_values_nested_locked | 0.1537ms | 51.9222μs | 19.2596 KOps/s | 19.3255 KOps/s | |
test_values_nested_leaf | 96.6980μs | 46.1667μs | 21.6606 KOps/s | 21.7876 KOps/s | |
test_values_stack_nested | 1.3007ms | 1.0455ms | 956.4685 Ops/s | 973.8104 Ops/s | |
test_values_stack_nested_leaf | 1.3111ms | 1.0308ms | 970.1172 Ops/s | 952.5286 Ops/s | |
test_values_stack_nested_locked | 1.0738ms | 0.6145ms | 1.6274 KOps/s | 1.6139 KOps/s | |
test_membership | 15.9600μs | 1.3453μs | 743.3146 KOps/s | 723.4350 KOps/s | |
test_membership_nested | 23.4030μs | 3.5393μs | 282.5431 KOps/s | 285.9554 KOps/s | |
test_membership_nested_leaf | 46.8870μs | 3.5528μs | 281.4668 KOps/s | 291.1438 KOps/s | |
test_membership_stacked_nested | 55.1930μs | 11.7959μs | 84.7751 KOps/s | 86.3721 KOps/s | |
test_membership_stacked_nested_leaf | 75.0790μs | 11.8049μs | 84.7108 KOps/s | 85.7202 KOps/s | |
test_membership_nested_last | 64.0190μs | 6.8051μs | 146.9476 KOps/s | 150.8164 KOps/s | |
test_membership_nested_leaf_last | 32.0200μs | 6.7877μs | 147.3257 KOps/s | 150.5584 KOps/s | |
test_membership_stacked_nested_last | 0.2984ms | 0.1773ms | 5.6405 KOps/s | 5.5933 KOps/s | |
test_membership_stacked_nested_leaf_last | 65.1110μs | 13.8645μs | 72.1266 KOps/s | 72.5560 KOps/s | |
test_nested_getleaf | 41.5970μs | 10.5484μs | 94.8012 KOps/s | 91.7091 KOps/s | |
test_nested_get | 34.9250μs | 10.0197μs | 99.8038 KOps/s | 94.8423 KOps/s | |
test_stacked_getleaf | 0.7701ms | 0.3948ms | 2.5328 KOps/s | 2.4903 KOps/s | |
test_stacked_get | 0.4596ms | 0.3616ms | 2.7657 KOps/s | 2.7227 KOps/s | |
test_nested_getitemleaf | 53.3990μs | 12.0311μs | 83.1176 KOps/s | 79.6608 KOps/s | |
test_nested_getitem | 35.9370μs | 11.5891μs | 86.2883 KOps/s | 83.5899 KOps/s | |
test_stacked_getitemleaf | 0.5010ms | 0.3975ms | 2.5157 KOps/s | 2.4933 KOps/s | |
test_stacked_getitem | 0.6647ms | 0.3678ms | 2.7188 KOps/s | 2.6896 KOps/s | |
test_lock_nested | 0.8614ms | 0.3418ms | 2.9260 KOps/s | 2.8751 KOps/s | |
test_lock_stack_nested | 96.3494ms | 6.2762ms | 159.3321 Ops/s | 160.1247 Ops/s | |
test_unlock_nested | 1.0878ms | 0.3479ms | 2.8746 KOps/s | 2.3845 KOps/s | |
test_unlock_stack_nested | 99.9662ms | 6.3416ms | 157.6896 Ops/s | 153.2252 Ops/s | |
test_flatten_speed | 1.7280ms | 0.3653ms | 2.7375 KOps/s | 2.6874 KOps/s | |
test_unflatten_speed | 0.9359ms | 0.4681ms | 2.1362 KOps/s | 2.1169 KOps/s | |
test_common_ops | 3.7391ms | 0.6981ms | 1.4325 KOps/s | 1.3865 KOps/s | |
test_creation | 16.7420μs | 1.8484μs | 541.0035 KOps/s | 511.3186 KOps/s | |
test_creation_empty | 31.5090μs | 10.1736μs | 98.2937 KOps/s | 93.2626 KOps/s | |
test_creation_nested_1 | 51.5560μs | 12.8968μs | 77.5385 KOps/s | 74.3629 KOps/s | |
test_creation_nested_2 | 41.2370μs | 16.0139μs | 62.4458 KOps/s | 60.6085 KOps/s | |
test_clone | 0.1293ms | 13.1480μs | 76.0569 KOps/s | 76.0677 KOps/s | |
test_getitem[int] | 43.7120μs | 10.9273μs | 91.5138 KOps/s | 87.4933 KOps/s | |
test_getitem[slice_int] | 73.2470μs | 22.3648μs | 44.7132 KOps/s | 43.3079 KOps/s | |
test_getitem[range] | 0.2593ms | 42.6977μs | 23.4205 KOps/s | 23.3559 KOps/s | |
test_getitem[tuple] | 55.3930μs | 17.9256μs | 55.7861 KOps/s | 54.1860 KOps/s | |
test_getitem[list] | 0.2728ms | 37.4585μs | 26.6962 KOps/s | 26.4048 KOps/s | |
test_setitem_dim[int] | 0.1023ms | 31.0600μs | 32.1957 KOps/s | 32.2782 KOps/s | |
test_setitem_dim[slice_int] | 0.1132ms | 58.1238μs | 17.2047 KOps/s | 17.5436 KOps/s | |
test_setitem_dim[range] | 0.1411ms | 76.2716μs | 13.1110 KOps/s | 13.0195 KOps/s | |
test_setitem_dim[tuple] | 90.0390μs | 44.8220μs | 22.3105 KOps/s | 20.9754 KOps/s | |
test_setitem | 0.1461ms | 19.4024μs | 51.5399 KOps/s | 49.4559 KOps/s | |
test_set | 0.1942ms | 18.6582μs | 53.5959 KOps/s | 50.6630 KOps/s | |
test_set_shared | 2.1547ms | 0.1471ms | 6.7998 KOps/s | 6.7999 KOps/s | |
test_update | 0.1377ms | 21.4446μs | 46.6317 KOps/s | 44.2365 KOps/s | |
test_update_nested | 0.2042ms | 28.8170μs | 34.7017 KOps/s | 33.2592 KOps/s | |
test_set_nested | 0.1478ms | 20.4796μs | 48.8290 KOps/s | 46.8943 KOps/s | |
test_set_nested_new | 0.2056ms | 24.3493μs | 41.0690 KOps/s | 40.1797 KOps/s | |
test_select | 0.1799ms | 37.4507μs | 26.7018 KOps/s | 26.2041 KOps/s | |
test_select_nested | 0.1235ms | 57.4250μs | 17.4140 KOps/s | 17.1551 KOps/s | |
test_exclude_nested | 0.2147ms | 0.1075ms | 9.2998 KOps/s | 9.2201 KOps/s | |
test_empty[True] | 0.4802ms | 0.3218ms | 3.1075 KOps/s | 3.0932 KOps/s | |
test_empty[False] | 8.0108μs | 1.0307μs | 970.2021 KOps/s | 972.3781 KOps/s | |
test_unbind_speed | 0.3771ms | 0.2453ms | 4.0766 KOps/s | 4.0697 KOps/s | |
test_unbind_speed_stack0 | 84.8449ms | 3.3996ms | 294.1512 Ops/s | 292.7253 Ops/s | |
test_unbind_speed_stack1 | 22.9430μs | 2.0106μs | 497.3567 KOps/s | 515.6001 KOps/s | |
test_split | 77.7187ms | 1.6422ms | 608.9278 Ops/s | 589.7991 Ops/s | |
test_chunk | 0.1045s | 1.6395ms | 609.9350 Ops/s | 617.6075 Ops/s | |
test_creation[device0] | 3.6584ms | 0.1060ms | 9.4302 KOps/s | 9.4975 KOps/s | |
test_creation_from_tensor | 0.2507ms | 83.8240μs | 11.9298 KOps/s | 11.9819 KOps/s | |
test_add_one[memmap_tensor0] | 0.6054ms | 5.3538μs | 186.7849 KOps/s | 182.7295 KOps/s | |
test_contiguous[memmap_tensor0] | 8.0050μs | 0.6411μs | 1.5598 MOps/s | 1.5579 MOps/s | |
test_stack[memmap_tensor0] | 0.1620ms | 3.6425μs | 274.5345 KOps/s | 274.4170 KOps/s | |
test_memmaptd_index | 1.2162ms | 0.2221ms | 4.5026 KOps/s | 4.4714 KOps/s | |
test_memmaptd_index_astensor | 0.6714ms | 0.2816ms | 3.5518 KOps/s | 3.5143 KOps/s | |
test_memmaptd_index_op | 0.8908ms | 0.5675ms | 1.7622 KOps/s | 1.7077 KOps/s | |
test_serialize_model | 0.1888s | 0.1127s | 8.8770 Ops/s | 8.7665 Ops/s | |
test_serialize_model_pickle | 0.4500s | 0.3778s | 2.6472 Ops/s | 2.6109 Ops/s | |
test_serialize_weights | 0.1845s | 0.1115s | 8.9720 Ops/s | 8.8701 Ops/s | |
test_serialize_weights_returnearly | 0.3224s | 0.1551s | 6.4471 Ops/s | 8.0272 Ops/s | |
test_serialize_weights_pickle | 0.8979s | 0.6177s | 1.6189 Ops/s | 2.5744 Ops/s | |
test_serialize_weights_filesystem | 0.1034s | 94.3361ms | 10.6004 Ops/s | 10.1109 Ops/s | |
test_serialize_model_filesystem | 0.1808s | 0.1036s | 9.6549 Ops/s | 9.3314 Ops/s | |
test_reshape_pytree | 78.9870μs | 22.9130μs | 43.6434 KOps/s | 44.0655 KOps/s | |
test_reshape_td | 77.6850μs | 29.7244μs | 33.6424 KOps/s | 34.4624 KOps/s | |
test_view_pytree | 90.9090μs | 22.7611μs | 43.9345 KOps/s | 44.1057 KOps/s | |
test_view_td | 92.9196ms | 12.0704μs | 82.8473 KOps/s | 206.3805 KOps/s | |
test_unbind_pytree | 59.3600μs | 25.8909μs | 38.6236 KOps/s | 37.9758 KOps/s | |
test_unbind_td | 0.1268ms | 34.7215μs | 28.8006 KOps/s | 28.3749 KOps/s | |
test_split_pytree | 83.2350μs | 25.8320μs | 38.7117 KOps/s | 38.7217 KOps/s | |
test_split_td | 0.1354ms | 40.2334μs | 24.8550 KOps/s | 24.0348 KOps/s | |
test_add_pytree | 0.1152ms | 31.7344μs | 31.5116 KOps/s | 31.4110 KOps/s | |
test_add_td | 0.1172ms | 49.0615μs | 20.3826 KOps/s | 20.1332 KOps/s | |
test_distributed | 0.3415ms | 0.1004ms | 9.9634 KOps/s | 9.7168 KOps/s | |
test_tdmodule | 0.2275ms | 22.2541μs | 44.9355 KOps/s | 43.8722 KOps/s | |
test_tdmodule_dispatch | 0.2148ms | 44.1638μs | 22.6430 KOps/s | 22.6723 KOps/s | |
test_tdseq | 66.4340μs | 25.5696μs | 39.1089 KOps/s | 38.4527 KOps/s | |
test_tdseq_dispatch | 0.1639ms | 48.0628μs | 20.8061 KOps/s | 20.3407 KOps/s | |
test_instantiation_functorch | 2.0506ms | 1.3072ms | 764.9911 Ops/s | 773.7957 Ops/s | |
test_instantiation_td | 1.9145ms | 1.0059ms | 994.0989 Ops/s | 994.5674 Ops/s | |
test_exec_functorch | 0.2893ms | 0.1578ms | 6.3381 KOps/s | 6.2240 KOps/s | |
test_exec_functional_call | 0.3720ms | 0.1480ms | 6.7581 KOps/s | 6.7387 KOps/s | |
test_exec_td | 0.2312ms | 0.1423ms | 7.0280 KOps/s | 6.8768 KOps/s | |
test_exec_td_decorator | 1.0975ms | 0.1783ms | 5.6089 KOps/s | 5.4616 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1775ms | 0.9068ms | 1.1028 KOps/s | 1.1164 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8149ms | 0.4797ms | 2.0848 KOps/s | 2.0750 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1517ms | 0.7934ms | 1.2604 KOps/s | 1.3116 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7000ms | 0.3880ms | 2.5774 KOps/s | 2.5468 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.5917ms | 2.3785ms | 420.4299 Ops/s | 431.2044 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9591ms | 0.5348ms | 1.8699 KOps/s | 1.8599 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 3.0284ms | 1.9143ms | 522.3898 Ops/s | 519.5748 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.1121s | 0.4528ms | 2.2086 KOps/s | 2.4216 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 63.1198ms | 16.5707μs | 60.3476 KOps/s | 73.6924 KOps/s | |
test_plain_set_stack_nested | 0.1630ms | 0.1181ms | 8.4676 KOps/s | 8.3845 KOps/s | |
test_plain_set_nested_inplace | 48.3500μs | 14.8734μs | 67.2342 KOps/s | 67.0859 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2667ms | 0.1461ms | 6.8442 KOps/s | 6.7392 KOps/s | |
test_items | 30.9310μs | 4.7876μs | 208.8731 KOps/s | 209.7709 KOps/s | |
test_items_nested | 0.4315ms | 0.3409ms | 2.9335 KOps/s | 2.9529 KOps/s | |
test_items_nested_locked | 0.4205ms | 0.3446ms | 2.9018 KOps/s | 2.9212 KOps/s | |
test_items_nested_leaf | 0.2408ms | 0.2017ms | 4.9580 KOps/s | 4.9917 KOps/s | |
test_items_stack_nested | 1.4273ms | 1.3051ms | 766.2119 Ops/s | 776.3303 Ops/s | |
test_items_stack_nested_leaf | 1.2616ms | 1.1392ms | 877.8327 Ops/s | 882.3034 Ops/s | |
test_items_stack_nested_locked | 1.9467ms | 0.8895ms | 1.1242 KOps/s | 1.1416 KOps/s | |
test_keys | 46.6610μs | 4.5802μs | 218.3295 KOps/s | 218.8466 KOps/s | |
test_keys_nested | 0.5789ms | 94.3991μs | 10.5933 KOps/s | 10.5688 KOps/s | |
test_keys_nested_locked | 0.1661ms | 99.5571μs | 10.0445 KOps/s | 10.2508 KOps/s | |
test_keys_nested_leaf | 0.1998ms | 77.8415μs | 12.8466 KOps/s | 12.8393 KOps/s | |
test_keys_stack_nested | 1.2130ms | 1.1236ms | 890.0358 Ops/s | 883.3763 Ops/s | |
test_keys_stack_nested_leaf | 1.1593ms | 1.1095ms | 901.3401 Ops/s | 883.5407 Ops/s | |
test_keys_stack_nested_locked | 0.7726ms | 0.7105ms | 1.4074 KOps/s | 1.4156 KOps/s | |
test_values | 10.7170μs | 1.8890μs | 529.3863 KOps/s | 527.0881 KOps/s | |
test_values_nested | 82.5420μs | 44.8711μs | 22.2861 KOps/s | 22.1974 KOps/s | |
test_values_nested_locked | 77.1410μs | 46.9885μs | 21.2818 KOps/s | 21.2508 KOps/s | |
test_values_nested_leaf | 64.9010μs | 39.2474μs | 25.4794 KOps/s | 25.2687 KOps/s | |
test_values_stack_nested | 1.0998ms | 0.9340ms | 1.0707 KOps/s | 1.0638 KOps/s | |
test_values_stack_nested_leaf | 1.0217ms | 0.9481ms | 1.0548 KOps/s | 1.0644 KOps/s | |
test_values_stack_nested_locked | 0.7497ms | 0.5630ms | 1.7763 KOps/s | 1.7823 KOps/s | |
test_membership | 5.3902μs | 0.9515μs | 1.0509 MOps/s | 924.8893 KOps/s | |
test_membership_nested | 31.8300μs | 2.9343μs | 340.7919 KOps/s | 343.6826 KOps/s | |
test_membership_nested_leaf | 36.5010μs | 2.9252μs | 341.8568 KOps/s | 345.2472 KOps/s | |
test_membership_stacked_nested | 0.2136ms | 11.4452μs | 87.3732 KOps/s | 88.0263 KOps/s | |
test_membership_stacked_nested_leaf | 0.1092ms | 11.4445μs | 87.3783 KOps/s | 87.6316 KOps/s | |
test_membership_nested_last | 30.7810μs | 5.3512μs | 186.8739 KOps/s | 186.2732 KOps/s | |
test_membership_nested_leaf_last | 37.5610μs | 5.3172μs | 188.0684 KOps/s | 187.5942 KOps/s | |
test_membership_stacked_nested_last | 0.1960ms | 0.1552ms | 6.4453 KOps/s | 6.3917 KOps/s | |
test_membership_stacked_nested_leaf_last | 37.7810μs | 13.3030μs | 75.1713 KOps/s | 75.9778 KOps/s | |
test_nested_getleaf | 34.3310μs | 8.4050μs | 118.9770 KOps/s | 118.9860 KOps/s | |
test_nested_get | 35.6610μs | 7.9140μs | 126.3588 KOps/s | 126.2931 KOps/s | |
test_stacked_getleaf | 0.3717ms | 0.3312ms | 3.0198 KOps/s | 3.0534 KOps/s | |
test_stacked_get | 0.3381ms | 0.2922ms | 3.4223 KOps/s | 3.3940 KOps/s | |
test_nested_getitemleaf | 92.6420μs | 9.7831μs | 102.2172 KOps/s | 102.0406 KOps/s | |
test_nested_getitem | 45.2300μs | 9.3103μs | 107.4084 KOps/s | 106.8863 KOps/s | |
test_stacked_getitemleaf | 0.3964ms | 0.3322ms | 3.0103 KOps/s | 3.0337 KOps/s | |
test_stacked_getitem | 0.3373ms | 0.3006ms | 3.3261 KOps/s | 3.3375 KOps/s | |
test_lock_nested | 0.8067ms | 0.3483ms | 2.8711 KOps/s | 2.9002 KOps/s | |
test_lock_stack_nested | 90.5102ms | 6.3063ms | 158.5723 Ops/s | 159.3859 Ops/s | |
test_unlock_nested | 86.2669ms | 0.4318ms | 2.3157 KOps/s | 2.3458 KOps/s | |
test_unlock_stack_nested | 90.3924ms | 6.3845ms | 156.6281 Ops/s | 158.2681 Ops/s | |
test_flatten_speed | 0.6486ms | 0.2615ms | 3.8238 KOps/s | 3.7875 KOps/s | |
test_unflatten_speed | 0.4035ms | 0.3592ms | 2.7843 KOps/s | 2.7537 KOps/s | |
test_common_ops | 1.0517ms | 0.5927ms | 1.6872 KOps/s | 1.7323 KOps/s | |
test_creation | 45.5810μs | 1.5555μs | 642.8700 KOps/s | 647.4263 KOps/s | |
test_creation_empty | 27.5100μs | 8.2288μs | 121.5248 KOps/s | 123.3079 KOps/s | |
test_creation_nested_1 | 45.2810μs | 9.9301μs | 100.7040 KOps/s | 101.1327 KOps/s | |
test_creation_nested_2 | 35.4200μs | 12.3403μs | 81.0351 KOps/s | 81.8396 KOps/s | |
test_clone | 82.9310μs | 13.6118μs | 73.4656 KOps/s | 74.8148 KOps/s | |
test_getitem[int] | 43.8510μs | 10.5118μs | 95.1312 KOps/s | 95.3816 KOps/s | |
test_getitem[slice_int] | 47.1710μs | 20.1582μs | 49.6076 KOps/s | 48.8041 KOps/s | |
test_getitem[range] | 0.1515ms | 35.3505μs | 28.2881 KOps/s | 28.5866 KOps/s | |
test_getitem[tuple] | 62.2710μs | 18.2131μs | 54.9056 KOps/s | 54.4739 KOps/s | |
test_getitem[list] | 0.1730ms | 31.9813μs | 31.2683 KOps/s | 32.0131 KOps/s | |
test_setitem_dim[int] | 0.1533ms | 25.7352μs | 38.8572 KOps/s | 36.9770 KOps/s | |
test_setitem_dim[slice_int] | 71.0710μs | 45.6561μs | 21.9029 KOps/s | 21.6948 KOps/s | |
test_setitem_dim[range] | 83.8910μs | 58.4405μs | 17.1114 KOps/s | 16.4174 KOps/s | |
test_setitem_dim[tuple] | 65.2310μs | 40.0421μs | 24.9737 KOps/s | 24.2618 KOps/s | |
test_setitem | 65.2210μs | 18.1367μs | 55.1369 KOps/s | 54.9165 KOps/s | |
test_set | 68.1010μs | 17.7229μs | 56.4242 KOps/s | 56.5483 KOps/s | |
test_set_shared | 2.8645ms | 0.1019ms | 9.8172 KOps/s | 9.2530 KOps/s | |
test_update | 77.9120μs | 20.3988μs | 49.0225 KOps/s | 50.1076 KOps/s | |
test_update_nested | 72.7020μs | 26.7708μs | 37.3542 KOps/s | 37.5501 KOps/s | |
test_set_nested | 73.0510μs | 18.6605μs | 53.5891 KOps/s | 53.3405 KOps/s | |
test_set_nested_new | 0.1507ms | 21.9829μs | 45.4899 KOps/s | 44.8353 KOps/s | |
test_select | 0.1549ms | 34.6094μs | 28.8939 KOps/s | 29.0932 KOps/s | |
test_select_nested | 83.5110μs | 53.4305μs | 18.7159 KOps/s | 18.8529 KOps/s | |
test_exclude_nested | 0.1523ms | 0.1080ms | 9.2630 KOps/s | 9.4041 KOps/s | |
test_empty[True] | 0.3979ms | 0.3189ms | 3.1354 KOps/s | 3.1203 KOps/s | |
test_empty[False] | 2.8971μs | 0.8534μs | 1.1718 MOps/s | 1.1653 MOps/s | |
test_to | 70.5010μs | 50.5188μs | 19.7946 KOps/s | 19.2431 KOps/s | |
test_to_nonblocking | 0.1775ms | 32.0314μs | 31.2194 KOps/s | 30.5073 KOps/s | |
test_unbind_speed | 0.3539ms | 0.2594ms | 3.8550 KOps/s | 3.8277 KOps/s | |
test_unbind_speed_stack0 | 88.8827ms | 3.7073ms | 269.7395 Ops/s | 269.0517 Ops/s | |
test_unbind_speed_stack1 | 20.9200μs | 1.7993μs | 555.7570 KOps/s | 559.3722 KOps/s | |
test_split | 2.1643ms | 1.5258ms | 655.3886 Ops/s | 586.3826 Ops/s | |
test_chunk | 82.4656ms | 1.6555ms | 604.0583 Ops/s | 611.9235 Ops/s | |
test_creation[device0] | 0.2182ms | 70.0396μs | 14.2776 KOps/s | 14.3688 KOps/s | |
test_creation_from_tensor | 0.2318ms | 53.9095μs | 18.5496 KOps/s | 18.1711 KOps/s | |
test_add_one[memmap_tensor0] | 0.2323ms | 6.2598μs | 159.7491 KOps/s | 159.9731 KOps/s | |
test_contiguous[memmap_tensor0] | 11.9200μs | 0.6439μs | 1.5530 MOps/s | 1.6080 MOps/s | |
test_stack[memmap_tensor0] | 52.8810μs | 4.3416μs | 230.3280 KOps/s | 233.0558 KOps/s | |
test_memmaptd_index | 1.0615ms | 0.2590ms | 3.8616 KOps/s | 3.8381 KOps/s | |
test_memmaptd_index_astensor | 0.5732ms | 0.3151ms | 3.1736 KOps/s | 3.1652 KOps/s | |
test_memmaptd_index_op | 0.9178ms | 0.5826ms | 1.7165 KOps/s | 1.7058 KOps/s | |
test_serialize_model | 0.1715s | 97.6774ms | 10.2378 Ops/s | 9.7012 Ops/s | |
test_serialize_model_pickle | 1.3491s | 1.2358s | 0.8092 Ops/s | 0.8084 Ops/s | |
test_serialize_weights | 0.1731s | 96.2534ms | 10.3892 Ops/s | 10.1526 Ops/s | |
test_serialize_weights_returnearly | 0.2796s | 82.8811ms | 12.0655 Ops/s | 14.1696 Ops/s | |
test_serialize_weights_pickle | 1.3551s | 1.2369s | 0.8084 Ops/s | 0.8025 Ops/s | |
test_reshape_pytree | 57.3210μs | 24.3702μs | 41.0337 KOps/s | 40.9014 KOps/s | |
test_reshape_td | 0.1727ms | 28.6792μs | 34.8685 KOps/s | 35.2958 KOps/s | |
test_view_pytree | 0.1465ms | 24.0822μs | 41.5245 KOps/s | 42.2855 KOps/s | |
test_view_td | 0.5082ms | 6.7057μs | 149.1276 KOps/s | 233.0941 KOps/s | |
test_unbind_pytree | 68.5110μs | 29.7736μs | 33.5868 KOps/s | 33.5378 KOps/s | |
test_unbind_td | 0.3155ms | 39.6207μs | 25.2393 KOps/s | 24.1321 KOps/s | |
test_split_pytree | 62.9910μs | 27.6777μs | 36.1302 KOps/s | 35.5294 KOps/s | |
test_split_td | 0.1817ms | 38.2237μs | 26.1618 KOps/s | 26.2763 KOps/s | |
test_add_pytree | 0.1482ms | 34.4325μs | 29.0423 KOps/s | 29.3697 KOps/s | |
test_add_td | 84.0220μs | 44.6760μs | 22.3834 KOps/s | 21.1279 KOps/s | |
test_distributed | 0.2094ms | 69.1120μs | 14.4693 KOps/s | 13.5714 KOps/s | |
test_tdmodule | 37.6610μs | 17.4688μs | 57.2449 KOps/s | 57.1544 KOps/s | |
test_tdmodule_dispatch | 0.2096ms | 36.1061μs | 27.6961 KOps/s | 27.6691 KOps/s | |
test_tdseq | 0.1356ms | 20.3192μs | 49.2145 KOps/s | 48.9279 KOps/s | |
test_tdseq_dispatch | 67.4720μs | 38.4089μs | 26.0356 KOps/s | 25.7441 KOps/s | |
test_instantiation_functorch | 1.8052ms | 1.6502ms | 605.9815 Ops/s | 608.6008 Ops/s | |
test_instantiation_td | 1.6708ms | 1.1430ms | 874.9251 Ops/s | 868.7172 Ops/s | |
test_exec_functorch | 0.2136ms | 0.1561ms | 6.4075 KOps/s | 6.4281 KOps/s | |
test_exec_functional_call | 0.2268ms | 0.1511ms | 6.6188 KOps/s | 6.5880 KOps/s | |
test_exec_td | 0.2384ms | 0.1455ms | 6.8741 KOps/s | 7.0015 KOps/s | |
test_exec_td_decorator | 0.1149s | 0.2096ms | 4.7711 KOps/s | 5.4386 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.2841ms | 0.9987ms | 1.0013 KOps/s | 963.2093 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.7155ms | 0.5735ms | 1.7436 KOps/s | 1.6892 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0473ms | 0.9103ms | 1.0985 KOps/s | 1.0557 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6560ms | 0.5019ms | 1.9926 KOps/s | 1.9632 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.8769ms | 2.2657ms | 441.3618 Ops/s | 436.2224 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0580ms | 0.6166ms | 1.6218 KOps/s | 1.5884 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.3326ms | 1.8946ms | 527.8186 Ops/s | 527.0352 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9009ms | 0.5203ms | 1.9220 KOps/s | 1.9158 KOps/s | |
test_vmap_transformer_speed[True-True] | 11.9208ms | 11.7518ms | 85.0937 Ops/s | 85.0095 Ops/s | |
test_vmap_transformer_speed[True-False] | 7.7924ms | 7.6843ms | 130.1347 Ops/s | 130.0908 Ops/s | |
test_vmap_transformer_speed[False-True] | 11.8732ms | 11.6624ms | 85.7458 Ops/s | 85.2202 Ops/s | |
test_vmap_transformer_speed[False-False] | 7.7854ms | 7.5950ms | 131.6654 Ops/s | 130.8730 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 71.5844ms | 70.8657ms | 14.1112 Ops/s | 12.8763 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.0229ms | 18.4275ms | 54.2667 Ops/s | 53.7577 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 65.2971ms | 63.7827ms | 15.6782 Ops/s | 15.5960 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.6511ms | 18.0549ms | 55.3868 Ops/s | 50.0764 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Soon we will make all ops non-lazy by default (for v0.3)
I expect this to be bc-breaking for some users, but most should not be impacted. In fact, life will be much easier (a lot of
.contiguous()
calls will be unnecessary).The plan is to keep
LazyStackedTensorDict
there, as it's a useful abstraction to carry heterogeneous data structures or whenever one does not want to stack all the tensors of a data source.In this PR, I set the lazy_legacy to False and test if all the tests pass. The plan it to reset it to True before merging, in such a way that we're sure that this PR does not break everything in torchrl for instance.
The plan for
torch.stack
is that we'll be looking at the lazy_legacy env variable for 1 release: