-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Memory-mapped nested tensors #618
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 15, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 45.7860μs | 18.2277μs | 54.8614 KOps/s | 56.5695 KOps/s | |
test_plain_set_stack_nested | 40.1240μs | 18.0632μs | 55.3613 KOps/s | 56.9798 KOps/s | |
test_plain_set_nested_inplace | 75.2600μs | 20.4980μs | 48.7853 KOps/s | 50.6004 KOps/s | |
test_plain_set_stack_nested_inplace | 48.9820μs | 20.5627μs | 48.6316 KOps/s | 50.2712 KOps/s | |
test_items | 53.9800μs | 2.6472μs | 377.7578 KOps/s | 399.0030 KOps/s | |
test_items_nested | 0.4647ms | 0.2733ms | 3.6596 KOps/s | 3.7740 KOps/s | |
test_items_nested_locked | 1.3142ms | 0.2720ms | 3.6762 KOps/s | 3.7642 KOps/s | |
test_items_nested_leaf | 0.1410ms | 78.9915μs | 12.6596 KOps/s | 12.8765 KOps/s | |
test_items_stack_nested | 0.4515ms | 0.2721ms | 3.6751 KOps/s | 3.7083 KOps/s | |
test_items_stack_nested_leaf | 0.1633ms | 78.5239μs | 12.7350 KOps/s | 12.5144 KOps/s | |
test_items_stack_nested_locked | 0.3547ms | 0.2760ms | 3.6231 KOps/s | 3.7516 KOps/s | |
test_keys | 17.9040μs | 3.9339μs | 254.2022 KOps/s | 259.1716 KOps/s | |
test_keys_nested | 0.2645ms | 0.1385ms | 7.2207 KOps/s | 7.3393 KOps/s | |
test_keys_nested_locked | 0.6909ms | 0.1438ms | 6.9560 KOps/s | 7.1204 KOps/s | |
test_keys_nested_leaf | 0.2199ms | 0.1179ms | 8.4782 KOps/s | 8.8102 KOps/s | |
test_keys_stack_nested | 0.2291ms | 0.1359ms | 7.3573 KOps/s | 7.4408 KOps/s | |
test_keys_stack_nested_leaf | 0.2251ms | 0.1165ms | 8.5819 KOps/s | 8.7188 KOps/s | |
test_keys_stack_nested_locked | 0.2461ms | 0.1412ms | 7.0800 KOps/s | 7.1558 KOps/s | |
test_values | 6.3217μs | 1.1716μs | 853.5398 KOps/s | 864.8169 KOps/s | |
test_values_nested | 0.1023ms | 51.2144μs | 19.5258 KOps/s | 19.7718 KOps/s | |
test_values_nested_locked | 99.6250μs | 51.8848μs | 19.2735 KOps/s | 19.7298 KOps/s | |
test_values_nested_leaf | 0.1249ms | 46.4955μs | 21.5074 KOps/s | 21.7105 KOps/s | |
test_values_stack_nested | 0.1067ms | 52.3423μs | 19.1050 KOps/s | 19.0051 KOps/s | |
test_values_stack_nested_leaf | 88.1530μs | 45.7597μs | 21.8533 KOps/s | 21.6479 KOps/s | |
test_values_stack_nested_locked | 0.1068ms | 51.9422μs | 19.2522 KOps/s | 19.1400 KOps/s | |
test_membership | 23.5240μs | 1.3278μs | 753.1529 KOps/s | 711.6572 KOps/s | |
test_membership_nested | 22.0810μs | 3.4764μs | 287.6559 KOps/s | 288.0907 KOps/s | |
test_membership_nested_leaf | 26.6690μs | 3.4624μs | 288.8191 KOps/s | 285.8401 KOps/s | |
test_membership_stacked_nested | 18.1840μs | 3.4450μs | 290.2788 KOps/s | 288.2840 KOps/s | |
test_membership_stacked_nested_leaf | 26.4900μs | 3.4557μs | 289.3757 KOps/s | 289.4047 KOps/s | |
test_membership_nested_last | 31.6790μs | 4.2850μs | 233.3735 KOps/s | 234.8705 KOps/s | |
test_membership_nested_leaf_last | 24.7460μs | 4.2758μs | 233.8760 KOps/s | 240.7023 KOps/s | |
test_membership_stacked_nested_last | 31.7890μs | 13.8196μs | 72.3608 KOps/s | 211.2828 KOps/s | |
test_membership_stacked_nested_leaf_last | 35.1550μs | 13.7313μs | 72.8263 KOps/s | 207.6757 KOps/s | |
test_nested_getleaf | 34.5650μs | 10.8989μs | 91.7524 KOps/s | 92.6570 KOps/s | |
test_nested_get | 49.3620μs | 10.2502μs | 97.5594 KOps/s | 98.0809 KOps/s | |
test_stacked_getleaf | 31.7690μs | 10.7638μs | 92.9036 KOps/s | 93.9817 KOps/s | |
test_stacked_get | 43.6250μs | 10.0050μs | 99.9496 KOps/s | 98.4121 KOps/s | |
test_nested_getitemleaf | 51.3750μs | 11.3175μs | 88.3587 KOps/s | 87.4075 KOps/s | |
test_nested_getitem | 41.5470μs | 10.3518μs | 96.6016 KOps/s | 95.1406 KOps/s | |
test_stacked_getitemleaf | 43.7820μs | 11.2580μs | 88.8257 KOps/s | 88.3316 KOps/s | |
test_stacked_getitem | 29.5250μs | 10.4874μs | 95.3528 KOps/s | 95.2134 KOps/s | |
test_lock_nested | 51.3768ms | 0.4000ms | 2.5000 KOps/s | 2.8217 KOps/s | |
test_lock_stack_nested | 0.5846ms | 0.2964ms | 3.3741 KOps/s | 3.2670 KOps/s | |
test_unlock_nested | 0.7024ms | 0.3494ms | 2.8621 KOps/s | 2.4651 KOps/s | |
test_unlock_stack_nested | 0.3925ms | 0.3029ms | 3.3016 KOps/s | 3.1690 KOps/s | |
test_flatten_speed | 0.1791ms | 96.6209μs | 10.3497 KOps/s | 10.5530 KOps/s | |
test_unflatten_speed | 1.3479ms | 0.4315ms | 2.3177 KOps/s | 2.4194 KOps/s | |
test_common_ops | 4.1537ms | 0.7836ms | 1.2762 KOps/s | 1.3427 KOps/s | |
test_creation | 23.4440μs | 1.9530μs | 512.0210 KOps/s | 525.9846 KOps/s | |
test_creation_empty | 30.9170μs | 12.4589μs | 80.2638 KOps/s | 85.5076 KOps/s | |
test_creation_nested_1 | 49.8020μs | 15.5432μs | 64.3370 KOps/s | 66.2247 KOps/s | |
test_creation_nested_2 | 53.7300μs | 18.9296μs | 52.8273 KOps/s | 55.8277 KOps/s | |
test_clone | 76.1420μs | 13.5443μs | 73.8316 KOps/s | 72.6302 KOps/s | |
test_getitem[int] | 39.3430μs | 11.5166μs | 86.8313 KOps/s | 84.6807 KOps/s | |
test_getitem[slice_int] | 53.8800μs | 22.9536μs | 43.5661 KOps/s | 42.7721 KOps/s | |
test_getitem[range] | 85.1080μs | 61.0112μs | 16.3904 KOps/s | 15.6229 KOps/s | |
test_getitem[tuple] | 50.3640μs | 19.3397μs | 51.7071 KOps/s | 51.1106 KOps/s | |
test_getitem[list] | 0.1088ms | 41.3608μs | 24.1775 KOps/s | 23.6225 KOps/s | |
test_setitem_dim[int] | 56.1050μs | 36.8686μs | 27.1233 KOps/s | 27.5362 KOps/s | |
test_setitem_dim[slice_int] | 0.3353ms | 63.6019μs | 15.7228 KOps/s | 15.4846 KOps/s | |
test_setitem_dim[range] | 0.2355ms | 85.0775μs | 11.7540 KOps/s | 11.5190 KOps/s | |
test_setitem_dim[tuple] | 88.6740μs | 52.1191μs | 19.1868 KOps/s | 19.1052 KOps/s | |
test_setitem | 0.2940ms | 22.0002μs | 45.4541 KOps/s | 46.8677 KOps/s | |
test_set | 66.4440μs | 21.1004μs | 47.3925 KOps/s | 45.8582 KOps/s | |
test_set_shared | 1.8132ms | 0.1437ms | 6.9584 KOps/s | 6.8679 KOps/s | |
test_update | 0.3016ms | 24.6384μs | 40.5871 KOps/s | 42.5529 KOps/s | |
test_update_nested | 76.9830μs | 32.3593μs | 30.9030 KOps/s | 30.6379 KOps/s | |
test_update__nested | 72.4140μs | 25.0365μs | 39.9418 KOps/s | 38.7920 KOps/s | |
test_set_nested | 83.2550μs | 23.2616μs | 42.9893 KOps/s | 43.9523 KOps/s | |
test_set_nested_new | 70.4410μs | 27.1063μs | 36.8918 KOps/s | 37.6120 KOps/s | |
test_select | 89.1560μs | 43.1938μs | 23.1515 KOps/s | 24.0197 KOps/s | |
test_select_nested | 0.1130ms | 60.8250μs | 16.4406 KOps/s | 16.1877 KOps/s | |
test_exclude_nested | 0.2870ms | 0.1229ms | 8.1337 KOps/s | 8.1149 KOps/s | |
test_empty[True] | 1.0314ms | 0.4028ms | 2.4827 KOps/s | 2.5113 KOps/s | |
test_empty[False] | 20.4740μs | 1.0787μs | 927.0072 KOps/s | 917.9824 KOps/s | |
test_unbind_speed | 1.6171ms | 0.2626ms | 3.8080 KOps/s | 3.7414 KOps/s | |
test_unbind_speed_stack0 | 0.3678ms | 0.2485ms | 4.0241 KOps/s | 3.9386 KOps/s | |
test_unbind_speed_stack1 | 65.7003ms | 0.7391ms | 1.3530 KOps/s | 1.3140 KOps/s | |
test_split | 1.7195ms | 1.4950ms | 668.9017 Ops/s | 622.0807 Ops/s | |
test_chunk | 66.7037ms | 1.6088ms | 621.5862 Ops/s | 623.0959 Ops/s | |
test_creation[device0] | 0.1970ms | 0.1060ms | 9.4336 KOps/s | 9.2353 KOps/s | |
test_creation_from_tensor | 3.3174ms | 85.2446μs | 11.7309 KOps/s | 11.8106 KOps/s | |
test_add_one[memmap_tensor0] | 51.5560μs | 5.3353μs | 187.4303 KOps/s | 177.7118 KOps/s | |
test_contiguous[memmap_tensor0] | 16.6710μs | 0.6377μs | 1.5681 MOps/s | 1.5663 MOps/s | |
test_stack[memmap_tensor0] | 47.3540μs | 3.5913μs | 278.4491 KOps/s | 275.1956 KOps/s | |
test_memmaptd_index | 0.9927ms | 0.2499ms | 4.0014 KOps/s | 4.1409 KOps/s | |
test_memmaptd_index_astensor | 66.4264ms | 0.3506ms | 2.8521 KOps/s | 3.1706 KOps/s | |
test_memmaptd_index_op | 0.9467ms | 0.6383ms | 1.5667 KOps/s | 1.6119 KOps/s | |
test_serialize_model | 0.1727s | 0.1085s | 9.2153 Ops/s | 8.9537 Ops/s | |
test_serialize_model_pickle | 0.4470s | 0.3818s | 2.6190 Ops/s | 2.6059 Ops/s | |
test_serialize_weights | 0.1702s | 0.1079s | 9.2700 Ops/s | 9.3285 Ops/s | |
test_serialize_weights_returnearly | 0.1838s | 0.1303s | 7.6740 Ops/s | 7.5341 Ops/s | |
test_serialize_weights_pickle | 0.7037s | 0.4587s | 2.1801 Ops/s | 2.4446 Ops/s | |
test_serialize_weights_filesystem | 98.0380ms | 91.5172ms | 10.9269 Ops/s | 10.5872 Ops/s | |
test_serialize_model_filesystem | 0.1616s | 97.6320ms | 10.2425 Ops/s | 8.9940 Ops/s | |
test_reshape_pytree | 62.8770μs | 25.1758μs | 39.7206 KOps/s | 35.7473 KOps/s | |
test_reshape_td | 73.3460μs | 33.9450μs | 29.4594 KOps/s | 29.1542 KOps/s | |
test_view_pytree | 60.2320μs | 25.2184μs | 39.6536 KOps/s | 38.8682 KOps/s | |
test_view_td | 68.2870μs | 37.2996μs | 26.8099 KOps/s | 26.2492 KOps/s | |
test_unbind_pytree | 71.8640μs | 28.9349μs | 34.5603 KOps/s | 34.0054 KOps/s | |
test_unbind_td | 0.3873ms | 38.4378μs | 26.0161 KOps/s | 26.0593 KOps/s | |
test_split_pytree | 67.1850μs | 28.7848μs | 34.7406 KOps/s | 33.1698 KOps/s | |
test_split_td | 0.1204ms | 41.4001μs | 24.1545 KOps/s | 24.0333 KOps/s | |
test_add_pytree | 74.1180μs | 34.7226μs | 28.7997 KOps/s | 27.8086 KOps/s | |
test_add_td | 0.1301ms | 57.4540μs | 17.4052 KOps/s | 17.3799 KOps/s | |
test_distributed | 0.1781ms | 99.0099μs | 10.1000 KOps/s | 9.7654 KOps/s | |
test_tdmodule | 64.3890μs | 18.4375μs | 54.2374 KOps/s | 46.0387 KOps/s | |
test_tdmodule_dispatch | 54.9430μs | 36.6283μs | 27.3013 KOps/s | 23.2146 KOps/s | |
test_tdseq | 41.6280μs | 21.4955μs | 46.5215 KOps/s | 41.5284 KOps/s | |
test_tdseq_dispatch | 78.5360μs | 42.5859μs | 23.4820 KOps/s | 22.3103 KOps/s | |
test_instantiation_functorch | 1.7756ms | 1.3408ms | 745.8326 Ops/s | 730.0239 Ops/s | |
test_instantiation_td | 1.5530ms | 1.0322ms | 968.8174 Ops/s | 951.7012 Ops/s | |
test_exec_functorch | 0.2848ms | 0.1602ms | 6.2438 KOps/s | 5.5431 KOps/s | |
test_exec_functional_call | 0.3278ms | 0.1508ms | 6.6319 KOps/s | 5.8650 KOps/s | |
test_exec_td | 0.2957ms | 0.1500ms | 6.6652 KOps/s | 6.3201 KOps/s | |
test_exec_td_decorator | 0.7981ms | 0.2240ms | 4.4642 KOps/s | 4.3312 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8655ms | 0.4878ms | 2.0502 KOps/s | 1.9373 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7520ms | 0.4859ms | 2.0579 KOps/s | 1.9658 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6073ms | 0.3960ms | 2.5252 KOps/s | 2.4110 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6047ms | 0.4122ms | 2.4262 KOps/s | 2.4016 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2645ms | 0.5556ms | 1.7999 KOps/s | 1.7437 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7033ms | 0.5520ms | 1.8115 KOps/s | 1.7556 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7050ms | 0.4556ms | 2.1948 KOps/s | 2.1071 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6511ms | 0.4537ms | 2.2039 KOps/s | 2.1026 KOps/s | |
test_to_module_speed[True] | 1.7797ms | 1.6938ms | 590.3917 Ops/s | 574.4005 Ops/s | |
test_to_module_speed[False] | 1.7628ms | 1.6669ms | 599.9133 Ops/s | 590.7810 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 78.6910μs | 13.9830μs | 71.5156 KOps/s | 77.6362 KOps/s | |
test_plain_set_stack_nested | 0.1476ms | 0.1186ms | 8.4301 KOps/s | 8.4200 KOps/s | |
test_plain_set_nested_inplace | 36.8310μs | 15.2954μs | 65.3791 KOps/s | 70.3905 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1861ms | 0.1444ms | 6.9254 KOps/s | 6.8348 KOps/s | |
test_items | 19.6810μs | 4.7096μs | 212.3331 KOps/s | 210.0282 KOps/s | |
test_items_nested | 0.3950ms | 0.3406ms | 2.9362 KOps/s | 2.9094 KOps/s | |
test_items_nested_locked | 0.3919ms | 0.3414ms | 2.9294 KOps/s | 2.8880 KOps/s | |
test_items_nested_leaf | 0.2471ms | 0.1995ms | 5.0133 KOps/s | 4.9477 KOps/s | |
test_items_stack_nested | 1.4001ms | 1.3100ms | 763.3530 Ops/s | 752.2565 Ops/s | |
test_items_stack_nested_leaf | 1.2439ms | 1.1509ms | 868.8751 Ops/s | 856.1777 Ops/s | |
test_items_stack_nested_locked | 0.9757ms | 0.9140ms | 1.0940 KOps/s | 1.0753 KOps/s | |
test_keys | 24.8310μs | 4.6356μs | 215.7197 KOps/s | 208.0632 KOps/s | |
test_keys_nested | 0.7451ms | 95.7431μs | 10.4446 KOps/s | 10.5487 KOps/s | |
test_keys_nested_locked | 0.1230ms | 95.1373μs | 10.5111 KOps/s | 10.6231 KOps/s | |
test_keys_nested_leaf | 0.1821ms | 78.7503μs | 12.6984 KOps/s | 12.8588 KOps/s | |
test_keys_stack_nested | 1.2602ms | 1.1610ms | 861.3324 Ops/s | 856.8784 Ops/s | |
test_keys_stack_nested_leaf | 1.2693ms | 1.1408ms | 876.5694 Ops/s | 871.9504 Ops/s | |
test_keys_stack_nested_locked | 0.7946ms | 0.7336ms | 1.3631 KOps/s | 1.3696 KOps/s | |
test_values | 8.1770μs | 1.9073μs | 524.3097 KOps/s | 525.6596 KOps/s | |
test_values_nested | 66.7310μs | 44.9970μs | 22.2237 KOps/s | 21.9759 KOps/s | |
test_values_nested_locked | 67.7810μs | 46.9872μs | 21.2824 KOps/s | 21.0426 KOps/s | |
test_values_nested_leaf | 61.4710μs | 39.2926μs | 25.4501 KOps/s | 25.3630 KOps/s | |
test_values_stack_nested | 1.0456ms | 0.9617ms | 1.0398 KOps/s | 1.0194 KOps/s | |
test_values_stack_nested_leaf | 1.0169ms | 0.9559ms | 1.0461 KOps/s | 1.0393 KOps/s | |
test_values_stack_nested_locked | 0.6764ms | 0.5929ms | 1.6865 KOps/s | 1.7029 KOps/s | |
test_membership | 4.9982μs | 0.9345μs | 1.0701 MOps/s | 937.5690 KOps/s | |
test_membership_nested | 28.7500μs | 2.2541μs | 443.6363 KOps/s | 433.1978 KOps/s | |
test_membership_nested_leaf | 12.0250μs | 2.1823μs | 458.2336 KOps/s | 448.8892 KOps/s | |
test_membership_stacked_nested | 30.1210μs | 10.9830μs | 91.0500 KOps/s | 90.6520 KOps/s | |
test_membership_stacked_nested_leaf | 36.7600μs | 10.9325μs | 91.4707 KOps/s | 91.0358 KOps/s | |
test_membership_nested_last | 37.6310μs | 4.6744μs | 213.9325 KOps/s | 214.5814 KOps/s | |
test_membership_nested_leaf_last | 20.1100μs | 4.6652μs | 214.3551 KOps/s | 214.2584 KOps/s | |
test_membership_stacked_nested_last | 0.1872ms | 0.1367ms | 7.3144 KOps/s | 7.3445 KOps/s | |
test_membership_stacked_nested_leaf_last | 40.6610μs | 12.9835μs | 77.0209 KOps/s | 78.1081 KOps/s | |
test_nested_getleaf | 33.2600μs | 8.4535μs | 118.2938 KOps/s | 119.2126 KOps/s | |
test_nested_get | 29.4600μs | 7.9830μs | 125.2659 KOps/s | 125.8032 KOps/s | |
test_stacked_getleaf | 0.3860ms | 0.3241ms | 3.0854 KOps/s | 3.1171 KOps/s | |
test_stacked_get | 0.3542ms | 0.2911ms | 3.4356 KOps/s | 3.4771 KOps/s | |
test_nested_getitemleaf | 29.9200μs | 8.4899μs | 117.7865 KOps/s | 118.1802 KOps/s | |
test_nested_getitem | 40.8100μs | 8.0514μs | 124.2019 KOps/s | 125.6241 KOps/s | |
test_stacked_getitemleaf | 0.3925ms | 0.3261ms | 3.0666 KOps/s | 3.0963 KOps/s | |
test_stacked_getitem | 0.3636ms | 0.2925ms | 3.4186 KOps/s | 3.4866 KOps/s | |
test_lock_nested | 4.2707ms | 0.4195ms | 2.3837 KOps/s | 2.3965 KOps/s | |
test_lock_stack_nested | 84.7477ms | 6.6316ms | 150.7937 Ops/s | 152.2350 Ops/s | |
test_unlock_nested | 0.8417ms | 0.4149ms | 2.4103 KOps/s | 2.3959 KOps/s | |
test_unlock_stack_nested | 83.2413ms | 6.9509ms | 143.8668 Ops/s | 143.4190 Ops/s | |
test_flatten_speed | 0.8159ms | 0.2663ms | 3.7558 KOps/s | 3.8123 KOps/s | |
test_unflatten_speed | 0.4125ms | 0.3578ms | 2.7952 KOps/s | 2.7814 KOps/s | |
test_common_ops | 1.1133ms | 0.6304ms | 1.5863 KOps/s | 1.7145 KOps/s | |
test_creation | 16.7400μs | 1.6029μs | 623.8618 KOps/s | 620.0926 KOps/s | |
test_creation_empty | 36.6600μs | 9.2012μs | 108.6817 KOps/s | 155.4726 KOps/s | |
test_creation_nested_1 | 24.8300μs | 11.0711μs | 90.3257 KOps/s | 119.9234 KOps/s | |
test_creation_nested_2 | 35.4710μs | 15.4865μs | 64.5725 KOps/s | 77.0570 KOps/s | |
test_clone | 0.1077ms | 13.2683μs | 75.3673 KOps/s | 74.8796 KOps/s | |
test_getitem[int] | 33.2700μs | 11.5322μs | 86.7141 KOps/s | 87.6510 KOps/s | |
test_getitem[slice_int] | 42.0900μs | 22.3651μs | 44.7126 KOps/s | 46.1288 KOps/s | |
test_getitem[range] | 69.4210μs | 37.4616μs | 26.6940 KOps/s | 26.4708 KOps/s | |
test_getitem[tuple] | 81.8710μs | 19.6886μs | 50.7907 KOps/s | 50.6524 KOps/s | |
test_getitem[list] | 69.6110μs | 34.1562μs | 29.2773 KOps/s | 28.4869 KOps/s | |
test_setitem_dim[int] | 44.6910μs | 29.6466μs | 33.7307 KOps/s | 38.8304 KOps/s | |
test_setitem_dim[slice_int] | 0.1129ms | 49.6823μs | 20.1279 KOps/s | 21.3500 KOps/s | |
test_setitem_dim[range] | 81.9010μs | 64.4931μs | 15.5055 KOps/s | 16.3804 KOps/s | |
test_setitem_dim[tuple] | 60.8310μs | 43.4990μs | 22.9890 KOps/s | 24.9983 KOps/s | |
test_setitem | 0.1047ms | 18.2752μs | 54.7189 KOps/s | 58.6208 KOps/s | |
test_set | 0.1036ms | 17.7703μs | 56.2737 KOps/s | 61.5016 KOps/s | |
test_set_shared | 2.6725ms | 0.1059ms | 9.4456 KOps/s | 9.5698 KOps/s | |
test_update | 0.1052ms | 20.8988μs | 47.8497 KOps/s | 54.8069 KOps/s | |
test_update_nested | 0.1149ms | 27.2181μs | 36.7403 KOps/s | 40.9936 KOps/s | |
test_set_nested | 0.1037ms | 19.2626μs | 51.9142 KOps/s | 57.0433 KOps/s | |
test_set_nested_new | 0.1029ms | 22.0281μs | 45.3966 KOps/s | 48.5748 KOps/s | |
test_select | 72.1810μs | 43.2788μs | 23.1060 KOps/s | 24.2165 KOps/s | |
test_to | 74.2810μs | 54.5266μs | 18.3397 KOps/s | 17.9176 KOps/s | |
test_to_nonblocking | 60.0310μs | 34.9344μs | 28.6251 KOps/s | 28.5408 KOps/s | |
test_unbind_speed | 0.3938ms | 0.3311ms | 3.0198 KOps/s | 3.0484 KOps/s | |
test_unbind_speed_stack0 | 79.8306ms | 3.9118ms | 255.6381 Ops/s | 258.2614 Ops/s | |
test_unbind_speed_stack1 | 1.7020μs | 0.5404μs | 1.8506 MOps/s | 1.8817 MOps/s | |
test_split | 74.4525ms | 1.7626ms | 567.3481 Ops/s | 567.1699 Ops/s | |
test_chunk | 1.7427ms | 1.6126ms | 620.1160 Ops/s | 575.4602 Ops/s | |
test_creation[device0] | 0.1455ms | 72.9066μs | 13.7162 KOps/s | 13.7046 KOps/s | |
test_creation_from_tensor | 0.1557ms | 54.8215μs | 18.2410 KOps/s | 17.5030 KOps/s | |
test_add_one[memmap_tensor0] | 0.1309ms | 7.1415μs | 140.0263 KOps/s | 139.4473 KOps/s | |
test_contiguous[memmap_tensor0] | 10.5100μs | 0.6594μs | 1.5164 MOps/s | 1.5558 MOps/s | |
test_stack[memmap_tensor0] | 28.7600μs | 4.7133μs | 212.1642 KOps/s | 215.9260 KOps/s | |
test_memmaptd_index | 0.3078ms | 0.2473ms | 4.0440 KOps/s | 4.0337 KOps/s | |
test_memmaptd_index_astensor | 0.3792ms | 0.3054ms | 3.2741 KOps/s | 3.2479 KOps/s | |
test_memmaptd_index_op | 0.7982ms | 0.6117ms | 1.6348 KOps/s | 1.7084 KOps/s | |
test_serialize_model | 92.3941ms | 88.7824ms | 11.2635 Ops/s | 9.7365 Ops/s | |
test_serialize_model_pickle | 1.6736s | 1.3043s | 0.7667 Ops/s | 0.8078 Ops/s | |
test_serialize_weights | 0.1642s | 94.2504ms | 10.6100 Ops/s | 9.8813 Ops/s | |
test_serialize_weights_returnearly | 0.2556s | 77.3048ms | 12.9358 Ops/s | 14.7583 Ops/s | |
test_serialize_weights_pickle | 1.3508s | 1.2364s | 0.8088 Ops/s | 0.8086 Ops/s | |
test_reshape_pytree | 52.0010μs | 24.5403μs | 40.7493 KOps/s | 40.7775 KOps/s | |
test_reshape_td | 45.9610μs | 29.7057μs | 33.6636 KOps/s | 34.0989 KOps/s | |
test_view_pytree | 48.0310μs | 24.3914μs | 40.9980 KOps/s | 40.9617 KOps/s | |
test_view_td | 16.9610μs | 4.0917μs | 244.3992 KOps/s | 243.8218 KOps/s | |
test_unbind_pytree | 53.2110μs | 30.7500μs | 32.5203 KOps/s | 32.8811 KOps/s | |
test_unbind_td | 84.9110μs | 52.9478μs | 18.8865 KOps/s | 19.2954 KOps/s | |
test_split_pytree | 56.4810μs | 28.6968μs | 34.8471 KOps/s | 35.1274 KOps/s | |
test_split_td | 0.7338ms | 42.1409μs | 23.7299 KOps/s | 24.7466 KOps/s | |
test_add_pytree | 63.6910μs | 36.4969μs | 27.3996 KOps/s | 25.8637 KOps/s | |
test_add_td | 84.9510μs | 49.9184μs | 20.0327 KOps/s | 21.3978 KOps/s | |
test_distributed | 1.9565ms | 77.3753μs | 12.9240 KOps/s | 13.8942 KOps/s | |
test_tdmodule | 0.1091ms | 18.5853μs | 53.8059 KOps/s | 59.7216 KOps/s | |
test_tdmodule_dispatch | 0.1528ms | 35.9143μs | 27.8441 KOps/s | 30.9851 KOps/s | |
test_tdseq | 36.3500μs | 21.7804μs | 45.9129 KOps/s | 51.0085 KOps/s | |
test_tdseq_dispatch | 65.8600μs | 39.2804μs | 25.4580 KOps/s | 28.5701 KOps/s | |
test_instantiation_functorch | 1.8348ms | 1.7069ms | 585.8599 Ops/s | 597.4758 Ops/s | |
test_instantiation_td | 1.8416ms | 1.1939ms | 837.5702 Ops/s | 854.5622 Ops/s | |
test_exec_functorch | 0.2185ms | 0.1634ms | 6.1216 KOps/s | 6.1819 KOps/s | |
test_exec_functional_call | 0.2210ms | 0.1626ms | 6.1503 KOps/s | 5.9580 KOps/s | |
test_exec_td | 0.2168ms | 0.1555ms | 6.4288 KOps/s | 6.4201 KOps/s | |
test_exec_td_decorator | 0.9811ms | 0.1975ms | 5.0628 KOps/s | 5.1379 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.2178ms | 1.1361ms | 880.2297 Ops/s | 895.9307 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.8477ms | 0.6769ms | 1.4774 KOps/s | 1.4907 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1215ms | 1.0403ms | 961.2550 Ops/s | 959.9178 Ops/s | |
test_vmap_mlp_speed[False-False] | 0.6912ms | 0.6014ms | 1.6628 KOps/s | 1.5998 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.2957ms | 2.5889ms | 386.2585 Ops/s | 404.0691 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9916ms | 0.7241ms | 1.3810 KOps/s | 1.3785 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.5446ms | 2.1532ms | 464.4220 Ops/s | 484.3077 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0656ms | 0.6204ms | 1.6118 KOps/s | 1.6002 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.8833ms | 12.6478ms | 79.0652 Ops/s | 80.0815 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.6056ms | 8.3497ms | 119.7645 Ops/s | 120.6891 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.9328ms | 12.6050ms | 79.3335 Ops/s | 80.9700 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.5080ms | 8.2825ms | 120.7365 Ops/s | 121.5871 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 0.1669s | 84.6779ms | 11.8095 Ops/s | 12.2013 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 21.7837ms | 20.0070ms | 49.9826 Ops/s | 50.0742 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 71.5679ms | 70.2682ms | 14.2312 Ops/s | 14.7320 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 0.1180s | 21.5114ms | 46.4870 Ops/s | 46.4841 Ops/s |
Blocked by pytorch/pytorch#117711 |
# Conflicts: # tensordict/persistent.py # test/test_h5.py
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes it possible to create memory-mapped tensors with heterogeneous (jagged) shapes using nested tensors as a backend.
The usage I have in mind with this is to amortize the cost of reading files: this can be done once and for all provided that there is enough space to store a single gigantic uint8 tensor on some scratch storage (more elaborate pipelines can be thought of if the dataset is split in several chunks etc)
Code example with torchvision where we decode a small dataset on a single tensor to cache the decoding phase, then do a similar preprocessing using resize with a resized buffer:
cc @albanD @cpuhrsch @NicolasHug @mikaylagawarecki
Gist: https://gist.github.com/vmoens/d50dc6a7defe823444bcc80143bf37fd