-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Using native torch.Tensor for memmap #554
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 8, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.2165ms | 19.9547μs | 50.1136 KOps/s | 49.8169 KOps/s | |
test_plain_set_stack_nested | 0.2109ms | 0.1859ms | 5.3786 KOps/s | 5.3328 KOps/s | |
test_plain_set_nested_inplace | 41.7000μs | 23.6245μs | 42.3289 KOps/s | 42.3551 KOps/s | |
test_plain_set_stack_nested_inplace | 0.9053ms | 0.2205ms | 4.5351 KOps/s | 4.5077 KOps/s | |
test_items | 68.4010μs | 3.3933μs | 294.7023 KOps/s | 289.2369 KOps/s | |
test_items_nested | 2.2512ms | 0.3761ms | 2.6590 KOps/s | 2.7689 KOps/s | |
test_items_nested_locked | 0.4630ms | 0.3746ms | 2.6697 KOps/s | 2.7114 KOps/s | |
test_items_nested_leaf | 0.2573ms | 0.2266ms | 4.4134 KOps/s | 4.5251 KOps/s | |
test_items_stack_nested | 1.9300ms | 1.8507ms | 540.3428 Ops/s | 548.3711 Ops/s | |
test_items_stack_nested_leaf | 1.7733ms | 1.6782ms | 595.8665 Ops/s | 609.8826 Ops/s | |
test_items_stack_nested_locked | 2.9863ms | 0.9972ms | 1.0028 KOps/s | 1.0285 KOps/s | |
test_keys | 64.6010μs | 5.1100μs | 195.6930 KOps/s | 198.6078 KOps/s | |
test_keys_nested | 1.1433ms | 0.1834ms | 5.4529 KOps/s | 4.9614 KOps/s | |
test_keys_nested_locked | 0.2114ms | 0.1814ms | 5.5124 KOps/s | 5.4895 KOps/s | |
test_keys_nested_leaf | 0.3254ms | 0.1741ms | 5.7443 KOps/s | 5.7169 KOps/s | |
test_keys_stack_nested | 1.9701ms | 1.7131ms | 583.7210 Ops/s | 594.8389 Ops/s | |
test_keys_stack_nested_leaf | 1.9219ms | 1.7093ms | 585.0240 Ops/s | 592.1189 Ops/s | |
test_keys_stack_nested_locked | 1.2155ms | 0.8534ms | 1.1717 KOps/s | 1.2147 KOps/s | |
test_values | 17.6010μs | 1.5496μs | 645.3435 KOps/s | 646.6685 KOps/s | |
test_values_nested | 0.1098ms | 66.8500μs | 14.9589 KOps/s | 14.9438 KOps/s | |
test_values_nested_locked | 0.1250ms | 67.1448μs | 14.8932 KOps/s | 14.9604 KOps/s | |
test_values_nested_leaf | 0.1112ms | 58.4599μs | 17.1057 KOps/s | 17.0415 KOps/s | |
test_values_stack_nested | 2.5307ms | 1.5751ms | 634.8784 Ops/s | 688.3212 Ops/s | |
test_values_stack_nested_leaf | 1.6564ms | 1.4746ms | 678.1706 Ops/s | 690.9528 Ops/s | |
test_values_stack_nested_locked | 0.7508ms | 0.6528ms | 1.5320 KOps/s | 1.5469 KOps/s | |
test_membership | 16.7000μs | 1.8025μs | 554.7919 KOps/s | 538.0480 KOps/s | |
test_membership_nested | 72.5010μs | 3.6758μs | 272.0470 KOps/s | 280.8171 KOps/s | |
test_membership_nested_leaf | 37.9010μs | 3.7307μs | 268.0475 KOps/s | 282.3751 KOps/s | |
test_membership_stacked_nested | 28.1000μs | 14.4149μs | 69.3726 KOps/s | 69.8285 KOps/s | |
test_membership_stacked_nested_leaf | 44.5010μs | 14.3137μs | 69.8634 KOps/s | 69.9738 KOps/s | |
test_membership_nested_last | 25.3000μs | 7.6266μs | 131.1195 KOps/s | 133.4385 KOps/s | |
test_membership_nested_leaf_last | 40.7010μs | 7.6233μs | 131.1763 KOps/s | 133.4076 KOps/s | |
test_membership_stacked_nested_last | 0.2586ms | 0.2288ms | 4.3709 KOps/s | 4.4373 KOps/s | |
test_membership_stacked_nested_leaf_last | 0.1041ms | 16.9126μs | 59.1276 KOps/s | 60.3171 KOps/s | |
test_nested_getleaf | 46.3010μs | 15.8512μs | 63.0869 KOps/s | 63.4089 KOps/s | |
test_nested_get | 41.1010μs | 15.0983μs | 66.2324 KOps/s | 66.8117 KOps/s | |
test_stacked_getleaf | 0.8806ms | 0.7751ms | 1.2902 KOps/s | 1.3298 KOps/s | |
test_stacked_get | 0.7789ms | 0.7403ms | 1.3507 KOps/s | 1.3887 KOps/s | |
test_nested_getitemleaf | 45.3000μs | 15.8170μs | 63.2229 KOps/s | 63.8070 KOps/s | |
test_nested_getitem | 45.6000μs | 15.0693μs | 66.3603 KOps/s | 66.5420 KOps/s | |
test_stacked_getitemleaf | 0.8661ms | 0.7773ms | 1.2865 KOps/s | 1.3270 KOps/s | |
test_stacked_getitem | 0.7882ms | 0.7412ms | 1.3492 KOps/s | 1.3871 KOps/s | |
test_lock_nested | 87.3496ms | 1.2588ms | 794.4000 Ops/s | 859.3824 Ops/s | |
test_lock_stack_nested | 0.1103s | 18.5665ms | 53.8604 Ops/s | 53.0010 Ops/s | |
test_unlock_nested | 85.6639ms | 1.2647ms | 790.6730 Ops/s | 791.6967 Ops/s | |
test_unlock_stack_nested | 0.1224s | 18.9277ms | 52.8325 Ops/s | 52.1645 Ops/s | |
test_flatten_speed | 0.9479ms | 0.8913ms | 1.1220 KOps/s | 1.1435 KOps/s | |
test_unflatten_speed | 1.5999ms | 1.5631ms | 639.7538 Ops/s | 643.2815 Ops/s | |
test_common_ops | 7.1427ms | 0.8486ms | 1.1785 KOps/s | 1.1826 KOps/s | |
test_creation | 30.0000μs | 3.0429μs | 328.6351 KOps/s | 335.7273 KOps/s | |
test_creation_empty | 40.0000μs | 9.6262μs | 103.8832 KOps/s | 104.7657 KOps/s | |
test_creation_nested_1 | 41.1000μs | 14.9491μs | 66.8937 KOps/s | 66.8246 KOps/s | |
test_creation_nested_2 | 52.3000μs | 18.2782μs | 54.7100 KOps/s | 56.3633 KOps/s | |
test_clone | 0.1030ms | 14.9301μs | 66.9789 KOps/s | 66.6809 KOps/s | |
test_getitem[int] | 51.5010μs | 17.7106μs | 56.4634 KOps/s | 56.9871 KOps/s | |
test_getitem[slice_int] | 0.1009ms | 42.8428μs | 23.3411 KOps/s | 24.3608 KOps/s | |
test_getitem[range] | 0.1076ms | 66.2080μs | 15.1039 KOps/s | 14.7580 KOps/s | |
test_getitem[tuple] | 53.6000μs | 33.3563μs | 29.9794 KOps/s | 30.0005 KOps/s | |
test_getitem[list] | 0.1236ms | 61.6785μs | 16.2131 KOps/s | 15.6210 KOps/s | |
test_setitem_dim[int] | 57.5010μs | 33.4385μs | 29.9056 KOps/s | 30.3521 KOps/s | |
test_setitem_dim[slice_int] | 86.4010μs | 59.0476μs | 16.9355 KOps/s | 16.9177 KOps/s | |
test_setitem_dim[range] | 99.7020μs | 77.7468μs | 12.8623 KOps/s | 12.5608 KOps/s | |
test_setitem_dim[tuple] | 67.6010μs | 49.1656μs | 20.3394 KOps/s | 20.2549 KOps/s | |
test_setitem | 0.1308ms | 20.7524μs | 48.1872 KOps/s | 47.1550 KOps/s | |
test_set | 97.6010μs | 19.9024μs | 50.2452 KOps/s | 49.9589 KOps/s | |
test_set_shared | 4.1728ms | 0.1897ms | 5.2727 KOps/s | 5.3118 KOps/s | |
test_update | 0.1290ms | 27.3040μs | 36.6246 KOps/s | 36.1333 KOps/s | |
test_update_nested | 0.2052ms | 38.3315μs | 26.0882 KOps/s | 25.7119 KOps/s | |
test_set_nested | 0.1297ms | 22.4563μs | 44.5310 KOps/s | 43.6582 KOps/s | |
test_set_nested_new | 0.1045ms | 31.9442μs | 31.3046 KOps/s | 31.6632 KOps/s | |
test_select | 0.2523ms | 61.0972μs | 16.3674 KOps/s | 16.6373 KOps/s | |
test_unbind_speed | 0.4089ms | 0.3736ms | 2.6766 KOps/s | 2.6513 KOps/s | |
test_unbind_speed_stack0 | 0.1036s | 6.5598ms | 152.4433 Ops/s | 155.7020 Ops/s | |
test_unbind_speed_stack1 | 29.0010μs | 1.1630μs | 859.8182 KOps/s | 1.0840 MOps/s | |
test_creation[device0] | 5.0859ms | 0.4586ms | 2.1804 KOps/s | 2.1703 KOps/s | |
test_creation_from_tensor | 4.5601ms | 0.5268ms | 1.8984 KOps/s | 1.9830 KOps/s | |
test_add_one[memmap_tensor0] | 1.9731ms | 33.2242μs | 30.0986 KOps/s | 29.0448 KOps/s | |
test_contiguous[memmap_tensor0] | 39.4010μs | 8.5837μs | 116.4993 KOps/s | 110.3460 KOps/s | |
test_stack[memmap_tensor0] | 87.4010μs | 27.5748μs | 36.2649 KOps/s | 37.1110 KOps/s | |
test_memmaptd_index | 0.4145ms | 0.3092ms | 3.2337 KOps/s | 3.2317 KOps/s | |
test_memmaptd_index_astensor | 1.5341ms | 1.2336ms | 810.6350 Ops/s | 826.4057 Ops/s | |
test_memmaptd_index_op | 4.9454ms | 2.6570ms | 376.3674 Ops/s | 373.1715 Ops/s | |
test_reshape_pytree | 93.7010μs | 32.8187μs | 30.4704 KOps/s | 30.0254 KOps/s | |
test_reshape_td | 81.0010μs | 28.1854μs | 35.4794 KOps/s | 35.0693 KOps/s | |
test_view_pytree | 93.9010μs | 32.7482μs | 30.5361 KOps/s | 29.1430 KOps/s | |
test_view_td | 22.9000μs | 5.7097μs | 175.1398 KOps/s | 176.0301 KOps/s | |
test_unbind_pytree | 80.0010μs | 37.8529μs | 26.4181 KOps/s | 26.3873 KOps/s | |
test_unbind_td | 93.6010μs | 54.6265μs | 18.3061 KOps/s | 18.5444 KOps/s | |
test_split_pytree | 0.1296ms | 37.6170μs | 26.5837 KOps/s | 26.7130 KOps/s | |
test_split_td | 0.1427ms | 0.1043ms | 9.5846 KOps/s | 9.9171 KOps/s | |
test_add_pytree | 90.6010μs | 47.2979μs | 21.1426 KOps/s | 21.0150 KOps/s | |
test_add_td | 0.1155ms | 59.9673μs | 16.6758 KOps/s | 16.8787 KOps/s | |
test_distributed | 49.8000μs | 8.8974μs | 112.3919 KOps/s | 109.7695 KOps/s | |
test_tdmodule | 0.1285ms | 25.7310μs | 38.8637 KOps/s | 34.9672 KOps/s | |
test_tdmodule_dispatch | 0.2872ms | 45.8991μs | 21.7869 KOps/s | 21.7395 KOps/s | |
test_tdseq | 55.8000μs | 31.0622μs | 32.1934 KOps/s | 30.0316 KOps/s | |
test_tdseq_dispatch | 0.6151ms | 55.1967μs | 18.1170 KOps/s | 17.7016 KOps/s | |
test_instantiation_functorch | 1.9643ms | 1.7000ms | 588.2423 Ops/s | 604.3940 Ops/s | |
test_instantiation_td | 2.0972ms | 1.3354ms | 748.8308 Ops/s | 754.3229 Ops/s | |
test_exec_functorch | 0.2503ms | 0.1980ms | 5.0499 KOps/s | 5.0515 KOps/s | |
test_exec_td | 0.2392ms | 0.1881ms | 5.3170 KOps/s | 5.3010 KOps/s | |
test_vmap_mlp_speed[True-True] | 11.0368ms | 1.1717ms | 853.4732 Ops/s | 878.5634 Ops/s | |
test_vmap_mlp_speed[True-False] | 14.3520ms | 0.6421ms | 1.5575 KOps/s | 1.6186 KOps/s | |
test_vmap_mlp_speed[False-True] | 5.0708ms | 0.9808ms | 1.0196 KOps/s | 978.6893 Ops/s | |
test_vmap_mlp_speed[False-False] | 9.2924ms | 0.4948ms | 2.0211 KOps/s | 2.0354 KOps/s |
vmoens
added
enhancement
New feature or request
Refactor
Refactoring code - not a new feature
labels
Nov 13, 2023
Due to the fact that sharing a file-backed non-shared tensor serializes it, we're closing this PR for the time being. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduces a new backend for memory-mapped tensors that doesn't rely on np