-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recompute linear_cache_indices for pipeline prefetching #2147
Conversation
This pull request was exported from Phabricator. Differential Revision: D50983176 |
✅ Deploy Preview for pytorch-fbgemm-docs canceled.
|
Summary: When pipeline prefetching is enabled (`prefetch_pipeline=True`) for `EmbeddingLocation.MANAGED_CACHING`, TBE has to update `lxu_cache_locations` to ensure cache consistency before the backward pass. The `lxu_cache_locations` update requires `linear_cache_indices` as an input. Prior to this diff, TBE keeps `linear_cache_indices` alive after prefetching until the tensor is used for the `lxu_cache_locations` update. This puts a lot of pressure to the memory space requirement limiting the enablement of pipeline prefetching for some models. This diff addresses the memory limitation issue by recomputing `linear_cache_indices` when it is needed. Differential Revision: D50983176
e93e108
to
26da350
Compare
This pull request was exported from Phabricator. Differential Revision: D50983176 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D50983176 |
Summary: When pipeline prefetching is enabled (`prefetch_pipeline=True`) for `EmbeddingLocation.MANAGED_CACHING`, TBE has to update `lxu_cache_locations` to ensure cache consistency before the backward pass. The `lxu_cache_locations` update requires `linear_cache_indices` as an input. Prior to this diff, TBE keeps `linear_cache_indices` alive after prefetching until the tensor is used for the `lxu_cache_locations` update. This puts a lot of pressure to the memory space requirement limiting the enablement of pipeline prefetching for some models. This diff addresses the memory limitation issue by recomputing `linear_cache_indices` when it is needed. Reviewed By: jspark1105 Differential Revision: D50983176
Summary: When pipeline prefetching is enabled (`prefetch_pipeline=True`) for `EmbeddingLocation.MANAGED_CACHING`, TBE has to update `lxu_cache_locations` to ensure cache consistency before the backward pass. The `lxu_cache_locations` update requires `linear_cache_indices` as an input. Prior to this diff, TBE keeps `linear_cache_indices` alive after prefetching until the tensor is used for the `lxu_cache_locations` update. This puts a lot of pressure to the memory space requirement limiting the enablement of pipeline prefetching for some models. This diff addresses the memory limitation issue by recomputing `linear_cache_indices` when it is needed. Reviewed By: jspark1105 Differential Revision: D50983176
26da350
to
6fbcef2
Compare
This pull request was exported from Phabricator. Differential Revision: D50983176 |
Summary: When pipeline prefetching is enabled (`prefetch_pipeline=True`) for `EmbeddingLocation.MANAGED_CACHING`, TBE has to update `lxu_cache_locations` to ensure cache consistency before the backward pass. The `lxu_cache_locations` update requires `linear_cache_indices` as an input. Prior to this diff, TBE keeps `linear_cache_indices` alive after prefetching until the tensor is used for the `lxu_cache_locations` update. This puts a lot of pressure to the memory space requirement limiting the enablement of pipeline prefetching for some models. This diff addresses the memory limitation issue by recomputing `linear_cache_indices` when it is needed. Reviewed By: jspark1105 Differential Revision: D50983176
6fbcef2
to
9ce5068
Compare
This pull request was exported from Phabricator. Differential Revision: D50983176 |
Summary: When pipeline prefetching is enabled (`prefetch_pipeline=True`) for `EmbeddingLocation.MANAGED_CACHING`, TBE has to update `lxu_cache_locations` to ensure cache consistency before the backward pass. The `lxu_cache_locations` update requires `linear_cache_indices` as an input. Prior to this diff, TBE keeps `linear_cache_indices` alive after prefetching until the tensor is used for the `lxu_cache_locations` update. This puts a lot of pressure to the memory space requirement limiting the enablement of pipeline prefetching for some models. This diff addresses the memory limitation issue by recomputing `linear_cache_indices` when it is needed. Reviewed By: jspark1105 Differential Revision: D50983176
9ce5068
to
693bb6f
Compare
This pull request was exported from Phabricator. Differential Revision: D50983176 |
693bb6f
to
6c8e0a6
Compare
This pull request was exported from Phabricator. Differential Revision: D50983176 |
This pull request has been merged in 37111f5. |
Summary:
When pipeline prefetching is enabled (
prefetch_pipeline=True
) forEmbeddingLocation.MANAGED_CACHING
, TBE has to updatelxu_cache_locations
to ensure cache consistency before the backwardpass. The
lxu_cache_locations
update requireslinear_cache_indices
as an input. Prior to this diff, TBE keepslinear_cache_indices
alive after prefetching until the tensor isused for the
lxu_cache_locations
update. This puts a lot ofpressure to the memory space requirement limiting the enablement of
pipeline prefetching for some models. This diff addresses the memory
limitation issue by recomputing
linear_cache_indices
when it isneeded.
Differential Revision: D50983176