Skip to content
This repository has been archived by the owner on Jul 31, 2024. It is now read-only.

comm.py should maybe consider backend-specific support of different devices #30

Open
lucaslie opened this issue Jan 14, 2022 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@lucaslie
Copy link

lucaslie commented Jan 14, 2022

Depending on the backend, distributed communication may only be supported on either CPU or GPU, see table here.

Right now, in comm.py communication is always done on the GPU, see here e.g.:

# serialize
if context.rank() == src:
tensor = _serialize(obj).cuda()

I would suggest considering the backend-specific device support for both allgather() and broadcast() to ensure the functions are usable across multiple backends.

torch.distributed.broadcast_object_list and torch.distributed.all_gather_object might be a useful starting points for this.

@zhijian-liu zhijian-liu self-assigned this Jan 22, 2022
@zhijian-liu zhijian-liu added the enhancement New feature or request label Jan 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants