Skip to content

Commit

Permalink
Fix Checkpoint issue when using Horovod distributed backend (PyTorchL…
Browse files Browse the repository at this point in the history
…ightning#6947) (#6958)

Co-Authored-By: Adrian Wälchli <[email protected]>

Co-authored-by: Adrian Wälchli <[email protected]>

(cherry picked from commit b37b58a)
  • Loading branch information
liob authored and SeanNaren committed Apr 13, 2021
1 parent b60824f commit 84e6900
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion pytorch_lightning/plugins/training_type/horovod.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,9 @@ def reduce(self, output, group: Optional[Any] = None, reduce_op: Optional[Union[
"Unset `group`."
)

if reduce_op is None or reduce_op == "sum":
if reduce_op in (None, "avg", "mean"):
reduce_op = hvd.Average
elif reduce_op in ("sum", ReduceOp.SUM):
reduce_op = hvd.Sum
elif isinstance(reduce_op, str) and reduce_op in ("avg", "mean"):
reduce_op = hvd.Average
Expand Down

0 comments on commit 84e6900

Please sign in to comment.