Skip to content

Commit

Permalink
Fix deadlinks in docs (#10739)
Browse files Browse the repository at this point in the history
Co-authored-by: Ethan Harris <[email protected]>
  • Loading branch information
kaushikb11 and ethanwharris authored Dec 2, 2021
1 parent 5b9995d commit 541b983
Show file tree
Hide file tree
Showing 4 changed files with 7 additions and 5 deletions.
4 changes: 2 additions & 2 deletions docs/source/advanced/advanced_gpu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ To activate parameter sharding, you must wrap your model using provided ``wrap``
When not using Fully Sharded these wrap functions are a no-op. This means once the changes have been made, there is no need to remove the changes for other plugins.

``auto_wrap`` will recursively wrap `torch.nn.Modules` within the ``LightningModule`` with nested Fully Sharded Wrappers,
signalling that we'd like to partition these modules across data parallel devices, discarding the full weights when not required (information `here <https://fairscale.readthedocs.io/en/latest/api/nn/fsdp_tips.html>`__).
signalling that we'd like to partition these modules across data parallel devices, discarding the full weights when not required (information :class:`here <fairscale.nn.fsdp>`).

``auto_wrap`` can have varying level of success based on the complexity of your model. **Auto Wrap does not support models with shared parameters**.

Expand Down Expand Up @@ -182,7 +182,7 @@ Activation checkpointing frees activations from memory as soon as they are not n

FairScales' checkpointing wrapper also handles batch norm layers correctly unlike the PyTorch implementation, ensuring stats are tracked correctly due to the multiple forward passes.

This saves memory when training larger models however requires wrapping modules you'd like to use activation checkpointing on. See `here <https://fairscale.readthedocs.io/en/latest/api/nn/misc/checkpoint_activations.html>`__ for more information.
This saves memory when training larger models however requires wrapping modules you'd like to use activation checkpointing on. See :class:`here <fairscale.nn.checkpoint.checkpoint_wrapper>` for more information.

.. warning::

Expand Down
4 changes: 2 additions & 2 deletions docs/source/advanced/ipu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ PopVision Graph Analyser
:alt: PopVision Graph Analyser
:width: 500

Lightning supports integration with the `PopVision Graph Analyser Tool <https://docs.graphcore.ai/projects/graphcore-popvision-user-guide/en/latest/popvision.html>`__. This helps to look at utilization of IPU devices and provides helpful metrics during the lifecycle of your trainer. Once you have gained access, The PopVision Graph Analyser Tool can be downloaded via the `GraphCore download website <https://downloads.graphcore.ai/>`__.
Lightning supports integration with the `PopVision Graph Analyser Tool <https://docs.graphcore.ai/projects/graph-analyser-userguide/en/latest/>`__. This helps to look at utilization of IPU devices and provides helpful metrics during the lifecycle of your trainer. Once you have gained access, The PopVision Graph Analyser Tool can be downloaded via the `GraphCore download website <https://downloads.graphcore.ai/>`__.

Lightning supports dumping all reports to a directory to open using the tool.

Expand All @@ -127,7 +127,7 @@ Lightning supports dumping all reports to a directory to open using the tool.
trainer = pl.Trainer(ipus=8, strategy=IPUPlugin(autoreport_dir="report_dir/"))
trainer.fit(model)
This will dump all reports to ``report_dir/`` which can then be opened using the Graph Analyser Tool, see `Opening Reports <https://docs.graphcore.ai/projects/graphcore-popvision-user-guide/en/latest/graph/graph.html#opening-reports>`__.
This will dump all reports to ``report_dir/`` which can then be opened using the Graph Analyser Tool, see `Opening Reports <https://docs.graphcore.ai/projects/graph-analyser-userguide/en/latest/graph-analyser.html#opening-reports>`__.

.. _ipu-model-parallelism:

Expand Down
2 changes: 2 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,8 @@ def _transform_changelog(path_in: str, path_out: str) -> None:
"numpy": ("https://numpy.org/doc/stable/", None),
"PIL": ("https://pillow.readthedocs.io/en/stable/", None),
"torchmetrics": ("https://torchmetrics.readthedocs.io/en/stable/", None),
"fairscale": ("https://fairscale.readthedocs.io/en/latest/", None),
"graphcore": ("https://docs.graphcore.ai/en/latest/", None),
}

# -- Options for todo extension ----------------------------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class FullyShardedNativeMixedPrecisionPlugin(ShardedNativeMixedPrecisionPlugin):
"""Native AMP for Fully Sharded Training."""

def clip_grad_by_norm(self, *_: Any, **__: Any) -> None:
# see https://fairscale.readthedocs.io/en/latest/api/nn/fsdp_tips.html
# see https://fairscale.readthedocs.io/en/latest/api/nn/fsdp.html
# section `Gradient Clipping`, using `torch.nn.utils.clip_grad_norm_` is incorrect
# for FSDP module. To overcome this, needs to call sharded_module.clip_grad_norm(clip_val)
# however we rely on LightningModule's configure_sharded_model to wrap FSDP, it would be hard to
Expand Down

0 comments on commit 541b983

Please sign in to comment.