Fix deadlinks in docs (#10739)

Co-authored-by: Ethan Harris <[email protected]>
Lightning-AI · Dec 2, 2021 · 541b983 · 541b983
1 parent 5b9995d
commit 541b983
Show file tree

Hide file tree

Showing 4 changed files with 7 additions and 5 deletions.
diff --git a/docs/source/advanced/advanced_gpu.rst b/docs/source/advanced/advanced_gpu.rst
@@ -117,7 +117,7 @@ To activate parameter sharding, you must wrap your model using provided ``wrap``
 When not using Fully Sharded these wrap functions are a no-op. This means once the changes have been made, there is no need to remove the changes for other plugins.
 
 ``auto_wrap`` will recursively wrap `torch.nn.Modules` within the ``LightningModule`` with nested Fully Sharded Wrappers,
-signalling that we'd like to partition these modules across data parallel devices, discarding the full weights when not required (information `here <https://fairscale.readthedocs.io/en/latest/api/nn/fsdp_tips.html>`__).
+signalling that we'd like to partition these modules across data parallel devices, discarding the full weights when not required (information :class:`here <fairscale.nn.fsdp>`).
 
 ``auto_wrap`` can have varying level of success based on the complexity of your model. **Auto Wrap does not support models with shared parameters**.
 
@@ -182,7 +182,7 @@ Activation checkpointing frees activations from memory as soon as they are not n
 
 FairScales' checkpointing wrapper also handles batch norm layers correctly unlike the PyTorch implementation, ensuring stats are tracked correctly due to the multiple forward passes.
 
-This saves memory when training larger models however requires wrapping modules you'd like to use activation checkpointing on. See `here <https://fairscale.readthedocs.io/en/latest/api/nn/misc/checkpoint_activations.html>`__ for more information.
+This saves memory when training larger models however requires wrapping modules you'd like to use activation checkpointing on. See :class:`here <fairscale.nn.checkpoint.checkpoint_wrapper>` for more information.
 
 .. warning::
 

diff --git a/docs/source/advanced/ipu.rst b/docs/source/advanced/ipu.rst
@@ -114,7 +114,7 @@ PopVision Graph Analyser
    :alt: PopVision Graph Analyser
    :width: 500
 
-Lightning supports integration with the `PopVision Graph Analyser Tool <https://docs.graphcore.ai/projects/graphcore-popvision-user-guide/en/latest/popvision.html>`__. This helps to look at utilization of IPU devices and provides helpful metrics during the lifecycle of your trainer. Once you have gained access, The PopVision Graph Analyser Tool can be downloaded via the `GraphCore download website <https://downloads.graphcore.ai/>`__.
+Lightning supports integration with the `PopVision Graph Analyser Tool <https://docs.graphcore.ai/projects/graph-analyser-userguide/en/latest/>`__. This helps to look at utilization of IPU devices and provides helpful metrics during the lifecycle of your trainer. Once you have gained access, The PopVision Graph Analyser Tool can be downloaded via the `GraphCore download website <https://downloads.graphcore.ai/>`__.
 
 Lightning supports dumping all reports to a directory to open using the tool.
 
@@ -127,7 +127,7 @@ Lightning supports dumping all reports to a directory to open using the tool.
     trainer = pl.Trainer(ipus=8, strategy=IPUPlugin(autoreport_dir="report_dir/"))
     trainer.fit(model)
 
-This will dump all reports to ``report_dir/`` which can then be opened using the Graph Analyser Tool, see `Opening Reports <https://docs.graphcore.ai/projects/graphcore-popvision-user-guide/en/latest/graph/graph.html#opening-reports>`__.
+This will dump all reports to ``report_dir/`` which can then be opened using the Graph Analyser Tool, see `Opening Reports <https://docs.graphcore.ai/projects/graph-analyser-userguide/en/latest/graph-analyser.html#opening-reports>`__.
 
 .. _ipu-model-parallelism:
 

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -273,6 +273,8 @@ def _transform_changelog(path_in: str, path_out: str) -> None:
     "numpy": ("https://numpy.org/doc/stable/", None),
     "PIL": ("https://pillow.readthedocs.io/en/stable/", None),
     "torchmetrics": ("https://torchmetrics.readthedocs.io/en/stable/", None),
+    "fairscale": ("https://fairscale.readthedocs.io/en/latest/", None),
+    "graphcore": ("https://docs.graphcore.ai/en/latest/", None),
 }
 
 # -- Options for todo extension ----------------------------------------------

diff --git a/pytorch_lightning/plugins/precision/fully_sharded_native_amp.py b/pytorch_lightning/plugins/precision/fully_sharded_native_amp.py
@@ -21,7 +21,7 @@ class FullyShardedNativeMixedPrecisionPlugin(ShardedNativeMixedPrecisionPlugin):
     """Native AMP for Fully Sharded Training."""
 
     def clip_grad_by_norm(self, *_: Any, **__: Any) -> None:
-        # see https://fairscale.readthedocs.io/en/latest/api/nn/fsdp_tips.html
+        # see https://fairscale.readthedocs.io/en/latest/api/nn/fsdp.html
         # section `Gradient Clipping`, using `torch.nn.utils.clip_grad_norm_` is incorrect
         # for FSDP module. To overcome this, needs to call sharded_module.clip_grad_norm(clip_val)
         # however we rely on LightningModule's configure_sharded_model to wrap FSDP, it would be hard to