diff --git a/docs/lang/articles/differentiable/differentiable_programming.md b/docs/lang/articles/differentiable/differentiable_programming.md
index b95fa0b0bc3b0..d0a1b1443e203 100644
--- a/docs/lang/articles/differentiable/differentiable_programming.md
+++ b/docs/lang/articles/differentiable/differentiable_programming.md
@@ -448,3 +448,79 @@ Check out [the DiffTaichi paper](https://arxiv.org/pdf/1910.00935.pdf)
 and [video](https://www.youtube.com/watch?v=Z1xvAZve9aE) to learn more
 about Taichi differentiable programming.
 :::
+
+
+## Forward-Mode Autodiff
+
+There are two modes of automatic differentiation, forward and reverse mode. The forward mode provides a function to compute Jacobian-Vector Product (JVP), which can compute one column of the Jacobian matrix at a time. The reverse mode supports computing Vector-Jacobian Product (VJP), i.e., one row of the Jacobian matrix at a time. Therefore, for functions which have more inputs than outputs, reverse mode is more efficient. The `ti.ad.Tape` and `kernel.grad()` are built on the reverse mode. The forward mode is more efficient when handling functions whose outputs are more than inputs. Taichi autodiff also supports forward mode.
+
+### Using `ti.ad.FwdMode`
+The usage of `ti.ad.FwdMode` is very similar to `ti.ad.Tape`. Here we reuse the example for reverse mode above for an explanation.
+1. Enable `needs_dual=True` option when declaring fields involved in the derivative chain.
+2. Use context manager with `ti.ad.FwdMode(loss=y, param=x)`: to capture the kernel invocations which you want to automatically differentiate. The `loss` and `param` are the output and input of the function respectively.
+3. Now dy/dx value at current x is available at function output `y.dual[None]`.
+The following code snippet explains the steps above:
+
+```python
+import taichi as ti
+ti.init()
+
+x = ti.field(dtype=ti.f32, shape=(), needs_dual=True)
+y = ti.field(dtype=ti.f32, shape=(), needs_dual=True)
+
+
+@ti.kernel
+def compute_y():
+    y[None] = ti.sin(x[None])
+
+
+with ti.ad.FwdMode(loss=y, param=x):
+    compute_y()
+
+print('dy/dx =', y.dual[None], ' at x =', x[None])
+```
+
+:::note
+The `dual` here indicates `dual number`in math. The reason for using the name is that forwar-mode autodiff is equivalent to evaluating function with dual numbers.
+:::
+
+:::note
+The `ti.ad.FwdMode` automatically clears the dual field of `loss`.
+:::
+
+ti.ad.FwdMode support multiple inputs and outputs. The param can be a N-D field and the loss can be an individual or a list of N-D fields. The argument `seed` is the 'vector' in Jacobian-vector product, which used to control the parameter that is computed derivative with respect to. Here we show three cases with multiple inputs and outputs. With `seed=[1.0, 0.0] `or `seed=[0.0, 1.0]`  , we can compute the derivatives solely with respect to `x_0` or `x_1`.
+
+```python
+import taichi as ti
+ti.init()
+N_param = 2
+N_loss = 5
+x = ti.field(dtype=ti.f32, shape=N_param, needs_dual=True)
+y = ti.field(dtype=ti.f32, shape=N_loss, needs_dual=True)
+
+
+@ti.kernel
+def compute_y():
+    for i in range(N_loss):
+        for j in range(N_param):
+            y[i] += i * ti.sin(x[j])
+
+
+# Compute derivatives respect to x_0
+with ti.ad.FwdMode(loss=y, param=x, seed=[1.0, 0.0]):
+    compute_y()
+print('dy/dx_0 =', y.dual, ' at x_0 =', x[0])
+
+# Compute derivatives respect to x_1
+with ti.ad.FwdMode(loss=y, param=x, seed=[0.0, 1.0]):
+    compute_y()
+print('dy/dx_1 =', y.dual, ' at x_1 =', x[1])
+```
+
+:::note
+The `seed` argument is required if the `param` is not a scalar field.
+:::
+
+:::tip
+Similar to reverse mode autodiff, Taichi provides an API `ti.root.lazy_dual()` that automatically places the dual fields following the layout of their primal fields.
+:::