Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial example for computing the adjoint gradient of a level set #2792

Merged
merged 13 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
386 changes: 386 additions & 0 deletions doc/docs/Python_Tutorials/Adjoint_Solver.md
Original file line number Diff line number Diff line change
Expand Up @@ -804,6 +804,392 @@ if __name__ == "__main__":
)
```

Adjoint Gradient of a Level Set
-------------------------------
stevengj marked this conversation as resolved.
Show resolved Hide resolved

It is also possible to compute the derivative of the Meep outputs with respect to a geometric parameter via a [level-set](https://en.wikipedia.org/wiki/Level_set) formulation (an implicit-function representation of a material discontinuity) using the density-based adjoint solver. This is useful for applications involving [shape optimization](https://en.wikipedia.org/wiki/Shape_optimization) of explicitly parameterized geometries.

As a demonstration, we will compute the gradient of the diffraction efficiency of the first transmitted order with $\mathcal{S}$ polarization (electric field out of the plane, i.e. $E_z$) of a 1D binary grating with respect to the grating height. The accuracy of the adjoint gradient is validated using a brute-force finite difference via the directional derivative.
stevengj marked this conversation as resolved.
Show resolved Hide resolved

The calculation of the diffraction efficiency involves [mode decomposition](Python_User_Interface/#mode-decomposition) of planewaves which is described in [Tutorial/Transmissive Diffraction Spectrum for Planewave at Normal Incidence](Mode_Decomposition.md#transmissive-diffraction-spectrum-for-planewave-at-normal-incidence). An important aspect of setting up this simulation is specifying a diffraction order (planewave) in 2D using the `meep.adjoint.EigenmodeCoefficient` object. This involves three components: (1) the wavevector via the `kpoint_func` parameter, (2) the polarization via the `eig_parity` parameter, and (3) the `eig_vol` parameter must be set to have a *length of one pixel in the periodic direction*. The latter is necessary for MPB to correctly interpret the wavevector in the Cartesian basis rather than the reciprocal-lattice basis defined by the grating period.

The calculation of the gradient of the diffraction efficiency ($F$) with respect to the grating height ($h$) involves using the chain rule to obtain a product of two derivatives: (1) $\partial F / \partial \rho$ and (2) $\partial \rho / \partial h$, where $\rho$ are the density weights (2D grid of $N_x \times N_y = N$ points) used to represent the binary grating as a levelset. (1) is the adjoint gradient computed by Meep. (2) requires the levelset to be *smoothed* to ensure it is differentiable. A non-smoothed levelset is generally not differentiable. (2) is implemented in Autograd using a custom vector-Jacobian product (vJP) which is used as part of reverse-mode differentiation (backpropagation) to compute $\partial F / \partial h$. An overview of this calculation is shown below.
stevengj marked this conversation as resolved.
Show resolved Hide resolved

![](../images/levelset_gradient_backpropagation.png#center)

The derivative $\partial \rho / \partial h$ can be approximated by a finite difference using a one-pixel perturbation applied to the grating height. The calculation is summarized in the schematic below. There are two function evaluations of $\rho(h)$ for which the cost is negligible. Smoothing of the levelset can be performed using a number of different methods including a [signed-distance function](https://en.wikipedia.org/wiki/Signed_distance_function) or convolution filter. In this example, the smoothing is based on downsampling the levelset from a high-resolution grid (10X the resolution of the simulation grid) to the lower-resolution simulation grid using bilinear interpolation. Note that only the boundary pixels are nonzero in the Jacobian matrix in this case.
stevengj marked this conversation as resolved.
Show resolved Hide resolved

![](../images/levelset_jacobian_matrix.png#center)

A schematic of the simulation layout showing the design region containing the grating as a levelset is shown below. The adjoint gradient $\partial F / \partial \rho$ computed by Meep at a resolution of 400 pixels/$\mu$m is shown next to this schematic.

![](../images/levelset_adjoint_gradient.png#center)

The simulation script is in [python/examples/adjoint_optimization/binary_grating_levelset.py](https://github.com/NanoComp/meep/tree/master/python/examples/adjoint_optimization/binary_grating_levelset.py). Running this script at five different resolutions produces the output below for the directional derivative computed using the finite difference and the adjoint gradient as well as the relative error from these two results.

```
RESOLUTION_UM = 50
dir-deriv:, -0.02535355 (finite difference), -0.01469133 (adjoint), 0.420542 (error)

RESOLUTION_UM = 100
dir-deriv:, -0.00604817 (finite difference), -0.00473221 (adjoint), 0.217580 (error)

RESOLUTION_UM = 200
dir-deriv:, -0.00284452 (finite difference), -0.00252470 (adjoint), 0.112432 (error)

RESOLUTION_UM = 400
dir-deriv:, -0.00140221 (finite difference), -0.00132065 (adjoint), 0.058165 (error)

RESOLUTION_UM = 800
dir-deriv:, -0.00069606 (finite difference), -0.00067547 (adjoint), 0.029583 (error)
```

A logarithmic plot of the relative error vs. grid resolution based on these results demonstrates linear convergence. This is expected for fields right on the boundary of a discontinuous interface.

Note: because of a likely bug in the adjoint gradients for the $\mathcal{P}$ polarization (electric field in the $xy$ plane), there is a large error in the adjoint gradient which [does not decrease with resolution](https://github.com/NanoComp/meep/pull/2792#issuecomment-2171942477). This is why this demonstration only involved the $\mathcal{S}$ polarization.
stevengj marked this conversation as resolved.
Show resolved Hide resolved

![](../images/levelset_adjoint_gradient_error_vs_resolution.png#center)

```py
from enum import Enum
from typing import Callable, Tuple

from autograd.extend import primitive, defvjp
from autograd import numpy as npa
from autograd import tensor_jacobian_product
from mpl_toolkits.axes_grid1 import make_axes_locatable
import matplotlib.pyplot as plt
import meep as mp
import meep.adjoint as mpa
import numpy as np
from scipy.interpolate import RegularGridInterpolator


RESOLUTION_UM = 100
WAVELENGTH_UM = 0.53
GRATING_PERIOD_UM = 1.28
GRATING_HEIGHT_UM = 0.52
GRATING_DUTY_CYCLE = 0.65
GRATING_HEIGHT_PERTURBATION_UM = 1.0 / RESOLUTION_UM
DIFFRACTION_ORDER = 1
SUBSTRATE_UM = 1.5
PML_UM = 1.0
AIR_UM = 1.0
N_GLASS = 1.5
DESIGN_REGION_RESOLUTION_UM = 10 * RESOLUTION_UM

Polarization = Enum("Polarization", "S P")

design_region_size = mp.Vector3(
GRATING_HEIGHT_UM + GRATING_HEIGHT_PERTURBATION_UM, GRATING_PERIOD_UM, 0
)
nx_design_grid = int(design_region_size.x * DESIGN_REGION_RESOLUTION_UM) + 1
ny_design_grid = int(design_region_size.y * DESIGN_REGION_RESOLUTION_UM) + 1
nx_sim_grid = int(design_region_size.x * RESOLUTION_UM) + 1
ny_sim_grid = int(design_region_size.y * RESOLUTION_UM) + 1

DEBUG_OUTPUT = False


def design_region_to_meshgrid(nx: int, ny: int) -> Tuple[np.ndarray, np.ndarray]:
"""Returns the 2D coordinates of the meshgrid for the design region.

Args:
nx, ny: the number of grid points in the x and y directions, respectively.

Returns:
The coordinates of the x and y grid points (2D arrays) as a 2-tuple.
"""

xcoord = np.linspace(-0.5 * design_region_size.x, +0.5 * design_region_size.x, nx)
ycoord = np.linspace(-0.5 * design_region_size.y, +0.5 * design_region_size.y, ny)
xv, yv = np.meshgrid(xcoord, ycoord, indexing="ij")

return xv, yv


@primitive
def levelset_and_smoothing(grating_height_um: float) -> np.ndarray:
"""Returns the density weights for a binary grating as a levelset.

Args:
grating_height_um: the height of the grating.

Returns:
The density weights as a flattened (1D) array.
"""

xv, yv = design_region_to_meshgrid(nx_design_grid, ny_design_grid)
weights = np.where(
np.abs(yv) <= 0.5 * GRATING_DUTY_CYCLE * GRATING_PERIOD_UM,
np.where(xv <= xv[0][0] + grating_height_um, 1, 0),
0,
)
xcoord = np.linspace(
-0.5 * design_region_size.x, +0.5 * design_region_size.x, nx_design_grid
)
ycoord = np.linspace(
-0.5 * design_region_size.y, +0.5 * design_region_size.y, ny_design_grid
)
interp = RegularGridInterpolator((xcoord, ycoord), weights)

# Smooth the design weights by downsampling from the design grid
# to the simulation grid using bilinear interpolation.
sim_grid_xv, sim_grid_yv = design_region_to_meshgrid(nx_sim_grid, ny_sim_grid)
smoothed_weights = interp((sim_grid_xv, sim_grid_yv))

if DEBUG_OUTPUT:
fig, ax = plt.subplots()
im = ax.imshow(
np.transpose(smoothed_weights),
cmap="binary",
interpolation="none",
aspect="equal",
)
ax.set_title(r"$\rho_{smooth-levelset}(h)$")
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(im, cax=cax)
if mp.am_master():
fig.savefig("smoothed_levelset.png", dpi=150, bbox_inches="tight")

return smoothed_weights.flatten()


def levelset_and_smoothing_vjp(
ans: np.ndarray,
grating_height_um: float,
) -> Callable[[np.ndarray], np.ndarray]:
"""Returns a function for computing the vector-Jacobian product.

The Jacobian is computed manually using a finite difference.

Args:
ans: the design weights for the smoothed grating (no perturbation).
grating_height_um: the height of the grating.

Returns:
An anonymous function for computing the vector-Jacobian product.
"""

xv, yv = design_region_to_meshgrid(nx_design_grid, ny_design_grid)
weights = np.where(
np.abs(yv) <= 0.5 * GRATING_DUTY_CYCLE * GRATING_PERIOD_UM,
np.where(
xv <= xv[0][0] + grating_height_um + GRATING_HEIGHT_PERTURBATION_UM, 1, 0
),
0,
)
xcoord = np.linspace(
-0.5 * design_region_size.x, +0.5 * design_region_size.x, nx_design_grid
)
ycoord = np.linspace(
-0.5 * design_region_size.y, +0.5 * design_region_size.y, ny_design_grid
)
interp = RegularGridInterpolator((xcoord, ycoord), weights)

# Smooth the design weights by downsampling from the design grid
# to the simulation grid using bilinear interpolation.
xv_sim_grid, yv_sim_grid = design_region_to_meshgrid(nx_sim_grid, ny_sim_grid)
smoothed_weights = interp((xv_sim_grid, yv_sim_grid))
smoothed_weights = smoothed_weights.flatten()

jacobian = (smoothed_weights - ans) / GRATING_HEIGHT_PERTURBATION_UM

if DEBUG_OUTPUT:
fig, ax = plt.subplots()
im = ax.imshow(
np.transpose(jacobian.reshape(nx_sim_grid, ny_sim_grid)),
cmap="inferno",
interpolation="none",
aspect="equal",
)
ax.set_title(r"$\partial \rho_{smooth-levelset} / \partial h$")
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(im, cax=cax)
if mp.am_master():
fig.savefig("gradient_smooth_levelset.png", dpi=150, bbox_inches="tight")

return lambda g: np.tensordot(g, jacobian, axes=1)


def grating_1d(pol: Polarization) -> mpa.OptimizationProblem:
"""Sets up the adjoint optimization of a 1D grating.

Args:
pol: the polarization state (S or P).

Returns:
A meep.adjoint.OptimizationProblem object for the simulation.
"""

frequency = 1 / WAVELENGTH_UM
pml_layers = [mp.PML(direction=mp.X, thickness=PML_UM)]
glass = mp.Medium(index=N_GLASS)
size_x_um = (
PML_UM
+ SUBSTRATE_UM
+ GRATING_HEIGHT_UM
+ GRATING_HEIGHT_PERTURBATION_UM
+ AIR_UM
+ PML_UM
)
size_y_um = GRATING_PERIOD_UM
cell_size = mp.Vector3(size_x_um, size_y_um, 0)
k_point = mp.Vector3()

if pol.name == "S":
eig_parity = mp.ODD_Z
src_cmpt = mp.Ez
else:
eig_parity = mp.EVEN_Z
src_cmpt = mp.Hz

src_pt = mp.Vector3(-0.5 * size_x_um + PML_UM, 0, 0)
sources = [
mp.Source(
mp.GaussianSource(frequency, fwidth=0.1 * frequency),
component=src_cmpt,
center=src_pt,
size=mp.Vector3(0, size_y_um, 0),
)
]

matgrid = mp.MaterialGrid(
mp.Vector3(nx_sim_grid, ny_sim_grid),
mp.air,
glass,
weights=np.ones((nx_sim_grid, ny_sim_grid)),
do_averaging=False,
)

matgrid_region = mpa.DesignRegion(
matgrid,
volume=mp.Volume(
center=mp.Vector3(
(-0.5 * size_x_um + PML_UM + SUBSTRATE_UM + 0.5 * design_region_size.x),
0,
0,
),
size=design_region_size,
),
)

geometry = [
mp.Block(
material=glass,
size=mp.Vector3(PML_UM + SUBSTRATE_UM, mp.inf, mp.inf),
center=mp.Vector3(-0.5 * size_x_um + 0.5 * (PML_UM + SUBSTRATE_UM), 0, 0),
),
mp.Block(
material=matgrid,
size=matgrid_region.size,
center=matgrid_region.center,
),
]

sim = mp.Simulation(
resolution=RESOLUTION_UM,
cell_size=cell_size,
boundary_layers=pml_layers,
k_point=k_point,
sources=sources,
geometry=geometry,
)

tran_pt = mp.Vector3(0.5 * size_x_um - PML_UM, 0, 0)

kdiff = mp.Vector3(
(frequency**2 - (DIFFRACTION_ORDER / GRATING_PERIOD_UM) ** 2) ** 0.5,
DIFFRACTION_ORDER / GRATING_PERIOD_UM,
0,
)

obj_args = [
mpa.EigenmodeCoefficient(
sim,
mp.Volume(
center=tran_pt,
size=mp.Vector3(0, size_y_um, 0),
),
mode=1,
kpoint_func=lambda *not_used: kdiff,
eig_parity=eig_parity,
eig_vol=mp.Volume(center=tran_pt, size=mp.Vector3(0, 1 / RESOLUTION_UM, 0)),
),
]

def obj_func(mode_coeff):
return npa.abs(mode_coeff) ** 2

opt = mpa.OptimizationProblem(
simulation=sim,
objective_functions=obj_func,
objective_arguments=obj_args,
design_regions=[matgrid_region],
frequencies=[frequency],
)

return opt


if __name__ == "__main__":
# Only the S polarization is supported (for now).
opt = grating_1d(Polarization.S)

smoothed_design_weights = levelset_and_smoothing(GRATING_HEIGHT_UM)

obj_val_unperturbed, grad_unperturbed = opt(
[smoothed_design_weights], need_gradient=True
)

if DEBUG_OUTPUT:
fig, ax = plt.subplots()
im = ax.imshow(
np.transpose(grad_unperturbed.reshape(nx_sim_grid, ny_sim_grid)),
cmap="inferno",
interpolation="none",
aspect="equal",
)
ax.set_title(r"$\partial F / \partial \rho_{smooth-levelset}$")
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(im, cax=cax)
if mp.am_master():
fig.savefig(
"gradient_wrt_smoothed_design_weights.png", dpi=150, bbox_inches="tight"
)

fig, ax = plt.subplots()
opt.plot2D(init_opt=False, ax=ax)
if mp.am_master():
fig.savefig("sim_layout.png", dpi=150, bbox_inches="tight")

defvjp(levelset_and_smoothing, levelset_and_smoothing_vjp)
grad_backprop = tensor_jacobian_product(levelset_and_smoothing, 0)(
GRATING_HEIGHT_UM,
grad_unperturbed,
)

perturbed_design_weights = levelset_and_smoothing(
GRATING_HEIGHT_UM + GRATING_HEIGHT_PERTURBATION_UM
)
perturbed_design_weights = perturbed_design_weights.flatten()

obj_val_perturbed, _ = opt([perturbed_design_weights], need_gradient=False)

adj_directional_deriv = GRATING_HEIGHT_PERTURBATION_UM * grad_backprop
fnd_directional_deriv = obj_val_perturbed[0] - obj_val_unperturbed[0]
rel_err = abs(
(fnd_directional_deriv - adj_directional_deriv) / fnd_directional_deriv
)
print(
f"dir-deriv:, {fnd_directional_deriv:.8f} (finite difference), "
f"{adj_directional_deriv:.8f} (adjoint), {rel_err:.6f} (error)"
)
```

Compact Notebook Tutorials of Basic Features
--------------------------------------------

Expand Down
Binary file added doc/docs/images/levelset_adjoint_gradient.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/docs/images/levelset_jacobian_matrix.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading