-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tutorial for shape optimization of a multilayer stack #2862
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #2862 +/- ##
==========================================
- Coverage 73.81% 73.78% -0.04%
==========================================
Files 18 18
Lines 5423 5428 +5
==========================================
+ Hits 4003 4005 +2
- Misses 1420 1423 +3 |
It seems that indeed the optimizer is getting stuck in local optima. For the case with 9 layers shown above, I replaced the initial design which originally used a fixed layer thickness of 0.2 μm with random thicknesses. Running with initial random designs did eventually produce optimal designs which were close to the quarter-wavelength stack. Results are shown below for five different runs. Note the large spread in the transmittance values of the optimal designs. modification rng = np.random.RandomState()
layer_thickness_um = (MIN_LAYER_UM + rng.rand(NUM_LAYERS) *
(MAX_LAYER_UM - MIN_LAYER_UM))
optimal_layer_thickness_um = solver.optimize(layer_thickness_um) results
|
For lots of layers, you can easily get a non-quarter-wave structure with very low transmission, and once you get to that point the gradients will be almost zero so probably it will converge very slowly towards a QW solution (it's very slow to try to reduce transmission from @oskooi suggested just taking the log of the transmission. This seems worth trying, but I'm skeptical — it seems like a free lunch? It wouldn't help if there is noise in the gradient that is limiting us at low transmission levels, for example. Another alternative is simply to minimize ∫|E|² in the mirrors (or similar), which will push the optimization towards a field that decays as quickly as possible. Intuitively, this uses more information than the transmission optimization, since it uses E at every point in the mirror rather than just at the end. PS. Note that L-BFGS, being a higher-order scheme, will be more vulnerable to errors in the gradient (e.g. discretization errors) compared to first-order methods like CCSA. |
I tried the two suggested objective functions: (1) log of the transmission and (2) ∫|E|² including taking its log. Unfortunately, neither approach seemed to produce better results than the original objective function of just the transmission. Based on these results, I think the choice of objective function may not be as important as simply running lots of trials with random initial designs in order to try to find the global optima of the quarter-wavelength stack. The design space seems to be highly non convex regardless of the objective function. We may want to consider switching to a different problem. 1. Objective Function: tran_pt = mp.Vector3(0, 0, 0.5 * size_z_um - PML_UM)
obj_args = [
mpa.FourierFields(sim, mp.Volume(center=tran_pt), mp.Ex),
mpa.FourierFields(sim, mp.Volume(center=tran_pt), mp.Hy),
]
def obj_func(dft_ex, dft_hy):
"""Objective function for the optimization.
Args:
dft_ex, dft_hy: the discrete Fourier-transformed Ex and Hy fields.
Returns:
The transmittance through the stack.
"""
return npa.log(npa.real(npa.conj(dft_ex) * dft_hy) / norm_flux) 2. Objective Function: ∫|E|² obj_args = [
mpa.FourierFields(
sim,
matgrid_region.volume,
mp.Ex
)
]
def obj_func(dft_ex):
"""Objective function for the optimization.
Args:
dft_ex: the discrete Fourier-transformed Ex field.
Returns:
The field intensity summed over the design region.
"""
return npa.log(npa.sum(npa.absolute(dft_ex) **2)) |
When you minimize ∫|E|², did you get a more rapid spatial decay of the fields? I don't think the transmission should get better, but maybe it is more localized. Fundamentally, once ∫|E|² gets very localized, changing the exponential tail from So, if you're not happy just showing that minimizing ∫|E|² produces some rapid spatial decay, I would look at something that is going to be better conditioned. (And, at a single ω, we already know that quarter-wave stacks are a good, perhaps optimal, solution for this.). For example, you can try to design a broad-band AR coating with only 3–4 layers. In this case if you can get an optimum of > 90% at all wavelengths you would be pretty happy, and there is no worry about going from 99.9% to 99.999%. (Either worst-case or average-case broadband would be interesting, perhaps both.) |
(Note that the fact that optimization takes some steps that make things worse initially is a common phenomenon in many algorithms — initially, it takes steps that are too large and then has to backtrack. After a few iterations, the algorithm has more information about a "good" step size, and in the case of L-BFGS has more information about the 2nd derivative.) |
I did verify that minimizing ∫|E|² rather than the transmittance does produce more rapid spatial decay of the fields, as expected. The plot below shows a comparison of the field intensity within the multilayer stack (the design region) for the two cases of the objective function: (1) ∫|E|² vs. (2) transmittance. The design produced using (1) had a smaller transmittance than (2). This is perhaps a good indication that (1) may be a more suitable objective function than (2) for this problem. (I was probably unlucky in the local optima I chose for (1) in the results I posted in my previous comment because that design did not have a lower transmittance than the design obtained using (2).) Case 1: objective function = ∫|E|²
Case 2: objective function = transmittance
|
I changed the original optimization problem of minimizing ∫|E|² for a single wavelength to multiple wavelengths using minimax (i.e., worst-case optimization). The updated problem involves two wavelengths (0.95 and 1.05 μm) and nine layers in the stack. Because the epigraph formulation involves the use of nonlinear constraints which are not supported by the BFGS algorithm in NLopt, I switched to using the MMA algorithm instead. The changes to the script A plot of the history of the objective function vs. iteration number of the optimizer for the two wavelengths is shown below. Also included in this plot is the epigraph variable. The initial layer thicknesses are chosen randomly. The optimizer seems to converge to a local optima within ~20 iterations. In this case, the optimizer ran for the maximum number of iterations (50). As expected, the optimal layer thicknesses are quite different from the quarter-wave stack of alternating thicknesses of 0.19 and 0.25 μm.
Additional results showing the field decay in the stack as well as the transmittance for this design will be added soon. |
A plot of the Perhaps we just happened to choose a poor local optima for this demonstration? Or maybe this is a feature of this particular optimization problem? rom typing import List, Tuple
import matplotlib.pyplot as plt
import meep as mp
import numpy as np
RESOLUTION_UM = 800
AIR_UM = 1.0
PML_UM = 1.0
NUM_LAYERS = 9
MIN_LAYER_UM = 0.1
MAX_LAYER_UM = 0.5
N_LAYER_1 = 1.0
N_LAYER_2 = 1.3
LAYER_PERTURBATION_UM = 1.0 / RESOLUTION_UM
DESIGN_WAVELENGTHS_UM = (0.95, 1.05)
DESIGN_REGION_RESOLUTION_UM = 10 * RESOLUTION_UM
DESIGN_REGION_UM = mp.Vector3(0, 0, NUM_LAYERS * MAX_LAYER_UM)
NZ_DESIGN_GRID = int(DESIGN_REGION_UM.z * DESIGN_REGION_RESOLUTION_UM) + 1
NZ_SIM_GRID = int(DESIGN_REGION_UM.z * RESOLUTION_UM) + 1
num_wavelengths = len(DESIGN_WAVELENGTHS_UM)
frequencies = [1 / wavelength_um for wavelength_um in DESIGN_WAVELENGTHS_UM]
frequency_center = np.mean(frequencies)
# Set source bandwidth to be larger than the range of design wavelengths.
frequency_width = 1.2 * (np.max(frequencies) - np.min(frequencies))
def design_region_to_grid(nz: int) -> np.ndarray:
"""Returns the coordinates of the grid for the design region.
Args:
nz: number of grid points.
Returns:
The 1D coordinates of the grid points.
"""
z_grid = np.linspace(
-0.5 * DESIGN_REGION_UM.z,
+0.5 * DESIGN_REGION_UM.z,
nz
)
return z_grid
def levelset_and_smoothing(layer_thickness_um: np.ndarray) -> np.ndarray:
"""Returns the density weights for a multilayer stack as a levelset.
Args:
layer_thickness_um: thickness of each layer in the stack.
Returns:
The density weights as a flattened (1D) array.
"""
air_padding_um = 0.5 * (DESIGN_REGION_UM.z - np.sum(layer_thickness_um))
weights = np.zeros(NZ_DESIGN_GRID)
# Air padding at left edge
z_start = 0
z_end = int(air_padding_um * DESIGN_REGION_RESOLUTION_UM)
weights[z_start:z_end] = 0
z_start = z_end
for j in range(NUM_LAYERS):
z_end = z_start + int(layer_thickness_um[j] * DESIGN_REGION_RESOLUTION_UM)
weights[z_start:z_end] = 1 if (j % 2 == 0) else 0
z_start = z_end
# Air padding at right edge
z_end = z_start + int(air_padding_um * DESIGN_REGION_RESOLUTION_UM)
weights[z_start:z_end] = 0
# Smooth the design weights by downsampling from the design grid
# to the simulation grid using bilinear interpolation.
z_sim_grid = design_region_to_grid(NZ_SIM_GRID)
z_design_grid = design_region_to_grid(NZ_DESIGN_GRID)
smoothed_weights = np.interp(z_sim_grid, z_design_grid, weights)
return smoothed_weights.flatten()
def input_flux() -> np.ndarray:
"""Returns the flux generated by the source in vacuum."""
pml_layers = [mp.PML(thickness=PML_UM)]
size_z_um = PML_UM + AIR_UM + DESIGN_REGION_UM.z + AIR_UM + PML_UM
cell_size = mp.Vector3(0, 0, size_z_um)
src_cmpt = mp.Ex
src_pt = mp.Vector3(0, 0, -0.5 * size_z_um + PML_UM)
sources = [
mp.Source(
mp.GaussianSource(frequency_center, fwidth=frequency_width),
component=src_cmpt,
center=src_pt,
)
]
sim = mp.Simulation(
resolution=RESOLUTION_UM,
cell_size=cell_size,
dimensions=1,
boundary_layers=pml_layers,
sources=sources
)
tran_pt = mp.Vector3(0, 0, 0.5 * size_z_um - PML_UM)
flux_mon = sim.add_flux(
frequencies,
mp.FluxRegion(center=tran_pt)
)
sim.run(
until_after_sources=mp.stop_when_fields_decayed(
25.0, src_cmpt, tran_pt, 1e-6
)
)
source_flux = mp.get_fluxes(flux_mon)
return source_flux
def multilayer_stack(
input_flux: np.ndarray,
layer_thickness_um: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
"""Returns the DFT fields of a multilayer stack."""
pml_layers = [mp.PML(thickness=PML_UM)]
size_z_um = PML_UM + AIR_UM + DESIGN_REGION_UM.z + AIR_UM + PML_UM
cell_size = mp.Vector3(0, 0, size_z_um)
src_cmpt = mp.Ex
src_pt = mp.Vector3(0, 0, -0.5 * size_z_um + PML_UM)
sources = [
mp.Source(
mp.GaussianSource(frequency_center, fwidth=frequency_width),
component=src_cmpt,
center=src_pt,
)
]
mat_1 = mp.Medium(index=N_LAYER_1)
mat_2 = mp.Medium(index=N_LAYER_2)
weights = levelset_and_smoothing(layer_thickness_um)
matgrid = mp.MaterialGrid(
mp.Vector3(0, 0, NZ_SIM_GRID),
mat_1,
mat_2,
weights
)
geometry = [
mp.Block(
material=matgrid,
center=mp.Vector3(),
size=DESIGN_REGION_UM
)
]
sim = mp.Simulation(
resolution=RESOLUTION_UM,
cell_size=cell_size,
dimensions=1,
boundary_layers=pml_layers,
sources=sources,
geometry=geometry
)
dft_field_mon = sim.add_dft_fields(
[mp.Ex],
frequencies,
center=mp.Vector3(),
size=DESIGN_REGION_UM,
yee_grid=True
)
tran_pt = mp.Vector3(0, 0, 0.5 * size_z_um - PML_UM)
flux_mon = sim.add_flux(
frequencies,
mp.FluxRegion(center=tran_pt)
)
sim.run(
until_after_sources=mp.stop_when_fields_decayed(
25.0, src_cmpt, tran_pt, 1e-8
)
)
ex_dft = np.zeros(
(num_wavelengths, NZ_SIM_GRID),
dtype=np.complex128
)
for j in range(num_wavelengths):
ex_dft[j, :] = sim.get_dft_array(dft_field_mon, mp.Ex, j)
tran_flux = mp.get_fluxes(flux_mon)
transmittance = np.asarray(tran_flux) / np.asarray(norm_flux)
return transmittance, ex_dft
if __name__ == "__main__":
norm_flux = input_flux()
layer_thickness_um = [
0.1664, 0.2452, 0.1399, 0.2719, 0.1786, 0.3060, 0.4952, 0.1688, 0.2306
]
tran, ex_dft = multilayer_stack(norm_flux, layer_thickness_um)
markers = ['bo-', 'ro-']
z_grid = design_region_to_grid(NZ_SIM_GRID)
fig, ax = plt.subplots()
for j in range(num_wavelengths):
ax.plot(
z_grid,
np.absolute(ex_dft[j, :])**2,
markers[j],
label=(
f"$\lambda$ = {DESIGN_WAVELENGTHS_UM[j]} μm, "
f"T = {tran[j]:.4f}"
)
)
ax.set_xlabel("z coordinate (μm)")
ax.set_ylabel("$|E_x|^2$ in multilayer stack")
ax.legend()
if mp.am_master():
fig.savefig(
"broadband_field_decay_in_multilayer_stack.png",
dpi=150,
bbox_inches="tight"
) |
For comparison, a simple hand design would be to use a quarter-wave stack at the center frequency given by the harmonic mean of the wavelengths e.g. plot the figure of merit normalized by the q-wave values. I would use the CCSA-quadratic (#2400) algorithm in NLopt rather than MMA. Hopefully you are just getting stuck due to a problem in the optimizer. If it is an actual local minimum you are stuck at, then you might need to try lots of random starting points (maybe reduce the number of parameters) or try an even fancier global algorithm. |
…ic of design problem
I made a number of changes based on the suggestions which together seem to produce decent results:
The two existing figures have been updated with the new results. The plot of the field decay in the stack has been updated to normalize the fields by those from the reference design of the quarter-wavelength stack. A new schematic showing the design problem has also been added. |
Shape Optimization of a Multilayer Stack | ||
---------------------------------------- | ||
|
||
We extend the demonstration of the shape derivative from the previous tutorial to perform shape optimization of a multilayer stack over a broad bandwidth. The 1D design problem is shown in the schematic below and involves finding the layer thicknesses for a fixed number of layers (9) which minimize the largest transmittance (or alternatively, maximize the smallest reflectance) at two wavelengths: $\lambda_1$ = 0.95 μm and $\lambda_2$ = 1.05 μm. However, rather than use the transmittance as the objective function, we use the *field decay* via (the logarithm of) the integral of the fields in the stack ($\int |E_x|^2$). This provides more information to the optimizer which should produce better results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should be minimizing transmittance now that it's not too small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that we are saying something that is false here.
It would be good to double-check the quarter-wave transmission — I'm surprised that the transmission is so low, even though the index contrast is low, though I have no reason to doubt Meep. Might be nice to just try minimizing transmission too. (A quarter-wave stack might not be completely optimal even at a single frequency because there might be a better termination for the first and last layers? I don't remember.) |
I confirmed that the relatively large transmittance of ~0.3 of the quarter-wavelength stack in the example is due to the index contrast of 1.3. For comparison, the transmittance is 1e-5 for a quarter-wavelength stack with nine layers for an index contrast of 3.5. This information has been added to the tutorial in #2881. |
Closes #2820.
This PR adds a new tutorial for the adjoint solver which demonstrates shape optimization of a multilayer stack in 1D. The design problem involves finding the layer thicknesses (for a fixed number of layers) which minimize the transmittance of a planewave at normal incidence in air with$\lambda = 1.0$ μm through a stack of alternating materials of $n_1=1.0$ and $n_2=1.3$ . The layers are arranged as $n_2$ , $n_1$ , $n_2$ , $n_1$ , ..., $n_2$ . The analytic solution is a quarter-wavelength stack with thicknesses $\lambda / (4 n_1) = 0.25$ and $\lambda / (4 n_2) = 0.1923$ .
The optimization uses the BFGS algorithm of NLopt. An initial value of 0.2 μm is used for the layer thicknesses. Results are shown below for five different cases in which the number of layers are 5, 7, 9, 11, and 13. The quarter-wavelength stack configuration is found by the optimization only for layer numbers 5 and 7. Layer numbers 9, 11, and 13 seem to get trapped in a local minima. In all cases, the optimization terminates due to the threshold tolerance of the objective function (return code 3 of NLopt). The thickness of all layers is constrained to be in the range of 0.1 μm to 0.5 μm. Since these are 1D simulations, the runtimes are generally less than a couple of minutes using a single core (Intel Xeon i7-7700K CPU @ 4.20GHz).
A plot of the transmittance vs. iteration number for the five cases of the layer numbers shows that the transmittance decreases with the number of layers, as expected. The optimizer seems to move the design in the wrong direction (increasing transmittance) during the early iterations before eventually finding an optimal design which minmizes the transmittance.
We may want to consider creating a unit test based on this tutorial since currently there is no unit test for the 1D adjoint solver.