Efficient inverse design of optical multilayer nano-thin films using neural network principles: backpropagation and gradient descent

Jun Hee Han
School of Electrical Engineering, the Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 34141, Republic of Korea. E-mail: jjunihan@kaist.ac.kr

Received 16th April 2024 , Accepted 9th August 2024

First published on 12th August 2024


Abstract

Optical multilayer thin films have a wide range of applications due to their ability to manipulate transmissive or reflective wavelengths by adjusting the thickness of composed layers, enabling diverse uses. Although their light weight, flexible nature and ease of fabrication position them as promising components for future devices, determining their optimal layer thickness for the desired functionality demands extensive simulations, leading to inefficient utilization of computational resources and time. To overcome these challenges, inverse design methods, leveraging machine learning and deep learning, are being explored. However, these methods necessitate learning processes, despite the presence of well-established formulas that elucidate these phenomena. Furthermore, deriving accurate answers for conditions not included in the learning process proves to be challenging. This paper introduces an innovative inverse design approach that utilizes the backpropagation of a networked transfer matrix, effectively explaining the characteristics of optical multilayer thin films. By exploiting the chain rule of the network, this method calculates gradients to discern how each layer thickness influences the outcomes. Consequently, the optimal thickness is determined without the need for an additional learning process. Mathematical elucidation of the operational principle of this approach is precisely described. Optimization of computing resource utilization through network configuration reduces the calculation time compared to conventional methods. The efficacy of this method is demonstrated through its application in the inverse design of transmissive and reflective films, verifying its potential for enhancing efficiency and accuracy in optical multilayer thin-film design and manufacturing processes.


Introduction

Optical multilayer thin films, comprised of nanometer-scale layers, serve as functional optical devices with diverse applications. These include color filters,1–4 transparent electrodes,5–7 and functional filters,8–11 prompting extensive research in various fields.12–14 Their thinness, light weight nature, and ease of fabrication make them particularly appealing for integration into flexible and wearable devices. The properties of optical multilayer thin films are governed by the interference phenomenon arising from the interaction of reflected and transmitted light among the stacked layers, along with phase shifts and absorption within the film structure. To produce an optical multilayer thin film with the desired functionality, the initial step involves establishing a layered structure using accessible materials. Subsequently, the thickness of each layer is adjusted through optical simulation or numerical computation. However, it is inefficient. It requires considerable computational resources to perform a multitude of calculations, resulting in time-intensive processes. This challenge is particularly pronounced when dealing with a high number of stacked layers and a wide range of permissible thicknesses for each layer, leading to exponential growth in calculation complexity and substantially compromising design efficiency.

To mitigate the inefficiencies inherent in the design process, researchers have proposed evolutionary optimization methods such as genetic algorithms15–17 and memetic algorithms.18 These methodologies emulate biological evolution, employing selection, hybridization, and mutation processes to iteratively refine designs towards optimal solutions. However, the computational overhead associated with these methods escalates rapidly with increasing design complexities. Moreover, limitations in the crossover process can hinder confidence in achieving global minimum of loss function. Concurrently, there is interest in inverse design techniques leveraging deep learning19–23 (ESI Fig. S1). This approach involves using thickness as an input parameter to train a deep neural network to discern the relationship between output spectra and thickness. Through iterative refinement of the network to align with the desired spectra, a neural network can be established, enabling it to inversely suggest input parameters corresponding to the targeted spectra, specifically the thickness of each layer. Nonetheless, a paradox emerges where inverse design can be achieved without deep learning, because all spectra related to the input parameters are available during construction of the training dataset. Moreover, predicting thicknesses for spectra beyond the training dataset may require an additional learning process. Further advancements include methodologies utilizing recurrent neural networks24 to learn layer-stacking processes from sequential data or leveraging decoder-only transformer models, such as GPT,25 pretrained on extensive datasets. However, a notable limitation of these approaches is their substantial computational demands.

If there is a function that can characterize the variance between the targeted spectra and the predicted spectra, gradients for each input parameter can be obtained through a single forward propagation and backpropagation of this function.26 The utilization of an optimal learning rate for parameter updates facilitates the discovery of the global optimal point while requiring minimal computational resources. In the structural design process of a metasurface that can process the wavefront of light into a desired shape, a gradient method is widely used to efficiently calculate the slope and perform inverse design using the forward propagation method and the adjoint method of Maxwell's equations.27–32 For optical multilayer thin films, there is a well-defined function called the transfer matrix method (TMM) that can describe the output results with respect to the input parameters. Thus, optical multilayer thin films can also be inversely designed using gradient methods. However, a significant hurdle arises from the fact that the loss function, which is a function of the difference between the target and predicted results, varies depending on the film structure, making it an inefficient method. In other words, this inefficiency stems from the necessity to calculate the loss function and gradients for each input parameter whenever inverse design is conducted with different film structures. In this paper, to overcome this obstacle and enhance the efficiency of this method, we introduced a novel approach to obtain the optimized layer thickness simply by appropriately updating the input parameters through forward and backward propagations of the corresponding formulas. This is achieved by constructing a suitable network structure tailored to effectively utilize computational resources. The proposed method significantly reduces computational overhead by updating parameter values via gradient-based minimization of the loss function, as opposed to the conventional approach that necessitates calculations for all possible cases. This computational efficiency is particularly pronounced in systems with a greater number of layers and larger layer sizes. Furthermore, enhanced optimization efficacy is achieved through the incorporation of an adaptive moment estimation algorithm (Adam) for parameter updates.

The TMM stands out for its ability to keep calculations straightforward, no matter how many layers are involved. This is because it can handle calculations by simply adding matrices, even as more layers are added. The method introduced in this paper leverages both forward and backward propagations within the framework of a networked TMM structure. Consequently, as the number of layers increases, its advantageous simplicity enables straightforward computation by merely adding a network layer. Employing the presented methodology, we have effectively predicted the optimal configuration of optical multilayer thin films to attain desired performance characteristics. We envision that this approach will emerge as a crucial tool for inverse design, streamlining the fabrication of optical multilayer thin films customized to meet distinct functional specifications.

Methodology

The neural network methodology employs interconnected neurons to discern intricate relationships between the input and output data (ESI Fig. S2). Its effectiveness stems from its capacity to describe complex associations through a combination of linear functions and activation functions beyond the purview of traditional methods.33,34 Through iterative learning via backpropagation, gradients of the weights and biases of the network are determined, utilizing the chain rule to facilitate parameter updates towards optimal representation (ESI Fig. S3). Gradients signify the extent of alterations in each parameter that influence shifts in outcomes.33,34 Importantly, the gradient method streamlines optimization by ascertaining the direction and magnitude for weight and bias adjustments, thereby avoiding the need to try out all possible combinations of settings for the parameters, which would be time consuming. Parameter updates are guided by a loss function, quantifying the disparity between actual and predicted outputs, with adjustments orchestrated to minimize this discrepancy.

While neural networks represent a promising model, their efficacy can diminish in scenarios where a well-established relationship exists between input and output data. For instance, in the case of optical multilayer thin films, assessing the performance of such films can be accurately achieved by employing techniques like the TMM, which considers the structure, material composition, and layer thicknesses. However, in inverse optical design tasks, where corresponding input parameters for a desired output are identified, challenges persist despite established relationships. Computational complexity poses a significant hurdle in such cases. For example, to fabricate an optical film transmitting only a specific wavelength within a multilayer thin film setup, exhaustive calculations across various cases are necessary to obtain an optimal solution. Given the scenario of layer thicknesses ranging from 0 nm to 100 nm and calculations performed at 1 nm intervals, the total number of calculations required amounts to 10 billion to determine the film thickness necessary to achieve the desired functionality. To address these limitations, this paper introduces a novel methodology for configuring TMM networks to facilitate more efficient inverse design. By structuring the TMM as a network, gradients can be readily computed using the chain rule, resembling the parameter update schemes in neural networks. These gradients are then utilized to iteratively adjust layer thicknesses, thereby determining optimal optical multilayer configurations.

The optical properties of optical multilayer thin films can be accurately forecasted utilizing the TMM, a widely recognized theoretical framework. To facilitate effective inverse design leveraging gradients within the TMM framework, it is imperative to establish a robust TMM network. In this paper, we introduce a novel approach incorporating an identity matrix, as depicted in eqn (1), into the fundamental equation to construct a TMM network. Through this augmentation, a functional representation spanning from eqn (2) to (5) is devised, enabling the construction of a network for optical output prediction and design optimization. The term Δϕn represents the phase change that occurs when light passes through a layer with thickness tn and refractive index nn. A is the set of input variables used in eqn (1) through eqn (4) when making the calculations. When formulating a function organized from eqn (2) to (5), it is possible to apply the strength of the TMM equally to the network, wherein calculations can be performed by simply adding a matrix even when additional layers are introduced. Fig. 1 illustrates a network employing the generated functions. In the figure, the squared elements denote the equations, while the circled elements represent the corresponding results derived from these equations. The circles positioned at the far left signify the input values fed into the network. Elements a1, b1, c1, and d1 correspond to components of the identity matrix, thus maintaining fixed values of 1, 0, 0, and 1, respectively. Expanding the network with additional layers is achieved by simply incorporating the box delineated by the red dashed lines, facilitating straightforward configuration adjustments. The ability to construct a network with a repetitive structure offers the advantage of facilitating efficient computations when utilizing computer-based calculations. In the context of calculating the transmittance or reflectance of an optical multilayer thin film, parameters denoted as B and C need to be derived. These parameters can be obtained through eqn (6), from which functions such as eqn (7) and (8) can be derived. Leveraging the derived values of B and C, the transmittance and reflectance can be computed using eqn (9) and (10), respectively. While computing the transmittance and reflectance provides a comprehensive portrayal of the TMM as a network, the addition of another function is necessary to efficiently compute the optimal thickness using gradient methods. This additional function, the error function (Ferr), facilitates the determination of discrepancies between the anticipated outcomes obtained through reverse engineering (E) and those computed via the TMM. Eqn (11) and (12) delineate the expression of the error function, which is also integrated into the network representation.

 
image file: d4nr01667j-t1.tif(1)
 
F1(A) = an−1[thin space (1/6-em)]cos[thin space (1/6-em)]Δϕn−1 + i·bn−1nn−1[thin space (1/6-em)]sin[thin space (1/6-em)]Δϕn−1 = an (2)
 
image file: d4nr01667j-t2.tif(3)
 
F3(A) = cn−1[thin space (1/6-em)]cos[thin space (1/6-em)]Δϕn−1 + i·dn−1nn−1[thin space (1/6-em)]sin[thin space (1/6-em)]Δϕn−1 = cn (4)
 
image file: d4nr01667j-t3.tif(5)
 
image file: d4nr01667j-t4.tif(6)
 
FB(a, b, n) = an + bnnn = B (7)
 
FC(c, d, n) = cn + dnnn = C (8)
 
image file: d4nr01667j-t5.tif(9)
 
image file: d4nr01667j-t6.tif(10)
 
Ferr(T, E) = (TE)2 = Y (11)
 
Ferr(R, E) = (RE)2 = Y (12)


image file: d4nr01667j-f1.tif
Fig. 1 The network designed for inverse modeling utilizes the transfer matrix method alongside a loss function. The squared elements within the network represent the implemented functions, while the circled elements depict the resulting outcomes obtained through function computation. The gray-colored circles positioned on the left signify the input values, corresponding to the components of the identity matrix.

If a network can be constructed and the loss value, representing the disparity between the network-obtained results and the expected outcomes, can be calculated, gradient calculation using the chain rule becomes feasible, allowing for inverse design through gradient descent. In the case of a single layer, the forward propagation utilizing input parameters within the network depicted in Fig. 1 can be articulated through the dependency relationships of each function, as described in eqn (13). Subsequently, the calculation of gradients using the chain rule becomes feasible. Given the objective of this paper is to ascertain the optimal layer thickness, obtaining the partial derivative of the layer thickness of Y, attainable through eqn (14), is necessitated. Even with the presence of two layers, calculating the gradient of the loss function with respect to the thickness of layer 1 and layer 2 can be readily achieved by considering only the augmented network, as demonstrated in eqn (15) through eqn (20). Eqn (15)–(18) present the expressions calculated from F1 to F4 for the second layer of the network. The results of these calculations are then substituted into eqn (13) to obtain the loss value. This reveals that the computation is executed by reusing F1(A) to F4(A). This approach simplifies formula implementation by reusing existing code, thereby eliminating the need to create complex formulas based on the number of layers. This method is particularly advantageous when leveraging computing power through the production of computer program code. Eqn (19) and (20) represent the expressions used to obtain the gradients of t1 and t2 with respect to Y during the backpropagation process. As layers are added and the network deforms, the gradient of t1 evolves from eqn (14) to (20). Whenever a layer is added, this adjustment should apply to each thickness parameter, yet manual intervention is unnecessary. Instead, it automates the procedure by tracking the trajectory of each parameter during forward propagation and automatically computes it using the stored partial derivative values acquired during this phase (Fig. 2). This methodology draws inspiration from the technique used to update weights and biases in deep neural networks.33

 
image file: d4nr01667j-t7.tif(13)
 
image file: d4nr01667j-t8.tif(14)
 
F1(A) = F1(t, F1(A), F2(A), F3(A), F4(A), n, k) (15)
 
F2(A) = F2(t, F1(A), F2(A), F3(A), F4(A), n, k) (16)
 
F3(A) = F3(t, F1(A), F2(A), F3(A), F4(A), n, k) (17)
 
F4(A) = F4(t, F1(A), F2(A), F3(A), F4(A), n, k) (18)
 
image file: d4nr01667j-t9.tif(19)
 
image file: d4nr01667j-t10.tif(20)


image file: d4nr01667j-f2.tif
Fig. 2 Exposition of the computational procedures involved in both forward propagation and backpropagation within the inverse design.

If the gradient of the loss value with respect to the layer thickness is known, one can adjust the layer thickness to achieve the desired function through suitable learning rates and optimization techniques. In neural networks, a variety of methods are employed to obtain the optimal weights and biases of the linear function. These techniques encompass gradient descent (GD),35 stochastic gradient descent (SGD),36 momentum,34 adaptive moment estimation (Adam),37 root mean square prop (RMSprop),37 adaptive gradient algorithm (AdaGrad),38 and Nesterov accelerated gradient (NAG),38 among others. While these techniques have conventionally been employed to optimize weights and biases in neural networks, this paper adapts them for the optimization of layer thickness. Among the various methods explored, Adam, renowned for its effective utilization of gradient history in optimization, was utilized. Additionally, a random initialization method was employed concurrently with Adam to further enhance performance.

Fig. 3 illustrates the inverse design process flowchart of an optical multilayer thin film using backward propagation. As depicted in the chart, the process begins with random initialization within a specified range to establish a starting point. Subsequently, forward propagation and backward propagation occur from this starting point for a specified number of iterations (i1). Following this, the loss value is evaluated to determine if it falls below the predefined threshold. If the criteria are satisfied, the thickness is recorded. Finally, to ensure the reliability of the results, the process concludes after confirming the attainment of the set-saved thickness count (i2). If there are insufficient data to meet i2, an additional iteration is performed starting from random initialization. Both i1 and i2 are hyperparameters.


image file: d4nr01667j-f3.tif
Fig. 3 Flowchart of the inverse design process of an optical multilayer thin film using backward propagation.

Even after layer thickness optimization using Adam is completed, additional steps are necessary to ascertain the optimal thickness. Upon optimization completion, the layer thickness is not determined as a single representative value but instead varies optimally depending on the wavelength (ESI Fig. S4). This variability arises due to differences in the k value corresponding to the wavenumber and the refractive index of the layer material, both of which fluctuate with the wavelength. Consequently, layer thickness optimization is performed separately for each wavelength. Therefore, during the ‘Save Thickness’ process (Fig. 3), the task of identifying the representative value among the thicknesses for each wavelength is concurrently executed (ESI Fig. S5). The saved representative thickness is the thickness associated with the lowest loss value when set as the representative layer thickness among the thicknesses for each wavelength (ESI Fig. S6).

Results and discussion

The method proposed in this paper was analyzed by executing the inverse design of optical multilayer thin films transmitting red, green, and blue colors, as well as those reflecting cyan, magenta, and yellow colors. This inverse design method receives two inputs: the structure and the target spectrum. The structure outlines the number of layers and materials used to construct the optical multilayer thin film, while the target spectrum represents the desired optical characteristics of the structure. The target spectrum serves as reference data in the inverse design process and is utilized to determine the loss value. Here, a structure comprising tungsten trioxide (materials 1, 3 and 5) and silver (materials 2 and 4) was employed. However, given the absence of material-type restrictions, a wide range of materials can be utilized. Upon receiving these inputs, forward propagation and backward propagation are sequentially conducted to derive a gradient for each layer thickness, and the layer thickness is subsequently updated using this gradient. Fig. 4 illustrates the process of updating layer thickness using gradients. As a result of inverse design, the layer thickness was derived, as shown in Table 1, and the output spectrum corresponding to each target spectrum was produced. Upon examining the shape of the output spectrum, it exhibits a similar profile to the target spectrum provided as input data. Furthermore, upon assessing the mean squared error values between the target spectrum and the output spectrum, loss values for R, G, and B are observed to be 0.03, 0.03, and 0.02, respectively.
image file: d4nr01667j-f4.tif
Fig. 4 Flowchart of the inverse design process of a transmissive optical multilayer thin film using backward propagation. The input consists of the structure and target spectrum of the multilayer thin film to be implemented, while the output presents the resulting spectrum of the designed multilayer thin film intended to match the input target spectrum. Additionally, a conceptual image illustrating the thickness of the implemented structure is shown.
Table 1 Layer thickness by color of the transmissive multilayer thin film implemented through inverse design
  t1 t2 t3 t4 t5
R 79 nm 27 nm 121 nm 25 nm 230 nm
G 169 nm 34 nm 81 nm 18 nm 40 nm
B 125 nm 22 nm 35 nm 15 nm 108 nm


When conducting inverse design, a learning rate of 10−9 was employed, with i1 set to 200 iterations and i2 to 20 iterations, respectively. To assess the validity of the obtained thickness, a loss value of 0.07 was designated. The choice of setting i1 to 200 iterations was due to the thickness becoming saturated before reaching 200 iterations, while i2 was an arbitrary value chosen for this study, with the flexibility to be adjusted to either enhance the simulation speed or improve simulation accuracy. Commonly used values for the hyperparameters in Adam were utilized: 0.9 for β1 and 0.999 for β2.

Fig. 5 illustrates the optimization of layer thickness through inverse design. Each graph depicts the evolution of values during the inverse design process for two layers, offering insight into the optimal layer thickness progression. In practice, obtaining the gradient of the loss function with respect to each thickness is influenced by all five layers, and accordingly the update of parameters is subsequently undertaken. However, as the 5th dimension cannot be graphically represented, two thicknesses were visualized to observe the parameter update process. Apart from the thickness used as a variable in each graph, optimized values were applied to the remaining thicknesses. Fig. 5(a–c) depict the parameter update process for achieving a spectrum that transmits the red area in the target spectrum. Specifically, (a) represents t1 and t2, (b) represents t3 and t4, while (c) exhibits the change in loss value as t5 and t1 vary, portrayed as a contour map. The gradient on the map signifies the update direction for each thickness value. Notably, the graph illustrates each value converging towards a point of minimal loss, indicating appropriate thickness updates. Moreover, it demonstrates a diminishing update magnitude in the latter stages, attributed not only to a decreasing gradient, but also optimization effects using Adam. To facilitate interpretation, a shared color bar for the contour map is presented on the right-hand side of graph (c). Additionally, (d–f) delineate the thickness update process for achieving a spectrum transmitting green in the target spectrum, while (g–i) illustrate the same for blue transmission. Across all graphs, it is evident that thickness updates progress towards minimizing the loss value. Thus, it is confirmed that by acquiring the gradient value of parameters concerning the loss value and updating accordingly, efficient and rapid determination of appropriate thickness is achievable without exhaustive simulation.


image file: d4nr01667j-f5.tif
Fig. 5 Parameter update process for achieving a spectrum that transmits the red, green, and blue colors. (a), (b), and (c) depict (t1, t2), (t3, t4), and (t5, t1) of the red color, respectively. (d), (e), and (f) depict (t1, t2), (t3, t4), and (t5, t1) of the green color, respectively. (g), (h), and (i) depict (t1, t2), (t3, t4), and (t5, t1) of the blue color, respectively.

The operating principle of an optical multilayer thin film is based on light interference, resulting in a periodic value dependent on the wavelength of interference occurrence. Consequently, the optimal layer thickness may vary depending on the location of the initial starting point, owing to this fundamental principle. In essence, there exists no singular global minimum but rather multiple local minima (ESI Fig. S7). However, the absence of a definitive global minimum can be advantageous in designing optical multilayer thin films, as it allows for adaptability according to specific requirements. For instance, certain applications may necessitate a very thin film, while others may require a thicker film to minimize level differences with surrounding structures. The ESI figures, from Fig. S8 to S10, depict the results of extracting ten optimal thicknesses for each color target spectrum input into the inverse design process. While each spectrum shares a similar shape, they represent varying thicknesses, as detailed in ESI Tables S1–S3. Attempting to derive all these candidates without utilizing inverse design would necessitate simulating a vast number of cases, proving highly inefficient in terms of time and cost. Thus, the inverse design method proposed in this paper offers the advantage of furnishing researchers with a diverse array of structural candidates fulfilling the same function. To fully capitalize on this benefit, the inverse design process outlined above was devised to enable users to obtain a variety of candidates by adjusting the number of random initializations, a hyperparameter denoted as i2.

We devised an inverse design approach for creating an optical multilayer thin film tailored to reflect light of a specific spectrum. This involved updating the thickness through gradient calculations within the network configuration proposed in this paper. In previous inverse design approaches concentrating on optical multilayer thin films that transmit specific wavelengths, the calculation process was directed to pass through the transmittance node corresponding to eqn (9) situated immediately before the network loss. Conversely, for devices intended for reflectance, the relevant node was altered to a reflectance node corresponding to eqn (10), while all other nodes remained unchanged (Fig. 6). Similar to the transmissive device design process, input requirements include the device structure and target spectrum. However, in the case of reflectance, unlike transmittance, the substrate was configured as a reflector. Upon performing inverse design using the structure and target spectrum, we acquired the layer thickness for the target spectrum and obtained the reflectance spectrum (Table 2). The results confirmed that the reflection closely resembled the target spectrum, as depicted in Fig. 6. Examination of the mean squared error values of the target spectrum and output spectrum revealed values of 0.05, 0.02, and 0.02 for C, M, and Y, respectively. Fig. 7 illustrates the inverse design process of the reflectance optical multilayer thin film. The thickness update process for each graph visualizes the changes in thickness values of two layers. Fig. 7(a–c) depict the thickness update process aimed at reflecting blue and green from the target spectrum to produce a cyan color. Specifically, (a) represents t1 and t2, (b) represents t3 and t4, while (c) exhibits the change in loss value as t5 and t1 vary, portrayed as a contour map. The gradient on the map signifies the update direction for each thickness value. Additionally, (d–f) delineate the thickness update process for achieving a spectrum reflecting magenta in the target spectrum, while (g–i) illustrate the same for yellow reflection. Across all graphs, it is evident that thickness updates progress towards minimizing the loss value.


image file: d4nr01667j-f6.tif
Fig. 6 Flowchart of the inverse design process of a reflective optical multilayer thin film using backpropagation. The input consists of the structure and target spectrum of the multilayer thin film to be implemented, while the output presents the resulting spectrum of the designed multilayer thin film intended to match the input target spectrum. Additionally, a conceptual image illustrating the thickness of the implemented structure is shown.

image file: d4nr01667j-f7.tif
Fig. 7 Parameter update process for achieving a spectrum that reflects the cyan, magenta, and yellow colors. (a), (b), and (c) depict (t1, t2), (t3, t4), and (t5, t1) of the cyan color, respectively. (d), (e), and (f) depict (t1, t2), (t3, t4), and (t5, t1) of the magenta color, respectively. (g), (h), and (i) depict (t1, t2), (t3, t4), and (t5, t1) of the yellow color, respectively.
Table 2 Layer thickness by color of the reflective multilayer thin film implemented through inverse design
  t1 t2 t3 t4 t5
C 72 nm 41 nm 117 nm 94 nm 241 nm
M 166 nm 48 nm 75 nm 106 nm 11 nm
Y 2 nm 15 nm 21 nm 25 nm 33 nm


The effectiveness of inverse design, which involves configuring the TMM as a network to derive gradients for the loss value associated with each layer thickness and subsequently updating them accordingly, has been validated in preceding sections. This approach drastically reduces the computational workload, facilitating efficient device design. To further illustrate its effectiveness, we examined the required calculations as the number of layers increased. Assuming a layer thickness range between 0 nm and 300 nm with 1 nm increments, conventional methods require exhaustive grid searches to locate the desired spectrum among the results. For instance, with one layer, only 300 calculations are necessary, but this figure escalates by 300 times with each additional layer. Consequently, designing for 5 layers necessitates a staggering 2.43 × 1012 calculations. Grid search is a common method for thickness optimization. However, to reduce computational effort, it is sometimes applied by fixing the thickness of certain layers and performing the grid search only for the remaining layers. In this approach, the choice of which layers to fix and their corresponding thicknesses often relies on the engineer's experience.5 For example, in the structure used in this paper, an engineer could specify two layers of Ag, and a grid search would be performed for the remaining three layers of WO3. Despite using this method, a computational load of 27 × 106 calculations is required. Even after completing these calculations, it is challenging to have confidence that the obtained value is the optimal one. Alternatively, a grid search with broad intervals for each parameter can be performed first, followed by a more detailed grid search around the optimal point. When both methods presented above are applied, the computational load is reduced to 22.4 × 104 for the same structure discussed in this paper. However, the computational demand still increases with the number of layers, and suboptimal performance can be observed compared to the results obtained in this paper.1 Conversely, the proposed inverse design method sets only the number of iterations, significantly reducing the workload compared to grid search. In this study, we found that the thickness update is saturated after 200 iterations, leading us to set the iteration limit accordingly. The disparity in results is visually discernible in Fig. 8. The efficiency of the proposed method increases with the number of layers constituting the film structure. To enhance the accuracy of inverse design and explore various thickness combinations with similar functions, the number of initial thickness points must increase, consequently increasing the calculation workload. However, this increase has a negligible impact on efficiency. For instance, applying 10 random initial points requires only 2 × 103 calculations, regardless of the layer count.


image file: d4nr01667j-f8.tif
Fig. 8 Comparison of the number of calculations required for inverse design between the grid search and gradient descent methods.

The reduced computational load translates into a significant advantage in inverse design time. Typically, the grid search is performed using MATLAB simulations. Even though MATLAB takes 0.1 ms to calculate the TMM for a 5-layer optical nano-thin film with one fixed thickness condition, and applying 5 nm spacing for a grid search between 0 nm and 300 nm, the total computation time amounts to a minimum of 21.6 hours (605 × 10−4 seconds). However, we observed that our proposed method completes 200 iterations for a single initial condition in less than 2 seconds. For enhanced accuracy, when increasing the number of initial conditions to 10, the execution time is approximately 1 minute. The disparity in execution time between our method and the traditional approach becomes more pronounced with an increasing number of layers. AI-based methods have been developed capable of performing inverse design in just 0.1 seconds.25 However, a significant amount of time is required to construct the training dataset, involving training on 10 million designs to develop the model. Additionally, a limitation persists in predicting accurate results for new structures not included within the 10 million designs. Therefore, the capability to rapidly execute inverse design by adapting the network configuration to various device conditions is a significant advantage of the proposed method, distinguishing it from existing approaches.

Conclusions

In this paper, we introduced a novel approach to inverse design that distinguishes itself from previously proposed methods solely based on artificial intelligence or grid search. Instead, we propose a technique that leverages the fundamental principle of deep neural networks, specifically calculating gradient values for each parameter to update the optimized design. To achieve this, we developed a formula incorporating an identity matrix and established a network capable of performing efficient computations using available computing resources. Unlike conventional methods, our approach eliminates the need for a separate learning process, allowing for the immediate commencement of inverse design to determine optimal thicknesses according to desired structures. Moreover, it offers the advantage of flexibility for accommodating various structural configurations. For optical nano-thin films with a laminated structure of planar layers, the proposed method can be implemented by simply adjusting the network configuration irrespective of the number of layers. However, it is challenging to apply the proposed method to cases where the layers have nanostructures or the film is non-planar. In this paper, we constructed a network to examine the optical properties in the normal direction, but it is anticipated that a network capable of assessing angular properties can be developed in the future.

Through mathematical derivations, we demonstrate the efficient computation of gradient values using the chain rule within a network framework. Compared to exhaustive grid search methods, our approach significantly reduces the computational burden when updating layer thicknesses to find optimal values through gradient calculations, thereby enabling efficient design processes. Furthermore, its efficacy improves with increasing numbers of layers. Utilizing our proposed method, we designed both transmissive and reflective multilayer optical nano-thin films tailored to specific wavelengths, verifying the feasibility of designing devices that achieve target spectra.

Optical nano-thin films have diverse applications including transmissive/reflective color filters, organic light-emitting diodes, organic photovoltaics, optical communication, etc. Their ability to selectively transmit or reflect specific wavelengths allows devices to emit desired light, absorb targeted wavelengths, or improve the efficiency of light-based communication. If the method proposed in this paper enables the efficient design of films with optimal performance, it is anticipated to broaden the functionalities and improve the efficiency of devices incorporating optical nano-thin films.

Materials

Computer resources

The processor utilized in this study was the AMD Ryzen 5 PRO 4650G with Radeon Graphics clocked at 3.7 GHz. The system's RAM capacity was 8.00 GB.

Refractive indices

The experimentally measured refractive indices of silver and tungsten trioxide, obtained using a spectroscopic ellipsometer (M2000D), were used for the calculations and inverse design.

Author contributions

J. H. Han: conceptualization, data curation, formal analysis, investigation, methodology, project administration, resources, software, supervision, validation, visualization, writing – original draft, and writing – review and editing.

Data availability

The data supporting this article have been included as part of the ESI. The code to perform the inverse design presented in this paper is accessible at https://github.com/Jun-Hee-Han/NanoFilm_Inverse_Design.git.

Conflicts of interest

There are no conflicts to declare.

References

  1. J. H. Han, D. Y. Kim, D. Kim and K. C. Choi, Sci. Rep., 2016, 6, 29341 CrossRef CAS PubMed .
  2. J. H. Han, D. Kim, T. W. Lee, E. G. Jeong, H. S. Lee and K. C. Choi, ACS Photonics, 2018, 5, 1891–1897 CrossRef CAS .
  3. C. Yang, W. Shen, Y. Zhang, K. Li, X. Fang, X. Zhang and X. Liu, Sci. Rep., 2015, 5, 9285 CrossRef CAS PubMed .
  4. C. Ji, K. T. Lee, T. Xu, J. Zhou, H. J. Park and L. J. Guo, Adv. Opt. Mater., 2017, 5, 1700368 CrossRef .
  5. J. H. Han, D. H. Kim, E. G. Jeong, T. W. Lee, M. K. Lee, J. W. Park, H. Lee and K. C. Choi, ACS Appl. Mater. Interfaces, 2017, 9, 16343–16350 CrossRef CAS PubMed .
  6. H. Cho, C. Yun, J. W. Park and S. Yoo, Org. Electron., 2009, 10, 1163–1169 CrossRef CAS .
  7. S. M. Lee, C. S. Choi, K. C. Choi and H. C. Lee, Org. Electron., 2012, 13, 1654–1659 CrossRef CAS .
  8. J. H. Han, D. Kim, T. W. Lee, Y. Jeon, H. S. Lee and K. C. Choi, ACS Photonics, 2018, 5, 3322–3330 CrossRef CAS .
  9. S. Wang, T. Jiang, Y. Meng, R. Yang, G. Tan and Y. Long, Science, 2021, 374, 1501–1504 CrossRef CAS PubMed .
  10. C. Yang, C. Ji, W. Shen, K.-T. Lee, Y. Zhang, X. Liu and L. J. Guo, ACS Photonics, 2016, 3, 590–596 CrossRef CAS .
  11. A. Ghobadi, H. Hajian, M. Gokbayrak, B. Butun and E. Ozbay, Nanophotonics, 2019, 8, 823–832 CrossRef CAS .
  12. W. Li, Y. Shi, K. Chen, L. Zhu and S. Fan, ACS Photonics, 2017, 4, 774–782 CrossRef CAS .
  13. J. K. Tong, W. C. Hsu, Y. Huang, S. V. Boriskina and G. Chen, Sci. Rep., 2015, 5, 10661 CrossRef CAS PubMed .
  14. Z. Chen, L. Zhu, A. Raman and S. Fan, Nat. Commun., 2016, 7, 13729 CrossRef CAS PubMed .
  15. E. N. Cho, P. Moon, C. E. Kim and I. Yun, Expert Syst. Appl., 2012, 39, 8885–8889 CrossRef .
  16. R. Li Voti, J. Eur. Opt. Soc., 2018, 14, 1–12 CrossRef .
  17. T. Jalali, M. Jafari and A. Mohammadi, Mater. Sci. Eng., B, 2019, 247, 114354 CrossRef CAS .
  18. Y. Shi, W. Li, A. Raman and S. Fan, ACS Photonics, 2018, 5, 684–691 CrossRef CAS .
  19. Q. Guan, A. Raza, S. S. Mao, L. F. Vega and T. J. Zhang, ACS Photonics, 2023, 10, 715–726 CrossRef CAS .
  20. A. Jiang, Y. Osamu and L. Chen, Sci. Rep., 2020, 10, 12780 CrossRef CAS PubMed .
  21. A. Luce, A. Mahdavi, H. Wankerl and F. Marquardt, Mach. Learn.: Sci. Technol., 2023, 4, 015014 Search PubMed .
  22. Q. Pan, S. Zhou, S. Chen, C. Yu, Y. Guo and Y. Shuai, Opt. Express, 2023, 31, 23944 CrossRef CAS PubMed .
  23. W. Chen, Y. Gao, Y. Li, Y. Yan, J. Y. Ou, W. Ma and J. Zhu, Adv. Sci., 2023, 10, 2206718 CrossRef CAS PubMed .
  24. H. Wang, Z. Zheng, C. Ji and L. Jay Guo, Mach. Learn.: Sci. Technol., 2021, 2, 025013 Search PubMed .
  25. T. Ma, H. Wang and L. J. Guo, arXiv, 2023, preprint, arXiv:2304.10294 [physics.optics],  DOI:10.48550/arXiv.2304.10294.
  26. D. E. Rumelhart, G. E. Hinton and R. J. Williams, Nature, 1986, 323, 533–536 CrossRef .
  27. Z. Li, R. Pestourie, Z. Lin, S. G. Johnson and F. Capasso, ACS Photonics, 2022, 9, 2178–2192 CrossRef CAS .
  28. J. S. Jensen and O. Sigmund, Laser Photonics Rev., 2011, 5, 308–321 CrossRef CAS .
  29. W. Ji, J. Chang, H. X. Xu, J. R. Gao, S. Gröblacher, H. P. Urbach and A. J. L. Adam, Light: Sci. Appl., 2023, 12, 169 CrossRef CAS PubMed .
  30. R. Paniagua-Domínguez, Y. F. Yu, E. Khaidarov, S. Choi, V. Leong, R. M. Bakker, X. Liang, Y. H. Fu, V. Valuckas, L. A. Krivitsky and A. I. Kuznetsov, Nano Lett., 2018, 18, 2124–2132 CrossRef PubMed .
  31. J. Jiang and J. A. Fan, Nano Lett., 2019, 19, 5366–5372 CrossRef CAS PubMed .
  32. Z. Li, R. Pestourie, J. S. Park, Y. W. Huang, S. G. Johnson and F. Capasso, Nat. Commun., 2022, 13, 1–11 Search PubMed .
  33. S. Goki, Deep running starting from the bottom, Hanvit Media, 2017 Search PubMed .
  34. I. Goodfellow, Y. Bengio and A. Courville, Deep Learning (Adaptive Computation and Machine Learning series), The MIT Press, 2016 Search PubMed .
  35. A. Krizhevsky, I. Sutskever and G. E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, in Advances in Neural Information Processing Systems, ed. F. Pereira, C. J. Burges, L. Bottou and K. Q. Weinberger, 2012, vol. 25, https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf Search PubMed .
  36. H. Robbins and S. Monro, Ann. Math. Stat., 1951, 400–407 CrossRef .
  37. D. P. Kingma and J. Ba, arXiv, 2014, preprint, arXiv:1412.6980 [cs.LG],  DOI:10.48550/arXiv.1412.6980.
  38. S. Ruder, arXiv, 2016, preprint, arXiv:1609.04747 [cs.LG],  DOI:10.48550/arXiv.1609.04747.

Footnote

Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4nr01667j

This journal is © The Royal Society of Chemistry 2024