Study of Inverse Lithography Approaches based on Deep Learning

: Computational lithography (CL) has become an indispensable technology to improve imaging resolution and fidelity of deep sub-wavelength lithography. The state-of-the-art CL approaches are capable of optimizing pixel-based mask patterns to effectively improve the degrees of optimization freedom. However, as the growth of data volume of photomask layouts, computational complexity has become a challenging problem that prohibits the applications of advanced CL algorithms. In the past, a number of innovative methods have been developed to improve the computational efficiency of CL algorithms, such as machine learning and deep learning methods. Based on the brief introduction of optical lithography, this paper reviews some recent advances of fast CL approaches based on deep learning. At the end, this paper briefly discusses some potential developments in future work.


Introduction
Optical lithography is a crucial technology to manufacture the integrated circuits (IC) in semiconductor industry. Figure 1(a) shows the schematic diagram of a typical deep-ultraviolet (DUV) optical lithography system, which is used to transfer the IC layouts from photomask onto wafer [1,2] . The illumination of DUV lithography system emits the light rays with 193nm wavelength, which pass through the optical lens and uniformly illuminate the photomask. The transmitted light rays from mask are collected by the projection optics, and then form the aerial image on the wafer. On the top surface of wafer there is a thin layer of photosensitive material, namely photoresist that is exposed and developed. Finally, the IC layout pattern is replicated on the wafer after etch process.
As the critical dimensions (CD) continuously shrink, computational lithography has been widely used to improve the resolution of wafer image and extend the life of Moore's Law [3] . Computational lithography refers to a set of technologies that design and optimize lithography systems and processes through mathematical and algorithmic approaches. Inverse lithography technology (ILT) is a representative computational lithography approach that compensates image distortion by pre-warping the photomask patterns. ILT regards the mask pattern as a binary pixelated image, where the zero-valued and one-valued pixels represent opaque and transparent regions, respectively. Figure 1(b) presents an illustration of ILT method [4] . Pixel-based ILT greatly improves the degrees of freedom in mask optimization, so it can effectively improve the imaging performance of lithography systems.
In the past, researchers have proposed a number of gradient-based algorithms to solve the ILT problems [6,7] . For instance, Liu et al. compensated the image distortion for both binary mask and phaseshifting mask using the branch and bound algorithm, as well as the simulated annealing algorithm [8] . Sherif et al. proposed a binary mask optimization method for incoherent diffraction-limited imaging system, where the problem was formulated as a mixed linear integer program (MLIP), and then the branch and bound method was used to solve the problem [9] . Granik et al. formulated ILT as nonlinear, constrained minimization problems over a domain of mask pixels, and then applied local variation and gradient descent methods to quickly solve ILT problems [4] . Poonawala et al. proposed a set of gradient-based algorithms and regularization methods to solve the ILT problem in coherent lithography imaging system [7] . However, traditional gradient-based ILTs have to face with the great challenges of large amount of computation and low efficiency [10] .
In order to overcome the computational complexity, many machine learning techniques have been applied to accelerate the ILT algorithms [11] . Wang et al. used machine learning to efficiently generate sub-resolution assist features (SRAF) on full-chip layout at 20nm technology node, and achieved a high imaging accuracy [12] . Guajardo et al. used machine learning methods to jointly optimize the main features (MF) and SRAFs [13] . Ma et al. proposed fast mask optimization algorithms based on non-parametric kernel regression, which can effectively improve the computational efficiency and mask manufacturability [14] . Xu et al. proposed a fast SRAF generation method that involved support vector machines (SVM) and logistic regression models in the complete mask optimization process [15] . K. Luo et al. and R. Luo et al. respectively proposed fast mask optimization methods based on SVMs [16] and multilayer perceptual neural networks [17] .
Due to the high nonlinearity of ILT problem, traditional machine learning methods have their inherent limitations. For instance, traditional machine learning methods often require a large number of training samples to accurately construct the nonlinear mapping between the IC layout and the corresponding ILT solution [10] . In the latest decade, deep learning has become the forefront of fast ILT approaches, since it can properly fit any complex nonlinear function [18] . This paper will describe and discuss in detail several ILT methods based on deep learning.
The rest of this paper is organized as follows. Section 2 summarizes the ILT methods based on standard deep learning approaches, and Section 3 describes and discusses several ILT methods based on a radically new learning method, namely modeldriven convolution neural network (MCNN). The paper will be concluded in Section 4.

ILT based on Standard Deep Learning
Deep learning has been used to solve a series of problems in computational lithography, for example the defect characterization and classification of masks based on convolutional neural networks [19] , and hotspots correction based on the cycle-consistent generative adversarial network [20] . Lan et al. proposed a new technique to apply deep neural networks in GPU-accelerated mask optimization platform, which provided a fast and accurate ILT solution for 10nm and below technology nodes [21] . Shi et al. proposed an optimal feature vector automatic design method based on convolution neural network (CNN), which greatly improved the computational efficiency of ILT [22] . Chen et al. used Auto Pattern Selection (APS) tool to train the Newron SRAF deep learning network and successfully realized the inverse mask optimization on full-chip layout [23] . As examples, this section will detail two ILT methods based on variational autoencoder (VAE) [24] and generative adversarial network (GAN) [25] .

ILT using VAE
In 2018, Zhang et al. proposed a mask design method based on a widely used deep learning framework, i.e., variational autoencoder [26] . As shown in Fig. 2(a), VAE is composed of an encoder and a decoder. In the VAE method, a large number of mask patterns are used as the training data set, which can be obtained in advance through other methods. Here, the mask patterns of the training data are obtained by adding or removing rectangular features at some control points on the target layout, as shown in Fig. 2(b). The network regards the mask and the corresponding print image as the input data pair, which is transferred to the encoder to obtain latent variables that satisfy the Gaussian distribution. Then, the latent variables are sampled from the Gaussian distribution, and processed by the decoder to obtain the output data pair. At the beginning, the VAE network is trained to learn the relationship between the mask patterns and their corresponding print images. This is implemented by minimizing the distance between the input of encoder and the output of decoder. If a new latent variable is obtained by sampling the Gaussian distribution, the decoder will generate a new data pair. After that, the optimal latent variable is found to minimize the error between the print image and the target layout, and then the optimized mask will be calculated according to the optimal latent variable Figure 2(c) shows the optimized mask and the corresponding print image obtained by the VAE method. Although VAE improves the efficiency of mask design process, it is necessary to collect a large number of training data through other methods.

ILT using GAN
In 2018, Yang et al. proposed a mask optimization method, also called optical proximity correction (OPC), based on a generative adversarial network model to improve the imaging performance of lithography system [27] . This method modifies the conventional GAN generator by using an autoencoder, which is composed of an encoder and a decoder, as shown in Figs. 3(a) and 3(b). The modified generator can learn the nonlinear mapping between the target layout and OPC solution. The discriminator in Fig. 3(c) is responsible for distinguishing the true dataset from the OPC solutions emulated by the generator. In order to further improve the prediction capacity of generator, the GAN-OPC network is pre-trained by an ILTguided method. After the network is trained, the OPC solution can be obtained by inputting the target layout into the generator, followed by a refinement process via a gradient-based ILT method, as shown in Fig. 4(a). Figure 4(b) illustrates the simulation results of the GAN-OPC method.

ILT based on Model-Driven Deep Learning
The two methods described in Section 2 were migrated from the standard deep learning networks, which were modified slightly to adapt to mask optimization problem. This section describes the ILT methods developed recently based on a new kind of deep learning approach called MCNN.

Model-Driven Convolution Neural Network
In 2018, Ma et al. introduced the principles of MCNN to computational lithography realm. The MCNN was used to provide an initial guess of ILT solution for a given layout pattern, and then the steepest descent (SD) algorithm can be used to refine the mask pattern and further reduce the lithography image distortion [5] . The MCNN network is not inherited from existing deep learning architecture, but derived from a general inverse optimization model. In addition, MCNN provides a systematic initialization method for the network parameters based on the mathematical model of optimization problem. That is where the network's name came from. In this approach, the network structure of MCNN was constructed by unfolding and truncating the SD-ILT algorithm [28] , as shown in Figs. 5(a) and 5(b). The network parameters were systematically initialized according to the imaging model of lithography system. As shown in Fig. 5(c), the lithography imaging model was used as a decoder, which allows to train the MCNN in an unsupervised manner. The unsupervised training method is beneficial to avoid the time-consuming labelling process that plagues many machine learning architectures. The MCNN leads to much faster convergence than the SD algorithm, and its pattern error (PE) is smaller as shown in Fig. 6

Dual-Channel Model-Driven Deep Learning
Recently, Ma et al. generalized the prior MCNN method to a dual-channel model-driven deep learning (DMDL) method. The DMDL approach outperforms traditional ILT algorithms in terms of both computational efficiency and image fidelity [29] . Similar to MCNN, the network structure of DMDL is derived from the SD-ILT model as shown in Figs. 7(a) and 7(b). However, DMDL approach formulates the mask pattern as the superposition of MFs and SRAFs. Thus, the DMDL network divides the data flow into two parallel channels as shown in Fig. 7(b), which are used to predict the optimization results of MFs and SRAFs, respectively. Therefore, DMDL method can successfully insert SRAFs on mask to improve the image fidelity of lithography system. As shown in Fig. 7(c), the DMDL approach uses an unsupervised training strategy, where the lithography imaging model serves as the decoder. In addition, the DMDL can effectively alleviate the gradient vanishing problem and extend the depth of network, which greatly improves its prediction capacity. It was proven that for simple layout patterns, DMDL method was capable of obtaining the ILT solution directly, and did not require the subsequent refinement process. As shown in Fig. 8, the DMDL approach can achieve higher image fidelity compared to the traditional SD method.

Conclusion and Discussion
This paper briefly described the concepts of computational lithography, and then reviewed the development of some ILT lgorithms. This paper focused on the description and discussion on the fast ILT methods based on deep learning. Due to the length limitation, we selected a set of representative methods to introduce. Deep learning brings opportunities for the advances of novel computational lithography methodologies. In the Truncating Channel 1

(b) Encoder
Lithography System  future, different deep learning frameworks may be introduced and applied to solve for computational lithography problems, including ILT and sourcemask optimization (SMO). How to exploit the synergy between the existing deep learning approaches and model-based deep learning approaches may be an interesting topic to study. In addition, several other aspects may have considerable impacts on the applications of those deep learning methods, including generation of reliable training data sets or sample libraries, as well as the effective and efficient training methods.