Automated optimized overlay sampling for high-order processing in double patterning lithography Chiew-seng Koaya, Matthew E. Colburna, Pavel Izikson , John C. Robinson*c, Cindy Katod, Hiroyuki Kuritad, Venkat Nagaswamic a IBM Corp., 257 Fuller Road, Albany NY 12303 USA; b KLA-Tencor Israel, 1 Halavian Street, Migdal Ha'Emak 23100, Israel; c KLA-Tencor Corp., One Technology Drive, Milpitas, CA 95035 USA; d KLA-Tencor Japan Ltd., 134 Godo-Cho, Hodogaya-Ku, Yokohama, 240-0005 Japan b
ABSTRACT A primary concern when selecting an overlay sampling plan is the balance between accuracy and throughput. Two significant inflections in the semiconductor industry require even more careful sampling consideration: the transition from linear to high order overlay control, and the transition to dual patterning lithography (DPL) processes. To address the sampling challenges, an analysis tool in KT-Analyzer has been developed to enable quantitative evaluation of sampling schemes for both stage-grid and within-field analysis. Our previous studies indicated (1) the need for fully automated solutions that takes individual interpretation from the optimization process, and (2) the need for improved algorithms for this automation; both of which are described here. Keywords: overlay, metrology, sampling, double patterning
1. INTRODUCTION Sampling has always been a key topic for metrology with the primary focus being the balance between data accuracy and cycle time1,2. In high-end semiconductor manufacturing, as design rules shrink, the accuracy requirements become ever more stringent. Sampling garners considerable attention during major industry inflection points. In the past these inflection points have included the transition from aligners to steppers, from steppers to scanners, and during wafer diameter transitions. Currently the industry is undergoing a lithographic inflection from linear control to high-order (HO) control including both inter-field and intra-field3. This work addressed overlay metrology and the transition to high-order control. Historically, sampling analysis has been a manual and difficult process and often times involves engineering judgment by individuals. This work involves the evaluation of a fully automated software tool in KLA-Tencor Corp.’s KT Analyzer to address sampling requirements. As was reported previously, a key requirement is for an automated tool that reduces or eliminates the need for individual judgment calls.2,4. 1.1 Design of Experiment To perform the sampling analysis, a super-set of data was required that contained a high number of overlay data-points corresponding to intra-field and inter-field targets. For this, a DPL mask-set was generated which contained test patterns, various inter-field and intra-field alignment and registration marks. The zero layer mask was generated using a Pattern Generator and L1 and L2 were generated using an e-beam system at a third party mask-shop. The mask-set consisted of a Zero Layer (L0), and a test pattern split in to 2 separate masks (L1 and L2). L1 contained one portion of the alignment mark and L2 contained the remaining portion, as color coded in green and red respectively in Fig. 1. The Test Pattern on the mask contained cells of 11x 14 rows and columns and each cell contained various Overlay Registration targets over a field size of approximately 24x30 mm. In each field, 49 locations were available for measurement. All possible exposure fields (71) were measured resulting in approximately 3500 data points (intra-field and inter-field) per wafer. Five wafers were used in this experiment (referred as W21 to W25 later in the paper) to account for wafer to wafer variability. The wafers were exposed using a DPL Litho-Freeze-Litho process by IBM at Albany5,6. From this super-set of data (henceforth referred to as ‘omniscient’), it was possible to select desired number of data-points in a step-wise fashion (e.g. 6, 12, 20, 27, 35 etc.) and test the model accuracy compared to the maximum available data. *John.Robinson@KLA-Tencor.com; phone 1 512-231-4221 Metrology, Inspection, and Process Control for Microlithography XXIV, edited by Christopher J. Raymond, Proc. of SPIE Vol. 7638, 76381R · © 2010 SPIE · CCC code: 0277-786X/10/$18 · doi: 10.1117/12.846371
Proc. of SPIE Vol. 7638 76381R-1 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
IPRO
Figure 1: Archer™ AIM overlay target for L1 and L2 masks
1.2 Analysis Flow The analysis used in this publication makes use of KT AnalyzerŠ Version-7 software, in particular the High Order Correction (HOC) option. The analysis flow is as follows. Either intra-field, inter-field, or both types of sampling can be considered. First, it is necessary to have an estimated overlay residual error from historical data. Other inputs are the required overlay specifications, the degree of the desired model (linear, 3rd, 5th, etc.), the maximum sample size, and the wafer and/or field test location geometry. In addition, it is necessary to decide the relative weighting of model uncertainty versus throughput to establish a proper cost function. A more formal definition of the Cost Function is expressed as Cost Function = Normalized Sample Size + w* Normalized Uncertainty where the normalized sample size and uncertainty are in the range of 0 to 1. The weighting factor w can assume any positive real number. It is used to control the relative importance between the model uncertainty and throughput. Once the inputs have been gathered, one chooses an algorithm to generate the optimal sample plan. The algorithms available in the software, in order of increasing accuracy, are (1) Fixed Uncertainty calculates the optimal sample plan as one having the smallest size and with its maximum uncertainty held below a specified threshold value, (2) Optimized Uncertainty calculates the optimal sample plan as one with the minimum cost after taking into account of the sample size component and a maximum uncertainty component. Obviously it is desirable to drive the cost as low as possible, and (3) Optimized Monte Carlo calculates the optimal sample plan also as one with the minimum cost. However, in this algorithm, the cost function consists of the sample size component and a Monte Carlo Fit component. The (2) and (3) algorithms control the ratio of the 2 components with the weighting factor. The outputs from each optimal sample plan include recommendations on the number of fields and/or sites per field, as well as the recommended locations of those fields and/or sites. 1.3 Methodology Analyses for the intra-field and inter-field samplings were treated separately, and are discussed in Sections 2 and 3. In each analysis the weighting factor was varied between 0.5 and 5. Then an optimal sample plan was calculated corresponding to each weight. Next, the optimum sample was compared against the omniscient sample, by taking the difference in their model-predicted overlay. A model-predicted overlay from a sample is meant to mathematically represent the overlay fingerprint throughout the wafer. The omniscient sample provides the most complete representation to which any other sample plan could only emulate. The residuals between the models were also compared. Since the omniscient sample was available, the residuals were calculated using all locations rather than calculated based on the optimum sample alone. This allows direct comparison relative to the omniscient performance. We also compared the correctables between various samplings. To make the comparisons between different correctables more meaningful, the correctables were converted into nanometer of displacement at the edge of the field or wafer, thereby providing a direct comparison of a technologically meaningful metric. Further, a combined simultaneous intra-field and inter-field
Proc. of SPIE Vol. 7638 76381R-2 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
sampling optimization was performed to demonstrate that the results are similar to those of independent intra-field and inter-field analysis. Finally geometrically defined sampling plans were compared to optimized sampling plans with equal numbers of fields and/or sites to demonstrate the advantage of the optimization.
2. INTER-FIELD SAMPLING In the inter-field (or wafer grid) analysis, we address the question: what is the optimal number of fields on the wafer and where they should be located? By varying the weighting factor, we can control the relative importance between the model uncertainty and the throughput. The sample size was used to quantify the throughput. The algorithm described in Sec. 1.2 calculates the optimal number of fields and their locations for a specified weight as shown in Fig. 2. The optimal number of fields increased with the weight, which controls the importance of the model accuracy, resulting in decreased maximum uncertainty (MU). Beyond the point of diminishing return, a large increment in the number of fields would result in a tiny improvement in MU.
Weight = 1
Weight = 2
Weight = 5
Uncertainty
Weight = 0.5
Random Sampling Optimal Sampling
MU = 1.5 [nm]
MU = 0.9 [nm]
MU = 0.82 [nm]
MU = 0.77 [nm]
12 fields
20 fields
27 fields
35 fields
Figure 2: Uncertainty versus number of sampled fields for various cost function weight factors. The green diamond represents the optimal sampling for a given weight. Also shown are the maximum uncertainty (MU), the recommended field sampling layout, and the number of sampled fields.
Proc. of SPIE Vol. 7638 76381R-3 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
12 Fields
20 Fields
27 Fields
35 Fields
3 Sigma Differences Sampling
12
20
27
35
3 Sigma
1.59
1.06
0.89
0.86
Figure 3: The deltas between model predicted overlay based on omniscient sampling and the optimized cases shown in Figure 2.
Each of the optimum samples was also compared against the omniscient sample, by taking the difference in their modelpredicted overlay. Point-by-point overlay deltas, for the each weighting factor, is plotted as a histogram in Fig. 3. There is a clear trend between sample size and the width of the distribution. The larger the sample size, the closer the optimal sample will resemble the omniscient sample. Major improvement was achieved when the sample size increased from 12 to 20 fields. Only little improvement was observed between 27 and 35 sampled points. The histograms are also rather symmetric about zero, which is the mean delta.
Overall Residuals
X
Y
Figure 4: Comparison residuals in X and Y for 5 wafers for the optimized sample plans and the full (omniscient) sample plan.
Residuals comparison in Fig. 4 are broken down by X and Y, and by wafer for the 5 wafers in the experiment. Again we see that 12-field sampling is not sufficient in most cases to characterize the overlay variation present in the omniscient sample. A comparison of overlay correctables between those calculated from the omniscient sample and the individual optimized sub-samplings is most clearly shown in Fig. 5 when the results were converted into nm of translation at the edge of the wafer. Similar to the residuals, the biggest improvement is between 12 to 20 fields.
Proc. of SPIE Vol. 7638 76381R-4 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Linear Corrections
2nd Order Correctables
(b)
(a) Sampling
Sampling
All correctables converted into translation (nm) at the edge of wafer
3rd Order Correctables
(c)
(d) Sampling
Sampling
Figure 5: Comparison of overlay correctables for the various sampling plans, shown as deltas between the omniscient sample and the optimized sample plans. The horizontal axis is the sample size. (a), (b), and (c) compare the coefficients directly. (d) the deltas as calculated at the edge of the wafer in nm.
3. INTRA-FIELD SAMPLING Historically intra-field sampling has been limited to only 4 corners of the field or possibly 5-site sampling. The recent trend in mask design has been to include increased numbers of overlay targets on the reticle to take advantage of highorder overlay correction capability on immersion scanners. In this section, our analysis addresses the question: what is the optimal number of sites per field and where they should be located? The methodology used for intra-field sampling is the same as the previous section. Here three weighting factors were evaluated, i.e. 0.5, 1, and 2. The algorithm described in Sec. 1.2 calculates the optimal number of sites and their locations for a given weight, as shown in Fig. 6. The optimal number of sites increased with the weight, which controls the importance of the model accuracy, resulting in decreased maximum uncertainty (MU). The model-predicted overlay for the optimized sample was compared against the model from the omniscient sample. The histograms in Fig. 7 are overlay deltas between the two models. Their distributions are rather symmetric about zero. Also, the larger the samples are, the narrower the distribution. This confirms the fundamental behavior of sampling, in which the larger the sample size is, the more accurate it will represent the population. In this evaluation, the model from a 22-site sample was able to provide a predicted overlay extremely close to the model from the omniscient sample.
Proc. of SPIE Vol. 7638 76381R-5 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Weight = 1
Weight = 2
Uncertainty
Weight = 0.5
Random Sampling Optimal Sampling
MU = 1.2 [nm]
MU = 0.95 [nm]
MU = 0.9 [nm]
12 Sites
19 Sites
22 Sites
Figure 6: Uncertainty versus number of sampled sites per field for various cost function weight factors. The green diamond represents the optimal sampling for the given case. Also shown are the model uncertainty (MU), the recommended site sampling layout within the field, and the number of sampled sites.
19 Sites
12 Sites
22 Sites
3 Sigma Differences Sampling
12
19
22
3 Sigma (nm)
1.0
0.8
0.4
Figure 7: The deltas between model predicted overlay based on full “omniscient� sampling and the optimized cases shown in Figure 6.
Proc. of SPIE Vol. 7638 76381R-6 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Overall Residuals
Y
X
Figure 8: Comparison residuals (3sigma) in X and Y for 5 wafers for the optimized sample plans and the full sample plan. Improved accuracy is seen as sample size is increased, however, the amount of improvement is less obvious than the inter-field case because even the minimum sample size is relatively large in order to support the high order model. 2nd Order Correctables
Linear Correctables
(b)
(a) Sampling
Sampling
All correctables converted into translation (nm) at the edge of the field
3rd Order Correctables
(c) Sampling
Figure 9: Comparison of correctables for the various sampling plans. Shown as deltas between the omniscient sample and the optimized sample. The horizontal axis is sample size. (a), (b), and (c) compare the coefficients directly. (d) shows the deltas calculated at the edge of the wafer in nm.
Proc. of SPIE Vol. 7638 76381R-7 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
(d)
The residuals were compared as in Fig. 8. In each model the residuals are evaluated at all points of the omniscient set, and not just the subset of points in a given sampling, thus providing the best indication of model accuracy. The trend of decreased residuals with increased sample size is clear, though the improvement is smaller as compared to the inter-field case. Even residuals from the smallest sample considered (12 sites) are significantly lower than those from the traditional 4-corner sample. A comparison of overlay correctibles in Fig. 9 between those calculated from the omniscient sample and the individual optimized sub-samplings is most clear when the results were converted into nm of translation at the edge of the wafer.
4. OPTIMIZED VERSUS NON-OPTIMIZED INTRA-FIELD SAMPLING In this section we compared an a priori selection of sample sites to the optimized layout for the same number of sites. This evaluation was carried out for 13 and 22 out of the 48 sites available in the field. First we note that the optimized layout, not surprisingly, differs from the very symmetric a priori selection as shown in Fig. 10. As we have observed from the comparison of model-predicted overlay in the previous sections, it is obvious that the larger the sample size is, the more accurate the model will resemble the omniscient sample. The results in Fig. 11 show that, for the same sample size, the overlay model of the optimized layout is more accurate than that of the a priori layout. In this case the impact of an optimized layout is more evident for the large (22) than the small (13) number of sites
N=13
N=22
Figure 10: Identical number of sites per field, optimized layout (upper) versus a priori layout (lower) for 13 and 22 sites per field out of a possible 48 available sites.
In the comparison of residuals (Fig. 12), we see improvement in the residuals for the increased sampling as well as for the optimized versus non-optimized layouts, for a given sample size. Additionally we examined the impact associated with high order (HO) modeling as compared to linear modeling, and the interaction between sampling and accuracy. The important conclusion is that HO modeling with realistic levels of sampling achieves results superior to even 100% sampling and linear modeling.
Proc. of SPIE Vol. 7638 76381R-8 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
22 Sites
13 Sites
22 Sites
NON OPTIMIZED
OPTIMIZED
12 Sites
3 Sigma Differences 12/13 Sites
22 Sites
Optimized
1.0
0.4
Non Optimized
1.0
0.8
Figure 11: The deltas between model-predicted overlay based on omniscient sample and the optimized and nonoptimized (a priori) examples, as well as the associated 3sigma values.
Intra-field Residuals
X
Residuals
Y
Figure 12: Comparison of residuals in X and Y for 5 wafers of optimized and non-optimized (a priori) sampling proposals for 12 and 22 sites.
Proc. of SPIE Vol. 7638 76381R-9 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
X
Residuals
Y
X
Residuals
Y
Figure 13: Comparison of linear (left) and high order (right) intra-field residuals in X and Y and the impact of sampling with 5 wafers averaged.
5. CONCLUSIONS In sections 2 and 3 we described independent optimization and analysis of inter-field and intra-field sampling. An interesting question arises as to whether simultaneous optimization of the combined inter-field and intra-field sampling would change our conclusions. Our conclusion is that it does not change the conclusion from that of the individual optimizations in any appreciable way. The basic conclusions are the same, and therefore treating the analysis in 2 separate stages is justified. In this study, we have assessed the overlay residuals performance achieved using the automated sampling algorithms, as implemented in KLA-Tencor Corp’s KT Analyzer. Performance of the optimized sampling was improved relative to an a priori sampling method and the optimization improvements plateau beyond 20 fields.. In our intra-field example, 22 sites per field are recommended for the conditions here. It was determined that intrafield and interfield sampling optimization can be performed independently and achieve comparable performance to joint-optimization. Optimized inter-field and intra-field sampling with high order (HO) modeling significantly reduces residuals compared to even full sampling and linear modeling. The optimization routine can be used for establishing and maintaining baseline sampling plans, especially in the transition to high order control schemes.
ACKNOWLEDGMENTS This work was performed by the Research Alliance Teams at various IBM Research and Development Facilities.
REFERENCES [1] Bo Yun Hsueh, George K. C. Huang, Chun-Chi Yu, et. al., “Sampling strategy: optimization and correction for high-order overlay control for 45nm process node,” Proc. SPIE 7272, 727231 (2009) [2] C. Kato, et. al., “Sampling for advanced overlay process control,” Proc. SPIE 7272, 727206 (2009) [3] M. Adel, P. Izikson, D. Tien, et. al., “The challenges of transitioning from linear to high-order overlay control in advanced lithography,” Proc. SPIE 6827, 682722 (2007) [4] Christian Sparka, Anna Golotsvan, Yosef Avrahamov, et. al., “Automated overlay recipe setup in high-volume manufacturing: improving performance, efficiency, and robustness,” Proc. SPIE 7272, 727232 (2009) [5] Steven Holmes, Chiew-Seng Koay, Karen Petrillo, et. al., “Engine for characterization of defects, overlay, and critical dimension control for double exposure processes for advanced logic nodes,” Proc. SPIE 7273, 727305 (2009) [6] V. Nagaswami, M. Colburn et. al., “Overlay components in Double Patterning Lithography,” 6th International Symposium on Immersion Lithography and Extensions, Prague, October 2009.
Proc. of SPIE Vol. 7638 76381R-10 Downloaded from SPIE Digital Library on 01 Apr 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms