Reflectivity metrics for optimization of anti-reflection coatings on wafers with topography Mark D. Smith, Trey Graves, John Biafore, and Stewart Robertson KLA-Tencor Corp, 8834 N. Capital of Texas Hwy, Suite 301, Austin, TX
Abstract Anti-reflection coatings are commonly used in advanced photolithography in order to minimize CD variability caused by deviations in resist thickness and in the films and structures comprising the substrate. For a planar film stack, reflectivity calculations are a critical tool for optimization of parameters such as coating thicknesses and optical properties of anti-reflection coatings (TARCs and BARCs). However, with the exception of the first lithography layer, all layers on a production wafer have some degree of topography, so that reflectivity calculations for a planar film stack are not strictly correct. In this study, we evaluate three different reflectivity metrics that can be applied to wafers with topography: reflectivity for simplified planar film stacks, standing wave amplitude, and reflected diffraction efficiencies. Each of these metrics has a simple, physical meaning that will be described in detail in the presentation. We then evaluate how well these reflectivity metrics correlate with CD variability for two different example lithography steps: implant layers with STI (where a developable BARC might be used), and Litho-Etch-Litho-Etch style double patterning.
Introduction Anti-reflection coatings are typically used to make lithography processes more stable in the presence of process variations. The most common example is a CD swing curve, where reflections from the bottom surface of the resist film cause a sinusoidal variation in the printed CD as the resist thickness increases. A similar variation in the printed CD can also occur when the thickness of the layers under the resist change, due to a change in the reflectivity of the resist-substrate interface. However, anti-reflection coatings do more than simply alleviate swing effects – they make the lithographic process as a whole more robust. The reason for this overall improvement is that the standing wave pattern caused by reflections competes with the contrast of the desired printed image. When the standing wave pattern is reduced or eliminated, imaging of the desired pattern is improved. The standard method for optimization of an anti-reflection coating is to calculate the reflectivity. For a bottom anti-reflection coating (BARC), one would calculate the reflectivity of the resistsubstrate interface, while for a top anti-reflection coating (TARC), one would calculate the reflectivity of the entire stack. One can then optimize the thickness of the anti-reflection coating by choosing an ARC thickness that minimizes the change in reflectivity in the presence of different process variations. For example, to optimize for variations in the thickness of an underlying layer, an optimal BARC thickness would minimize the change in the substrate reflectivity when the thickness of the underlayer deviates from its target thickness. Calculation of the reflectivity is straightforward and very fast for planar film stacks. However, every layer except the first on a wafer has some degree of topography, so it is desirable to be able to Advances in Resist Materials and Processing Technology XXVII, edited by Robert D. Allen, Mark H. Somervell, Proc. of SPIE Vol. 7639, 763935 · © 2010 SPIE · CCC code: 0277-786X/10/$18 · doi: 10.1117/12.846540
Proc. of SPIE Vol. 7639 763935-1 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Figure 1: Experimental test structure for implant with STI test structure presented in reference 1. The CD of the isolated trench over isolated active region was measured more than 1000 times to create the histograms shown on the right for resist with TARC and dBARC. The most significant process variation was reported as the variation in oxide step height due to variations in CMP.
perform a similar optimization for lithography process steps where the incoming wafer does not consist of a stack of planar films. Two examples that we will consider in this paper are implant layer with STI and double patterning with a hard mask (litho-etch-litho-etch). The STI process we will examine is similar to the one presented by Bailey et al. in reference 1, and shown in Figure 1. In their paper, Bailey et al. saw significant CD variability caused by variations in the oxide step height due to variations in the CMP process. They created a test structure that was an isolated trench printed over an isolated active region. This test structure was measured more than 1000 times to generate the histograms shown in the figure. This experiment was used to compare different antireflection strategies, such as TARC and developable BARC. The histograms shown in the figure demonstrate that the CD variability for the dBARC process (7.4 nm 3Ďƒ) were much smaller than the TARC process (21 nm 3Ďƒ). Experiments such as these are very useful because they provide a direct measurement of the CD variability that should be minimized by the anti-reflection strategy. Unfortunately, this experiment requires a very large number of measurements to build statistically significant histograms, and many of these experimental histograms would have to be generated in order to optimize dBARC thickness. A second example we will investigate is relevant to litho-etch-litho-etch type double patterning, where there will be an embedded hard mask structure during imaging of the second pass. A detailed
Proc. of SPIE Vol. 7639 763935-2 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Figure 2: Double patterning test structure with resist lines printed at a constant pitch (130nm) over a grating structure with varying pitch, from reference 2. The grating was created by etching 40nm lines to a depth of 60nm into bare silicon. The grating was then coated with BARC with a spin coating process that would give a 26nm thickness on a flat wafer. The CD of the second patterning step varies as the pitch of the underlying topography is changed. (Figure is from reference 2.)
Figure 3: Cross-section of the wafer after coating with BARC with a spin coat process that gives a 26nm thickness on a flat silicon wafer (top of figure). The approximate location of the grating structure is shown by the white lines drawn on the image. The BARC coating was found to be about 10 to 15nm thick over the top of the gratings, and around 26nm thick in the spaces between each grating. The approximate BARC top surface used in the simulations is shown at the bottom of the figure. (The cross-section figure is from reference 2.)
Proc. of SPIE Vol. 7639 763935-3 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
study of a test structure for this type of process was presented by Robertson et al. in reference 2. The test structure was created by etching 40nm lines into silicon on a varying pitch. The target etch depth for this grating structure was 60nm. Next, a BARC was coated over the silicon gratings with a spin coating process that gives a nominal thickness of 26nm on a flat silicon wafer. Cross-sections were performed to determine the shape of the BARC top surface, as shown in Figure 3. The BARCcoated wafer was coated with resist, and then imaging of 40nm lines on a 130nm pitch. Finally, the CDs were measured to give the response shown in Figure 2. Here we see two undesirable interactions between the printed resist lines and the pitch of the wafer topography. First, there is a significant dip in the CD around a wafer topography pitch of 150nm. Second, the top-down SEM images show variations in the line width, typically becoming smaller on top of the silicon lines and becoming larger halfway in-between. Again, it would be beneficial to optimize the film stack so that these interactions are minimized. The purpose of this investigation is to examine simulation methods that can help solve these problems by greatly reducing the number of required experiments. Ideally, there would be a reflectivity metric that can be applied to wafers with topography that is analogous to the classical reflectivity reported for wafers with planar film stacks. We will investigate three candidate metrics: reflectivity of simplified planar film stacks, standing wave amplitude, and reflected diffraction efficiencies. In Section 2 of this paper, we will define each of these metrics and explain their physical significance. We will then apply these metrics to the STI and the double patterning example problems in Section 3. Finally, in Section 4, we will give a summary of our findings and provide some conclusions.
Reflectivity Metrics for Wafers with Topography The simplest reflectivity metric for a wafer with topography is to approximate the film stack as planar. For the STI example, we might simplify the actual topography into two stacks – one typical of the oxide region, and a second typical of the active area, as shown in Figure 4. This approach is very appealing because it allows us to re-use the reflectivity metric that we use for planar film stacks. However, it is probably only useful far from the active area where scattering will influence the image in the resist and ultimately the printed CD. The second metric we will investigate is the standing wave amplitude. This metric was first used by Maaike Op de Beeck from IMEC [3] and later by Chris Mack [4]. The basic idea is to calculate the image in resist using the imaging conditions and mask that will be used for a target feature on the wafer. Next, the intensity data is extracted using a vertical slice at the nominal feature edge. If a standing wave is present, then the intensity plotted against depth into the resist will look like a swing curve. The standing wave amplitude metric is the amplitude of the swing divided by the average intensity in the slice, see Figure 5. One advantage of this metric is that the standing wave amplitude can be calculated at a specific location, so it can be used to examine the response of a specific, critical feature – this is different from the standard reflectivity metric which is characteristic of the entire wafer and not localized at a specific location. One problem with this metric is that the standing wave amplitude may become very difficult to determine, or even meaningless, for very thin films where there is not an entire swing in the intensity.
Proc. of SPIE Vol. 7639 763935-4 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Figure 4: Simplified planar film stacks for analysis of the STI example shown in Figure 1. For one simplified stack (top right) the wafer has an oxide layer with a thickness equal to the STI depth, and for the other film stack (bottom right) the wafer does not have any oxide. Both simplified film stack have resist and dBARC layers.
Figure 5: Definition of the standing wave amplitude metric from an image in resist. A vertical slice through the image in resist is taken at the nominal feature edge, and then the amplitude of the “swing� is calculated. The standing wave amplitude is the ratio of the swing amplitude divided by the average intensity. This is analogous to the swing ratio for a CD swing curve.
Proc. of SPIE Vol. 7639 763935-5 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
The final metric we will investigate is the reflected diffraction efficiency. This metric is commonly used to describe gratings used in optical systems, and has also been applied to lithography by Shao et al [5]. Here we calculate the reflected, scattered waves from the topography for an open frame exposure. The diffraction efficiency is energy associated with each of these scattered waves. It is defined as: cos θ iR 2 Ri = ρi (1) cos θ I
Where ρi is the fraction of the incident electric field that is reflected with an angle θi. The relationship between the incident and reflected angles is given by: sin θ i =
λ0 i nP
− sin θ incident
(2)
The angles are numbered so that i = 0 corresponds to a reflected wave with the same angle as the incident wave. Reflected waves with a nonzero value of i are scattered waves. We report three different types of reflected diffraction efficiencies. First, we can sum all of the reflected diffraction efficiencies to give the total reflected energy, R All. Second, we can report only R 0, which is the reflectivity if you ignore the scattered waves, and finally, we can report only the scattered waves, R Scat = R All - R 0.
Example Optimizations for STI and Double Patterning In the Introduction, we showed experimentally measured histograms from reference 1, for the implant with STI test structure. We generated similar histograms using PROLITH, as shown in Figure 6. Unfortunately, the precise settings for the test structure are not given in the IBM paper, so we used estimated values for the dimensions of the mask and topography structures. We started with simplified topography shown on the left side of Figure 4, with a resist thickness of 200nm, an STI depth of 350nm, and a target step height of 10nm. (This makes the total height of the oxide 360nm.) The width of the active area between STI structures was 100nm. We used the optical properties for
Figure 6: PROLITH simulated histograms for the implant with STI test structure. The histogram on the left is without a dBARC and has 3σ = 19nm. The one on the right has 62nm dBARC with 3σ = 9.6nm.
Proc. of SPIE Vol. 7639 763935-6 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Figure 7: Simulated CD variability as a function of dBARC thickness for two test structures – the 180nm trench, which is similar to the structure in reference 1, and also for a 300nm trench.
the resist and BARC given in the IBM paper, but otherwise the resist model parameters were simply chosen to be “typical” for an ArF photoresist. The imaging conditions were 180nm trenches on a 700nm pitch with a 6% attenuated PSM, combined with an NA of 0.75 and a partial coherence of 0.6. To generate the histograms, we assumed that the step height varied from the target height with a Gaussian-shaped probability with σ = 5nm. The simulated results in Figure 6 look similar to the experimental results shown in Figure 1, so this appears to be a reasonable set-up for our example calculation. We can now use these settings to generate a series of histograms, and plot the expected 3σ variation in the CD as a function of BARC thickness, as shown in Figure 7. As shown in the figure, there are a few sharp minima, around 20nm, 50nm, and 80nm, with the overall trend that variability decreases as the dBARC thickness increases. Also shown in the figure are results for a 300nm trench on a 700nm pitch, and we can see that there are also local minima in the CD variability, but the locations of these minima are at different dBARC thickness values. What this means is that we must choose between a dBARC thickness that is a general solution for both features and a dBARC thickness that optimizes for a specific feature. For example, a general solution might be a dBARC thickness larger than 45nm, because both features have less than 5nm 3σ variation. If we are only concerned with the 180nm structure, then we could choose a dBARC thickness of 20nm which would give about 3nm 3σ variation – even though this is obviously not an optimal case for the 300nm structure. An ideal reflectivity metric would correlate well with the optimal dBARC thickness values found from the histograms. We start with the results for the simplified planar film stacks, as shown in Figure 8. We have calculated results for the simplified planar film stack with a homogeneous slab of
Proc. of SPIE Vol. 7639 763935-7 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Figure 8: Reflectivity for simplified planar film stack and the histogram data for the implant with STI test structure. Reflectivity results are shown for oxide thicknesses of 355nm, 360nm, and 365nm. The arrows show dBARC thickness values where the reflectivity does not change much as the oxide thickness varies. Histogram data is shown for the 180nm and 300nm trench test structures.
oxide, and we have varied the oxide thickness from 355nm to 365nm, which corresponds to the target thickness +/-5nm, which is the expected variation. At dBARC thickness values around 58nm and 81nm, the reflectivity is approximately constant as the oxide thickness changes – these are potentially good operating points according to the reflectivity metric. Also shown in the figure is the histogram data, and we see that these two operating points correlate fairly well with the histograms. However, we cannot use the reflectivity of a simplified planar film stack to find the local minima for the 180nm trench or for the 300nm trench. This is because the reflectivity metric is a global metric (reflectivity of the entire wafer) instead of a localized metric. The next metric is the standing wave amplitude. Again, we plot the metric and the histogram data together to see if there is correlation between the standing wave amplitude metric and the CD variability (see Figure 9). Note that this is a localized reflectivity metric because the intensity is extracted at the feature edge, which will be in a different location for the 180nm and 300nm test structures. In addition, the calculation is performed with the 180nm or 300nm mask. From the figure, we see good correlation between the standing wave amplitude metric and the CD variability for the 180nm test structure. A correlation for the 300nm test structure is also present, but the correlation is not as good. It is important to notice that because this is a localized reflectivity metric, we get different results for the 180nm and 300nm test structures, and we are able to find local minima in the variability with this metric. The final test metric is the reflected diffraction efficiencies. Results are shown in Figure 10. On the left side of the figure, we see the different reflected diffraction efficiency metrics along with the reflectivity for a simplified planar film stack. Note that the diffraction efficiencies are a property of
Proc. of SPIE Vol. 7639 763935-8 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Figure 9: Standing wave amplitude and 3σ CD variability for the 180nm and 300nm isolation with STI test structures. The arrows show dBARC thickness values where the standing wave amplitude does not vary much as the oxide thickness changes.
the topography on the wafer and the incident angles from the illuminator, and are completely independent of the mask, so there are only a single set of results for both trench test structures. It is interesting that the curve for R All has a shape that is very similar to the reflectivity for a simplified planar stack, but that the first minima is shifted to thinner dBARCs. Also, R All never goes to zero, which indicates that the scattering from the STI structure is always present and never fully suppressed by the dBARC. The results on the right side of the figure show R All and R Scat compared with the 3σ CD variability. Again, we find that we cannot locate the local minima for each test structure because the diffraction efficiencies are independent of the mask, but the scattered diffraction efficiency appears to correlate well with the over all trend in the CD variability for both the 180nm and 300nm test structures. For this example, a scattered diffraction efficiency less than 3% to 4% gives CD variability less than 5nm 3σ.
Figure 10: Reflected diffraction efficiencies and 3σ CD variability for the STI test structures. On the left, the reflectivity for the simplified film stack is shown for comparison, along with the sum of all reflected diffraction efficiencies, the zero order reflected diffraction efficiency, and the scattered diffraction efficiency. Comparison between different diffraction efficiency metrics and the 3σ CD variability is shown on the right.
Proc. of SPIE Vol. 7639 763935-9 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
Figure 11: Results for the double patterning test structure. Shown on the left is the reflected diffraction efficiency and CD variability versus BARC thickness for a grating pitch of 150nm. CD variability is CDmax – CDmin, so it is a measure of the “waviness” of the printed line. Note that for the BARC thickness with the lowest CD variability, the BARC is approximately planarizing. Shown on the right is the CD versus topography pitch. The experimental data are identical to the results in Figure 2, but the simulated results are for a planarizing BARC.
The final example is the double patterning test structure shown in Figure 2. For this test structure, we will optimize difference between CDmax and CDmin for a specific grating pitch value – we would like the printed line to not be as “wavy” as it appears to be in Figure 2, so this will be similar to minimizing reflective notching. Because we are coating the BARC over a grating structure with a height of 60nm, we will need to use a model for the shape of the spin coated surface. A simple model has been implemented in PROLITH, where the inputs are interaction length and a minimum thickness. We chose an interaction length of 50nm, and a minimum thickness of 12nm which reflects the minimum thickness observed over the top of the grating in the experimental crosssections. A comparison between experimental cross-sections and the spin coat model is shown in Figure 3. Results are shown for a grating pitch of 150nm on the left side of Figure 11. Here we see that the scattered diffraction efficiency correlates well with the general trend in the CD variability, and that the variability is minimized for a BARC thickness of around 50nm. This BARC thickness corresponds to a spin coat surface that is almost planarizing. If we use a planarizing BARC, and we simulate the CD versus wafer topography pitch, we obtain the results shown on the right half of Figure 11. From these two results, we conclude that the planarizing BARC minimizes both the waviness of the line, and minimizes the variability in the CD vs. pitch interaction. This condition corresponds to a scattered diffraction efficiency of less than 1%.
Summary and Conclusions We have analyzed three different reflectivity metrics that are suitable for wafers with topography. Reflectivity for simplified planar film stacks has the advantage that it can be calculated very quickly (less than 1 second), but it is a global metric that cannot capture any interaction between the topography and the printed feature. Also, it is a global metric, so it cannot differentiate between
Proc. of SPIE Vol. 7639 763935-10 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms
different structures on the mask. Second, we examined the standing wave amplitude. This metric takes several minutes to calculate because the Maxwell equations must be solved at the wafer in order to predict the image in resist. This metric correlated fairly well with the CD variability data for the 180nm trench, and less well with the CD variability data for the 300nm trench. It is a local metric, so it is able to distinguish between different critical features on the wafer. Finally, we examined reflected diffraction efficiencies. This metric takes about the same amount of time to calculate as the standing wave amplitude, and we found reasonable correlation between the scattered part of the reflected waves and the CD variability. Like the reflectivity for simplified planar film stacks, reflected diffraction efficiencies are also a global metric, so they can only be used to track trends typical of all of the features on the wafer instead of finding a specific solution for a single critical feature. It seems that a combination of all three metrics might be useful, depending on the problem at hand. When scattering and reflective notching behavior is not important, reflectivity for simplified planar film stacks will probably give good results. Standing wave amplitude is a good local metric when scattering is important, and scattered reflected diffraction efficiencies is a good global metric. Once an optimization has been performed with these reflectivity metrics, it is worthwhile to do a more detailed analysis by simulating the process variations directly. For the implant layer with STI, the process variations were characterized by generating histograms, and for the double patterning process, we characterized the waviness of the printed line and the CD through pitch behavior. Direct simulations like these take more time than evaluating the reflectivity metrics – a single histogram requires about 10 minutes on an 8 core 2.4 GHz desktop computer, and a single crossed grating structure requires about 10 to 20 minutes depending on the pitch of the grating. However, if the reflectivity metrics can be used to get very close to a good operating point, then perhaps only a few of the more complicated calculations will be required.
References 1. Bailey, McIntyre, Zhang, Deschner, Mehta, Song, Lee, Hu, and Brodsky, “Reflectivity-induced Variation in Implant Layer Lithography” Proc. SPIE 6924, 69244F (2008). 2. Robertson, Reilly, Graves, Biafore, Smith, Perret, Ivin, Potashov, Silakov and Elistratov ,“Simulation of optical lithography in the presence of topography and spin coated films” Proc. SPIE 7273, 727340 (2009). 3. Maiike Op de Beeck, personal communication 2005. 4. Mack, Fundamental Principles of Optical Lithography: The Science of Microfabrication, Wiley, New York (2008). 5. Shao, Evanschitzky, Fuhner, Erdmann “Efficient simulation and optimization of wafer topographies in double patterning” JM3,(2009)
Proc. of SPIE Vol. 7639 763935-11 Downloaded from SPIE Digital Library on 26 Mar 2010 to 192.146.1.254. Terms of Use: http://spiedl.org/terms