Calculation Corner
Thermal Facts & Fairy Tales
Technical Brief
Data Center Design:
Electro-Thermal Simulation of Power Semiconductor Devices
Reduce Energy and Maximize Performance through Improved Cooling
www.electronics-cooling.com
Is CFD a Must?
CHANGE SERVICE REQUESTED ITEM PUBLICATIONS 1000 GERMANTOWN PIKE, F-2 PLYMOUTH MEETING, PA 19462-2486
PRESORTED STANDARD US POSTAGE PAID PERMIT NO. 1536 BOLINGBROOK, IL
Contents Editorial
2
Looking Back Bob Simons
Thermal Facts and Fairy Tales
4
Heat Sinks, Heat Exchangers, and History Jim Wilson
Technical Brief
6
Page 4
Design Considerations for High Performance Processor Liquid Cooled Cold Plates Michael J. Ellsworth, Jr. and Levi A. Campbell, IBM
Calculation Corner
12
Transient Modeling of a High-Power IC Package, Part 1 Bruce Guenin
Feature Articles Efficient Electro-Thermal Simulation of Power Semiconductor Devices via Model Order Reduction
22
Page 14
Tamara Bechtold, CADFEM GmbH; Torsten Hauck, Freescale Semiconductor Inc.; Leon Voss, ANSYS Germany GmbH; E. B. Rudnyi, CADFEM GmbH
The Increasing Challenge of Data Center Design and Management: Is CFD a Must?
28
Mark Seymour, Christopher Aldham, Matthew Warner, Hassan Moezzi, Future Facilities
Energy Reduction and Performance Maximization through Improved Cooling
34
Page 32
David Copeland, Oracle
Index of Advertisers electronics-cooling.com
40 ElectronicsCooling
1
Editorial Robert Simons
Editor-in-Chief, December 2011 Issue
Looking Back
I
n the editorial in the fall issue of ElectronicsCooling, my good friend and colleague Clemens Lasance, wrote “It is the privilege of an editor to discuss any subject that comes to mind, as long as it is (remotely) linked to the theme of the journal”. So, as the editor responsible for this issue of the magazine, I want to avail myself of this privilege. As I contemplate what I want to write, it is with the realization that when you read these words, we will have come to the end of another year. Even more significantly for me, it is also with the realization that this will be my last opportunity to communicate with you the readers, from this forum. This is because after much deliberation, I have decided to leave ElectronicsCooling. Naturally, as I progressed through my decision process, my thoughts turned back to the many years that I have been associated with the magazine. I first joined the editorial staff in December 2000 at the invitation of the then editor-in-chief, Kaveh Azar. Although, I have served 11 years as an Associate Technical Editor, in a sense my association with the magazine goes back as much as 15 years, when I contributed my first article to the magazine for the May 1996 issue. I still remember the article very well, its title was “Direct Liquid Immersion Cooling for High Density Micro-electronics” and it drew upon my many years of experience in this area at IBM. For those who may be interested, the article is still available on the Electronics-Cooling website by searching on “direct liquid immersion simons”. So, I owe a debt of gratitude to my long-time friend Kaveh, for his invitation to submit that first article (as well as some subsequent articles) and especially for inviting me to join the editorial staff. Many of you may be aware from my Calculation Corner article (“A Useful Catalog of Calculation Corner Articles’) in the September 2011 issue, that I have shared the writing of this column with my friend and colleague Bruce Guenin, almost since when I first joined the editorial staff. I have found it both challenging and interesting to come up with suitable analysis topics, that I thought would be of interest to you the reader and amenable to explanation in 2 to 3 pages. I hope that you have found them as interesting to read, as it has been for me to prepare and write about. I have always derived a great sense of satisfaction when at a conference such as SEMI-THERM, a reader would introduce himself or herself to me and let me know how much they liked and how useful they found the Calculation Corner articles. I want to sincerely thank those readers for having done so. I also hope that their comments to me were representative of the vast majority of readers I have not had the pleasure to meet. As you may have gathered from my words so far, I have truly enjoyed my years of working on the magazine and I have derived a true sense of accomplishment from the fruits of our (i.e. my co-editors and myself) labor. I particularly enjoyed my close working relationship with my fellow editors (Clemens Lasance and Bruce Guenin who I have already mentioned and Jim Wilson) over all these years. I have known them not only through our association as co-editors, but even before that through the many SEMI-THERM Symposia we were involved in. I regard each of them not only as first class thermal professionals, but equally importantly as friends. So, you may be wondering why have I chosen to leave at this time. As has been said by others and in particular Kaveh Azar when he gave up his position as editor-in-chief, “every good thing must come to an end”. My decision is one that has been brewing for some time now. I find that between my engineering work, as well as other activities I would like to pursue during my leisure time, I simply no longer have the amount of time to devote to the magazine as the position of associate technical editor warrants. In case you are wondering who will take over my position on the editorial staff, it has not yet been decided. However, I have made a recommendation of someone with excellent qualifications. In any event, I am confident that whoever ITEM Publications and my co-editors choose, will do a first class job in contributing to continuing the fine reputation ElectronicsCooling magazine has earned over the years. In closing, I want to extend my personal thanks to you our readers. It is your interest and support that has helped ElectronicsCooling to become the premier publication of the electronics cooling community it is today. In particular, I want to single out for recognition and thanks, the many readers over the years who have contributed articles for publication in the magazine. It is these articles that make up the content and substance of the magazine. So, thank you all! 2
ElectronicsCooling
www.electronics-cooling.com Editorial Board Associate Technical Editors Bruce Guenin, Ph.D. Principal Hardware Engineer Oracle bruce.guenin@oracle.com Clemens Lasance, Ir Principal Scientist - Retired Consultant at SomelikeitCool lasance@onsnet.nu Robert Simons Senior Technical Staff Member - Retired IBM resimons@att.net Jim Wilson, Ph.D., P.E. Engineering Fellow Raytheon Company jsw@raytheon.com
Published by
ITEM Publications 1000 Germantown Pike, F-2 Plymouth Meeting, PA 19462 USA Phone: +1 484-688-0300 Fax: +1 484-688-0303 info@electronics-cooling.com www.electronics-cooling.com
Publisher
Paul Salotto psalotto@electronics-cooling.com
Content Manager
Sarah Long editor@electronics-cooling.com
Business Development Manager Bob Poust bpoust@electronics-cooling.com
Reprints
Reprints are available on a custom basis at reasonable prices in quantities of 500 or more. Please call +1 484-688-0300.
Subscriptions
Subscription for this quarterly publication is FREE. Subscribe online at: www.electronics-cooling.com All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, or stored in a retrieval system of any nature, without the prior written permission of the publishers (except in accordance with the Copyright Designs and Patents Act 1988). The opinions expressed in the articles, letters and other contributions included in this publication are those of the authors and the publication of such articles, letters or other contributions does not necessarily imply that such opinions are those of the publisher. In addition, the publishers cannot accept any responsibility for any legal or other consequences which may arise directly or indirectly as a result of the use or adaptation of any of the material or information in this publication. ElectronicsCooling is a trademark of Mentor Graphics Corporation and its use is licensed to ITEM. ITEM is solely responsible for all content published, linked to, or otherwise presented in conjunction with the ElectronicsCooling trademark.
ITEM
™
Produced by ITEM Publications
The World’s Leading Total Thermal Solution Provider Projector
Heat Pipe
25W
30W-80W
Set Top Box 2W-3W
Pad 15W
Set Top Box
Industrial PC 105W
10W
+ + + +
Unsurpassed Heat pipe design with minimum thickness of 1.2mm Patented DR-MagLev’s Blanked & Seamless technology and Seal & Clip design Innovative Heat dissipation technology Flexible & Customized Thermal solutions Applications Industry Equipment 、 Telecommunication 、 LED Lighting 、 PC 、 Consumer Electronics Sunonwealth Electric Machine Industry Co., Ltd Headquarters (Taiwan)
http://www.sunon.com
Sunon Inc. (U.S.A.)
http://www.sunonusa.com | E-mail:info@sunon.com | Phone:+1-714-255-0208
c 2011 SUNONWEALTH Electric Machine Industry Co., Ltd
thermal facts and fairy tales
Heat Sinks, Heat Exchangers, and History Jim Wilson
Associate Technical Editor
A
s far as history goes, the field of electronics cooling does not have a very long past. A rather quick look through my personal reference material that was strictly geared towards cooling electronics had as the earliest some US Navy documents from the 1950s [1]. Comparing the solution techniques available today to those available then shows that we have both much better tools and harder problems (although I still like to refer to the suggested heat transfer coefficient value of ~10 W/m2-K for natural convection when not much else is known). Perhaps because of the shorter history and the tendency of engineers working in this field always being exposed to the latest and greatest electronics, the electronics cooling community sometimes doesn’t venture out and learn from related fields. When we start solving problems without doing significant research, we can live in a fairy tale world where we think that our problems are strictly unique to us. There can be a benefit by taking some time to examine the past and finding out that other smart engineers have often looked at similar problems and may have relevant information that would help us with understanding. The use of heat exchanger theory provides a good example where there is a possible benefit from thinking about the problems we are solving from more than one viewpoint. Consider the simple illustration of a heat sink shown in Figure 1. Heat sink suppliers and designers, especially in air cooled electronics, like to use a thermal resistance type description such as Rth = 1/hA = (Tsurf -Tcool-in)/q
Since the resistance can vary with the coolant velocity, information about how Rth varies with velocity may be provided. The coolant temperature of reference is the inlet temperature. While this approach is convenient, there isn’t much need to think about using a minimum amount of coolant and other constraints such as noise and prime power to move the coolant may dictate the flow rate. Engineers that come from an avionics background typically consider that the coolant will change temperature as the waste heat is added. Often, the coolant flow rate is specified in terms of flow rate per KW of heat such that the temperature rise of the coolant from inlet to exit is constant for different electronic assemblies. The mindset is to use the coolant as efficiently as possible because additional coolant is either not possible or very expensive. One of the potential benefits of applying heat exchanger theory to electronics cooling is that it can provide one way of looking at efficiency. Moffat [2] provides a good discussion on heat exchanger theory applied to air cooled heat sinks and general heat ex4
ElectronicsCooling
Figure 1. Simple heat sink.
changer theory can be found in most heat transfer textbooks [3]. While the theory was mostly developed to deal with two fluids, a bounding case where one of the fluid temperatures did not change (such as with condensation or evaporation) simplifies the equations and can be representative of heat sinks used to cool electronics. A uniform temperature for Tsurf is only an approximation but useful for illustration. Heat exchanger analysis frequently uses what is known as the effectivenessNTU method where the effectiveness represents the actual heat transfer divided by the maximum possible heat transfer. The NTU, or Number of Transfer Units, is a dimensionless parameter that relates the heat transfer convective resistances to the coolant flow heat capacity. While the details are beyond the scope of this short column, a typical heat sink (or cold plate) can be described with the following equations (assuming that the simplifying assumption of one surface temperature is reasonable). 1 Actual heat transfer = Ccool*(Tcool-out – Tcool-in); (Ccool is the mass flow times the heat capacity for the coolant) 2 Maximum possible heat transfer = Ccool*(Tsurf – Tcool-in) 3 Effectiveness = E = (Tcool-out – Tcool-in)/ (Tsurf – Tcool-in) 4 Effectiveness = 1 – exp(-NTU), where NTU = hA/Ccool At this point, someone used to using resistance based concepts for heat sinks might ask why go to all this trouble. The reason for thinking about the problem in these terms is the form of equation 4 which is shown graphically in Figure 2. Note that if we want to increase the effectiveness of our heat sink we need to increase NTU. One way to increase the NTU term is to decrease the coolant heat capacity but while our effectiveness increased, the resulting temperature for the surface may not be acceptable. The other way is to increase the hA term which means either larger area or a higher effective heat transfer coefficient. The engineering challenge is to minimize the decrease in effectiveness as coolant flow rates increase. Note that the limit is when the surface temperature December 2011
Figure 2. Plot of equation 4.
and the coolant exit are at the same temperature, or the system has an effectiveness of 1. This limit is formally only for a constant temperature boundary condition (as opposed to uniform flux or mixed boundary condition) and is only a theoretical maximum, but it does provide a basis for comparison. More complex systems such as cold plates with multiple heat sources or significant coolant temperature variation may require more detail to assess but the general trends are
similar. Historical electrical analogy treatment of cold plates (typical in avionics) sub-divides the problem into zones and account for the coolant rise but make the assumption that a heat transfer coefficient relative to a local coolant temperature is known. Some heat sink resistance calculations add in a “fluid resistance” but the reader is cautioned that this terminology can be problematic [2]. We are often asked to help design more efficient systems but sometimes the definition of efficient is vague. From the perspective of coolant in and out temperatures, the heat transfer process is 100% efficient and the heat ended up in the coolant. However, the real question may be can the cooling job be accomplished with a lower flow rate and still have acceptable temperatures? Thermal engineers can participate and even lead discussions on developing even more efficient systems but we benefit when we look at problems from multiple viewpoints and even venture out into the research from related fields.
References 1. 2. 3.
Bureau of Ships NavShips 900,190, Guide Manual of Cooling Methods for Electronic Equipment, 1 November, 1956. Moffat, R, “Modelling Air-Cooled Heat Sinks at Heat Exchangers”, ElectronicsCooling, February 2008. F. P. Incropera & D. P. DeWitt 1990 Fundamentals of Heat and Mass Transfer, 3rd edition, Wiley, New York.
electronics-cooling.com
l
ElectronicsCooling
5
technical brief
Design Considerations for High Performance Processor Liquid Cooled Cold Plates Michael J. Ellsworth, Jr. and Levi A. Campbell IBM, Poughkeepsie, New York, USA
T
he meteoric rise in cooling requirements of commercial computer products has been driven by an exponential increase in microprocessor performance over the last decade. The conventional way to cool microprocessors has been to utilize air to carry the heat away from the chip, and reject it to the ambient. Air cooled heat sinks are the most commonly used air-cooling devices with the highest performers incorporating heat pipes or vapor chambers. Such air cooling techniques are inherently limited with respect to their ability to extract heat from semiconductor devices with high heat fluxes as well as carry heat away from server nodes that have high power densities. Thus, the need to cool future high heat load, high heat flux electronics mandates the development of low thermal resistance and highly energy efficient thermal management techniques, such as liquid cooling using cold plate devices. Liquid cooling of electronics is not a new technology. The need to further increase packaging density and reduce signal delay between communicating circuits led to the development of multi-chip modules beginning in the late 1970s. The heat flux associated with bipolar circuit technologies steadily increased from the very beginning and really took off in the 1980s [1]. IBM had determined that the most effective way to manage chip temperatures in these systems was through the use of indirect water-cooling [2]. Several other mainframe manufacturers also came to the same conclusion [3-7]. The decision to switch from bipolar to Complementary Metal Oxide Semiconductor (CMOS) based circuit technology in the early 1990s led to a significant reduction in power dissipation and a return to totally air-cooled machines. However,
this was but a brief respite as power and packaging density rapidly increased, matching then exceeding the performance of the earlier bipolar machines. These increased packaging densities and power levels have resulted in unprecedented cooling demands at the package, system and data center levels necessitating a return to water-cooling [1, 8-9]. This article focuses on some of the design trade-offs associated with the processor cold plate. Two general types of cold plates are considered: a relatively high performance, high cost machined copper cold plate and a relatively low performance, low cost copper tube embedded in aluminum cold plate. Thermal performance in the context of the product application is also addressed.
Figure 1. Plan (transparent) view; machined copper cold plates (1), (2), and (3) along with molded copper pin fin cold plate (7).
Figure 2. Cold plates with ¼” OD copper tubes embedded in aluminium or copper plates.
6
ElectronicsCooling
Cold Plate Descriptions
A summary of the cold plates under consideration in this exercise can be found in Table 1. Cold plates (1), (2), and (3) are copper blocks with rectangular channels for the water to flow through. Cold plates (1) and (2) have 0.5 mm wide channels while cold plate (3) has 0.25 mm wide channels. Cold plate (1) is a 3-pass flow design in that flow passes sequentially through 3 groupings of channels (figure 1); each with a third of the total number of channels. By contrast, cold plates (2) and (3) are 1-pass flow designs; water flows through all the channels in parallel. Cold plates (2) and (3) are shown in plan view (transparent to show channels and fins) in figure 1. Also depicted in figure 1 is cold plate (7) with pin fin geometry. Cold plates (4), (5), and (6) are comprised of 6.35mm (¼”) outside diameter (OD) copper tubes embedded in an aluminum or copper plate. Cold plate (4) has its tubes flattened
December 2011
Figure 3: Cold plate convective unit thermal resistance (R”conv) as a function of volumetric flow rate
Figure 5: Cold plate total unit thermal resistance (R”total) as a function of flow power
slightly to provide favorable thermal contact with the module lid. The formed tube cross section can be seen in figure 2. The tubes in cold plates (5) and (6) are flattened to a greater degree for increased thermal performance: a greater fraction of the module lid area is in contact with the copper tubes along with a reduced hydraulic diameter for increased convective heat transfer. The price of this enhancement is increased pressure drop. Note, however, how inexpensive the copper tube cold plates (4), (5), and (6) are in comparison to the machined copper cold plates (1), (2), (3), and (7). The 0.25 mm channel cold plate (3) is by far the most expensive as it required wire electrical discharge machining (EDM) to form the channels.
necessarily require disparate flow rates (high flow rates for the formed tubes, and much lower flow rates for the small machined channels). The thermal resistance calculated in equation 1 includes the temperature rise of the fluid within the cold plate implicitly. Consider a micro-scale channeled cold plate, with extremely enhanced area but a high pressure drop and a pressed tube cold plate, both having equivalent thermal performance at the module level. For this to be true, the microchannel cold plate would have to be at a low flow rate, and the pressed tube at a relatively high flow rate; clearly most of the thermal resistance of the microchannel in this case is temperature rise of the fluid, whereas in the pressed tube the resistance is dominated by the heat transfer coefficient on the tube walls and the available area. To better observe this differing behavior, we also compare the cold plates by their convective performance only, and so define a convective unit thermal resistance (mm 2 C/W),
Analysis Results and Discussion
Conjugate thermal analyses were performed on the cold plates described in the previous section using a commercially available computational fluid dynamics code [11]. In order to objectively compare cold plate thermal performance, a uniform heat flux boundary condition was applied to each cold plate’s active cooling area, which is defined for the purposes of this comparison as the cold plate area which is conduction coupled to the module lid by a thermal interface material (TIM). The cold plate thermal performance is defined by a total unit thermal resistance (mm 2 C/W), (1) where T base is the average temperature of the cold plate base active cooling area (oC), Tin is the water inlet temperature (oC), and q” is the heat flux applied to the cold plate base active cooling area. R”total, which captures both convective and advective heat transfer behavior, does not completely capture the true nature of the cold plate’s thermal performance. The copper tube cold plates, having larger characteristic dimensions, rely on transitional or turbulent flows to achieve heat transfer performance while the machined cold plates with smaller characteristic dimensions rely on large wetted areas to achieve performance. Since the two types of cold plates compared in this study have very different pressure drop characteristics, implementations of the two types would 8
ElectronicsCooling
(2) where is the water mass flow rate (kg/s) and Cp is the water specific heat (J/kgK). The convective performance of the cold plates is depicted in the graph in Figure 3. The machined (plus pin fin) cold plates achieve a high level of performance with relatively little flow. The slightly formed copper tube cold plate (4) performed the worst with comparable behavior to the machined cold plates only at much higher flows. The crushed tube (5) and (6) faired much better although the resulting pressure drop, seen in figure 4, is an order of magnitude higher than that associated with the 0.5 mm channel cold plates (1 and 2) or the slightly formed copper tube cold plate (4). It is also useful to compare the cold plates using the total resistance as a function of flow power (figure 5) to capture the convective and advective resistance as well as the pumping power cost associated with the varying flow rates and pressure drops inherent to these different geometries. Flow power (W) is defined as, December 2011
The latest thermal management news delivered to your inbox Get to know ElectronicsCooling eNews
ElectronicsCooling eNews delivers practical information, including the latest news, trends, product updates, events and other content on thermal management advances in the medical, power generation, military, aerospace and computer industries.
Join the growing number of subscribers and stay up to date – Subscribe online today. ITEM
TM
www.electronics-cooling.com
Figure 4. Cold plate pressure drop as a function of volumetric flow rate.
Pflow = ∆P × V (3) where P is pressure drop (Pa) and V is volumetric flow rate (m 3/s). Clearly, the higher pressure drop, higher flow cold plates required much higher flow power (2 orders of magnitude in some cases). Finally, consideration must also be made to how the cold plate performs in the actual product application. Several of the cold plates in this study were incorporated into two different product module applications (depicted in figure 6). The cold plate thermal resistance in the context of the module application is determined by
(
)
T pr − Tin R prod = q
R"total Amod
(4)
(5)
where Amod is the module lid (active) area. By graphing in figure 6 the ratio of these resistances, (RUHF / Rprod), it can be seen that the actual performance can be considerably less (by a factor of 2) than the idealized uniform heat flux case; the RUHF / Rprod metric as shown represents an efficiency, where values less than 100% indicate to what extent the final design underperforms the idealized uniform heat flux case. The magnitude and variation with flow will vary with cold plate type and module application. One interesting feature of these packages is that as the overall thermal resistance of the package decreases, less spreading is taking place in the lid and cold plate base; conversely, as the cold plate resistance increases, 10
ElectronicsCooling
the heat flux from the lid becomes more uniform. In the case of a laminar flow cold plate, such as (1) and (7), the heat transfer coefficient between the channel walls and the coolant is not a function of flow rate; only the advective resistance is affected by increasing flow, resulting in a characteristic RUHF / Rprod that decreases as flow increases. Alternatively, a cold plate with a larger hydraulic diameter and higher flow rates can be in transitional or turbulent flow, such as cold plates (4), (5), and (6); with transitional and turbulent flows the heat transfer coefficient between the tube walls and the coolant increases with increased flow rate, lowering the resistance of the path from the lid through the interface to the cold plate base and into the side walls of the tube.
Conclusions
where Tpr is a point reference temperature on the cold plate base corresponding to the x-y center of the processor and q is the total processor heat load. This thermal resistance is compared to the thermal resistance resulting from the uniform heat flux boundary condition previously discussed and is defined as
RUHF =
Figure 6: Comparison of the uniform heat flux cold plate thermal resistance to its thermal resistance in a product applic
Two differing cold plate technologies for cooling processor modules have been compared. The following observations and conclusions can be drawn from the analysis results presented herein: 1. Cold plate thermal resistance performance, whether inclusive or exclusive of advective resistance, does not translate to a better design (i.e. one that most closely satisfies all of the design constraints of pressure drop, cost, and thermal performance). A more costly cold plate does not guarantee a better design. 2. The entire system design must be taken into account when designing / specifying a cold plate. Choices at the system level can elevate design requirements on the cold plate (i.e. allow for a lower performance / lower cost design). Specifically, given a thermal resistance target that can be met with a low flow, expensive, high pressure drop cold plate or a high flow, inexpensive, low pressure drop alternative, the coolant routing within the system (parallel versus serial paths, for example) and the available pump performance are key to the cold plate selection. 3. Cold plate thermal resistance based on a uniform heat flux (UHF) boundary condition can not be applied directly to
December 2011
a product application specification. Performance in the product application can be very different than what the UHF resistance would suggest.
References [1] Ellsworth, Jr., M.J., Campbell, L.A., Simons, R.E., Iyengar, M.K., Schmidt, R.R., Chu, R.C., “The Evolution of Water Cooling for IBM Large Server Systems: Back to the Future,” proceedings of the 2008 ITherm Conference, Orlando, FL, USA, May 28-31. [2] Simons. R.E., ”The Evolution of IBM High Performance Cooling Technology,” Proceedings of the Eleventh Annual IEEE Semiconductor Thermal Measurement and Management Symposium, 1995, pp. 102-112. [3] Kaneko, K. Seyama, K. and Suzuki, M. “LSI Packaging and Cooling Technologies for Fujitsu VP2000 Series,” Fujitsu Scientific & Technology Journal, v. 41, no. 1, 1990, pp. 12-19. [4] Kaneko, K., Kuwabara, K., Kikuchi, S. and Kano, T. “Hardware Technology for Fujitsu VP2000 Series,” Fujitsu Scientific & Technology Journal, v. 37, no. 2, 1991, pp. 158-168. 5. Kobayashi, F.; Watanabe, Y.; Yamamoto, M.; Anzai, A.; Takahashi, A.; Daikoku, T.; Fujita, T., “Hardware Technology for Hitachi M-880 Processor Group,” Proceedings of the 41st Electronic Components and Technology Conference, 1991, pp. 693-703. 6. Watari, T.; Murano, H.,” Packaging Technology for the NEC SX Supercomputer,” IEEE Transactions on Components, Hybrids, and Manufacturing Technology, Volume 8, Issue 4, 1985, pp.462 - 467 7. Murano, H.; Watari, T., “Packaging technology for the NEC SX-3 Supercomputers,” IEEE Transactions on Components, Hybrids, and Manufacturing
Technology, Volume 15, Issue 4, 1992 , pp. 411 – 417. 8. Ellsworth, Jr., M.J., Goth, G.F., Zoodsma, R.J., Arvelo, A., Campbell, L.A., Anderl, W.J., “An Overview of the IBM Power 775 Supercomputer Water Cooling System,” Proceedings of the ASME 2011 Pacific Rim Technical Conference & Exposition on Packaging and Integration of Electronic and Photonic Systems (InterPACK 2011), Portland, Oregon, USA, July 6-8. 9. Wei, J., “Hybrid Cooling Technology for Large-Scale Computing Systems – From Back to the Future,” Proceedings of InterPACK 2011, Portland, Oregon, USA, July 6-8. 10. Sahan, R.A., Rahima, M.K., Xia, A., and Pang, YF, “Advanced Liquid Cooling Technology Evaluation for High Powered CPUs and GPUs,” Proceedings of InterPACK 2011, Portland, Oregon, USA, July 6-8. 11. Fluent , Distributed by ANSYS, Inc.; Southpointe, 275 Technology Drive, Canonsburg, PA, 15317, www.ANSYS.com
l
More on the Web Whatever the needs of a particular application, find a wide choice of manufacturers in the electronics-cooling.com Buyers' Guide section. Just log on to www.electronics-cooling.com, click on Buyers' Guide, and make your selection.
www.electronics-cooling.com
electronics-cooling.com
ElectronicsCooling
11
calculation corner
Transient Modeling of a High-Power IC Package, Part 1 Bruce Guenin Associate Technical Editor
T
here is an increasing need for calculating integrated circuit temperatures during conditions of changing chip power. Varying computer workloads and the implementation of power-saving strategies are leading to greater variability in chip power levels than in the past. As reliability requirements get more stringent there are growing concerns about the effect of temperature changes on die and package integrity. The latest JEDEC standard for measuring the junction-to-case thermal resistance uses a transient method [1]. This article seeks to provide greater insight into transient thermal phenomena in high-power IC packages and examines a number of approaches for predicting transient thermal behavior.
OVERVIEW
This work extends the effort described in two recent installments of this column devoted to the steady-state thermal analysis of high-power IC packages attached to heat sinks [2, 3]. The present effort is also divided into two parts. Part 1 focuses on nuances of transient heat transfer and explores three different methodologies using a simplified model of the package and heat sink. Part 2 will extend this analysis to more practical examples such as those treated in the recent columns as well as to the JEDEC junction-to-case thermal resistance test. This article explores three analysis methods: 1) finite element analysis (FEA), 2) an analytical, multi-stage RC model, and 3) a numerical, multi-stage RC model. (RC = Resistor Capacitor.) A commercial code was used to implement the FEA work [4]. Methods 2 and 3 were implemented in spreadsheets. The current analysis assumes the same package construction as in the recent columns, namely a high-power IC package attached to a heat sink, as depicted in Figure 1. In this model, two simplifying assumptions were made: 1) heat flow to the package substrate and to the PCB is neglected and 2) the cooling effect of heat sink fins is represented by the application of a suitable heat transfer coefficient directly to the heat sink base. Hence, the only components explicitly represented in the model are the die, TIM1, lid, TIM2, and the heat sink base (where TIM = Thermal Interface Material). A further simplifying assumption for Part1 only of this article is that the width of all components is equal to that of the die, making the heat flow one-dimensional. This will simplify the task of evaluating the relative accuracy of the analytical and numerical multi-stage RC models.
Finite Element Analysis Solution Assumptions 12
ElectronicsCooling
Figure 2a depicts the solid model representing the five aforementioned components. It is a one-quarter model, because of the symmetry in the component geometry, heat load, and boundary condition. In the subsequent text, comments regarding up and down directions (or top and bottom) in the component stack-up are made with reference to this figure. The quantitative assumptions of the model are listed in Tables 1, 2, and 3. Note that the stated values of die width, thickness of the all components, material composition, and thermal resistance are consistent with those in the recent articles [2, 3]. Differences involve the width of the lid, TIM2, and heat sink base, and value of heat transfer coefficient, h. The current value of h is much larger than the ones assumed in the cited articles. It was chosen to produce a value of heat sink-to-air thermal resistance comparable to those in the earlier work despite having a much smaller surface area for heat removal. All material properties are assumed to be independent of temperature. Note that, for convenience, the ambient temperature was assumed to equal zero. Hence, the reported component temperatures correspond to the temperature rise with respect to the ambient.
Results
The results of the FEA solution are shown in Figures 2 and 3. Figure 2a shows the thermal contour maps for elapsed times in the range 2E-7 to 100 seconds. As expected, the heat flow is one-dimensional. Interpretation of the contour plots is assisted by Figure 2b, which attempts to capture the spatial and temporal variations of temperature in the model. It plots the temperature for each node at a given z-axis location for each FEA solution illustrated in Figure 2a. (The z-axis is assumed to have an “up� orientation). Clearly, the temperature rise at all levels in the model proceeds most rapidly in the early portions of the time interval explored. The steady state is nearly achieved after only 8.5 seconds. There is only a slight further increase in temperature after 100 seconds has elapsed. Most of the temperature rise occurs within the TIM layers and at the heat sink-to-air interface. In contrast, the temperaDecember 2011
ITEM
™
get to know
Environmental Test & Design (ETD) shock vibration temperature
An electronic newsletter that focuses on the inter-relationship between electronic products, systems and devices and their environments.
acoustics
ETD coverage includes:
HALT
» The impact of shock, vibration, temperature and acoustics on electronic products, and how they are designed to withstand these influences.
eco design
» The impact of electronic products upon their environments in terms of acoustical noise, vibrations, and other emissions.
waste HASS sustainability news
» Qualification and analysis of electronic products from a lifecycle perspective and the use of HALT, HASS, ESS testing and other screening methods.
standards compliance
Visit www.etdnews.com to subscribe today
For more information, please contact our Business Development Manager, Bob Poust bpoust@etdnews.com | 484-688-0300 ext. 15
calculation corner clearly shows that, up to approximately 5E-4 seconds, the only heat flow occurs within the die. Heat flow doesn’t reach the heat sink base until about 2E-3 seconds. Because of their advantages, log-log plots are used in the subsequent analyses.
Lumped-Parameter Thermal Circuit Models
A further examination of Figure 2a indicates that there are alternating regions of different transient behavior: 1) small thickness and large temperature rise (TIMs and heat-sinkto- air thermal resistance) and 2) larger thickness and small temperature rise across them (die, lid, and heat sink base). The first class of regions has a dominant resistive behavior while the second exhibits capacitive behavior. The one-dimensional heat flow situation allows us to easily calculate the thermal resistance, R, and heat capacity, C, of each layer using the following formulas:
ture variation within the die, lid, and heat sink base is relatively small. Hence the thermal behavior of the package/heat sink stack can be efficiently captured by tracking the temperature of nodes at the top of the die, lid, and heat sink base. Figure 3a is a linear plot of the temperatures at these three locations versus time. It displays the behavior normally expected in a transient thermal situation, with the rapid initial temperature rise as discussed in the context of Figure 2b. Unfortunately, since most of the graph represents time intervals during which the system is near or at the steady state, it is not the best vehicle for documenting the nuances of a transient thermal model. In contrast, the log-log plot in Figure 3b expands the scale considerably in the very short time scales, down to 1E-7. It
14
ElectronicsCooling
(1)
(2)
C = Specific Heat * Density * Volume
(3)
Table 4 lists the calculated values of RCONDUCTIVE and C for each component and for the heat sink base-to-air thermal resistance, RCONVECTIVE, are listed in Table 4. The stated values confirm the dominant behavior of each region as described above. This suggests the possibility that a multi-stage RC circuit model could be used to represent the thermal behavior of the package/heat sink stack. Figure 4 represents a template for an RC ladder circuit that could be scaled to an arbitrary number of RC stages. As indicated in the lower part of Table 4, it is possible to lump the various layers in the package stack-up into RC ladder models consisting of various numbers of stages. The thermal resistance and heat capacity of each stage would each equal the sum of these parameters for each of the components lumped into that stage. The thermal time constant for each stage equals the product of its total R and C values. It should be noted that up to 3 stages, the lumped elements are delineated by component interfaces. To create 4 stages, the die is divided into upper and lower parts. In the subsequent sections, these lumped parameter values are used for both analytical and numerical transient calculations. It should be noted that, as the number of stages is increased, the time constant for the first stage (containing the die) gets smaller and smaller and, hence, should enable the model to better track the shor t-t i me-sca le transient behavior of the die. December 2011
Still Using Thermocouples? No wonder your stuff keeps melting.
Analytical Multi-Stage RC Model
Equation 4 provides a convenient method of calculating the transient behavior of the junction temperature, TJ, in response to stepwise increase in power from zero to a constant value, beginning at time = 0.
(4)
Thermal cameras give you millions of accurate, real-time, non-contact temperature measurements of spots as small as 5 microns up to 120 frames per second. Stop guessing about where your products are creating heat, and see
where ΘJA, the junction-to-air thermal resistance, is equal to the sum of all the R i values and τi is the thermal time constant of the ith stage, equal to R i*Ci. Note that this equation is not an exact solution for this type of problem. However, it often provides adequate accuracy in situations such as the present one involving a multi-layer structure, with each layer having a significantly different time constant from the others [5]. The reader is directed to Ref. [6] for a more rigorous derivation of multistage thermal networks. Figure 5 compares predictions of the analytical model with those of the FEA model (and the numerical model, to be discussed below). Obviously, increasing number of stages in the analytical model improves accuracy. The calculation using the 4-stage analytical model is in good agreement with the FEA prediction. The maximum discrepancy between them is 0.2˚C, occurring at time = 0.1 second.
it for yourself.
Contact measurement instruments can create thermal flags that influence your measurements and just don’t work on small targets. Learn about the physics behind this at www.flir.com/research and download an informative white paper to get all the details.
SC8300 HD Infrared Camera & SC645 High-Resolution Infrared Camera
Numerical Multi-Stage RC Model
The preceding analysis has demonstrated acceptable accuracy using the analytical model. It has the virtue of being very easy to implement for a simple power on/power off situation. The main problem, however, is that it becomes much more complicated to use it to calculate the effect of an arbitrary change in power on the junction temperature. The normal approach to dealing with this situation is to perform a Laplace transform transfer function analysis, which can be quite challenging [6].
Contact FLIR today to learn how. 866.477.3687
electronics-cooling.com SC645_SC8300 ElectronicsCooling hpV_1111.indd 1
ElectronicsCooling
15
11/10/11 8:50 AM
(6) • Finally, the temperature at node i can be calculated at the next time step, j + 1 from: Ti,j+1 = Ti,j + ∆Ti,j (7) Once the updated temperatures are calculated for all nodes, then the process repeats by sequentially applying equations 5, 6, and 7 to all nodes. It is repeated until the process steps though the time interval of interest.
Spreadsheet Structure Numerical methods, such as the one describe below, are more difficult to implement initially. However, they can readily be adapted to account for the effect of an arbitrary power waveform on the junction temperature.
Methodology
Assume a series RC circuit containing the desired number of stages. The heat flow from the junction to the ambient at a given value of instantaneous power can be calculated by discretizing time into sufficiently small steps, and applying the following procedure [7]. • At time < 0, zero power is applied to the die. The temperature of all nodes is equal to that of the ambient. • For time ≥ 0, in a given time step, δt, a quantity of thermal energy is injected into the left side of the ladder network by the die, as indicated in Figure 4. • This quantity of energy, Q, is equal to δt*P, where P is the instantaneous dissipated power. • The heat then propagates toward the right end of the network by flowing through each stage in the network ladder. • The quantity of heat flowing from nodes i and i+1 during time step j is calculated with reference to Figure 6 using the following expression. It is directly proportional to the difference in the nodal temperatures and inversely proportional to the resistance, R i. (5) • The change in temperature at a given node i, during time step j, is calculated using the net heat remaining in the node and the heat capacity of that node according to the expression:
16
ElectronicsCooling
The spreadsheet structure to perform this calculation as described above is shown in Table 5a. Note that the calculation requires a fixed time step to converge. As we have seen, a transient solution can span several decades of time. This spreadsheet uses a technique to deal with this efficiently. There are three different sheets implementing the calculation with different time steps: Sheet1, 1E-6 sec.; Sheet2, 1E-4 sec.; and Sheet3, 1E-2 sec. The calculation on each sheet begins at 0 time and proceeds at different rates per the chosen time step. Hence, Sheet1 provides a much finer time resolution, suitable for the early portion of the transient. Each successive sheet lends itself to calculating the temperatures with a larger time step and, therefore, covering a larger span of elapsed time. The calculated results from all the calculation sheets, at the desired time steps, can be readily aggregated on the “Summary” sheet (described below) for graphing and further analysis. Note that the power in a given time step is entered directly into the appropriate cell in column C. As such, it can be varied at any desired time step(s). A spreadsheet for RC circuits with more than one stage can be generated by the following recursive process: • To create a 2-stage model, copy the cell contents from the range: [Col E, row 2: Col G, last row] and pasting it into the range: [Col G, row 2: Col I, last row]. Re-label cells G2 and F2 as “R2” and “C2”, respectively. Input correct values of these parameters into cells G3 and F3. • To create a 3-stage model from the 2-stage model, copy the cell contents from the range: [Col G, row 2: Col I, last row] and paste it into the range: [Col I, row 2: Col K, last row], etc. This spreadsheet has a fourth sheet described as the “Summary” sheet. Its structure is described in Table 5b. It uses the nested formulas “INDIRECT(ADDRESS(row,col,,,sheet) )” to access an arbitrary solution temperature on any of the
December 2011
if there is a need to account for the effect of a time-varying power level, the use of the numerical model provides the more convenient path to a solution. In Part 2, the numerical methods applied here to onedimensional heat flow situations will be extended to more typical situations involving diverging heat flows, using the methods recently developed in this column for steady-state thermal analyses of high-power packages.
References 1.
calculation sheets, per the row and sheet inputs to the ADDRESS function, in columns A and B. The present example only shows links to values on Sheet1. However, as used by this author, these formulas are copied to many different rows to selectively access calculated values from all of the sheets.
Results
Figure 5 compares the numerical results for the 1-stage and 4-stage models with both the FEA and analytical results. The 1-stage results for the numerical and analytical methods are nearly identical. The difference between them depends on the value of the time step. It was largest for the largest time step, equal to 1E-2. In this case it is 5E-4˚C. Since the 1-stage analytical result represents an exact solution, this precise correlation with the 1-stage numerical method is a confirmation of its accuracy. Figures 7a and 7b compare the 3- and 4-stage numerical results with the FEA solution. Clearly, the 4-stage method provides a calculation of the junction temperature that is in better agreement with the FEA result than either the 3-stage numerical or the 4-stage analytical result. The lid temperature calculations for the numerical method are in good agreement with the FEA results. The calculated heat sink base temperatures are in slightly worse agreement. It is likely that the agreement could be improved by subdividing the heat sink base into an upper and lower half, as was done with the die.
Numerical 4-Stage RC Model with Variable Power
The 4-stage model, whose results with constant power are presented in Figure 7b, was modified simply by manually changing the power level at selected time steps in Col C of Sheet3. In some cases a linear function relating power and time was input. The choices for instantaneous power were quite arbitrary and are intended mainly to exercise the numerical model. The results, extracted by the Summary sheet from Sheet3 only, are presented in Figure 8. As expected, the process for modifying the spreadsheet to accommodate the variable power input took only a few minutes.
Conclusions
2. 3. 4. 5. 6.
7.
l
Mintres specializes in developing thermal management solutions based on ceramic heatspreading materials – diamond (pure and as a composite), aluminum nitride and alumina – in conjunction with the customer or simply to drawing. See our website (www.mintres.com) for more details or e‐mail us at sales@mintres.com.
Mintres is a customer focused organization. We are responsive to our customers and can provide rapid turnaround times for making samples and prototypes. At the same time we are price/quality competitive for supplying products for volume applications, typically opto‐electronics and high power discrete components.
With a sufficient number of RC stages, both a simple analytical model and a straightforward numerical give adequate accuracy when accounting for a simple power on/power off situation with a typical high-power IC package. However, electronics-cooling.com
JEDEC Standard, JESD-14, “Transient Dual Interface Test Method for the Measurement of the Thermal Resistance Junction to Case of Semiconductor Devices with Heat Flow Through a Single Path.” Available for free download at www.jedec.com. B. Guenin, “Thermal interactions Between High-Power Packages and Heat Sinks, Part 1, “ ElectronicsCooling, Vol. 16, No. 4, Winter, 2010. B. Guenin, “Thermal interactions Between High-Power Packages and Heat Sinks, Part 2, “ ElectronicsCooling, Vol. 17, No. 1, Spring, 2011. ANSYS®, Version 13.0 B. Guenin, “Simplified Transient Model for IC Packages,” ElectronicsCooling, Vol. 8., No. 3, August, 2002. Y-L Xu, R. Stout, and D. Billings, “Electronic Package Thermal Response Prediction to Power Surge,” Proceedings, I-Therm Conference, May, 2000, 366-371. B. Guenin, “Transient Thermal Model for the MQUAD Microelectronic Package,” Proceedings SEMI-THERM X Conference, February, 1994, pp. 86-95.
ElectronicsCooling 17
ADVANCE
PROGRAM
SEMI-THERM is proud to announce the Advance Program for our 28th year of technical excellence in semiconductor thermal sciences. Pre-conference short-courses from world-class thermal experts Over 55 peer-reviewed papers presented by the industry’s best
March 18-22, 2012 Doubletree by Hilton San Jose, CA USA
and brightest semiconductor thermal professionals and educators Chip to system thermal management, modeling and measure-
ment solutions
Tracks include 3D Packaging, Liquid and Air Cooling, Multidiscipli-
nary Thermal Management, Thermal Management in Multi-Core Architectures and more…
New! for SEMI-THERM 28: Special Data Center Cooling track with
short courses and paper presentations
Co-Located with MEPTEC “Heat is On” Symposium March 19 SEMI-THERM exhibition with over 40 vendors plus workshops and
food and beverage receptions Tuesday and Wednesday evening Keynote and opening ceremony, evening technical program and
special luncheon speakers
THERMI and Harvey Rosten Award presentations Check out the information on the next few pages and visit
www.semi-therm.org for details on the program and other events coming to SEMI-THERM 2012! THANK YOU TO OUR SEMI-THERM 28 PROGRAM COMMITTEE General Chair Program Chair Vice-Program Chair
Herman Chu, Cisco Systems, Inc. Zeki Celik, LSI Corporation Genevieve Martin, Philips Research
www.SEMI-THERM.org SEMI-THERM 28 SELECTION AND REVIEW COMMITTEE Bennett Joiner George Meyer Guohua Greg Xiong Sean Too Mertol, Atila Attila Aranyosi Bernie Siegal Bonnie Mack Cathy Biber Celik, Zeki
Clemens Lasance Dave Saums David Copeland Genevieve Martin Herman Chu Himanshu Pokharna Hsiao-Kang Ma Kazuaki Yazawa Keiji Matsumoto Kevin Hanson
Marta Rencz Peter Rodgers Prasad Tota Qian Han Raghavan Mahalingam Rahima Mohammed Ross Wilcoxon Sai Ankireddi Sharon Adam Shlomo Novotny
Susheela Narasimhan Thomas Tarter Wendy Luiten William Maltz Xi Wang Jim Wilson Yan Zhang Yongguo Chen Younes Shabany
ADVANCE
TECHNICAL
PROGRAM
NEW SHORT COURSES FOR SEMI-THERM 28! Six new technical short courses are being offered in the SEMI-THERM pre-conference program. These highly informative courses span two days; Sunday March 18 - Monday March 19, 2012. These reasonably priced and outstanding technical courses are focused on the thermal sciences and are presented by leaders in the semiconductor cooling field. New for 2012 are two short half-day short courses on Data Center thermal management. Lunch is included with registration and CEU credits are available. Registration opens November 15, 2011. Short Course A. Sunday and Monday, 8AM-5:30PM Two Days Thermal Management for LED-based Applications Clemens Lasance, Philips Research Emeritus, Cathy Biber, Thermal Consulting Course Description This two-day course covers all aspects of thermal management knowledge required to solve thermally-related problems in LED-based components and systems. Designers will learn about:
Theory of heat transfer mechanisms from an industrial point of view Unique features of LEDs Basics of thermal design for LEDs Critical elements of a successful thermal design Active and passive cooling of LED-based systems Lifetime and performance issues Rules-of-thumb and hand calculations, numerical and experimental analysis and more!
Who Should Attend Engineers directly involved with thermal design and cooling of LEDbased products and students who want to understand and learn more about this increasingly important field of electronics would benefit from the course. Short Course B. Sunday, 8AM-Noon Half-day Thermal Design of Electronic Systems for use in Data Centers Dr. Chris Aldham and Mark Seymour, Future Facilities Course Description: This half-day course will discuss the thermal design of electronic systems with specific emphasis on the interaction between the system and its environment. It will also mention future trends in both system and data center design and the need to ensure that both are fully compatible.
Fundamentals of heat transfer and fluid flow Design of Air-Cooled Heat Sinks: Conventional and advanced aircooled heat sinks; Air-Cooling limits Fundamental elements of thermal-fluid design: Compact Heat Exchanger Effectiveness Design of liquid cooled macro to micro-scale heat sinks and cold plates Thermal and Hydraulic design elements for box and server level liquid cooling systems Supply chain management in designing and deploying liquid cooled systems
Who Should Attend Engineers tasked with thermal management design, engineers considering the transition from air to liquid cooled systems, technical leads and technical mgrs overseeing strategic direction on thermal management decisions. Short Course E. Monday, 1:30PM-5:30PM Half-day Overview of Thermal Management Analysis and Testing for Electronics used in Harsh Environments Ross Wilcoxon, Dave Dlouhy, Rockwell-Collins Course Description The course will present a brief overview of heat transfer fundamental concepts and describe how they establish physical limitations for thermal management. First order analysis approaches using general design guidelines and thermal resistance-based calculations will be presented. The course will also include an overview on thermal/fluid testing that describes proper use of thermocouples, IR cameras, flow recapture, etc. The course will also include a discussion on the capabilities and limitations of a number of advanced thermal management technologies that may be considered for especially challenging problems.
Who Should Attend This is a review of electronics cooling topics that focuses on providing Who Should Attend guidelines for assessing thermal management requirements and deThis short course is intended as an introduction for engineers responsisigns for equipment used in harsh environments. The target audience ble for thermal management of electronic systems designed for use in is primarily mechanical, electrical, and systems engineers who need to data centers and anyone interested in understanding more about the understand how thermal management issues affect overall system interaction of the cooling systems used in data centers and the elecdesign. tronic systems they house. Short Course C. Sunday , 1:30PM-5:30PM Half-day Data Center Design Cannot Ignore Electronic System Details Mark Seymour and Dr. Chris Aldham, Future Facilities The course will demonstrate the resulting thermal problems and show the ad-hoc fixes that are becoming standard in data centre operation to cope with disconnect between the electronics thermal design and data centre cooling design.
Short Course F. Monday, 8:00AM-Noon CFD Applied to Electronics Cooling, focusing on Realistic Design Expectations Peter Rodgers, Ph.D. The Petroleum Institute, Abu Dhabi, United Arab Emirates
Half-day
Course Description This course provides numerical modeling strategies and methodologies for the thermal design of electronic equipment, which are generiWho Should Attend cally applicable with CFD codes dedicated to the thermal analysis of This short course is intended as an introduction for engineers interested electronic systems. The course also permits an understanding of the in designing cooling strategies for data centers or managing data technical capabilities and limitations of CFD analysis at the various centers. Attendees include facility managers interested in understandphases of the product design cycle. Practical application examples ing the common challenges and how to address them and IT managare presented for the analysis of component, board, system, and facilers interested in maximising the capacity and life of their data center ity-level heat transfer. The course has no bias towards a particular whilst minimising risk by making informed deployment decisions. CFD code. Short Course D. Monday , 8AM-5:30PM Full Day Transitioning from Air to Liquid Cooling: Design Fundamentals for Heat Sinks and Cold Plates Alfonso (Al) Ortega, Ph.D., Villanova University Rahima Mohammed, Ph.D., Intel Corporation Course Description The course is organized into the following modules: Semiconductor industry and role of thermal management
Who Should Attend This half-day short course will benefit engineers, managers and scientists involved in the thermal management of electronic systems. It is aimed at participants with varying expertise levels in Computational Fluid Dynamics (CFD), from novice to advanced. Visit www.SEMI-THERM.org today for full course descriptions and registration information!
ADVANCE
TECHNICAL
PROGRAM
SEMI-THERM TECHNICAL PROGRAM 2012 TUESDAY MARCH 20, 2012 OPENING CEREMONY
History and Charter of SEMI-THERM: Herman Chu, ST28 General Chair, Cisco Systems KEYNOTE: The Energy Costs of Cooling Electronic Systems: Reflections on Past Lessons and Future Opportunities Dr. Alfonso (Al) Ortega, James R. Birle Professor of Energy Technology, Villanova University
SEMI-THERM TRACK A 3-D Packaging
Measurement of Microbump Thermal Resistance in 3D Chip Stack Experimental Thermal Resistance Evaluation of a Three-dimensional (3D) Chip stack Impact of Die-to-Die Thermal Coupling on the Electrical Characteristics of 3D Stacked SRAM Cache Dynamic Compact Thermal Model for Stacked-Die Components
SEMI-THERM TRACK B
DATA CENTER TRACK A
Multidisciplinary Thermal Management
Novel 3D Electro-Thermal Robustness Optimization Approach of Super Junction Power MOSFETs Energy Reduction in Server Cooling Via Real Time Thermal Control A Case study on Impact of Free Air Cooling on Performance of Telecom Equipment Enhancement of Photovoltaic Solar Module Performance for Power Generation in the Middle East
Sustainable Data Centers
Air Vs.Liquid Vs. Two-Phase Cooling of Data Centers—Principles, Energy Efficiency Considerations and Market Driven Forces Design And Management Of Data Center Effectiveness, Risks And Costs Sustainable Data Centers Powered by Renewable Energy Sources
Liquid Cooling
Modeling and Analysis
Two-Phase Flow Control of On-Chip TwoPhase Cooling Systems of Servers Numerical Prediction of the Junction-toFluid Thermal Resistance of a 2-Phase Immersion-Cooled IBM Dual Core POWER6 Processor Thermal Performances of SubAtmospheric Loop Thermosyphon With and Without Enhanced Boiling Surface Loop Heat Pipes with High Performance Wick Structures
Embedded Tutorial Thermal Test Methodology and Standards for Lighting LEDs Andras Poppe Mentor Graphics Corporation Mechanical Analysis Division
Data Center Cooling Management And Analysis – A Model Based Approach Datacenter Power Savings through High Ambient Datacenter Operation: CFD Modeling Study CFD Analysis of Free Cooling of Modular Data Centers
EXHIBIT HALL OPEN 1:30-6:30PM - VENDOR WORKSHOPS 2:00 - 5:00PM - RECEPTION 6:00PM Visit the exhibits and join us in the exhibition hall for a reception followed by the ST28 evening tutorial. WEDNESDAY MARCH 21, 2012 SEMI-THERM TRACK C Thermal Management in Multi-Core Architectures
Performance Optimization of Multi-Core Processors using Core Hopping - Thermal and Structural Thermal System Identification (TSI): A Methodology for Post-silicon Characterization and Prediction of the Transient Thermal Field in Multicore Chips On-chip Cooling of Hot-Spots with a Copper Micro-Evaporator Energy Efficient Liquid-Thermoelectric Hybrid Cooling for Hot-Spot Removal Test ASIC for Investigation of Thermal Coupling in Many-Core Architectures Thermal Issues in GaN Devices Thermoreflectance CCD Imaging of Self Heating in AlGaN/GaN High Electron Mobility Power Transistors at High Drain Voltage Thermal Factors Influencing the Reliability of GaN-based HEMTs Integration of a Phase Change Material for Junction Level Cooling in GaN Devices
SEMI-THERM TRACK D
DATA CENTER TRACK B
Experimental Methods
Panel: Current trends in data center industry: challenges and opportunities
Socket Thermal Testing Procedure and Correlation A Method to Measure the Heat Dissipation from Component on PCB Thermal Conductivity Measurements of Novel SOI Films Using Submicron Thermography and Transient Thermoreflectance Measurement of Thermal Conductivity of Carbon Nanotube Films by Steady-State Heat Conduction Application of Thermal Transient Testing for Solar Cell Characterization
Panelists: Madhusudan Iyengar (IBM), Tahir Cader (HP), Kushagra Vaid (Microsoft), Amir Michael (Facebook), Mark Seymour (Future Facilities)
Server Liquid Cooling with Chiller-Less Data Center Design to Enable Significant Energy Savings Experimental Investigation of Water Cooled Server Microprocessors and Memory Devices in an Energy Efficient Chillerless Data Center Experimental Characterization of an Energy Efficient Chiller-less Data Center Test Facility with Warm Water Cooled Services
Liquid Cooling
Heat Transfer Fundamentals I
A Heatsink Design Methodology Using the Thermal ShortCut Concept A Framework Theory for Dynamic Compact Thermal Models A Method to Adapt Zth-Junction-toAmbient Curves to Varying Ambient Conditions
ADVANCE
TECHNICAL
PROGRAM
SEMI-THERM TECHNICAL PROGRAM 2012 WEDNESDAY MARCH 21, 2012 (Continued) SEMI-THERM TRACK D
SEMI-THERM TRACK C
Heat Transfer Fundamentals II
Fan Cooling
DATA CENTER TRACK B
A Multiple Vibrating Fans System Using Interactive Magnetic Force and Piezoelectric Force Study of a Cooling System with a Piezoelectric Fan Modeling of Fan Failures in Networking Enclosures SynJet Augmented Cooling of a 1U Security Chassis
Two-Layer Heat Spreading Revisited Heat Spreading From a Small Source on a Thin Plate Optimization of Thermal Performance for Scale-roughened Surface Design Considerations for Heat Spreader in High Heat Flux Systems
Cooling Management
Achieving Energy Efficient Data Centers Using Cooling Path Management Coupled with ASHRAE Standards Use of Ducting to Improve Inlet Conditions for Side-to-Side Airflow Switches in Data Centers Effect of Perforated Tile Vent Pattern on Rack Air Flow Distribution in a Raised Floor Data Center Facility
EXHIBIT HALL OPEN 1:30-6:30PM - VENDOR WORKSHOPS 2:00 - 5:00PM - RECEPTION 6:00PM Visit the exhibits and join us in the exhibition hall for a reception and exciting giveaways! THURSDAY MARCH 22, 2012 (Single Track) Thermal Aware IC Design
Enabling Reliability-Aware Floorplanning With Fast Thermal Simulations Cache Leakage Power Estimation Using Architectural Model for 32 nm and 16 nm Technology New Simulation Approaches Supporting Temperature-Aware Design of Digital ICs 2012 THERMI Award Presentation
LED Thermal Management
A Step Forward in Multi-Domain Simulation of Power LEDs How Thermal Environment Affects OLEDs’ Operational Characteristics An Innovative Passive Cooling Method for High Performance Light-emitting Diodes
SEMI-THERM EXHIBITION 1:30 - 6:30PM TUESDAY, WEDNESDAY MARCH 20 and 21, 2012
Advances in Materials, Thermal Imaging and Validation Platforms
High Performance Thermal Interface Materials with Enhanced Reliability System and Methodology of Chip Thermal Imaging Capability to Explore a Flexible and Efficient Supplement to Traditional Thermal Test Vehicle (TTV) Side-by-Side Comparison Between Infrared and Thermoreflectance Imaging Using a Thermal Test Chip with Embedded Diode Temperature Sensors Performance Improvements of AirCooled Thermal Tool with Advanced Technologies
Don’t miss the SEMI-THERM 28 exhibits! The exhibit hall features over 40 vendors focused on thermal management products and services are here to discuss the newest products and solutions. Parking vouchers, giveaways and nightly complimentary food and beverage receptions are held in the exhibit area. Free vendor workshops are featured in the afternoons on the hour!
Closing Ceremony and Luncheon
SPECIAL SEMI-THERM PROGRAMS LUNCHEON SPEAKERS Tuesday: Status and Challenges of Meeting Global Energy Demands for 2030 Peter Rodgers, Ph.D. Wednesday: How We Caught the Unabomber - Terry Turchie, FBI (ret.)\ EVENING TUTORIAL 6:30-8:00PM Tuesday, March 20 Thermal Challenges and Opportunities in Concentrated Photo-Voltaics - Seri Lee, Nanyang Technical University, Singapore In this tutorial, technical challenges and market opportunities associated with the thermal management of concentrated photovoltaics, (CPV), are examined.
The Heat is On Symposium 2012 MEPTEC will again join SEMI-THERM with their popular Thermal Management symposium to be held on Monday, March 19. Industry forecasts and roadmaps, regardless of the source, are in agreement that semiconductor devices are hot and getting hotter. This symposium will bring together a group of key speakers in the industry to share their concerns and solutions. Individuals from management to development groups to sales and marketing continue to monitor the thermal industry to look for new business opportunities. All professionals involved in or concerned about effective thermal management should attend. Visit www.MEPTEC.org. FOR INFORMATION, DETAILS, FULL EVENT DESCRIPTIONS AND MORE, VISIT WWW.SEMI-THERM.ORG
Efficient Electro-Thermal Simulation of Power Semiconductor Devices via Model Order Reduction Tamara Bechtold, CADFEM GmbH, Stuttgart, Germany Torsten Hauck, Freescale Semiconductor Inc., Munich, Germany Leon Voss, ANSYS Germany GmbH, Darmstadt, Germany E. B. Rudnyi, CADFEM GmbH, Stuttgart, Germany
Tamara Bechtold received her M.Sc. in microelecronics and microsystem engineering from the University of Bremen, Germany, in 2000 and her PhD in microsystem simulation from the University of Freiburg in 2005. Between 2006 and 2010, Bechtold worked as a researcher for Philips/NXP research laboratories in Eindhoven, the Netherlands. Since 2010 she has been with CADFEM GmbH in Stuttgart, Germany, and supports industry and academia in application of advanced modelling and simulation tools for system-level simulation and quasi-static electromagnetic device simulation.
Torsten Hauck received his diploma in material science from the University Halle and his PhD degree in fracture mechanics from the University Paderborn, both in Germany. In 1996 he joined Motorola where he worked as senior staff engineer in the semiconductor products sector. Since 2004 he is with the Freescale Semiconductor GmbH in Munich, Germany, where he is responsible for modelling and simulation of microelectronics components and microsystems.
Dr. Leon Voss received his PhD in the area of Power Electronics from the University of Bochum, Germany, and his M.A.Sc and B.A.Sc. in Electrical Engineering from the University of Waterloo, Canada. He has over 15 years professional experience in the areas of power electronics, electric power systems and drive technology. He joined ANSYS in 2007 as a Senior Application Engineer, where he is responsible for FEM, circuit and system simulation for electromechanical systems.
Evgenii Rudnyi received his Ph.D. in physical chemistry from Moscow State University (MSU), Russia. From 1985 to 2000, he was a Research Worker, and Associate Professor in the Chemistry Department, MSU. In 1991, he was a Guest Scientist at the National Institute of Standards and Technology. From 2001 to 2006 he worked at the Institute of Microsystem Technology (IMTEK) University of Freiburg, Germany, as a Senior Scientist. From 2006 he has worked at CADFEM in the area of thermal management, multiphysics and electromobility.
22
ElectronicsCooling
T
hermal management is an increasingly important consideration in the development of power electronic systems. Electro-thermal simulation is required at the system level in order to ensure under realistic loads the fulfillment of thermal requirements, which strongly influence the performance, reliability and efficiency. A significant modeling issue is to obtain a compact but accurate thermal model needed for the system level simulation. We suggest an efficient methodology based on modern mathematical algorithms of model order reduction (MOR) and demonstrate the approach on a power MOSFET model.
Electro-Thermal Simulation
Electrothermal simulation at system level is a combined simulation of electrical and thermal aspects of the system as schematically shown in Figure 1. The circuit model calculates power dissipation, which is used by the thermal subsystem to evaluate the temperatures. Temperatures in return influence the circuit parameters, thus requiring a two-way coupling. The thermal model in Figure 1 is a simple, lumped-element model. For general complex geometries, a more accurate, physical model in form of heat transfer partial differential equation is required:
Figure 1. Simple electro-thermal simulation with conservative coupling between thermal and electrical subsystems. December 2011
Figure 2. The idea of model order reduction: automatic way from a finite element thermal model to electro-thermal simulation at the system level
∂T ∇ • (κ∇T ) + Q − ρc p =0 ∂t
(1)
where K(r) is the thermal conductivity in W/m/K at the position r, cp (r) is the specific heat capacity in J/kg/K, ρ(r) is the mass density in kg/m3, T(r,t) is the temperature distribution and Q(r,t) is the heat generation rate per unit volume in W/m3. Spatial discretisation (via e. g. finite element method - FEM) of (1) results in a large-scale ordinary differential equation (ODE) system with n equations of the form:
ET + K T =F
(2)
where E Є Rnxn is a heat capacity matrix, K Є Rnxn is a heat conductivity matrix, T Є Rn is a vector of nodal temperatures in time and F Є Rn is a heat source load vector. (1) can also be considered as an electrical network where the vector T will be equivalent to unknown voltages, the matrix E will be a capacity matrix and the matrix K the resistance matrix (see [1] for methods to synthesis electrical networks). In both cases, (2) is not compatible with system level simulation, as vector T usually contains several hundred thousand degrees of freedom. The remedy is to apply modern mathematical algorithms of model order reduction [2],[3], which enable a formal transformation of (2) to a low-dimensional ODE system of the same form but which much smaller dimension, as described in the following section.
the complete temperature field is required, the output matrix C Є Rmxn (where m is the number of outputs of interest) has to be defined as a unity matrix. Model reduction is based on an assumption that the “movement in time” of a high dimensional state vector T can be well approximated by a lower dimensional subspace (ε in Figure 3 should be small). Such subspace is defined by the projection matrix V. Provided this subspace is known, the original system can be projected on it, as shown in Figure 3. The definition of such low dimensional subspace i. e. matrix V, is subject of a particular MOR algorithm. For thermal systems, we propose to use the Krylov-supspace based Arnoldi algorithm [7], which defines the projection matrix in such a way, that the transfer function of the original dynamic system is well approximated by the Taylor-series of the transfer function of the reduced system. Mathematically speaking this approach belongs to the Padé approximation and is also known under the name moment-matching method, where the Taylor-series coefficients are called moments. The detailed description of the algorithm and the theorems proving its moment matching properties can be found in [2][3] and [7]. For performing model order reduction of finite element models, we employ the software tool MOR for ANSYS [4]. It reads the system matrices created by the finite element simulator ANSYS [5], runs the Arnoldi algorithm and writes the reduced system, i. e. its matrices, Er, Kr, Br and Cr (see Figure 3) out. The reduced matrices can be read directly in Simplorer [6], MATLAB/Simulink, and other system level simulation tools. It is also possible to write down the reduced model as a Spice model or a template for the use in VerilogA and VHDL-AMS.
Electro-thermal System Simulation of a power MOSFET
The self-heating of a power MOSFET can cause a significant temperature rise associated with temperature gradients and thermal stresses in the particular device. Thermal cross-talk between transistor channels as well as temperature dependent heat generation rates will accelerate the self heating phenomenon, which can occasionally even cause a thermal runaway
Model Order Reduction
Model order reduction is an area of mathematics which enables a formal approximation of the physical model and hence the generation of a compact model suitable for system level simulation (see Figure 2). Model order reduction starts with the ordinary differential equation system (2), which is slightly modified as follows (3) In (3) the load vector F Є Rn from (2) was split into a product of a constant input matrix B Є Rnxp and a vector u(t) Є Rp of p input functions (heat sources) and a second, output equation has been added. The output equation in (3), defines certain linear combinations of the state vector T that are of interest for the particular application in system level simulation. This equation has to be defined by the user. Note, that in case when 24
ElectronicsCooling
Figure 3. Model order reduction as a projection of the system onto the lowdimensional subspace, defined by the matrix V. Different MOR algorithms define V in different ways. The approximation is good if the error ε is small. For thermal models, E is the heat capacity matrix, K is the heat conductivity matrix, and F is a heat source load vector. is a vector of nodal temperatures in time. December 2011
Stay current with the latest EMC/EMI news, standard updates, technical articles, product news, and industry events. Join the growing number of subscribers Subscribe online today
www.interferencetechnology.com Your online resource for EMC/EMI information.
Figure 5. Schematics of a power switch driving four light bulbs, only the first light bulb is shown, the other three are identically connected to T2, … , T4.
Figure 4. Solid model of the power switch, SOIC package mounted on PCB, de-capsulated.
instance. A good understanding of the electro-thermal behavior of power switch, electronic control unit and the load is a key for the system design. The model generation path and electro-thermal system simulation are demonstrated for a novel High-Side Switch devices [8] with four independent channels. The device is packaged in a Small Outline IC (SOIC) package, which itself is assembled onto a printed circuit board. Figure 4 shows a model of the de-capsulated package (plastic molding compound removed) with the wire bonds connecting each transistor channel with its associated external lead fingers. The corresponding electro-thermal system model (Figure 5) contains two distinct thermal submodels for power switch and printed circuit board (PCB), and five electrical submodels, one for each transistor channel and one for the light bulb. Thermal submodels have been extracted from the 3D-finite element models by means of model order reduction, as described in section 2. Power switch and PCB share one common node at the system level, which represents the thermal port between SOIC exposed pad and the adjacent PCB land pad. The PCB model also contains the heat convection of the system into the ambient air. The electrical transistor models are ready to consider dynamic self heating phenomena. In addition to the electrical IO/s, gate, drain and source, these models each contain one electro-thermal port T interconnecting electrical and thermal domains of the power switch. In Figure 6, the thermal response of a full-scale model and of a reduced order model of a single transistor is shown. The difference is less than one percent. Model reduction takes only 80 s, while transient simulation with 60 time steps requires about 2000 s. At the same time, system level simulation lasts less than a second. In the system level model shown in Figure 7 a single transistor has been loaded sequentially with one, two or three lamps. The light-bulb model is represented using an existing Spice model which was imported directly. In Figure 8 the junction temperatures are shown as functions of time for a different number of lumps. One can see that with three lumps the tem26
ElectronicsCooling
perature for short time of about 5 ms goes over 200 degrees Celsius. The thermal model considered in this study cannot tell us whether this is acceptable or not, but it demonstrates the need for transient electrothermal simulation. The stationary temperatures are acceptable for all three curves in Figure 8 and only transient simulation can reveal that the temperature goes over the critical temperature. More information can be found in [9] and [10].
Conclusion
We have shown that model order reduction is an efficient tool to automatically generate an accurate compact thermal model for power electronic devices. The advantages of the method are as follows: • The method uses the system matrices from the accurate finite element model directly. • The method is automatic. The user should set the dimension of the reduced model and define the outputs of interest. According to our experience the accuracy of one percent is achieved with 10-15 degrees of freedom per input. • The model reduction process is fast. The time for model reduction is comparable with that of a static solution and usually, depending on the chosen integration time step, shorter than a single transient run with the original finite element model.
Figure 6. Comparison of the transient thermal response of a transistor model: the reduced model order 30 vs. the original model order 300.000. December 2011
Figure 8. Junction temperature on the transistor vs. time as a function of number of lamps. Figure 7. Simulation model with bulb lamps.
REFERENCES [1] J. T. Hsu, L. Vu-Quoc, A rational formulation of thermal circuit models for electrothermal simulation. Part I: Finite element method. IEEE Trans. Circ. Syst. – I: Fund. Theor. Appl. v. 43, N 9, p. 721-32, 1996. [2] T. Bechtold, E. B. Rudnyi, J. G. Korvink. Fast Simulation of Electro-Thermal MEMS: Efficient Dynamic Compact Models, Springer 2006, ISBN: 978-3540-34612-8. [3] A. C. Antoulas, Approximation of Large-Scale Dynamical Systems. Society for Industrial and Applied Mathematic, 2005, ISBN: 0898715296. [4] E. B. Rudnyi and J. G. Korvink. Model Order Reduction for Large Scale Engineering Models Developed in ANSYS. Lecture Notes in Computer Science, v. 3732, pp. 349-356, 2006, http://ModelReduction.com.
[5] ANSYS® Mechanical, Release 13.0, http://www.ansys.com [6] ANSYS Simplorer®, Release 9.0, http://www.ansys.com [7] R. W. Freund, "Krylov-subspace methods for reduced order modeling in circuit simulation", Journal of Computational and Applied Mathematics, vol. 123, pp. 395-421, 2000. [8] MC15XS3400: Quad High Side Switch, http://www.freescale.com [9] T. Hauck, W. Teulings, E. B. Rudnyi. Electro-Thermal Simulation of Multi-channel Power Devices on PCB with SPICE. THERMINIC 2009, 7-9 October, Leuven, Belgium, 6 p. [10] T. Hauck, I. Schmadlak, L. Voss, E. B. Rudnyi. Electro-Thermal Simulation of Multi-channel Power Devices: From Workbench to Simplorer by means of Model Reduction. NAFEMS, Seminar: Multi-Disciplinary Simulations - The Future of Virtual Product Development, 9-10 November 2009, Wiesbaden, Germany, paper #5, 10 p.
electronics-cooling.com
l
ElectronicsCooling
27
The Increasing Challenge of Data Center Design and Management: Is CFD a Must? Mark Seymour, Christopher Aldham, Matthew Warner, Hassan Moezzi Future Facilities, San Jose, California, USA
Mark Seymour, MSEE, MASHRAE, director of Future Facilities, started his career developing experiments to verify simulation tools he developed in the defence industry. In 1988, he moved into the building services industry, establishing CFD analysis alongside on-site testing and full- and part-scale modelling of the built environment. During this time he was responsible for the user requirements for the first building HVAC CFD program at Flomerics. In 2004 with two colleagues he formed Future Facilities and was subsequently responsible for development of the 6SigmaDC software for data centers.
Dr. Chris Aldham, PhD, software product manager at Future Facilities, has worked in computational fluid dynamics (CFD) for over 30 years and for more than 20 years in the field of electronics cooling. After 16 years at Flomerics, Aldham joined Future Facilities to work on 6SigmaDC, a suite of integrated software products that tackles head-on the challenges of data center lifecycle engineering (including equipment design analysis) through the Virtual Facility.
Matt Warner, BEng, PhD, software product manager at Future Facilities, is a mechanical engineer and received a PhD in numerical analysis and simulation from the University of Bath. For 15 years Warner has developed virtual modelling techniques for a diversity of industries where physical prototyping is not feasible. Over the last five years at Future Facilities, he has worked closely with data center professionals to improve resilience, energy efficiency and utilization in data centers using engineering simulation techniques.
Hassan Moezzi, director of Future Facilities, has been involved with the CFD industry for 25 years. As one of the founders of Flomerics (now part of Mentor Graphics), Moezzi led the introduction of FLOTHERM, the first CFD software for electronics cooling. Moezzi left Flomerics in 2004 to set up Future Facilities Ltd, a provider of the 6SigmaDC suite of CFD-based software solutions for the data center and electronics industries worldwide.
D
ata centers have been cooled for many years by delivering cool air to the IT equipment via the room. One of the key advantages of this approach is the flexibility that it provides the owner / operator in terms of equipment deployment. In principle all that is necessary is to determine the maximum power consumption of the equipment and provide an equivalent amount of cooling to the data center. Why then, since we have been building and operating data centers for decades using air cooling, do data centers experience hot spots and fail to reach their design expectations for capacity? This article will explain some of the challenges faced in the search for the perfect data center and why, given the variability of equipment design and the time varying nature of the data center load, CFD is not the only tool to be used in design and / or management of a data center, but is an essential tool to enable maximum performance.
IT Equipment Airflow Is Important Design For Airflow As Well As kW
Traditional design simply ensured that the cooling system provides sufficient cooling in terms of kW per area or per cabinet. When power densities were low, this was sufficient since any re-circulation of air only resulted in moderately warmed air entering the equipment. Increasing heat densities in electronics often results in higher temperature recirculation putting equipment at risk. The room cooling performance will be affected by the ratio of air supplied to the room compared with IT equipment airflow demand. Too much cooling air (Figure 1, left) results in wasted energy with cool air returning unused to the cooling
Figure 1. Example showing poor airflow balancing with over-supply waste and under-supply and recirculation potentially causing overheating. 28
ElectronicsCooling
December 2011
HAVE YOU VISITED www.Electronics-Cooling.com LATELY?
Electronics-cooling.com provides the following information to our readers:
• Industry news • New product listings • Archives of ElectronicsCooling magazines
• Buyers’ Guide • Industry events and symposium listings • And much more!
ITEM
TM
Figure 2. Example showing several different airflow regimes from equipment. Front to back, upward outflow and mixed side to side and front to back.
system. Too little cooling air (right) will be compensated for by drawing potentially warm ‘used’ air in from the surrounding environment risking overheating. The designer must therefore consider airflow balance, as well as cooling power, and incorporate a strategy to address the changing needs likely during the life of the data center.
Impact Of Real Equipment On Cooling Performance
Figure 3. Example showing the impact of air volume and temperature rise on the flow from a server.
There are no generally accepted standards for equipment cooling. This provides a blank canvas for the equipment designer who can freely choose how to package the equipment and cool it. The equipment design will change the equipment’s demands on its environment and how best to configure of the rack and the data center. This variability in design affects the room cooling requirement by drawing air in from differing locations / faces of the equipment and similarly exhausting it from different locations and in different directions. Historically while servers have been generally designed with front to back airflow other equipment has often varied. Figure 2 shows some typical examples. The choice of flow rate and temperature rise for the equipment also affects room airflow patterns and temperatures. A low flow (Figure 3, left) results in slow moving air that rises due to buoyancy. At higher flows (right) velocity dominates and the air shoots away without rising appreciably. Flow rate will depend upon equipment design but also on operational details such as equipment utilization or environmental conditions. The designer and operator must recognize the need for the design to be flexible so it can accommodate the change over time. The choice of IT equipment and how it is utilized will affect the cooling performance of the data center, consequently airflow must be considered throughout data center life.
Configuring The Data Center Rack Configuration
The data center is an evolving entity in which deployments are easily made but often not easily reversed. In practice the consequences of these deployments are often not seen until later in the data center lifetime. Failure to plan, allowing for the consequences of deployments on airflow and cooling can, and commonly does, result in a gradual deterioration of cooling performance and hence capacity of the data center. It is not unusual for a data center to experience cooling difficulties and hotspots at only ⅔rds design capacity in terms of kW equipment deployed. This apparently unusable capacity is
30
ElectronicsCooling
December 2011
Figure 4. Blade system deployed in cabinet below 1U servers.
Figure 5. Blade system deployed in cabinet above 1U servers.
often called â&#x20AC;&#x2DC;stranded capacityâ&#x20AC;&#x2122;. It is stranded because without further assessment and reconfiguration it may not be possible to use the full design capacity without creating hotspots. This is not just a data center issue at room level alone but is affected by deployment decisions within the cabinet too. It is common to consider issues such as blanking when placing equipment in a cabinet but less common to consider equipment interaction. Figure 4 shows the air circulation and temperatures in a cabinet where only two IT deployments have been made. First a blade system has been deployed and then
three 1U servers added in the slots immediately above. In this instance the hot air from the blade system is confined to the lower part of the cabinet and recirculates under the cabinet causing high inlet temperatures and reduced resilience. If the servers are deployed in the opposite sequence (Figure 5), with the 1U servers in the lower 3 slots and the blade system above, then the recirculation is dramatically reduced resulting in lower inlet temperatures and greater resilience. The configuration of equipment inside the cabinet can equally affect room conditions. Figure 6 shows that, with no
electronics-cooling.com
ElectronicsCooling
31
Figure 6. Example showing front to back servers cool while routers are hot in the same configuration.
modification to rack cooling strategy, the room conditions and equipment temperatures are dramatically changed when front to back ventilated servers are replaced by routers with mixed airflow of the same power.
Room Configuration
The primary challenges for the owner / operator are: a. How to deploy different types of equipment in the data center side by side which have different demands for airflow and cooling when the data center was intended to provide a relatively uniform cooling capability; b. How to be prepared for and accommodate future generations of equipment with characteristics yet unknown. It is unlikely that the cooling system will deliver airflow and cooling uniformly throughout the data center. This means that the owner / operator must be aware of the impact of location of deployment in the context of the actual performance of the data center. Consider a scenario where 2 new Sun M5000s are to be deployed. Figure 7 shows the impact is different depending on choice of installation location even though in principle all the chosen locations have space power and cooling. The different results are not easily predicted using simple rules as they result from the combination of equipment heat
load, equipment airflow but also the configuration, in particular the interaction between the equipment and room airflows. The increase in power density and soaring energy costs, combined with the growing awareness of a need for environmentally responsibility, has stimulated the fundamental design of data center cooling to be revisited. New approaches, such as aisle containment, are implemented to increase efficiency but the increased efficiency will only be achieved if equipment airflows are appropriately controlled / balanced. If this is not done potentially high temperature damaging recirculation can still occur, Figure 8. Matters affecting the cooling performance are not limited to equipment deployment alone but also the infrastructure such as ACU & PDU deployment and location and size of cable routes.
So should CFD be used?
Alternative approaches to design and analysis such as rules of thumb and hand calculations or more sophisticated models such as modeling using potential flow theory provide an insight into cooling performance. However they all to a greater or lesser degree fail to account fully for the important issue of airflow and its momentum.
Figure 7. Example showing 2 different spatial deployments of 2 items of equipment â&#x20AC;&#x201C; only option 2 works. 32
ElectronicsCooling
December 2011
Figure 8. Recirculation in a poorly contained scenario.
Computational Fluid Dynamics, or CFD for short, provides a unique tool capable of modeling the data center and equipment installation in conceptual design right through to detailed modeling for operation. Providing a 3-dimensional model of the facility can account for almost any feature and combination of features, in a manner very similar to its use in electronic equipment design. However, modeling a data center can be somewhat more challenging since the data center is a dynamic changing ‘electronics design’. The above illustrations show the potential for CFD to predict the significant impact of these small variations. So, what are the barriers that must be overcome for successful use of CFD? From an academic point of view people may point to the more theoretical aspects of CFD such as turbulence modeling, gridding, solution time and indeed including the full physics, but, in practice, these difficulties are normally far outweighed by the difficulty of capturing the true configuration (such as the features described above) in sufficient detail to predict the resulting environment. CFD can relatively easily be used for concept design decisions but even here there is need for education about the significant risk of ignoring equipment type and resulting airflow, temperature affects and likely deployment locations. For real facilities it is even more challenging and it is essential that measurements are made alongside the modeling process in order to ascertain that the model reflects reality. Why? Because some details can never be represented precisely (e.g. unstructured cabling, damper settings …) and others may depend on operational factors such as equipment utilization. For the latter it is hard to gather airflow and heat dissipation data for equipment to be able to characterize them fully in deployment. Here the electronics industry can make an important contribution by publishing data more openly and indicate likely trends for planning purposes. Once this careful data gathering exercise has been achieved then, with the right tools, it is relatively straightforward to build a CFD model of the data center and check that it reflects reality. Once a ‘calibrated’ model is achieved it can be used with reasonable certainty to make deployment decisions and undertake other tests such as failure scenarios with confidence. In the view of the authors it is imperative,
if the potential for stranded capacity is to be minimized, that simulations be undertaken frequently to understand the implications of deployment decisions and that monitoring and comparison with measured data is made regularly to ensure the model continues to reflect reality. With the increasing availability of (live) measured data, maturing CFD tools and increasing pressure on energy efficiency but still maintaining availability, using CFD with an appropriate methodology can be a critical tool for design and management. l
electronics-cooling.com
ElectronicsCooling
33
Energy Reduction and Performance Maximization through Improved Cooling David Copeland Oracle Corporation, Santa Clara, California USA
David Copeland works in Thermal Engineering and Packaging Technology in Oracle Systems Microelectronics, developing packaging and cooling technology for UltraSPARC processors and the systems which use them. Areas of development include thermal interfaces, heat spreading materials, embedded heat pipe and vapor chamber heatsinks, single-phase and phase-change liquid cooling and data center cooling. Copeland received the BS from Massachusetts Institute of Technology, MS from Stanford University and DrEng from Tokyo Institute of Technology, all in Mechanical Engineering. Before joining Sun Microsystems in 2005, he worked in packaging and cooling at IBM, Hitachi and Fujitsu and in heatsink development and design at Intricast, Sumitomo and Showa Aluminum. Copeland belongs to ASME, IEEE and IMAPS and is a frequent participant at conferences on electronics cooling.
T
he International Technology Roadmap for Semiconductors [1] predicts high performance processor power density to more than double by the year 2024, while during the same time allowable junction temperature will decrease from 90oC to 70oC, reducing the junction-toambient temperature difference to nearly half. This combined challenge to cooling will require the total thermal resistance to decrease by almost a factor of four. Even if these changes are only half as great, significant improvements in cooling will be required. Yet independent of increases in power density, advantages to improved cooling can be demonstrated. Leakage current has become an increasing fraction of processor power with each technology node. As leakage current is strongly temperature dependent, processor power dissipation can be reduced through improved cooling. Achievable processor frequency is strongly dependent on temperature and voltage. The voltage dependence is approximately proportional, while temperature dependence of performance has reduced with each technology node. In the near future, the temperature dependence of performance may almost disappear. Through improved cooling, temperature can be reduced while voltage and frequency are increased, resulting in higher system performance. For a given cooling configuration, a combination of voltage and temperature exists which
Figure 1. Package and heatsink. 34
ElectronicsCooling
June 2011
maximizes system performance per watt. At higher powers, additional performance is achieved at the expense of energy. Such increases may be limited by electromigration and other failure mechanisms, which are functions of both temperature and voltage. The temperature dependence of leakage current has been increasing through recent technology nodes. Significant decreases in energy consumption and/or increases in processor frequency can be achieved through improved cooling.
LEAKAGE CURRENT EFFECTS
Leakage current is current which Figure 2. Thermal resistance components. bypasses the transistor gate, which is always present regardless of whether the gate is active, and • Heatsink fins which increases with both voltage and temperature. It can be considered as wasted energy, unlike energy used to perform • Coolant computation. Nonuniformity of power dissipation within the die inLeakage current effects have increased as semiconductor creases package thermal resistance above that of a package lithography has progressed well into the subcontinuum range with a uniformly powered by a significant factor, typically [2]. In some ASICs, leakage current can be over half of the total two or more [6, 7]. For a given nonuniform power distribupower. As the industry moves from 45nm to 32nm to 22nm tion, this factor will increase as the thermal path is improved, and beyond, designs are intended to limit leakage current to moving asymptotically toward the ratio of the highest local about one-third of total power dissipation. Some of this will power density to the average power density.. be achieved by lower junction temperatures. The dependence Recent developments in packaging and cooling largely of leakage current on junction temperature is also becoming stronger. At 65nm, leakage current would change by a factor of two over a temperature range of about 45oC; at 32nm, the range may shrink to 22oC [3]. So the energy savings due to a given temperature reduction will increase with technology improvements. From the viewpoint of effectiveness of improved cooling, a relevant metric is the percent reduction in total power versus temperature. Recent observations of processors on the market show this to range from about 0.5%/ oC up to almost 2%/oC [4]
IMPROVED COOLING
Ellsworth and Simons [5] discussed a variety of package cooling improvements which could be used to cool up to several hundred W/cm2. These began with traditional air cooling of a lidded package as shown in Figure 1. As the thermal path through the package substrate offers a high resistance, nearly all the heat flows upwards and outwards while spreading through the package lid and heatsink base, then finally to the heatsink fins and into the air stream. The heat flow path is through the following components: •
Silicon die
•
Thermal interface material 1 (TIM1)
•
Package lid
•
Thermal interface material 2 (TIM2)
•
Heatsink base
THERMAL INTERFACE MATERIALS HIGH PERFORMANCE & LOW COST OPTIONS u u u u
Gap Filler Pads Thin Films Greases Putty
u Form-In-Place u Die-Cuts u Extrusions u Non-Silicone
Free Samples | Free Product Guide | Application Support | Online Knowledge Center
732.969.0100
electronics-cooling.com
ElectronicsCooling
35
package, acting as the lid. Ultimately, coolant can contact the silicon with no interfaces between. At the system level, liquid can be used to transfer heat to air through a liquid-to-air heat exchanger, to another liquid through an interface or liquid-to-liquid heat exchanger, or the liquid can flow across the system boundary to transfer heat to the datacenter and beyond. Such technologies are described in detail by Kang and Miller [8].
PERFORMANCE MODELING
Figure 3. Component temperatures at nomimal frequency.
consist of replacing traditional materials with those having higher thermal conductivity. In the package, organic TIM1 with metallic or ceramic fillers can be replaced by a completely metallic material, typically indium or indium alloy. Copper package lids can be replaced by composite materials, such as aluminum-graphite, copper-graphite, aluminum-diamond or silicon carbide-diamond. External to the package, heatsinks feature embedded heatpipes or vapor chambers, and in some cases the vapor chamber shares a continuous vapor/liquid space with heatpipes extending into the fins. A cold plate can replace the heatsink or become an integral part of the
The MicroCool® Advantage MicroCool® is pioneering new ground by providing superior liquid cooling performance in high-end server and power electronic systems. Our unique, compact, ultra small cold plates maximize thermal and pressure drop performance due to the high fin density and micro-channel construction.
• Quick turnaround on custom designs • Proprietary fin and pin fin geometries • Consistent quality • Low cost processing • Low capital cost for tooling
www.microcooling.com 36
ElectronicsCooling
The processor is has a nominal power dissipation of 240W, operating at nominal conditions of 0.85V and 95oC. Thermal resistance components of various combinations of packaging and cooling are shown in Figure 2. Values shown are typical for a die approximately one square inch (645mm2) in modern flip-chip packaging. Design variations consist of the following: Package thermal interface material (TIM1) - The TIM1 can be organic or metallic in a traditional lidded package. Coolant on silicon removes the TIM1 and lid or cold plate, simplifying package construction. Package types can be lidded, which requires another thermal interface material (TIM2) or cold plate lid, which eliminates TIM2. The system cooling configuration can be one of three types: Air, in which all heat is dissipated to the air stream without liquid as an intermediary. In this case the air stream is fixed at 35oC, the allowable limit for a typical datacenter. Internal liquid, in which liquid is used to transfer heat to the air stream. External liquid, in which heat is transferred to liquid without air as an intermediary. In this case, the external liquid temperature is fixed at 25oC, slightly above the allowable maximum dewpoint of a typical datacenter. Electrical performance is based on a generic technology representative of recent and upcoming process nodes, and is modeled by a few simple assumptions: Achievable frequency is assumed to be proportional to voltage Freq V [1] Dynamic power to be proportional to frequency times voltage squared: Pd FreqV2 [2] So dynamic power becomes proportional to voltage cubed: Pd V3 [3] June 2011
bination of temperature and voltage, while maintaining gate oxide reliability equal or better than that of the nominal case, results in frequency improvements as high as 19%, as shown in Figure 4. The slopes of the lines from the origin to the operating point are proportional to computational efficiency, defined as frequency divided by power. In this case, as voltage is lowered, efficiency improves monotonically, with packaging and cooling technology making very little difference at the lowest values.
SYSTEM PERFORMANCE
Figure 4. Processor frequency.
Static power (leakage current) is also proportional to voltage cubed. At nominal conditions, static power is fixed at 35% of of dynamic power: Ps / Pt = Ps / Pt = Ps / (Ps + Pd) = 0.35 [4] Dynamic power is assumed to change by a factor of two every 22oC, expressed by: Pd exp[0.0315(Tj - 95)] [
5
If the processor provided all or nearly all of the power dissipation in a system, very low voltage (and frequency) operation would be most efficient. But processors consume, in all but a few cases, less power than memory. Volume systems typically feature fewer dual inline memory modules (DIMMs) per socket than high-end systems. High-end systems have been described as large boxes of memory with other hardware to drive it, and this is a reasonable description in terms of power dissipation, volume occupied and component cost. Adding a constant memory power load changes power per socket and system performance to that of Figure 5. In this case memory power is equal to twice that of nominal processor power, typical of a modern high-end system. System perfor-
]
At nominal frequency (and voltage) the temperatures of various packaging components are shown in Figure 3. While most of the temperature reduction is due to improved cooling, some is due to the reduction in dynamic power. This positive feedback converges toward a lower operating temperature.
RELIABILITY CONCERNS
One practical limit to increasing power is reliability, which is typically dominated by gate oxide failure. This is assumed to change by a factor of two every 10oC (typical of modern technology nodes, and happens to be equal to an often misapplied rule of thumb which has been around for decades) and/or every 15mV in the present study, resulting in: Fail exp{[0.0693(Tj - 95)] + [46.4(V - 0.85)]} [6] As the cooling performance is increased and thermal resistance is reduced, the reliability limit is reached at decreasing temperatures, higher voltages and slightly higher power than nominal. Ultimately leakage current is reduced to only one-seventh of its nominal value.
USB airflow & temperature sensors SIMPLE TO USE data collection interpretation & storage made possible with integrated Accutrac™ • • •
ATM2400 & UAS1000, UAT1000 USB Connected Sensors
www.degreeC.com
FREQUENCY IMPROVEMENTS
Permitting the processor to operate at the highest com-
Innovative Thermal Engineering 18 Meadowbrook Dr. Milford, NH 03055 • 1-877-DEGREEC • sales@degreeC.com
electronics-cooling.com
ElectronicsCooling
37
Figure 5. System performance.
mance is assumed to scale with the square root of processor frequency. This exponent relating processor frequency and system performance is at the low end of the range seen, which approaches a value of one at the high end and is strongly dependent on the program being exercised. Unlike the example of a processor only, an optimum operating point at which the slope of power versus performance is maximized falls close to the midpoint of the range evaluated. This is slightly different for each packaging and cooling configuration. Dividing performance by power provides computational efficiency, which is shown compared to the nominal configuration in Figure 6. Here the optimum operating condition shifts to lower power as cooling is improved.
TOTAL COST OF OWNERSHIP
Figure 6. System efficiency.
While maximization of computational efficiency in terms of energy has been demonstrated, the total cost of ownership should be considered. Until just a few years ago, capital expenditure on computing equipment dominated energy cost, and the two expenses were the concern of different organizations with little communication. Recently higher energy costs, much higher energy usage and environmental concerns have resulted in a more holistic view of datacenter cost and efficiency. In the present study, computing equipment cost is fixed to be equal to that of energy cost. At 11.4 cents per kilowatthour, a continuous watt of electricity costs exactly one dollar per year. This is close to an average datacenter's electricity cost, so a rough approximation can be made that lifetime energy cost of a server in dollars is equal to its average power consumption in watts times its operational lifetime in years. Equal capital energy costs could thus be a server with a five year lifetime and capital cost equal to five times power consumption, for example. Figure 7 plots normalized total cost of ownership (TCO) versus economic computational efficiency, a term invented for this study and defined as system performance divided by TCO. All optimum values feature TCO below the nominal value, and improvement in economic computational efficiency can be as high as 7%. As this is a simple model, some factors are neglected, which will alter the results if included. The cost of floorspace may be an additional factor, tax incentives and deductions will alter TCO calculations. Additional equipment external to the computing must be added or removed.
CONCLUSIONS
A case study showing energy reduction and/or performance maximization through improved cooling of a typical modern high-end server has been performed. In both energy efficiency and economic computational efficiency or TCO minimization, an optimum frequency (and voltage) exists for each packaging and cooling configuration. As cooling is improved, the advantage becomes more favorable. Energy computational efficiency can be increased to 15% above nominal, and economic computational efficiency can be increased to 7% above nominal. More detailed models will shift the values but should follow the same general trend. Lower temperature operation of processors is an available 38
ElectronicsCooling
June 2011
design option with immediate rewards.
ACKNOWLEDGMENTS
This article is based on the author's keynote presentation of the same title at The Heat Is On: Performance and Cost Improvements through Thermal Management Design, 21st March 2011 [9]. The author thanks his Oracle colleagues Vadim Gektin, who provided Figure 1, and Bruce Guenin, who invited him to write this article.
[9] thro Cos Pac
REFERENCES [1] International Technology Roadmap for Semiconductors, 2010, http://www. itrs.net/Links/2010ITRS/Home2010.htm [2] N. S. Kim, T. Austin, D. Blaauw, T. Mudge, K. Flautner, J. S. Hu, M. J. Irwin, M. Kandemir and V. Narayanan, 2003, Leakage Current: Mooreâ&#x20AC;&#x2122;s Law Meets Static Power, IEEE Computer, Vol. 36, No. 12, pp. 68-75 [3] S. Mukhopadhyay, A. Raychowdhury and K. Roy, 2003, Accurate Estimation of Total Leakage Current in Scaled CMOS Logic Circuits Based on Compact Current Modeling, Proceedings of the Design Automation Conference, pp. 169-174 [4] J. Wei, 2007, Challenges in Package Cooling of High Performance Servers, Proceedings of the 2007 International Electronic Packaging Technical Conference and Exhibition, IPACK2007-33637, ASME [5] M. J. Ellsworth and R. E. Simons, 2005, High Powered Chip Cooling - Air and Beyond, Electronics Cooling, Vol. 11, No. 3, http://www.
Figure 7. System performance versus TCO.
electronics-cooling.com/2005/08/high-powered-chip-cooling-air-and-beyond/ [6] J. Deeney, 2002, Thermal Modeling and Measurement of Large High Power Silicon Devices with Asymmetric Power Distribution, Proceedings of the 35th International Symposium on Microelectronics, IMAPS [7] D. Copeland, 2005, 64-bit Server Cooling Requirements. Proceedings of the IEEE Twenty-First Annual IEEE Semiconductor Thermal Measurement and Management Symposium, pp. 94-98 [8] S. Kang and D. Miller, 2011, Thermal Management Architectures for Rack Based Electronics Systems, Proceedings of The Heat Is On: Performance and Cost Improvements through Thermal Management Design, MicroElectronics Packaging and Test Engineering Council (MEPTEC)
NIAGARA?
electronics-cooling.com
ElectronicsCooling
39
index of advertisers Alpha Novatech...........................................................................................................................................................................Inside Back Cover The Bergquist Company........................................................................................................................................................... Inside Front Cover Cofan USA.................................................................................................................................................................................................................. 5 Concept Group......................................................................................................................................................................................................... 31 Deep-Cool . ............................................................................................................................................................................................................. 38 DegreeC .................................................................................................................................................................................................................. 37 ebm papst .............................................................................................................................................................................................. Back Cover EIC Solutions, Inc. . ................................................................................................................................................................................................ 11 FLIR Commercial Systems, Inc. ........................................................................................................................................................................... 15 Fujipoly.......................................................................................................................................................................................................................35 Future Facilities . .....................................................................................................................................................................................................30 Intermark ..................................................................................................................................................................................................................33 ITEM Publications ................................................................................................................................................................................. 9, 13, 25, 29 Malico Inc. . ................................................................................................................................................................................................................7 Mentor Graphics Corporation...............................................................................................................................................................................23 Mintres.......................................................................................................................................................................................................................17 PLANSEE .................................................................................................................................................................................................................27 SEMI-THERM . ................................................................................................................................................................................................... 18-21
The International Journal
Summit Thermal System Co., Ltd..........................................................................................................................................................................39 of Electromagnetic Compatibility
Sunon, Inc. .................................................................................................................................................................................................................3 Wolverine Tube Inc., MicroCool Division............................................................................................................................................................36
TM
2010 EMC Directory
& Design Guide technologies EMC Design ................................................... 90 Filters ............................................................... ...............................................................61 Lightning, Transients &ESD............................ &ESD ............................68 Sheilded Conduits ............................................ ............................................82 Shielding............................................................ ............................................................82 ............................................................ Standards ........................................................ ........................................................106 Testing & Test Equipment............................... Equipment...............................10
directories Company Directory......................................... Directory.........................................162 Consultant Services ....................................... .......................................128 Government Directory.................................... Directory ....................................146 Products & Services Index........................... Index ...........................152 Professional Societies.................................... Societies ....................................138 Standards Recap ............................................ ............................................129
industries & applications Aerospace .................................................. 32, 72 Automotive ........................................................ ........................................................32 Medical .............................................................. ..............................................................32 Military ............................................................ ............................................................124 Telecom.............................................................. ..............................................................32 .............................................................. chanGe service requested
interferencetechnology.com
ITEM PUBLICATIONS 1000 Germantown pike, f-2 plymouth meetinG, pa 19462-2486
presorted standard us postaGe paid permit no. 225 mechanicsburG, pa Cover_DDG10.indd 1
Europe
4/26/2010 4:59:06 PM
North America
China
Japan
Subscribe online for weekly EMC news and updates at
www.interferencetechnology.com
ITEM
40
ElectronicsCooling
TM
December 2011