EDA Tech Forum Journal: March 2009

Page 1

The Technical Journal for the Electronic Design Automation Community

www.edatechforum.com

Volume 6

Issue 1

March 2009

Embedded ESL/SystemC Digital/Analog Implementation Tested Component to System Verified RTL to Gates Design to Silicon INSIDE: Managing your IP for profit Embedded World bucks recession Toward the power-aware OS Turbocharging ESL design Virtual prototypes for security


COMMON PLATFORM TECHNOLOGY Industry availability of real innovation in materials science, process technology and manufacturing for differentiated customer solutions.

Chartered Semiconductor Manufacturing, IBM and Samsung provide you the access to innovation you need for industry-changing 32/28nm high-k/metal gate (HKMG) technology with manufacturing alignment, ecosystem design enablement, and flexibility of support through Common Platform technology. Collaborating with some of the world’s premier IDMs to develop leading-edge technology as part of a joint development alliance, Chartered, IBM and Samsung provide access to this technology as well as qualified IP and robust ecosystem offerings to help you get to market faster, with less risk and more choice in your manufacturing options. Visit www.commonplatform.com today to find out how you can get your access to innovation.

www.commonplatform.com


EDA Tech Forum March 2009

contents < COMMENTARY >

< TECH FORUM >

5

18

Start Here

Embedded

Mainstream not niche

The power-aware OS

6

22

Analysis

Embedded

Choose, but choose wisely

FPGA-based speech encrypting and decrypting embedded system

Our last chance on green design.

Some consumer markets remain busy in spite of the recession’s severity.

10 Design Management

A profitable discipline

Efficiently tracking the use of intellectual property and open source software is becoming increasingly vital.

14 Conference Preview

Ahead of the game

The Embedded World conference has actually grown in 2009.

Mentor Graphics

Texas A&M University

26 ESL/SystemC

Rapid design flows for advanced technology pathfinding NXP-TSMC Research Center and NXP Semiconductors

30 Verified RTL to gates

Multiple cross clock domain verification eInfochips

34 Digital/analog implementation

TrustMe-ViP: trusted personal devices virtual prototyping EDA Tech Forum Volume 6, Issue 1 March 2009

University of Nice-Sophia Antipolis

42 Design to silicon

Chemical mechanical polish: the enabling technology Intel

46 Tested component to system

Automating sawtooth tuning Broadcom

EDA Tech Forum Journal is a quarterly publication for the Electronic Design Automation community including design engineers, engineering managers, industry executives and academia. The journal provides an ongoing medium in which to discuss, debate and communicate the electronic design automation industry’s most pressing issues, challenges, methodologies, problem-solving techniques and trends. EDA Tech Forum Journal is distributed to a dedicated circulation of 50,000 subscribers.

EDA Tech Forum is a trademark of Mentor Graphics Corporation, and is owned and published by Mentor Graphics. Rights in contributed works remain the copyright of the respective authors. Rights in the compilation are the copyright of Mentor Graphics Corporation. Publication of information about third party products and services does not constitute Mentor Graphics’ approval, opinion, warranty, or endorsement thereof. Authors’ opinions are their own and may not reflect the opinion of Mentor Graphics Corporation.

3


4

Consolidate, share, and protect all pertinent engineering software, Vendor, and license data with TeamEDA’s License Asset Manager

Track & Monitor: x Licensed Assets x Costs x Utilization x Servers & Daemons x Expiration Alerts x P.O.’s x Agreements And more Supported Vendors: x Mentor x Mathworks x Cadence x Agilent x Synopsys x Synplicity x Denali x Magma And more

(978) 251-7510

Please call for a web demonstration www.teameda.com

info@teameda.com

team < EDITORIAL TEAM > Editor-in-Chief Paul Dempsey +1 703 536 1609 pauld@rtcgroup.com

Managing Editor Marina Tringali +1 949 226 2020 marinat@rtcgroup.com

Copy Editor Rochelle Cohn

< CREATIVE TEAM > Creative Director Jason Van Dorn jasonv@rtcgroup.com

Art Director

Kirsten Wyatt kirstenw@rtcgroup.com

Graphic Designer Christopher Saucier chriss@rtcgroup.com Untitled-6 1

4/28/08 10:18:27 AM

< EXECUTIVE MANAGEMENT TEAM > President

John Reardon johnr@rtcgroup.com

Vice President

Make Sure You Get Every Issue Elect ronic for the

D esign

Au tomat

io n C ommu

al l Journ

4 Volume

December Issue 4

Vice President of Finance Cindy Muir cindym@rtcgroup.com

Director of Corporate Marketing Aaron Foellmi aaronf@rtcgroup.com

nity

rum EDA Tech Fo nica T he Tech

Cindy Hickson cindyh@rtcgroup.com

T he Techni cal Journa l

EDA Tech Foru

2007

chforum.com

for the Electro nic

www.edatech forum.

www.edate

com

Volume 4

Design

Autom ation

C ommun

ity

m

< SALES TEAM >

Issue 3 September 2007

Advertising Manager Stacy Gandre +1 949 226 2024 stacyg@rtcgroup.com

INSIDE

INSIDE

memory menting s and imple us clock > Selecting with asynchrono chips ble ation > Simul reconfigura radio cation for > Verifi are-defined lexity ing softw flow comp > Realiz design ging SoC > Mana

> Seeking a vision for the MPSoC > Holisti c design means analog era > Organizing too tool license > A user's s take on verific for profit > Managing ation flows multi-corner multi-mode

Advertising Manager

SUBSCRIBE AT:

www.edatechforum.com

subscribeadv1.indd 1

5/5/08 10:19:19 AM

Lauren Trudeau +1 949 226 2014 laurent@rtcgroup.com

Advertising Manager Shandi Ricciotti +1 949 573 7660 shandir@rtcgroup.com


EDA Tech Forum March 2009

start here Mainstream not niche Green design will be one of the enduring themes of 2009, or so we are told. As hard-faced as it might sound, though, it is hard to see this theme getting the attention it needs right now. My problem with the whole concept is that it currently seems to attract a premium at just about every point in the design and supply chain, above and beyond what is truly merited by the extra functionality or physical qualities that a tool or component or board or finished product may possess. Slap on “green” and you can slap on 10%. Companies are even encouraged to think along these lines by reputable research, such as that recently published by the Consumer Electronics Association. It said that just over a fifth of the John and Jane Qs out there would be prepared to pay 20% more on the retail price for an average flatpanel display, if it had been designed with the environment in mind. Well, sorry, but even people from Greenpeace look at that skeptically. As Casey Harrell, one of the pressure group’s senior campaigners on electronics says, “There is always a big difference here between what people say and what they then go and do.” In other words, if you frog-marched all those people to Best Buy straight after getting their answers, would they still lay out the extra cash? And there is another issue that goes begging here. Even if 22% of people will open their pocketbooks that wide for such goods and services, it’s not enough. Global warming, carbon footprints and related issues cannot be solved with niche products. The approach we take has to be mass-market. Now this might sound as though I’m telling electronics companies to sacrifice margin. However, I don’t think it’s still called a “sacrifice” when you have no choice, and that is the way things are going. President Obama has picked Steven Chu as his energy secretary, a Nobel Laureate who has made climate change the focus of his work, and who has also published and promoted some of the most disturbing research on the topic. The signal is clear—if industry won’t do something, here is a guy who will and has full executive backing. So, unless electronics wants to face regulations that impose green thinking on its mainstream design processes, it needs to start developing those processes itself, and account for the environment in everyday products in a commercially viable way. Such a strategy will actually protect margins rather than sacrifice them to red tape and regulatory diktat. Paul Dempsey Editor-in-Chief

5


6

< COMMENTARY > ANALYSIS

Choose, but choose wisely Our annual review of consumer electronics trends finds there is still scope for well-targeted design activity. The consumer sector has become a reliable bellwether for the state of the electronics industry. However you slice it, the latest data does not paint a pretty picture. With the publication of its latest U.S. market forecast, the Consumer Electronics Association (CEA) provided data that put the typical knock-on effect of a recession into concrete terms. “Economists estimate that for every dollar decline in wealth, consumption declines approximately 4-8¢,” said Steve Koenig, the association’s director for industry analysis. Given that macro-economic formula, the further bad news is that the ‘wealth’ number (broadly analogous to disposable income) is now 40% off the base (Figure 1). And remember, consumer spending represents about 70% of the U.S. economy. “[2009] will remain a very bad year,” said CEA economist Shawn DuBravac. “This will be the worst two-year period we’ve seen since the early eighties. We project that this year consumer spending will be down 0.3%, which will be a little bit worse than we’ve already experienced in the latter part of 2008.” Overall, according to CEA chairman and CEO Gary Shapiro, his organization is forecasting a -0.6% year-on-year drop in U.S. consumer electronics factory sales for 2009. This will follow on from a 5.4% rise in 2008, against an originally forecast 6%. If confirmed, this projection will mean that 2009 is only the fourth year since the seventies when CE revenues have contracted, alongside 2001 (a post 9/11 3% drop), 1991 (-0.8%), 1974 and 1975 (both of which suffered double-digit declines). Even before we entered 2009, retailer Tweeter had shuttered all its stores, Circuit City had filed for Chapter 11 bankruptcy protection and Polaroid Consumer Electronics had done the same. Circuit City has since also been forced to throw in the towel. At least the CEA is looking for a relatively short and sharp downturn. “2010 should be a year of recovery, and we hope and expect to return to positive industry growth,” Shapiro said. However, in the meantime, should we all head for a dark, dank corner of the basement and pull on a paper bag? Strangely enough, by mining the data you find evidence to suggest that while things are going to be tough, notions

Source: Standard & Poor’s

60% 40% 20% 0% -20% -40% -60%

Economists estimate that for every dollar decline in wealth, consumption declines approximately 4 to 8 cents. 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 Recession

S&P500 (Y/Y % Change)

FIGURE 1 Large negative wealth effects that electronics design activity is about to judder to a halt are misplaced. Recession or not, 4,000 flat screen TVs are sold every day, 20,000 new products were launched at the Consumer Electronics Show in January, and American consumers are still set to buy more than one billion CE products over the course of 2009. It is probably fair to say that we entered the age of ‘electronics everywhere’ several years ago, largely courtesy of the mobile phone. However, one positive aspect of that may only become clear now that we are in recession: many CE products are no longer viewed as luxuries but more as essentials, and are therefore less likely to fall out of people’s spending plans when belts must be tightened. Consequently, there are numerous instances of products where the unit shipments are set to rise in 2009 even though price pressures and other market dynamics may cause revenues from them to fall. It is all a question of balance. One particularly fast-moving product shows how carefully that question must be considered.

The rise of the laptop ‘lite’ Netbooks, ultra-lights, mini-notebooks—call them what you will (notwithstanding that there are some trademarks in there). The fact is that if you are working on board design for any of these children spawned by the 7-inch-screen


EDA Tech Forum March 2009

Asustek (aka Asus) Eee PC, you are probably looking forward to a busy 2009. The same is true for those working on processors and other chipsets for this market. It is hard to believe that the first of these products only shipped as recently as September 2007, less than 18 months ago. Already, the CEA says that they account for 7% of all laptop sales, and forecasts that they will move up to an 11% share by the end of the year (Figure 2). The Information Network, a Pennsylvania-based research firm, takes a broadly similar view on the units. It says that 11.4 million of these devices were sold last year and that 21.5 million will sell this year, implying a 12% share. In many respects, the start of shipments for Intel’s Atom processor last April really ignited this segment, bringing in vendors who competed to follow Asus in shrinking form factors and applying better and better industrial design (rapidly becoming an even more powerful product differentiator here than processor power, OS, or memory). Dell, Lenovo and Acer have all launched their own Atom-based equivalents. HP has also entered the space, but using a Via Technologies processor. At CES, a second wave of products, now based on rival ARM-powered processors, went on view. These ranged from Cortex A8-based offerings in CPUs from companies such as Freescale Semiconductor to Qualcomm’s Snapdragon engine that is based around an ARM architectural license. This new set of machines has brought a more problematic side of the ‘netbook’ market into focus. The original Eee PC was Celeron-powered, had a very short battery life and was not intended for much beyond Web-surfing and email. As such, calling these devices ‘netbooks’ actually suited Intel quite well. It implied that this was a wholly separate product segment from that of notebooks, preserving the higher-end market for its more expensive, higher performance (and higher margin) Core processors. In short, the Eee PC (and its equivalents) were ‘cheap’ at around the $300 mark because they were limited. Then, however, netbooks started to undergo ‘function creep’. The processing power of the Atom allowed OEMs to add Windows XP and programs from Microsoft Office or—to keep costs down—have their products run the OpenOffice suite on Linux. Now, the entry of ARM-based chips has upped the ante still further. Here are the technical specifications for Qualcomm’s Snapdragon platform: • 1GHz CPU • 600MHz DSP • Support for Linux and Windows Mobile • WWAN, Wi-Fi and Bluetooth connectivity • Seventh-generation gpsOne engine for standalone-GPS and assisted-GPS modes, as well as gpsOneXTRATM Assistance • High-definition video decode (720p)

Source: CEA 14% 13%

13%

2010

2011

11%

7%

0% 2007

2008

2009

2012

FIGURE 2 Netbook market share of laptop sales • 3D graphics with up to 22m triangles/sec and 133m 3D pixels/sec • High-resolution XGA display support • 12-Mpi camera • Support for multiple video codecs • Audio codecs: (AAC+, eAAC+, AMR, FR, EFR, HR, WB-AMR, G.729a, G.711, AAC stereo encode) • Support for broadcast TV (MediaFLO, DVB-H and ISDB-T) • Fully tested, highly integrated solution including baseband, software, RF, PMIC, Bluetooth, broadcast & Wi-Fi Notwithstanding XP’s absence from that list, this kind of machine looks very much like a traditional notebook. Indeed, the capability of these small form factor machines to take on more and more functionality is not only being driven by increasing competition and performance in the processor space, but also by trends in storage. According to SanDisk, integrating 32GB of solid-state (typically flash) storage (SSD) into one of these machines now costs about the same as an equivalent capacity 2.5” hard disk drive (HDD), and a 64GB SSD is available at only a “small premium” to the equivalent 1.8” HDD. Prototypes are in circulation that can quite comfortably carry three HD movies and—given that they are based around engines optimized for the low-power obsessed mobile communications space—have the battery life to show them. The portable computer that can run for an entire transatlantic flight is apparently with us—some designs are now claiming capacity of eight hours and above. The even more important issue here, though, is cost. Netbook OEMs say that an Atom processor adds $30 to a bill of materials, and to date this has set the floor for this new market at about $250 retail. However, the Freescale i.MX515 used in a netbook launched at CES by Pegatron is said to add only $20, and also comes with software and other peripherals. Thus armed, Pegatron is targeting a $199 price point, the figure the far less well-featured Eee PC originally aimed for but could not hit. Continued on next page

7


8

< COMMENTARY > ANALYSIS

Source: CEA

2009 Fastest Growing Products

(based on estimated shipment revenues) 1) OLED Displays 2) E-readers 3) HD Flash Camcorders 4) Netbooks/subnotebooks 5) Climate Systems-Communicating Thermostats 6) Next Generation DVD Players 7) LCD TVs with 120Mhz+ Refresh Rate 8) Portable Navigation (traffic compatible) 9) MP3 Players with Wireless Connectivity 10) Home-theater-in-a-box w/ Blu-Ray

149% 110% 106% 80% 71% 62% 57% 52% 41% 30%

FIGURE 3 2009’s fastest growing CE products Competition for this netbook space is going to intensify still further. Other ARM licensees such as Samsung, Texas Instruments and Marvell will release CPUs for it. AMD, Intel’s chief rival in higher-end PCs, now has the Athlon Neo. Via already has the ultra-low-voltage C7-M processor used by HP. And graphics specialist Nvidia has ambitions in this segment. It is an exciting space, then, and promises to be a very busy one for design in the months ahead. If there is one other thing that is now obvious about these netbooks, it is that their functionality is still in flux. However, what about the bottom line? There are those who see Intel’s Atom cannibalizing its existing higher-end laptop business. According to the Information Network, consumers do not see netbooks as a separate category, but as a cheaper alternative to a ‘traditional’ laptop. As a result, the research firm has said that Atom may have actually cost Intel $1.14B in revenue in 2008, and could cost it another $2.16B, assuming 50% of netbook purchasers would otherwise have bought notebooks and a $200 price difference between Atom and Core devices. There is one further factor here, the insertion of mobile communications economics into a PC space. So far, Europe has been a far more voracious adopter of netbooks than North America. An important volume driver is that the machines are being bundled with data contracts by the continent’s 3G network operators. The first similar U.S. offer came in January from AT&T, which is discounting Dell’s Inspiron Mini 9 from $449 to $99 in return for a two-year 3G data contract. When network operators become major customers, suppliers soon find that they are not willing to accept PC-level CPU margins. That is an assumption that ARM and its licensees are using in the business model they are using for netbooks, seeing the space as an upwards evolution from cell phones, through smart phones, through mobile Internet devices and ultimately into something akin to the traditional computing space. Is Intel then trapped with Atom by a process under which it is evolving backwards, to some extent? And bear in mind that recessionary economics and their impact on consumer buying patterns need to be overlaid on all these existing market dynamics.

Just good enough? In 1997’s The Innovator’s Dilemma, Clayton Cristensen famously described and analyzed a number of the scenarios that have challenged and wounded technology companies, even when they have been apparently on top of their games. One tale in particular is being cited by a number of executives right now and concerns the former Digital Equipment Corporation (DEC). DEC was, in its time, the prince of the minicomputer business—but then along came the PC, and for a number of reasons, including its structure and insistence on only high margin business, DEC missed the boat. However, another important factor was that while DEC continued to make the highest performance computers, that equipment’s specifications far exceeded the market’s need. The PC’s strength was that it combined being ‘just good enough’ (JGE) to do what its users needed with a much lower price point. Sound familiar? To suggest that Intel may be at a similar crossroads now is something that the company’s competitors might well promote for obvious reasons—though you doubt that even they believe it entirely. Nevertheless, there may well be something in the idea that we are entering a period where JGE products are those that are especially well placed to win customers’ business. Figure 3 shows the products that the CEA expects to post the greatest volume growth in 2009. In reviewing that list, one caveat always has to be made: any percentage ranking will always favor very new products that are growing from a low initial sales volume in the year before. Still, you can reasonably argue that five of the product groups on that list fit JGE criteria. HD flash camcorders Flip video’s MinoHD has an incredibly straightforward iPod-like user interface but few of the effects that you will find on a traditional camcorder. It will capture high-definition H.264 video—about an hour’s worth on a 4GB SSD— and fit in your pocket. Accessing the content is via a simple USB connection. The retail price is $229.99 against a previous entry level for HD camcorders of about $500, but the Mino will do what most people want it to do. More to the point, most established OEMs in this space are bringing their own ‘me too’ products to market. Netbooks/subnotebooks See above. Next-generation DVD players/home-theater-in-a-box (HTIB) with Blu-ray. There was surprise in some quarters that although sales in this segment rose 118% in 2008, the figure should have been much higher given the comparatively low base and


EDA Tech Forum March 2009

Source: CEA

CE Sales Outlook Unit Sales in Millions

Digital Displays 2008 2009

Wireless Handsets 121.0

32.7 5.8% growth in 2009

2.6% growth in 2009

34.6

2010

36.7

2011

127.9 38.3

2012

2008

2010 2011 2012

136.0

39.0

Personal Computers

2009

124.2

144.8

Game Consoles 27.6

5.1% growth in 2009

35.3 2.8% growth in 2009

29.0

36.3

33.2

35.3 38.2

32.5 43.3

29.2

FIGURE 4 CE sales outlook (unit sales in millions) The CEA expects its four core product groups to each post growth in unit sales during 2009 despite the recession. The two tipped for the greater increases are Digital Displays (5.8%) and PCs (5.1%). The tumbling cost of displays is undoubtedly helping the sector maintain sales traffic, but a second factor that will continue to push the 2009 number is the impending switch-off of analog TV broadcasts in the U.S. PC numbers are largely being driven by the netbook subsection of the laptop market, as discussed in the main article. If the CEA numbers here do provide one source of concern, it is in the games consoles segment, with 2009 now pinned as the peak year for the current generation. The latest machines from Sony, Nintendo and Microsoft have been available for two years at least, so some maturing was to have been expected. However, given the PlayStation3’s Blu-ray capabilities and following its repositioning in 2008 as an entertainment center rather than simply a gaming console, Sony may have expected its product to show more longevity than these numbers suggest.

Toshiba’s decision to terminate the HD-DVD format almost immediately after CES 2008. Blu-ray Disc should have done much better with the market to itself. However, for most of last year, Sony concentrated on marketing the format through its PlayStation3 (with, apparently, limited results among non-gamers) while other manufacturers held back equipment that fully met the 2.0 specifications until later in the year. More importantly though, low-cost Blu-ray players— such as a $199 model from Best Buy in-house brand Insignia—only reached retail shelves in the fourth quarter. As this price point continues to fall over the course of 2009, more take-up will occur as the format moves from being a premium, performance-based technology, to one whose JGE position is established by the cheaper kit. The

bundling of Blu-ray technology in home theaters will also help. LCD TV with 120Mhz+ refresh rate The LCD has long been perceived as a junior technology to a plasma display in performance terms, but once it does get to this level of screen refresh, the important point is that it will have improved sufficiently to convert more consumers. The overarching point here is that none of these products would necessarily have the notion of ‘high-end’ associated with them. They are mass-market in the purest sense, but with perhaps more of an emphasis on the price point underpinning their growth than might be the case in a more robust economy.

9


10

< COMMENTARY > DESIGN MANAGEMENT

A profitable discipline Managing your software and IP usage is all about the bottom line, say Mahshad Koohgoli and Sorin Cohn-Sfetcu. The software food chain Software has become a ubiquitous part of many products and services, some of which are distributed and used on a massive scale. Think of cloud computing, online banking and the engines for today’s cellphones and cameras. Competition requires that such new products are developed more quickly than ever before and yet brought to market at lower and lower cost. Project managers therefore demand the design efficiency inherent in the reuse of existing, proven software code, be it internally generated or acquired from third parties. Most companies have lots of their own code files. Many, many more can now easily be sourced online and downloaded from open source libraries or acquired from external developers at a fraction of the time and cost it would take to do the work from scratch in-house. Open source code is also attractive because it tends to grow exponentially in many respects—e.g., functionality, richness, stability, quality and security—while being subjected to rigorous peer review. Yet it is also thought to involve no monetary cost. However, such software is still governed by a wide range of complex license regulations and usage terms that require expert interpretation. Keeping track of the use of such software is also difficult. Companies rarely create the entire software load in their products or services. Rather, most players in the ‘food chain’ (Figure 1) assemble software from a mixture of suppliers, contractors and open sources, then couple it with their own value-add code and pass it along to another stage or competence—at which point, the same process of creative software combination from various sources will take place all over again. Today’s typical end product or service is the sum of code provided by many organizations and individuals at multiple stages of a supply chain. Software modules are treated as commodities, as has happened for longer with many hardware components. However, there is one major difference: software code comes laden with licensing and intellectual property (IP) terms that can seriously affect a product’s commercialization and/or certain corporate transactions. Without a complete and up-to-date view of all the software components and contributors to its products or services, a company runs the risk of encountering unexpected costs, missed deadlines and significant business risks. In particularly egregious scenarios, software components may not meet specifica-

Source: Protecode

Open Source

Development Outsourcer

Software Vendor Chips, Sub-systems, Supplier

Product Company

Service Provider EndUser

FIGURE 1 The software food chain tion and need to be repaired, or proper IP and copyright obligations may be overlooked, yet the problems they create only come to light after a product has been released. Imagine that a security fault is discovered in the protocol stack supplied by one of the players. The end-user now has to ascertain which products are affected (potentially thousands), and meet the cost of solving the problem and correcting it in each instance.

Software governance and development discipline The risks outlined above can be greatly reduced, even eliminated if you have proper, accurate records for all the code in a product and know the pedigree of all its components. Alas, few companies have traditionally managed to capture such a ‘software bill of materials’ because those tools that could determine code’s content and manage IP correctly were mainly retrospective. They also either involved cumbersome manual audits or were expensive tools for automatic code analysis. These problems can now be avoided by implementing real-time records-gathering and preventive IP management processes based on appropriate design flow policies. Moreover, recently released preventive tools allow for the realtime analysis, recording and compliance-checking of legacy code as well as any new code brought into a project.


EDA Tech Forum March 2009

Source: Protecode

IP Policy & Compliance Administration

Enterprise Legacy Code IP Assessment

Market-ready Load-build IP Audit

Preventive Clean IP Development

Source Code Management

FIGURE 2 A basic source code management flow While good software development practices have evolved to include systems for checking syntax, managing software versions and tracking software bugs, disciplines that are standard practice in structured hardware development flows have yet to be adopted. This latter group includes: • Approved vendor lists that should contain approved components and licenses, including the commercial terms, vendor history, version details, historic pricing, and so on. A developer can then select components freely from the list without concern. • Automatic notification in case attempts are made to use code with unapproved licenses, or code modules that are governed by incompatible licenses. • A bill of materials that fully records which components feature in the final product and that includes necessary details to enable production, determine costs, set export terms, track vendor upgrades and manage other post-design activities. Such advanced practices must be applied to software development processes in order to ensure proper software governance. They will give organizations a more effective overview of their source code assets and streamline product commercialization. To establish the necessary culture of software development discipline and good IP management, product and development managers must therefore: • understand the economics of software governance; • work with legal counsel to create and administer appropriate policies for software IP management; • acquire full knowledge of licensing and other IP characteristics of the existing code in the institution and clean it where necessary; and • automate the record keeping and IP management process as an integral part of the software development process going forward. The latest automoted tools can help managers take all these necessary steps and create an effective source code management flow (Figure 2).

The economics of software IP management The economic justifications for software IP management include the cost incurred to fix bugs and delays in time-tomarket caused by the amount of time that elapses between the detection of IP issues and their correction. The earlier

Source: Protecode

Developer IP Assistant

Automated Preventive Automated Retrospective

Build IP Analyzer

Enterprise IP Analyzer

Manual On-command

On-schedule

Continuous Real-time

FIGURE 3 Protecode tool options any external IP obligations are detected and managed, the less expensive they are to address and fix. Until recently, the industry had only limited options. The main preventive solutions are still often entirely manual, and basically involve training developers to avoid certain types of software code. This is time-consuming and expensive because the number and range of code suppliers and sources keep getting larger and more complex. And still, there is no certainty that the end software will be ‘clean’. On the other hand, after-the-fact corrective measures waste resources on removing the offending code, identifying a replacement third-party code (or worse, rewriting code internally), and making changes to the application program interfaces (APIs) so that the new code fits the software. The time required for correction lengthens time-to-market and a project’s commercial window can even close before its release. It is better to handle record gathering and software IP management during the original code development and quality assurance (QA) processes. Here, automatic tools now place minimal training demands on R&D staff with regards to the minutiae of IP issues. They are easy to adopt and structured so that they do not disturb normal creative development flows. Examples of these tools include the offerings from our company. As shown in Figure 3, they cover the time-efficiency spectrum, from On-Command to On-Schedule and Continuous Real-Time. These tools are: The Enterprise IP Analyzer. It helps establish corporate records-keeping and corresponding IP policies, analyzes legacy code in a single project or across an entire organization, creates an associated pedigree database, and reports violations of declared policies. Continued on next page

11


12

< COMMENTARY > DESIGN MANAGEMENT

Source: Protecode

Edit Source File(s) start

Load Build

Test

Software Development Process

IP Policy Administration

Enterprise IP Analyzer

end

Developer IP Assistant

Final S/W

Pedigree Record BoM

Software IP Database Real-Time Software Discipline IP Development Tools

IP Reports

FIGURE 4 A fully integrated code and IP management flow The Build IP Analyzer. It identifies all software modules that actually feature in a product and checks them against established IP policies. The Developer IP Assistant. It automatically builds a record of the code’s pedigree as a project develops and provides real-time notifications to ensure disciplined records-keeping and that the software contains ‘clean’ IP.

Software IP management policies and compliance administration Software managers who want to maintain disciplined development environments and deliver products with clean IP must collaborate toward these goals with legal counsel. Together, they can develop appropriate coding policies that ensure IP compliance and the presence of downstream safeguards, while enabling the constructive use and adoption of open source or third-party software. A sound, clear and enforceable IP policy will contain an effective list of approved software vendors, together with acceptable external-content license attributes and measures to be taken in various situations: • What is allowed and what is restricted. • What to do if a piece of code with a hitherto-unknown license is desired. • What to do in case of unknown code. • What are the potential distribution restrictions, including those on exports. Such policies should be consistent with corporate goals, easy to enforce and allow compliance monitoring throughout a project’s lifetime from development to commercialization and on to post-sales support. An organization should also be able to adopt and enforce appropriate IP policies for each class of projects it handles.

Retrospective code mapping and IP issue corrections To ascertain the IP cleanliness of the software that has already been developed or acquired, users need to map the existing (legacy) code and ensure that it is in line with the appropriate IP policy. This can be done by engaging expert teams to perform an IP

audit or by using automatic tools—e.g., Enterprise IP Analyzer— to perform the code mapping and check all the components. Retrospective tools—available from companies like Protecode, BlackDuck and Palamida—establish the IP pedigree of existing code by analyzing it and decomposing it into modules that can be matched against extensive databases of known software.

Real-time preventive IP software management The final and most cost- and time-effective measure is for management to embed the records-keeping and software IP management process as an intrinsic element of the company’s development and QA processes. Tools that unobtrusively operate at each development workstation are preferrable where they do not perturb the normal process of software creation and assembly. Our Developer IP Assistant product is an example of this kind of approach. Such tools automatically detect and log in real time each piece of code a developer brings into a project. Various techniques are used to determine the code’s pedigree (e.g., its ‘signature’ when compared against a database of known code modules, identifying which URL it came from, identifiers within the code itself, requesting the developer to provide a valid record of how the code was created and brought in). The resulting pedigree is captured in the overall code map and henceforth always associated with that piece of code. The Developer IP Assistant also checks the licensing obligations of the identified code against the IP Policy for that project and takes any necessary enforcement steps. The integration of various analysis and policy enforcement tools (Figure 4) ensures proper record keeping and clean IP software development. Such automatic tools are already being used in academia and by industry. Their utilization here will accelerate in the face of enhanced demands for IP indemnification, open source enforcement of copyright infringement and reduced product development cost. Dr. Mahshad Koohgoli is CEO of Protecode, and Dr. Sorin Cohn-Sfetcu is an executive consultant to the company. More details about its products and services can be accessed online at www.protecode.com.


at the heart... of SoC Design ARM IP — More Choice. More Advantages. • Full range of microprocessors, fabric and physical IP, as well as software and tools • Flexible Foundry Program offering direct or Web access to ARM IP • Extensive support of industry-leading EDA solutions • Broadest range of manufacturing choice at leading foundries • Industry’s largest Partner network

www.arm.com The Architecture for the Digital World® © ARM Ltd.AD123 | 04/08


14

< COMMENTARY > CONFERENCE PREVIEW

Ahead of the game The embedded systems community gathers in Nuremberg this March for an event that is bucking the downturn. The 2009 Embedded World conference and exhibition in Nuremberg, Germany (March 3-5) has managed to maintain much of its scale despite the global downturn. Some speakers have apparently been pulled out by their companies— the most notable contingent, sadly, coming from the USA— but overall the exhibition has expanded by 5% in terms of its physical size and is also reporting a useful increase in international participants. Meanwhile, its collocated conference, organized by German magazine Design&Elektronik, is set to address an admirably wide range of subjects, touching on all of the embedded sector’s existing and emerging concerns. This combination of an increasingly broad-based technical event with a remarkably healthy show provides sufficient justification for Embedded World’s claim to Source: Embedded World be its sector’s pre-eminent annual global gathering. The technical side is overseen by Dr.-Ing Matthias Sturm, conference director and professor of Microcomputing and Electronics at the Leipzig University of Applied Sciences. It is a complex task, embracing all three key aspects of embedded technology—hardware, software and tools—and then weaving them into a diverse set of subjects. However, in Sturm, the FIGURE 1 Prof. Sturm has organizers have very much been with Embedded the right man for the job. World since 2003 Sturm’s enthusiasm for electronics dates back to his childhood, his research includes work in embedded systems during the 1970s (before they were even known as such), and he has made significant technological contributions himself. “One project of which I’m particularly proud was a microcontroller-based, embedded Web server, and we published the research on that in September 2001,” he says. “That application may not be seen as so spectacular today, but by describing the server starting with a circuit schemat-

Source: Embedded World

FIGURE 2 T he conference emphasizes practical class sessions ic through to complete software sources, we provided evidence that you can make a Web server with a 16-bit MCU. Numerous designers took up this suggestion, adapted it and developed it further.” Sturm’s work today is looking toward the use of embedded systems in life science engineering, an area that, in the years to come, is likely to become a fruitful market segment. More immediately, though, there is this year’s conference, Sturm’s seventh. The length of his involvement puts Sturm


EDA Tech Forum March 2009

Embedded World 2009

Embedded World 2009

Session topics

Class topics

• Network technologies • Wireless technologies • Multicore processing • Development tools • Microprocessor architectures and cores • Cryptography and embedded security • Graphical user interface • Memory in embedded systems • M2M-communication • Automotive applications • System on chip • Software development methods I, II • CompactPCI Plus • Safe and secure virtualization • Green electronics • Automotive software development & test • Embedded system architecture • Managing development projects successfully • Model-based design • Embedded Linux • Debug methods • Software quality/test & verification • Successfully implementing ARM

• Modeling embedded systems with UML • Introduction to real-time operating systems • Introduction to real-time Linux • Linux in safety-related systems • Solutions workshop—get the most out of Cortex M3 debugging • Creating multitasking systems with real-world timing • Open-source project management • More busting bugs from birth to death of an embedded system running an RTOS • IEC61508—developing safety-oriented software • Cryptography and embedded security • Design of safety-critical systems • Using standards to analyze and implement multicore platforms • Design and test of safety-critical systems • Modeling real-time systems • Software development in HLL I: JAVA • Software development in HLL II: C++ • Software design for multicore systems • Unified Design—innovative architectural design methodology for embedded systems • USB host workshop with NXP L • ARM quick start workshop for LPC1700 (Cortex-M3)

in a good position to judge how Embedded World has and will develop, and what it offers to engineers. “The combination of exhibition and conference, both organized and staged by highly motivated teams that do an extremely professional job, is only one reason for the success of the event,” he says. “What’s equally important is that despite Embedded World’s tremendous growth, the design community still sees it as its very own event. It’s no exaggeration to say that the family character of ‘Embedded’—as most of those attending call it—is a major factor in that success.” Sturm also believes that Embedded World is a vital event because it completes a virtuous circle in terms of both its delegates and the stand personnel. “Both designers and decision-makers get their money’s worth. What’s noticeable is that the majority of exhibitors are very positive when judging the quality of attendees and the discussions they are able to have with them,” he says. “A large percentage of those visiting the exhibition booths to gather more information are development engineers. So, as the exhibitors have realized this, they are increasingly sending engineers from their development departments to man their booths be-

cause the technical discussions that result can often be very in-depth. Visitors aren’t interested in a show in the sense of entertainment. They’re looking for direct contact with other developers. They want technical information straight from the source, and close contact with someone who can offer them something.” Meanwhile, the conference is developed to a very clear agenda that reflects the various different and practical priorities of the attendees. “The conference offers participants various possibilities for broadening and furthering their own knowledge, according to their expectations; it is split between an even balance of classes and sessions,” says Sturm. Classes typically last a whole day and cover a specific topic. They are aimed primarily at participants who want to familiarize themselves thoroughly and efficiently with a particular field. “There are options for direct dialog with experts to help attendees clarify a whole load of questions. The idea is to Continued on next page

15


16

< COMMENTARY > CONFERENCE PREVIEW

Source: Embedded World

FIGURE 3 Despite the downturn, this year’s exhibition is larger than in 2008 offer an excellent opportunity of deepening and widening your knowledge fast. The classes are also didactic in structure to guarantee a high level of learning success,” says Sturm. Meanwhile, the sessions serve primarily for the presentation of more discrete ideas, solutions and experience in embedded system development. “They’re a lively forum for imparting knowledge, and the resulting talks are often just the starting point for further discussions at the exhibition booths. However, for the conference we stress that we want purely technical presentations without any distracting marketing,” says Sturm. “At the same time, the sessions will allow attendees to quickly acquire an overview of certain technologies and the latest trends.” In that latter respect, Embedded World will this year build on the strong position it has always given to low-power applications and smart energy management with a session dedicated to green electronics. “But there are many other areas of special interest,” says Sturm. “There we have, for example, sessions on safetyoriented systems, system design at a high abstraction level, and the effective development, programming and handling of multicore systems. There are issues that continue to revolve around RTOS, embedded Linux in different builds and open source. Embedded security and cryptography are increasing in importance, and are therefore also on the agenda.”

The conference also attracts name speakers for these areas. At this year’s event, consultant David Kalinsky will speak on RTOS, Prof. Nicholas McGuire of the University of Lanzhou will speak on embedded Linux, Prof. Christof Paar of the Ruhr-University of Bochum will speak on embedded security and Dr. Bruce Powel Douglass of iLogix will speak on model-based development. Sturm says that Embedded World has—for now, at least— avoided the worst effects of the global recession. “Preparations for a conference always start straight after the preceding event. You need a certain run-up for the jury to review the submissions, to compose the program, and to publish the contents of the conference,” he says. “So the financial crisis hasn’t yet had any effect on arrangements for this year’s conference. “Moreover, the financial crisis isn’t causing the innovative power of the community to dwindle. Engineers are still implementing innovative ideas, and making things possible that once seemed fantastic or quite impossible. It’ll just take a little longer. Aside from the financial crises, as a professor at a university, I feel it’s important that engineers continue their education so that they’re well prepared to face future challenges, and that they let others know of the enthusiasm they put into their daily work, and pass it on to the next generation.” More information on Embedded World’s conference and exhibition is available online at www.embedded-world.de.


320,000,000 MILES, 380,000 SIMULATIONS AND ZERO TEST FLIGHTS LATER.

THAT’S MODEL-BASED DESIGN.

After simulating the final descent of the Mars Rovers under thousands of atmospheric disturbances, the engineering team developed and verified a fully redundant retro firing system to ensure a safe touchdown. The result—two successful autonomous landings that went exactly as simulated. To learn more, go to mathworks.com/mbd

©2005 The MathWorks, Inc.

Accelerating the pace of engineering and science


18

< TECH FORUM > EMBEDDED

The Power-aware OS Stephen Olsen, Mentor Graphics Stephen Olsen is a product marketing manager in the Embedded Systems Division at Mentor Graphics. He has more than 20 years of embedded software experience with an emphasis on system architecture, embedded software and IP. Stephen holds a BS in Physics from Humboldt State University in California.

For system developers designing portable electronic devices, the main dilemma has always concerned power management. How to maximize use and functionality while at the same time minimizing the battery’s cost and size to fit the shrinking form factors and longer lifespan demanded by users? Furthermore, consumers are demanding more functionality with every successive product generation. If a mobile phone does not have a Web browser, camera, audio/video capabilities, speaker phone, Bluetooth and a GPS receiver, it is a candidate for the scrap pile. With such a range of functionality being packed into today’s electronic devices, power management has become intrinsic to success. At the foundational embedded operating system (OS) layer—the interface between hardware and end-user applications—device manufacturers are demanding more complete software platforms to meet the requirements of increasingly sophisticated applications. Within the core of the system, this OS layer is uniquely positioned to exploit power-saving features offered by the underlying hardware, while at the same time controlling user applications in a power-efficient manner.

Addressing power consumption When employed by the OS, an application drains a device of its power. To take a mobile phone as an example, an application that uses the display heavily is a candidate for excessive power consumption. A typical scenario would be where the LCD keeps the back light active so the user can read the menu screen. Once a selection has been made— e.g., launch a video playback application—the LCD is on full power while the video is streamed to the phone. Is it high, medium, or low bandwidth video? Does the device support hardware acceleration? These variables impact how the system schedules power use. The CPU transfers the video and then decompresses it in either hardware or software. The end result is a user interface that displays the

video, and a system that fully performs the task requested by the user—but in the process, too much power might have been consumed. That was a simplified example, but it illustrates a key point. The OS ultimately controls all of the applications and as a result, must decide what is shut down and when. Power management raises a number of questions the OS must address. Which applications can be controlled? What power is necessary in the lower states and how much power needs to be saved when going into these states? The answers to these questions vary from device to device. The one variable not in question, however, is the fact that any good power-aware OS must be able to deal with a host of possibilities. It is important to mention that many devices allow users the option to turn off or turn down the functionality of individual applications to extend battery life. While this may result in power savings, it is definitely not the type of power management the industry or consumers are demanding from next-generation portable devices. To truly tackle the problem, we must go deep into the OS and make fundamental changes to how a system is designed.

A greener OS? For power management on a portable electronic device, we need to look not just at the hardware and software, but the entire system design. The hardware may well have features that reduce power consumption, but the software will often not take advantage of them. Therefore, in order for a battery-operated system to prolong its active life while in use, we must introduce energy-saving techniques at the OS level. By definition, some embedded OSs are more power-efficient than others. All things being equal, an embedded OS that can be configured with the smallest amount of memory for a specific task is more power-efficient than one with a larger footprint. Where there is less code to execute and less memory required, you will conserve overall system energy. But beyond this basic consideration, a green OS had not been considered a design requirement for most engineers until recent developments in the industry and the increased concern about the global cli-


EDA Tech Forum March 2009

The article describes the context and need for embedded operating systems that are more responsive to the power management demands placed on today’s electronic devices. It reviews the design objectives for the two main types of power management, reactive and proactive, and examines how both can be implemented.

Source: Mentor Graphics

Reactive Power Management 380 LCD ON/ Backlight ON Initialize the System

370

360

350 Inactivity Timer

Inactivity Timer

miliamps

340 Touchpanel Event 330

320

Inactivity Timer

310 System Halted 300

290 1

61

121

181 Time (seconds)

241

301

361

FIGURE 1 The user interface power domain changing states mate. Back in the day, the hardware was physically wired so that when a device powered up, a peripheral (e.g., an Ethernet controller) would also start up during system initialization. Regardless of its intended use, this ‘set and forget’ mode left the peripheral powered on as long as the system was on. Further, it is easy to see why battery-operated device manufacturers have a vested interest in batteries that last longer, but what about manufacturers of electrical-powered

devices such as DVD players and set-top boxes? These too can benefit from improved power management. If they use their power more effectively there is less of a drain on the power grid. With a green, power-aware OS in mind, let’s take a look at a couple of energy-saving techniques that can improve power management. At the system level, there are two Continued on next page

19


< TECH FORUM > EMBEDDED

Source: Mentor Graphics

Proactive Power Management 120

Tasks 1 & 2 Complete

100 Tasks 1, 2, & 3 Readied 80 % CPU Frequency

20

Task 3 File Transfer

Idle 60

Timeout/ Sleep CPU

40 Wake up CPU/ ISR to Schedule Tasks 20

0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 Time (seconds)

FIGURE 2 The system undergoing dynamic voltage and frequency scaling as it schedules CPU frequency along with scheduling tasks ways to solve power drain and extend battery life: reactive and proactive power management.

Reactive power management We briefly touched upon reactive power management in the earlier discussion of the LCD example. This technique represents the most basic approach to power management. Essentially, it responds to each application in use at a given time, based on a set of preprogrammed conditions. For example, when there are no applications with open sockets, having an Ethernet controller powered on is wasteful because there is nothing using it. If the application opens a socket only when it needs to communicate, that is reactive power management. A more sophisticated example might involve the process by which a USB drive is accessed by the software. If the file system is mounted and a file is open, and the application is reading and writing the file, then the device is active. But what if the task becomes busy in such a way that it does not

read or write data into the file for several seconds? The system can reactively determine that the device is not using the USB drive, and after an inactivity timeout, tell the system to act accordingly. This technique is depicted in Figure 1 (p. 19). Although basic in its implementation, it remains a good first step toward effective power management. It does have one major drawback though—by the time reactive power management takes place, electrons have already been sent through the system. Reactive power management does nothing to scale the processor. The CPU does not get scheduled correctly and prioritizations of tasks are excluded. So while reactive power management helps start to preserve battery life, it is by no means the final answer. An even more efficient power management system is one in which an application gets scheduled for power consumption and is prioritized by the device, before the application is launched. Proactive power management is one way of achieving this goal.


EDA Tech Forum March 2009

Proactive power management Proactive power management plays with the notion that developers can predict the future. Of course, that is not entirely possible; however, developers can use complex scheduling techniques to predict what the power use will be when the system is in operation. The data can be discovered manually by programming the system with a poweruse scenario, or by dynamically measuring which domains are active and when. For example, let’s say a mobile device has been asked to run the Bluetooth radio. The user is typing a text message on the keypad at the same time. What are the power requirements of the Bluetooth radio when the keypad is engaged? A proactive power management system might say, “Do not communicate at the highest speed, communicate at a lower speed.” The result is less bandwidth for the Bluetooth radio, but the system knows the keypad is in use, so high-speed bandwidth in not required. It’s all about trade-offs that do not affect the apparent device performance. With proactive power management, the user never notices any kind of performance degradation. Proactive power management also works for file transfers. Developers can optimize the system so that a transfer still happens, but consumes less power by scaling back the voltage or frequency of the CPU. Proactive power management senses that instead of transferring a digital photo in five seconds, it can transmit it in 30 seconds with no discernable difference to the end-user experience. Here is one final example. If a system has 10 tasks and all are ready to run, one would expect the system to be busy running these tasks. It makes sense to run the CPU at high power. It is important to note that which ten tasks are running might make a significant difference. If the system integrator can establish that every time a certain task is made ready to run—regardless of it actually being scheduled—the system will increase its power usage; then dynamic voltage and frequency scaling (DVFS) can be used. As seen in Figure 2, DVFS provides enough cycles to get the job done without wasting electrons. Continuing with this line of thinking, it is sometimes better to consume a little more power now so as not to degrade the quality of the user experience while waiting for the power modes to change. Proactive power management techniques are the new wave in power management. Mentor Graphics, through its highly regarded Nucleus OS, is currently investigating several approaches in this area. These techniques will provide system developers and integrators the necessary elements to implement a more sophisticated power management policy.

Conclusion The idea of a power-efficient portable electronic device must be taken seriously. System developers can begin

to do this by taking a more holistic approach to power conservation, utilizing hardware and infrastructure designed to scale back power use and by using software that is capable of controlling a device’s overall power consumption. At the very center of this approach is the concept of a power-aware OS that combines both reactive and proactive power management techniques. Whether it is the individual system developer or an electronic device manufacturer who is most actively pursuing a more power-efficient product line, it is the end-user who benefits most by having a device with improved battery life along with built-in capabilities that actually contribute to a greener planet.

Mentor Graphics Corporate Office 8005 SW Boeckman Rd Wilsonville OR 97070 USA T: +1 800 547 3000 W: www.mentor.com

21


22

< TECH FORUM > EMBEDDED

FPGA-based speech encrypting and decrypting embedded system K.A.S. Srikanth Pentakota, Texas A&M University K.A.S. Srikanth Pentakota holds a Bachelor’s Degree from N.I.T Rourkela, India and is pursuing a Master’s in mixed signal devices at Texas A&M University. He has worked as a project assistant at the Indian Institute of Science, Bangalore and as a subject matter expert in DVCI, AMDOCS.

We have undertaken a design to assess the viability of using FPGAs in embedded systems with real-time requirements by using one as the basis for a digital speech encryption and decryption system (Figure 1). Digital speech encryption is one of the most powerful countermeasures against eavesdropping on telephonic communications, and was therefore a good test of both the FPGA and the surrounding infrastructural technology. For the purpose of the exercise, we selected the Xilinx Virtex II Pro platform FPGA.

FPGAs as building blocks The use of FPGAs and configurable processors has become an increasingly interesting option for embedded system development. FPGAs offer all of the features needed to implement even the most complex designs. Clock management is facilitated by on-chip phaselocked loop (PLL) or delay-locked loop (DLL) circuitry. Dedicated memory blocks can be configured as basic single-port RAMs, ROMs, FIFOs, or CAMs. Data processing capabilities, as embodied in the devices’ logic fabric, can vary widely. The ability to link an FPGA with backplanes, highspeed buses and memories is provided through both on-chip and development kit support for various single-ended and differential I/O standards. Also, today’s FPGAs feature such system-building resources as highspeed serial I/Os, arithmetic modules, embedded processors and large amounts of memory. There are also options from leading vendors for embedded cores and configurable cores. We developed our FPGA-based embedded system to be a completely programmed chip for a complex function that faced minimal delays during its development and that exploits all that FPGAs have to offer to such complex systems, particularly their practical use in situations that demand the utmost accuracy and precision.

Source: Texas A&M

Sender

Receiver

Encryption Plain text

Network

Decryption

Cipher text

Plain text

FIGURE 1 B lock diagram for standard encryption and decryption Source: Texas A&M

Encryption Key

Plain text

Encryption Algorithm

Cipher text

Decryption Key

Cipher text

Decryption Algorithm

Plain text

FIGURE 2 E ncryption and decryption

Speech encryption and decryption principles Speech encryption has always been a very important military communications technology (Figure 2). Given the technology available today, digital encryption is considered the best technological approach. Here, an original speech signal, x, is first digitized into a sequence of bits, x(k), which is then encrypted digitally into a different sequence of bits, y(k), and finally transmitted. But while digital encryption techniques can offer a very high degree of security, they are not entirely or immediately compatible with all of today’s communications networks. Most telephone systems are still largely analog. Most practical speech digitizers operate at bit rates higher than those that can easily be transmitted over standard analog telephone channels. Meanwhile, low bit-rate speech digitizers


EDA Tech Forum March 2009

The Texas A&M team describes an FPGA embedded system design project intended to assess the technology’s suitability for use in real-time, high-performance applications.

still entail relatively high design complexity but offer relatively poor quality results. Furthermore, almost all digital encryption techniques rely on accurate synchronization between the transmitter and the receiver. Exactly the same block of bits must be processed by the encryption and decryption devices. This not only increases design complexity, but also makes transmission much more sensitive to channel conditions—a slight synchronization error due to channel impairment can completely break the transmission. There is another type of speech encryption technique, scrambling. The original speech signal, x, is scrambled directly into a different signal, y(t), in analog form before transmission. Since the scrambled signal is analog, with similar bandwidth and characteristics to the original speech signal, this type of technique can be easily used with existing analog telephone systems. Some conventional scrambling techniques (e.g., frequency inversion, band splitting) do not require synchronization but today offer only relatively low levels of security. More advanced scrambling techniques have recently been developed (e.g., sample data scrambling) and are now used extensively because they continue to offer relative ease of implementation alongside improved levels of security. A typical advanced scrambling sequence is as follows. Original speech, x, is first sampled into a series of sample data, x(n). This is then scrambled into a different series of sample data, y(n), and recovered into a different signal, y(t), for transmission. These techniques offer a relatively high level of security and are compatible with today’s technical environment. However, like digital techniques, there is again a heavy dependence on synchronization between transmitter and receiver. The transformation from x(n) into y(n) has to be performed frame-by-frame, and exactly the same frame of sample data has to be used in the scrambling and descrambling processes for the signal to be recovered. As with digital approaches, this complicates implementation and makes transmissions very sensitive to channel conditions. Recently, two new sample data scrambling techniques have emerged. One scrambles the speech in the frequency

The test application is a speech encryption system intended for use in military and high-security environments. The team used a Xilinx Virtex II Pro platform FPGA for the project, and was also particularly interested in using the peripheral technologies made available.

Source: Texas A&M

Sender

Receiver

Shared secret key

Encryption Plain text

Network

Decryption

Cipher text

Plain text

FIGURE 3 S ymmetric-key cryptography system Source: Texas A&M Voice Input

Encrypted

Voice .txt file .txt file PC PC FPGA Matlab Matlab Encryption

Recording Decrypted

Voice .txt file PC FPGA Matlab Decryption

Playing

FIGURE 4 O ffline process domain, and the other scrambles it in the time domain. Both preserve the advantages of traditional sample data scrambling, while eliminating the requirement for synchronization in the receiver. This simplifies the system structure, and significantly improves the feasibility and reliability of sample data scrambling techniques. The basic point here is that the synchronization is only necessary as long as the scrambling and descrambling are performed frameby-frame. It becomes unnecessary when a ‘frame’ is not defined in the operation. Scrambling based on frequency band swapping of the analog signal can be used in a wide variety of analog and digital systems since the method can transmit speech signals over a standard telephone line with acceptable quality. Continued on next page

23


24

< TECH FORUM > EMBEDDED

Source: Texas A&M

Voice Input

ADC

ADC

FPGA Encrypton

FPGA Decrypton

DAC

DAC

Encrypted Voice

Decrypted Voice

FIGURE 5 Online process In the time domain method, a digital speech signal is encrypted. This is based on a redundant bit to protect speech information effectively. Although the method is secure, it is hard to apply to a conventional analog transmission line because the bandwidth is wide. In order to reduce the bit rate under the bandwidth for the analog line, a speech encryption system with a low bit-rate coding algorithm is necessary.

Cryptography principles The main components involved in cryptography are: • a sender; • a receiver; • plain text (the message before it is encrypted); • cipher text (the message that has been encrypted); • encryption and decryption algorithms; and • a key or keys. All methods of cryptographic encryption are then divided into two groups: • symmetric-key cryptography (a.k.a. private key cryptography); and • public-key cryptography. In public key cryptography, the encryption and decryption algorithms are public but the key is secret. Only the key needs to be protected rather than the encryption and decryption algorithms. In symmetric-key cryptography (Figure 3, p. 23), a common key is shared by both sender and receiver. Particular advantages of a symmetric-key cryptography algorithm include the following: • Less time is needed to encrypt a message than when using a public key algorithm. • The key is usually smaller, so symmetric-key algorithms are used to encrypt and decrypt long messages.

Disadvantages include: • Each pair of users must have an unique key. • So many keys may be required that their distribution between the two parties becomes difficult. Symmetric-key algorithms can be divided into traditional ciphers and block ciphers. Traditional ciphers encrypt the bits of the message one at a time. Block ciphers take several bits and encrypt them as a single unit. Blocks of 64 bits have been commonly used. Today, some advanced algorithms can encrypt blocks of 128 bits.

FPGA implementation We worked on two simulation processes for our system: • An offline simulation using the Xilinx 9.1i project navigator. • An online simulation using the Virtex II Pro platform FPGA. Offline process The offline simulation process is shown in Figure 4 (p. 23). It did not include a hardware implementation but was intended only to allow for self-test of the encryption/decryption algorithm. Assuming this was successful, our plan was then to move on to the online process. The following steps were required: • In the recording process, first we took a voice source input. Then, using Matlab, we created a text (.txt) file. The parameters given during creation of the .txt file were voice duration and its bit rate. The Matlab code for conversion of speech (.wav file) to text (.txt) file was written. • After successfully creating a text file, a testbench read it character-by-character and then these values were mapped for the encryption/decryption operation. • After the encryption/decryption operation had been performed, the testbench again created a text file but this one contained the encrypted version of the original voice input. • In the final step, Matlab code again read the encrypted version of the text file and played it at the specified bit rate. Online Process The online process (Figure 5) included the software as well as the hardware implementation. The board on which the VHDL code was burned contained built-in analog-to-digital and digital-to-analog converter (ADC & DAC) ports. So the VHDL code written at the transmitting end participates in three operations: • It converts analog voice into digital data by an ADC. • It encrypts the data using symmetric-key cryptography. • The encrypted values are sent to the outside world via a DAC.


EDA Tech Forum March 2009

Similarly, at the receiver end, the code participates in these three tasks: • analog-to-digital conversion; • use of the standard decryption algorithm; and • digital-to-analog conversion. This is a real-time process. Input comes continuously from a microphone and is given to the ADC converter on a Virtex II Pro board. We preferred symmetric-key cryptography for encryption because here the same code performs both encryption and decryption operations. As noted, symmetric-key cryptography involves both sender and recipient using a common key. The sender uses an encryption algorithm and the same key for encryption of data, and the recipient uses a decryption algorithm and the same key for the decryption of data. In this process of cryptography, the algorithm used for decryption is the reverse of the encryption algorithm. The shared key must be set up in advance and kept secret from all other parties. The Stream Cipher of the Data Encryption Standard algorithm was used here. This is a class of ciphers in which encryption or decryption is performed using separate keys created on numerous occasions by a keygen. In our context, the key space consists of 30 different keys for 30 data samples, and the key space repeats itself for each subsequent data sample. A shift algorithm is also used to obtain these 30 different keys. The algorithm consists of the generation of the key space and the XORing of the space in which the data samples are designed, simulated, implemented on an FPGA, and then tested on hardware. In this implementation, the keys used for encryption/ decryption were each 14 bit long. Since ADC gives 14 bit output, our keys are confined to 14 bits. But the DAC that is present on Virtex II Pro takes 12 bit input, so we are neglecting two MSB bits of the ADC output. Then the DAC sends the encrypted version of speech to the outside world.

Encryption and decryption in practice Offline Process In the offline process, we used Matlab to record and play speech, to store the recorded speech in a text file, and to play a speech waveform by reading it from a text file. There are, as noted, Matlab codes that write to and read from text files. However, we then had to write some VHDL codes. One was for the speech encryption/decryption process and the others were needed to read the text file we had generated in Matlab through the Xilinx testbench. We observed that the larger the number of bits used to represent the discrete values of a sampled speech signal, the greater the clarity of the speech, although a more com-

plex algorithm was required for higher bit representations. Ultimately, we used 12 bit representations for each sample value and took 20 samples at a time for encryption with 20 different random bit sequences. Online Process In the online process, we first programmed the ADC/ DAC converters on the Virtex II Pro board to sample the incoming analog speech signal at a frequency of 1MHz. Once we had the sampled data value for the speech signal, we encrypted it using a random bit sequence. We took a 20-state code for this purpose with a key space of 20 random bit sequences. The encrypted data values were then sent simultaneously through a DAC to speakers. For decryption, since we had used an XORing scheme, we could reuse the same code we had developed for encryption. The ADC output is composed of a 14 bit number whereas the DAC input had to be 12 bit, so we had do a conversion of the 14 bit input number to a 12 bit number by rounding off two LSBs.

References: [1] H. J. Beker and F. C. Piper, Secure Speech Communications. London, U.K.: Academic, 1985. [2] B. Goldburg, S. Sridharan, and E. Dawson, “Design and cryptanalysis of transform-based analog speech scramblers,” IEEE J. Select. Areas Commun., vol. 11, no. 5, pp. 735-744, May 1993. [3] A. Matsunaga, K. Koga, and M. Ohkawa, “An analog speech scrambling system using the FFT technique with high-level security,” IEEE J. Select. Areas Commun., vol. 7, no. 4, pp. 540-547, Apr. 1989. [4] K. Li, Y. C. Soh, and Z. G. Li, “Chaotic cryptosystem with high sensitivity to parameter mismatch,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 50, no. 4, pp. 579-583, Apr. 2003. [5] G. Manjunath and G. V. Anand, “Speech encryption using circulant transformations,” Proc. IEEE Int. Conf. Multimedia and Expo, vol. 1, pp. 553-556, 2002. [6] Digital design: Morris Mano M [7] Digital design: Frank Vahid

Department of Electrical & Computer Engineering Dwight Look College of Engineering Texas A&M University Zachry Engineering Center College Station TX 77843 USA T: 1 979 845 7441 W: www.ece.tamu.edu

25


26

< TECH FORUM > ESL/SYSTEMC

Rapid design flows for advanced technology pathfinding P. Christie, A. Nackaerts, & G. Doornbos, NXP-TSMC Research Center A. Kumar & A. S. Terechko, NXP Semiconductors NXP Semiconductors is one of Europe’s leading silicon design companies. The NXP-TSMC Research Center is an R&D collaboration between it and TSMC, the world’s largest foundry.

Introduction The complexity of current design flows makes it extremely time-consuming to evaluate new device technologies in terms of the parameters that designers need (e.g., clock rate, die area, battery life). The process can take months. This research describes several simplifications to standard design flows that enable extremely short experiment turn-around times—often less than a day—while maintaining reasonable timing accuracy (better than 10%). This approach is illustrated using two use-case examples. In the first example, the impact of two competing 15nm technologies on the clock-rate versus area trade-off of a large block of intellectual property (IP) is analyzed. In the second example, rapid design flows are then coupled to a design flow description language to enable unique experiments at the 45nm node that directly link process-level variability to timing variations, without recourse to perturbations of device model parameters.

Rapid design flow components The components of an RDF are shown in Figure 1. Transistors incorporating new materials, architectures and transport mechanisms are designed using Technology Computer Aided Design (TCAD) tools and embedded in the RDF using a model designed to be extracted from only six I-V curves, with eight model parameters. beta gain factor for one um wide device vt0 threshold voltage at zero back-bias m0 sub-threshold slope gam0 drain-induced threshold shift for zero gate bias gam1 drain-induced threshold shift for large gate bias the1 mobility reduction due to vertical field the3 mobility reduction due to lateral field rs source/drain series resistance Although simple, the RDF model maintains the ability to model deep submicron device non-idealities, such as DIBL, velocity saturation, reduced sub-threshold slope and series

Source: NXP/TSMC

Extraction time

Automated RDF model extracted in 0.3 sec

Area

RDF library 25% larger area than reference

Delay

RDF library 11% faster than reference

Output slope

RDF library 13% larger slope than reference

Characterization time

RDF model 30% faster than BSIM4 in same library

TABLE 1 RDF benchmark using 65nm technology node and a set of 15 combinatorial cells resistance. Such dramatic model simplification is possible because the dominant pole (i.e., RC constant) of the standard cell frequency response is given by the product of the cell output resistance and the distributed interconnect capacitance. Only the DC properties of the device/cell then need to be modeled accurately [1]. Table 1 summarizes the main results of a benchmarking exercise at the 65nm node, to compare a fully automated RDFgenerated library with a commercially produced library. For this comparison, the RDF model was extracted from I-V curves generated from a 65nm BSIM4 model, and design rules for cell layout were generated automatically from the specifications for the lithography tools used at that node. We observed that 10% timing accuracy was maintained at the cost of a 25% increase in cell area due to the automated cell compaction procedures. A similar exercise at the 45nm node also showed an area penalty of approximately 25%, indicating that the cell area off-set is predictable and node-independent. Although a significant 30% advantage in run-time is observed, the real benefit of the RDF model is the rapid extraction time, 0.3s. This is exploited in the last section of the article.

Coupling to standard synthesis/timing flows It is anticipated that at the 15nm node, die area will be more important than processing speed. An RDF was therefore used to calculate the area of a system L2 cache controller [2] (approximately 400,000 cells) at constant clock rate, using two competing 15nm technologies. The first was a fully depleted SOI (FDSOI) technology, and the second was a III-V NMOS/Ge PMOS technology.


EDA Tech Forum March 2009

In both cases, the RDF model was based on a bulk CMOS device and therefore the extracted parameters are empirically rather than physically based. Table 2 lists a selection of extracted parameters for each device. The predicted trade-offs between the area and clock rate for the two technologies are shown in Figure 2, where the III-V/Ge library implements the controller in half the area of the FDSOI library at a clock rate of 3GHz.

The paper describes several innovative modifications to standard design flows that enable new device technologies to be rapidly assessed at the system level. Cell libraries from these rapid flows are employed by a design flow description language (PSYCHIC) for the exploration of highly speculative ‘what if’ scenarios. These rapid design flows are used to explore the performance of two competing 15nm technologies in a system L2 cache controller and a PSYCHIC analysis of statistical timing variations in a 45nm memory concentrator. Source: NXP/TSMC

TCAD Silicon I-V data... Core design rules...

Reduced order device model (mm9p8) Complex desgn rules

IP library (VHDL) Rapid Library Creation tool (RLC)

Library files

Wire Load models

Synthesis

Static timing/power analysis

Psychic toolbox

Psychic timing/power scripts

FIGURE 1 Schematic representation of RDF components

Source: NXP/TSMC

2.2

A design flow description language

1.8 1.6 1.4 1.2 1 0.8

1000

% STA script for path from system L2 cache controller tech = GaAsGeTech ; % imported technology information lib = GaAsGeLib2 ; % imported 15nm III-V/Ge RDF library logicDepth = 9 ; % number of cells in timing path slew = zeros(1,logicDepth) ; % starting transition times delay = zeros(1,logicDepth) ; % starting delays M = 630 ; N = 630 ; % rows and columns of cells height = GaAsGeLib2.INVD1.height ; width = height ; % square cells peff = 0.8 ; reff = [0.0 0.7 0.7 0.7 0.7] ; % set place\route efficiency tc = 3 ; tn = 2 ; % set terminals per cell and net % calculate average wire length within array lav = savlength(height,width,preff,reff,tc,tn,tech) ; % convert to capacitance cload = cint(2,lav,tech).*ones(1,logicDepth) ; initSlew = 0 ; % first input transition time [delay(1),slew(1)]=timing(GaAsGeLib2.DFCND1,cload(1),initSlew,) ; [delay(2),slew(2)]=timing(GaAsGeLib2.INVD1,cload(2),slew(1)) ; [delay(3),slew(3)]=timing(GaAsGeLib2.AN2XD1,cload(3),slew(2)) ; [delay(4),slew(4)]=timing(GaAsGeLib2.ND3D2,cload(4),slew(3)) ; [delay(5),slew(5)]=timing(GaAsGeLib2.NR2D1,cload(5),slew(4)) ; [delay(6),slew(6)]=timing(GaAsGeLib2.INVD2,cload(6),slew(5)) ; [delay(7),slew(7)]=timing(GaAsGeLib2.NR2D1,cload(7),slew(6)) ; [delay(8),slew(8)]=timing(GaAsGeLib2.NR2D1,cload(8),slew(7)) ; [delay(9),slew(9)]=timing(GaAsGeLib2.IOA21D2,cload(9),slew(8)) ; pathDelay = sum(delay) ;

Note that the script itself is node-independent, and technology and library information are ‘fire-walled’ within separate tech and lib files, respectively. The toolbox makes it easy to couple RDF libraries with new design flow concepts such as statistical static timing

FDSOI III-V/Ge

2

Area (um2)

For even more rapid technology assessment scenarios, we have developed a design flow description language, called PSYCHIC. This is implemented as a set of functions in a Matlab toolbox (Table 3). The PSYCHIC approach relies on the use of defined functions to construct custom scripts, tailored to each modeling problem, rather than develop a single compiled program. Here is the script for emulating the static timing of the critical path within the system L2 cache controller.

x105

1500

2000

2500

3000

3500

4000

4500

Clock Rate (MHz)

FIGURE 2 Clock rate-area trade-offs for 15nm system L2 cache controller (400,000 cells, cell height 0.624um) analysis. In this context, the output load, input and output transition times and delay parameters of the timing function are not single values but probability density functions. Scripts based on this approach have been used to assess the impact of process-level variability on critical path timing. Variations were introduced into a TCAD model of a 45nm device by varying the gate insulator thickness (σEOT=1Å) and gate length (σL=3nm), in order to produce 50 sets of NMOS and PMOS I-V curves (five hours processing time). RDF device models were then extracted for each device variant (extraction time 30 sec). Fifty libraries were then generated and characterized (total time one day) and imported into PSYCHIC. Statistical static timing experiments were carried out on the slowest timing path extracted from a 45nm memory concentrator block within a multimedia processor SoC. Figure 3 shows the results. The timing histograms for each cell in the path were convolved to produce the overall path delay. Continued on next page

27


< TECH FORUM > ESL/SYSTEMC

Source: NXP/TSMC

Si technology

III-V/Ge hybrid technology

Parameter

NMOS

PMOS

III-V NMOS

Ge PMOS

Beta (A/V2)

0.0647

0.0284

0.2055

0.1023

vt0 (V)

0.5267

0.4719

0.5424

0.4996

m0

1.6593

1.6904

1.9846

2.0000

rs (Ω)

123.2600

146.4500

33.1310

24.9610

TABLE 2 Some key RDF model parameters for 15nm devices Source: NXP/TSMC

Function

Description

cint

Interconnect capacitance

rint

Interconnect resistance

savlength

Average wire length

crosstalk

Calculate induced cross-talk voltage

sintenergy

Supply-side energy dissipation in interconnect during logic transition

dintenergy

Demand-side energy dissipation in interconnect during logic transition

pplaceflat

Pseudo-placement using site function

proute

Allocate wire length distribution to layers

timing

Cell static timing analysis

stiming

Cell statistical static timing analysis

TABLE 3 Partial PSYCHIC function listing

Conclusions Benchmarking with commercially generated libraries shows that 10% timing accuracy and 30% run-time gain can be maintained with RDF libraries at the cost of a consistent, node-independent 25% cell area penalty. As an example, this approach was used to analyze the clock rate-area tradeoff for a system L2 cache controller implemented using two competing 15nm technologies. In order to explore more speculative ‘what if’ scenarios and to avoid costly synthesis and timing tools, a design flow description language was developed. Scripts written in this language were shown to reproduce critical path timing data from, for example, Cadence Encounter to 2% accuracy. The unique capabilities of RDF libraries and the ease of implementing new PSYCHIC functions were employed to analyze statistical timing variations in a 45nm memory concentrator.

References Source: NXP/TSMC

[1] P. Christie, et al., Proc. IEDM (2007). [2] P. Stravers, et al., Proc. Int. Sym. VLSI-TSA (2001) [3] C . Visweswariah, Proc. Design Automation Conference (2003)

Probability (%)

28

50

100

150 200 250 Path Delay (ps)

300

350

FIGURE 3 PSYCHIC prediction for statistical timing delay for critical data path in memory concentrator IP block due to process-level variability This approach allows fundamental experiments to be performed on the effects of process-level variability on system-level timing, and avoids issues associated with varying individual parameters within a single compact model to generate timing statistics [3].

NXP-TSMC Research Center Kapeldreef 75 B-3001 Leuven Belgium NXP Semiconductors High Tech Campus 5656 AE Eindhoven The Netherlands W: www.nxp.com W: www.tsmc.com


TAPE-OUT COMES A LOT FASTER

WITH CALIBRE nmDRC.

Calibre® nmDRC

There’s nothing like having an advantage when you’re racing to market.

That’s exactly what you get with Calibre nmDRC. Widely recognized as the world’s most popular physical verification solution, Calibre’s hyperscaling architecture produces the fastest run times available. To accelerate things even more, Calibre nmDRC adds incremental verification and a dynamic results-viewing and debugging environment. Designers can check, fix and re-verify DRC violations in parallel, dramatically reducing total cycle time. Start out and stay ahead of the pack. Go to mentor.com/go/calibre_nmdrc or call us at 800.547.3000.

©2008 Mentor Graphics Corporation. All Rights Reserved. Mentor Graphics and Calibre are registered trademarks of Mentor Graphics Corporation.


30

< TECH FORUM > VERIFIED RTL TO GATES

Multiple cross clock domain verification Snehal Patel, eInfochips

Snehal Patel is a project leader in the ASIC Verification division of eInfochips, a company providing integrated silicon, embedded system and software design and development.

Source: eInfochips

In

InA FIFO1

The design of multi-million-gate system-on-chips (SoCs) is made still more complex where engineers must account for the presence of multiple asynchronous clocks. The problem is not uncommon—for example, different external interface standards such as PCI Express (PCIe) and Internal Bus (IB) use different clock frequencies. Another factor can be the additional presence of a single fast clock that is effectively distributed over the entire chip. The verification of such designs has become especially tedious, time-consuming and therefore costly. Moreover, traditional functional simulation is proving inadequate for the verification of clock domain crossings (CDCs). Instead, this article describes a methodology for verifying the synchronization of different asynchronous clock domains that uses both structural and functional verification techniques. It also outlines the use of an assertion-based verification strategy that can be added to a simulation-based flow to improve design quality. SoCs with interfaces like PCIe and IB have two different operating frequencies and require synchronization during data transfer. Failing to synchronize data and control transfers between asynchronous clock interfaces leads to timing violations (e.g., setup and hold time) that cause signals to enter a metastable state. These metastable timing errors are difficult to detect as they are randomly generated. To detect them, you need the right combination of clock edges and data.

Metastability The proper operation of a clocked flip-flop depends on the input signal being stable for a certain period of time before and after the clock edge. If the requirements for setup and hold time are met then a valid output will appear at the output of the flip-flop after a maximum delay. But if the requirements are not met, the signal will take much longer to reach a valid output level. We refer to the result as either an ‘unstable state’ or ‘metastability’ (Figure 1). Metastability is avoided by adding various features to a design. Two of the most common such features are: • two flip-flop circuits (this technique requires a stable

InB FIFO1

CLKA

CLKB

CLKA InA

Clock B samples InA while it is charging

CLKB InB is in metastable state

InB

FIGURE 1 Metastability Source: eInfochips

d_in

d_out FIFO1

FIFO1

clk

FIGURE 2 Two flip-flop circuit design since incorrect design will lead to synchronization failure); and/or • FIFO buffers (handshaking mechanisms).

Structural clock domain verification Structure analysis checks the connectivity and combinational logic of a design. It should be performed on both the RTL code and the post-synthesis gate-level netlist. It can be performed by using automated CDC tools, scripts-based verification and code review, and identifies the following issues:


EDA Tech Forum March 2009 Source: eInfochips

Clk Data In Real Output

FIGURE 3 Timing diagram • unresolved clocks; • combinational logic failures ( jitter and glitches); and • insufficient synchronization of signals where synchronizers are based on flip-flops.

Functional clock domain verification Structural analysis will find annoying errors. However, it alone will not verify whether synchronizers have been used correctly in the design. Here, we pull upon functional analysis techniques as they can identify the following types of issue: • data stability between two different clock speeds (fast and slow); • FIFO read and write when both pointers operate on different clock frequencies; and • data stability with a handshake-based synchronizer. To identify synchronizers, functional clock domain verification uses the same algorithm as structural verification. However, under functional verification, assertions can be implemented for each type of synchronizer and as checks on all possible input conditions. Clock domain analysis with assertions can also be performed at different levels (e.g., block-level, full chip-level and gate-level) and helps to detect real timing violations in the gate-level simulation. Let’s now explore in more detail two of the design features used to overcome metastability.

1. Two flip-flop circuit A two flip-flop circuit is used to add delay for synchronization on a signal path. For the example shown in Figure 2, input data values must be stable for three destination clock edges. There are three cases where the input will not be transferred to the synchronizer output—where it is sampled by only one clock edge, sampled by two clock edges or not sampled by clocks at all. The advantage of a two flip-flop synchronizer is that it filters the metastability state, and this process is shown in Figure 3. The first ‘d_in’ pulse is never sampled by the clock and will not be transferred to ‘d_out’. But if the rising edge of this pulse violates the previous hold time or the falling edge violates the current set up time, this value may propagate to the real circuit. The second ‘d_in’ pulse is sampled by the clock, but if this signal changes just before the clock edge because of a setup time violation, the simulation will transfer the new value. In this case, the real circuit will become metastable and will transfer the original low value. The third ‘d_in’ pulse is sampled by two consecutive

Today’s system-on-chip designs often need to encompass multiple asynchronous clocks. This raises the problem of verification for the resultant clock domain crossings. It is becoming apparent that functional simulation alone is not up to the task. Instead, engineers need to consider hybrid methodologies, combining structural and functional verification approaches. The use of assertions is also proving increasingly important to CDC verification, and the author provides examples of code he built to address the techniques and features demanded by increasingly complex designs. clock edges and will be filtered by the synchronizer. Since the rising edge of ‘d_in’ violates the setup time for the first clock edge and the falling edge violates the hold time for the next clock edge, simulation will propagate two clocks. Here are the assertions and related descriptions required during this step. To check stability of the data: property stability; @(posedge clk) !$stable(d_in) |=> $stable(d_in) [*2]; endproperty : stability To check for a glitch: property no_glitch; logic data; @(d_in) (1, data = !d_in) |=> @(posedge clk) (d_in == data); endproperty : no_glitch assert property(stability); assert property(no_glitch);

The ‘stability’ assertion checks the stability of ‘d_in’ for two clocks. The ‘no_glitch’ assertion checks for a glitch on ‘d_in’.

2. FIFO buffer circuits 2.1. Handshake synchronizer Different types of handshake synchronizers are used to avoid CDC. In the type shown in Figure 4, the transmitter sends a request that is synchronized by two flip-flops. Here, the number of flip-flops is not fixed as it depends on the frequency of the clock. Once the receiver receives a request, it latches the data and sends an acknowledgement back to the transmitter. In this case, the data must be valid when the request is asserted. If the request is not synchronized properly because of flip-flop error, it can be caught during structural verification. That step identifies errors that arise when incorrect data is latched through glitches generated by combinational logic driving a flop-based synchronizer. Insufficient synchronization is also a structural error. High-speed clocks require a greater number of flip-flops and any shortfall can be detected by structural verification. For verification of the handshake synchronizer circuit, the following conditions can be asserted: 1. Every request gets acknowledged. 2. No acknowledgement without a request. 3. Data stability. Continued on next page

31


32

< TECH FORUM > VERIFIED RTL TO GATES Source: eInfochips

Request

F1

F2

Transmitter Control

Receiver Control F3

F4 Acknowledgement

clk1

data

clk2

latch

data

latch

data

FIGURE 4 Handshake synchronizer circuit

Here are the assertions and related descriptions required for this stage. Every request gets acknowledged sequence req_transfer; @(posedge clk) req ##1 !req [*1:max] ##0 ack; endsequence : req_transfer property req_gets_ack; @(posedge clk) req |-> req_transfer; endproperty : req_gets_ack No acknowledgement without request property ack_had_req; @(posedge clk) ack |-> req_transfer ended; endproperty : ack_had_req Data stability property data_stablility; @(posedge clk) req |=> data [*1:max] ##0 ack; endproperty : data_stablility assert property(req_gets_ack); assert property(ack_had_req); assert property(data_stablility);

The ‘req_transfer’ assertion checks whether every request gets acknowledged. The ‘ack_had_req’ assertion checks whether every acknowledgement has a request. The ‘data_ stability’ assertion checks if the data is stable for the period of the request, including an acknowledgement. 2.2 Asynchronous FIFO circuit A dual clock asynchronous FIFO circuit is used for CDC synchronization when high latency in the handshake protocol cannot be tolerated. The circuit changes according to the requirements of the SoC but its basic operation is constant (Figure 5). Data is written into the FIFO from the source code and read from the FIFO in the destination clock domain. Read and write pointers are passed into different clock domains to generate full and empty status flags. Data

write and read is kept in synchronization by write and read pointer positions. The following conditions can be asserted for verification of dual FIFO asynchronous circuits: 1. Data integrity. 2. Do not write when the FIFO is full. 3. Do not read when the FIFO is empty. Here are the assertions and related descriptions for this step. Don’t write when FIFO is full/Don’t read when FIFO is empty property full_empty_access(clk, inc, empty_full_flag); @(posedge clk) inc |-> !empty_full_flag endproprty : full_empty_access Data integrity int write_cnt, read_cnt; always@(posedge write_clk or negedge write_rst_n) if (!write_rst_n) write_cnt = 0; else if (winc) write_cnt = write_cnt+1; always@(posedge read_clk or negedge read_rst_n) if (!read_rst_n) read_cnt = 0; else if (rinc) read_cnt = write_cnt+1; property data_integrity int cnt; logic [DSIZE-1:0] data; disable iff (!write_rst_n || !read_rst_n) @(posedge write_clk) (winc, cnt = write_cnt, data = wdata) |=> @(posedge read_clk) first_match(##[0:$] (rinc && (read_cnt == cnt))) ##0 (rdata == data); endproperty : data_integrity assert property(full_empty_access(write_clk,winc,wfull)); assert property(full_empty_access(read_clk,rinc,rempty)); assert property (data_integrity);

The ‘full_empty_access’ assertion ensures that there should not be data access when FIFO is empty and FIFO is full. The ‘data_integrity’ assertion starts a new thread with an initial count value whenever data is written into the FIFO, and checks the read data value against a local data variable whenever a corresponding read operation occurs.


EDA Tech Forum March 2009

Source: eInfochips

Write Ptr

clk

F1

clk

F2

Write Control

Read Control F3

Read Ptr

F4

Write Clock Write Enable

Read Clock MEMORY

Write Data

Read Enable Read Data

FIGURE 5 Asynchronous FIFO circuit

CDC jitter emulation One problem that can arise because of synchronization is CDC jitter. Even though all signals may be synchronized properly for the destination clock domain, the actual arrival time can become uncertain if the signal goes metastable in the first synchronization flip-flop. This CDC jitter can lead to functional failures in the destination logic. Normally, this jitter would not show up on a simulation. Both formal and dynamic verification methodologies can be used to determine if the design still works with CDC jitter. Formal verification can prove that design properties are never violated for all possible jitter conditions whereas simulation can only demonstrate that no assertion checks are violated for a particular jitter combination. Actual violations can also be difficult to debug in simulation. CDC jitter emulation provides an improvement in overall verification quality.

Gray encoding In addition to handshake-based and FIFO-based synchronizers, another method of synchronizing data is first to gray encode it and then use multi-flop synchronizers to transfer it across domains. For multibit signals, a gray code ensures that only a single bit changes when a group of signals counts. This makes it possible to detect whether all the bits have been captured by the receiving clock together or if they have been skewed across multiple clock cycles due to metastability. Another example is in an asynchronous FIFO where the read and write pointers cross the write and read clock domains respectively. These pointers are control signals that are flop-synchronized. The signal is gray-encoded prior to the crossover. This can be verified using assertions. The assertion is simply exclusive-or of the data and the previous value of the data with the condition that it must have at most one active bit (i.e., it should be ‘one hot’).

The relevant assertions statement is shown here: property check_graycoded = always ((gc == 8’h0) || (((gc ^ prev(gc)) & ((gc ^ prev(gc)) – 8’h1)) == 8’h0)); assert check_graycoded;

All the assertion statements in the examples above verify that the receiver gets the expected data. Since the latency between sending and receiving data is not known, verifying the sequence will ensure that no data gets dropped. The actual value in the data sequence should not make a difference to the verification; an assertion could be written using a more random sequence or specific data values. Formal verification can use this assertion to verify that the control logic of the assertion does not allow any loss of data and will generate counter example traces if any failures are found.

Conclusion Multiple cross clock domain verification is best performed using both structural and functional verification techniques. The verification process may start with structural verification to remove errors related to insufficient synchronization, combinational logic driving synchronizers or missing synchronizers. After structural verification, functional verification may be used to implement additional assertions and check the usage of the synchronizers.

eInfochips, Inc. 1230 Midas Way Suite# 200 Sunnyvale CA 94085, USA T: +1 408 496 1882 W: www.einfochips.com

33


34

< TECH FORUM > DIGITAL/ANALOG IMPLEMENTATION

TrustMe-ViP: trusted personal devices virtual prototyping Gilles Jacquemod, LEAT, University of Nice-Sophia Antipolis Gilles Jacquemod received his PhD from INSA Lyon in 1989. In 2000, he joined the LEAT laboratory at the Ecole Polytechnique of Nice-Sophia Antipolis University as a full professor. His primary research interests include analog design and the behavioral modeling of mixed domain systems.

Introduction Like all electronic products, trusted personal devices (TPDs) have to be more and more competitive, delivering higher performance at lower cost. Reducing the bill of materials and time-to-market can best be achieved through the broadest and most complete integration, one that also includes elements such as the antenna and/or the packaging where smart cards or system-in-package (SiP) technologies are involved. This paper proceeds from the position that designing such complex systems requires the development of a single EDA framework composed of multi-engine simulators that are associated with a library of hierarchical models. Figure 1 shows the different parts in a TPD and the different types of signal involved in a typical communication between two such devices. It is clear that preparing them to handle the different signal shapes and frequencies is no trivial task. Indeed, during different design phases, engineers have to account for such factors as bit error rate (BER) on one hand, and power consumption on the other. Block design requires that electrical properties are set with due consideration for factors such as noise, nonlinearity, gain or impedance matching. Each step requires different simulators, and all must be interoperable. Even then, there is usually a design gap between the system specification and the RF block design. It typically centers on: • the use of different design frameworks and different simulators; and • a huge frequency ratio between the RF (GHz) and the digital baseband (MHz). Nevertheless, to meet an original system-level specification, engineers need to mix different levels of abstraction so that they can explore implementation architectures and then validate the final design at the circuit level. Luckily, existing behavioral models in the VHDL-AMS analog and mixed signal language provide the basis for such a seam-

less system-to-transistor-level approach. Relevant features include: • top-level functional simulations for architecture validation using a top-down methodology; • bottom-up verification with accurately characterized models; • test program development using tester resource models; • traditional measurement, post-processing and/or the use of testbenches; and • IP exchange and protection to help assess a model against a specification. This paper describes a methodology for the design of TPDs that can interface to multiple terminals and networks. This methodology is based on the Mentor Graphics’ ADVanceMS (ADMS) framework [1], Matlab/Simulink from The Mathworks, and a hierarchical analog and mixed-signal intellectual property (AMS-IP) library.

The virtual RF system platform The virtual platform The different parts of a complex embedded system are usually specified and designed separately by engineers working with various EDA tools. For TPDs and similar communications devices, one must account for critical design parameters (e.g., cost, power consumption, channel effects) at the system level. The exploration and evaluation of various system architectures then requires access to a complete, hierarchical AMSIP model library and a design environment that supports the use of several design levels. Then, test vectors that have been initially applied at the system level should also be available for reuse at each subsequent level in the design flow. Simulation time is very important. To control its influence on time-to-market, this methodology uses a four-level hierarchical model library (Figure 2, p.36): transistor (SPICE or physical level), structural (low level), behavioral (intermediate) and high level (ESL or specification). It is also tailored to the three main types of design process: bottom-up, topdown and meet-in-the-middle [2]. The overarching goal is to define the primitive components and objects, and to then develop them and the system structure simultaneously, so


EDA Tech Forum March 2009

As the market continues to push for higher performance at lower cost, trusted personal devices (TPDs) must also be able to exchange greater volumes of data, voice or streaming video at ever higher bit rates, while transmitting and receiving securely under multiple telecommunication standards. Reductions in cost and time-to-market can only be achieved by integrating the complete system design including such aspects as the antenna and the packaging.

that the final system is constructed from these primitives in the middle of the design process. The same paradigm is used to develop the hierarchical models. SystemC is used to provide a fast, high-level, C++ based framework for the simulation and to validate the system specifications [3]. Meanwhile, for non-electrical devices (e.g., micro-electromechanical systems [4], photonic components, etc.) finite element methods (FEMs) provide golden simulations [5] at the physical level, in the same way as SPICE does for electronic circuits. These simulations are used to validate a higher-level model or to develop them. In fact, both topdown and bottom-up approaches are suitable in that the hierarchical library can be built by refinement or abstraction. Limitations of the methodology The main limitation concerns the development of the hierarchical library. Model development is difficult and needs a high level of expertise to be carried out effectively. Achieving an accurate RF implementation demands the full consideration of such factors as non-linearity effects, compression, noise, phase noise, frequency response, mismatch and so on. Moreover, RF designers also often use proprietary models during simulation, and this often means that there are limited opportunities for their reuse across different design environments. The problem of coupling by the substrate or the wires must also be considered. Increased digital signal processing (DSP) lowers the quality of analog signal processing. Luckily, mixed-signal simulation is now a mature technology, and

The TrustMe-ViP project is part of a larger CIM-PACA (Centre Intégré de Microélectronique Provence Alpes Côte d’Azur) initiative that has developed a virtual RF systemlevel design platform, and it is specifically dedicated to the design of complex TPD systems. It provides a unified EDA framework that features multiple simulation engines based around dedicated hierarchical model libraries available for multiple levels of abstraction. The benefits and structure of this methodology are described here and illustrated by way of the example of a Bluetooth transceiver design. several unified simulators and co-simulation solutions are available [6]. This smoothes our path to the objective that, wherever possible, one should reuse existing models and unify various platforms in ways that are transparent to the user. The key factors for success are: • the ability to take a hybrid approach to the integration of different tools; • the use of open databases using standard languages; and • the availability of standard, compatible input/output formats and interfaces. Tool flow As noted earlier, the platform is based on the ADMS framework coupled with Matlab/Simulink. The main idea is to reuse models developed under Simulink, which offers an RF system baseband modeling simulation where the number of simulation steps is significantly reduced, as well as Simulink user-defined continuous models. For example, in a Bluetooth [7] transceiver, the wireless system transmits a GFSK-modulated signal at 1Mb/s, using a frequency-hopping algorithm over 79 1MHz sub-carrier channels, with a center frequency around 2.4GHz. In this case, Source: LEAT

DSP

Analog

I/Q impairments, modulation influence, ...

RF

RF

Analog

DSP

Phase noise, white noise, non-linearities, matching, ... BER=1012

FIGURE 1 TPD communication and signals characteristics Continued on next page

35


< TECH FORUM > DIGITAL/ANALOG IMPLEMENTATION

Source: LEAT

Use Case

High Level Specifications

System level models Matlab/Simulink VHDL-AMS/SystemC

Sub-system DSP

Sub-system RF

Sub-system Baseband

Intermediate level models VHDL-AMS + Simulink (export)

Block 1

Block 2

Block n

Intermediate level models VHDL-AMS+SPICE

Synthesis

Validation

Refinement

Abstraction

Low level models transistor SPICE

Top-Down

System Specifications (Simulink + VHDL-AMS)

Bottom-Up

Hierarchical Library (IP AMS)

Time Simulation and Model Accuracy increase

Environment

Tests Intermediate hierarchical levels VHDL-AMS Modeling

36

Circuit

Specific tests

Circuit

SPICE Implementation Transistors level

FIGURE 2 Methodology for complex SoC development

Source: LEAT Verilog-AMS

VHDL-AMS

SystemC

VHDL

Block

Verilog

System

SPICE

Matlab-Simulink

the established models only take into account the spectral peaks within a bandwidth around the carrier frequency. As a result, the first important drawback is the loss of simulated nonlinear behavior outside the bandwidth. A second is a limitation to the modulated steady state (MODSST) simulation. It requires the development of Simulink continuous models because the tool lacks appropriate baseband models. So, for behavioral and functional approaches, we need to run different RF simulations to analyze the complete system. The solution is to export and compile the Simulink models in ADMS, using the Real Time Workshop (RTW) toolbox [8] (Figure 3). Simulink is a graphical ‘what-you-see-is-what-you-get’ editor based on the Matlab engine. This simulator covers the system level, delivering its results easily and quickly. It allows the simulation of complete transmission systems, including both the digital and analog elements. The simulation accuracy is admittedly limited for the analog portion, but that is counterbalanced by the tool’s system approach to verification and validation. Simulink has thus become a standard language by default, particularly for system-level designers. VHDL-AMS (or Verilog-AMS) is used to describe the digital, analog and RF behavior. The ADMS framework can cope with mixtures of different languages (e.g., SystemC, VHDL, Verilog, VHDL-AMS, Verilog-AMS, Netlist SPICE) and simulators (e.g., Questa, ModelSim, Eldo and Eldo RF) (Figure 4). To carry out high-level simulations, ADVance-MS allows simulators that support VHDL or SystemC to be run. For the analog and RF parts of the circuit, Eldo and EldoRF are best suited to the simulation, and both support Commlib and Commlib RF libraries.

Circuit Matlab

ModelSim

Eldo RF

ADVance MS RF

ADVance SME

FIGURE 3 Matlab/Simulink over ADVance MS Hierarchical models The design levels most commonly used in today’s methodologies are the system level, the block or register level, and the circuit or transistor level. The system level is usually the first to be considered, and here the design’s specifications are extracted. It can be useful in determining the modulation to be used for a given design. It also allows for the definition of critical parts of the design, of the modeling level that will be used across the flow, and of the simulators to be used later on. At the block level, one must provide more detail to increase simulation accuracy. One must mainly account for electrical considerations in more specific terms, rather than


EDA Tech Forum March 2009

Source: LEAT

modsst

transient

RF

A-BB

ELDO RF

ELDO & ADiT Verilog-AMS

Verilog/VHDL

D-BB

ONE DESCRIPTION

Test-bench

ModelSim: VHDL & Verilog Questa:

ONE USER INTERFACE Source

Main Nets Structure Variables Process

System Interface (OSI) protocol stack and deals with end-user processes. Profiles defines how to use the lower Bluetooth layers to accomplish specific tasks. • Logical Link Control and Adaptation Protocol (L2CAP). This is responsible for packet multiplexing, segmentation and reassembly. • Host Control Interface (HCI) or Control. This defines the interface by which upper layers access lower layers of the stack. • Link Manager Protocol (LMP). This is responsible for the link state and establishes the power control mode. • BaseBand (BB). This deals with framing, flow control, medium access control (MAC) and mechanisms for timeout. • Radio Frequency layer. This is concerned with the design of the Bluetooth transceiver.

Viewer (EZ Wave)

Source: LEAT Host

ONE RESULTS DATABASE

Profile #1

Profile #2

Profile #N

FIGURE 4 ADVance MS framework using the mathematical equations seen at higher abstractions. This does increase simulation time. Finally, at the transistor level, one needs to link the performance of the components to the technology. Here, the necessary SPICE simulations are quite time-consuming and therefore must be tightly controlled. To improve simulation efficiency, we recommend that engineers mix these description levels during both design and verification. The development of such hierarchical models is both possible and desirable whether the project uses a top-down or a bottom-up methodology, or a combination.

L2CAP

Host Controller Interface (HCI) Controller

Link Manager Device Manager Baseband Resource Manager

Link Controller

Bluetooth implementation and results System considerations Bluetooth dominates the wireless personal area network (WPAN) space, and, as one would expect, has mainly been used to date in the mobile communications market where both low cost and low power consumption are required. The latest specification introduced an enhanced data rate (EDR) mode, up from the original 1Mb/s (using Gaussianfrequency-shift-keying modulation) to 3Mb/s (using eightdifferential-phase-shift-keying, or 8DPSK). It uses a slotted protocol with a frequency-hopping-spread-spectrum technique in the typically unlicensed ISM frequency band (2.402-2.483GHz) across 79 channels of 1MHz. The transmission channel changes 1,600 times per second. Therefore, time is divided into 625μs slots using a time-division-multiplexing (TDM) scheme. The Bluetooth specification not only defines a radio interface but also an entire communication protocol stack with the following layers (also shown graphically in Figure 5): • Applications/Profiles. This is the upper layer of the Open

Radio (Simulink/VHDL-AMS/SPICE)

FIGURE 5 SystemC model architecture Transceiver simulation from SystemC to RF Figure 6 (p.38) illustrates the modeled architecture of a Bluetooth transceiver. We focus here on one transceiver. The baseband Simulink model includes modulation/demodulation and gathers the necessary information to generate a sequence at 1Mb/s before we address the RF part of the design. The master SystemC model of the Bluetooth device master model can be substituted with a random bit generator for the RF simulation. For the RF architecture, we use a heterodyne structure with two intermediate frequencies (IF) equal to 1/3 and 2/3 of the carrier to reduce classical coupling between the voltage-controlled oscillator (VCO) and the power amplifier (PA). The frequency synthesizer Continued on next page

37


38

< TECH FORUM > DIGITAL/ANALOG IMPLEMENTATION

Source: LEAT

FIGURE 6 Bluetooth transceiver architecture Source: LEAT

is generated by a fractional phase-locked loop (PLL) [12]. The PA and the mixers are modeled both in Simulink and VHDL-AMS at different levels of abstraction. An RF transceiver contains key models (e.g., low noise amplifiers (LNAs), mixers, channel descriptions, PAs, PLLs with VCOs, filters). To simulate these elements, we need to develop a RF library [13]. This is described in a combination of VHDLAMS behavioral language and Simulink on the ADVance System Model Extractor (ADSME) [1]. The main features of the different blocks are then characterized (e.g., the third-order intercept point and clipping effects, conversion gain, phase noise, noise figure and input/output impedances and matching). After a first simulation, a preliminary study shows that the PA is a sensitive block. Its power consumption is by far the most important in the RF part of the design, and its power output can directly affect the transceiver (BER). For these reasons, a full SPICE description for this component is worth having. However, other blocks are described either with Simulink, VHDL-AMS or Verilog-AMS. Under the ADMS RF software, we can again mix different abstraction levels and the description languages across various parts of one block. Simulink allows us to simulate a complete transmission system that contains RF, MAC, modems and channel by combining the digital and analog parts. We start with Simulink at the system level where it offers a baseband modeling simulation for RF systems that significantly reduces the number of simulation steps on the time domain (transient simulation). Based on elapsed time for a 68Îźs Simulink simulation frame sequence, we model our transceiver by choosing from several possible simulation options (e.g., MODSST or transient) and behavioral language descriptions (e.g., SystemC, Simulink, VHDL-AMS, Verilog- AMS), and then use circuit

19.380 ms

time (ms) enableTx

0 1

clock devAddr

0xa3482c8

bitstream channel

29

FIGURE 7 SystemC generation of a bit train SPICE for our PA. A final comparison with SPICE visibility can then be performed. Several conclusions can be drawn: a Simulink design exported on ADMS runs 6x faster; one combining SystemC and Simulink on ADMS runs still 7x faster; and SystemC decreases the simulation time by 1x while behavioral languages reduce it by 2x. Armed with this information, we can determine the adequate abstraction level at which to analyze the design from SPICE to system models and estimate the bit error rate. We notice that MODSST can significantly reduce simulation time (reported as 34x [1]), but the SPICE device number must be increased to correctly use the MODSST simulation. To validate our methodology, a system simulation is first done using SystemC, and the corresponding frame sequence is generated. Figure 7 shows the generation of the sequence of bits for an identification (ID) packet, the simplest packet in the Bluetooth standard. To quickly extract estimated BER from the transceiver, an analytic model has been developed. Based on the SystemC


EDA Tech Forum March 2009

Source: LEAT

FIGURE 8 BER estimation simulation from SystemC frame Source: LEAT

 Ii − D 1 BER = ∑ erfc  2 i  σi 2

  p(i ) i 

(1)

We can approximate this formula by relation to equation 2:

1 N BER = ∑ N i =1

I −D 2 exp( − ( i ) / 2) σi I −D 2π i σi

(2)

where D represents the decision level, σi is the variance of block i, N is the number of blocks and p(i) the sequence probability, linked to the total variance (see equation 3) : 2 2 2 σ 2 = σ mixers + σ LNA + σ PA +

2 2 + σ 2filters + σ PL L + σ channel

(3)

An approximation of phase noise can be done using the Allan variance [12] for mixer and PLL models or by phase noise area computation. Figure 8 presents a section from a BER analyzer simulation with the ‘cleaned’ signals (before and after modulation), variance computation and the resulting estimate for the BER. We compute the BER using different simulations based on different hierarchical models and languages. To compare such results, we give the elapsed simulation time per bit (Xeon PC Linux, 2.8GHz and 8 Gbyte RAM) and the corresponding estimated BER: • Ideal Simulink models (without noise): 1.4s/bit simulation time with a BER equal to 0.

-40 Phase Noise Measurement Phase Noise Simulation

-60 Phase Noise (dB Hz)

Bluetooth frame, we simulate the transceiver including modulation, demodulation and RF architecture, with compression, frequency response and third-order harmonics characteristics extracted from previous simulations. This simulation will be done without noise in a relatively fast time. The next step will be the estimation of the BER using formula (1) [5]:

-80 -100 -120 -140 -160 -180

102

103

104 105 106 107 Frequency Shift (Hz)

108

FIGURE 9 PLL phase noise • Simulink models with RF noise for the PA: 55.1s/bit simulation time with a BER equal to 6.7*10-4. • Simulink or SystemC models for the digital part, Simulink models for the MODEM part and VHDLAMS models including noise for the RF part: 26s/bit simulation time with a BER equals to 8.1*10-4. • Quasi-analytic BER estimation: 16ms/bit simulation time with a BER equals to 6.4*10-4. Circuit modeling and design Armed with the nominal BER, one can perform RF system simulations to determine the noise performance required for each RF block (e.g., oscillator, PLL, amplifier, mixers). For example, Figure 9 shows the performance range of a PLL that can be used in Bluetooth applications—notice that the Bluetooth specification’s requirement for phase noise of -119dBc/Hz@1.5MHz is respected. A time-domain simulation, which compares different hierarchical models (using a combination of Simulink and VHDL-AMS) for the settling time, is given in Jacquemod, Geynet et al [15]. Given adequate abstraction, this simulation can be reduced by a ratio between five and six. Continued on next page

39


40

< TECH FORUM > DIGITAL/ANALOG IMPLEMENTATION

Conclusion This paper demonstrated the feasibility and the efficiency of mixing analog RF and digital simulators in a single design environment. A dedicated RF design and verification platform was coupled with a SystemC emulation environment that enabled the simulation of a full Bluetooth transmit/receive flow. This started with protocol emulation followed by sent-data generation, modulation, and modeling and optimization of the RF front-end. To achieve this, we developed a multi-language description (SystemC, Simulink, VHDL-AMS and SPICE netlist) using multiple engines (ModelSim, Matlab, Eldo and EldoRF), across multiple domains (MAC, BB, and RF) and hierarchical description levels (system, behavioral, structural and transistor) to simulate a BT transceiver. In comparison with other generic solutions, the simulation time is significantly decreased while the precision is increased. In addition, most of the models used for the simulations were generic, enabling easy reuse of the methodology.

[13] B . Nicolle, W. Tatinian, J-J. Mayol, J. Oudinot, G. Jacquemod, “RF library based on block diagram and behavioral description”, BMAS 2007, San Jose, CA, 2007 [14] C lark R.L., “Phase Noise measurements on vibrating crystal filters”, Frequency Control, 1990, Proceedings of the 44th Annual Symposium, pp. 493-497 [15] G . Jacquemod, L. Geynet, B. Nicolle, E. de Foucauld, W. Tatinian, P. Vincent, “Design and Modelling of a Multistandard Fractional PLL in CMOS/SOI Technology”, Microelectronics Journal, 2008, vol. 39, no. 9, 2008, pp. 1130-1139

Acknowledgments The author would like to thank the CIM-PACA Design Platform for its support.

References [1] ADVance MS Users’ Manual, v4.5_1, AMS 2006.2 [2] H. De Man, F. Catthoor, G. Goossens, J. Vanhoof, J. Van Meerbergen, S. Note, J. Huisken, “Architecture-driven Synthesis Techniques for VLSI Implementation of DSP Algorithms”, Proc. IEEE, Vol. 78, no.2, 1990, pp. 319-334 [3] Younes Lahbib, Romain Kamdem, Mohamed-lyes Benalycherif, Rached Tourki, “An automatic ABV methodology enabling PSL assertions across SLD flow for SOCs modelled in SystemC”, Sciencedirect.com, Computers and Electrical Engineering 31 (2005), pp. 282-302, Elsevier. [4] M. Zorzi, N. Speciale, G. Masetti, “A new VHDL-AMS simulation framework in Matlab”, BMAS 2002. [5] P. Bontoux, I. O’Connor, F. Gaffiot, G. Jacquemod, “Behavioral modeling and simulation of optical integrated devices”, Analog Integrated Circuits and Signal Processing, vol. 29, issue 1/2, 2001, pp. 37-47 [6] G. Jacquemod, “Virtual RF System Platform- Myth or Reality?”, Design Automation and Test in Europe 2007, Exhibition Theatre, Technical Panel: H. Guegan, F. Lemery, Y. Deval, A. Tudose, W. Krenik, O. Vermesan, Nice, 2007 [7] http://www.bluetooth.com [8] Matlab/Simulink User’s Manual, v2006b [9] http://www.eda.org/vhdl-ams [10] Modeling and Simulation for RF system design, Ronny Frevert et al., Springer 2005 [11] B T SIG, “Specification of the Bluetooth System - Core” Core Specification v2.1 + EDR, 2007, https://www.bluetooth.org/spec [12] C lark R.L., “Phase Noise measurements on vibrating crystal filters”, Frequency Control, 1990, Proceedings of the 44th Annual Symposium, pp. 493-497

Electronics, Antennas and Telecommunications Laboratory (LEAT) University of Nice-Sophia Antipolis UMR CNRS 6071 Bâtiment 4 250 rue Albert Einstein 06560 Valbonne France T: + 33 (0)4 92 94 28 00 W: www.elec.unice.fr


intelligent, connected devices. Choose your architecture wisely. 15 billion connected devices by 2015.* How many will be yours? intel.com/embedded * Gantz, John. The Embedded Internet: Methodology and Findings, IDC, January 2009. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. 2009 Intel Corporation. All rights reserved.

Š


42

< TECH FORUM > DESIGN TO SILICON

Chemical mechanical polish: the enabling technology Joseph M. Steigerwald, Intel Joe Steigerwald is director of Chemical Mechanical Polish Technology, Intel Technology and Manufacturing Group. He received his bachelor’s in electrical engineering from Clarkson University, and his masters and doctorate in electrical engineering from Rensselaer Polytechnic Institute.

When Chemical Mechanical Polish (CMP) was first introduced to IC manufacturing in the early 1990s, common sense insisted that the process was too crude and riddled with defects to use in the production of modern electron devices. However, CMP overcame predictions of an early demise, and as the decade passed, its use expanded considerably. Table 1 shows the insertion of new CMP steps into logic IC technology nodes by year of insertion as well as the technology element enabled by each CMP step. Introduction of these early CMP steps was key to the IC manufacturers’ ability to maintain scaling trends. The insertion of new steps slowed during the first part of the current decade after the insertion of copper CMP at the 130nm node. A major reason for the slowdown was concern raised over CMP’s inadequacies—mainly that the technique was expensive, induced yield-limiting defect modes, and resulted in thickness variations that were inconsistent with the scaling trends of the IC industry. While the first implementations of CMP were required to enable technology scaling, the original concerns around its crudeness appeared to reassert themselves. During these years, methods were found to scale dimensions and improve transistor performance without requiring new CMP steps. However, at the 45nm node, CMP is once again being used to enable a critical advancement in silicon technology. It is an integral component of the replacement metal gate (RMG) approach for defining metal gate structures required for high-k, metal gate (HKMG) dielectrics [1, 2]. Copper CMP has also been carried forward from the 65nm node to form Cu interconnects.

Requirements for RMG CMP CMP technology is extensively utilized to create metal gate electrodes for the introduction of HKMGs at the 45nm technology node [1, 2]. Figure 1 shows the RMG process flow utilizing poly opening polish (POP) and metal gate polish steps. Because of the small dimensions and consequent small dimensional tolerance of the gate structure, traditional

Source: Intel

a)

b) c) d) e) FIGURE 1 RMG process flow showing CMP steps: (a) ILD0 deposition post transistor formation, (b) POP CMP to expose the poly-Si gate, (c) poly etch, (d) metals deposition, (e) metal gate CMP CMP processes are inadequate for these RMG steps. For functional devices and requisite yield, thickness control and defect performance has to be significantly improved over CMP processes used for previous technologies. Table 2 shows integration issues associated with the insertion of the new RMG CMP steps. Many of these arise from insufficient thickness control, either during CMP or incoming to CMP. Low polish rates at the POP step result in tall gates that may not be properly filled with gate metals (due to high aspect ratio). Taller gates also require longer contact etch potentially resulting in underetched contacts. Severe underpolish, such that the poly-Si is not exposed, causes poly to remain in the gate, preventing proper metal fill. The resultant poly-Si gate transistor will fail due to improper work function (Vt shift) and high gate resistance. Underpolishing at the metal CMP step results in incomplete overburden metal removal and hence shorting. The metal gate polish step must undergo sufficient overpolishing to remove any topography evolved during poly opening. Excessive overpolishing at either of the RMG CMP


EDA Tech Forum March 2009

Chemical mechanical polishing (CMP) has traditionally been considered an enabling technology. It was first used in the early 1990s for BEOL metallization to replanarize the wafer substrate thus enabling advanced lithography, which was becoming ever more sensitive to wafer surface topography. Subsequent uses of CMP included density scaling via shallow trench isolation and interconnect formation via copper CMP. As silicon devices scale to 45nm and beyond, many new uses of CMP are becoming attractive options to enable new transistor technologies. These new uses will demand improved CMP performance (uniformity, topography, low defects) at lower cost, and this will in turn require breakthroughs in hardware, software, metrology and materials (e.g., slurry, pad, cleaning chemicals). Source: Intel

Node

Year

CMP

Enabling

0.8um

1990

ILD

Multilevel metallization

0.35um

1995

STI PSP W

Compact isolation PolySi patterning Yield/defect red

0.18um

1999

SiOF ILD

RC scaling

0.13um

2001

Cu

RC scaling

90nm

2003

SiOC ILD

RC scaling

65nm

2005

45nm

2007

RMG

HiK - metal gate

<32nm

2009+

???

Continued scaling New devices New architecture

This paper reviews the module level and integration challenges of applying traditional CMP steps to enable high-K metal gate for 45nm technology and to advance copper metallization from the 65nm to the 45nm node. These challenges are then considered with respect to new CMP applications for 32nm and beyond.

Source: Intel

STI WID 70% Line POP WID

STI/POP WID Performance

TABLE 1 Enabling CMP technologies

CMP Topography (normalized)

1

0.1

0.01

steps results in thin gates with high gate resistance and the potential for over-etched contacts. Severe overpolishing results in the exposure of the adjacent raised source/drain regions, which are then attacked in the post-CMP poly removal etch step. These integration concerns mean there is a narrow process window at both CMP steps. Figure 2 shows the historical improvements of within die (WID) thickness variation obtained for shallow trench isolation (STI)/POP CMP processes. Due to thickness control issues listed in Table 2 (p.44), circuit yield declines precipitously for WID values above the dashed line. Note that historic 70% scaling from previous technologies is inadequate to meet required WID control. The required thickness control is gained via several key CMP innovations. First WID/topography control is achieved by the selection of high selectivity slurry (HSS) and polish pad as well as the optimization of machine parameters around the selected set of consumables. With the HSS, the polish process slows significantly when the voidfree interlayer dielectric (ILD0) overburden is cleared and the gate is exposed resulting in an autostop to the process. Next, within wafer (WIW) uniformity is optimized by struc-

Must be below line to enable functional HiK/metal gate transistors 350

250

180

Scaling faster than 70% rate 130

65

90

45

Technology (nm)

FIGURE 2 Improvements in CMP topography by technology node tured experimental designs varying polish pressure, head and pad velocities, pad dressing and polish head design. Thickness control within a standard edge exclusion is insufficient: poor polish control of the bevel region of the wafer potentially leads to subsequent redistribution of bevel films during poly etch and subsequent wet etch operations. Significant CMP defect improvement is also required for satisfactory HMKG yield. Typical CMP defects are listed in Table 3 (p.44) (these modes were experienced during the development of RMG CMP processes). Because of the narrow dimensions at the gate layer, HKMG yields are particularly sensitive to CMP defects, and without significant improvements from the initial levels of RMG CMP defects, HKMG yield would not have been viable. Continued on next page

43


44

< TECH FORUM > DESIGN TO SILICON Source: Intel

Issue

Cause

Impact to Device

Gate resistance variation

Poor polish rate control

Parametrics

(WIW, WID, WTW) Poor gate fill

Thick poly opening/

High gate resistance/defects

large aspect ratio Unexposed poly si gate

Low polish rate

Vt shift, high gate resistance

Residual material (underpolish)

Non-uniform polish Poor planarization

Shorting/opens

Overpolish

S/D removal during poly etch

Raised S/D exposure

Particle generation

Thick epi S/D Contact etch window

Metal gate height determines etch depth

Opens/shorts for under/over etch

Bevel edge redistribution of poly/epi

Under/overpolish in bevel region generates defect source

Defects due to redistribution of under exposed gate or overpolished S/D regions

Structural damage to device layer

High CMP shear forces

Low yield/reliability

Strain/stress relaxation

High shear force during CMP

Loss of strain induced carrier mobility

TABLE 2 Integration issues with RMG CMP

Source: Intel

Defect mode

Potential Causes

Impact to device

Potential Solutions

Particles

Slurry/pad residue Polish byproducts

Shorting/opens Pattern distortion

Cleaner tooling Clean chemistries

Macro scratches

Large/hard foreign particles on polish pad

Pattern removal over multiple die.

Pad conditioning Pad cleaning Environment

Micro scratches

Slurry agglomeration Pad asperities

Shorting/opens

Slurry filters Pad/Pad conditioning

Corrosion (metal CMP)

Slurry chemistry Clean chemistry

Opens Reliability

Passivating films Chemistry optimization

Film delamination

Weak adhesion CMP shear force

Shorting/opens Device parametrics

Improve adhesion Low pressure CMP

Organic residue

Inadequate cleaning, residual slurry components

Shorting/opens Disturbed patterning of next layer

Cleaner tooling Slurry optimization Clean chemistries

TABLE 3 CMP defect modes

Back-end metallization requirements The scaling of circuit dimensions at the 45nm node also requires significant improvement to the copper CMP process. As the metal line width scales, variation in the height of the line results in greater variation in its resistance and capacitance. Copper metal loss during CMP (dishing and erosion effects) is a primary cause of interconnect height variation for the 45nm node, and a significant reduction in copper loss is required to ensure proper functioning of the interconnect. Copper thickness loss is decreased as WIW and WID thickness control is raised, and also as improvements in surface topography at a given layer allow a reduction in the

amount of oxide removed at subsequent layers. Underlying surface topography requires additional oxide removal to ensure that all of the metal overburden is removed from the low lying areas of the surface topography. Hence improvements in front-end topography translate into a need for less oxide removal in the upper back-end metal layers. Improvements in the uniformity of the copper CMP WIW and WID removal rate at the 45nm node are a result of improvements in slurry selectivity as well as polish pad, pad dressing and polish machine parameter optimizations. As with the RMG CMP steps, the defect modes listed in Table 3 are a challenge at copper CMP steps. Because mod-


EDA Tech Forum March 2009 Source: Intel

Application

CMP Enabling Aspect

Potential Challenges

Reference

FUSI replacement metal gates

CMP used to expose p and n gates

Inadvertent exposure of opposing gates

H.Y. Yu, et al. [4]

Novel FUSI metal gates

Enables differential silicidation of poly-Si gate

Requires CMP of dissimilar metals

C. Park, et al. [5]

FINFET devices

Planarization of Poly Si, reduction of topography by fins

Poly-si thickness variation resulting in patterning issues

A. Kaneko, et al. [6]

FINFET devices

Damascene (inlaid material) approach to FINFET formation

Thickness variation caused by multiple CMP steps

Y-S Kim, et al. [7]

3D Inegration – chip stacking

Oxide CMP post Cu CMP to recess oxide and promote Cu bonding

Smearing of Cu bumps

K.N. Chen, et al. [8]

3D stacked NAND Flash devices (2 papers)

3D integration

Ultra-flat topography required for building multiple layers of devices

S.M. Jung, et al. [9]

Novel memory – Ferroelectric media

CMP eliminates roughness typical of ferroelectric material

Device layer thickness control

D.C. Yoo, et al. [11]

Phase change memory

Planarize and expose phase change element

Thickness control of small device region

T. Nirschl, et al. [12]

E.K. Lai, et al. [10]

TABLE 4 Potential new CMP applications ern IC technologies contain up to 10 layers of metal, even low levels of defect densities in the copper CMP step can have a significant impact on yield.

New uses of CMP Recent conference proceedings and journal articles are rich with new potential uses of CMP, under consideration for 32nm and beyond. Table 4 lists some of those intriguing possibilities as well as some of the potential challenges they can be expected to bring. It is evident from all that activity that CMP is still considered useful by process technology architects. However, the difficulties experienced in introducing RMG steps at the 45nm node demonstrate that any new CMP steps introduced into the front end of the line will require exceedingly tight film thickness and defect control. Without such control, recently added RMG CMP steps would not have successfully advanced from R&D to high volume manufacturing. Emerging options can be expected to demand even greater control of thickness and defects—as will shrinking to dimensions of 32nm, 22nm and beyond. To continue the growth of CMP, the industry must find processing conditions (slurry, pad, tooling) that improve performance in critical areas.

[4] H.Y. Yu et al., IEDM Tech Dig., p.638, (2005). [5] C. Park et al., IEDM Tech Dig., P.299, (2004). [6] A. Kaneko et al., IEDM Tech Dig., p.884, (2005). [7] Y-S Kim et al., IEDM Tech Dig., p.315, (2005). [8] K.N. Chen et al., IEDM Tech Dig., (2006). [9] S.M. Jung et al., IEDM Tech Dig., (2006). [10] E.K. Lai et al., IEDM Tech Dig., (2006). [11] D.C. Yoo et al., IEDM Tech Dig., (2006). [12] T. Nirschl et al., IEDM Tech Dig., p.461, (2007).

Acknowledgements The author thanks Francis Tambwe, Matthew Prince and Gary Ding for assistance in preparing data, and Tahir Ghani and Anand Murthy for valuable discussions in preparing the manuscript.

References [1] K.Mistry et al., IEDM Tech. Dig., p.247, (2007). [2] C.Auth et al., Symp. VLSI Dig., p.128, (2008) [3] K.Kuhn IEDM Tech. Dig., p.471, (2007).

Intel RA3-301 2501 NW 229th Ave Hillsboro OR 97124 USA T: 1 503 613-8472 W: www.intel.com

45


46

< TECH FORUM > TESTED COMPONENT TO SYSTEM

Automating sawtooth tuning Alex Diaz, John Mehlmauer, Broadcom Source: Broadcom

Alex Diaz and John Mehlmauer are both on the senior staff, PCB Layout at Broadcom.

>3w w

PCB designers at Broadcom were tasked with accomplishing length matching within a differential pair. The task involves applying serpentine or sawtooth routing to a trace. Doing this manually is tedious and time-consuming, whereas automation cuts the time and effort involved significantly. The automation layer in Mentor Graphics’ Expedition product provides all the tools needed to complete this job. The paper demonstrates how the script form editor and automation object model were used to create an interface for applying a sawtooth route on an existing trace to achieve proper length matching for differential pairs.

Tuning requirements All signal traces in the MAC section have controlled impedance and need to be routed with a specific trace width. They must follow 3-W minimum spacing from other traces and differential pairs on the same layer. • 95Ω +/- 10% for the PCIe differential pairs (17 pairs total). • 100Ω +/- 10% for the XAUI differential pairs (two sets of 16 pairs each). The two traces within each PCIe differential pair must be length-matched so that the long trace is no more than 5mils longer than the short trace. The two traces within each XAUI differential pair must be length-matched so that the long trace is no more than 5mils longer than the short trace. Figure 1 shows how to achieve the 5mil-length matching within a differential pair. Apply this serpentine at the end of the trace run where the mismatch occurs for the PCIe and for the XAUI differential pairs. To achieve the above sawtooth serpentine, we had two options: Option one Create an AutoCad DXF file of a small two-bump sawtooth pattern following the requirements shown in Figure 1. Import this DXF file into Expedition as a drawn line object onto a tem-

S

S1 < 2S

FIGURE 1 Sawtooth serpentine for length matching porary user layer where it will be copied-and-pasted until the desired length is met. Then select all the segments and convert them into a continuous poly line. Change this drawn poly line type from a ‘Draw Object’ to a ‘Trace’, changing the ‘Layer’ and ‘Net’ properties at the same time. Move the trace to the desired location and finish routing the differential pair. Option two Select the longest line in the differential pair net and ‘fix’ or ‘lock’ it down. Manually create a small segment on the shortest net of the differential pair and pull this segment away from the longest differential pair net. Adjust this small segment to meet the sawtooth requirement shown in Figure 1. Continue creating and adjusting these small segments until the desired length is met. Either option demands a lot of time and effort. For one particular design, we chose option two because it best suited the board’s routing density. The initial routing pass of 49 differential pairs took almost 32 hours to match. By replacing this with sawtooth tuning automation, we cut the time required to route and match all 49 pairs to five hours.

Sawtooth tuning automation The tuning form that we developed modifies the route of an existing trace to have sawtooth bumps according to user-defined parameters. It was designed using Expedition’s Script Form Editor. Figure 2 shows the form that is presented to the user. Differential pair section In this section, the user must first select a differential pair net to work with using the ‘ComboBox’ control. On selec-


EDA Tech Forum March 2009

Length matching within a differential pair can be one of the more tedious tasks facing a PCB designer. This article describes how a team at communications and consumer electronics semiconductor company Broadcom overcame this by using the automation options available within their design tools. While the article describes automation tailored to a specific task, the authors present it here as an example of what can be done more generally. The workflow they developed is described, followed by detailed description of the code that underpinned the automation process. Source: Broadcom

Source: Broadcom 01 02 03 04' 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21

' Get the CES application object Set objCesApp = GetObject(,"CES.Application") Get the CES object Set objCES = objCesApp.Tools("CES") ' Get the Design object Set objDesign = objCES.ObjectDefs.Design For Each DiffPair In objDesign.DiffPairs Set objDiffPairRow = objCES.ObjectDefs.Sheets("ENET").FindObject(DiffPair.Name) ' Get the Pair Tol Max value dpTol = objDiffPairRow.ConstraintEx("DP_TOL") ' Get the Pair Tol Actual Value dpAct = objDiffPairRow.ConstraintEx("ACT_DP_DELTA") ' Check if this is in violation If CDbl(dpAct) > CDbl(dpTol) Then ' This is a violation so add it to ComboBoxDiffPairs ComboBoxDiffPair.AddString(DiffPair.Name) End If Next

FIGURE 3 Populating the ComboBox form. The spacing value is shown as variable ‘S’ in Figure 1. The ‘OK’ button is then clicked to enable the ‘Sawtooth’ and ‘Routing’ sections of the form.

FIGURE 2 The sawtooth tuning form tion, the pair is highlighted in the Expedition workspace. Just under the ComboBox control is a checkbox labeled ‘Fit window when selected’. If this is checked, the program will fit the Expedition workspace area to the differential pair traces that are highlighted. Once selected, the information for the net is automatically gathered through the Constraint Editor System (CES) and presented to the user in the form. This information is the two electrical net names for the differential pair, the length for each electrical net, the length difference between them, and the differential pair tolerance constraint value. Next to each electrical net name is a ‘Highlight’ button. If clicked, then all that net’s features are highlighted in the Expedition workspace. Line width and spacing information for the differential pair needs to be manually entered in this section of the

Sawtooth section This is where the user describes the parameters for the sawtooth bump that will be added to the trace. The parameters are spacing, length, quantity and type. The spacing parameter is the distance between the edge of the opposing trace segment and the edge of the ‘bumped’ sawtooth segment. It is shown in Figure 1 as variable S1. This value is automatically filled out to be just less than double the spacing value from the differential pair section. The length parameter defines how long the ‘bumped’ sawtooth segment is to run after the angle. This is shown as variable >3w in Figure 1 (i.e., greater than three times the width of the trace). The quantity (‘Qty’) parameter states how many sawtooth segments the user wants to add. This can be an integer number entered in to the EditBox control or the user can click on a ‘Maximum’ CheckBox control to keep adding them until the length matching is within tolerance or there is no more room on the trace, whichever comes first. The type parameter is a RadioButton control and the value can only be positive or negative. This determines which way to ‘bump’ the trace with the sawtooth segment. ‘Positive’ will bump above horizontal and angled traces and bump to the right of vertical traces. ‘Negative’ will bump Continued on next page

47


48

< TECH FORUM > TESTED COMPONENT TO SYSTEM

Source: Broadcom 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70

' Get the CES application object Set objCesApp = GetObject(,"CES.Application") ' Get the CES object Set objCES = objCesApp.Tools("CES") ' Get the Design object Set objDesign = objCES.ObjectDefs.Design ' Get the Expedition Application object Set objExpApp = GetObject(,"MGCPCB.Application") ' Get the Expedition Document object Set objExpDoc = objExpApp.ActiveDocument ' Set the DiffPairName variable to be the value that is selected DiffPairName = ComboBoxDiffPair.TextValue ' Clear any select and highlight objExpDoc.UnSelectAll() objExpDoc.UnHighlightAll() 'Get the DiffPair object Set objDiffPair = objDesign.GetDiffPair(DiffPairName) ' Get each Enet and fill out the EditName and EditLength controls appropriately ct = 1 ' counter For Each Enet In objDiffPair.Nets ' Use Eval to get the control object based on the counter Set EditName = Eval("EditDiffPairNet" & ct) ' Use Eval to get the control object based on the counter Set EditLength = Eval("EditDiffPairLength" & ct) EditName.Text = Enet.Name ' Get the CES spreadsheet row for this electrical net Set objEnetRow = objCES.ObjectDefs.Sheets("ENET").FindObject(Enet.Name) EditLength.Text = objEnetRow.ConstraintEx("ACT_LENGTH_TOF") ' Highlight each physical net in the electrical net For Each Pnet In Enet.PhysicalNets ' Get the Net object from the Document Object Set netObj = objExpDoc.FindNet(Pnet.Name) ' Highlight the net netObj.Highlighted = True Next ct = ct + 1

Source: Broadcom 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

' Get the CES application object Set objCesApp = GetObject(,"CES.Application") ' Get the CES object Set objCES = objCesApp.Tools("CES") ' Get the Design object Set objDesign = objCES.ObjectDefs.Design ' Get the Expedition Application object Set objExpApp = GetObject(,"MGCPCB.Application") ' Get the Expedition Document object Set objExpDoc = objExpApp.ActiveDocument Call Scripting.AttachEvents(objExpDoc, "docEvents") Sub docEvents_OnSelectionChange(SelectionType) Dim enable Dim pcbTraces, netName, objDiffPair Dim Enet, Pnet enable = 0 Set pcbTraces = objExpDoc.Traces(epcbSelectSelected) ' Make sure only 1 trace is selected If pcbTraces.Count = 1 Then netName = pcbTraces.Item(1).Net.Name Set objDiffPair = objDesign.GetDiffPair(ComboBoxDiffPair.TextValue) ' Verify the trace selected belongs to the working differential pair. For Each Enet In objDiffPair.Nets For Each Pnet In Enet.PhysicalNets If netName = Pnet.Name Then enable = 1 End If Next Next End If ButtonApply.Enable = enable End Sub

FIGURE 5 Enabling the apply button in the Expedition workspace is a one-segment trace that belongs to the working differential pair that the user has selected in the form.

Next ' Fill in the Length Difference and Pair Tolerance. Set objDiffPairRow = objCES.ObjectDefs.Sheets("ENET").FindObject(objDiffPair.Name) ' Get the Pair Tol Max value dpTol = objDiffPairRow.ConstraintEx("DP_TOL") ' Get the Pair Tol Actual Value dpAct = objDiffPairRow.ConstraintEx("ACT_DP_DELTA") EditLengthDifference.Text = dpAct EditPairTolerance.Text = dpTol ' Enable the Update and Ok buttons. ButtonUpdateDiffPair.Enable = 1 ButtonPairOk.Enable = 1 ' Set the extents to the highlighted nets if the CheckFitWindow checkbox control is checked. If CheckFitWindow.Check = 1 Then objExpDoc.ActiveView.SetExtentsToSelection True End If

FIGURE 4 Selecting a differential pair name below horizontal and angled traces and bump to the left of vertical traces. Routing section In this section, the user fills in two parameters, ‘Start With’ and ‘Direction’. ‘Start With’ determines whether to start with the bump right away (‘Sawtooth’) or continue on the same path with a segment that is the length of the sawtooth and then apply the bump (‘Segment’). The direction can be either right or left for horizontal and angled lines, and either up or down for vertical lines. This will determine the start/end-point of the working trace where the sawtooth segments begin to be added. ‘Apply’ button This is at the bottom of the form and executes the sawtooth route modification on the selected trace in the Expedition document. It is not enabled unless the current selection

Programming This section considers key programming techniques used in this program that the Automation Layer in the CES and Expedition provide. The language used for this program and in the examples to follow is VBScript. Populating the ComboBox Upon program start up, the ComboBox control in the Differential Pair section of the form is populated. In order to do this, the program will access the CES Application object. The code is shown in Figure 3 (p.47). The key object needed here is a CES ‘Design’ object. This will let us access the ‘DiffPairs’ collection in order to check the Pair Tolerance constraint value as well as the actual difference in length between the lengths of the differential pair nets. Lines 1 through 8 do the steps necessary to retrieve the ‘Design’ object—first by getting the ‘Application’ object, then the CES object, and finally the ‘Design’ object itself on line 8. Beginning on line 10, a ‘For Each’ loop is then used to traverse the ‘objDesign.DiffPairs’ collection. Within this loop on line 11, the variable ‘objDiffPairRow’ is being set as a spreadsheet object, which is retrieved from the ‘ENET’ page in CES using the differential pair name as the argument for the ‘FindObject’ method. Using this object allows us to retrieve the constraint values we want through the ‘ConstraintEx’ property. Line 13 gets the maximum pair tolerance and assigns the value to the variable ‘dpTol’, and line 15 gets the actual difference in length assigning it to the variable ‘dpAct’. From


EDA Tech Forum March 2009

Source: Broadcom 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67

' Get the Trace object that is selected ' Set pcbTraces = objExpDoc.Traces(epcbSelectSelected) Set pcbTrace = pcbTraces.Item(1) ' Layer that the trace belongs to ' pcbTraceLayer = pcbTrace.Layer ' Trace width ' pcbTraceWidth = pcbTrace.Geometry.LineWidth ' Get the Net object for the trace were working on ' pcbTraceNet = pcbTrace.Net.Name Set pcbNetObj = objExpDoc.FindNet(pcbTraceNet) ' Delete the original trace ' pcbTrace.Delete ' Get the parameters from the form ' toothLength = CDbl(editToothLength.Text) toothSpacing = CDbl(editToothSpacing.Text) pairSpacing = CDbl(editPairSpacing.Text) If radioToothType.Value = 0 Then toothType = "positive" 'positive | negative This is the side of the trace the sawtooth bump goes Else toothType = "negative" End If ' number of teeth to add. "Max" for as much as the trace length allows If editToothQty.Text = "Max" Then addAmount = "Max" Else addAmount = CInt(editToothQty.Text) End If If radioStartWith.Value = 0 Then startWith = "sawtooth" ' segment | sawtooth Else startWith = "segment" ' segment | sawtooth End If Select Case radioDirection.Value Case Case Case Case

0 1 2 3

routDirection routDirection routDirection routDirection

= = = =

"right" "left" "up" "down"

End Select 'Do While room is left or the addAmount has been reached 'Use the parameters and calculate the points array for this sawtooth ' pntsArr(0,0) = sawtoothX1 :pntsArr(1,0) = sawtoothY1 :pntsArr(2,0) = pntsArr(0,1) = sawtoothX2 :pntsArr(1,1) = sawtoothY2 :pntsArr(2,1) = pntsArr(0,2) = sawtoothX3 :pntsArr(1,2) = sawtoothY3 :pntsArr(2,2) = pntsArr(0,3) = sawtoothX4 :pntsArr(1,3) = sawtoothY4 :pntsArr(2,3) = ' 'Add the sawtooth trace. Set pcbAddTrace = objExpDoc.PutTrace(pcbTraceLayer, pcbNetObj,_ pcbTraceWidth, 4, pntsArr,_ Nothing, epcbUnitCurrent, epcbAnchorNone)

0.0 0.0 0.0 0.0

FIGURE 6 Applying the sawtooth modification there, we can do the math on line 17 to see if this differential pair is in violation. If it is in violation, then the differential pair name is added to the ‘ComboBox’ control on line 19 using the ‘AddString’ method. Selecting the differential pair When the user selects a differential pair from the ‘ComboBox’ control, the electrical net names and lengths are then presented in the form automatically. Figure 4 shows the code. Along with the CES’ ‘Application’ object, we also use Expedition’s ‘Application’ and ‘Document’ object. ‘Document’ allows us to interact with the working PCB design. The object is being set as the variable ‘ExpDoc’ on line 11 and is derived from the ‘Application.ActiveDocument’ property. This will be used to highlight the net’s features and zoom to their extents in the Expedition Workspace. Lines 10 through 14 accomplish retrieving the Document object; first by retrieving the ‘Application’ object on line 11 and then the ‘Document’ object on line 14. Line 17 sets a variable named ‘DiffPairName’ to be the differential pair name

that has been selected in the ComboBox control. This is accomplished using the ‘ComboBox.TextValue’ attribute. With this differential pair name we can now get the ‘DiffPair’ object on line 24 from the ‘objDesign.GetDiffPair’ method. Now that we have a ‘DiffPair’ object, we can loop through the child electrical nets within the ‘DiffPair’. Line 28 starts the appropriate ‘For Each’ loop, where the name of each electrical net is being placed in the appropriate ‘EditBox’ control on line 35. The length of the electrical net is then retrieved from the CES spreadsheet on line 38 and placed in the appropriate length ‘EditBox’ control on line 40. Next the features for the electrical nets are highlighted in a nested ‘For Each’ loop starting on line 43. It loops through the child physical nets within the electrical net. First, the physical ‘Net’ object is retrieved from the Expedition Document object on line 45, and it is then highlighted on line 47. After the electrical net information is filled out, we then must populate the length difference and pair tolerance ‘EditBox’ controls. This is done on lines 52 through 60. First, by retrieving the constraint values on line 55 and 57 from the CES, and then by setting the text properties for the controls on lines 59 and 60. We can now enable the ‘Update’ and ‘OK’ buttons within the differential pair section. This is done on line 63 and 64 using their respective ‘Enable’ properties and setting them to 1. There is a check to see if the ‘Fit window when selected’ checkbox is checked or not on line 68 and, if it is, then the extents of the Expedition Workspace are set to the highlighted features on line 69. Enabling the ‘Apply’ button This button will not be enabled unless the current selection in the Expedition Workspace is a single trace that belongs to the working differential pair that the user has selected in the form. To check for this, the program uses the ‘OnSelectionChange’ event in the ‘Document’ object. Figure 5 shows the code. First we must bind the document object event to our event handler function. This is done on line 7. The ‘AttachEvents’ method takes two parameters, the object reference and the event function prefix. We are specifying ‘docEvents’ as the function prefix that the automation layer will look for when an event occurs in the ‘Document’ object. The event handler function we define is ‘docEvents_OnSelectionChange’ on line 9. This event passes one argument, the ‘SelectionType’. Whenever a selection change occurs in the ‘Document’, this function will then execute. For our purposes, we want to verify that the selection is a single trace and is part of the differential pair currently selected in the form. Line 14 gets the collection of ‘Traces’ selected in the ‘Document’. Line 16 verifies that there is only one trace selected. Line 26 gets the physical net name of the selected trace. We then get the ‘DiffPair’ object from the CES on line 27 based Continued on next page

49


50

< TECH FORUM > TESTED COMPONENT TO SYSTEM

on what is currently selected in the form’s ComboBox control. From there, we can traverse the child electrical and physical nets to verify that the selected physical net is part of the ‘DiffPair’. If it is, then the ‘enable’ variable is set to 1 on line 32 and will enable the ‘Apply’ button on line 37. Applying the sawtooth modification When the ‘Apply’ button is pressed, the program will then modify the selected trace to have the sawtooth bumps according to the parameters from the form. Figure 6 (p.49) shows the code. First, information is gathered from the selected trace to be modified. This is done on lines 1 through 17. We need to know the layer the trace is on, the width of the trace, and finally we need to retrieve the ‘Net’ object that the trace belongs to. This information is necessary in order to execute the ‘PutTrace’ method for the sawtooth bump later down the line. The original trace is then deleted on line 21 using the ‘Trace.Delete’ method. The next step is to retrieve all the parameter values from the form’s controls. This is done on lines 23 through 53. Using all the parameters, we can now calculate where to start the sawtooth bumps and set a points array to use in the ‘Document.PutTrace’ method. This method is performed within a ‘Do While’ loop and will only execute if there is enough room left to add a sawtooth or a user-defined number of sawtooth segments has been reached. Lines 59 through 62 demonstrate setting a points array to be used and line 65 executes the ‘Document.PutTrace’ method.

Conclusion This paper has demonstrated the power of the script form editor and the automation layer that Expedition and the CES have available. This problem is only one of many out there that can be solved using automation. We hope this paper will increase your interest in and knowledge of automation. If your PCB design group finds itself repeating time-consuming tasks, consider automation as the solution.

Broadcom 16215 Alton Parkway Irvine CA 92618 USA T: +1 949 926 5000 W: www.broadcom.com


STAY ON THE

FRONTOFLINE EE DESIGN

Attend a 2009 EDA Tech Forum® Event Near You

15 Worldwide Locations • • • • • • • • • • • • • • •

March 26 – Irvine, CA April 9 – Tempe, AZ May 5 – Austin, TX May 28 – Ottawa, ON June 16 – Munich, Germany August 25 – Hsin-Chu, Taiwan August 27 – Seoul, South Korea September 1 – Shanghai, China September 3 – Santa Clara, CA September 3 – Beijing, China September 4 – Tokyo, Japan September 8 – New Delhi, India September 10 – Bangalore, India October 1 – Denver, CO October 8 – Boston, MA

2009 Platinum Sponsors:


2 speed-grade advantage 2X the density 1/2 the power

SECOND TO NONE

Think AND, not OR. Our new 40-nm Stratix® IV FPGAs give you tage AND half the 2X the density AND a 2 speed-grade advantage vers. For production, power—with and without 8.5-Gbps transceivers. combine them with our risk-free HardCopy® ASICs for all the benefits of FPGAs AND ASICs. Design with Quartus II software for the ®

pile times. When second highest logic utilization AND 3X faster compile As. is not an option, design with Stratix IV FPGAs. Copyright © 2008 Altera Corporation. All rights reserved..

www altera com www.altera.com


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.